0% found this document useful (0 votes)
196 views51 pages

Unit - Iii CV

Uploaded by

udaychamp5425
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
196 views51 pages

Unit - Iii CV

Uploaded by

udaychamp5425
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

UNIT - III (3D Vision and Motion)

Topics

3D Vision and Motion: Methods for 3D vision


projection schemes
shape from shading
photometric stereo
shape from texture
shape from focus
active range finding
surface
representations
point-based representation
volumetric representations
3D object
recognition
3D reconstruction
introduction to motion
triangulation
bundle adjustment

translational alignment
parametric motion
spline-based motion
optical flow
layered
motion.
❖ Projections in Computer Graphics :-
Representing an n-dimensional object into an n-1 dimension is known as
projection. It is process of converting a 3D object into 2D object, we represent
a 3D object on a 2D plane {(x,y,z)->(x,y)}. It is also defined as mapping or
transforming of the object in projection plane or view plane. When geometric
objects are formed by the intersection of lines with a plane, the plane is
called the projection plane and the lines are called projections.

Types of Projections:

1. Parallel projections

2. Perspective projections
Center of Projection:

It is an arbitrary point from where the lines are drawn on each point of an
object.

● If cop is located at a finite point in 3D space , Perspective projection

is the result

● If the cop is located at infinity, all the lines are parallel and the result

is a parallel projection.

Parallel Projection:

A parallel projection is formed by extending parallel lines from each vertex


of object until they intersect plane of screen. Parallel projection transforms
object to the view plane along parallel lines. A projection is said to be
parallel, if center of projection is at an infinite distance from the projected
plane. A parallel projection preserves relative proportion of objects, accurate
views of the various sides of an object are obtained with a parallel projection.
The projection lines are parallel to each other and extended from the object
and intersect the view plane. It preserves relative propositions of objects, and
it is used in drafting to produce scale drawings of 3D objects. This is not a
realistic representation, the point of intersection is the projection of the
vertex.

Parallel projection is divided into two parts and these two parts sub divided
into many.

Orthographic Projections:

In orthographic projection the direction of projection is normal to the


projection of the plane. In orthographic lines are parallel to each other
making an angle 90 with view plane. Orthographic parallel projections are
done by projecting points along parallel lines that are perpendicular to the
projection line. Orthographic projections are most often used to procedure
the front, side, and top views of an object are called evaluations. Engineering
and architectural drawings commonly employ these orthographic projections.
Transformation equations for an orthographic parallel projection as straight
forward. Some special orthographic parallel projections involve plan view,
side elevations. We can also perform orthographic projections that display
more than one phase of an object, such views are called monometric
orthographic projections.

Oblique Projections:

Oblique projections are obtained by projectors along parallel lines that are
not perpendicular to the projection plane. An oblique projection shows the
front and top surfaces that include the three dimensions of height, width and
depth. The front or principal surface of an object is parallel to the plane of
projection. Effective in pictorial representation.
● Isometric Projections: Orthographic projections that show more

than one side of an object are called axonometric orthographic

projections. The most common axonometric projection is an

isometric projection. In this projection parallelism of lines are

preserved but angles are not preserved.

● Dimetric projections: In these two projectors have equal angles

with respect to two principal axis.

● Trimetric projections: The direction of projection makes unequal

angle with their principal axis.

Cavalier Projections:

All lines perpendicular to the projection plane are projected with no change
in length. If the projected line making an angle 45 degrees with the projected
plane, as a result the line of the object length will not change.
Cabinet Projections:

All lines perpendicular to the projection plane are projected to one half of
their length. These gives a realistic appearance of object. It makes 63.4
degrees angle with the projection plane. Here lines perpendicular to the
viewing surface are projected at half their actual length.
Perspective Projections:
● A perspective projection is the one produced by straight lines

radiating from a common point and passing through point on the

sphere to the plane of projection.

● Perspective projection is a geometric technique used to produce a

three dimensional graphic image on a plane, corresponding to what

person sees.

● Any set of parallel lines of object that are not parallel to the

projection plane are projected into converging lines. A different set

of parallel lines will have a separate vanishing point.

● Coordinate positions are transferred to the view plane along lines

that converge to a point called projection reference point.


● The distance and angles are not preserved and parallel lines do not

remain parallel. Instead, they all converge at a single point called

center of projection there are 3 types of perspective projections.

Two characteristic of perspective are vanishing point and perspective force


shortening. Due to fore shortening objects and lengths appear smaller from
the center of projections. The projections are not parallel and we specify a
center of projection cop.

Different types of perspective projections:

● One point perspective projections: In this, principal axis has a finite

vanishing point. Perspective projection is simple to draw.


● Two point perspective projections: Exactly 2 principals have

vanishing points. Perspective projection gives better impression of

depth.

● Three point perspective projections: All the three principal axes

have finite vanishing point. Perspective projection is most difficult to

draw.
Perspective fore shortening:

The size of the perspective projection of the object varies inversely with
distance of the object from the center of projection.

Winter-time is here and so is the time to skill-up! More than 5,000 learners
have now completed their journey from basics of DSA to advanced level
development programs such as Full-Stack, Backend Development, Data
Science.

❖ Shape from shading –photometric


stereo – shape from texture – shape
from focus
Photometric stereo is a technique in computer vision for estimating the
surface normals of objects by observing that object under different lighting
conditions (photometry). It is based on the fact that the amount of light
reflected by a surface is dependent on the orientation of the surface in
relation to the light source and the observer.[1] By measuring the amount of
light reflected into a camera, the space of possible surface orientations is
limited. Given enough light sources from different angles, the surface
orientation may be constrained to a single orientation or even
overconstrained.

The technique was originally introduced by Woodham in 1980.[2] The


special case where the data is a single image is known as shape from
shading, and was analyzed by B. K. P. Horn in 1989.[3] Photometric stereo
has since been generalized to many other situations, including extended
light sources and non-Lambertian surface finishes. Current research aims
to make the method work in the presence of projected shadows, highlights,
and non-uniform lighting.

Under Woodham's original assumptions — Lambertian reflectance, known point-like


distant light sources, and uniform albedo — the problem can be solved by inverting the
linear equation

I=L⋅n

, where I is a (known) vector of

M observed intensities,

is the (unknown) surface normal, and

is a (known)

3×m
matrix of normalized light directions.

This model can easily be extended to surfaces with non-uniform albedo, while keeping
[4]
the problem linear. Taking an albedo reflectivity of

, the formula for the reflected light intensity becomes:

I=k(L⋅n)

If

is square (there are exactly 3 lights) and non-singular, it can be inverted, giving:

L−1I=kn

Since the normal vector is known to have length 1,

must be the length of the vector

kn

, and

is the normalised direction of that vector. If

is not square (there are more than 3 lights), a generalisation of the inverse can be
[5]
obtained using the Moore–Penrose pseudoinverse, by simply multiplying both sides
with
LT

giving:

LTI=LTk(L⋅n)

(LTL)−1LTI=kn

After which the normal vector and albedo can be solved as described above.

Complete PDF on the above topics is shared on whats app


group.

❖ Active range findings


❖ Point based representation :-

Point-based geometry representations allow to process and visualize 3D models without


the need for costly surface reconstruction or triangulation They are therefore a flexible
and efficient alternative to traditional spline-based or mesh-based surface
representations.

A crucial component for any interactive application using point-based representation is


the visualization. Due to the lack of hardware support for point primitives, early
point-based rendering techniques were pure software implementations , and hence
were too slow for real-time visualization of highly complex data sets. However, the
increasing programmability of current graphics hardware (GPU) allows for
hardware-accelerated point rendering , using so-called surface splatting, where each
point sample is equipped with a normal vector and a radius, and therefore represents a
small circle or ellipse in object space.

Figure 1: A model that representated by elliptical splats (left), rendered using flat
shading (center left), Gouraud shading (center right), and Phong shading (right).
High visual quality is and flexible rendering is achieved using multi-pass deferred
shading achieved by the inherent anti-aliasing of surface splatting, as well as by
per-pixel Phong shading and shadow mapping (Botsch et al., 2004; Botsch et al., 2005).

Figure 2: Phong shading can be implemented efficiently using deferred shading, based
on the depicted three rendering passes.

The point-based rendering metaphor, where elliptical splats are generated from simple
OpenGL points, has also been successfully applied in scientific visualization. For
instance, in molecular visualization, individual atoms and their connections can be
represented by sphere and cylinders, respectively, which are generated and rasterized
completely on the GPU (Sigg et al., 2006). Thanks to the high rendering performance,
even dynamic (pre-computed) MD simulations of large membrane patches can be
visualized in realtime, which we exploited for an interactive “atom-level magnifier tool” in
a combined mesoscopic and molecular visualization (missing reference)
Figure 3: Point-based molecule rendering allows for interactive magnifier tool that
bridges the gap between cell visualization of the mesoscopic (left) and molecular scale
(right).

★ Also Check PDF shared in the WHats app group for the above topic .

❖3D object recognition – 3D reconstruction –


Over the years, object detection has become more and more advanced. It has
progressed from recognizing objects in simple two-dimensional (2D) images to
identifying objects in the complex three-dimensional (3D) world around us. Early
techniques like template matching, which involved finding objects by comparing parts of
an image to stored reference images, were developed in the 1970s and formed the
basis for 2D object detection. In the 1990s, the introduction of technologies such as
LIDAR (Light Detection and Ranging) made it possible for systems to capture depth and
spatial information more easily. Today, multi-modal fusion methods, which combine 2D
images with 3D data, have paved the way for highly accurate 3D object detection
systems.

An Overview of 2D Object Detection


Before we take a look at 3D object detection, let’s understand how 2D object detection
works. 2D object detection is a computer vision technique that enables computers to
recognize and locate objects within flat, two-dimensional images. It works by analyzing
an object's horizontal (X) and vertical (Y) position in a picture. For example, if you pass
an image of players on a soccer field to a 2D object detection model like Ultralytics
YOLOv8, it can analyze the image and draw bounding boxes around each object (in this
case, the players), precisely identifying their location.

However, 2D object detection has its limitations. Since it only considers two dimensions,
it doesn’t understand depth. This can make it hard to judge how far away or big an
object is. For example, a large object far away might appear the same size as a smaller
object that’s closer, which can be confusing. The lack of depth information can cause
inaccuracies in applications like robotics or augmented reality, where knowing the true
size and distance of objects is necessary. That’s where the need for 3D object detection
comes in.

Gaining Spatial Awareness with 3D Object Detection


3D object detection is an advanced computer vision technique that allows computers to
identify objects in a three-dimensional space, giving them a much deeper understanding
of the world around them. Unlike 2D object detection, 3D object detection also takes
into consideration data about depth. Depth information provides more details, like where
an object is, how big it is, how far away it is, and how it's positioned in the real 3D world.
Interestingly, 3D detection can also handle situations where one object partially hides
another (occlusions) better and remains reliable even when the perspective changes. It
is a powerful tool for use cases that need precise spatial awareness.

3D object detection is vital for applications like self-driving cars, robotics, and
augmented reality systems. It works by using sensors like LiDAR or stereo cameras.
These sensors create detailed 3D maps of the environment, known as point clouds or
depth maps. These maps are then analyzed to detect objects in a 3D environment.

There are many advanced computer vision models designed specifically for
handling 3D data, like point clouds. For example, VoteNet is a model that
uses a method called Hough voting to predict where the center of an object
is in a point cloud, making it easier to detect and classify objects accurately.
Similarly, VoxelNet is a model that converts point clouds into a grid of small
cubes called voxels to simplify data analysis.
Key Differences Between 2D and 3D Object Detection

Now that we've understood 2D and 3D object detection, let's explore their key
differences. 3D object detection is more complicated than 2D object detection because
it works with point clouds. Analyzing 3D data, like the point clouds generated by LiDAR,
requires a lot more memory and computing power. Another difference is the complexity
of the algorithms involved. 3D object detection models need to be more complex to be
able to handle depth estimation, 3D shape analysis, and analysis of an object’s
orientation.

3D object detection models involve heavier mathematical and


computational work than 2D object detection models. Processing 3D data
in real-time can be challenging without advanced hardware and
optimizations. However, these differences make 3D object detection more
suited for applications requiring better spatial understanding. On the other
hand, 2D object detection is often used for simpler applications like security
systems that need image recognition or video analysis.

Pros and Cons of 3D Object Detection :-

3D object detection offers several advantages that make it stand out from traditional 2D
object detection methods. By capturing all three dimensions of an object, it provides
precise details about its location, size, and orientation with respect to the real world.
Such precision is crucial for applications like self-driving cars, where knowing the exact
position of obstacles is vital for safety. Another advantage of using 3D object detection
is that it can help you get a much better understanding of how different objects relate to
each other in 3D space.

Despite the many benefits, there are also limitations related to 3D object detection. Here
are some of the key challenges to keep in mind:

● Higher computational costs: Working with 3D data requires more powerful


hardware resources, and the cost can add up quickly.
● More complex data requirements: 3D object detection often relies on advanced
sensors like LiDAR, which can be expensive and not necessarily available in all
environments.
● Collecting and processing data: The complex data requirements of 3D object
detection make gathering, preparing, and processing the large datasets needed
to train the models both time-consuming and resource-intensive.
● Increased model complexity: The models used for 3D object detection are
generally more complicated, with more layers and parameters than those used
for 2D object detection.

➔ Applications of 3D Object Detection


Now that we've discussed the pros and cons of 3D object detection,
let's take a closer look at some of the use cases of 3D object
detection.

➔ Autonomous Vehicles

In self-driving cars, 3D object detection is vital for perceiving the


surroundings around the car. It lets the vehicles detect pedestrians,
other cars, and obstacles. It also provides precise information about
their position, size, and orientation in the real world. The detailed data
obtained through 3D object detection systems is helpful for a much
safer self-driving experience for the passengers on board.

Robotics
Robotic systems use 3D object detection for several applications.
They use it to navigate through different types of environments, pick
up and place objects, and interact with their surroundings. Such use
cases are particularly important in dynamic settings like warehouses
or manufacturing facilities, where robots need to understand
three-dimensional layouts to function effectively.
➔ Augmented and Virtual Reality (AR/VR)

Another interesting use case of 3D object detection is in augmented and


virtual reality applications. 3D object detection is used to accurately place
virtual objects in a realistic VR or AR environment. Doing so increases the
overall user experience of such technologies. It also allows the VR/AR
systems to recognize and track physical objects, creating immersive
environments where digital and physical elements interact seamlessly. For
example, gamers using AR/VR headsets can get a much more immersive
experience with the help of 3D object detection. It makes interactions with
virtual objects in 3D spaces a lot more engaging.

3D object detection makes it possible for systems to understand depth


and space more effectively than 2D object detection methods. It plays
a key role in applications like self-driving cars, robots, and AR/VR,
where knowing an object’s size, distance, and position is important.
While 3D object detection requires more processing power and
complex data, its ability to provide accurate and detailed information
makes it a very valuable tool in many fields. As technology advances,
the efficiency and accessibility of 3D object detection will likely
improve, paving the way for even broader adoption and innovation
across various industries.

❖ 3D Reconstruction
➔ 3D Reconstruction Basic Terminology (Traditional
Computer Vision Approach)

One of the most captivating achievements in this domain of Computer Vision is 3D


reconstruction, which aims to generate three-dimensional models of real-world objects
or scenes from two-dimensional images or videos. This cutting-edge technology holds
immense potential across various industries, including robotics, augmented reality,
medical imaging, and more. In this blog, we will explore the fascinating world of 3D
reconstruction, its techniques, and its diverse applications.

Understanding 3D Reconstruction

3D reconstruction, in essence, involves converting a set of 2D images or videos into a


coherent and accurate 3D representation. This process relies on sophisticated
algorithms and mathematical models to estimate the geometry and appearance of the
objects or environments depicted in the input data.
➔ Techniques of 3D Reconstruction:

1. Multi-View Stereo (MVS): Multi-View Stereo is an extension of stereo vision, which


uses multiple images of a scene taken from different viewpoints to reconstruct a
detailed 3D model. MVS algorithms take advantage of dense correspondence matching
between image pixels to generate a high-resolution 3D point cloud. This technique is
widely used in photogrammetry, cultural heritage preservation, and robotics.

2. Depth from Focus: Depth from Focus is a technique that estimates depth information
based on the variations in the focus of an imaging system. By capturing multiple
images with different focus settings, the algorithm analyzes the sharpness of different
image regions and infers the corresponding depth values. This method is particularly
useful in applications where traditional stereo or structure from motion techniques may
not be applicable, such as micro-scale object reconstruction.

3. Shape from Shading: Shape from Shading techniques utilize the variations in the
intensity of an object's surface to infer its 3D shape. By assuming certain lighting
conditions, the algorithm estimates the surface normals at each pixel and generates a
depth map. This method finds applications in fields like computer graphics and medical
imaging.

4. Volumetric Reconstruction: Volumetric reconstruction aims to generate 3D


representations of objects or scenes using a voxel grid, where each voxel (3D pixel)
represents a small volume element. This technique is employed in medical imaging
(e.g., CT and MRI scans), where it allows for the creation of detailed 3D models of
anatomical structures.

➔ Applications of 3D Reconstruction:
1. Robotics and Automation: 3D reconstruction plays a vital role in robotic applications,
enabling robots to perceive their environments, localize themselves, and plan actions
accordingly. Robots equipped with 3D vision systems can navigate complex
environments, perform pick-and-place tasks with precision, and collaborate safely with
humans in shared workspaces.

2. Virtual Reality (VR): In virtual reality, 3D reconstruction is instrumental in creating


realistic and immersive virtual environments. By capturing real-world scenes using
depth sensors or multiple cameras, developers can recreate these scenes in 3D and
allow users to explore them in virtual space.

3. 3D Printing and Manufacturing: 3D reconstruction is used in conjunction with 3D


printing to fabricate physical objects from digital models. Industrial applications benefit
from this technology by creating prototypes, custom components, and complex
geometries that would be difficult to achieve through traditional manufacturing
techniques.

4. Environmental Monitoring: 3D reconstruction is utilized for environmental monitoring


and surveying applications. Drones equipped with cameras and LiDAR sensors can
capture aerial imagery and create accurate 3D models of terrains, forests, and
infrastructure for urban planning, disaster management, and conservation efforts.

5. Cultural Heritage Restoration: Cultural heritage sites and artifacts can deteriorate
over time due to environmental factors and human impact. 3D reconstruction assists in
the restoration and conservation of these valuable assets by creating digital archives
and aiding in virtual reconstruction efforts.

6. Architecture and Construction: Architects and engineers use 3D reconstruction for


building information modeling (BIM), allowing them to create detailed 3D
representations of structures before construction begins. This helps in visualizing
designs, optimizing layouts, and detecting potential issues early in the planning phase.

7. Gaming and Simulation: 3D reconstruction plays a crucial role in creating realistic


virtual worlds and characters for gaming and simulation applications. Game developers
and simulation engineers use this technology to achieve higher levels of immersion and
realism, enhancing the user experience.

❖Introduction to motion – triangulation – bundle


adjustment

➔Triangulation

In computer vision, triangulation refers to the process of determining a point in 3D


space given its projections onto two, or more, images. In order to solve this problem it is
necessary to know the parameters of the camera projection function from 3D to 2D for
the cameras involved, in the simplest case represented by the camera matrices.
Triangulation is sometimes also referred to as reconstruction or intersection.

The triangulation problem is in principle trivial. Since each point in an image


corresponds to a line in 3D space, all points on the line in 3D are projected to the point
in the image. If a pair of corresponding points in two, or more images, can be found it
must be the case that they are the projection of a common 3D point x. The set of lines
generated by the image points must intersect at x (3D point) and the algebraic
formulation of the coordinates of x (3D point) can be computed in a variety of ways, as
is presented below.

In the following, it is assumed that triangulation is made on corresponding image points


from two views generated by pinhole cameras.
The image to the right shows the real case. The position of the image points y1 and y2
cannot be measured exactly. The reason is a combination of factors such as

Geometric distortion, for example lens distortion, which means that the 3D to 2D
mapping of the camera deviates from the pinhole camera model. To some extent these
errors can be compensated for, leaving a residual geometric error.

● A single ray of light from x (3D point) is dispersed in the lens system of the
cameras according to a point spread function. The recovery of the
corresponding image point from measurements of the dispersed intensity
function in the images gives errors.
● In a digital camera, the image intensity function is only measured in discrete
sensor elements. Inexact interpolation of the discrete intensity function have
to be used to recover the true one.
'
● The image points y1 and y2' used for triangulation are often found using
various types of feature extractors, for example of corners or interest points in
general. There is an inherent localization error for any type of feature
extraction based on neighborhood operations.
● Refinement step in Structure-from-Motion. • Refine a visual reconstruction to
produce jointly optimal 3D structures P and camera poses C. • Minimize total
re-projection errors .

➔Bundle adjustments :
Refinement step in Structure-from-Motion. • Refine a visual
reconstruction to produce jointly optimal 3D structures P and
camera poses C. • Minimize total re-projection errors .

Refinement step in Structure-from-Motion. • Refine a visual reconstruction to


produce jointly optimal 3D structures P and camera poses C. • Minimize total
re-projection errors .
• Minimize the cost function: 1. Gradient Descent 2. Newton Method 3.
Gauss-Newton 4. Levenberg-Marquardt

1. Gradient Descent 12 Initialization: Xk = X0

➔ Parametric Modeling
It is a term used to demonstrate the ability which change the model

geometry’s shape while the dimension value is changed. Various component

of a model is described by the feature-based.

For Instance: An object can be included with various types of features like

grooves, chamfers, holes, and fillets. The parametric solid model consists of a

basic unit called ‘Feature’.

Main Purpose of Parametric Modeling

Objects and systems are designed by the computer using parametric

modeling with component properties that stimulate real-world behavior. To

manipulate the system attributes, parametric modeling uses feature-based,


surface, and solid modeling design tools. There is one of the most essential

features of parametric modeling is the automatic change in the features that

are interlinked, or we can say that it allows the designer to demonstrate all

classes of shapes except by giving instances of them. It was difficult for

designers to change the form before the invention of parametrics.

For Instance: the designer had to alter the length, breadth, and height of a

3D solid. However, using parametric modelling, the designer only needs to

change one parameter since the other two are automatically updated.

Therefore, parametric models emphasise and parameterize the phases

involved in shaping an object. Product design engineering service companies

gain a lot from this.

Process of Parametric Modelling

A series of mathematical equations serve as the foundation for parametric

models. Parametric models must be based on accurate project data in order

to be taken seriously. The effectiveness of a modeling solution is determined

by the level of expertise of the information analysis approaches and the

depth of the hidden undertaking information.

Types of Parametric Models


There are two types of Parametric Models. These are discussed below:
1. Constructive Solid Geometry (CSG)

2. Boundary Representation (BR)

Constructive Solid Geometry (CSG)

CSG (Constructive Solid Geometry) defines a model by Combining

fundamental (primitive) and created (using extrusion and sweeping

operations) solid shapes. It builds a model using Boolean operations. CSG is

made up of a collection of 3D solid primitives, such as a cone, cylinder, prism,

rectangle, or sphere, which are then worked with using basic Boolean

operations.

Boundary Representation (BR)

In BR (Boundary Representation), a solid model is created by specifying the

surfaces (points, edges, etc.) that define its spatial limits. Then, by connecting

these spatial points, the thing is created. This technique is widely used in

Finite Element Method (FEM) programmes because it makes it easier to

manage the interior meshing of the volume.

Advantages of Parametric Modeling


The advantages of 3D parametric modeling over conventional 2D drawings

are as follows:

● It is able to create flexible designs


● Enhanced product visualisation because you can start with basic

objects and little detail

● Enhanced downstream application integration and shortened

engineering cycle time

● New designs can be made using already created design data.

● Rapid design turnaround, efficiency improvement

Terminologies Related to Parametric Modeling


● Parametric Model: Its like a 3D (3-dimenstional) model which has

parameters that can be customize according to our requirement. We

can easily change the values in the parameter to create change in

shape, dimensions and other properties of our model.

● NURBS Model: It is a mathematical approach which is used in CAD

and also in computer graphics. We can control its curve or we can

say the surface through the use of control points.

● Polygonal Model: Polygonal Model is also known as Mesh Model. It

is type of model that is used to create 3D objects by using tiny

components. These components or polygons are consist of edges,

vertices and faces. Also keep in mind that a single polygon is usually

said to be a Face. Though a Polygon Mesh consist of many

connected faces (polygon). We can use it to build a 3D polygon

model.
● Visual Programming: It allows us to create parametric 2D or 3D

models by inserting nodes, setting, changing parameters and

connected inputs. It is best for the people who usually don’t have

coding skills because it make the user to utilize the power of

algorithms with actions and visual objects rather than a script of

code.

● Nodes: We can also say it as components. They process and contain

a sequence of data about parameters of a model.

● List: It can be defined as a collection of items that can be utilize as a

input or output source, managed in a structured way by using

indexes which starts from 0.

● Data Types: Each and every item has its own data type. Some items

consist of curves, points , some consist of solids and other contains

only numbers , it can be vary according to the parametric modeling

software.

Parametric Modelling Tools


Today’s market offers a wide variety of software options for parametric

modelling. This programme can be broadly categorised as:

● Small Scale Use

● Large Scale Use

● Industry Specefic Modeling


The third category, Industry-Specific Parametric Software, has seen the

most growth. Some of the top business software includes:

❖ SolidWorks: SolidWorks, which was first released in 1995 as a

low-cost alternative to the other parametric modelling software

solutions, was acquired by Dassault Systemes in 1997. It has a large

following in the plastics industry and is mainly employed in

mechanical design applications.

❖ CATIA: CATIA was developed by Dassault Systemes in France in the

late 1970s. These complex industries—aeronautics, automobiles,

and shipbuilding—all make extensive use of this software.

❖ PTC: The industry standard for 3D CAD software is Creo Parametric.

To speed up the design of components and assemblies, it offers the

widest variety of potent yet adaptable 3D CAD capabilities.

❖ Spline-based motion

Spline-based motion in computer vision is a technique used to model and

analyze motion trajectories in a smooth and continuous manner. Here are

some key points to consider:

1. Definition of Splines

● Splines are piecewise-defined polynomials that provide a smooth curve

through a set of control points.


● Common types include linear splines, quadratic splines, and cubic

splines, with cubic splines being the most widely used due to their

smoothness and flexibility.

2. Applications in Motion Analysis

● Trajectory Modeling: Splines can effectively model the trajectories of

moving objects, providing a smooth representation of their paths.

● Object Tracking: In video analysis, splines can be used to interpolate

the position of an object over time, improving tracking accuracy.

● Motion Synthesis: In animation, splines help generate smooth motion

paths for characters or objects.

3. Advantages of Using Splines

● Smoothness: Provides a visually appealing and mathematically smooth

representation of motion.

● Flexibility: Can easily fit complex trajectories by adjusting control

points.

● Local Control: Changing one control point only affects nearby

segments, allowing for precise adjustments.

4. Types of Splines in Motion Modeling

● B-Splines: A generalization of splines that offer more control over the

shape of the curve without increasing the number of control points.


● Catmull-Rom Splines: A type of interpolating spline that passes

through its control points, suitable for animation.

● Hermite Splines: Defined by endpoints and tangents, allowing for

control over the shape of the curve.

5. Mathematical Representation

● Splines are represented using basis functions, typically defined over a

specific interval.

● For cubic splines, the function is defined as a piecewise cubic

polynomial, ensuring continuity in both the function and its first and

second derivatives.

6. Optimization and Fitting

● When fitting splines to data, optimization techniques are employed to

minimize the error between the spline and the observed data points.

● Regularization may be used to prevent overfitting, especially in noisy

data scenarios.

7. Challenges

● Computational Complexity: Fitting high-dimensional splines can be

computationally intensive.

● Parameter Selection: Choosing the right number and placement of

control points is crucial for effective modeling.


8. Integration with Other Techniques

● Splines can be combined with machine learning techniques for

improved motion prediction and recognition.

● They can also be integrated with Kalman filters for better tracking in

dynamic environments.

Conclusion

Spline-based motion modeling provides a powerful tool in computer vision,

allowing for smooth and flexible representations of motion trajectories. Its

applications range from simple object tracking to complex animation and

motion analysis, making it a valuable technique in both research and practical

implementations.

❖ Optical Flow Concept


Optical flow quantifies the motion of objects between consecutive frames captured by a
camera. These algorithms attempt to capture the apparent motion of brightness
patterns in the image. It is an important subfield of computer vision, enabling machines
to understand scene dynamics and movement.
Optical Flow Concept

The concept of optical flow dates back to the early works of James Gibson in the
1950s. Gibson introduced the concept in the context of visual perception. Researchers
didn’t start studying and using optical flow until the 1980s when computational tools
were introduced.

A significant milestone was the development of the Lucas-Kanade method in 1981. This
provided a foundational algorithm for estimating optical flow in a local window of an
image. The Horn-Schunck algorithm followed soon after, introducing a global approach
to optical flow estimation across the entire image.

Optical flow estimation relies on the assumption that the brightness of a point is
constant over short periods. Mathematically, this is expressed through the optical flow
equation, Ix​vx+Iyvy+It=0.
● Ix​and Iy reflect the spatial gradients of the pixel intensity in the x and y
directions, respectively
● It is the temporal gradient
● ​vx and vyare the flow velocities in the x and y directions, respectively.

More recent breakthroughs involve leveraging deep learning models like FlowNet,
FlowNet 2.0, and LiteFlowNet. These models transformed optical flow estimation by
significantly improving accuracy and computational efficiency. This is largely because
of the integration of Convolutional Neural Networks (CNNs) and the availability of large
datasets.

Even in settings with occlusions, optical flow techniques nowadays can accurately
anticipate complicated patterns of apparent motion.
Techniques and Algorithms for Optical Flow

Different types of optic flow algorithms, each with a unique way of calculating the
pattern of motion, led to the evolution of computational approaches. Traditional
algorithms like the Lucas-Kanade and Horn-Schunck methods laid the groundwork for
this area of computer vision.
The Lucas-Kanade Method

This method caters to use cases with a sparse feature set. It operates on the
assumption that flow is locally smooth, applying a Taylor-series approximation to the
image gradients. You can thereby solve an optical flow equation, which typically
involves two unknown variables for each point in the feature set. This method is highly
efficient for tracking well-defined corners and textures often, as identified by the
Shi-Tomasi corner detection or the Harris corner detector.

The Horn-Schunck Algorithm

This algorithm is a dense optical flow technique. It takes a global approach by


assuming smoothness in the optical flow across the entire image. This method
minimizes a global error function and can infer the flow for every pixel. It offers more
detailed structures of motion at the cost of higher computational complexity.

However, novel deep learning algorithms have ushered in a new era of optical flow
algorithms. Models like FlowNet, LiteFlowNet, and PWC-Net use CNNs to learn from
vast datasets of images. This enables the prediction with greater accuracy and
robustness, especially in challenging scenarios. For example, in scenes with occlusions,
varying illumination, and complex dynamic textures.

To illustrate the differences among these algorithms, consider the following


comparative table that outlines their performance in terms of accuracy, speed, and
computational requirements:
Traditional techniques such as Lucas-Kanade and Horn-Schunck are foundational and
should not be discounted. However, they generally can’t compete with the accuracy and
robustness of deep learning approaches. Deep learning methods, while powerful, often
require substantial computational resources. This means they might not be as suitable
for real-time applications.

You might also like