0% found this document useful (0 votes)
17 views22 pages

Computer Vision

The document provides an introduction to computer vision, detailing its history, concepts, and applications, including the use of Convolutional Neural Networks (CNNs) for image processing. It covers geometric primitives, photometric image formation, digital cameras, and point operators, emphasizing their roles in interpreting and analyzing visual data. The document serves as a foundational resource for understanding the principles and techniques involved in computer vision.

Uploaded by

Balaji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views22 pages

Computer Vision

The document provides an introduction to computer vision, detailing its history, concepts, and applications, including the use of Convolutional Neural Networks (CNNs) for image processing. It covers geometric primitives, photometric image formation, digital cameras, and point operators, emphasizing their roles in interpreting and analyzing visual data. The document serves as a foundational resource for understanding the principles and techniques involved in computer vision.

Uploaded by

Balaji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

NANDHA ENGINEERING COLLEGE

COMPUTER
VISION
17AIX07

UNIT - 1 INTRODUCTION TO IMAGE


FORMATION AND PROCESSING

January | 2024
CONTENT
1 Computer Vision
Geometric Primitives and
2
Transformation
3 Photometric Image Formation

4 Digital Camera

5 Point Operators
Computer Vision
What is CV?
 One of the most powerful and compelling
types of AI is Computer Vision which you
have almost surely experienced in any
number of ways without even knowing.
 Computer Vision is the field of Computer
Science that focuses on replicating parts of
the complexity of the human vision system
and enabling computers to identify and
process object in image and videos in the
same way that human do.
 The Ultimate Goal is for computers to
emulate the striking perceptual capability
of human eyes and brains or even to
surpass assist the humen in certain way.
History of Computer Vision

• The origins of computer vision can • In the 1970s and 1980s,


be traced back to the 1950s and advancements were made in edge
1960s when researchers began detection, shape recognition, and
exploring pattern recognition and early attempts at three-dimensional
image analysis. Early efforts focused (3D) vision. However, limited
on simple tasks like character computational power hindered
recognition. progress.

• The 1990s witnessed an increased • The 2000s saw a surge in computer vision
focus on robust algorithms for applications, driven by improved hardware
image understanding, feature capabilities and machine learning
extraction, and object recognition. techniques. Notable breakthroughs include
Applications extended to medical the development of Convolutional Neural
imaging, surveillance, and industrial Networks (CNNs) for image classification and
automation. object detection.
Concept of Computer
Vision
• Computer vision involves teaching machines to interpret and
make decisions based on visual data. The concept encompasses
various tasks, including:

 Image acquisition  Segmentation


 Preprocessing  3D Reconstruction
 Feature extraction  Motion analysis
 Image recognition  Machine learning
 Object detection Integration
 Applications
Convolutional Neural
Networks (CNNs)

Definition

• Convolutional Neural Networks (CNNs or ConvNets) are a


class of deep neural networks designed for processing and
analyzing visual data, making them especially effective for
tasks such as image recognition and computer vision. CNNs
are characterized by their unique architecture, which
includes convolutional layers, pooling layers, and fully
connected layers.
Key Concept for CNNs

Convolutional Layers Activation Functions


Convolutional layers apply Activation functions, such as
filters (also known as kernels) ReLU (Rectified Linear Unit),
to input data, enabling the introduce non-linearity to the
network to learn hierarchical network, allowing it to learn
features such as edges, complex relationships and
textures, and patterns. patterns.
Pooling Layers Fully Connected Layers
Pooling layers downsample the Fully connected layers connect
spatial dimensions of the input every neuron from one layer to
data, reducing computational every neuron in the next layer,
complexity and retaining consolidating learned features
important features. Common for classification or regression
pooling operations include max tasks.
pooling and average pooling.
Key Concept for CNNs

Convolutional Filters Weight Sharing


Filters capture local patterns in CNNs leverage weight sharing,
the input data, effectively where the same set of parameters
learning features like edges, (weights and biases) is used across
corners, and textures. These different regions of the input,
filters are adapted during facilitating the detection of similar
training to recognize higher- features in different parts of an
Striding
level features. Padding
image.

Striding controls the step size Padding involves adding extra


of the filter as it moves across pixels to the input data,
the input data, influencing the preventing information loss at
spatial dimensions of the the borders during convolution.
output. Striding helps reduce It ensures that the spatial
the spatial dimensions and dimensions of the input and
computational load. Dropout output match appropriately.
Dropout is a regularization technique
used to prevent overfitting by randomly
setting a fraction of input units to zero
during training, reducing reliance on
Applications
CNNs are widely used in various
applications
• Image classification
• Object detection
• Facial recognition
• Image segmentation
• Medical image analysis
• Autonomous vehicles
• Natural language processing (when
combined with recurrent networks)
Geometric Primitives and
Transformation

Definition
 Geometric primitives in computer vision refer to basic
shapes and structures used to represent objects in an
image. These primitives serve as foundational elements
for various computer vision tasks, including object
recognition, image analysis, and scene understanding.
Common geometric primitives include points, lines,
circles, rectangles, and polygons.
 Geometric transformations involve altering the positions,
orientations, and sizes of geometric primitives within an
image. These transformations play a crucial role in tasks
such as image registration, object tracking, and image
manipulation.
Key Concept

Points Circles
Lorem ipsum dolor sit amet, Defined by a center point and a radius,
consectetuer adipiscing elit. Aenean circles are used to model rounded
commodo ligula eget dolor. objects or features in images.
Represented by coordinates (x, y),
points serve as the fundamental
building blocks for constructing more
complex geometric shapes.

Lines Rectangles
Lorem ipsum dolor sit amet, Composed of four points or defined
consectetuer adipiscing elit. Aenean by a center, width, and height,
commodo ligula eget dolor. Defined rectangles are used to represent
by two points or a point and a objects with rectangular shapes.
direction vector, lines are essential
for representing edges and contours
in images. Polygons
Composed of a sequence of connected points,
polygons are versatile geometric primitives used
to represent complex shapes with multiple sides.
Key Concept

Translation Scaling
Shifts the position of geometric Enlarges or shrinks geometric primitives
primitives horizontally and vertically. based on a scaling factor.

Rotation Affine Transformation


Rotates geometric primitives around Combines translation, rotation,
a specified point or axis. scaling, and shearing to perform a
more generalized transformation.

Shearing
Distorts geometric primitives by shifting one axis
relative to the other.
Photometric Image Formation

Definition
Photometric image formation in computer vision refers to the process
by which a digital image is created based on the interaction of light
with a scene and the subsequent capture of this light by an imaging
device, such as a camera. This process involves various factors related
to illumination, reflection, and the characteristics of the imaging
system.
Key Concept

Illumination Reflection Surface Properties Shading Models


Illumination represents Reflection describes The material properties Shading models are
the incident light on a how surfaces in a scene of surfaces, such as mathematical
scene, and it plays a interact with incident diffuse and specular representations that
crucial role in image light. Different reflectance, affect how simulate how light and
formation. The materials exhibit light interacts with shadows interact with
intensity, direction, varying reflectance them. Diffuse reflection surfaces. Common models
and color of light properties, influencing is responsible for the include Lambertian
impact how objects in how much light they scattered appearance, reflectance for diffuse
the scene are captured. absorb or reflect. while specular surfaces and Phong or
reflection creates Blinn-Phong models for
Key Concept

Ambient, Diffuse & Light Source Camera Characteristics


Specular Components The position, type, and The characteristics of
characteristics of light the imaging device,
In computer graphics
sources in a scene such as the camera,
and vision, the
impact how objects are influence image
interaction of light is
illuminated and, formation. This
often decomposed into
consequently, how they includes factors like
ambient, diffuse, and
appear in the captured exposure time,
specular components.
image. aperture size, and
This decomposition
sensor sensitivity.
aids in simulating
realistic lighting
Digital Camera

Definition
A digital camera in computer vision refers to
an electronic imaging device that captures
visual information in the form of digital
images. It plays a fundamental role in
computer vision applications by providing a
means to acquire, process, and analyze
visual data for various tasks such as image
recognition, object detection, and scene
understanding.
Key Concept

The image sensor


converts light into
electrical signals,
forming the basis for
digital image
Resolution
creation. Common The lens system
types include CMOS focuses light onto
The resolution of a the image sensor.
(Complementary
digital camera is the The choice of lenses
Metal-Oxide-
number of pixels it influences factors
Semiconductor) and
can capture. Higher like focal length,
CCD (Charge-Coupled
resolution aperture, and depth
Device) sensors.
contributes to finer of field.
Image details in images.
Lens System
censor
Key Concept

Shutter
Digital cameras Speed &
capture color
information using
Exposure White balance
adjustments ensure
RGB (Red, Green, Shutter speed and accurate color
Blue) channels. Color exposure settings reproduction under
representation is determine how long different lighting
crucial for various the camera's shutter conditions.
computer vision remains open. They
Colour
tasks. impact the amount
of light reaching the White
Representatio sensor and influence
n image quality. Balance
Key Concept

In-built image Frame Rate


processing Autofocus systems
capabilities enhance and focus modes
images and may The frame rate contribute to
include features like indicates how many capturing sharp and
noise reduction, images per second clear images by
sharpness the camera can adjusting the focus
adjustment, and capture. It is crucial based on the scene.
color correction. for applications like
Image video analysis.
Processing
Auto Focus
Point Operators

Definition

• Point operators in computer vision refer to a class of image


processing operations where each pixel in an image is
independently transformed based on a predefined
mathematical function. These operations are applied
individually to every pixel, often without considering the
surrounding pixels, and are fundamental for enhancing or
modifying image characteristics.
Key Concept

Pixel Transformation Gamma Correction


Point operators involve Gamma correction is a
transforming the intensity point operation used to
or color of each pixel adjust the overall
independently based on a brightness of an image,
specific mathematical rule particularly in cases where
or function the display device's
Brightness & Contrast characteristics need to be
Thresholding
taken into account.
Adjustment
Thresholding is a point
Common point operators operation that converts an
include operations for image into a binary form
adjusting the brightness by setting pixels above or
and contrast of an image. below a certain threshold
These operations scale or to specific values.
shift pixel values to
achieve the desired visual
effect.
Key Concept

Logarithmic & Exponential


Histogram Equalization Transformation
Histogram equalization is a Logarithmic and
point operation that exponential
enhances the contrast of transformations are point
an image by redistributing operations used for
pixel values to cover the enhancing details in
entire intensity range. certain intensity ranges.
Negative Transformation
Bit-Plane Slicing
A simple point operation
involves obtaining the Bit-plane slicing involves
negative of an image, extracting specific bits
where pixel values are from the binary
inverted. representation of pixel
values, allowing for
detailed analysis or
modification.

You might also like