0% found this document useful (0 votes)
8 views44 pages

Unit1 CV

The document provides an introduction to computer vision, detailing its purpose to automate human visual perception through image acquisition, preprocessing, feature extraction, and high-level understanding. It discusses various tasks in computer vision such as image classification, object detection, and semantic segmentation, alongside applications in fields like quality management and smart farming. Additionally, it covers geometric camera models, image formation, and different color models used in digital imaging.

Uploaded by

jatin.topakar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views44 pages

Unit1 CV

The document provides an introduction to computer vision, detailing its purpose to automate human visual perception through image acquisition, preprocessing, feature extraction, and high-level understanding. It discusses various tasks in computer vision such as image classification, object detection, and semantic segmentation, alongside applications in fields like quality management and smart farming. Additionally, it covers geometric camera models, image formation, and different color models used in digital imaging.

Uploaded by

jatin.topakar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Unit - I

INTRODUCTION TO
COMPUTER VISION
Dr Resmi K R

Mission Vision Core Values


Christ University is a nurturing ground for an individual’s Excellence and Service Faith in God | Moral Uprightness
holistic development to make effective contribution Love of Fellow Beings | Social Responsibility
to the society in a dynamic environment Pursuit of Excellence
Syllabus

Basic Concepts to computer vision – Pros and cons of human vision -


Computer vision and Image processing- Different Applications of Computer
Vision-Geometric Camera Models: Image Formation, Basic Image Formats,
Geometric Camera Calibration-Light and Shading: Pixel Brightness,
Inference from Shading, Shape from one shaded Images, Color: Human
Color Perception, Representing color, A Model of Image color, Inference
from Color.

2
Introduction
● Computer Vision is a field of Artificial Intelligence that enables machines to capture
and interpret visual information from the world just like humans do.
● Computer vision aims to automate the human vision system to recognize objects,
understand scenes, and make judgments after analyzing the visual data.

3
Steps in Computer Vision

● Image Acquisition—acquire an image or a video by capturing it with devices such as a


camera or a sensor.
● Data Preprocessing—preprocessing remove unwanted noise and enhance the image
quality, making it suitable for further analysis.
● Feature Extraction—The processed data is then passed to computer vision algorithms
to extract features (patterns) from the image. Some initial features are edges, shapes,
and textures. With further iterations of the algorithm, it can detect more advanced
features.
● High-Level Understanding—The final stage of the process is high-level understanding.
Here, the extracted features are subjected to inference to interpret the image. The
output is based on the specific task performed, such as classification, detection, or
another.

4
Steps in Digital Image processing

5
Tasks in Computer Vision

● Image Classification-Image classification categorizes an image into


several predefined categories/classes. This is a supervised task.

Image Classification of Flowers

6
● Object Detection—Object detection identifies and locates what object is in the image.
Instead of simply saying that an image contains a dog and a cat, object detection shows
where they are located in the image. The algorithm can also detect multiple objects at
the same time.

Object Detection of Animals

7
● Semantic Segmentation—Semantic segmentation labels each pixel in an image with a
specific class. Unlike object detection, which focuses on bounding boxes around
objects, semantic segmentation provides a detailed understanding of the scene.

Semantic Segmentation for Autonomous Driving

8
● Instance Segmentation—Instance segmentation is an extension of semantic
segmentation that differentiates between multiple instances of the same object class.
For example, in an image with several cars, instance segmentation would not only
label all the vehicles but also distinguish between individual cars, assigning a unique
label to each one.

Instance Segmentation of Cars

9
● Keypoint Detection—Keypoint detection identifies specific points of
interest within an object, such as the corners of a box or the joints in a
human body.

Keypoint Detection for Facial Recognition

10
Real time example

Consider the image below to store the number ‘8’ in the form of an image

zoom Image

11
Applications of computer vision

Visual inspection of equipment

12
Quality management Tumor detection

13
Animal monitoring-Smart farming Plant disease detection

14
Parking occupancy detection Vehicle Counting

15
Human Vision Machine Vision

Speed The human visual system can process 10 to High speed – hundreds to thousands of
12 images per second. parts per minute (PPM)

Resolution High image resolution High resolution & magnification.


Interpretation Complex information. Best for qualitative Follows program precisely. Best for
interpretation of unstructured scene quantitative and numerical analysis of a
structured scene
Light spectrum Visible light – Human eyes are sensitive to Some machine-vision systems function at
electromagnetic wavelengths ranging from 390 infrared (IR), ultraviolet (UV), or X-ray
to 770 nanometres (nm) requires additional wavelengths.
lighting to highlight parts being inspected, but
can record light beyond human visible
spectrum.

Consistency, Impaired by boredom, distraction & fatigue Continuous repeatable performance – 24/7
reliability & safety 100% accuracy

16
Geometric Camera Models

When we click a picture, a real scenario which is in 3D is captured by a real camera in 2D.
So here, 3D to 2D conversion takes place.
Real Scene (3D) → Real Cameras (2D) → CV Output (3D)
Pinhole Camera Model
For a pinhole camera, a hole of the size of a pin is created on one side of a box and a thin
paper on the other side of the box. The light entering this hole will then project the image
of the world on the paper. The image captured will be upside down i.e. an inverted
image.

17
Image Formation in eye

f → distance between the image and the screen (in pinhole


camera) and in terms of lens camera, focal length

z → distance of the object from the screen

(x, y, z) → real 3D world coordinates

(u, v) → 2D image coordinates

18
Digital Image Processing, 2nd ed. www.imageprocessingbook.com

Elements of Visual Perception

• Cornea The cornea is a transparent, curved, refractive window


through which the light enters the eye. sclera, which extends and
covers the posterior portion of the optic globe.
• Iris The light entering the cornea is blocked by the visible
colored and opaque surface of the iris.
• Pupil The pupil is the opening at the center of the iris. The pupil
controls the amount of light entering the eye ball.
• Choroid Beneath the sclera is a membrane called choroid. It
contains blood vessels to nourish the cells in the eye.
• Retina Beneath the choroid lies the retina, the innermost
membrane of the eye where the light entering the eye is sensed
by the receptor cells. The retina has 2 types of photoreceptor
cells − rods and cones .The rods help in the dim-light
(scotopic) vision. The cones help in the bright-light (photopic)
vision.
• Fovea The central portion of the retina at the posterior part is the
fovea
Link (Elements of Visual perception): https://youtu.be/_xKbjYBnHhc
© 2002 R. C. Gonzalez & R. E. Woods
Digital Image Processing, 2nd ed. www.imageprocessingbook.com

Brightness adaptation

• Dynamic range of
human visual system
– 10-6 ~ 104
• Cannot accomplish this
range simultaneously
• The current sensitivity
level of the visual
system is called the
brightness adaptation
level

20
© 2002 R. C. Gonzalez & R. E. Woods
Digital Image Processing, 2nd ed. www.imageprocessingbook.com

Brightness Adaptation and Discrimination


Example: Mach bands

© 2002 R. C. Gonzalez & R. E. Woods


Digital Image Processing, 2nd ed. www.imageprocessingbook.com

Brightness Adaptation and Discrimination


Example: Simultaneous Contrast

© 2002 R. C. Gonzalez & R. E. Woods


Digital Image Processing, 2nd ed. www.imageprocessingbook.com

Brightness Adaptation and Discrimination


Examples for Human Perception Phenomena

© 2002 R. C. Gonzalez & R. E. Woods


Digital Image Processing, 2nd ed. www.imageprocessingbook.com

Image Sampling and Quantization

The output of most of the image sensors is an analog signal, and we can not apply digital
processing on it because we can not store it. We can not store it because it requires infinite
memory to store a signal that can have infinite values.
So we have to convert an analog signal into a digital signal.
To create an image which is digital, we need to covert continuous data into digital form. There are
two steps in which it is done.
Sampling
Quantization

© 2002 R. C. Gonzalez & R. E. Woods


Digital Image Processing, 2nd ed. www.imageprocessingbook.com

Image Sensing and Acquisition

• Nowadays most visible and near IR electromagnetic imaging is done with


2-dimensional charged-coupled devices (CCDs).

© 2002 R. C. Gonzalez & R. E. Woods


Digital Image Processing, 2nd ed. www.imageprocessingbook.com

A Simple Image Formation Model

• Let f(x,y) be an image function, then


f(x,y) = i(x,y) r(x,y),
where i(x,y): the illumination function
r(x,y): the reflection function
Note: 0 < i(x,y)< ∞ and 0 <r(x,y)< 1.

• For digital images the minimum gray level is usually 0, but the maximum depends on number of
quantization levels used to digitize an image. The most common is 256 levels, so that the maximum
level is 255.

© 2002 R. C. Gonzalez & R. E. Woods


Digital Image Processing, 2nd ed. www.imageprocessingbook.com

Image Sampling and Quantization

© 2002 R. C. Gonzalez & R. E. Woods


Digital Image Processing, 2nd ed. www.imageprocessingbook.com

Image Sampling and Quantization

• Sampling: digitizing the 2-dimensional spatial coordinate values


• Quantization: digitizing the amplitude values (brightness level)

© 2002 R. C. Gonzalez & R. E. Woods


Digital Image Processing, 2nd ed. www.imageprocessingbook.com

Result after Sampling and quantization

The result of sampling and quantization is a matrix of real numbers. f(x, y) is


sampled so that the resulting digital image has M rows and N columns .The values
of the coordinates (x, y) now become discrete quantities.

 f (0, 0) f (0,1) ... f (0, N − 1) 


 f (1, 0) f (1,1) ... f (1, N − 1) 
f ( x, y ) = 
 ... ... ... ... 
 
 f ( M − 1, 0) f ( M − 1,1) ... f ( M − 1, N − 1) 

© 2002 R. C. Gonzalez & R. E. Woods


Difference between sampling and Quantization

30
Basic relationships between pixels

● Neighborhood and Connectivity


Consider the pixels

31
● Neighbors of a Pixel

32
33
Distance Measures

● Euclidean distance-Euclidean distance measures straight-line distance

34
● L1 Distance (or Cityblock Distance or Manhattan Distance)-does not go
in straight lines but in blocks

35
● Chebyshev Distance (or Chessboard Distance)
The most intuitive understanding of the Chebyshev distance is the movement of the King on a chessboard:
it can go one step in any direction (up, down, left, right and verticals).

36
IMAGE FILE FORMATS

37
38
Color Models

● A color model is a mathematical representation of colors as sets of values or


coordinates. These models provide a systematic way to specify colors in digital
images, enabling efficient processing, storage, and communication of color
information.
● Different color Models are
1. RGB (Red, Green, Blue)
2. CMY/CMYK (Cyan, Magenta, Yellow, [Black])
3. HSV (Hue, Saturation, Value)
4. HSL (Hue, Saturation, Lightness)
5. YUV / YCbCr
6. LAB (CIELAB)

39
RGB (Red, Green, Blue) Model
The RGB model is the most widely used color model in digital imaging. Colors are created by combining
red, green, and blue light in varying intensities.
Applications:
Primarily used in monitors, televisions, and cameras.

40
CMY and CMYK (Cyan, Magenta, Yellow, Black) Model
● The CMY model is a subtractive color model used in color printing. It works by subtracting varying
amounts of cyan, magenta, and yellow from white light.

Applications: Widely used in printing processes to achieve a broad range of colors.

41
HSV (Hue, Saturation, Value) Model

● Description: The HSV model represents colors in a way that aligns with human
perception. It separates color information (hue) from brightness (value) and intensity
(saturation).
● Applications: Useful for image enhancement, object recognition, and color-based
segmentation.

42
● YUV and YCbCr Models
Description: These models are used in video compression and broadcasting. Y represents luminance, while
U and V (or Cb and Cr in YCbCr) represent chrominance.
Applications: Widely employed in digital video standards such as JPEG and MPEG compression.

● CIELAB Model

Description: The CIELAB model is a perceptually uniform color model developed by the International
Commission on Illumination (CIE). It represents colors in a way that closely matches human vision.

Applications: Useful in color difference measurements, image analysis, and ensuring consistent color
reproduction across different devices.

43
Additional Video Links

● Geometric Camera Calibration:


https://www.youtube.com/watch?v=e7Dz4JjAonY
● Human vision
https://www.youtube.com/watch?v=_xKbjYBnHhc

44

You might also like