1 Intro To CV
1 Intro To CV
2
COMPUTER VISION, OFTEN ABBREVIATED
AS CV, IS DEFINED AS A FIELD OF STUDY
THAT SEEKS TO DEVELOP TECHNIQUES TO
HELP COMPUTER "SEE" AND UNDERSTAND
THE CONTENT OF DIGITAL IMAGES SUCH
AS PHOTOGRAPHS AND VIDEOS.
INTRODUCTION
Computer Vision is a field of artificial intelligence (AI) that
enables computers and systems to derive meaningful information
from digital images, videos, and other visual inputs—and take
actions or make recommendations based on that information.
If AI enables computers to think, computer vision enables them
to see, observe, and understand.
DIP focuses on enhancing, restoring, compressing, or
transforming images using mathematical operations, while CV
aims to extract meaningful information, recognize objects, or
perform actions based on images using artificial intelligence.
APPLICATIONS OF
COMPUTER VISION
•Image Recognition: Identifying objects, places, people, writing, and actions in images or videos. It is
used in applications such as photo tagging on social media, diagnostics in healthcare, and autonomous
vehicles.
•Face Detection and Recognition: Identifying and verifying human faces in images and videos. This is
widely used in security systems and user identification processes.
•Object Detection: Locating objects within an image and identifying each object. This is crucial for
applications like self-driving cars, where the system must recognize objects on the road.
•Image Segmentation: Dividing an image into parts or segments to simplify its analysis. It's used in
medical imaging to distinguish different tissues and organs.
•Optical Character Recognition (OCR): Converting different types of documents, such as scanned
paper documents or PDFs, into editable and searchable data.
•Augmented Reality (AR): Overlaying digital content on the real world. Examples include Snapchat
filters and apps like Pokemon Go.
HOW DOES COMPUTER
VISION
Computer WORK
Vision relies on pattern recognition. Machines are trained to
recognize patterns through:
1.Data Collection: Gathering a large dataset of images or videos with
labeled objects.
2.Preprocessing: Preparing the data for analysis, such as resizing images,
normalizing pixel values, and augmenting the dataset with transformations
like rotations and flips.
3.Feature Extraction: Identifying the important features or patterns
within the images. Traditional methods include techniques like edge
detection, texture analysis, and histogram of oriented gradients (HOG).
4.Model Training: Using machine learning algorithms, particularly deep
learning, to create models that can recognize patterns and objects in
images. Convolutional Neural Networks (CNNs) are particularly effective for
image-related tasks.
5.Inference: Applying the trained model to new images to recognize
patterns, detect objects, and classify images.
COMPUTER VISION AND IMAGE PROCESSING
Computer vision is distinct from image processing.
Image processing is the process of creating a new image
from an existing image, typically simplifying or enhancing the
content in some way. It is a type of digital signal processing
and is not concerned with understanding the content of an
image.
A given computer vision system may require image
processing to be applied to raw input, e.g. pre-processing
images.
Examples of image processing include:
Normalizing photometric properties of the image, such as
brightness or color.
Cropping the bounds of the image, such as centering an
object in a photograph.
Removing digital noise from an image, such as digital
artifacts from low light levels.
HOW DOES COMPUTER VISION
WORK?
Let’s leave our fluffy cat friends for a moment on the side and let’s get more technical🤔😹.
Below is a simple illustration of the grayscale image buffer which stores our image of
Abraham Lincoln. Each pixel’s brightness is represented by a single 8-bit number, whose
range is from 0 (black) to 255 (white):
Black 0 00000000
White 255 11111111
In point of fact, pixel values are almost universally stored, at the hardware
level, in a one-dimensional array. For example, the data from the image
above is stored in a manner similar to this long list of unsigned chars:
This way of storing image data may run counter to your expectations, since the data
certainly appears to be two-dimensional when it is displayed. Yet, this is the case, since
computer memory consists simply of an ever-increasing linear list of address spaces.
THE EVOLUTION OF
COMPUTER VISION
Before the advent of deep learning, the tasks that computer vision could
perform were very limited and required a lot of manual coding and effort
by developers and human operators. For instance, if you wanted to
perform facial recognition, you would have to perform the following steps:
Capture
Correction
Create a Annotate new Feature Feature
of error
database: images images Selection Reduction
Margins
THE EVOLUTION OF COMPUTER VISION- TO
DEEP LEARNING
COMPUTER VISION APPLIES
HERE TOO
The increased sophistication of Computer vision, artificial
neural networks (ANNs) coupled with the availability of
AI-powered chips have driven am unparalleled enterprise
interest in computer vision (CV). This exciting new
technology will find myriad applications in several
industries, and according to GlobalData forecasts, it would
reach a market size of $28bn by 2030.The increasing
adoption of AI-powered computer vision solutions,
consumer drones; and the rising Industry 4.0 adoption will
drive this phenomenal change. Here are the top computer
vision trends that will be behind the growth of computer
vision for modern enterprises-
Deep learning
3 D holographic imaging
Thermal Imaging
Liquid Lenses
Drones
COMPUTER VISION FOR AUTONOMOUS ROBOTS
LANE DETECTION
GAMES
FACE-TAGGING
The difference between computer vision and image processing in
Computer vision helps to gain high-level of understanding from images or
videos.
For instance, object recognition, which is the process of identifying the
type of objects in an image, is a computer vision problem. In computer
vision, you receive an image as input, and you can produce an image as
output or some other type of information.Whereas, image processing
doesn’t need such a high level of understanding of image. In fact, it is the
sub-field of signal processing but also applied to images. For example, if
you have noisy or blurred images, then under image processing the
deblurring or denoising is done to make the object in the image clearly
visible to machines.
Image process task involves filtering, noise removal, edge detection, and
color processing. In entire processing, you receive an image as input and
produce another image as an output that can be used to train the machine
through computer vision.
The main difference between computer vision and image processing are
the goals (not the methods used). For example, if the goal is to enhance
the image quality for later use, which is called image processing. If the
goal is to visualize like humans, like object recognition, defect detection or
automatic driving, then it is called computer vision.
WHAT IS COMPUTER VISION IN
AI AND MACHINE LEARNING?
Computer vision is simply the process of perceiving the images and
videos available in the digital formats. In Machine Learning (ML)
and AI – Computer vision is used to train the model to recognize
certain patterns and store the data into their artificial memory to
utilize the same for predicting the results in real-life use.
The main purpose of using computer vision technology in ML and AI
is to create a model that can work itself without human
intervention. The whole process involves methods of acquiring the
data, processing, analyzing, and understanding the digital images
to utilize the same in the real-world scenario.
Let’s clear things up: artificial intelligence
(AI), machine learning (ML), and deep
learning (DL) are three different things.
•Artificial intelligence is a science like
mathematics or biology. It studies ways to
build intelligent programs and machines that
can creatively solve problems, which has
always been considered a human
prerogative.
•Machine learning is a subset of artificial
intelligence (AI) that provides systems the
ability to automatically learn and improve
from experience without being explicitly
programmed. In ML, there are different
algorithms (e.g. neural networks) that help to
solve problems.
•Deep learning, or deep neural learning,
is a subset of machine learning, which uses
the neural networks to analyze different
factors with a structure that is similar to the
IMAGE DIGITIZATION
BLACK AND WHITE IMAGE– The image which consist of only black and white
color is called BLACK AND WHITE IMAGE.
8 bit COLOR FORMAT– It is the most famous image format.It has 256 different
shades of colors in it and commonly known as Grayscale Image. In this format, 0
stands for Black, and 255 stands for white, and 127 stands for gray.
16 bit COLOR FORMAT– It is a color image format. It has 65,536 different colors
in it.It is also known as High Color Format. In this format the distribution of color
is not as same as Grayscale image.
A 16 bit format is actually divided into three further formats which are Red, Green
and Blue. That famous RGB format.
Pixel = f (r,g,b)
Green f(0,256,0)
35 35 35 0 0 0 35 35 35 75
10 10 10 20 20 20 20
75 75 75
0 0 0 0 0 0 0
Sampling
32
64
128
256
512
1024
Sampling:-The sampling rate determines
the spatial resolution of the digitized
image
128 64 32
Quantization:- the quantization level determines the number of grey
levels in the digitized image. A magnitude of the sampled image is
expressed as a digital value in image processing. The transition
between continuous values of the image function and its digital
equivalent is called quantization.
Rounding of grey levels – 1 bit to 16 bit rounding
8-bit 7-bit 6-bit 5-bit
=
COLOR IMAGES
r ( x, y )
f ( x , y ) g ( x, y )
b( x, y )
COLOR SENSING IN CAMERA:
PRISM
Requires three chips and precise alignment.
CCD(R)
CCD(G)
CCD(B)
COLOR SENSING IN CAMERA:
COLOR FILTER ARRAY
• In traditional systems, color filters are applied to a single
layer of photodetectors in a tiled mosaic pattern.
Bayer grid
Why more green?
demosaicing
(interpolation)
COLOR SENSING IN CAMERA:
FOVEON X3
• CMOS sensor; takes advantage of the fact that red, blue
and green light silicon to different depths.
http://www.foveon.com/article.php?a=67
ALTERNATIVE COLOR SPACES
Various other color representations can be computed from RGB.
This can be done for:
Decorrelating the color channels:
principal components.
Bringing color information to the fore:
Hue, saturation and brightness.
Perceptual uniformity:
CIELuv, CIELab, …
COLOR TRANSFORMATION -
EXAMPLES
SKIN COLOR
RGB rg
r
g
SKIN DETECTION
Or the domain of f:
Image
Segmentatio
Enhancemen
n
t
Image Object
Acquisition Recognition
Representatio
Problem Domain n&
Description
Colour
Image
Image
Compression
Processing
KEY STAGES IN DIGITAL IMAGE
PROCESSING:
IMAGE AQUISITION
es taken from Gonzalez & Woods, Digital Image Processing (2002)
Image Morphologic
Restoration al Processing
Image
Segmentatio
Enhancemen
n
t
Image Object
Acquisition Recognition
Representatio
Problem Domain n&
Description
Colour
Image
Image
Compression
Processing
KEY STAGES IN DIGITAL IMAGE
PROCESSING:
IMAGE ENHANCEMENT
es taken from Gonzalez & Woods, Digital Image Processing (2002)
Image Morphologic
Restoration al Processing
Image
Segmentatio
Enhancemen
n
t
Image Object
Acquisition Recognition
Representatio
Problem Domain n&
Description
Colour
Image
Image
Compression
Processing
KEY STAGES IN DIGITAL IMAGE
PROCESSING:
IMAGE RESTORATION
es taken from Gonzalez & Woods, Digital Image Processing (2002)
Image Morphologic
Restoration al Processing
Image
Segmentatio
Enhancemen
n
t
Image Object
Acquisition Recognition
Representatio
Problem Domain n&
Description
Colour
Image
Image
Compression
Processing
KEY STAGES IN DIGITAL IMAGE
PROCESSING:
MORPHOLOGICAL PROCESSING
es taken from Gonzalez & Woods, Digital Image Processing (2002)
Image Morphologic
Restoration al Processing
Image
Segmentatio
Enhancemen
n
t
Image Object
Acquisition Recognition
Representatio
Problem Domain n&
Description
Colour
Image
Image
Compression
Processing
KEY STAGES IN DIGITAL IMAGE
PROCESSING:
SEGMENTATION
es taken from Gonzalez & Woods, Digital Image Processing (2002)
Image Morphologic
Restoration al Processing
Image
Segmentatio
Enhancemen
n
t
Image Object
Acquisition Recognition
Representatio
Problem Domain n&
Description
Colour
Image
Image
Compression
Processing
KEY STAGES IN DIGITAL IMAGE
PROCESSING:
OBJECT RECOGNITION
es taken from Gonzalez & Woods, Digital Image Processing (2002)
Image Morphologic
Restoration al Processing
Image
Segmentatio
Enhancemen
n
t
Image Object
Acquisition Recognition
Representatio
Problem Domain n&
Description
Colour
Image
Image
Compression
Processing
KEY STAGES IN DIGITAL IMAGE
PROCESSING:
REPRESENTATION & DESCRIPTION
es taken from Gonzalez & Woods, Digital Image Processing (2002)
Image Morphologic
Restoration al Processing
Image
Segmentatio
Enhancemen
n
t
Image Object
Acquisition Recognition
Representatio
Problem Domain n&
Description
Colour
Image
Image
Compression
Processing
KEY STAGES IN DIGITAL IMAGE
PROCESSING:
IMAGE COMPRESSION
Image Morphologic
Restoration al Processing
Image
Segmentatio
Enhancemen
n
t
Image Object
Acquisition Recognition
Representatio
Problem Domain n&
Description
Colour
Image
Image
Compression
Processing
KEY STAGES IN DIGITAL IMAGE
PROCESSING:
COLOUR IMAGE PROCESSING
Image Morphologic
Restoration al Processing
Image
Segmentatio
Enhancemen
n
t
Image Object
Acquisition Recognition
Representatio
Problem Domain n&
Description
Colour
Image
Image
Compression
Processing
Thank you