Unit-5 Computer Vision
Unit-5 Computer Vision
Facial recognition
The most frequently used technology is smartphones. It is a technology to remember and verify a person, object, etc
from the visuals from the given pre-defined data. Such kinds of mechanics are often used for security and safety
purposes.
For eg: Face security lock-in devices and traffic cameras are some examples using facial recognition.
Facial filters
A Face Filter is a computer-generated effect that applies predesigned edits or changes to a loaded image. Modern days
social media apps like Snapchat and Instagram use such kinds of technology that extract facial landmarks and process
them using AI to get the best result.
Goggle lens
To search data, Google uses Computer vision for capturing and analysing different features of the input image to the
database of images and then gives us the search.
Medical Imaging
For the last decades, computer vision medical imaging application has been a trustworthy help for physicians and doctors.
It creates and analyses images and helps doctors with their interpretation.
The application is used to read and convert 2D scan images into interactive 3D models.
Single object
This means giving one image as input to the Computer Vision application. It is divided into two categories: -
1. Image Classification
Image Classification is the task of identifying an object in the input image and label from a predefined category.
2. Classification + Localization
As the name suggests, the task identifies the object and locates it in the input image.
Multiple object
This means giving multiple images as input to the Computer Vision application. It is divided into two categories: -
1.Object detection
Object detection tasks extract features from the input and use learned formulas to recognize instances of an object
category.
2.Instance segmentation
Instance segmentation assigns a label to each pixel of the image. It is used for tasks such as counting the number of
objects.
Basics of Images
The word “pixel” means a picture element.
Pixels
• Pixels are the fundamental element of a photograph.
• They are the smallest unit of information that make up a picture.
• They are typically arranged in a 2-dimensional grid.
• In general term, the more pixels you have, the more closely the image resembles the original.
Resolution
• The number of pixels covered in an image is sometimes called the resolution
• Term for area covered by the pixels in conventionally known as resolution.
• For eg :1080 x 720 pixels is a resolution giving numbers of pixels in width and height of that picture.
• A megapixel is a million pixels
Pixel value
• Pixel value represent the brightness of the pixel.
• The range of a pixel value in 0-255(28-1)
• where 0 is taken as Black or no colour and 255 is taken as white.
Greyscale Images
• Grayscale images are images which have a range of shades of grey without apparent colour. The lightest shade is
white total presence of colour or 255 and darkest colour is black at 0.
• Intermediate shades of grey have equal brightness levels of the three primary colours RBG.
• The computers store the images we see in the form of these numbers.
RBG colours
• All the coloured images are made up of three primary colours Red, Green and Blue.
• All the other colour are formed by using these primary colours at different proportions.
• Computer stores RGB Images in three different channels called the R channel, G channel and the B channel.
Image Features
• A feature is a description of an image.
• Features are the specific structures in the image such as points, edges or objects.
• Other examples of features are related to tasks of CV motion in image sequences, or to shapes defined in terms
of curves or boundaries between different image regions.
OpenCV or Open-Source Computer Vision Library is that tool that helps a computer to extract these features from the
images. It is capable of processing images and videos to identify objects, faces, or even handwriting.
Convolution
Convolutions are one of the most critical, fundamental building blocks in computer vision and image processing.
We learned that computers store images in numbers and that pixel are arranged in a particular manner to create the
picture we can recognize. As we change the values of these pixels, the image changes.
Image convolution is simply an element-wise multiplication of two matrices followed by a sum. Convolution is using
a ‘kernel’ to extract certain ‘features’ from an input image.
Kernel- A kernel is a matrix or a small matrix used for blurring, sharpening, and many more which is slid across the image
and multiplied with the input such that the output is enhanced in a certain desirable manner.
Q: Mention the steps of convolution:
Ans: - The steps of convolution
1. Take two matrices (Input Image +kernel with dimensions).
2. Multiply them, element-by-element (i.e., not the dot-product, just a simple multiplication).
3. Sum the elements together.
4. Then the sum will be the centre value of the image.
Convolution Layer
The first Convolution Layer is responsible for capturing the Low-Level features such as edges, colour, gradient
orientation, etc. In the convolution layer, there are several kernels that help us in processing the image further produce
several features. The output of this layer is called the feature map.
For eg: If we consider it as a kid, we teach him the landmarks in the image, and then if he finds these similar landmarks in
another, he will identify that object same is the case with AI we use convolution for picking the landmark from the input
for further editing.
Pooling Layer
The Pooling layer is responsible for reducing the spatial size of the Convolved Feature while still retaining the important
features. Image is more resistant to small transformations, distortions, and translations to the input image.
*********************