0% found this document useful (0 votes)
18 views17 pages

CV Unit 1

The document provides an extensive overview of computer vision, detailing its definition, historical development, and essential technologies like deep learning and CNNs. It outlines various tasks associated with computer vision, such as object classification, detection, and recognition, along with its applications in fields like healthcare, automotive, and retail. Additionally, it discusses digital image processing techniques and the significance of images in computer vision, emphasizing the importance of data and algorithms in enhancing and analyzing visual information.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views17 pages

CV Unit 1

The document provides an extensive overview of computer vision, detailing its definition, historical development, and essential technologies like deep learning and CNNs. It outlines various tasks associated with computer vision, such as object classification, detection, and recognition, along with its applications in fields like healthcare, automotive, and retail. Additionally, it discusses digital image processing techniques and the significance of images in computer vision, emphasizing the importance of data and algorithms in enhancing and analyzing visual information.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

COMPUTER VISION/ AL701 NOTES

UNIT-1
Unit I: Introduction to computer vision, Introduction to images, Image Processing VS Computer Vision,
Problems in Computer Vision, Basic image operations, Mathematical operations on images: Datatype
Conversion, Contrast Enhancement, Brightness Enhancement, Bitwise operations: Different Bitwise
Operations

What is computer vision?


Computer vision is a field of artificial intelligence (AI) that uses machine learning and neural
networks to teach computers and systems to derive meaningful information from digital images,
videos and other visual inputs—and to make recommendations or take actions when they see
defects or issues.
How does computer vision work?
Computer vision needs lots of data. It runs analyses of data over and over until it discerns
distinctions and ultimately recognize images.
For example, to train a computer to recognize automobile tires, it needs to be fed vast quantities
of tire images and tire-related items to learn the differences and recognize a tire, especially one
with no defects.
Two essential technologies are used to accomplish this: a type of machine learning called deep
learning and a convolutional neural network (CNN).
History of Computer Vision
Computer vision is not a new technology because scientists and experts have been trying to
develop machines that can see and understand visual data for almost six decades. The evolution
of computer vision is classified as follows:
o 1959: The first experiment with computer vision was initiated in 1959, where they
showed a cat as an array of images. Initially, they found that the system reacts first to
hard edges or lines, and scientifically, this means that image processing begins with
simple shapes such as straight edges.
o 1960: In 1960, artificial intelligence was added as a field of academic study to solve
human vision problems.
o 1963: This was another great achievement for scientists when they developed computers
that could transform 2D images into 3-D images.
o 1974: This year, optical character recognition (OCR) and intelligent character recognition
(ICR) technologies were successfully discovered. The OCR has solved the problem of
recognizing text printed in any font or typeface, whereas ICR can decrypt handwritten
text. These inventions are one of the greatest achievements in document and invoice
processing, vehicle number plate recognition, mobile payments, machine translation, etc.
o 1982: In this year, the algorithm was developed to detect edges, corners, curves, and
other shapes. Further, scientists also developed a network of cells that could recognize
patterns.
o 2000: In this year, scientists worked on a study of object recognition.
o 2001: The first real-time face recognition application was developed.
o 2010: The ImageNet data set became available to use with millions of tagged images,
which can be considered the foundation for recent Convolutional Neural Network (CNN)
and deep learning models.
o 2012: CNN has been used as an image recognition technology with a reduced error rate.
o 2014: COCO has also been developed to offer a dataset for object detection and support
future research.
How does Computer Vision Work?
Computer vision is a technique that extracts information from visual data, such as images and
videos. Although computer vision works similarly to human eyes with brain work, this is
probably one of the biggest open questions for IT professionals: How does the human brain
operate and solve visual object recognition?

On a certain level, computer vision is all about pattern recognition which includes the training
process of machine systems for understanding the visual data such as images and videos, etc.
Firstly, a vast amount of visual labeled data is provided to machines to train it. This labeled data
enables the machine to analyze different patterns in all the data points and can relate to those
labels. E.g., suppose we provide visual data of millions of dog images. In that case, the computer
learns from this data, analyzes each photo, shape, the distance between each shape, color, etc.,
and hence identifies patterns similar to dogs and generates a model. As a result, this computer
vision model can now accurately detect whether the image contains a dog or not for each input
image.
Task Associated with Computer Vision
Although computer vision has been utilized in so many fields, there are a few common tasks for
computer vision systems. These tasks are given below:

o Object classification: Object classification is a computer vision technique/task used to


classify an image, such as whether an image contains a dog, a person's face, or a banana.
It analyzes the visual content (videos & images) and classifies the object into the defined
category. It means that we can accurately predict the class of an object present in an
image with image classification.
o Object Identification/detection: Object identification or detection uses image
classification to identify and locate the objects in an image or video. With such detection
and identification technique, the system can count objects in a given image or scene and
determine their accurate location and labeling. For example, in a given image, one dog,
one cat, and one duck can be easily detected and classified using the object detection
technique.
o Object Verification: The system processes videos, finds the objects based on search
criteria, and tracks their movement.
o Object Landmark Detection: The system defines the key points for the given object in
the image data.
o Image Segmentation: Image segmentation not only detects the classes in an image as
image classification; instead, it classifies each pixel of an image to specify what objects it
has. It tries to determine the role of each pixel in the image.
o Object Recognition: In this, the system recognizes the object's location with respect to
the image.
How to learn computer Vision?
Although, computer vision requires all basic concepts of machine learning, deep learning, and
artificial intelligence. But if you are eager to learn computer vision, then you must follow below
things, which are as follows:
1. Build your foundation:
o Before entering this field, you must have strong knowledge of advanced
mathematical concepts such as Probability, statistics, linear algebra, calculus, etc.
o The knowledge of programming languages like Python would be an extra
advantage to getting started with this domain.
2. Digital Image Processing:
It would be best if you understood image editing tools and their functions, such as
histogram equalization, median filtering, etc. Further, you should also know about
compressing images and videos using JPEG and MPEG files. Once you know the basics
of image processing and restoration, you can kick-start your journey into this domain.
3. Machine learning understanding
To enter this domain, you must deeply understand basic machine learning concepts such
as CNN, neural networks, SVM, recurrent neural networks, generative adversarial
neural networks, etc.
4. Basic computer vision: This is the step where you need to decrypt the mathematical
models used in visual data formulation.
Applications of computer vision
Computer vision is one of the most advanced innovations of artificial intelligence and machine
learning. As per the increasing demand for AI and Machine Learning technologies, computer
vision has also become a center of attraction among different sectors. It greatly impacts different
industries, including retail, security, healthcare, automotive, agriculture, etc.
Below are some most popular applications of computer vision:
o Facial recognition: Computer vision has enabled machines to detect face images of
people to verify their identity. Initially, the machines are given input data images in which
computer vision algorithms detect facial features and compare them with databases of
fake profiles. Popular social media platforms like Facebook also use facial recognition to
detect and tag users. Further, various government spy agencies are employing this feature
to identify criminals in video feeds.
o Healthcare and Medicine: Computer vision has played an important role in the
healthcare and medicine industry. Traditional approaches for evaluating cancerous tumors
are time-consuming and have less accurate predictions, whereas computer vision
technology provides faster and more accurate chemotherapy response assessments;
doctors can identify cancer patients who need faster surgery with life-saving precision.
o Self-driving vehicles: Computer vision technology has also contributed to its role in self-
driving vehicles to make sense of their surroundings by capturing video from different
angles around the car and then introducing it into the software. This helps to detect other
cars and objects, read traffic signals, pedestrian paths, etc., and safely drive its passengers
to their destination.
o Optical character recognition (OCR)
Optical character recognition helps us extract printed or handwritten text from visual data
such as images. Further, it also enables us to extract text from documents like invoices,
bills, articles, etc.
o Machine inspection: Computer vision is vital in providing an image-based automatic
inspection. It detects a machine's defects, features, and functional flaws, determines
inspection goals, chooses lighting and material-handling techniques, and other
irregularities in manufactured products.
o Retail (e.g., automated checkouts): Computer vision is also being implemented in the
retail industries to track products, shelves, wages, record product movements into the
store, etc. This AI-based computer vision technique automatically charges the customer
for the marked products upon checkout from the retail stores.
o 3D model building: 3D model building or 3D modeling is a technique to generate a 3D
digital representation of any object or surface using the software. In this field also,
computer vision plays its role in constructing 3D computer models from existing objects.
Furthermore, 3D modeling has a variety of applications in various places, such as
Robotics, Autonomous driving, 3D tracking, 3D scene reconstruction, and AR/VR.
o Medical imaging: Computer vision helps medical professionals make better decisions
regarding treating patients by developing visualization of specific body parts such as
organs and tissues. It helps them get more accurate diagnoses and a better patient care
system. E.g., Computed Tomography (CT) or Magnetic Resonance Imaging (MRI)
scanner to diagnose pathologies or guide medical interventions such as surgical planning
or for research purposes.
o Automotive safety: Computer vision has added an important safety feature in
automotive industries. E.g., if a vehicle is taught to detect objects and dangers, it could
prevent an accident and save thousands of lives and property.
o Surveillance: It is one of computer vision technology's most important and beneficial use
cases. Nowadays, CCTV cameras are almost fitted in every place, such as streets, roads,
highways, shops, stores, etc., to spot various doubtful or criminal activities. It helps
provide live footage of public places to identify suspicious behavior, identify dangerous
objects, and prevent crimes by maintaining law and order.
o Fingerprint recognition and biometrics: Computer vision technology detects
fingerprints and biometrics to validate a user's identity. Biometrics deals with recognizing
persons based on physiological characteristics, such as the face, fingerprint, vascular
pattern, or iris, and behavioral traits, such as gait or speech. It combines Computer Vision
with knowledge of human physiology and behavior.
Introduction to images
Digital Image Processing means processing digital image by means of a digital computer. We can
also say that it is a use of computer algorithms, in order to get enhanced image either to extract
some useful information.
Digital image processing is the use of algorithms and mathematical models to process and
analyze digital images. The goal of digital image processing is to enhance the quality of images,
extract meaningful information from images, and automate image-based tasks.
The basic steps involved in digital image processing are:
1. Image acquisition: This involves capturing an image using a digital camera or scanner, or
importing an existing image into a computer.
2. Image enhancement: This involves improving the visual quality of an image, such as
increasing contrast, reducing noise, and removing artifacts.
3. Image restoration: This involves removing degradation from an image, such as blurring,
noise, and distortion.
4. Image segmentation: This involves dividing an image into regions or segments, each of
which corresponds to a specific object or feature in the image.
5. Image representation and description: This involves representing an image in a way that
can be analyzed and manipulated by a computer, and describing the features of an image
in a compact and meaningful way.
6. Image analysis: This involves using algorithms and mathematical models to extract
information from an image, such as recognizing objects, detecting patterns, and
quantifying features.
7. Image synthesis and compression: This involves generating new images or compressing
existing images to reduce storage and transmission requirements.
8. Digital image processing is widely used in a variety of applications, including medical
imaging, remote sensing, computer vision, and multimedia.

Image processing mainly include the following steps:


1.Importing the image via image acquisition tools;
2.Analysing and manipulating the image;
3.Output in which result can be altered image or a report which is based on analysing that image.
What is an image?
An image is defined as a two-dimensional function,F(x,y), where x and y are spatial coordinates,
and the amplitude of F at any pair of coordinates (x,y) is called the intensity of that image at that
point. When x,y, and amplitude values of F are finite, we call it a digital image.
In other words, an image can be defined by a two-dimensional array specifically arranged in
rows and columns.
Digital Image is composed of a finite number of elements, each of which elements have a
particular value at a particular location.These elements are referred to as picture elements,image
elements,and pixels.A Pixel is most widely used to denote the elements of a Digital Image.
Types of an image
1. BINARY IMAGE– The binary image as its name suggests, contain only two pixel
elements i.e 0 & 1,where 0 refers to black and 1 refers to white. This image is also known
as Monochrome.
2. BLACK AND WHITE IMAGE– The image which consist of only black and white
color is called BLACK AND WHITE IMAGE.
3. 8 bit COLOR FORMAT– It is the most famous image format.It has 256 different shades
of colors in it and commonly known as Grayscale Image. In this format, 0 stands for
Black, and 255 stands for white, and 127 stands for gray.
4. 16 bit COLOR FORMAT– It is a color image format. It has 65,536 different colors in
it.It is also known as High Color Format. In this format the distribution of color is not as
same as Grayscale image.
A 16 bit format is actually divided into three further formats which are Red, Green and Blue.
That famous RGB format.
Image as a Matrix
As we know, images are represented in rows and columns we have the following syntax in which
images are represented:

The right side of this equation is digital image by definition. Every element of this matrix is
called image element , picture element , or pixel.

DIGITAL IMAGE REPRESENTATION IN MATLAB:

In MATLAB the start index is from 1 instead of 0. Therefore, f(1,1) = f(0,0).


henceforth the two representation of image are identical, except for the shift in origin.
In MATLAB, matrices are stored in a variable i.e X,x,input_image , and so on. The variables
must be a letter as same as other programming languages.

PHASES OF IMAGE PROCESSING:


1.ACQUISITION– It could be as simple as being given an image which is in digital form. The
main work involves:
a) Scaling
b) Color conversion(RGB to Gray or vice-versa)
2.IMAGE ENHANCEMENT– It is amongst the simplest and most appealing in areas of Image
Processing it is also used to extract some hidden details from an image and is subjective.
3.IMAGE RESTORATION– It also deals with appealing of an image but it is
objective(Restoration is based on mathematical or probabilistic model or image degradation).
4.COLOR IMAGE PROCESSING– It deals with pseudocolor and full color image processing
color models are applicable to digital image processing.
5.WAVELETS AND MULTI-RESOLUTION PROCESSING– It is foundation of representing
images in various degrees.
6.IMAGE COMPRESSION-It involves in developing some functions to perform this
operation. It mainly deals with image size or resolution.
7.MORPHOLOGICAL PROCESSING-It deals with tools for extracting image components
that are useful in the representation & description of shape.
8.SEGMENTATION PROCEDURE-It includes partitioning an image into its constituent parts
or objects. Autonomous segmentation is the most difficult task in Image Processing.
9.REPRESENTATION & DESCRIPTION-It follows output of segmentation stage, choosing a
representation is only the part of solution for transforming raw data into processed data.
10.OBJECT DETECTION AND RECOGNITION-It is a process that assigns a label to an
object based on its descriptor.
OVERLAPPING FIELDS WITH IMAGE PROCESSING

According to block 1,if input is an image and we get out image as a output, then it is termed as
Digital Image Processing.
According to block 2,if input is an image and we get some kind of information or description as
a output, then it is termed as Computer Vision.
According to block 3,if input is some description or code and we get image as an output, then it
is termed as Computer Graphics.
According to block 4,if input is description or some keywords or some code and we get
description or some keywords as a output,then it is termed as Artificial Intelligence
Advantages of Digital Image Processing:
1. Improved image quality: Digital image processing algorithms can improve the visual
quality of images, making them clearer, sharper, and more informative.
2. Automated image-based tasks: Digital image processing can automate many image-based
tasks, such as object recognition, pattern detection, and measurement.
3. Increased efficiency: Digital image processing algorithms can process images much
faster than humans, making it possible to analyze large amounts of data in a short amount
of time.
4. Increased accuracy: Digital image processing algorithms can provide more accurate
results than humans, especially for tasks that require precise measurements or
quantitative analysis.
Disadvantages of Digital Image Processing:
1. High computational cost: Some digital image processing algorithms are computationally
intensive and require significant computational resources.
2. Limited interpretability: Some digital image processing algorithms may produce results
that are difficult for humans to interpret, especially for complex or sophisticated
algorithms.
3. Dependence on quality of input: The quality of the output of digital image processing
algorithms is highly dependent on the quality of the input images. Poor quality input
images can result in poor quality output.
4. Limitations of algorithms: Digital image processing algorithms have limitations, such as
the difficulty of recognizing objects in cluttered or poorly lit scenes, or the inability to
recognize objects with significant deformations or occlusions.
5. Dependence on good training data: The performance of many digital image processing
algorithms is dependent on the quality of the training data used to develop the algorithms.
Poor quality training data can result in poor performance of the algorithm.
Difference between Image Processing and Computer Vision
Computer Vision:
In Computer Vision, computers or machines are made to gain high-level understanding from the
input digital images or videos with the purpose of automating tasks that the human visual system
can do. It uses many techniques and Image Processing is just one of them.
Image Processing:
Image Processing is the field of enhancing the images by tuning many parameter and features of
the images. So Image Processing is the subset of Computer Vision. Here, transformations are
applied to an input image and the resultant output image is returned. Some of these
transformations are- sharpening, smoothing, stretching etc.
Now, as both the fields deal with working in visuals, i.e., images and videos, there seems to be
lot of confusion about the difference about these fields of computer science. In this article we
will discuss the difference between them.
Difference between Image Processing and Computer Vision:

Image Processing Computer Vision

Image processing is mainly focused on Computer vision is focused on extracting


processing the raw input images to information from the input images or videos to
enhance them or preparing them to do have a proper understanding of them to predict the
other tasks visual input like human brain.

Image processing uses methods like


Anisotropic diffusion, Hidden Markov Image processing is one of the methods that is
models, Independent component analysis, used for computer vision along with other
Different Filtering etc. Machine learning techniques, CNN etc.

Image Processing is a subset of Computer Computer Vision is a superset of Image


Vision. Processing.

Examples of some Image Processing


applications are- Rescaling image (Digital Examples of some Computer Vision applications
Zoom), Correcting illumination, Changing are- Object detection, Face detection, Hand
tones etc. writing recognition etc.

BASIC IMAGE OPERATIONS


Fundamental Image Processing Steps
Image Acquisition
Image acquisition is the first step in image processing. This step is also known as preprocessing
in image processing. It involves retrieving the image from a source, usually a hardware-based
source.
Image Enhancement
Image enhancement is the process of bringing out and highlighting certain features of interest in
an image that has been obscured. This can involve changing the brightness, contrast, etc.
Image Restoration
Image restoration is the process of improving the appearance of an image. However, unlike
image enhancement, image restoration is done using certain mathematical or probabilistic
models.
Color Image Processing
Color image processing includes a number of color modeling techniques in a digital domain.
This step has gained prominence due to the significant use of digital images over the internet.
Wavelets and Multiresolution Processing
Wavelets are used to represent images in various degrees of resolution. The images are
subdivided into wavelets or smaller regions for data compression and for pyramidal
representation.
Compression
Compression is a process used to reduce the storage required to save an image or the bandwidth
required to transmit it. This is done particularly when the image is for use on the Internet.
Morphological Processing
Morphological processing is a set of processing operations for morphing images based on their
shapes.
Segmentation
Segmentation is one of the most difficult steps of image processing. It involves partitioning an
image into its constituent parts or objects.
Representation and Description
After an image is segmented into regions in the segmentation process, each region is represented
and described in a form suitable for further computer processing. Representation deals with the
image’s characteristics and regional properties. Description deals with extracting quantitative
information that helps differentiate one class of objects from the other.
Recognition
Recognition assigns a label to an object based on its description.

MATHEMATICAL OPERATIONS ON IMAGES: DATATYPE CONVERSION,


CONTRAST ENHANCEMENT, BRIGHTNESS ENHANCEMENT
Image arithmetic or mathematical operation on images is a method of manipulating images by applying
ordinary arithmetic or logical operators. Each pixel's value in the output image is solely determined by its
matching pixel in the input images since these processes are carried out pixel-by-pixel. The input images
must normally be the same size as a result. When adding an offset to an image, for example, one input
image could be a constant value.

Despite being a straightforward method of image processing, image arithmetic has many uses. Due to its
simplicity, one of its key benefits is speed. For example, when reducing random noise by adding
successive images or detecting motion by subtracting two successive images, the images being
processed are often snapshots of the same scene recorded at various times.
In image arithmetic, logical operators are frequently employed to join binary images. Logical operators
are often applied bitwise when working with integer pictures. This enables the use of a binary mask to
choose particular areas inside an image.

Addition and Subtraction of Images

The OpenCV function cv2.add() or the straightforward numpy operation addition = image1 + image2 can
be used to combine two images. Both images should be of the same depth and type, otherwise, the
second image can simply be a scalar number. But, adding the pixels is not optimal. As a result, we
employ the cv2.addweighted() function.

import cv2

import numpy as np

from google.colab.patches import cv2_imshow

image1 = cv2.imread('/content/montains.jpeg')

image2 = cv2.imread('/content/sunrise.jpeg')

image1 = cv2.resize(image1, (512, 512))

image2 = cv2.resize(image2, (512 ,512))

added_image = cv2.addWeighted(image1, 0.5,blended_image, 0.5, 0)

cv2_imshow(image1)

cv2_imshow(image2)

cv2_imshow(added_image)

Output

subtraction of images With cv2.subtract, we may subtract the pixel values from two images and merge
them . The images must be the same size and depth. It is customary to use a single image as input and to
subtract a constant value from all of the pixels.

import cv2

import numpy as np

from google.colab.patches import cv2_imshow

image1 = cv2.imread('/content/montains.jpeg')

image2 = cv2.imread('/content/sunrise.jpeg')
image1 = cv2.resize(image1, (300, 300))

image2 = cv2.resize(image2, (300,300))

subtracted_image = cv2.subtract(image1, image2)

cv2_imshow(image1)

cv2_imshow(image2)

cv2_imshow(subtracted_image)

Output

Multiplication and Division of Images


Multiplication Multiplication is used in computer vision to scale images and perform transformations
such as rotation and scaling. It's also utilized in image processing processes like convolution and
correlation. Multiplication is a computationally efficient operation, particularly when performed with
SIMD (Single Instruction Multiple Data) instructions.

Using the cv2.multiply() function, we achieve picture scaling by multiplying the first image by 0.5. The
cv2.multiply() function performs element-wise multiplication of the image's pixel values.

Scaling images using multiplication operation

import cv2

import numpy as np

from Google. collab. patches import cv2_imshow

input_image = cv2.imread('/content/sunrise.jpeg')

input_image = cv2.resize(input_image, (300,300))

scaling_factor = 0.4

scaled_image= cv2.multiply(input_image, np.array([scaling_factor]))

cv2_imshow(input_image)

cv2_imshow(scaled_image)

Output

Division In computer vision, the division is used for image normalization and contrast adjustment. It is
also employed in several feature extraction methods. When working with huge images, the division
might be computationally expensive.

Below is an example of implementing the division of images for normalization


import cv2

import numpy as np

from google.colab.patches import cv2_imshow

input_image = cv2.imread('/content/montains.jpeg')

input_image = cv2.resize(input_image, (300,300))

#normalization

normalized_image = cv2.divide(input_image, 1)

cv2_imshow(input_image)

cv2_imshow(normalized_image)

Output

Blending of Images

This is likewise image addition, but the images are given varied weights to create the illusion of blending
or transparency. The first image weights 0.7, whereas the second image has a weight of 0.3.

```import cv2

from google.colab.patches import cv2_imshow

image1 = cv2.imread('/content/montains.jpeg')

image2 = cv2.imread('/content/air_balloons.jpeg')

image1 = cv2.resize(image1, (512, 512))

image2 = cv2.resize(image2, (512 ,512))

blended_image = cv2.addWeighted(image1, 0.7,blended_image, 0.3, 0)

cv2_imshow(image1)

cv2_imshow(image2)

cv2_imshow(blended_image)

Output BLENDED IMAGE


Comparison of Basic Mathematical Operations

 Image processing operations such as blending and background subtraction make use of addition
and subtraction.

 Scaling, blending, filtering, and feature extraction all rely on multiplication.

 Normalization, contrast adjustment, and colour balance are all accomplished by division.

 Subtraction and division are not commutative, but addition and multiplication are.

 Subtraction and division are not associative, although addition and multiplication are.

 Subtraction and division are distributive over each other, but addition and multiplication are not.

 Each operation has distinct properties and applications, and a combination of these operations is
frequently utilised to do more complicated computer vision tasks.

Advanced Mathematical Operations

Advanced mathematical operations are an essential part of computer vision and image processing. They
allow us to extract useful information from images, detect patterns, and perform complex tasks like
object recognition and tracking. These operations can be used to perform tasks like image filtering,
segmentation, feature extraction, and classification.

Image Filtering

 The process of enhancing, blurring, or sharpening images.

 Involves a kernel or filter mask convolving an image.

 The Gaussian, Sobel, and Laplacian filters are typical filters.

 Moreover, non-linear filters like median and bilateral filters are frequently employed.

Convolutional Operations

 A mathematical operation used to extract features from an image.

 involves applying a number of learnable filters to a given image.

 Semantic segmentation, object detection, and image classification are typical uses.

 A few well-known convolutional neural networks are MobileNet, ResNet, VGG, and AlexNet.

Fourier Transform

 The technique for converting images from the spatial to the frequency domain.

 Involves breaking down an image into its individual frequency components.

 Edge detection, noise reduction, and image compression are typical uses.
 In image processing, the discrete Fourier transform (DFT) and fast fourier transform (FFT) are
frequently utilized.

Wavelet Transform

 The technique for converting images from the spatial to the frequency domain.

 Involves breaking an image down into many wavelets with various sizes and orientations.

 Image compression, feature extraction, and denoising are frequent uses.

 There are several wavelet families that are often utilised, including Haar, Daubechies, and
Coiflets.

You might also like