0% found this document useful (0 votes)
124 views77 pages

Introcduction To Image Processing With Python Nour Eddine ALAA and Ismail Zine El Abidne March 5, 2021

This document provides an introduction to image processing with Python. It discusses what digital images and image processing are, and outlines the objectives of working with OpenCV and Python for image processing tasks. The document covers installing OpenCV and Python, reading, displaying and saving images, image transformations, feature detection, face detection using haar cascades, linear and non-linear filters, directional filters, and image restoration techniques. The goal is to provide a solid foundation in computer vision and image processing fundamentals and techniques using Python and OpenCV.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views77 pages

Introcduction To Image Processing With Python Nour Eddine ALAA and Ismail Zine El Abidne March 5, 2021

This document provides an introduction to image processing with Python. It discusses what digital images and image processing are, and outlines the objectives of working with OpenCV and Python for image processing tasks. The document covers installing OpenCV and Python, reading, displaying and saving images, image transformations, feature detection, face detection using haar cascades, linear and non-linear filters, directional filters, and image restoration techniques. The goal is to provide a solid foundation in computer vision and image processing fundamentals and techniques using Python and OpenCV.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

INTROCDUCTION TO IMAGE PROCESSING WITH PYTHON

Nour Eddine ALAA and Ismail Zine El Abidne


March 5, 2021

LAMAI Laboratory FST Marrakech

Cadi Ayyad University

1
Contents
1 Abstract 4

2 Introduction : 5
2.1 What is a Digital Image . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Internship Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Project Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Python 8
3.1 Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 OpenCV: 14
4.1 Installing Python and OpenCv . . . . . . . . . . . . . . . . . . . 14
4.1.1 Download the Python 3 Installer . . . . . . . . . . . . . . 14
4.1.2 Installing OpenCV . . . . . . . . . . . . . . . . . . . . . . 14
4.1.3 Python Virtual Environments : . . . . . . . . . . . . . . 19
4.2 Reading, displaying, and saving images : . . . . . . . . . . . . . . 20
4.2.1 Reading image : . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.2 Display an image : . . . . . . . . . . . . . . . . . . . . . . 21
4.2.3 Saving Image : . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.4 Color image space : . . . . . . . . . . . . . . . . . . . . . 23
4.2.5 Changing Color space : . . . . . . . . . . . . . . . . . . . 24
4.2.6 Cropping Image : . . . . . . . . . . . . . . . . . . . . . . . 25
4.2.7 Transformations of Images : . . . . . . . . . . . . . . . . . 26
4.3 Image Processing in OpenCV : . . . . . . . . . . . . . . . . . . . 30
4.3.1 2D convolution : . . . . . . . . . . . . . . . . . . . . . . . 30
4.3.2 Smoothing Image : . . . . . . . . . . . . . . . . . . . . . . 32
4.3.3 Morphological Transformations : . . . . . . . . . . . . . . 36
4.3.4 Edge detection : . . . . . . . . . . . . . . . . . . . . . . . 39
4.3.5 Image Thresholding and segmentation : . . . . . . . . . . 41
4.3.6 Image Histogram : . . . . . . . . . . . . . . . . . . . . . . 44
4.3.7 Template Matching with Multiple Objects : . . . . . . . . 47
4.4 Feature Detection and Description : . . . . . . . . . . . . . . . . 48
4.4.1 What are keypoints ? . . . . . . . . . . . . . . . . . . . . 49
4.4.2 Harris Corner Detection : . . . . . . . . . . . . . . . . . . 49
4.4.3 Shi-Tomasi Corner Detector and Good Features to Track : 51
4.5 Face Detection using Haar Cascades : . . . . . . . . . . . . . . . 52
4.5.1 Basic concept of HAAR cascade algorithm : . . . . . . . 52
4.5.2 Haar-cascade Detection in OpenCV : . . . . . . . . . . . . 53

2
5 Linear Filters 54
5.1 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 Low Pass Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2.1 Mean Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2.2 Gaussian filter . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2.3 Binomial filter . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3 High Pass Filte . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3.1 DOG Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3.2 Laplacian of Gaussian (LoG) . . . . . . . . . . . . . . . . 58

6 Non Linear Filters 58


6.1 Median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.2 Maximum Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3 Minimum Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

7 Directional Filters 60
7.1 What is a Gradient of the image? . . . . . . . . . . . . . . . . . . 60
7.2 Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.3 Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.3.1 Directional Filter : . . . . . . . . . . . . . . . . . . . . . . 61
7.3.2 Prewitt Operator . . . . . . . . . . . . . . . . . . . . . . . 62
7.3.3 Robert Operator . . . . . . . . . . . . . . . . . . . . . . . 64
7.3.4 Krish Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.3.5 Sobel Filter : . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.4 Gaussian Blur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.4.1 Laplacien Filter . . . . . . . . . . . . . . . . . . . . . . . . 70

8 Image Restoration 71
8.1 Image deblurring . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.2 Image in-painting . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.3 Super resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.4 The Wiener filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
8.5 The Perona-Malik equation . . . . . . . . . . . . . . . . . . . . . 76

3
1 Abstract
A methodological study on significance of image processing and its applications
in the field of computer vision is carried out here. During an image processing
operation the input given is an image and its output is an enhanced high quality
image as per the techniques used. Image processing usually referred as digital
image processing. Our study provides a solid introduction to image processing
along with segmentation techniques, computer vision fundamentals and its ap-
plied applications that will be of worth to the image processing and computer
vision research communities.

4
2 Introduction :
2.1 What is a Digital Image
A digital image is a representation of a real image as a set of numbers that can
be stored and handled by a digital computer. In order to translate the image
into numbers, it is divided into small areas called pixels (picture elements). For
each pixel, the imaging device records a number, or a small set of numbers,
that describe some property of this pixel, such as its brightness (the intensity
of the light) or its color. The numbers are arranged in an array of rows and
columns that correspond to the vertical and horizontal positions of the pixels
in the image. Digital images have several basic characteristics. One is the type
of the image. For example, a black and white image records only the intensity
of the light falling on the pixels. A color image can have three colors, normally
RGB (Red, Green, Blue) or four colors, CMYK (Cyan, Magenta, Yellow, black).
RGB images are usually used in computer monitors and scanners, while CMYK
images are used in color printers. There are also non-optical images such as
ultrasound or X-ray in which the intensity of sound or X-rays is recorded. In
range images, the distance of the pixel from the observer is recorded. Resolution
is expressed in the number of pixels per inch (ppi). A higher resolution gives
a more detailed image. A computer monitor typically has a resolution of 100
ppi, while a printer has a resolution ranging from 300 ppi to more than 1440
ppi. This is why an image looks much better in print than on a monitor.

5
For the colored image it works a little bit dierent every pixel contains 3 values
R(red) G(green) B(blue) and these values are the ones who controles our image’s
color.

2.2 Image Processing


Image processing is a method to convert an image into digital form and
perform some operations on it, in order to get an enhanced image or to extract
some useful information from it. It is a type of signal dispensation in which
input is image, like video frame or photograph and output may be image or
characteristics associated with that image. Usually Image Processing system
includes treating images as two dimensional signals while applying already set
signal processing methods to them. It is among rapidly growing technologies
today, with its applications in various aspects of a business. Image Processing
forms core research area within engineering and computer science disciplines
too.
Image processing basically includes the following three steps:
• Importing the image via image acquisition tools.
• Analysing and manipulating the image.
• Output in which result can be altered image or report that is based on
image analysis.
Low-level image processing algorithms include :
• Edge detection
• Segmentation.
• Classification.
• Feature detection and matching.

6
Edge detection

Classification and Feature detection and matching

2.3 Internship Objectives


We present basic image processing denoising methods.We rst recall briefly the
main features of linear 1D feltering techniques.Then we present linear standard
methods. Like Gaussian Blur

2.4 Project Results


We develop many lters like ”Gauss Filter” , ”Sobel Filter”,”Prewitt operator”
and so on .. we gonna see them all in the next chapters.

7
3 Python
3.1 Denition
Python is an interpreted, high-level, general-purpose programming language.
Created by Guido van Rossum and rst released in 1991, Python’s design philos-
ophy emphasizes code readability with its notable use of signicant white space.
Its language constructs and object-oriented approach aims to help program-
mers write clear, logical code for small and large-scale projects.[26] Python is
dynamically typed and garbage-collected. It supports multiple programming
paradigms, including procedural, object-oriented, and functional programming.
Python is often described as a ”batteries included” language due to its com-
prehensive standard library.[27] Python was conceived in the late 1980s as a
successor to the ABC language. Python 2.0, released 2000, introduced features
like list comprehensions and a garbage collection system capable of collecting
reference cycles. Python 3.0, released 2008, was a major revision of the lan-
guage that is not completely backward-compatible, and much Python 2 code
does not run unmodied on Python 3. Due to concern about the amount of code
written for Python 2, support for Python 2.7 (the last release in the 2.x se-
ries) was extended to 2020. Language developer Guido van Rossum shouldered
sole responsibility for the project until July 2018 but now shares his leadership
as a member of a veperson steering council.[28][29][30] Python interpreters are
available for many operating systems. A global community of programmers de-
velops and maintains CPython, an open source[31] reference implementation. A
non-prot organization, the Python Software Foundation, manages and directs
resources for Python and CPython development.

3.2 Libraries
We use a 3 main libraries to process get data from image and analyse it , apply
our lters then save it as a png.
1. skimage
scikit-image is a collection of algorithms for image processing and com-
puter vision. The main package of skimage only provides a few utilities
for converting between image data types; for most features, you need to
import one of the following subpackages: Color , Data , Draw , Exposure
, Filters .. and so on .
src = ”https://scikit-image.org/docs/dev/api/skimage.html”
we use to this library especially to import the data (image) from it , we
taken then apply our filters at it.
• Loading Images from skimage
Within the scikit-image package, there are several sample images
provided in the data module. Let’s say we want to load a single
image to perform a few experiments. Instead of using an external

8
image, we can simply load one of the images provided within the
package!
Here is the Python code to do this:

Listing 1: Load Image from skimage


from skimage . i o import imread , imshow
from skimage import data

image = data . c o i n s ( )
imshow ( image )
Notice that I have used the imshow function here to view the image
in the notebook itself.
• Reading Images from our System using skimage
What if you want to load an image from your machine instead of
the ones provided in the package?For this, we can use the imread
function from skimage.
We can read images in two formats – colored and grayscale. We will
see both of these in action and understand how they’re different.
The imread function has a parameter “as-gray” which is used to
specify if the image must be converted into a grayscale image or not.
We will start with reading an image in grayscale format, by setting
the parameter to true:

Listing 2: Load Image from Our Local Desktop


from skimage . i o import imread , imshow
import m a t p l o t l i b . p y p l o t a s p l t

i m a g e g r a y = imread ( ’ images . j p e g ’ , a s g r a y=True )


imshow ( i m a g e g r a y )
Now, we’ll load the image in the original color format. For this, we
will have to set the parameter ‘as-gray’ to False:

Listing 3: Load Image from Our Local Desktop

from skimage . i o import imread , imshow


import m a t p l o t l i b . p y p l o t a s p l t

i m a g e c o l o r = imread ( ’ images . j p e g ’ , a s g r a y=F a l s e )


print ( i m a g e c o l o r . shape )
imshow ( i m a g e c o l o r )

2. OpenCv
OpenCV (Open Source Computer Vision Library) is an open source com-
puter vision and machine learning software library. OpenCV was built

9
to provide a common infrastructure for computer vision applications and
to accelerate the use of machine perception in the commercial products.
Being a BSDlicensed product, OpenCV makes it easy for businesses to
utilize and modify the code. The library has more than 2500 optimized
algorithms, which includes a comprehensive set of both classic and state-
of- the-art computer vision and machine learning algorithms. These algo-
rithms can be used to detect and recognize faces, identify objects, classify
human actions in videos, track camera movements, track moving objects,
extract 3D models of objects, produce 3D point clouds from stereo cam-
eras, stitch images together to produce a high resolution image of an entire
scene, nd similar images from an image database, remove red eyes from
images taken using ash, follow eye movements, recognize scenery and es-
tablish markers to overlay it with augmented reality, etc. OpenCV has
more than 47 thousand people of user community and estimated number
of downloads exceeding 18 million. The library is used extensively in com-
panies, research groups and by governmental bodies.
src = ”https://opencv.org/about/”
this Library is so powerful has a lot of features that i could not imagine
we use it to display and restore our images after applying our lters.
• Reading, Writing and Displaying Images
Machines see and process everything using numbers, including im-
ages and text. How do you convert images to numbers – I can hear
you wondering. Two words – pixel values:
Every number represents the pixel intensity at that particular loca-
tion. In the above image, I have shown the pixel values for a grayscale
image where every pixel contains only one value i.e. the intensity of
the black color at that location.
Note that color images will have multiple values for a single pixel.
These values represent the intensity of respective channels – Red,
Green and Blue channels for RGB images, for instance.
Reading and writing images is essential to any computer vision project.
And the OpenCV library makes this function a whole lot easier.
Now, let’s see how to import an image into our machine using OpenCV.

Listing 4: Python example


#i m p o r t t h e l i b r a r i e s
import numpy a s np
import m a t p l o t l i b . p y p l o t a s p l t
import cv2

#r e a d i n g t h e image

image = cv2 . imread ( ’ i n d e x . png ’ )


image = cv2 . c v t C o l o r ( image , cv2 .COLOR BGR2RGB)
#p l o t t i n g t h e image

10
p l t . imshow ( image )

#s a v i n g image
cv2 . i m w r i t e ( ’ t e s t w r i t e . j p g ’ , image )

• Changing Color Spaces


A color space is a protocol for representing colors in a way that makes
them easily reproducible. We know that grayscale images have single
pixel values and color images contain 3 values for each pixel – the
intensities of the Red, Green and Blue channels.
Most computer vision use cases process images in RGB format. How-
ever, applications like video compression and device independent
storage – these are heavily dependent on other color spaces, like the
Hue-Saturation-Value or HSV color space.
OpenCV reads a given image in the BGR format by default. So,
you’ll need to change the color space of your image from BGR to
RGB when reading images using OpenCV. Let’s see how to do that:

Listing 5: Changing Color Spaces


#i m p o r t t h e r e q u i r e d l i b r a r i e s
import numpy a s np
import m a t p l o t l i b . p y p l o t a s p l t
import cv2
image = cv2 . imread ( ’ i n d e x . j p g ’ )
#c o n v e r t i n g image t o Gray s c a l e
g r a y i m a g e = cv2 . c v t C o l o r ( image , cv2 .COLOR BGR2GRAY)
#p l o t t i n g t h e g r a y s c a l e image
p l t . imshow ( g r a y i m a g e )
#c o n v e r t i n g image t o HSV f o r m a t
h s v i m a g e = cv2 . c v t C o l o r ( image , cv2 .COLOR BGR2HSV)
#p l o t t i n g t h e HSV image
p l t . imshow ( h s v i m a g e )

3. Numpy
NumPy is the fundamental package for scientic computing with Python.
It contains among other things: a powerful N-dimensional array object
sophisticated (broadcasting) functions tools for integrating C/C++ and
Fortran code useful linear algebra, Fourier transform, and random number
capabilities Besides its obvious scientic uses, NumPy can also be used as an
ecient multi-dimensional container of generic data. Arbitrary data-types
can be dened. This allows NumPy to seamlessly and speedily integrate
with a wide variety of databases.
src = ”https://www.numpy.org/” We use this library to apply our lters
as matrix because as we mentioned before an image is a matrix multi-
dimenstion

11
• Python Lists vs NumPy Arrays – What’s the Difference?
If you’re familiar with Python, you might be wondering why use
NumPy arrays when we already have Python lists? After all, these
Python lists act as an array that can store elements of various types.
This is a perfectly valid question and the answer to this is hidden in
the way Python stores an object in memory.
A Python object is actually a pointer to a memory location that
stores all the details about the object, like bytes and the value. Al-
though this extra information is what makes Python a dynamically
typed language, it also comes at a cost which becomes apparent when
storing a large collection of objects, like in an array.
Python lists are essentially an array of pointers, each pointing to a
location that contains the information related to the element. This
adds a lot of overhead in terms of memory and computation. And
most of this information is rendered redundant when all the objects
stored in the list are of the same type!
To overcome this problem, we use NumPy arrays that contain only
homogeneous elements, i.e. elements having the same data type.
This makes it more efficient at storing and manipulating the array.
This difference becomes apparent when the array has a large number
of elements, say thousands or millions. Also, with NumPy arrays,
you can perform element-wise operations, something which is not
possible using Python lists!
This is the reason why NumPy arrays are preferred over Python lists
when performing mathematical operations on a large amount of data.
• Creating a NumPy Array
– Basic ndarray
NumPy arrays are very easy to create given the complex prob-
lems they solve. To create a very basic ndarray, you use the
np.array() method. All you have to pass are the values of the
array as a list:

Listing 6: np.array
np . a r r a y ( [ 1 , 2 , 3 , 4 ] )
Output
array ( [ 1 , 2 , 3 , 4 ] )
This array contains integer values. You can specify the type of
data in the dtype argument:
np . a r r a y ( [ 1 , 2 , 3 , 4 ] , dtype=np . f l o a t 3 2 )
Output
array ( [ 1 . , 2 . , 3 . , 4 . ] )

12
– Multi-Dimensional Array
NumPy arrays can be multi-dimensional too.

Listing 7: Multi-Dimensional Array


np . a r r a y ( [ [ 1 , 2 , 3 , 4 ] , [ 5 , 6 , 7 , 8 ] ] )
Output
array ( [ [ 1 , 2 , 3 , 4 ] ,
[5 , 6 , 7 , 8]])

13
4 OpenCV:
4.1 Installing Python and OpenCv
4.1.1 Download the Python 3 Installer
• Open a browser window and navigate to the Download page for Windows
at python.org.
• Underneath the heading at the top that says Python Releases for Win-
dows, click on the link for the Latest Python 3 Release - Python 3.x.x.
(As of this writing, the latest is Python 3.6.5.)
• Scroll to the bottom and select either Windows x86-64 executable installer
for 64-bit or Windows x86 executable installer for 32-bit. (See below.)
• Once you have chosen and downloaded an installer, simply run it by
double-clicking on the downloaded file. A dialog should appear that looks
something like this:

• Then just click Install Now. That should be all there is to it. A few
minutes later you should have a working Python 3 installation on your
system.

4.1.2 Installing OpenCV


Launch the Anaconda prompt from the start menu:

14
15
We should see a similar prompt if we went according to instructions.

To install the OpenCV we need to type the following command at the


prompt:
conda install -c conda-forge opencv

The prompt will show that it is “solving environment”. It takes quite a bit of
time if you are on slower Internet connection.

16
Try to use faster Internet, preferably a wired connection for uninterrupted
and best results. if you are in a work or institution based Internet connection
then it is likely that an HTTP timeout will occur and we need to enter the
command once again and restart the procedure.

Once the environment is resolved by conda it will list the packages that will
be installed, namely: opencv, libopencv, py-opencv. Enter y to proceed with
the installation.

17
We can verify if the installation was successful by launching the python
interpreter. opencv is referred to as cv2 in python. Type at the prompt:
• import cv2 : if the prompt is displayed then opencv then python has
successfully imported the opencv library. But we should also verify the
version of opencv so we need to type:
• print(cv2. version ) : as of March 2019 the version displayed is 3.4.1
which is the officially supported version of opencv by anaconda environ-
ment. If we want to work with a different version then while installing we
can specify the version as “opencv=3.4.1” as shown below

18
4.1.3 Python Virtual Environments :
Basically reading an image from your local machine . We need to create a python
file readimage.py.we put the images in the same direction and we start coding
. Happy Coding
It is often useful to have one or more Python environments where you can
experiment with different combinations of packages without affecting your main
installation. Python supports this through virtual environments. The virtual
environment is a copy of an existing version of Python with the option to inherit
existing packages. A virtual environment is also useful when you need to work
on a shared system and do not have permission to install packages as you will
be able to install them in the virtual environment.
Firstly we need to open Terminal

Install the virtualenv package


The virtualenv package is required to create virtual environments. You can
install it with pip: pip install virtualenv

Create the virtual environment

To create a virtual environment, you must specify a path. For example to create
one in the local directory called ‘miasi’, type the following: virtualenv miasi

Activate the virtual environment

You can activate the python environment by running the following command:
• Mac OS / Linux : source miasi/bin/activate

19
• Windows : miasi
You should see the name of your virtual environment in brackets on your
terminal line e.g. (miasi).
Any python commands you use will now work with your virtual environment

4.2 Reading, displaying, and saving images :


4.2.1 Reading image :
We gonna use OpenCv library firstly we have to install it in our ’miasi’ envi-
ronment we will need to run this command pip install opencv-python.
Now everything is good and ready.
In opencv there are many methods to read images but we will use 3 main ones :
• cv2.IMREAD COLOR or (1) : Loads a color image. Any trans-
parency of image will be neglected. It is the default flag.

• cv2.IMREAD GRAYSCALE or (0) : Loads image in grayscale mode.


• cv2.IMREAD UNCHANGED or (-1) : Loads image as such includ-
ing alpha channel.

20
Reading Image with OpenCV
As we see imread() function converts our image to NumPy array.

4.2.2 Display an image :


Use the function cv2.imshow() to display an image in a window. The window
automati- cally fits to the image size. First argument is a window name which is
a string. second argument is our image. You can create as many windows as you
wish, but with different window names. Next, the function cv2.waitKey(), is
used in OpenCV for keyboard binding, it performs the rendering of the image
loaded in the step before. It takes a number that indicates the time in mil-
liseconds of rendering. Basically, we use this function to wait for a specified
duration until we encounter a keyboard event. The program stops at this point,
and waits for you to press any key to continue. If we don’t pass any argument, or

21
if we pass 0 as the argument, this function waits for a keyboard event indefinitely.

Finaly, the function cv2.destroyAllWindows() destroys all the windows we


created. If you want to destroy any specific window, use the function cv2.destroyWindow()
where you pass the exact window name as the argument.

22
Displaying Image with OpenCV

4.2.3 Saving Image :


First argument is the file name, second argument is the image you want to save.

Saving Image with OpenCV


This will save the image in PNG format in the working directory.

4.2.4 Color image space :


In computer vision and image processing, color space refers to a specific way of
organizing colors. A color space is actually a combination of two things, a color
model and a mapping function. The reason we want color models is because it
helps us in representing pixel values using tuples. The mapping function maps
the color model to the set of all possible colors that can be represented. There
are many different color spaces that are useful. Some of the more popular color
spaces are RGB, HSV, YUV, Lab, and so on.

• RGB : Probably the most popular color space. It stands for Red, Green,
and Blue. In this color space, each color is represented as a weighted
combination of red, green, and blue. So every pixel value is represented
as a tuple of three numbers corresponding to red, green, and blue. Each
value ranges between 0 and 255.

• HSV : Stands for Hue, Saturation, and Value. This is a cylindrical


system where we separate three of the most primary properties of colors

23
and represent them using different channels. This is closely related to
how the human visual system understands color. This gives us a lot of
flexibility as to how we can handle images.

There is a difference in pixel ordering in OpenCV and Matplotlib. OpenCV


follows BGR (Blue, Green, Red) order, while matplotlib likely follows RGB
(Red, Green, Blue) order.

Saving Image with OpenCV

4.2.5 Changing Color space :


For color conversion, we use the function cv2.cvtColor(input image, flag)
where flag deter- mines the type of conversion.
For BGR =⇒ Gray conversion we use the flags cv2.COLOR BGR2GRAY.
For BGR=⇒RGB conversion we use the flags cv2.COLOR BGR2RGB.
Similarly for BGR=⇒HSV, we use the flag cv2.COLOR BGR2HSV.

24
4.2.6 Cropping Image :
All we are doing is slicing arrays. We first supply the startY : endY coordinates,
followed by the startX : endX coordinates to the slice. That’s it. We’ve cropped
the image!

25
4.2.7 Transformations of Images :
Resizing :
Resizing an image means changing the dimensions of it, be it width alone, height
alone or both. Also, the aspect ratio of the original image could be preserved
in the resized image. To resize an image, OpenCV provides cv2.resize() function.

Preferable interpolation methods are cv2.INTER AREA for shrinking and


cv2.INTER CUBIC (slow) cv2.INTER LINEAR for zooming. By default,
interpolation method used is cv2.INTER LINEAR for all resizing purposes.

26
27
Translation : Translation basically means that we are shifting the image
by adding/subtracting the x and y coordinates. In order
 to do this, we need to
1 0 tx
create a transformation matrix M, as follows : M =
0 1 ty
Here, the t x and t y values are the x and y translation values ; that is, the
image will be moved by x units to the right, and by y units downwards.. So
once we create a matrix like this, we can use the function, cv2.warpAffine(),
to apply it to our image. The third argument in cv2.warpAffine() refers to
the size of output image.

28
Rotation :
To rotate an image using OpenCV Python, first, calculate the affine matrix that
does the affine transformation (linear mapping of pixels), then warp the input
image with the affine matrix.
Using cv2.getRotationMatrix2D(), we can specify the center point around which
the image would be rotated as the first argument, then the angle of rotation in
degrees, and a scaling factor for the image at the end.
Rotation is alsoa form of transformation,
 and we can achieve
 it by using the
cos(θ) −sin(θ)
following transformation matrix : R =
sin(θ) cos(θ)
θ is the angle of rotation in the counterclockwise direction.

29
4.3 Image Processing in OpenCV :
In this section, we are going to see how to apply cool visual effects to images.
We will learn how to use fundamental image processing operators, discuss edge
detection, computing histogram, temple matching, and see how we can use
image filters to apply various effects to photos.

4.3.1 2D convolution :
Convolution involving one-dimensional signals is referred to as 1D convolution
or just convolution. Otherwise, if the convolution is performed between two
signals spanning along two mutually perpendicular dimensions (i.e., if signals
are two-dimensional in nature), then it will be referred to as 2D convolution.
This concept can be extended to involve multi-dimensional signals due to which
we can have multi-dimensional convolution.
In the digital domain, convolution is performed by multiplying and accu-
mulating the instantaneous values of the overlapping samples corresponding to
two input signals, one of which is flipped. This definition of 1D convolution is
applicable even for 2D convolution except that, in the latter case, one of the
inputs is flipped twice.
This kind of operation is extensively used in the field of digital image pro-
cessing wherein the 2D matrix representing the image will be convolved with a
comparatively smaller matrix called 2D kernel.
the kernel is called the image filter and the process of applying this kernel to
the given image is called image filtering. The output obtained after applying
the kernel to the image is called the filtered image. Depending on the values in

30
the kernel, it performs different functions such as blurring, detecting edges, and
so on.

As for one-dimensional signals, images also can be filtered with various low-
pass filters (LPF), high-pass filters (HPF), etc. A LPF helps in removing noise,
or blurring the image. A HPF helps in finding edges in an image. OpenCV
provides a function, cv2.filter2D(), to convolve a kernel with an image. As an
example, we will try an averaging filter on an image. A 3x3 averaging filter
kernel can be defined as follows :

 
1 1 1
1 1
K= 9
1 1
1 1 1

31
4.3.2 Smoothing Image :
Image blurring is achieved by convolving the image with a low-pass filter kernel.
It is useful for removing noise. It actually removes high frequency content (e.g :
noise, edges) from the image resulting in edges being blurred when this is filter
is applied.
Averaging :
OpenCV provides a function, cv2.filter2D(), to convolve a kernel with an image.
As an example, we will try an averaging filter on an image. A 5x5 averaging
filter kernel can be defined as follows:

 
1 1 1 1 1
1 1 1 1 1
1
 
K= 1 1 1 1 1
25  
1 1 1 1 1
1 1 1 1 1

32
Gaussian Filtering :
As in any other signals, images also can contain different types of noise, espe-
cially because of the source (camera sensor). Image Smoothing techniques help
in reducing the noise. In OpenCV, image smoothing (also called blurring) could
be done in many ways.

Gaussian filters have the properties of having no overshoot to a step function


input while minimizing the rise and fall time. In terms of image processing, any
sharp edges in images are smoothed while minimizing too much blurring.

OpenCV provides cv2.gaussianblur() function to apply Gaussian Smoothing


on the input source image.

33
Median Filtering :
Here, the function cv2.medianBlur() computes the median of all the pixels under
the kernel window and the central pixel is replaced with this median value. This
is highly effective in removing salt-and-pepper noise. One interesting thing to
note is that, in the Gaussian and box filters, the filtered value for the central
element can be a value which may not exist in the original image. However this
is not the case in median filtering, since the central element is always replaced
by some pixel value in the image. This reduces the noise effectively. The kernel
size must be a positive odd integer.

34
Sharpening :
The level of sharpening depends on the type of kernel we use. We have a lot of
freedom to customize the kernel here, and each kernel will give you a different
kind of sharpening. To just sharpen an image, for Sharpening we would use a
kernel like this:
 
−1 −1 −1
M = −1 9 −1
−1 −1 −1
If we want to do excessive sharpening, we would use the following kernel:
 
1 1 1
M = 1 −9 1
1 1 1

35
4.3.3 Morphological Transformations :
Morphological transformations are some simple operations based on the image
shape. It is normally performed on binary images. It needs two inputs, one
is our original image, second one is called structuring element or kernel which
decides the nature of operation. Two basic morphological operators are Erosion
and Dilation. Then its variant forms like Opening, Closing, Gradient etc
also comes into play. We will see them one-by-one with help of following image:

Erosion : The basic idea of erosion is just like soil erosion only, it erodes away
the boundaries of foreground object (Always try to keep foreground in white).
So what it does? The kernel slides through the image (as in 2D convolution).
A pixel in the original image (either 1 or 0) will be considered 1 only if all the
pixels under the kernel is 1, otherwise it is eroded (made to zero).

So what happends is that, all the pixels near boundary will be discarded
depending upon the size of kernel. So the thickness or size of the foreground
object decreases or simply white region decreases in the image. It is useful for
removing small white noises (as we have seen in colorspace chapter), detach two
connected objects etc.

36
Dilation : It is just opposite of erosion. Here, a pixel element is ’1’ if atleast
one pixel under the kernel is ’1’. So it increases the white region in the image
or size of foreground object increases. Normally, in cases like noise removal,
erosion is followed by dilation. Because, erosion removes white noises, but it
also shrinks our object. So we dilate it. Since noise is gone, they won’t come
back, but our object area increases. It is also useful in joining broken parts of
an object.

Opening : Opening is just another name of erosion followed by dilation. It


is useful in removing noise, as we explained above. Here we use the function,
cv2.morphologyEx().

37
Closing :
Closing is reverse of Opening, Dilation followed by Erosion. It is useful in closing
small holes inside the foreground objects, or small black points on the object.

Morphological Gradient : It is the difference between dilation and erosion


of an image.
The result will look like the outline of the object.

38
4.3.4 Edge detection :
Edge detection is an image processing technique for finding the boundaries of
objects within images. It works by detecting discontinuities in brightness. Edge
detection is used for image segmentation and data extraction in areas such as
image processing, computer vision, and machine vision.
Common edge detection algorithms include Sobel, Canny, Prewitt, Roberts,
and fuzzy logic methods.
Laplacian : The Laplacian of an image highlights the areas of rapid changes
in intensity and can thus be used for edge detection. If we let I(x,y) represent
the intensities of an image then the Laplacian of the image is given by the
following formula:
∂2f ∂2f
L(x, y) = 2
+ 2
∂x ∂y
The discrete approximation of the Laplacian at a specific pixel can be deter-
mined by taking the weighted mean of the pixel intensities in a small neighbor-
hood of the pixel. The Laplacien Operator :
 
0 1 0
K = 1 −4 1
0 1 0

39
Canny Edge Detection : The Canny edge detector is an edge detection
operator that uses a multi-stage algorithm to detect a wide range of edges in
images. It was developed by John F. Canny in 1986. Canny also produced a
computational theory of edge detection explaining why the technique works.The
Canny edge detection algorithm is composed of 5 steps:
• Noise reduction;
• Gradient calculation;
• Non-maximum suppression;

• Double threshold;
• Edge Tracking by Hysteresis.
One last important thing to mention, is that the algorithm is based on grayscale
pictures. Therefore, the pre-requisite is to convert the image to grayscale before
following the above-mentioned steps.

40
4.3.5 Image Thresholding and segmentation :
Simple Thresholding : Here, the matter is straight-forward. For every pixel,
the same threshold value is applied. If the pixel value is smaller than the
threshold, it is set to 0, otherwise it is set to a maximum value. The func-
tion cv2.threshold is used to apply the thresholding. The first argument is the
source image, which should be a grayscale image. The second argument is the
threshold value which is used to classify the pixel values. The third argument is
the maximum value which is assigned to pixel values exceeding the threshold.
OpenCV provides different types of thresholding which is given by the fourth
parameter of the function. Basic thresholding as described above is done by
using the type cv2.THRESH BINARY. All simple thresholding types are:
• cv2.T HRESH BIN ARY
• cv2.T HRESH BIN ARYI N V

• cv2.T HRESH T RU N C
• cv2.T HRESH T OZERO
• cv2.T HRESH T OZEROI N V

41
Thresholdingcode.png

Thresholding.png

Adaptive Thresholding :

42
We used one global value as a threshold. But this might not be good in
all cases, e.g. if an image has different lighting conditions in different areas.
In that case, adaptive thresholding can help. Here, the algorithm determines
the threshold for a pixel based on a small region around it. So we get different
thresholds for different regions of the same image which gives better results for
images with varying illumination.
In addition to the parameters described above, the method cv2.adaptiveThreshold
takes three input parameters:
The adaptiveMethod decides how the threshold value is calculated:
• cv2.ADAP T IV E T HRESH M EAN C :The threshold value is the mean
of the neighbourhood area minus the constant C.

• cv2.ADAP T IV E T HRESH GAU SSIAN C: The threshold value is a


gaussian-weighted sum of the neighbourhood values minus the constant
C.
The blockSize determines the size of the neighbourhood area and C is a constant
that is subtracted from the mean or weighted sum of the neighbourhood pixels.

43
4.3.6 Image Histogram :
You can consider histogram as a graph or plot, which gives you an overall idea
about the intensity distribution of an image. It is a plot with pixel values (rang-
ing from 0 to 255, not always) in X-axis and corresponding number of pixels in
the image on Y-axis. It is just another way of understanding the image. By
looking at the histogram of an image, you get intuition about contrast, bright-
ness, intensity distribution etc of that image. Almost all image processing tools
today, provides features on histogram.
Histogram Terminology :
Now we have an idea on what is histogram, we can look into how to find this.
Both OpenCV and Numpy come with in-built function for this. Before us-
ing those functions, we need to un- derstand some terminologies related with
histograms.
• Bins : The histogram above shows the number of pixels for every pixel
value, from 0 to 255. In fact, we used 256 values (bins) to show the
histogram. It could be 8, 16, 32 etc. OpenCV uses histSize to refer to
bins.
• Dims : It is the number of parameters for which we collect the data. In
this case, we collect data regarding only one thing, intensity value. So
here it is 1.

44
• Range :It is the range of intensity values you want to measure. Normally,
it is [0,256], ie all intensity values.
Histogram Calculation in OpenCV : We will use cv2.calcHist() function
to find the histogram. Let’s familiarize with the function and its parameters :
cv2.calcHist(images, channels, mask, histSize, ranges)
• images : it is the source image of type uint8 or float32. it should be given
in square brackets, ie, “[img]”.
• channels : it is also given in square brackets. It the index of channel for
which we calculate histogram. For example, if input is grayscale image,
its value is [0]. For color image, you can pass [0],[1] or [2] to calculate
histogram of blue,green or red channel respectively.
• mask : mask image. To find histogram of full image, it is given as “None”.
But if you want to find histogram of particular region of image, you have
to create a mask image for that and give it as mask. (I will show an
example later.)
• histSize : this represents our BIN count. Need to be given in square
brackets. For full scale, we pass [256].
• histSize : this represents our BIN count. Need to be given in square
brackets. For full scale, we pass [256].
However, we can use NumPy for histogram , it makes code more concise. NumPy
has a special function to compute histograms, np.histogram(). Arguments of
the routine are the input image, number of bins, and range of bins. It returns an
array with histogram values and edge values for bins. Plotting Histograms :
There are two ways for this,
• Short Way : using Matplotlib plotting functions
• Long Way : using OpenCV drawing functions
Matplotlib comes with a histogram plotting function : matplotlib.pyplot.hist()
It directly finds the histogram and plot it. Wee need not use cv2.calcHist()
or np.histogram() function to find the histogram.

45
Or We can use normal plot of matplotlib, which would be good for BGR plot.
For that, you need to find the histogram data first.

46
4.3.7 Template Matching with Multiple Objects :
Template Matching is a method for searching and finding the location of a tem-
plate image in a larger image. OpenCV comes with a function cv2.matchTemplate()
for this purpose. It simply slides the template image over the input image (as in
2D convolution) and compares the template and patch of input image under the
template image. Several comparison methods are implemented in OpenCV. It
returns a grayscale image, where each pixel denotes how much does the neigh-
bourhood of that pixel match with template.
If input image is of size (WxH) and template image is of size (wxh), output
image will have a size of (W-w+1, H-h+1). Once you got the result, you can use
cv2.minMaxLoc() function to find where is the maximum/minimum value.
Take it as the top-left corner of rectangle and take (w,h) as width and height
of the rectangle. That rectangle is your region of template.
Here, as an example, we will search for Mario’s coins in his photo. So I created
a template as below:

47
4.4 Feature Detection and Description :
Feature detection plays a crucial role in image registration. There exists quite a
few feature detection algorithms in literature like BRISK, FAST, SURF etc [14].
Each of these algorithms has its own advantages and disadvantages. BRISK
is rotation and scale invariant, but it takes more time to detect the feature
points. On the other hand FAST, as the name suggests, takes less time to
detect the key points, but it is not scale invariant. To overcome the demerits

48
of BRISK and FAST feature detection algorithms, this paper proposes a hybrid
feature detection algorithm, which consumes less time to detect the feature key
points and it is also rotation and scale invariant. This paper also focus on a
comparative analysis of BRISK, FAST and proposed algorithm in terms of time
to detect feature points. This paper has taken five feature key points in every
Remote-sensing images and also deals with feature detection using the above
three algorithms. It can be observed from the results and tables that in case of
hybrid feature detector, it takes less time to detect five feature points.

4.4.1 What are keypoints ?


When we build object recognition systems, we need to detect the interesting
regions to create a signature for the image. These interesting regions are charac-
terized by keypoints. This is why keypoint detection is critical in many modern
computer vision systems.
So the keypoints refer to the interesting regions in the image, it means that
something is happening in that region. If the region is just uniform, then it’s
not very interesting. For example, corners are interesting because there is a
sharp change in intensity in two different directions. Each corner is a unique
point where two edges meet.

4.4.2 Harris Corner Detection :


Harris Corner Detector is a corner detection operator that is commonly used
in computer vision algorithms to extract corners and infer features of an im-
age. It was first introduced by Chris Harris and Mike Stephens in 1988 upon
the improvement of Moravec’s corner detector. Compared to the previous one,
Harris’ corner detector takes the differential of the corner score into account
with reference to direction directly, instead of using shifting patches for every
45-degree angles, and has been proved to be more accurate in distinguishing

49
between edges and corners. Since then, it has been improved and adopted in
many algorithms to preprocess images for subsequent applications.

OpenCV has the function cv2.cornerHarris() for this purpose. Its argu-
ments are :

• img : Input image, it should be grayscale and float32 type.


• blockSize : It is the size of neighbourhood considered for corner detection
• ksize : Aperture parameter of Sobel derivative used.
• k :Harris detector free parameter in the equation.

50
4.4.3 Shi-Tomasi Corner Detector and Good Features to Track :
The Shi-Tomasi corner detector is based entirely on the Harris corner detector.
However, one slight variation in a ”selection criteria” made this detector much
better than the original. It works quite well where even the Harris corner de-
tector fails. So here’s the minor change that Shi and Tomasi did to the original
Harris corner detector.
The change :
The Harris corner detector has a corner selection criteria. A score is calculated
for each pixel, and if the score is above a certain value, the pixel is marked as
a corner. The score is calculated using two eigenvalues. That is, you gave the
two eigenvalues to a function. The function manipulates them, and gave back
a score.

51
4.5 Face Detection using Haar Cascades :
Face detection using Haar cascades is a machine learning based approach where
a cascade function is trained with a set of input data.

OpenCV already contains many pre-trained classifiers for face, eyes, smiles,
etc. we will be using the face classifier. we can experiment with other classifiers
as well.

4.5.1 Basic concept of HAAR cascade algorithm :


Object Detection using Haar feature-based cascade classifiers is an effective
method proposed by Paul Viola and Michael Jones in the 2001 paper, ”Rapid
Object Detection using a Boosted Cascade of Simple Features”. It is a machine
learning based approach in which a cascade function is trained from a lot of
positive and negative images. It is then used to detect objects in other images.
Here we will work with face detection. Initially, the algorithm needs a lot
of positive images (images of faces) and negative images (images without faces)
to train the classifier. Then we need to extract features from it. For this, Haar
features shown in below image are used. They are just like our convolutional
kernel. Each feature is a single value obtained by subtracting the sum of pixels
under the white rectangle from the sum of pixels under the black rectangle.

52
Now all possible sizes and locations of each kernel are used to calculate plenty
of features. For each feature calculation, we need to find the sum of the pixels
under the white and black rectangles. To solve this, they introduced the integral
images. It simplifies calculation of the sum of the pixels, how large may be the
number of pixels, to an operation involving just four pixels.

4.5.2 Haar-cascade Detection in OpenCV :


Here we will deal with detection. OpenCV already contains many pre-trained
classifiers for face, eyes, smile etc. Those XML files are stored in opencv/-
data/haarcascades/ folder. Let’s create a face and eye detector with OpenCV.

We use the function:

detectMultiScalet (image, objects, scaleFactor = 1.1, minNeighbors = 3, flags


= 0, minSize = new cv.Size(0, 0), maxSize = new cv.Size(0, 0))
Parameters

• image : matrix of the type CV 8U containing an image where objects are


detected.
• object : vector of rectangles where each rectangle contains the detected
object. The rectangles may be partially outside the original image.
• scaleFactor :parameter specifying how much the image size is reduced at
each image scale.
• minNeighbors :parameter specifying how many neighbors each candidate
rectangle should have to retain it.

• flags : parameter with the same meaning for an old cascade as in the
function cvHaarDetectObjects. It is not used for a new cascade.
• minSize :minimum possible object size. Objects smaller than this are
ignored.

• maxSize : maximum possible object size. Objects larger than this are
ignored. If maxSize == minSize model is evaluated on single scale.

53
fac code.png
code.png

5 Linear Filters
5.1 Convolution
Convolution involving one-dimensional signals is referred to as 1D convolution
or just convolution. Otherwise, if the convolution is performed between two
signals spanning along two mutually perpendicular dimensions (i.e., if signals
are two-dimensional in nature), then it will be referred to as 2D convolution.
This concept can be extended to involve multi-dimensional signals due to which
we can have multi-dimensional convolution.
This kind of operation is extensively used in the field of digital image processing
wherein the 2D matrix representing the image will be convolved with a compar-
atively smaller matrix called 2D kernel.
the kernel is called the image filter and the process of applying this kernel to
the given image is called image filtering. The output obtained after applying
the kernel to the image is called the filtered image.
if f is the image that we want to filter and g the filter then :
 

 

−1
f (x, y) ∗ g(x, y) = F F (f (x, y)). F (g(x, y)) .

 | {z }  
G(u,v)

G is transfer function of filter.we will present the filters in the discreet case.x
and y are the coordinates of the pixels and f is an integer(between [0,..,255]).
there are tree types of filters : .
1. Low Pass Filter decreases the noise but attenuates the details of the
image.

54
2. High Pass Filter increases the counters and the details but also increases
the noise.
3. Band Pass Filter delete some unwanted frequencies
We generally don’t do any global convolution but a local transformation based
at the neighborhood of the point(x,y)

Local Convolution
The convolution kernel of the filter k is a compact support included in [x1,x2]
[y2,y2] :
Xx2 X y2
g(x, y) = (f ∗ k)(x ∗ y) =
i=x1 j=y2

5.2 Low Pass Filter


5.2.1 Mean Filter
Mean filtering is a simple, intuitive and easy to implement method of smoothing
images, i.e. reducing the amount of intensity variation between one pixel and
the next. It is often used to reduce noise in images.
The idea of mean filtering is simply to replace each pixel value in an image
with the mean (‘average’) value of its neighbors, including itself. This has the ef-
fect of eliminating pixel values which are unrepresentative of their surroundings.
Mean filtering is usually thought of as a convolution filter. Like other convo-
lutions it is based around a kernel, which represents the shape and size of the
neighborhood to be sampled when calculating the mean. Often a 33 square ker-
nel is used , although larger kernels (e.g. 55 squares) can be used for more severe
smoothing. (Note that a small kernel can be applied more than once in order to
produce a similar but not identical effect as a single pass with a large kernel.)

55
5.2.2 Gaussian filter

if for example σ=0.8 we will have the filter 3x3 :

G(-1,-1) G(0,-1) G(1,-1) 1 2 1


1
G(-1,0) G(0,0) G(1,0) ≈ ∗ 2 4 2
16
G(-1,1) G(0,1) G(1,1) 1 2 1

56
if σ = 1 for 5x5 filter we will have :
1 4 6 4 1
4 18 30 18 4
1
∗ 6 30 48 30 6
300
4 18 30 18 4
1 4 6 4 1

Ideally , a filter of size (6σ +1) x (6σ +1). generally the Gaussian filter with
σ ¡1 is used to reduce noise, and if σ¿1 it is for the purpose to make an image
that we will use to make a personalized ”blur mask”. Note that the larger , the
better the blurring applied to the image

5.2.3 Binomial filter


The coefficients of this filter are obtained by Newton’s binomial. 1D filter bi-
1
nomial of order 4 is a filter given by the vector v = 16 [1 4 6 4 1]. A 2D filter
t
binomial of order 4 is the separable filter given by v v :

1 4 6 4 1
4 16 24 16 4
1
∗ 6 24 36 24 6
256
4 16 24 16 4
1 4 6 4 1

These filters are low-pass filters: they soften the details of the image (and there-
fore the additive noise) but by eroding the edges add blurring to the image. We
will see in a next section how to reduce blurring.

5.3 High Pass Filte


5.3.1 DOG Filter
A Gaussian filter can be seen as a low-pass filter (only low frequencies are
retained), hence the blurring effect observed in the ”Gaussian blur” filter. By
subtracting from 2 Gaussian, we then have the equivalent of a bandpass filter.
the coefficients are the result of the subtraction of 2 Gaussians of different
variances: Gsigma1 and Gsigma2
K = Gσ1 − Gσ2
 1 1   1 1 
2 2 2 2
−1 e σ1 1 e σ1 −1 e σ2 1 e σ2
2
σ1  1  2
σ2  1 
e e
Gσ 1 = 2πσ1 2 ∗
 1 e σ1
2
1   and Gσ 2 = 2πσ2 2 ∗
 1
2
e σ2 1 

1 1 1 1
2 2 2 2
e σ1 1 e σ1 e σ2 1 e σ2

57
5.3.2 Laplacian of Gaussian (LoG)
This filter is defined by the function:

x2 + y 2 − 2σ 2 −x2 −y 2
K(x, y) = ∆(Gσ (x, y) = e 2σ 2
2πσ 4

6 Non Linear Filters


6.1 Median
The median filter is a nonlinear signal processing technology based on statistics.
The noisy value of the digital image or the sequence is replaced by the median
value of the neighborhood (mask). The pixels of the mask are ranked in the
order of their gray levels, and the median value of the group is stored to replace
the noisy value. The median filtering output is g(x,y) = medf(x-i,y-j) , i,j ∈ W.
where f(x,y) , g(x,y) are the original image and the output image respectively,
W is the two-dimensional mask: the mask size is n x n (where n is commonly
odd) such as 3 x 3, 5 x 5, and etc.; the mask shape may be linear, square, cir-
cular, cross, and etc.

6.2 Maximum Filter


The maximum filter replaces each pixel value of a Digital Image with the maxi-
mum value(i.e., the value of the brightest pixel) of its neighbourhood pixel win-
dow. It is the opposite of what the minimum filter does to an Image.
Applying the maximum filter removes the negative outlier noise present in a

58
Digital Image.
When a maximum filter is applied, the darker objects present in the image are
eroded. This way maximum filter is called an erosion filter. With respect to the
lighter pixels, some call this as a dilation filter.
To be more specific , brighter objects are dilated and the darker objects are eroded
upon applying a maximum filter to a Digital Image. let’s Apply our filter in Lena
image :

(a) Original (b) Gauss noise

(c) Maximum 3x3 (d) Maximum 5x5

Figure 1: Maximum Filter

6.3 Minimum Filter


When the minimum filter is applied to a digital image it picks up the minimum
value of the neighbourhood pixel window and assigns it to the current pixel. A
pixel with the minimum value is the darkest among the pixels present in the pixel
window.
The dark values present in an image are enhanced by the minimum filter.

59
Minimum filter is also called as a dilation filter. When minimum filter is applied
the object boundaries present in an image are extended.
The minimum filter is one of the morphological filters. The other morphological
filters include maximum filter and the median filter.
The minimum filter removes any positive outlier noise present in a digital image.
let’s Apply our filter in Lena image :

(a) Original (b) Gauss noise

(c) Minimum 3x3 (d) Minimum 5x5

Figure 2: Minimum Filter

7 Directional Filters
7.1 What is a Gradient of the image?
Image of the gradient is nothing but the change of intensity of image colors in
X, Y or both directions.
We can find the gradient of an image by the help of Sobel and Laplacian deriva-
tives of the image. Sobel is used for either X or Y direction or even in combined

60
form while Laplacian help in both directions.

7.2 Laplacian
The Laplacian is a 2-D isotropic measure of the 2nd spatial derivative of an
image. The Laplacian of an image highlights regions of rapid intensity change
and is therefore often used for edge detection (see zero crossing edge detectors).
The Laplacian is often applied to an image that has first been smoothed with
something approximating a Gaussian smoothing filter in order to reduce its sen-
sitivity to noise, and hence the two variants will be described together here. The
operator normally takes a single graylevel image as input and produces another
graylevel image as output.

7.3 Filters
7.3.1 Directional Filter :
δI
the approximation of δx is done by convolution with
 
0 −1 0
hx = 0 1 0
0 0 0
δI
and δy is done by convolution with
 
0 0 0
hy = −1 1 0
0 0 0
Which means :
δI
• δx (i, j) = −I(i − 1, j) + I(i, j)
δI
• δy (i, j) = −I(i, j − 1) + I(i, j)
so let’s apply this filter on coin image :

(a) Original (b) Ix (c) Iy

Figure 3: Directional Filter

61
7.3.2 Prewitt Operator
Prewitt operator is used for edge detection in an image. It detects two types of
edges
• Horizontal edges

• Vertical Edges
Edges are calculated by using difference between corresponding pixel intensities
of an image. All the masks that are used for edge detection are also known as
derivative masks. As we have seen before that image is also a signal so changes
in a signal can only be calculated using differentiation. So that’s why these
operators are also called as derivative operators or derivative masks.
Prewitt operator provides us two masks one for detecting edges in horizontal
direction and another for detecting edges in an vertical direction.
Vertical direction :

 
−1 0 1
hx = −1 0 1
−1 0 1
When we apply this mask on the image it prominent vertical edges. It simply
works like as first order derivate and calculates the difference of pixel intensities
in a edge region. As the center column is of zero so it does not include the
original values of an image but rather it calculates the difference of right and
left pixel values around that edge. This increase the edge intensity and it become
enhanced comparatively to the original image.
Horizontal Direction:

 
−1 −1 −1
hy =  0 0 0
1 1 1
This mask will prominent the horizontal edges in an image. It also works on
the principle of above mask and calculates difference among the pixel intensities
of a particular edge. As the center row of mask is consist of zeros so it does not
include the original values of edge in the image but rather it calculate the differ-
ence of above and below pixel intensities of the particular edge. Thus increasing
the sudden change of intensities and making the edge more visible. Both the
above masks follow the principle of derivate mask. Both masks have opposite
sign in them and both masks sum equals to zero. The third condition will not be
applicable in this operator as both the above masks are standardize and we can’t
change the value in them.
Now it’s time to see these masks :

62
(a) Original (b) Perwitt X

(c) Perwitt Y (d) Perwitt

Figure 4: Perwitt Filter

The Code Source will be provided in the document sections .


There is an easiest way . we can apply Perwitt filter in tiny lines with opencv
:

63
Figure 5: Perwitt Filter Using OpenCv

7.3.3 Robert Operator


The Roberts cross operator is used in image processing and computer vision for
edge detection. It was one of the first edge detectors and was initially proposed
by Lawrence Roberts in 1963.[1] As a differential operator, the idea behind the
Roberts cross operator is to approximate the gradient of an image through dis-
crete differentiation which is achieved by computing the sum of the squares of
the differences between diagonally adjacent pixels.

64
(a) Original (b) Robert X

(c) Robert Y (d) Robert

Figure 6: Robert Filter

The Code Source will be provided in the document sections .


There is an easiest way . we can apply Robert filter in tiny lines with opencv
:

65
Figure 7: Robert Filter Using OpenCv

7.3.4 Krish Filter


Kirsch Compass Mask is also a derivative mask which is used for finding edges.
This is also like Robinson compass find edges in all the eight directions of a
compass. The only difference between Robinson and kirsch compass masks is
that in Kirsch we have a standard mask but in Kirsch we change the mask
according to our own requirements.
With the help of Kirsch Compass Masks we can find edges in the following
eight directions.
• North
• North West
• West

• South West

66
• South
• South East
• East
• North East

We take a standard mask which follows all the properties of a derivative mask
and then rotate it to find the edges.
For example let’s see the following mask which is in North Direction and
then rotate it to make all the direction masks.
North Direction Mask
 
−3 −3 5
−3 0 5
−3 −3 5
North West Direction Mask
 
−3 5 5
−3 0 5
−3 −3 −3
West Direction Mask
 
5 5 5
−3 0 −3
−3 −3 −3
South West Direction Mask
 
5 5 −3
5 0 −3
−3 −3 −3
South Direction Mask
 
5 −3 −3
5 0 −3
5 −3 −3
South East Direction Mask
 
−3 −3 −3
5 0 −3
5 −3 5
East Direction Mask
 
−3 −3 −3
−3 0 −3
5 5 5

67
North East Direction Mask
 
−3 −3 −3
−3 0 5
−3 5 5
As you can see that all the directions are covered and each mask will give you
the edges of its own direction. Now to help you better understand the concept of
these masks we will apply it on a real image. Suppose we have a sample picture
from which we have to find all the edges.
Now we will apply all the above :

(a) Original (b) North Direction (c) north west direction

(d) west direction (e) south west direction (f) south direction

(g) south east direction (h) east direction (i) north east direction

Figure 8: Krish Filter

68
7.3.5 Sobel Filter :
Sobel Operator is an approximation to a derivative of an image. It is separate
in the y and x directions. If we look at the x-direction, the gradient of an image
in the x-direction is equal to this operator here. We use a kernel 3 by 3 matrix,
one for each x and y direction. The gradient for x-direction has minus numbers
on the left hand side and positive numbers on the right hand side and we are
preserving a little bit of the center pixels. Similarly, the gradient for y-direction
has minus numbers on the bottom and positive numbers on top and here we are
preserving a little bit on the middle row pixels.
   
−1 0 1 −1 −2 −1
Gx = −2 0 1 Gy =  0 0 0
−1 0 1 1 2 1

(a) Original

Figure 9: Sobel Filter

7.4 Gaussian Blur


The Gaussian smoothing operator is a 2-D convolution operator that is used
to ‘blur’ images and remove detail and noise. In this sense it is similar to the
mean filter, but it uses a different kernel that represents the shape of a Gaussian
(‘bell-shaped’) hump. This kernel has some special properties which are detailed
below.
In 2-D, an isotropic (i.e. circularly symmetric) Gaussian has the form:
1 − x2 +y2 2
G(x, y) = e 2σ
2πσ 2

let’s Apply our filter in coins image :

69
(a) Original (b) Gauss Blur

Figure 10: Gauss Filter

we can apply Gauss filter in tiny lines with opencv :

Figure 11: Gauss Filter Using OpenCv

7.4.1 Laplacien Filter


The Laplacian of an image highlights the areas of rapid changes in intensity and
can thus be used for edge detection. If we let I(x,y) represent the intensities of
an image then the Laplacian of the image is given by the following formula:

∂2f ∂2f
L(x, y) = 2
+ 2
∂x ∂y
The discrete approximation of the Laplacian at a specific pixel can be determined
by taking the weighted mean of the pixel intensities in a small neighborhood of
the pixel. The Laplacien Operator :

70
 
0 1 0
D = 1 −4 1
0 1 0
so let’s apply this filter on coin image :

(a) Original (b) Laplacien

Figure 12: Laplacien Filter

8 Image Restoration
When we are referring to image restoration problems we basically mean that we
have a degraded image and we want to recover the clean non-degraded image.
There could be many reasons for an image to get degraded, Mainly, degradation
of images may occur during image transmission, formation, and storage.
There are a lot of tasks in image restoration, Let’s talk about three main tasks:

8.1 Image deblurring


Deblurring is the process of removing blurring effects from images, caused for
example by defocus aberration or motion blur.
In forward mode, such blurring effect is typically modelled as a 2-dimensional
convolution between the so-called point spread function and a target sharp input
image, where the sharp input image (which has to be recovered) is unknown and
the point-spread function can be either known or unknown.

71
Figure 13: Deblurring

8.2 Image in-painting


Image in-painting is the process of reconstructing lost or deteriorated parts of
images and videos. This technique is often used to remove unwanted objects
from an image or to restore damaged portions of old photos. The figures below
show example image-in-painting result.

72
Figure 14: in-painting

8.3 Super resolution


Super resolution is the process of upscaling and or improving the details within
an image. Often a low resolution image is taken as an input and the same image
is upscaled to a higher resolution, which is the output. The details in the high
resolution output are filled in where the details are essentially unknown.
Super resolution is essentially what we see in films and series like CSI where
someone zooms into an image and it improves in quality and the details just
appear.
Examples of X2 super resolution
Following are ten examples of X2 super resolution (doubling the image size)
from the same model trained on the Div2K dataset, 800 high resolution images
of a variety of subject matter categories. Example one from a model trained
on varied categories of image. However this version of the model trained on a
generic category data set has managed to improve this image well, look closely
at the added detail in the face, the hair, the folds of the clothes and all of the
background.

73
zoom.png

Figure 15: Super Zoom , Example 1

Figure 16: Super Zoom , Example 2

8.4 The Wiener filter


The Wiener filter is the MSE-optimal stationary linear filter for images degraded
by additive noise and blurring. Calculation of the Wiener filter requires the as-

74
sumption that the signal and noise processes are second-order stationary (in the
random process sense).gif For this description, only noise processes with zero
mean will be considered (this is without loss of generality).
Wiener filters are usually applied in the frequency domain. Given a degraded im-
age x(n,m), one takes the Discrete Fourier Transform (DFT) to obtain X(u,v).
The original image spectrum is estimated by taking the product of X(u,v) with
the Wiener filter G(u,v):

S(u, v) = G(u, v)X(u, v)

The inverse DFT is then used to obtain the image estimate from its spectrum.
The Wiener filter is defined in terms of these spectra:

• H(u,v) Fourier transform of the point-spread function (PSF)


• Ps (u, v) Power spectrum of the signal process , obtained by taking the
Fourier transform of the signal autocorrelation.

• Pn (u, v) Power spectrum of the noise process , obtained by taking the


Fourier transform of the noise autocorrelation.

The Wiener filter is:

H ∗ (u, v)Ps (u, v)


G(u, v) =
|H(u, v)|2 Ps (u, v) + Pn (u, v)

Dividing through by Ps makes its behaviour easier to explain:

H ∗ (u, v)
G(u, v) = Pn (u,v
|H(u, v)|2 + Ps (u,v) )

The Term Pn /Ps can be interpreted as the reciprocal of the signal-to-noise


ratio. Where the signal is very strong relative to the noise, Pn /Ps ≈ 0 and the
Wiener filter becomes H − 1(u, v) the inverse filter for the PSF. Where the signal
is very weak,Pn /Ps → ∞ and G(u,v) → 0
F orthecaseof additivewhitenoiseandnoblurring, theW ienerf iltersimplif iesto :

Ps (u, v)
G(u, v) =
Ps (u, v) + σn 2
where σn 2 is the noise variance.

After a brief introduction about The Wiener filter let’s apply our filter :

75
Figure 17: The Wiener filter

8.5 The Perona-Malik equation


Perona and Malik propose a nonlinear diffusion method for avoiding the blur-
ring and localization problems of linear diffusion filtering [326, 328]. They apply
an inhomogeneous process that reduces the diffusivity at those locations which
have a larger likelihood to be edges. This likelihood is measured by |∇u|2 . The
Perona–Malik filter is based on the equation

∂t u = div(g(|∇u|2 )∇u)
and it uses diffusivities such as
1
g(s2 ) = (λ > 0)
1 + s2 /λ2
Although Perona and Malik name their filter anisotropic, it should be noted that
– in our terminology – it would be regarded as an isotropic model.
The flux function is defined as

φ(s) = sg(s2 )

76
Figure 18: Diffusivity function the corresponding flux function and its derivative

let’s apply our filter on a noised image :

Figure 19: The Perona-Malik Filter

https://www.sunpower-uk.com/glossary/what-is-a-frequency-filter/
https://sisu.ut.ee/imageprocessing/book/1
https://www.encyclopedia.com/computing/news-wires-white-papers-and-books/digital-
images
https://packaging.python.org/tutorials/installing-packages/
https://github.com/
https://www.analyticsvidhya.com/blog/

77

You might also like