Introcduction To Image Processing With Python Nour Eddine ALAA and Ismail Zine El Abidne March 5, 2021
Introcduction To Image Processing With Python Nour Eddine ALAA and Ismail Zine El Abidne March 5, 2021
1
Contents
1 Abstract 4
2 Introduction : 5
2.1 What is a Digital Image . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Internship Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Project Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Python 8
3.1 Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 OpenCV: 14
4.1 Installing Python and OpenCv . . . . . . . . . . . . . . . . . . . 14
4.1.1 Download the Python 3 Installer . . . . . . . . . . . . . . 14
4.1.2 Installing OpenCV . . . . . . . . . . . . . . . . . . . . . . 14
4.1.3 Python Virtual Environments : . . . . . . . . . . . . . . 19
4.2 Reading, displaying, and saving images : . . . . . . . . . . . . . . 20
4.2.1 Reading image : . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.2 Display an image : . . . . . . . . . . . . . . . . . . . . . . 21
4.2.3 Saving Image : . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.4 Color image space : . . . . . . . . . . . . . . . . . . . . . 23
4.2.5 Changing Color space : . . . . . . . . . . . . . . . . . . . 24
4.2.6 Cropping Image : . . . . . . . . . . . . . . . . . . . . . . . 25
4.2.7 Transformations of Images : . . . . . . . . . . . . . . . . . 26
4.3 Image Processing in OpenCV : . . . . . . . . . . . . . . . . . . . 30
4.3.1 2D convolution : . . . . . . . . . . . . . . . . . . . . . . . 30
4.3.2 Smoothing Image : . . . . . . . . . . . . . . . . . . . . . . 32
4.3.3 Morphological Transformations : . . . . . . . . . . . . . . 36
4.3.4 Edge detection : . . . . . . . . . . . . . . . . . . . . . . . 39
4.3.5 Image Thresholding and segmentation : . . . . . . . . . . 41
4.3.6 Image Histogram : . . . . . . . . . . . . . . . . . . . . . . 44
4.3.7 Template Matching with Multiple Objects : . . . . . . . . 47
4.4 Feature Detection and Description : . . . . . . . . . . . . . . . . 48
4.4.1 What are keypoints ? . . . . . . . . . . . . . . . . . . . . 49
4.4.2 Harris Corner Detection : . . . . . . . . . . . . . . . . . . 49
4.4.3 Shi-Tomasi Corner Detector and Good Features to Track : 51
4.5 Face Detection using Haar Cascades : . . . . . . . . . . . . . . . 52
4.5.1 Basic concept of HAAR cascade algorithm : . . . . . . . 52
4.5.2 Haar-cascade Detection in OpenCV : . . . . . . . . . . . . 53
2
5 Linear Filters 54
5.1 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 Low Pass Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2.1 Mean Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2.2 Gaussian filter . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2.3 Binomial filter . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3 High Pass Filte . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3.1 DOG Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3.2 Laplacian of Gaussian (LoG) . . . . . . . . . . . . . . . . 58
7 Directional Filters 60
7.1 What is a Gradient of the image? . . . . . . . . . . . . . . . . . . 60
7.2 Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.3 Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.3.1 Directional Filter : . . . . . . . . . . . . . . . . . . . . . . 61
7.3.2 Prewitt Operator . . . . . . . . . . . . . . . . . . . . . . . 62
7.3.3 Robert Operator . . . . . . . . . . . . . . . . . . . . . . . 64
7.3.4 Krish Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.3.5 Sobel Filter : . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.4 Gaussian Blur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.4.1 Laplacien Filter . . . . . . . . . . . . . . . . . . . . . . . . 70
8 Image Restoration 71
8.1 Image deblurring . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.2 Image in-painting . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.3 Super resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.4 The Wiener filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
8.5 The Perona-Malik equation . . . . . . . . . . . . . . . . . . . . . 76
3
1 Abstract
A methodological study on significance of image processing and its applications
in the field of computer vision is carried out here. During an image processing
operation the input given is an image and its output is an enhanced high quality
image as per the techniques used. Image processing usually referred as digital
image processing. Our study provides a solid introduction to image processing
along with segmentation techniques, computer vision fundamentals and its ap-
plied applications that will be of worth to the image processing and computer
vision research communities.
4
2 Introduction :
2.1 What is a Digital Image
A digital image is a representation of a real image as a set of numbers that can
be stored and handled by a digital computer. In order to translate the image
into numbers, it is divided into small areas called pixels (picture elements). For
each pixel, the imaging device records a number, or a small set of numbers,
that describe some property of this pixel, such as its brightness (the intensity
of the light) or its color. The numbers are arranged in an array of rows and
columns that correspond to the vertical and horizontal positions of the pixels
in the image. Digital images have several basic characteristics. One is the type
of the image. For example, a black and white image records only the intensity
of the light falling on the pixels. A color image can have three colors, normally
RGB (Red, Green, Blue) or four colors, CMYK (Cyan, Magenta, Yellow, black).
RGB images are usually used in computer monitors and scanners, while CMYK
images are used in color printers. There are also non-optical images such as
ultrasound or X-ray in which the intensity of sound or X-rays is recorded. In
range images, the distance of the pixel from the observer is recorded. Resolution
is expressed in the number of pixels per inch (ppi). A higher resolution gives
a more detailed image. A computer monitor typically has a resolution of 100
ppi, while a printer has a resolution ranging from 300 ppi to more than 1440
ppi. This is why an image looks much better in print than on a monitor.
5
For the colored image it works a little bit dierent every pixel contains 3 values
R(red) G(green) B(blue) and these values are the ones who controles our image’s
color.
6
Edge detection
7
3 Python
3.1 Denition
Python is an interpreted, high-level, general-purpose programming language.
Created by Guido van Rossum and rst released in 1991, Python’s design philos-
ophy emphasizes code readability with its notable use of signicant white space.
Its language constructs and object-oriented approach aims to help program-
mers write clear, logical code for small and large-scale projects.[26] Python is
dynamically typed and garbage-collected. It supports multiple programming
paradigms, including procedural, object-oriented, and functional programming.
Python is often described as a ”batteries included” language due to its com-
prehensive standard library.[27] Python was conceived in the late 1980s as a
successor to the ABC language. Python 2.0, released 2000, introduced features
like list comprehensions and a garbage collection system capable of collecting
reference cycles. Python 3.0, released 2008, was a major revision of the lan-
guage that is not completely backward-compatible, and much Python 2 code
does not run unmodied on Python 3. Due to concern about the amount of code
written for Python 2, support for Python 2.7 (the last release in the 2.x se-
ries) was extended to 2020. Language developer Guido van Rossum shouldered
sole responsibility for the project until July 2018 but now shares his leadership
as a member of a veperson steering council.[28][29][30] Python interpreters are
available for many operating systems. A global community of programmers de-
velops and maintains CPython, an open source[31] reference implementation. A
non-prot organization, the Python Software Foundation, manages and directs
resources for Python and CPython development.
3.2 Libraries
We use a 3 main libraries to process get data from image and analyse it , apply
our lters then save it as a png.
1. skimage
scikit-image is a collection of algorithms for image processing and com-
puter vision. The main package of skimage only provides a few utilities
for converting between image data types; for most features, you need to
import one of the following subpackages: Color , Data , Draw , Exposure
, Filters .. and so on .
src = ”https://scikit-image.org/docs/dev/api/skimage.html”
we use to this library especially to import the data (image) from it , we
taken then apply our filters at it.
• Loading Images from skimage
Within the scikit-image package, there are several sample images
provided in the data module. Let’s say we want to load a single
image to perform a few experiments. Instead of using an external
8
image, we can simply load one of the images provided within the
package!
Here is the Python code to do this:
image = data . c o i n s ( )
imshow ( image )
Notice that I have used the imshow function here to view the image
in the notebook itself.
• Reading Images from our System using skimage
What if you want to load an image from your machine instead of
the ones provided in the package?For this, we can use the imread
function from skimage.
We can read images in two formats – colored and grayscale. We will
see both of these in action and understand how they’re different.
The imread function has a parameter “as-gray” which is used to
specify if the image must be converted into a grayscale image or not.
We will start with reading an image in grayscale format, by setting
the parameter to true:
2. OpenCv
OpenCV (Open Source Computer Vision Library) is an open source com-
puter vision and machine learning software library. OpenCV was built
9
to provide a common infrastructure for computer vision applications and
to accelerate the use of machine perception in the commercial products.
Being a BSDlicensed product, OpenCV makes it easy for businesses to
utilize and modify the code. The library has more than 2500 optimized
algorithms, which includes a comprehensive set of both classic and state-
of- the-art computer vision and machine learning algorithms. These algo-
rithms can be used to detect and recognize faces, identify objects, classify
human actions in videos, track camera movements, track moving objects,
extract 3D models of objects, produce 3D point clouds from stereo cam-
eras, stitch images together to produce a high resolution image of an entire
scene, nd similar images from an image database, remove red eyes from
images taken using ash, follow eye movements, recognize scenery and es-
tablish markers to overlay it with augmented reality, etc. OpenCV has
more than 47 thousand people of user community and estimated number
of downloads exceeding 18 million. The library is used extensively in com-
panies, research groups and by governmental bodies.
src = ”https://opencv.org/about/”
this Library is so powerful has a lot of features that i could not imagine
we use it to display and restore our images after applying our lters.
• Reading, Writing and Displaying Images
Machines see and process everything using numbers, including im-
ages and text. How do you convert images to numbers – I can hear
you wondering. Two words – pixel values:
Every number represents the pixel intensity at that particular loca-
tion. In the above image, I have shown the pixel values for a grayscale
image where every pixel contains only one value i.e. the intensity of
the black color at that location.
Note that color images will have multiple values for a single pixel.
These values represent the intensity of respective channels – Red,
Green and Blue channels for RGB images, for instance.
Reading and writing images is essential to any computer vision project.
And the OpenCV library makes this function a whole lot easier.
Now, let’s see how to import an image into our machine using OpenCV.
#r e a d i n g t h e image
10
p l t . imshow ( image )
#s a v i n g image
cv2 . i m w r i t e ( ’ t e s t w r i t e . j p g ’ , image )
3. Numpy
NumPy is the fundamental package for scientic computing with Python.
It contains among other things: a powerful N-dimensional array object
sophisticated (broadcasting) functions tools for integrating C/C++ and
Fortran code useful linear algebra, Fourier transform, and random number
capabilities Besides its obvious scientic uses, NumPy can also be used as an
ecient multi-dimensional container of generic data. Arbitrary data-types
can be dened. This allows NumPy to seamlessly and speedily integrate
with a wide variety of databases.
src = ”https://www.numpy.org/” We use this library to apply our lters
as matrix because as we mentioned before an image is a matrix multi-
dimenstion
11
• Python Lists vs NumPy Arrays – What’s the Difference?
If you’re familiar with Python, you might be wondering why use
NumPy arrays when we already have Python lists? After all, these
Python lists act as an array that can store elements of various types.
This is a perfectly valid question and the answer to this is hidden in
the way Python stores an object in memory.
A Python object is actually a pointer to a memory location that
stores all the details about the object, like bytes and the value. Al-
though this extra information is what makes Python a dynamically
typed language, it also comes at a cost which becomes apparent when
storing a large collection of objects, like in an array.
Python lists are essentially an array of pointers, each pointing to a
location that contains the information related to the element. This
adds a lot of overhead in terms of memory and computation. And
most of this information is rendered redundant when all the objects
stored in the list are of the same type!
To overcome this problem, we use NumPy arrays that contain only
homogeneous elements, i.e. elements having the same data type.
This makes it more efficient at storing and manipulating the array.
This difference becomes apparent when the array has a large number
of elements, say thousands or millions. Also, with NumPy arrays,
you can perform element-wise operations, something which is not
possible using Python lists!
This is the reason why NumPy arrays are preferred over Python lists
when performing mathematical operations on a large amount of data.
• Creating a NumPy Array
– Basic ndarray
NumPy arrays are very easy to create given the complex prob-
lems they solve. To create a very basic ndarray, you use the
np.array() method. All you have to pass are the values of the
array as a list:
Listing 6: np.array
np . a r r a y ( [ 1 , 2 , 3 , 4 ] )
Output
array ( [ 1 , 2 , 3 , 4 ] )
This array contains integer values. You can specify the type of
data in the dtype argument:
np . a r r a y ( [ 1 , 2 , 3 , 4 ] , dtype=np . f l o a t 3 2 )
Output
array ( [ 1 . , 2 . , 3 . , 4 . ] )
12
– Multi-Dimensional Array
NumPy arrays can be multi-dimensional too.
13
4 OpenCV:
4.1 Installing Python and OpenCv
4.1.1 Download the Python 3 Installer
• Open a browser window and navigate to the Download page for Windows
at python.org.
• Underneath the heading at the top that says Python Releases for Win-
dows, click on the link for the Latest Python 3 Release - Python 3.x.x.
(As of this writing, the latest is Python 3.6.5.)
• Scroll to the bottom and select either Windows x86-64 executable installer
for 64-bit or Windows x86 executable installer for 32-bit. (See below.)
• Once you have chosen and downloaded an installer, simply run it by
double-clicking on the downloaded file. A dialog should appear that looks
something like this:
• Then just click Install Now. That should be all there is to it. A few
minutes later you should have a working Python 3 installation on your
system.
14
15
We should see a similar prompt if we went according to instructions.
The prompt will show that it is “solving environment”. It takes quite a bit of
time if you are on slower Internet connection.
16
Try to use faster Internet, preferably a wired connection for uninterrupted
and best results. if you are in a work or institution based Internet connection
then it is likely that an HTTP timeout will occur and we need to enter the
command once again and restart the procedure.
Once the environment is resolved by conda it will list the packages that will
be installed, namely: opencv, libopencv, py-opencv. Enter y to proceed with
the installation.
17
We can verify if the installation was successful by launching the python
interpreter. opencv is referred to as cv2 in python. Type at the prompt:
• import cv2 : if the prompt is displayed then opencv then python has
successfully imported the opencv library. But we should also verify the
version of opencv so we need to type:
• print(cv2. version ) : as of March 2019 the version displayed is 3.4.1
which is the officially supported version of opencv by anaconda environ-
ment. If we want to work with a different version then while installing we
can specify the version as “opencv=3.4.1” as shown below
18
4.1.3 Python Virtual Environments :
Basically reading an image from your local machine . We need to create a python
file readimage.py.we put the images in the same direction and we start coding
. Happy Coding
It is often useful to have one or more Python environments where you can
experiment with different combinations of packages without affecting your main
installation. Python supports this through virtual environments. The virtual
environment is a copy of an existing version of Python with the option to inherit
existing packages. A virtual environment is also useful when you need to work
on a shared system and do not have permission to install packages as you will
be able to install them in the virtual environment.
Firstly we need to open Terminal
To create a virtual environment, you must specify a path. For example to create
one in the local directory called ‘miasi’, type the following: virtualenv miasi
You can activate the python environment by running the following command:
• Mac OS / Linux : source miasi/bin/activate
19
• Windows : miasi
You should see the name of your virtual environment in brackets on your
terminal line e.g. (miasi).
Any python commands you use will now work with your virtual environment
20
Reading Image with OpenCV
As we see imread() function converts our image to NumPy array.
21
if we pass 0 as the argument, this function waits for a keyboard event indefinitely.
22
Displaying Image with OpenCV
• RGB : Probably the most popular color space. It stands for Red, Green,
and Blue. In this color space, each color is represented as a weighted
combination of red, green, and blue. So every pixel value is represented
as a tuple of three numbers corresponding to red, green, and blue. Each
value ranges between 0 and 255.
23
and represent them using different channels. This is closely related to
how the human visual system understands color. This gives us a lot of
flexibility as to how we can handle images.
24
4.2.6 Cropping Image :
All we are doing is slicing arrays. We first supply the startY : endY coordinates,
followed by the startX : endX coordinates to the slice. That’s it. We’ve cropped
the image!
25
4.2.7 Transformations of Images :
Resizing :
Resizing an image means changing the dimensions of it, be it width alone, height
alone or both. Also, the aspect ratio of the original image could be preserved
in the resized image. To resize an image, OpenCV provides cv2.resize() function.
26
27
Translation : Translation basically means that we are shifting the image
by adding/subtracting the x and y coordinates. In order
to do this, we need to
1 0 tx
create a transformation matrix M, as follows : M =
0 1 ty
Here, the t x and t y values are the x and y translation values ; that is, the
image will be moved by x units to the right, and by y units downwards.. So
once we create a matrix like this, we can use the function, cv2.warpAffine(),
to apply it to our image. The third argument in cv2.warpAffine() refers to
the size of output image.
28
Rotation :
To rotate an image using OpenCV Python, first, calculate the affine matrix that
does the affine transformation (linear mapping of pixels), then warp the input
image with the affine matrix.
Using cv2.getRotationMatrix2D(), we can specify the center point around which
the image would be rotated as the first argument, then the angle of rotation in
degrees, and a scaling factor for the image at the end.
Rotation is alsoa form of transformation,
and we can achieve
it by using the
cos(θ) −sin(θ)
following transformation matrix : R =
sin(θ) cos(θ)
θ is the angle of rotation in the counterclockwise direction.
29
4.3 Image Processing in OpenCV :
In this section, we are going to see how to apply cool visual effects to images.
We will learn how to use fundamental image processing operators, discuss edge
detection, computing histogram, temple matching, and see how we can use
image filters to apply various effects to photos.
4.3.1 2D convolution :
Convolution involving one-dimensional signals is referred to as 1D convolution
or just convolution. Otherwise, if the convolution is performed between two
signals spanning along two mutually perpendicular dimensions (i.e., if signals
are two-dimensional in nature), then it will be referred to as 2D convolution.
This concept can be extended to involve multi-dimensional signals due to which
we can have multi-dimensional convolution.
In the digital domain, convolution is performed by multiplying and accu-
mulating the instantaneous values of the overlapping samples corresponding to
two input signals, one of which is flipped. This definition of 1D convolution is
applicable even for 2D convolution except that, in the latter case, one of the
inputs is flipped twice.
This kind of operation is extensively used in the field of digital image pro-
cessing wherein the 2D matrix representing the image will be convolved with a
comparatively smaller matrix called 2D kernel.
the kernel is called the image filter and the process of applying this kernel to
the given image is called image filtering. The output obtained after applying
the kernel to the image is called the filtered image. Depending on the values in
30
the kernel, it performs different functions such as blurring, detecting edges, and
so on.
As for one-dimensional signals, images also can be filtered with various low-
pass filters (LPF), high-pass filters (HPF), etc. A LPF helps in removing noise,
or blurring the image. A HPF helps in finding edges in an image. OpenCV
provides a function, cv2.filter2D(), to convolve a kernel with an image. As an
example, we will try an averaging filter on an image. A 3x3 averaging filter
kernel can be defined as follows :
1 1 1
1 1
K= 9
1 1
1 1 1
31
4.3.2 Smoothing Image :
Image blurring is achieved by convolving the image with a low-pass filter kernel.
It is useful for removing noise. It actually removes high frequency content (e.g :
noise, edges) from the image resulting in edges being blurred when this is filter
is applied.
Averaging :
OpenCV provides a function, cv2.filter2D(), to convolve a kernel with an image.
As an example, we will try an averaging filter on an image. A 5x5 averaging
filter kernel can be defined as follows:
1 1 1 1 1
1 1 1 1 1
1
K= 1 1 1 1 1
25
1 1 1 1 1
1 1 1 1 1
32
Gaussian Filtering :
As in any other signals, images also can contain different types of noise, espe-
cially because of the source (camera sensor). Image Smoothing techniques help
in reducing the noise. In OpenCV, image smoothing (also called blurring) could
be done in many ways.
33
Median Filtering :
Here, the function cv2.medianBlur() computes the median of all the pixels under
the kernel window and the central pixel is replaced with this median value. This
is highly effective in removing salt-and-pepper noise. One interesting thing to
note is that, in the Gaussian and box filters, the filtered value for the central
element can be a value which may not exist in the original image. However this
is not the case in median filtering, since the central element is always replaced
by some pixel value in the image. This reduces the noise effectively. The kernel
size must be a positive odd integer.
34
Sharpening :
The level of sharpening depends on the type of kernel we use. We have a lot of
freedom to customize the kernel here, and each kernel will give you a different
kind of sharpening. To just sharpen an image, for Sharpening we would use a
kernel like this:
−1 −1 −1
M = −1 9 −1
−1 −1 −1
If we want to do excessive sharpening, we would use the following kernel:
1 1 1
M = 1 −9 1
1 1 1
35
4.3.3 Morphological Transformations :
Morphological transformations are some simple operations based on the image
shape. It is normally performed on binary images. It needs two inputs, one
is our original image, second one is called structuring element or kernel which
decides the nature of operation. Two basic morphological operators are Erosion
and Dilation. Then its variant forms like Opening, Closing, Gradient etc
also comes into play. We will see them one-by-one with help of following image:
Erosion : The basic idea of erosion is just like soil erosion only, it erodes away
the boundaries of foreground object (Always try to keep foreground in white).
So what it does? The kernel slides through the image (as in 2D convolution).
A pixel in the original image (either 1 or 0) will be considered 1 only if all the
pixels under the kernel is 1, otherwise it is eroded (made to zero).
So what happends is that, all the pixels near boundary will be discarded
depending upon the size of kernel. So the thickness or size of the foreground
object decreases or simply white region decreases in the image. It is useful for
removing small white noises (as we have seen in colorspace chapter), detach two
connected objects etc.
36
Dilation : It is just opposite of erosion. Here, a pixel element is ’1’ if atleast
one pixel under the kernel is ’1’. So it increases the white region in the image
or size of foreground object increases. Normally, in cases like noise removal,
erosion is followed by dilation. Because, erosion removes white noises, but it
also shrinks our object. So we dilate it. Since noise is gone, they won’t come
back, but our object area increases. It is also useful in joining broken parts of
an object.
37
Closing :
Closing is reverse of Opening, Dilation followed by Erosion. It is useful in closing
small holes inside the foreground objects, or small black points on the object.
38
4.3.4 Edge detection :
Edge detection is an image processing technique for finding the boundaries of
objects within images. It works by detecting discontinuities in brightness. Edge
detection is used for image segmentation and data extraction in areas such as
image processing, computer vision, and machine vision.
Common edge detection algorithms include Sobel, Canny, Prewitt, Roberts,
and fuzzy logic methods.
Laplacian : The Laplacian of an image highlights the areas of rapid changes
in intensity and can thus be used for edge detection. If we let I(x,y) represent
the intensities of an image then the Laplacian of the image is given by the
following formula:
∂2f ∂2f
L(x, y) = 2
+ 2
∂x ∂y
The discrete approximation of the Laplacian at a specific pixel can be deter-
mined by taking the weighted mean of the pixel intensities in a small neighbor-
hood of the pixel. The Laplacien Operator :
0 1 0
K = 1 −4 1
0 1 0
39
Canny Edge Detection : The Canny edge detector is an edge detection
operator that uses a multi-stage algorithm to detect a wide range of edges in
images. It was developed by John F. Canny in 1986. Canny also produced a
computational theory of edge detection explaining why the technique works.The
Canny edge detection algorithm is composed of 5 steps:
• Noise reduction;
• Gradient calculation;
• Non-maximum suppression;
• Double threshold;
• Edge Tracking by Hysteresis.
One last important thing to mention, is that the algorithm is based on grayscale
pictures. Therefore, the pre-requisite is to convert the image to grayscale before
following the above-mentioned steps.
40
4.3.5 Image Thresholding and segmentation :
Simple Thresholding : Here, the matter is straight-forward. For every pixel,
the same threshold value is applied. If the pixel value is smaller than the
threshold, it is set to 0, otherwise it is set to a maximum value. The func-
tion cv2.threshold is used to apply the thresholding. The first argument is the
source image, which should be a grayscale image. The second argument is the
threshold value which is used to classify the pixel values. The third argument is
the maximum value which is assigned to pixel values exceeding the threshold.
OpenCV provides different types of thresholding which is given by the fourth
parameter of the function. Basic thresholding as described above is done by
using the type cv2.THRESH BINARY. All simple thresholding types are:
• cv2.T HRESH BIN ARY
• cv2.T HRESH BIN ARYI N V
• cv2.T HRESH T RU N C
• cv2.T HRESH T OZERO
• cv2.T HRESH T OZEROI N V
41
Thresholdingcode.png
Thresholding.png
Adaptive Thresholding :
42
We used one global value as a threshold. But this might not be good in
all cases, e.g. if an image has different lighting conditions in different areas.
In that case, adaptive thresholding can help. Here, the algorithm determines
the threshold for a pixel based on a small region around it. So we get different
thresholds for different regions of the same image which gives better results for
images with varying illumination.
In addition to the parameters described above, the method cv2.adaptiveThreshold
takes three input parameters:
The adaptiveMethod decides how the threshold value is calculated:
• cv2.ADAP T IV E T HRESH M EAN C :The threshold value is the mean
of the neighbourhood area minus the constant C.
43
4.3.6 Image Histogram :
You can consider histogram as a graph or plot, which gives you an overall idea
about the intensity distribution of an image. It is a plot with pixel values (rang-
ing from 0 to 255, not always) in X-axis and corresponding number of pixels in
the image on Y-axis. It is just another way of understanding the image. By
looking at the histogram of an image, you get intuition about contrast, bright-
ness, intensity distribution etc of that image. Almost all image processing tools
today, provides features on histogram.
Histogram Terminology :
Now we have an idea on what is histogram, we can look into how to find this.
Both OpenCV and Numpy come with in-built function for this. Before us-
ing those functions, we need to un- derstand some terminologies related with
histograms.
• Bins : The histogram above shows the number of pixels for every pixel
value, from 0 to 255. In fact, we used 256 values (bins) to show the
histogram. It could be 8, 16, 32 etc. OpenCV uses histSize to refer to
bins.
• Dims : It is the number of parameters for which we collect the data. In
this case, we collect data regarding only one thing, intensity value. So
here it is 1.
44
• Range :It is the range of intensity values you want to measure. Normally,
it is [0,256], ie all intensity values.
Histogram Calculation in OpenCV : We will use cv2.calcHist() function
to find the histogram. Let’s familiarize with the function and its parameters :
cv2.calcHist(images, channels, mask, histSize, ranges)
• images : it is the source image of type uint8 or float32. it should be given
in square brackets, ie, “[img]”.
• channels : it is also given in square brackets. It the index of channel for
which we calculate histogram. For example, if input is grayscale image,
its value is [0]. For color image, you can pass [0],[1] or [2] to calculate
histogram of blue,green or red channel respectively.
• mask : mask image. To find histogram of full image, it is given as “None”.
But if you want to find histogram of particular region of image, you have
to create a mask image for that and give it as mask. (I will show an
example later.)
• histSize : this represents our BIN count. Need to be given in square
brackets. For full scale, we pass [256].
• histSize : this represents our BIN count. Need to be given in square
brackets. For full scale, we pass [256].
However, we can use NumPy for histogram , it makes code more concise. NumPy
has a special function to compute histograms, np.histogram(). Arguments of
the routine are the input image, number of bins, and range of bins. It returns an
array with histogram values and edge values for bins. Plotting Histograms :
There are two ways for this,
• Short Way : using Matplotlib plotting functions
• Long Way : using OpenCV drawing functions
Matplotlib comes with a histogram plotting function : matplotlib.pyplot.hist()
It directly finds the histogram and plot it. Wee need not use cv2.calcHist()
or np.histogram() function to find the histogram.
45
Or We can use normal plot of matplotlib, which would be good for BGR plot.
For that, you need to find the histogram data first.
46
4.3.7 Template Matching with Multiple Objects :
Template Matching is a method for searching and finding the location of a tem-
plate image in a larger image. OpenCV comes with a function cv2.matchTemplate()
for this purpose. It simply slides the template image over the input image (as in
2D convolution) and compares the template and patch of input image under the
template image. Several comparison methods are implemented in OpenCV. It
returns a grayscale image, where each pixel denotes how much does the neigh-
bourhood of that pixel match with template.
If input image is of size (WxH) and template image is of size (wxh), output
image will have a size of (W-w+1, H-h+1). Once you got the result, you can use
cv2.minMaxLoc() function to find where is the maximum/minimum value.
Take it as the top-left corner of rectangle and take (w,h) as width and height
of the rectangle. That rectangle is your region of template.
Here, as an example, we will search for Mario’s coins in his photo. So I created
a template as below:
47
4.4 Feature Detection and Description :
Feature detection plays a crucial role in image registration. There exists quite a
few feature detection algorithms in literature like BRISK, FAST, SURF etc [14].
Each of these algorithms has its own advantages and disadvantages. BRISK
is rotation and scale invariant, but it takes more time to detect the feature
points. On the other hand FAST, as the name suggests, takes less time to
detect the key points, but it is not scale invariant. To overcome the demerits
48
of BRISK and FAST feature detection algorithms, this paper proposes a hybrid
feature detection algorithm, which consumes less time to detect the feature key
points and it is also rotation and scale invariant. This paper also focus on a
comparative analysis of BRISK, FAST and proposed algorithm in terms of time
to detect feature points. This paper has taken five feature key points in every
Remote-sensing images and also deals with feature detection using the above
three algorithms. It can be observed from the results and tables that in case of
hybrid feature detector, it takes less time to detect five feature points.
49
between edges and corners. Since then, it has been improved and adopted in
many algorithms to preprocess images for subsequent applications.
OpenCV has the function cv2.cornerHarris() for this purpose. Its argu-
ments are :
50
4.4.3 Shi-Tomasi Corner Detector and Good Features to Track :
The Shi-Tomasi corner detector is based entirely on the Harris corner detector.
However, one slight variation in a ”selection criteria” made this detector much
better than the original. It works quite well where even the Harris corner de-
tector fails. So here’s the minor change that Shi and Tomasi did to the original
Harris corner detector.
The change :
The Harris corner detector has a corner selection criteria. A score is calculated
for each pixel, and if the score is above a certain value, the pixel is marked as
a corner. The score is calculated using two eigenvalues. That is, you gave the
two eigenvalues to a function. The function manipulates them, and gave back
a score.
51
4.5 Face Detection using Haar Cascades :
Face detection using Haar cascades is a machine learning based approach where
a cascade function is trained with a set of input data.
OpenCV already contains many pre-trained classifiers for face, eyes, smiles,
etc. we will be using the face classifier. we can experiment with other classifiers
as well.
52
Now all possible sizes and locations of each kernel are used to calculate plenty
of features. For each feature calculation, we need to find the sum of the pixels
under the white and black rectangles. To solve this, they introduced the integral
images. It simplifies calculation of the sum of the pixels, how large may be the
number of pixels, to an operation involving just four pixels.
• flags : parameter with the same meaning for an old cascade as in the
function cvHaarDetectObjects. It is not used for a new cascade.
• minSize :minimum possible object size. Objects smaller than this are
ignored.
• maxSize : maximum possible object size. Objects larger than this are
ignored. If maxSize == minSize model is evaluated on single scale.
53
fac code.png
code.png
5 Linear Filters
5.1 Convolution
Convolution involving one-dimensional signals is referred to as 1D convolution
or just convolution. Otherwise, if the convolution is performed between two
signals spanning along two mutually perpendicular dimensions (i.e., if signals
are two-dimensional in nature), then it will be referred to as 2D convolution.
This concept can be extended to involve multi-dimensional signals due to which
we can have multi-dimensional convolution.
This kind of operation is extensively used in the field of digital image processing
wherein the 2D matrix representing the image will be convolved with a compar-
atively smaller matrix called 2D kernel.
the kernel is called the image filter and the process of applying this kernel to
the given image is called image filtering. The output obtained after applying
the kernel to the image is called the filtered image.
if f is the image that we want to filter and g the filter then :
−1
f (x, y) ∗ g(x, y) = F F (f (x, y)). F (g(x, y)) .
| {z }
G(u,v)
G is transfer function of filter.we will present the filters in the discreet case.x
and y are the coordinates of the pixels and f is an integer(between [0,..,255]).
there are tree types of filters : .
1. Low Pass Filter decreases the noise but attenuates the details of the
image.
54
2. High Pass Filter increases the counters and the details but also increases
the noise.
3. Band Pass Filter delete some unwanted frequencies
We generally don’t do any global convolution but a local transformation based
at the neighborhood of the point(x,y)
Local Convolution
The convolution kernel of the filter k is a compact support included in [x1,x2]
[y2,y2] :
Xx2 X y2
g(x, y) = (f ∗ k)(x ∗ y) =
i=x1 j=y2
55
5.2.2 Gaussian filter
56
if σ = 1 for 5x5 filter we will have :
1 4 6 4 1
4 18 30 18 4
1
∗ 6 30 48 30 6
300
4 18 30 18 4
1 4 6 4 1
Ideally , a filter of size (6σ +1) x (6σ +1). generally the Gaussian filter with
σ ¡1 is used to reduce noise, and if σ¿1 it is for the purpose to make an image
that we will use to make a personalized ”blur mask”. Note that the larger , the
better the blurring applied to the image
1 4 6 4 1
4 16 24 16 4
1
∗ 6 24 36 24 6
256
4 16 24 16 4
1 4 6 4 1
These filters are low-pass filters: they soften the details of the image (and there-
fore the additive noise) but by eroding the edges add blurring to the image. We
will see in a next section how to reduce blurring.
57
5.3.2 Laplacian of Gaussian (LoG)
This filter is defined by the function:
x2 + y 2 − 2σ 2 −x2 −y 2
K(x, y) = ∆(Gσ (x, y) = e 2σ 2
2πσ 4
58
Digital Image.
When a maximum filter is applied, the darker objects present in the image are
eroded. This way maximum filter is called an erosion filter. With respect to the
lighter pixels, some call this as a dilation filter.
To be more specific , brighter objects are dilated and the darker objects are eroded
upon applying a maximum filter to a Digital Image. let’s Apply our filter in Lena
image :
59
Minimum filter is also called as a dilation filter. When minimum filter is applied
the object boundaries present in an image are extended.
The minimum filter is one of the morphological filters. The other morphological
filters include maximum filter and the median filter.
The minimum filter removes any positive outlier noise present in a digital image.
let’s Apply our filter in Lena image :
7 Directional Filters
7.1 What is a Gradient of the image?
Image of the gradient is nothing but the change of intensity of image colors in
X, Y or both directions.
We can find the gradient of an image by the help of Sobel and Laplacian deriva-
tives of the image. Sobel is used for either X or Y direction or even in combined
60
form while Laplacian help in both directions.
7.2 Laplacian
The Laplacian is a 2-D isotropic measure of the 2nd spatial derivative of an
image. The Laplacian of an image highlights regions of rapid intensity change
and is therefore often used for edge detection (see zero crossing edge detectors).
The Laplacian is often applied to an image that has first been smoothed with
something approximating a Gaussian smoothing filter in order to reduce its sen-
sitivity to noise, and hence the two variants will be described together here. The
operator normally takes a single graylevel image as input and produces another
graylevel image as output.
7.3 Filters
7.3.1 Directional Filter :
δI
the approximation of δx is done by convolution with
0 −1 0
hx = 0 1 0
0 0 0
δI
and δy is done by convolution with
0 0 0
hy = −1 1 0
0 0 0
Which means :
δI
• δx (i, j) = −I(i − 1, j) + I(i, j)
δI
• δy (i, j) = −I(i, j − 1) + I(i, j)
so let’s apply this filter on coin image :
61
7.3.2 Prewitt Operator
Prewitt operator is used for edge detection in an image. It detects two types of
edges
• Horizontal edges
• Vertical Edges
Edges are calculated by using difference between corresponding pixel intensities
of an image. All the masks that are used for edge detection are also known as
derivative masks. As we have seen before that image is also a signal so changes
in a signal can only be calculated using differentiation. So that’s why these
operators are also called as derivative operators or derivative masks.
Prewitt operator provides us two masks one for detecting edges in horizontal
direction and another for detecting edges in an vertical direction.
Vertical direction :
−1 0 1
hx = −1 0 1
−1 0 1
When we apply this mask on the image it prominent vertical edges. It simply
works like as first order derivate and calculates the difference of pixel intensities
in a edge region. As the center column is of zero so it does not include the
original values of an image but rather it calculates the difference of right and
left pixel values around that edge. This increase the edge intensity and it become
enhanced comparatively to the original image.
Horizontal Direction:
−1 −1 −1
hy = 0 0 0
1 1 1
This mask will prominent the horizontal edges in an image. It also works on
the principle of above mask and calculates difference among the pixel intensities
of a particular edge. As the center row of mask is consist of zeros so it does not
include the original values of edge in the image but rather it calculate the differ-
ence of above and below pixel intensities of the particular edge. Thus increasing
the sudden change of intensities and making the edge more visible. Both the
above masks follow the principle of derivate mask. Both masks have opposite
sign in them and both masks sum equals to zero. The third condition will not be
applicable in this operator as both the above masks are standardize and we can’t
change the value in them.
Now it’s time to see these masks :
62
(a) Original (b) Perwitt X
63
Figure 5: Perwitt Filter Using OpenCv
64
(a) Original (b) Robert X
65
Figure 7: Robert Filter Using OpenCv
• South West
66
• South
• South East
• East
• North East
We take a standard mask which follows all the properties of a derivative mask
and then rotate it to find the edges.
For example let’s see the following mask which is in North Direction and
then rotate it to make all the direction masks.
North Direction Mask
−3 −3 5
−3 0 5
−3 −3 5
North West Direction Mask
−3 5 5
−3 0 5
−3 −3 −3
West Direction Mask
5 5 5
−3 0 −3
−3 −3 −3
South West Direction Mask
5 5 −3
5 0 −3
−3 −3 −3
South Direction Mask
5 −3 −3
5 0 −3
5 −3 −3
South East Direction Mask
−3 −3 −3
5 0 −3
5 −3 5
East Direction Mask
−3 −3 −3
−3 0 −3
5 5 5
67
North East Direction Mask
−3 −3 −3
−3 0 5
−3 5 5
As you can see that all the directions are covered and each mask will give you
the edges of its own direction. Now to help you better understand the concept of
these masks we will apply it on a real image. Suppose we have a sample picture
from which we have to find all the edges.
Now we will apply all the above :
(d) west direction (e) south west direction (f) south direction
(g) south east direction (h) east direction (i) north east direction
68
7.3.5 Sobel Filter :
Sobel Operator is an approximation to a derivative of an image. It is separate
in the y and x directions. If we look at the x-direction, the gradient of an image
in the x-direction is equal to this operator here. We use a kernel 3 by 3 matrix,
one for each x and y direction. The gradient for x-direction has minus numbers
on the left hand side and positive numbers on the right hand side and we are
preserving a little bit of the center pixels. Similarly, the gradient for y-direction
has minus numbers on the bottom and positive numbers on top and here we are
preserving a little bit on the middle row pixels.
−1 0 1 −1 −2 −1
Gx = −2 0 1 Gy = 0 0 0
−1 0 1 1 2 1
(a) Original
69
(a) Original (b) Gauss Blur
∂2f ∂2f
L(x, y) = 2
+ 2
∂x ∂y
The discrete approximation of the Laplacian at a specific pixel can be determined
by taking the weighted mean of the pixel intensities in a small neighborhood of
the pixel. The Laplacien Operator :
70
0 1 0
D = 1 −4 1
0 1 0
so let’s apply this filter on coin image :
8 Image Restoration
When we are referring to image restoration problems we basically mean that we
have a degraded image and we want to recover the clean non-degraded image.
There could be many reasons for an image to get degraded, Mainly, degradation
of images may occur during image transmission, formation, and storage.
There are a lot of tasks in image restoration, Let’s talk about three main tasks:
71
Figure 13: Deblurring
72
Figure 14: in-painting
73
zoom.png
74
sumption that the signal and noise processes are second-order stationary (in the
random process sense).gif For this description, only noise processes with zero
mean will be considered (this is without loss of generality).
Wiener filters are usually applied in the frequency domain. Given a degraded im-
age x(n,m), one takes the Discrete Fourier Transform (DFT) to obtain X(u,v).
The original image spectrum is estimated by taking the product of X(u,v) with
the Wiener filter G(u,v):
The inverse DFT is then used to obtain the image estimate from its spectrum.
The Wiener filter is defined in terms of these spectra:
H ∗ (u, v)
G(u, v) = Pn (u,v
|H(u, v)|2 + Ps (u,v) )
Ps (u, v)
G(u, v) =
Ps (u, v) + σn 2
where σn 2 is the noise variance.
After a brief introduction about The Wiener filter let’s apply our filter :
75
Figure 17: The Wiener filter
∂t u = div(g(|∇u|2 )∇u)
and it uses diffusivities such as
1
g(s2 ) = (λ > 0)
1 + s2 /λ2
Although Perona and Malik name their filter anisotropic, it should be noted that
– in our terminology – it would be regarded as an isotropic model.
The flux function is defined as
φ(s) = sg(s2 )
76
Figure 18: Diffusivity function the corresponding flux function and its derivative
https://www.sunpower-uk.com/glossary/what-is-a-frequency-filter/
https://sisu.ut.ee/imageprocessing/book/1
https://www.encyclopedia.com/computing/news-wires-white-papers-and-books/digital-
images
https://packaging.python.org/tutorials/installing-packages/
https://github.com/
https://www.analyticsvidhya.com/blog/
77