Basic Operations in Image Processing - Poorvi Joshi - 2019 Batch
Basic Operations in Image Processing - Poorvi Joshi - 2019 Batch
Image resizing –
Scaling of images. Many times we need to resize the image i.e. either shrink it or scale up to meet the size
requirements.
Interpolation Method for Resizing –
cv2.INTER_AREA: This is used when we need to shrink an image.
cv2.INTER_CUBIC: This is slow but more efficient.
cv2.INTER_LINEAR: This is primarily used when zooming is required. This is the default interpolation
technique in OpenCV.
Code :-
Output :-
1
Eroding an image
It erodes away the boundaries of foreground object (Always try to keep foreground in white). It is normally
performed on binary images. It needs two inputs, one is our original image, second one is called structuring
element or kernel which decides the nature of operation .
kernel: A structuring element used for erosion. If element = Mat(), a 3 x 3 rectangular structuring element is
used. Kernel can be created using getStructuringElement.
Code:-
Input :-
Output :-
2
Blurring an Image
It helps in Noise removal. As noise is considered as high pass signal so by the application of low pass
filter kernel we restrict noise.
It helps in smoothing the image.
Low intensity edges are removed.
It helps in hiding the details when necessary. For e.g. in many cases police deliberately want to hide the
face of the victim, in such cases blurring is required.
Code
Input
Output
3
Gray scaling of images
Gray scaling is the process of converting an image from other color spaces e.g. RGB, CMYK, HSV, etc. to shades
of gray.
Dimension reduction: For example, In RGB images there are three color channels and has three
dimensions while grayscale images are single-dimensional.
Reduces model complexity: Consider training neural article on RGB images of 10x10x3 pixel. The input
layer will have 300 input nodes. On the other hand, the same neural network will need only 100 input
nodes for grayscale images.
For other algorithms to work: Many algorithms are customized to work only on grayscale images e.g.
Canny edge detection function pre-implemented in OpenCV library works on Grayscale images only.
Three methods are there – average, flag=0, cvtcolor().
Code
Output
4
Affine Transformation
The affine transformation is applied as follows:
• Consider every pixel coordinate in the image.
• Calculate the dot product of the pixel coordinate with a trans formation matrix. The matrix differs depending
on the type of transformation being performed which will be discussed below. The dot product gives the pixel
coordinate for the transformed image.
• Determine the pixel value in the transformed image using the pixel coordinate calculated from the previous
step. Since the dot product may produce non-integer pixel coordinates, we will apply interpolation.
Types of interpolation: -
1. Nearest-neighbor (order = 0)
2. Bi-linear (order = 1)
3. Bi-quadratic (order = 2)
4. Bi-cubic (order = 3)
5. Bi-quartic (order = 4)
6. Bi-quintic (order = 5)
Scaling
5
Scaling is a process of changing the distance (compression or elon gation) between points in one or more axes.
This change in distance Affine Transformation 129 causes the object in the image to appear larger or smaller
than the original input.
If the value of kx or ky is less than 1, then the objects in the image will appear
smaller.
Missing pixel values will be filled with 0 or based on the value of the warp parameter.
If the value of kx or ky is greater than 1, then the objects in the image will appear larger.
If the value of kx and ky are equal, the image is compressed or elongated by the same amount along both axes.
Code
Output
6
Rotation
Rotation is the process of changing the radial orientation of an image along the various axes with
respect to a fixed point.
Code
Output
Due to 0.6 parameter value as the picture is scaled at the same time getRotationMatrix2D() function’s argument.
Translating an image
Translation is the process of shifting the image along the various axes (x-, y- and z-axis). For a 2D image, we can
perform translation along one or both axes independently.
7
Code
Output
Edge detection
Edges are a set of points in an image where there is a change of intensity between one side of that point and the
other. From calculus, we know that the changes in intensity can be measured by using the first or second
derivative.
First derivative filter :-
8
1. Sobel filter
2. Prewitt filter
3. Canny filter
Canny edge detection : -
The process of image detection involves detecting sharp edges in the image. This edge detection is
essential in context of image recognition or object localization/detection. There are several algorithms for
detecting edges due to it’s wide applicability. We’ll be using one such algorithm known as Canny Edge
Detection.
A Gaussian filter is used on the image for smoothing.
Code
Output
9
based image such as JPEG, whose values span [0, 255], the number of values in the x-axis will be 256.
Histograms are a useful tool in determining the quality of the image.
Code
Output
10
Histogram equalisation
It is a method in image processing of contrast adjustment using the image’s histogram. This method usually
increases the global contrast of many images, especially when the usable data of the image is represented by
close contrast values. cv2.equalizeHist(). Its input is just grayscale image and output is our histogram
equalized image.
Code
Input
11
Output
Simple thresholding
The basic Thresholding technique is Binary Thresholding.The different Simple Thresholding Techniques are:
cv2.THRESH_BINARY: If pixel intensity is greater than the set threshold, value set to 255, else set to 0
(black).
cv2.THRESH_BINARY_INV: Inverted or Opposite case of cv2.THRESH_BINARY.
cv.THRESH_TRUNC: If pixel intensity value is greater than threshold, it is truncated to the threshold. The
pixel values are set to be the same as the threshold. All other values remain the same.
cv.THRESH_TOZERO: Pixel intensity is set to 0, for all the pixels intensity, less than the threshold value.
cv.THRESH_TOZERO_INV: Inverted or Opposite case of cv2.THRESH_TOZERO.
Code
12
Input
Output
13
Segmentation
Segmentation is the process of separating an image into multiple logical regions. The regions can be defined as
pixels sharing similar characteristics such as intensity, texture, etc.
Histogram based segmentation
In the histogram-based method, a threshold is determined by using the histogram of the image. Each pixel in the
image is compared with the threshold value.
The various histogram-based methods differ in their techniques of determining the threshold.
1. Otsu’s segmentation
2. Adaptive segmentation
Otsu’s segmentation
In Otsu Thresholding, a value of the threshold isn’t chosen but is determined automatically. A bimodal image
(two distinct image values) is considered. The histogram generated contains two peaks. So, a generic
condition would be to choose a threshold value that lies in the middle of both the histogram peak values .
Code
Input
14
Output
Adaptive segmentation
A constant threshold value won’t help in the case of variable lighting conditions in different areas. Adaptive
segmentation is the method where the threshold value is calculated for smaller regions. This leads to
different threshold values for different regions with respect to the change in lighting.
Code
Output
15
Counting number of figures using canny edge detection and contours
Steps :-
1. Convert image to gray scale
2. Performing edge detection
3. Thresholding a grayscale image
4. Finding , counting , and drawing counters.
Code
16
Output
17
Output
18
Denoise of colored image
Syntax: cv2.fastNlMeansDenoisingColored( P1, P2, float P3, float P4, int P5, int P6)
Parameters:
P1 – Source Image Array
P2 – Destination Image Array
P3 – Size in pixels of the template patch that is used to compute weights.
P4 – Size in pixels of the window that is used to compute a weighted average for the given pixel.
P5 – Parameter regulating filter strength for luminance component.
P6 – Same as above but for color components // Not used in a grayscale image.
Code
Output
19
Finding coordinates of counters
The Co-ordinates of each vertices of a contour is hidden in the contour itself. In this approach, we will be
using numpy library to convert all the co-ordinates of a contour into a linear array. This linear array would
contain the x and y co-ordinates of each contour. The key point here is that the first co-ordinate in the array
would always be the co-ordinate of the topmost vertex and hence could help in detection of orientation of an
image.
Code
20
Output
Bilateral filter
Normalization factor and the range weight are new terms added to the previous equation. Sigma ‘s’ denotes
the spatial extent of the kernel, i.e. the size of the neighbourhood, and sigma ‘r’ denotes the minimum
amplitude of an edge. It ensures that only those pixels with intensity values similar to that of the central pixel
are considered for blurring, while sharp intensity changes are maintained. The smaller the value of sigma ‘ r’,
the sharper the edge. As sigma ‘r’ tends to infinity, the equation tends to a Gaussian blur.
21
Code
Output
Intensity transformation
Intensity transformations are applied on images for contrast manipulation or image thresholding. These are in
the spatial domain, i.e. they are performed directly on the pixels of the image at hand, as opposed to being
performed on the Fourier transform of the image .
1. Log transformation
2. Gamma transformation
Code
22
Output
Log transformed-
Gamma transformed –
23
Background subtraction (Running average)
Running average of a function is used to separate foreground from background. The video sequence is
analysed over a particular set of frames. During this sequence of frames, the running average over the
current frame and the previous frames is computed. This gives us the background model and any new object
introduced in the sequencing of the video becomes the part of the foreground. Then, the current frame holds
the newly introduced object with the background. Then the computation of the absolute difference between
the background model (which is a function of time) and the current frame (which is newly introduced object)
is done.
Code
24
Output
25
Morphological operations
26