0% found this document useful (0 votes)
38 views

Image Processing and Computer Vision Both Are Very Exciting Field of Computer Science

image processing

Uploaded by

Bayee Atnafu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Image Processing and Computer Vision Both Are Very Exciting Field of Computer Science

image processing

Uploaded by

Bayee Atnafu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 56

Image processing and Computer Vision both are very exciting field

of Computer Science.
Computer Vision:
In Computer Vision, computers or machines are made to gain high-
level understanding from the input digital images or videos with
the purpose of automating tasks that the human visual system can
do. It uses many techniques and Image Processing is just one of
them.
Image Processing:
Image Processing is the field of enhancing the images by tuning
many parameter and features of the images. So Image Processing
is the subset of Computer Vision. Here, transformations are applied
to an input image and the resultant output image is returned. Some
of these transformations are- sharpening, smoothing, stretching
etc.
Now, as both the fields deal with working in visuals, i.e., images
and videos, there seems to be lot of confusion about the difference
about these fields of computer science. In this article we will
discuss the difference between them.
Difference between Image Processing and Computer Vision:
Image Processing Computer Vision
Computer vision is focused on extracting
Image processing is mainly focused information from the input images or
on processing the raw input images videos to have a proper understanding of
to enhance them or preparing them them to predict the visual input like
to do other tasks human brain.
Image processing uses methods like
Anisotropic diffusion, Hidden Image processing is one of the methods
Markov models, Independent that is used for computer vision along with
component analysis, Different other Machine learning techniques, CNN
Filtering etc. etc.
Image Processing is a subset of Computer Vision is a superset of Image
Computer Vision. Processing.
Examples of some Image Examples of some Computer Vision
Processing applications are- applications are- Object detection, Face
Rescaling image (Digital Zoom), detection, Hand writing recognition etc.
Correcting illumination, Changing
tones etc.

What is Image Processing?


The Art of Beautifying Images
Imagine you have a photograph that isn’t quite perfect – maybe it’s
too dark, or the colors are dull. Image processing is like a magic
wand that transforms this photo into a better version. It involves
altering or improving digital images using various methods and
tools. Think of it as editing a photo to make it look more appealing
or to highlight certain features. It’s all about changing the image
itself.

What is Computer Vision?


Teaching Computers to Interpret Images
Now, imagine a robot looking at the same photograph. Unlike
humans, it doesn’t naturally understand what it’s seeing. This is
where computer vision comes in. It’s like teaching the robot to
recognize and understand the content of the image – is it a picture
of a cat, a car, or a tree? Computer vision doesn’t change the image.
Instead, it tries to make sense of it, much like how our brain
interprets what our eyes see.

Core Principles & Techniques


Computer Vision (CV): Seeing Beyond the
Surface
In the realm of Computer Vision, the goal is to teach computers to
understand and interpret visual information from the world around
them. Let’s explore some of the key principles and techniques that
make this possible:

Pattern Recognition
Think of this as teaching a computer to play a game of ‘spot the
difference’. By recognizing patterns, computers can identify
similarities and differences in images. This skill is crucial for tasks
like facial recognition or identifying objects in a scene.

Deep Learning
Deep Learning is like giving a computer a very complex brain that
learns from examples. By feeding it thousands, or even millions, of
images, a computer learns to identify and understand various
elements in these images. This is the backbone of modern computer
vision, enabling machines to recognize objects, people, and even
emotions.

Object Detection
This is where computers get really smart. Object detection is about
identifying specific objects within an image. It’s like teaching a
computer to not just see a scene, but to understand what each part
of that scene is. For instance, in a street scene, it can distinguish
cars, people, trees, and buildings.

Image Processing: Transforming Pixels into


Perfection
In the world of Image Processing, the magic lies in altering and
enhancing images to make them more useful or visually appealing.
Let’s break down some of the fundamental principles and
techniques:

Image Enhancement
This is like giving a makeover to an image. Image enhancement can
brighten up a dark photo, bring out hidden details, or make colors
pop. It’s all about improving the look and feel of an image to make it
more pleasing or informative.

Filtering
Imagine sifting through the ‘noise’ to find the real picture. Image
filtering involves removing or reducing unwanted elements from an
image, like blurring, smoothening rough edges, or sharpening blurry
parts. It helps in cleaning up the image to highlight the important
features.
Transformation Techniques
This is where an image can take on a new shape or form.
Transformation techniques might include resizing an image, rotating
it, or even warping it to change perspective. It’s like reshaping the
image to fit a specific purpose or requirement.

These techniques form the toolbox of image processing, enabling us


to manipulate and enhance images in countless ways.

Distinctions Between Computer


Vision and Image Processing
Image Processing: Visual Perfection
The primary aim of image processing is to improve image quality.
Whether it’s enhancing contrast, adjusting colors, or smoothing
edges, the focus is on making the image more visually appealing or
suitable for further use. It’s about transforming the raw image into a
refined version of itself.
Image processing focuses on enhancing and transforming images.
It’s vital in fields like digital photography for color correction,
medical imaging for clearer scans, and graphic design for creating
stunning visuals. These transformations not only improve aesthetics
but also make images more suitable for analysis, laying the
groundwork for deeper interpretation, including by computer vision
systems.

Computer Vision: Decoding the Visual World


Computer vision, on the other hand, seeks to extract meaning from
images. The goal isn’t to change how the image looks but to
understand what the image represents. This involves identifying
objects, interpreting scenes, and even recognizing patterns and
behaviors within the image. It’s more about comprehension rather
than alteration.
Computer Vision, conversely, aims to extract meaning and
understanding from images. It’s at the heart of AI and robotics,
helping machines recognize faces, interpret road scenes for
autonomous vehicles, and understand human behavior. The success
of these tasks often relies on the quality of image processing. High-
quality, well-processed images can significantly enhance the
accuracy of computer vision algorithms.

Techniques and Tools


Image Processing Techniques and Tools
In image processing, the toolkit includes a range of software and
algorithms specifically designed for modifying images. This includes:

Software like Photoshop and GIMP, for manual edits such as


retouching and resizing.

Algorithms for automated tasks like histogram equalization for


contrast adjustment and filters for noise reduction and edge
enhancement.

Computer Vision Techniques and Tools


Computer Vision, on the other hand, employs a different set of
methodologies:
Machine Learning and Deep Learning Algorithms such
as Convolutional Neural Networks (CNNs) are pivotal for tasks like
image classification and object recognition.

Pattern Recognition Tools are used to identify and classify objects


within an image, essential for applications like facial recognition.

Image Processing Computer Vision

The input and output The input can be an image or a video. The
are images. output can be a label or a bounding box.

Changes the input’s Usually, it doesn’t change the input’s


properties. properties.

Doesn’t interpret an
Extracts useful information from the input.
image.

Often the first step of


We use it after the image-processing stage.
an application.

Image Processing Algorithms in Computer Vision


Last Updated : 04 Jul, 2024



In the field of computer vision, image preprocessing is a crucial step
that involves transforming raw image data into a format that can be
effectively utilized by machine learning algorithms. Proper
preprocessing can significantly enhance the accuracy and efficiency
of image recognition tasks. This article explores various image
preprocessing algorithms commonly used in computer vision.
What is image processing in computer vision?
Image processing in computer vision refers to a set of techniques
and algorithms used to manipulate and analyze digital images to
extract meaningful information. It involves transforming raw image
data into a format that is easier to understand and interpret by both
humans and machine learning algorithms. The goal of image
processing is to improve the quality of images, enhance specific
features, and prepare the data for further analysis, recognition, and
decision-making tasks.
Key Objectives of Image Processing
1. Enhancement: Improving the visual appearance of an
image or accentuating certain features to make them more
prominent.
2. Restoration: Correcting defects or distortions in an image
to recover the original scene.
3. Segmentation: Dividing an image into meaningful regions
or objects for easier analysis.
4. Compression: Reducing the size of image files for storage
and transmission efficiency.
5. Analysis: Extracting quantitative or qualitative information
from images.
Now we will discuss all the different preprocessing algorithms used
in computer vision;
1. Image Resizing
Image resizing changes the dimensions of an image. The challenge
is to maintain image quality while altering its size. Here are the
main interpolation methods:
a) Nearest Neighbor Interpolation:
 Simplest and fastest method
 Selects the pixel value of the nearest neighbor
 Results in a blocky image when enlarging
b) Bilinear Interpolation:
 Uses a weighted average of the 4 nearest pixel values
 Smoother results than nearest neighbor, but can cause
some blurring
c) Bicubic Interpolation:
 Uses a weighted average of the 16 nearest pixel values
 Produces the smoothest results, especially for photographic
images
Code Implementation
Python
import cv2
import numpy as np
import matplotlib.pyplot as plt

def resize_image(image, target_size):


print(f"Original image shape: {image.shape}")
print(f"Target size: {target_size}")

resized_nn = cv2.resize(image, target_size,


interpolation=cv2.INTER_NEAREST)
resized_bilinear = cv2.resize(image, target_size,
interpolation=cv2.INTER_LINEAR)
resized_bicubic = cv2.resize(image, target_size,
interpolation=cv2.INTER_CUBIC)

print(f"Resized image shape: {resized_nn.shape}")

return resized_nn, resized_bilinear, resized_bicubic

# Example usage
image_path = '/kaggle/input/sample-image/tint1.jpg'
image = cv2.imread(image_path)

if image is None:
print(f"Failed to load image from {image_path}")
else:
print(f"Successfully loaded image from {image_path}")

target_size = (800, 600)


nn, bilinear, bicubic = resize_image(image, target_size)

# Display the results


plt.figure(figsize=(15, 5))

plt.subplot(131)
plt.imshow(cv2.cvtColor(nn, cv2.COLOR_BGR2RGB))
plt.title('Nearest Neighbor')
plt.axis('off')

plt.subplot(132)
plt.imshow(cv2.cvtColor(bilinear, cv2.COLOR_BGR2RGB))
plt.title('Bilinear')
plt.axis('off')

plt.subplot(133)
plt.imshow(cv2.cvtColor(bicubic, cv2.COLOR_BGR2RGB))
plt.title('Bicubic')
plt.axis('off')

plt.tight_layout()
plt.show()

print("Script completed")
Output:

Output

Code Explanation:
 Import Libraries: Imports cv2 for image
processing, numpy for calculations, and matplotlib.pyplot for
plotting images.
 Define resize_image Function: Resizes the image using
three methods: Nearest Neighbor, Bilinear, and Bicubic
interpolation.
 Load Image: Reads the image from the specified path and
checks if the image was loaded successfully.
 Perform Resizing: Applies Nearest Neighbor, Bilinear, and
Bicubic resizing methods to the image.
 Display Results: Shows the original image resized using
different methods side by side using matplotlib.
2. Image Normalization
Normalization adjusts pixel intensity values to a standard scale.
Common techniques include:
a) Min-Max Scaling:
 Scales values to a fixed range, typically [0, 1].
Formula: (x - min) / (max - min)
b) Z-score Normalization:
 Transforms data to have a mean of 0 and standard deviation
of 1.
Formula: (x - mean) / standard_deviation
c) Histogram Equalization:
Enhances contrast by spreading out the most frequent intensity
values.
Implementation
Python
import cv2
import numpy as np
import matplotlib.pyplot as plt

def normalize_image(image):
print("Performing image normalization...")

# Min-Max Scaling
min_max = cv2.normalize(image, None, 0, 255, cv2.NORM_MINMAX)

# Z-score Normalization
z_score = np.zeros_like(image, dtype=np.float32)
for i in range(3): # for each channel
channel = image[:,:,i]
mean = np.mean(channel)
std = np.std(channel)
z_score[:,:,i] = (channel - mean) / (std + 1e-8) # adding
small value to avoid division by zero
z_score = cv2.normalize(z_score, None, 0, 255,
cv2.NORM_MINMAX).astype(np.uint8)

# Histogram Equalization
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
hist_eq = cv2.equalizeHist(gray)

print("Normalization complete.")
return min_max, z_score, hist_eq

# Example usage
image_path = '/kaggle/input/sample-image/input_image.jpg'
image = cv2.imread(image_path)
if image is None:
print(f"Failed to load image from {image_path}")
else:
print(f"Successfully loaded image from {image_path}")

min_max, z_score, hist_eq = normalize_image(image)

# Display the results


plt.figure(figsize=(20, 5))

plt.subplot(141)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title('Original')
plt.axis('off')

plt.subplot(142)
plt.imshow(cv2.cvtColor(min_max, cv2.COLOR_BGR2RGB))
plt.title('Min-Max Scaling')
plt.axis('off')

plt.subplot(143)
plt.imshow(cv2.cvtColor(z_score, cv2.COLOR_BGR2RGB))
plt.title('Z-score Normalization')
plt.axis('off')

plt.subplot(144)
plt.imshow(hist_eq, cmap='gray')
plt.title('Histogram Equalization')
plt.axis('off')

plt.tight_layout()
plt.show()

print("Script completed")
Output:
Normalization

Code Explanation:
 Import Libraries: Imports cv2, numpy, and
matplotlib.pyplot for image processing and visualization.
 Define normalize_image Function: Applies Min-Max
Scaling, Z-Score Normalization, and Histogram Equalization
to the image.
 Load Image: Reads an image from a specified path and
checks if it loaded successfully.
 Perform Normalization: Executes Min-Max Scaling, Z-
Score Normalization, and Histogram Equalization to adjust
the image.
 Display Results: Shows the original and processed images
side by side using matplotlib.
3. Image Augmentation
Image augmentation creates modified versions of images to expand
the training dataset. Key techniques include:
a.) Geometric Transformations:
 Rotation: Turning the image around a center point.
 Flipping: Mirroring the image horizontally or vertically.
 Scaling: Changing the size of the image.
 Cropping: Cutting out a part of the image to use.
b) Color Adjustments:
 Brightness: Making the image lighter or darker.
 Contrast: Changing the difference between light and dark
areas.
 Saturation: Adjusting the intensity of colors.
 Hue: Shifting the overall color tone of the image.
c) Noise Addition:
 Adding Random Noise: Introducing random variations to
pixel values to simulate imperfections.
Python
import cv2
import numpy as np
import matplotlib.pyplot as plt

def augment_image(image):
print("Performing image augmentation...")

# Rotation
rows, cols = image.shape[:2]
M = cv2.getRotationMatrix2D((cols/2, rows/2), 45, 1)
rotated = cv2.warpAffine(image, M, (cols, rows))

# Flipping
flipped = cv2.flip(image, 1) # 1 for horizontal flip

# Brightness adjustment
bright = cv2.convertScaleAbs(image, alpha=1.5, beta=0)

# Add noise
noise = np.random.normal(0, 25, image.shape).astype(np.uint8)
noisy = cv2.add(image, noise)

print("Augmentation complete.")
return rotated, flipped, bright, noisy

# Example usage
image_path = '/kaggle/input/sample-image/input_image.jpg'
image = cv2.imread(image_path)

if image is None:
print(f"Failed to load image from {image_path}")
else:
print(f"Successfully loaded image from {image_path}")
rotated, flipped, bright, noisy = augment_image(image)

# Display the results


plt.figure(figsize=(20, 5))

plt.subplot(151)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title('Original')
plt.axis('off')

plt.subplot(152)
plt.imshow(cv2.cvtColor(rotated, cv2.COLOR_BGR2RGB))
plt.title('Rotated')
plt.axis('off')

plt.subplot(153)
plt.imshow(cv2.cvtColor(flipped, cv2.COLOR_BGR2RGB))
plt.title('Flipped')
plt.axis('off')

plt.subplot(154)
plt.imshow(cv2.cvtColor(bright, cv2.COLOR_BGR2RGB))
plt.title('Brightness Adjusted')
plt.axis('off')

plt.subplot(155)
plt.imshow(cv2.cvtColor(noisy, cv2.COLOR_BGR2RGB))
plt.title('Noisy')
plt.axis('off')

plt.tight_layout()
plt.show()

print("Script completed")
Output:
Augmentation
Code Explanation:
 Import Libraries: Imports cv2 for image processing,
numpy for numerical operations, and matplotlib.pyplot for
displaying images.
 Define augment_image Function: Performs four types of
image augmentations: rotation, flipping, brightness
adjustment, and noise addition.
 Load Image: Reads the image from the specified path and
confirms successful loading.
 Perform Augmentation: Applies rotation, horizontal
flipping, brightness adjustment, and adds Gaussian noise to
the image.
 Display Results: Shows the original and augmented
images (rotated, flipped, brightness adjusted, noisy) side by
side using matplotlib.
4. Image Denoising
Denoising removes noise from images, enhancing quality and
clarity. Common methods include:
a) Gaussian Blur:
 Applies a Gaussian function to smooth the image.
 Effective for reducing Gaussian noise.
b) Median Filtering:
 Replaces each pixel with the median of neighboring pixels.
 Effective for salt-and-pepper noise.
c) Bilateral Filtering:
Preserves edges while reducing noise.
Code Implementation
Python
import cv2
import matplotlib.pyplot as plt

def denoise_image(image):
print("Performing image denoising...")

gaussian = cv2.GaussianBlur(image, (5, 5), 0)


median = cv2.medianBlur(image, 5)
bilateral = cv2.bilateralFilter(image, 9, 75, 75)

print("Denoising complete.")
return gaussian, median, bilateral

# Example usage
image_path = '/kaggle/input/sample-image/noisy_image.jpg'
image = cv2.imread(image_path)

if image is None:
print(f"Failed to load image from {image_path}")
else:
print(f"Successfully loaded image from {image_path}")

gaussian, median, bilateral = denoise_image(image)

# Display the results


plt.figure(figsize=(20, 5))

plt.subplot(141)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title('Original (Noisy)')
plt.axis('off')

plt.subplot(142)
plt.imshow(cv2.cvtColor(gaussian, cv2.COLOR_BGR2RGB))
plt.title('Gaussian Blur')
plt.axis('off')

plt.subplot(143)
plt.imshow(cv2.cvtColor(median, cv2.COLOR_BGR2RGB))
plt.title('Median Blur')
plt.axis('off')
plt.subplot(144)
plt.imshow(cv2.cvtColor(bilateral, cv2.COLOR_BGR2RGB))
plt.title('Bilateral Filter')
plt.axis('off')

plt.tight_layout()
plt.show()

print("Script completed")
Output:

Denoising
Code Explanation:
 Import Libraries: Imports cv2 for image processing and
matplotlib.pyplot for displaying images.
 Define denoise_image Function: Applies three different
denoising techniques to the noisy image: Gaussian Blur,
Median Blur, and Bilateral Filter.
 Load Image: Reads the noisy image from the specified path
and checks if the image was successfully loaded.
 Perform Denoising: Applies Gaussian Blur, Median Blur,
and Bilateral Filter to the noisy image to reduce noise.
 Display Results: Uses matplotlib to display the original
noisy image and the results of the three denoising methods
side by side.
5. Edge Detection
Edge detection identifies boundaries of objects within images. Key
algorithms include:
a) Sobel Operator
 Gradient Calculation: Measures changes in image
intensity.
 Edge Emphasis: Highlights horizontal and vertical edges
using convolution kernels.
b) Canny Edge Detector
 Multi-stage Algorithm: Involves noise reduction, gradient
calculation, and edge tracking.
 Intensity Gradients and Non-maximum
Suppression: Detects edges by suppressing non-maximal
pixels.
c) Laplacian of Gaussian (LoG)
 Combines Gaussian Smoothing with Laplacian
Operator: Smooths image to reduce noise, then applies
Laplacian to detect edges.
Code Example:
Python
import cv2
import numpy as np
import matplotlib.pyplot as plt

def detect_edges(image):
print("Performing edge detection...")

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Sobel
sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize=3)
sobely = cv2.Sobel(gray, cv2.CV_64F, 0, 1, ksize=3)
sobel = np.sqrt(sobelx**2 + sobely**2)
sobel = np.uint8(sobel / sobel.max() * 255)

# Canny
canny = cv2.Canny(gray, 100, 200)

# Laplacian of Gaussian
blur = cv2.GaussianBlur(gray, (3, 3), 0)
log = cv2.Laplacian(blur, cv2.CV_64F)
log = np.uint8(np.absolute(log))

print("Edge detection complete.")


return sobel, canny, log

# Example usage
image_path = '/kaggle/input/sample-image/tint1.jpg'
image = cv2.imread(image_path)

if image is None:
print(f"Failed to load image from {image_path}")
else:
print(f"Successfully loaded image from {image_path}")

sobel, canny, log = detect_edges(image)

# Display the results


plt.figure(figsize=(20, 5))

plt.subplot(141)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title('Original')
plt.axis('off')

plt.subplot(142)
plt.imshow(sobel, cmap='gray')
plt.title('Sobel')
plt.axis('off')

plt.subplot(143)
plt.imshow(canny, cmap='gray')
plt.title('Canny')
plt.axis('off')

plt.subplot(144)
plt.imshow(log, cmap='gray')
plt.title('Laplacian of Gaussian')
plt.axis('off')

plt.tight_layout()
plt.show()

print("Script completed")
Output:
Edge Detection
Explanation:
 Import Libraries: Imports cv2 for image processing,
numpy for calculations, and matplotlib.pyplot for plotting
images.
 Define detect_edges Function: Converts the image to
grayscale and applies Sobel, Canny, and Laplacian of
Gaussian methods to detect edges.
 Load Image: Reads an image from a specified path and
verifies if it was loaded successfully.
 Perform Edge Detection: Calls detect_edges to get edge-
detected versions of the image using Sobel, Canny, and LoG
techniques.
 Display Results: Uses matplotlib to show the original
image and the edge-detected results side by side.
6. Image Binarization
Binarization converts an image to black and white based on a
threshold. Methods include:
a) Global Thresholding:
 Applies a single threshold value for the entire image.
 Otsu's method automatically determines the optimal
threshold.
b) Adaptive Thresholding:
 Uses different thresholds for different regions of the image.
 Better for images with varying illumination.
Python
import cv2
import matplotlib.pyplot as plt

def binarize_image(image):
print("Performing image binarization...")

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Global thresholding (Otsu's method)


_, global_thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY +
cv2.THRESH_OTSU)

# Adaptive thresholding
adaptive_thresh = cv2.adaptiveThreshold(gray, 255,
cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY, 11,
2)

print("Binarization complete.")
return global_thresh, adaptive_thresh

# Example usage
image_path = '/kaggle/input/sample-image/tint1.jpg'
image = cv2.imread(image_path)

if image is None:
print(f"Failed to load image from {image_path}")
else:
print(f"Successfully loaded image from {image_path}")

global_thresh, adaptive_thresh = binarize_image(image)

# Display the results


plt.figure(figsize=(15, 5))

plt.subplot(131)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title('Original')
plt.axis('off')

plt.subplot(132)
plt.imshow(global_thresh, cmap='gray')
plt.title('Global Thresholding')
plt.axis('off')

plt.subplot(133)
plt.imshow(adaptive_thresh, cmap='gray')
plt.title('Adaptive Thresholding')
plt.axis('off')

plt.tight_layout()
plt.show()

print("Script completed")
Output:

Binarization
Explanation:
 Import Libraries: Imports cv2 for image processing and
matplotlib.pyplot for visualizing images.
 Define binarize_image Function: Converts the image to
grayscale and applies global thresholding (Otsu’s method)
and adaptive thresholding for binary image creation.
 Load Image: Reads an image from the specified path and
checks if it was loaded correctly.
 Perform Binarization: Calls binarize_image to apply global
and adaptive thresholding methods on the grayscale image
to create binary images.
 Display Results: Uses matplotlib to show the original
image, global thresholded image, and adaptive thresholded
image in a 1x3 grid layout.
What is Computer Vision?
Computer vision is a field of study within artificial intelligence (AI)
that focuses on enabling computers to Intercept and extract
information from images and videos, in a manner similar to human
vision. It involves developing algorithms and techniques to extract
meaningful information from visual inputs and make sense of the
visual world.
Prerequisite: Before Starting Computer Vision It’s Recommended
that you should have a foundational knowledge of Machine
Learning, Deep learning and an OpenCV. you can refer to our
tutorial page on prerequisites technologies.

Computer Vision Examples:


Here are some examples of computer vision:
 Facial recognition: Identifying individuals through visual
analysis.
 Self-driving cars: Using computer vision to navigate and
avoid obstacles.
 Robotic automation: Enabling robots to perform tasks
and make decisions based on visual input.
 Medical anomaly detection: Detecting abnormalities in
medical images for improved diagnosis.
 Sports performance analysis: Tracking athlete
movements to analyze and enhance performance.
 Manufacturing fault detection: Identifying defects in
products during the manufacturing process.
 Agricultural monitoring: Monitoring crop growth,
livestock health, and weather conditions through visual
data.
These are just a few examples of the many ways that computer
vision is used today. As the technology continues to develop, we
can expect to see even more applications for computer vision in
the future.

Computer Vision Tutorials Index


Overview of computer vision and its Applications
 Computer Vision – Introduction
 A Quick Overview to Computer Vision
 Applications of Computer Vision
 Image Formation Tools & Technique
o Digital Photography
o Satellite Image Processing
o Lidar(Light Detection and Ranging)
o Synthetic Image Generation
o Image Stitching & Composition
o Fundamentals of Image Formation
o Image Formats
 Beginner’s Guide to Photoshop Tools
Image Processing & Transformation
 Digital Image
o Digital Image Processing Basics
o Digital image color spaces
o RGB, HSV,
 Image Transformation:
o Pixel Transformation
o Geometric transformations
o Fourier Transforms for Image Transformation
o Intensity Transformation
 Image Enhancement Techniques
o Histogram Equalization
o Color correction
o Color Inversion using Pillow
o Automatic color correction with
OpenCV and Python
o Contrast Enhancement
o Image Sharpening
o sharpen() function in Wand
o Edge Detection
o Image Edge Detection Operators
o Edge Detection using Pillow
o OpenCV – Roberts Edge
Detection
o OpenCV – Canny Edge Detector
o Edge detection using Prewitt,
Scharr and Sobel Operator
o Noise Reduction & Filtering Technique
o Smoothing and Blurring the
Image
oGaussian Smoothing
o GaussianBlur()
method
o Apply a Gauss filter
to an image
o Spatial Filtering
o Spatial Filters –
Averaging filter and
Median filter
o MedianFilter() and
ModeFilter()
o Image Restoration
Using Spatial
Filtering
o Bilateral Filtering
o Morphological operations
o Erosion and Dilation of
o Difference between Opening and
Closing in Digital Image
Processing
o Image Denoising Techniques
o Denoising of colored images
using opencv
o Total Variation Denoising
o Wavelet Denoising
o Non-Local Means Denoising
Feature Extraction and Description:
 Feature detection and matching with OpenCV-Python
 Boundary Feature Descriptors
 Region Feature Descriptors
 Interest point detection
 Local feature descriptors
 Harris Corner Detection
 Scale-Invariant Feature Transform (SIFT)
 Speeded-Up Robust Features (SURF)
o Mahotas – Speeded-Up Robust Features
 Histogram of Oriented Gradients (HOG)
 Principal Component as Feature Detectors
 Local Binary Patterns (LBP)
 Convolutional Neural Networks (CNN)
Deep Learning for Computer Vision
 Convolutional Neural Networks (CNN)
o Introduction to Convolution Neural Network
o Types of Convolutions
o Strided Convolutions
o Dilated Convolution
o Flattened
Convolutions
o Spatial and Cross-
Channel
convolutions
o Depthwise
Separable
Convolutions
o Grouped
Convolutions
o Shuffled Grouped
Convolutions
o Continuous Kernel
Convolution
o What is a Pooling Layers?
o Introduction to Padding
o Same and Valid Padding
 Data Augmentation in Computer Vision
 Deep ConvNets Architectures for Computer Vision
o ImageNet Dataset
o Transfer Learning for Computer Vision
o What is Transfer Learning?
o Residual Network
o ResNet
o Inception Network
o GoogleNet (or
InceptionNet)
o Inception Network
V1
o Inception V2 and V3
o MobileNet
o Image Recognition
with Mobilenet
o EfficientNet
o Visual Geometry Group Network
(VGGNet)
o VGG-16 | CNN
model
o FaceNet Architecture
 AutoEncoders
o How Autoencoders works
o Encoder and Decoder network architecture
o Difference between Encoder and
Decoder
o Latent space representation
o Implementing an Autoencoder in PyTorch
o Autoencoders for Computer Vision:
o Feedforward Autoencoders
o Deep Convolutional
Autoencoders
o Variational autoencoders (VAEs)
o Denoising autoencoders
o Sparse autoencoders
o Adversarial Autoencoder
o Applications of Autoencoders
o Dimensionality reduction and
feature extraction using
autoencoders
o Image compression and
reconstruction techniques
o Anomaly detection and outlier
identification with autoencoders
 Generative Adversarial Network (GAN)
o Deep Convolutional GAN
o StyleGAN – Style Generative Adversarial
Networks
o Cycle Generative Adversarial Network
(CycleGAN)
o Super Resolution GAN (SRG AN)
o Selection of GAN vs Adversarial Autoencoder
models
o Real-Life Application of GAN
o Image and Video Generation
using DCGANs
o Conditional GANs for image
synthesis and style transfer
o VAEs for image generation and
latent space manipulation
o Evaluation metrics for generative models
Object Detection and Recognition
 Introduction to Object Detection and Recognition
o Introduction to Object Detection?
 Traditional Approaches for Object Detection and
Recognition
o Feature-based approaches: SIFT, SURF, HOG
o Sliding Window Approach
o Selective Search for Object Detection
o Haar Cascades for Object Detection
o Template Matching
 Object Detection Techniques
o Bounding Box Predictions in Object Detection
o Intersection over Union
o Non – Max Suppression
o Anchor Boxes in Object Detection
o Region Proposals in Object Detection
o Feature Pyramid Networks (FPN)
o Contextual information and attention
mechanisms
o Object tracking and re-identification
 Neural network-based approach for Object Detection and
Recognition
o R Proposals in Object Detection | R – CNN
o Fast R-CNN
o Faster R – CNN
o Single Shot MultiBox Detector (SSD)
o You Look Only Once(YOLO) Algorithm in
Object Detection
o YOLO v2 – Object Detection
 Object Recognition in Video
 Evaluation Metrics for Object Detection and Recognition
o Intersection over Union (IoU)
o Precision, recall, and F1 score
o Mean Average Precision (mAP)
 Object Detection and Recognition Applications
o Object Detection and Self-Driving Cars
o Object Localization
o Landmark Detection
o Face detection and recognition
o What is Face Recognition Task?
o DeepFace Recognition
o Eigen Faces for Face Recognition
o Emojify using Face Recognition
with Machine Learning
o Face detection and landmark
localization
o Facial expression recognition
o Hand gesture recognition
o Pedestrian detection
o Object Detection with Detection Transformer
(DETR) by Facebook
o Vehicle detection and tracking
o Object detection for autonomous driving
o Object recognition in medical imaging
Image Segmentation
 Introduction to Image Segmentation
 Point, Line & Edge Detection
 Thresholding Technique for Image Segmentation
 Contour Detection & Extraction
 Graph-based Segmentation
 Region-based Segmentation
o Region and Edge Based Segmentation
o Watershed Segmentation Algorithm
o Semantic Segmentation
 Deep Learning Approaches to Image Segmentation
o Fully convolutional networks (FCN)
o U-Net architecture for semantic segmentation
o Image Segmentation Using UNet
o Mask R-CNN for instance segmentation
o Mask R – CNN
o Encoder-Decoder architectures (e.g., SegNet,
DeepLab)
 Evaluation Metrics for Image Segmentation
o Pixel-level evaluation metrics (e.g., accuracy,
precision, recall)
o Region-level evaluation metrics (e.g., Jaccard
Index, Dice coefficient)
o Mean Intersection over Union (mIoU)
Boundary-based evaluation metrics (e.g.,
o
average precision, F-measure)
3D Reconstruction
 Structure From Motion for 3D Reconstruction
 Monocular Depth Estimation Techniques
 Fusion Techniques for 3D Reconstruction
o LiDAR | Light Detection and Ranging
o Depth Sensor Fusion
 Volumetric Reconstruction
 Point Cloud Reconstruction

Computer Vision Interview Questions

 Computer Vision Interview

Computer Vision Projects

 Top Computer Vision Projects


How does Computer Vision Work?
Computer Vision Works similarly to our brain and eye work, To get
any Information first our eye capture that image and then sends
that signal to our brain. Then After, our brain processes that signal
data and converted it into meaningful full information about the
object then It recognizes/categorises that object based on its
properties.
In a similar fashion to Computer Vision Work, In CV we have a
camera to capture the Objects and Then it processes that Visual
data by some pattern recognition algorithms and based on that
property that object is identified. But, Before giving unknown data
to the machine/Algorithm, we trained that machine on a vast
amount of Visual labelled data. This labelled data enables the
machine to analyze different patterns in all the data points and can
relate to those labels.
Example: Suppose we provide audio data of thousands of bird
songs. In that case, the computer learns from this data, analyzes
each sound, pitch, duration of each note, rhythm, etc., and hence
identifies patterns similar to bird songs and generates a model. As
a result, this audio recognition model can now accurately detect
whether the sound contains a bird song or not for each input
sound.
Evolution of Computer Vision
Time Period Evolution of Computer Vision

1. Development of deep learning algorithms for. recognition


image.
2. Introduction of convolutional neural networks (CNNs) for
2010-2015
image classification.
3. Use of computer vision in autonomous vehicles for object
detection and navigation.

1. Advancements in real-time object detection with systems


like YOLO (You Only Look Once).
2. in facial recognition technology, used in various
applications like unlocking smartphones and surveillance.
2015-2020
3. Integration of computer vision in augmented reality (AR)
and virtual reality (VR) systems.
4. Use of computer vision in medical imaging for disease
diagnosis.

1. Further advancements in real-time object detection and


image recognition.
2. More sophisticated use of computer vision in autonomous
2020-2025 vehicles.
(Predicted) 3. Increased use of computer vision in healthcare for early
disease detection and treatment.
4. Integration of computer vision in more consumer products,
like smart home devices.

Applications of Computer Vision


1. Healthcare: Computer vision is used in medical imaging to
detect diseases and abnormalities. It helps in analyzing X-
rays, MRIs, and other scans to provide accurate diagnoses.
2. Automotive Industry: In self-driving cars, computer vision
is used for object detection, lane keeping, and traffic sign
recognition. It helps in making autonomous driving safe and
efficient.
3. Retail: Computer vision is used in retail for inventory
management, theft prevention, and customer behaviour
analysis. It can track products on shelves and monitor
customer movements.
4. Agriculture: In agriculture, computer vision is used for
crop monitoring and disease detection. It helps in
identifying unhealthy plants and areas that need more
attention.
5. Manufacturing: Computer vision is used in quality control
in defect detect can It. manufacturing products that are
hard to spot with the human eye.
6. Security and Surveillance: Computer vision is used in
security cameras to detect suspicious activities, recognize
faces, and track objects. It can alert security personnel
when it detects a threat.
7. Augmented and Virtual Reality: In AR and VR, computer
vision is used to track the user’s movements and interact
with the virtual environment. It helps in creating a more
immersive experience.
8. Social Media: Computer vision is used in social media for
image recognition. It can identify objects, places, and
people in images and provide relevant tags.
9. Drones: In drones, computer vision is used for navigation
and object tracking. It helps in avoiding obstacles and
tracking targets.
10. Sports: In sports, computer vision is used for player
tracking, game analysis, and highlight generation. It can
track the movements of players and the ball to provide
insightful statistics.
@@@@

An image processing algorithm takes in an image or a video


as an input, processes it and the result of the processing is
still, an image or a video. But, a computer vision algorithm
takes in an image or a video, processes it and constructs
explicit and meaningful descriptions from it.
What is an Image?

An image can be defined as a two-dimensional function, f(x,


y), where x and y are spatial(plane) coordinates and the
amplitude of f at any pair of coordinates (x, y) is called the
intensity or gray-level of the image at that point. When (x, y),
and the amplitude of f are all finite, discrete values, we call
the image as the digital image.

— Rafael C. Gonzalez & Richard E. Woods


What is a Video?

A basic definition of a video is nothing but images stacked

along the temporal axis. A video can be characterized by the


aspect ratio, frame rate, interlaced vs progressive, color
model, compression method, etc.
Courtesy of http://www.ctralie.com/Research/SlidingWindowVideo-
SOCG2016/slidingvideo.html

Considering the slice of the video M, X[n] is the first frame


and X[n+M-1] is the last frame, and Y[n] is the vector
stacking all of these to form a slice(part) of the whole video.

Image Handling and I/O

O penCV supports a lot of image and video formats for

I/O. First, let’s understand a few paradigms when it comes to


video analysis. With the way just about today’s cameras
record, recordings(videos) boil down to frames that are
displayed 30–60 FPS(Frames Per Second), where frames
being images. Thus, image processing and video analysis use
identical methods for the most part

For cv2.imread(), following formats are supported:

 Windows bitmaps — *.bmp, *.dib


 JPEG files — *.jpeg, *.jpg, *.jpe
 JPEG 2000 files — *.jp2
 Portable Network Graphics — *.png
 Portable image format — *.pbm, *.pgm, *.ppm
 Sun rasters — *.sr, *.ras
 TIFF files — *.tiff, *.tif

For cv2.VideoCapture(), AVI files — *.avi format has full support


because it seems that AVI is the only format with decent
cross-platform support. See here for more info.

Below are some examples of Image and Video I/O in OpenCV:


(Left)Code for
reading & displaying the read image. (Right) Close the displayed
image on a keypress
(Left) Load an existing video and play it. (Right) Record a video from
your webcam and save to the disk after you press ‘q’

As you can see above, the type of image is Numpy’s


‘ndarray’(n-dimensional array), let’s have a look at some of
the image manipulations(array manipulations)and some
without using OpenCV’s core modules.

1. Get the basic image properties


import cv2img = cv2.imread('test1.jpeg')print("The properties of
the image are:")
print("Shape:" + str(img.shape))
print("Total no. of pixels:" + str(img.size))
print("Data type of image:" + str(img.dtype))

Output for the code above.

2. Access and modify image pixels

Access an individual pixel for its gray-level/intensity:


import cv2
import numpy as npm = cv2.imread("test1.jpeg")height,width,depth =
np.shape(m)y = 1 # y coordinate(across height)
x = 1 # x coordinate(across width)print("Value at (1, 1, 0) = " +
str(m[y][x][0])) # This will print the pixel value at given
coordinates at depth zero(blue)print("Value at (1, 1, 1) = " +
str(m[y][x][1])) # This will print the pixel value at given
coordinates at depth one(green)print("Value at (1, 1, 2) = " +
str(m[y][x][2])) # This will print the pixel value at given
coordinates at depth two(red)

Output for the above code


To simply iterate over all pixels in the image we can use:
import cv2
import numpy as npm = cv2.imread("test1.jpeg")height, width,
depth= np.shape(m)# iterate over the entire image.
for y in range(0, height):
for x in range(0, width):
print(m[y][x])

Behold, this will print a lot of numbers in your terminal!!

To modify the pixel value:


import cv2
import numpy as npm = cv2.imread("test1.jpeg")height, width ,depth
= np.shape(m)for py in range(0, height):
for px in range(0, width):
m[py][px][0] = 0cv2.imshow('matrix', m)
cv2.imwrite('output2.png', m)
cv2.waitKey(0)
cv2.destroyAllWindows()

(Left) Original image. Others are


manipulated by making all pixels of certain channels = 0.
3. Splitting Image Channels

Generally, an RGB image has a 24-bit ‘color-depth’ data i.e,


three 8-bit channels of RGB data. These channels are
nothing but the colors blue, green and red with intensity
level ranging from 0 to 255. It’s possible to split the image in
OpenCV using cv2.split() but this method is computationally
costly and therefore we opt to go with Numpy indexing as it’s
much more efficient and should be used if possible.

Splitting using the cv2.split() method is as easy as:


b,g,r = cv2.split(img)

In OpenCV, the order of channels is BGR and not RGB.

Splitting using indexing:


import cv2
import numpy as npm = cv2.imread("test1.jpeg")blue = m[:, :, 0]
green = m[:, :, 1]
red = m[:, :, 2]
Computer Vision vs. Image Processing: What’s the
difference?

Madhurjya Chowdhury
Published on:
20 Jan 2022, 12:00 am







Exploring every end of computer vision vs
image processing through an in-depth
analysis
What is the difference between image processing and computer vision? Both
are concerned with images. And that's the only thing they have in
common. Computer vision and image processing are two distinct tools with
different applications. In this post, we'll look at each of these in greater detail
and explore the differences between them.

Image Processing
As the name implies, image is handled in image processing. It signifies that
an input file has undergone at least one change. And with the help of
dedicated software, it can be done by a person.

A number of transformations are carried out automatically. Sharpening,


juxtaposing, smoothing, and edge detection are just a few to name. They all
happen completely on their own. A graphic only needs to begin a specific
operation. Resizing, stretching, improving, and adding new layers or words
are all examples of manual transformations. These processes necessitate a
greater level of focus and action on the part of the graphic. In image
processing, you begin with image X, process it, and then get image Y as a
result.

Computer Vision
It's a different story when it comes to computer vision. A picture or video is
used as input in computer vision, but nothing changes to the file itself. The
objective is to deduce meaning from the image and its components. While
some image processing methods are used by computer vision to solve
problems, processing has never been the primary focus. In fact, image
processing algorithms are used to accomplish computer vision jobs.

In this case, computer vision is employed to aid the driver, especially, in bad
weather. It examines the environment around the car and assesses potential
hazards, impediments, and other pertinent events, that a driver may
encounter while driving such as a person crossing the street.

Computer Vision vs. Image Processing


Computer vision in the motor industry
As previously said, one of the most important industries in which computer
vision is used in the automotive industry. Consider the following examples.
Did you realise that over 3,000 deaths occur in traffic accidents every
day? Computer vision and image processing are just two of many methods
available to address this issue. Computer vision technologies can potentially
be utilised to address the problem of distracted driving.

Anyone who has driven a vehicle after a lousy night's sleep can attest to the
fact that it is quite unsafe! As a result, computer vision technology can assist
you in staying awake and determining when you are too tired or sleepy to
drive. Depending on your visual state or head motions, the computer vision
programme can continuously check your condition. Computer vision and
image recognition technology could detect when you're not paying attention
to the road and are about to fall asleep. Your vehicle sends you an alarm to
get you back on track or to suggest that you sleep before driving again.

Computer Vision in manufacturing


On manufacturing lines, Pharma Packaging Systems uses computer vision
technology to count capsules efficiently. Furthermore, computer vision
techniques are employed to control manufacturing processes. In addition,
computer vision aids businesses in a variety of ways, such as checking
product components against production specifications, analysing lids, and
determining fill levels.

Fitness and sports


Sentio has created a platform for following and analysing football players,
giving coaches a complete picture of their matches. Additionally, computer
vision and image processing systems are utilised to increase shooting
precision during sports training (the Noah system), as well as to help
swimmers enhance their technique by collecting data in real-time on
everything from stroking frequency to speed and turnaround time (FINIS
LaneVision).

Image enhancement in the healthcare sector


Image enhancement is a method for improving image quality and
perceptibility that is commonly employed in modern healthcare. This is used
in medical imaging to reduce noise and brighten details in order to improve
the image's visual representation. Furthermore, this method incorporates
both objective and subjective improvements. Many medical imaging
modalities, such as CT, MRI, and X-ray, have limited contrast, as it turns out.
As a result, the image quality degrades. This is why image enhancement is
so important.

Image processing for missing people


Image processing technology is utilised to locate missing people in Australia.
The Missing Persons Action Network (MPAN) uses Facebook to quickly get the
word out to the friends of a missing person. Using Facebook's face
recognition techniques, the application can also identify persons against the
background of photographs. As a result, having a large network of friends
increases your chances of meeting new people.

Fundamental Steps in Digital Image Processing


Last Updated : 10 Jul, 2024



An Image is defined as a two dimensional function f(x, y). (x, y) is
the spatial coordinate (or location) and f is the intensity at that
point. If x, y and f all are finite and discrete, then the image is said
to be a digital image. Digital image consists of finite and discrete
image elements called pixels, each of them having a location and
intensity value. In digital image processing, we process digital
images using a digital computer.
Digital Image processing is not a field which is used for only high-
end applications. There are various fundamental steps in digital
image processing. We will discuss all the steps and processes that
can be applied for different images.
Classification
We can categorise the steps in digital image processing as three
types of computerised processing, namely low level, mid level and
high level processing.
Low Level Processing
Low level processing involves basic operations such as image
preprocessing, image enhancement, image restoration, image
sharpening, etc. The main characteristic of low level processing is
that both its inputs and outputs are images.
Mid Level Processing
Mid level processing involves tasks like image classification, object
identification, image segmentation, etc. The main characteristic of
low level processing is that its inputs are generally images whereas
its outputs are attributes associated with image which are extracted
from it.
High Level Processing
High level processing involves making sense of ensemble of
recognised object and cognitive tasks associated with computer
vision.
Fundamental Steps in Digital Image
Processing
Fundam
ental Steps in Digital Image Processing
Image Acquisition
Image acquisition is the first step in digital image processing. In this
step we get the image in digital form. This is done using sensing
materials like sensor strips and sensor arrays and electromagnetic
wave light source. The light source falls on an object and it gets
reflected or transmitted which gets captured by the sensing
material. The sensor gives the output image in voltage waveform in
response to electric power being supplied to it. The example of a
situation where reflected light is captured is a visible light source.
Whereas, in X-ray light sources transmitted light rays are captured.
Image Acquisition
The image captured is analog image as the output is continuous. To
digitise the image, we use sampling and quantization where
discretize the image. Sampling is discretizing the image spatial
coordinates whereas quantization is discretizing the image
amplitude values.
Sampl
ing and Quantization
Image Enhancement
Image enhancement is the manipulation of an image for its specific
purpose and objectives. This is majorly used in photo beautify
applications. These are performed using filters. The filters are used
to minimise noise in an image. Each filter is used for a specific
situation. Correlation operation is done between filters and input
image matrix to obtain enhanced output image in . To simplify the
process, we perform multiplication in the frequency domain which
gives the same result. We transform the image from spatial domain
to frequency domain using discrete fourier transform (DFT) multiply
with filter and then go back to spatial domain using inverse discrete
fourier transform (IDFT). Some filters used in frequency domain are
butterworth filter and gaussian filter.
Majorly used filters are high pass filter and low pass filter. Low pass
filter smoothens the images by averaging the pixel of neighbouring
value thus minimising the random noise. It gives a blurring effect. It
minimises the sharpening edges. High pass filter is used to sharpen
the images using spatial differentiation. Examples of high pass
filters are laplace filter and high boost filter. There are other non
linear filters for different purposes. For example, a median filter is
used to eliminate salt and pepper noise.
Image Restoration
Like image enhancement, image restoration is related to improving
an image. But image enhancement is more of a subjective step
where image restoration is more of an objective step. Restoration is
applied to a degraded image trying to recover back the original
model. Here firstly we try to estimate the degradation model and
then find the restored image.
We can estimate the degradation by observation, experimentation
and mathematical modelling. Observation is used when you do not
know anything about the setup of the image taken or the
environment. In experimentation, we find the point spread function
of an impulse with a similar setup. In mathematical modelling, we
even consider the environment at which the image was taken and it
is the best out of all the other three methods.

Image Restoration Block Diagram


To find the restored image, we generally use one of the three filters
- inverse filter, minimum mean square (weiner) filter, constrained
least squares filter. Inverse filtering is the simplest method but
cannot be used in presence of noise. In the Wiener filter, mean
square error is minimised. In constrained least error filtering, we
have a constraint and it is the best method.
Colour Image Processing
Colour image processing is motivated by the fact that using colour it
is easier to classify and the human eye can easily see thousands of
colours than shades of black and white. Colour image processing is
divided into types - pseudo colour or reduced colour processing and
full colour processing. In pseudo colour processing, the grey scale is
applied to one colour. It was used earlier. Now-a-days, full colour
processing is used for full colour sensors such as digital cameras or
colour scanners as the price of full colour sensor hardware is
reduced significantly.
There are various colour models like RGB (Red Green Blue), CMY
(Cyan Magenta Yellow), HSI (Hue Saturation Intensity). Different
colour models are used for different purposes. RGB is
understandable for computer monitors. Whereas CMY is
understandable for a computer printer. So there is an internal
hardware which converts RGB to CMY and vice versa. But humans
cannot understand RGB or CMY, they understand HSI.

Col
our Models
Wavelets
Wavelets represent an image in various degrees of resolution. It is
one of the members of the class of linear transforms along with
fourier, cosine, sine, Hartley, Slant, Haar, Walsh-Hadamard.
Transforms are coefficients of linear expansion which decompose a
function into a weighted sum of orthogonal or biorthogonal basis
functions. All these transforms are reversible and interconvertible.
All of them express the same information and energy. Hence all are
equivalent. All the transforms vary in only the manner how the
information is represented.
Compression
Compression deals with decreasing the storage required to the
image information or the bandwidth required to transmit it.
Compression technology has grown widely in this era. Many people
are knowledgeable about it by common image extension JPEG (Joint
Photographic Experts Group) which is a compression technology.
This is done by removing redundancy and irrelevant data. In the
encoding process of compression, the image goes through a series
of stages - mapper, quantizer, symbol encoder. Mapper may be
reversible or irreversible. Example of mapper is run length encoding.
Quantizer reduces the accuracy and is an irreversible process.
Symbol encoders assign small values to more frequent data and is a
reversible process.

Image Compression Block Diagram


To get back the original image, we perform decompression going
through the stage of symbol decoder and inverse mapper.
Compression may be lossy or lossless. If after compression we get
the exact same image, then it is lossless compression else it is lossy
compression. Examples of lossless compression are huffman coding,
bit plane coding, LZW (Lempel Ziv Welch) coding, (PCM) pulse code
modulation. Examples of lossy compression are JPEG, PNG. Lossy
compression is ideally used in the world as the change is not visible
to the naked eye and saves way better storage or bandwidth than
lossless compress.
Morphological Image Processing
In morphological image processing, we try to understand the
structure of the image. We find the image components present in
digital images. It is useful in representing and describing the
images' shape and structure. We find the boundary, hole, connected
components, convex hull, thinning, thickening, skeletons, etc. It is
the fundamental step for the upcoming stages.
Segmentation
Segmentation is based on extraction information from images on
the basis of two properties - similarity and discontinuity. For
example, a sudden change in intensity value represents an edge.
Detection of isolation points, line detection, edge detection are
some of the tasks associated with segmentation. Segmentation can
be done by various methods like thresholding, clustering,
superpixels, graph cuts, region growing, region splitting and
merging, morphological watersheds.
Feature Extraction
Feature extraction is the next step after segmentation. We extract
features from images, regions and boundaries. Example of feature
extraction is corner detection. These features should be
independent and insensitive to variation of parameters such as
scaling, rotation, translation, illumination. Boundary features can be
described by boundary feature descriptors such as shape numbers
and chain codes, fourier descriptors and statistical moments.

Region and Boundary Extraction


Image Pattern Classification
In image pattern classification, we assign labels to images on the
basis of features extracted. For example, classify the image as a cat
image. Classical methods for image pattern classification are
minimum-distance, correlation and Bayes classifier. Modern
methods for the same purpose use neural networks and deep
learning models such as deep convolutional neural networks. This
method is ideal for image processing techniques.
Applications
1. In medical diagnosis, Gamma ray imaging, X-ray imaging,
ultrasound imaging, MRI imaging is used to know about the
internal organs and bones of our body.
2. In satellite imaging and astronomy, infrared imaging is used.
3. In forensics, for biometrics such as thumbprints and retina
scan, digital image processing is used.
4. We can find defects in manufactured packaged goods using
microwave imaging.
5. We can find information about circuit boards and
microprocessors.
6. Using image restoration, we can identify the car number
plates of moving cars from CCTV for police investigations.
7. Beautify filters are used in social media platforms which use
image enhancement.
8. We can classify and identify images using deep learning
models.
Conclusion
A picture is worth a thousand words. And the world is filled with
beautiful pictures. To manipulate these images according to our
needs is all digital image processing is. And we live in the world
using advanced digital imaging processing in diverse fields.
Frequently Asked Questions on Digital Image
Processing-FAQ's
What is the resolution of the image?
Resolution is the smallest discernible detail in an image. There are
two types of resolution - spatial resolution and intensity resolution.
What are the components of a digital image processing
system?
Physical sensors and digitizers are required for image acquisition.
Image displays like flat, colour screen monitors, hardcopy devices
such as laser printers, computers with specialised hardware and
software for digital image processing systems, mass storage and
networking and cloud communication are the components of digital
image processing systems.
Why is digital image processing more famous than analog
image processing?
Analog image processing is a costlier and time consuming process
whereas digital image processing is a cheaper and fast process.
What is the commonly used language for image processing?
Python, C/C++ with openCV, Matlab, Java.

You might also like