Image Processing and Computer Vision Both Are Very Exciting Field of Computer Science
Image Processing and Computer Vision Both Are Very Exciting Field of Computer Science
of Computer Science.
Computer Vision:
In Computer Vision, computers or machines are made to gain high-
level understanding from the input digital images or videos with
the purpose of automating tasks that the human visual system can
do. It uses many techniques and Image Processing is just one of
them.
Image Processing:
Image Processing is the field of enhancing the images by tuning
many parameter and features of the images. So Image Processing
is the subset of Computer Vision. Here, transformations are applied
to an input image and the resultant output image is returned. Some
of these transformations are- sharpening, smoothing, stretching
etc.
Now, as both the fields deal with working in visuals, i.e., images
and videos, there seems to be lot of confusion about the difference
about these fields of computer science. In this article we will
discuss the difference between them.
Difference between Image Processing and Computer Vision:
Image Processing Computer Vision
Computer vision is focused on extracting
Image processing is mainly focused information from the input images or
on processing the raw input images videos to have a proper understanding of
to enhance them or preparing them them to predict the visual input like
to do other tasks human brain.
Image processing uses methods like
Anisotropic diffusion, Hidden Image processing is one of the methods
Markov models, Independent that is used for computer vision along with
component analysis, Different other Machine learning techniques, CNN
Filtering etc. etc.
Image Processing is a subset of Computer Vision is a superset of Image
Computer Vision. Processing.
Examples of some Image Examples of some Computer Vision
Processing applications are- applications are- Object detection, Face
Rescaling image (Digital Zoom), detection, Hand writing recognition etc.
Correcting illumination, Changing
tones etc.
Pattern Recognition
Think of this as teaching a computer to play a game of ‘spot the
difference’. By recognizing patterns, computers can identify
similarities and differences in images. This skill is crucial for tasks
like facial recognition or identifying objects in a scene.
Deep Learning
Deep Learning is like giving a computer a very complex brain that
learns from examples. By feeding it thousands, or even millions, of
images, a computer learns to identify and understand various
elements in these images. This is the backbone of modern computer
vision, enabling machines to recognize objects, people, and even
emotions.
Object Detection
This is where computers get really smart. Object detection is about
identifying specific objects within an image. It’s like teaching a
computer to not just see a scene, but to understand what each part
of that scene is. For instance, in a street scene, it can distinguish
cars, people, trees, and buildings.
Image Enhancement
This is like giving a makeover to an image. Image enhancement can
brighten up a dark photo, bring out hidden details, or make colors
pop. It’s all about improving the look and feel of an image to make it
more pleasing or informative.
Filtering
Imagine sifting through the ‘noise’ to find the real picture. Image
filtering involves removing or reducing unwanted elements from an
image, like blurring, smoothening rough edges, or sharpening blurry
parts. It helps in cleaning up the image to highlight the important
features.
Transformation Techniques
This is where an image can take on a new shape or form.
Transformation techniques might include resizing an image, rotating
it, or even warping it to change perspective. It’s like reshaping the
image to fit a specific purpose or requirement.
The input and output The input can be an image or a video. The
are images. output can be a label or a bounding box.
Doesn’t interpret an
Extracts useful information from the input.
image.
# Example usage
image_path = '/kaggle/input/sample-image/tint1.jpg'
image = cv2.imread(image_path)
if image is None:
print(f"Failed to load image from {image_path}")
else:
print(f"Successfully loaded image from {image_path}")
plt.subplot(131)
plt.imshow(cv2.cvtColor(nn, cv2.COLOR_BGR2RGB))
plt.title('Nearest Neighbor')
plt.axis('off')
plt.subplot(132)
plt.imshow(cv2.cvtColor(bilinear, cv2.COLOR_BGR2RGB))
plt.title('Bilinear')
plt.axis('off')
plt.subplot(133)
plt.imshow(cv2.cvtColor(bicubic, cv2.COLOR_BGR2RGB))
plt.title('Bicubic')
plt.axis('off')
plt.tight_layout()
plt.show()
print("Script completed")
Output:
Output
Code Explanation:
Import Libraries: Imports cv2 for image
processing, numpy for calculations, and matplotlib.pyplot for
plotting images.
Define resize_image Function: Resizes the image using
three methods: Nearest Neighbor, Bilinear, and Bicubic
interpolation.
Load Image: Reads the image from the specified path and
checks if the image was loaded successfully.
Perform Resizing: Applies Nearest Neighbor, Bilinear, and
Bicubic resizing methods to the image.
Display Results: Shows the original image resized using
different methods side by side using matplotlib.
2. Image Normalization
Normalization adjusts pixel intensity values to a standard scale.
Common techniques include:
a) Min-Max Scaling:
Scales values to a fixed range, typically [0, 1].
Formula: (x - min) / (max - min)
b) Z-score Normalization:
Transforms data to have a mean of 0 and standard deviation
of 1.
Formula: (x - mean) / standard_deviation
c) Histogram Equalization:
Enhances contrast by spreading out the most frequent intensity
values.
Implementation
Python
import cv2
import numpy as np
import matplotlib.pyplot as plt
def normalize_image(image):
print("Performing image normalization...")
# Min-Max Scaling
min_max = cv2.normalize(image, None, 0, 255, cv2.NORM_MINMAX)
# Z-score Normalization
z_score = np.zeros_like(image, dtype=np.float32)
for i in range(3): # for each channel
channel = image[:,:,i]
mean = np.mean(channel)
std = np.std(channel)
z_score[:,:,i] = (channel - mean) / (std + 1e-8) # adding
small value to avoid division by zero
z_score = cv2.normalize(z_score, None, 0, 255,
cv2.NORM_MINMAX).astype(np.uint8)
# Histogram Equalization
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
hist_eq = cv2.equalizeHist(gray)
print("Normalization complete.")
return min_max, z_score, hist_eq
# Example usage
image_path = '/kaggle/input/sample-image/input_image.jpg'
image = cv2.imread(image_path)
if image is None:
print(f"Failed to load image from {image_path}")
else:
print(f"Successfully loaded image from {image_path}")
plt.subplot(141)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title('Original')
plt.axis('off')
plt.subplot(142)
plt.imshow(cv2.cvtColor(min_max, cv2.COLOR_BGR2RGB))
plt.title('Min-Max Scaling')
plt.axis('off')
plt.subplot(143)
plt.imshow(cv2.cvtColor(z_score, cv2.COLOR_BGR2RGB))
plt.title('Z-score Normalization')
plt.axis('off')
plt.subplot(144)
plt.imshow(hist_eq, cmap='gray')
plt.title('Histogram Equalization')
plt.axis('off')
plt.tight_layout()
plt.show()
print("Script completed")
Output:
Normalization
Code Explanation:
Import Libraries: Imports cv2, numpy, and
matplotlib.pyplot for image processing and visualization.
Define normalize_image Function: Applies Min-Max
Scaling, Z-Score Normalization, and Histogram Equalization
to the image.
Load Image: Reads an image from a specified path and
checks if it loaded successfully.
Perform Normalization: Executes Min-Max Scaling, Z-
Score Normalization, and Histogram Equalization to adjust
the image.
Display Results: Shows the original and processed images
side by side using matplotlib.
3. Image Augmentation
Image augmentation creates modified versions of images to expand
the training dataset. Key techniques include:
a.) Geometric Transformations:
Rotation: Turning the image around a center point.
Flipping: Mirroring the image horizontally or vertically.
Scaling: Changing the size of the image.
Cropping: Cutting out a part of the image to use.
b) Color Adjustments:
Brightness: Making the image lighter or darker.
Contrast: Changing the difference between light and dark
areas.
Saturation: Adjusting the intensity of colors.
Hue: Shifting the overall color tone of the image.
c) Noise Addition:
Adding Random Noise: Introducing random variations to
pixel values to simulate imperfections.
Python
import cv2
import numpy as np
import matplotlib.pyplot as plt
def augment_image(image):
print("Performing image augmentation...")
# Rotation
rows, cols = image.shape[:2]
M = cv2.getRotationMatrix2D((cols/2, rows/2), 45, 1)
rotated = cv2.warpAffine(image, M, (cols, rows))
# Flipping
flipped = cv2.flip(image, 1) # 1 for horizontal flip
# Brightness adjustment
bright = cv2.convertScaleAbs(image, alpha=1.5, beta=0)
# Add noise
noise = np.random.normal(0, 25, image.shape).astype(np.uint8)
noisy = cv2.add(image, noise)
print("Augmentation complete.")
return rotated, flipped, bright, noisy
# Example usage
image_path = '/kaggle/input/sample-image/input_image.jpg'
image = cv2.imread(image_path)
if image is None:
print(f"Failed to load image from {image_path}")
else:
print(f"Successfully loaded image from {image_path}")
rotated, flipped, bright, noisy = augment_image(image)
plt.subplot(151)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title('Original')
plt.axis('off')
plt.subplot(152)
plt.imshow(cv2.cvtColor(rotated, cv2.COLOR_BGR2RGB))
plt.title('Rotated')
plt.axis('off')
plt.subplot(153)
plt.imshow(cv2.cvtColor(flipped, cv2.COLOR_BGR2RGB))
plt.title('Flipped')
plt.axis('off')
plt.subplot(154)
plt.imshow(cv2.cvtColor(bright, cv2.COLOR_BGR2RGB))
plt.title('Brightness Adjusted')
plt.axis('off')
plt.subplot(155)
plt.imshow(cv2.cvtColor(noisy, cv2.COLOR_BGR2RGB))
plt.title('Noisy')
plt.axis('off')
plt.tight_layout()
plt.show()
print("Script completed")
Output:
Augmentation
Code Explanation:
Import Libraries: Imports cv2 for image processing,
numpy for numerical operations, and matplotlib.pyplot for
displaying images.
Define augment_image Function: Performs four types of
image augmentations: rotation, flipping, brightness
adjustment, and noise addition.
Load Image: Reads the image from the specified path and
confirms successful loading.
Perform Augmentation: Applies rotation, horizontal
flipping, brightness adjustment, and adds Gaussian noise to
the image.
Display Results: Shows the original and augmented
images (rotated, flipped, brightness adjusted, noisy) side by
side using matplotlib.
4. Image Denoising
Denoising removes noise from images, enhancing quality and
clarity. Common methods include:
a) Gaussian Blur:
Applies a Gaussian function to smooth the image.
Effective for reducing Gaussian noise.
b) Median Filtering:
Replaces each pixel with the median of neighboring pixels.
Effective for salt-and-pepper noise.
c) Bilateral Filtering:
Preserves edges while reducing noise.
Code Implementation
Python
import cv2
import matplotlib.pyplot as plt
def denoise_image(image):
print("Performing image denoising...")
print("Denoising complete.")
return gaussian, median, bilateral
# Example usage
image_path = '/kaggle/input/sample-image/noisy_image.jpg'
image = cv2.imread(image_path)
if image is None:
print(f"Failed to load image from {image_path}")
else:
print(f"Successfully loaded image from {image_path}")
plt.subplot(141)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title('Original (Noisy)')
plt.axis('off')
plt.subplot(142)
plt.imshow(cv2.cvtColor(gaussian, cv2.COLOR_BGR2RGB))
plt.title('Gaussian Blur')
plt.axis('off')
plt.subplot(143)
plt.imshow(cv2.cvtColor(median, cv2.COLOR_BGR2RGB))
plt.title('Median Blur')
plt.axis('off')
plt.subplot(144)
plt.imshow(cv2.cvtColor(bilateral, cv2.COLOR_BGR2RGB))
plt.title('Bilateral Filter')
plt.axis('off')
plt.tight_layout()
plt.show()
print("Script completed")
Output:
Denoising
Code Explanation:
Import Libraries: Imports cv2 for image processing and
matplotlib.pyplot for displaying images.
Define denoise_image Function: Applies three different
denoising techniques to the noisy image: Gaussian Blur,
Median Blur, and Bilateral Filter.
Load Image: Reads the noisy image from the specified path
and checks if the image was successfully loaded.
Perform Denoising: Applies Gaussian Blur, Median Blur,
and Bilateral Filter to the noisy image to reduce noise.
Display Results: Uses matplotlib to display the original
noisy image and the results of the three denoising methods
side by side.
5. Edge Detection
Edge detection identifies boundaries of objects within images. Key
algorithms include:
a) Sobel Operator
Gradient Calculation: Measures changes in image
intensity.
Edge Emphasis: Highlights horizontal and vertical edges
using convolution kernels.
b) Canny Edge Detector
Multi-stage Algorithm: Involves noise reduction, gradient
calculation, and edge tracking.
Intensity Gradients and Non-maximum
Suppression: Detects edges by suppressing non-maximal
pixels.
c) Laplacian of Gaussian (LoG)
Combines Gaussian Smoothing with Laplacian
Operator: Smooths image to reduce noise, then applies
Laplacian to detect edges.
Code Example:
Python
import cv2
import numpy as np
import matplotlib.pyplot as plt
def detect_edges(image):
print("Performing edge detection...")
# Sobel
sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize=3)
sobely = cv2.Sobel(gray, cv2.CV_64F, 0, 1, ksize=3)
sobel = np.sqrt(sobelx**2 + sobely**2)
sobel = np.uint8(sobel / sobel.max() * 255)
# Canny
canny = cv2.Canny(gray, 100, 200)
# Laplacian of Gaussian
blur = cv2.GaussianBlur(gray, (3, 3), 0)
log = cv2.Laplacian(blur, cv2.CV_64F)
log = np.uint8(np.absolute(log))
# Example usage
image_path = '/kaggle/input/sample-image/tint1.jpg'
image = cv2.imread(image_path)
if image is None:
print(f"Failed to load image from {image_path}")
else:
print(f"Successfully loaded image from {image_path}")
plt.subplot(141)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title('Original')
plt.axis('off')
plt.subplot(142)
plt.imshow(sobel, cmap='gray')
plt.title('Sobel')
plt.axis('off')
plt.subplot(143)
plt.imshow(canny, cmap='gray')
plt.title('Canny')
plt.axis('off')
plt.subplot(144)
plt.imshow(log, cmap='gray')
plt.title('Laplacian of Gaussian')
plt.axis('off')
plt.tight_layout()
plt.show()
print("Script completed")
Output:
Edge Detection
Explanation:
Import Libraries: Imports cv2 for image processing,
numpy for calculations, and matplotlib.pyplot for plotting
images.
Define detect_edges Function: Converts the image to
grayscale and applies Sobel, Canny, and Laplacian of
Gaussian methods to detect edges.
Load Image: Reads an image from a specified path and
verifies if it was loaded successfully.
Perform Edge Detection: Calls detect_edges to get edge-
detected versions of the image using Sobel, Canny, and LoG
techniques.
Display Results: Uses matplotlib to show the original
image and the edge-detected results side by side.
6. Image Binarization
Binarization converts an image to black and white based on a
threshold. Methods include:
a) Global Thresholding:
Applies a single threshold value for the entire image.
Otsu's method automatically determines the optimal
threshold.
b) Adaptive Thresholding:
Uses different thresholds for different regions of the image.
Better for images with varying illumination.
Python
import cv2
import matplotlib.pyplot as plt
def binarize_image(image):
print("Performing image binarization...")
# Adaptive thresholding
adaptive_thresh = cv2.adaptiveThreshold(gray, 255,
cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY, 11,
2)
print("Binarization complete.")
return global_thresh, adaptive_thresh
# Example usage
image_path = '/kaggle/input/sample-image/tint1.jpg'
image = cv2.imread(image_path)
if image is None:
print(f"Failed to load image from {image_path}")
else:
print(f"Successfully loaded image from {image_path}")
plt.subplot(131)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title('Original')
plt.axis('off')
plt.subplot(132)
plt.imshow(global_thresh, cmap='gray')
plt.title('Global Thresholding')
plt.axis('off')
plt.subplot(133)
plt.imshow(adaptive_thresh, cmap='gray')
plt.title('Adaptive Thresholding')
plt.axis('off')
plt.tight_layout()
plt.show()
print("Script completed")
Output:
Binarization
Explanation:
Import Libraries: Imports cv2 for image processing and
matplotlib.pyplot for visualizing images.
Define binarize_image Function: Converts the image to
grayscale and applies global thresholding (Otsu’s method)
and adaptive thresholding for binary image creation.
Load Image: Reads an image from the specified path and
checks if it was loaded correctly.
Perform Binarization: Calls binarize_image to apply global
and adaptive thresholding methods on the grayscale image
to create binary images.
Display Results: Uses matplotlib to show the original
image, global thresholded image, and adaptive thresholded
image in a 1x3 grid layout.
What is Computer Vision?
Computer vision is a field of study within artificial intelligence (AI)
that focuses on enabling computers to Intercept and extract
information from images and videos, in a manner similar to human
vision. It involves developing algorithms and techniques to extract
meaningful information from visual inputs and make sense of the
visual world.
Prerequisite: Before Starting Computer Vision It’s Recommended
that you should have a foundational knowledge of Machine
Learning, Deep learning and an OpenCV. you can refer to our
tutorial page on prerequisites technologies.
Madhurjya Chowdhury
Published on:
20 Jan 2022, 12:00 am
Exploring every end of computer vision vs
image processing through an in-depth
analysis
What is the difference between image processing and computer vision? Both
are concerned with images. And that's the only thing they have in
common. Computer vision and image processing are two distinct tools with
different applications. In this post, we'll look at each of these in greater detail
and explore the differences between them.
Image Processing
As the name implies, image is handled in image processing. It signifies that
an input file has undergone at least one change. And with the help of
dedicated software, it can be done by a person.
Computer Vision
It's a different story when it comes to computer vision. A picture or video is
used as input in computer vision, but nothing changes to the file itself. The
objective is to deduce meaning from the image and its components. While
some image processing methods are used by computer vision to solve
problems, processing has never been the primary focus. In fact, image
processing algorithms are used to accomplish computer vision jobs.
In this case, computer vision is employed to aid the driver, especially, in bad
weather. It examines the environment around the car and assesses potential
hazards, impediments, and other pertinent events, that a driver may
encounter while driving such as a person crossing the street.
Anyone who has driven a vehicle after a lousy night's sleep can attest to the
fact that it is quite unsafe! As a result, computer vision technology can assist
you in staying awake and determining when you are too tired or sleepy to
drive. Depending on your visual state or head motions, the computer vision
programme can continuously check your condition. Computer vision and
image recognition technology could detect when you're not paying attention
to the road and are about to fall asleep. Your vehicle sends you an alarm to
get you back on track or to suggest that you sleep before driving again.
Col
our Models
Wavelets
Wavelets represent an image in various degrees of resolution. It is
one of the members of the class of linear transforms along with
fourier, cosine, sine, Hartley, Slant, Haar, Walsh-Hadamard.
Transforms are coefficients of linear expansion which decompose a
function into a weighted sum of orthogonal or biorthogonal basis
functions. All these transforms are reversible and interconvertible.
All of them express the same information and energy. Hence all are
equivalent. All the transforms vary in only the manner how the
information is represented.
Compression
Compression deals with decreasing the storage required to the
image information or the bandwidth required to transmit it.
Compression technology has grown widely in this era. Many people
are knowledgeable about it by common image extension JPEG (Joint
Photographic Experts Group) which is a compression technology.
This is done by removing redundancy and irrelevant data. In the
encoding process of compression, the image goes through a series
of stages - mapper, quantizer, symbol encoder. Mapper may be
reversible or irreversible. Example of mapper is run length encoding.
Quantizer reduces the accuracy and is an irreversible process.
Symbol encoders assign small values to more frequent data and is a
reversible process.