0% found this document useful (0 votes)

6 views30 pages

Open CV Notes

The document provides a comprehensive guide on image processing using OpenCV, detailing a basic pipeline for image manipulation and analysis through Python scripts. It covers essential operations such as reading images, resizing, color conversion, and various smoothing techniques, along with their definitions and use cases. The content is structured to facilitate understanding for both practitioners and students in the field of computer vision.

Uploaded by

khansamaira395

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views30 pages

Open CV Notes

Uploaded by

khansamaira395

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Open CV

June 19, 2025

1 Image Processing in OpenCV: Exhaustive Breakdown of a

Basic Pipeline with Definitions, Code, and Explanations
1.1 Introduction
In the domain of computer vision and image analysis, preprocessing an image is a fundamental step.
OpenCV, an open-source computer vision library, provides a wide array of tools to perform tasks such
as image reading, scaling, filtering, edge detection, and morphological transformations.
This section contains an extremely detailed walkthrough of a Python script built using OpenCV. The
goal is to understand the purpose and operation of each line of code, and to define every technical term
and function employed, using formal descriptions, relevant examples, and analytical commentary.

1.2 Python Script Overview

Before dissecting the code, let us present the complete script to establish context:

import cv2 as cv
img = cv.imread(’Photo/Varanasi.jpg’)

def resacleFrame(frame, scale=0.25):

width = frame.shape[1] * scale
height = frame.shape[0] * scale
dimension = (int(width), int(height))
return cv.resize(frame, dimension, interpolation=cv.INTER_AREA)

img = resacleFrame(img)
cv.imshow(’images’, img)

gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

cv.imshow(’Gray’, gray)

blur = cv.GaussianBlur(img, (19, 19), cv.BORDER_DEFAULT)

cv.imshow(’Blur’, blur)

canny = cv.Canny(img, 125, 175)

cv.imshow(’Canny’, canny)

dilated = cv.dilate(canny, (7, 7), iterations=3)

cv.imshow(’dilated’, dilated)

eroded = cv.erode(dilated, (3, 3), iterations=1)

cv.imshow(’eroded’, eroded)

resized = cv.resize(img, (500, 500), interpolation=cv.INTER_CUBIC)

cv.imshow(’resized’, resized)

cropped = img[50:400, 250:400]

1
cv.imshow(’cropped’, cropped)

cv.waitKey(0)

1.3 Line-by-Line Analysis with Definitions and Explanations

Line 1: Importing OpenCV
import cv2 as cv
Definition: OpenCV (Open Source Computer Vision Library) is a software library of programming
functions mainly aimed at real-time computer vision. It is written in C++ and has bindings for Python,
Java, and MATLAB.
Usage: We import the cv2 module using the alias cv to simplify notation.

Line 2: Reading an Image from Disk

img = cv.imread(’Photo/Varanasi.jpg’)
Function: cv.imread(filepath) reads an image from the specified file.
• Returns a NumPy array representing the image.
• Reads the image in BGR format (Blue, Green, Red), not RGB.
• If the image is not found, it returns None.
Use Case: Reading an image into memory for processing.

Line 3–7: Defining a Rescaling Function

def resacleFrame(frame, scale=0.25):
width = frame.shape[1] * scale
height = frame.shape[0] * scale
dimension = (int(width), int(height))
return cv.resize(frame, dimension, interpolation=cv.INTER_AREA)
Definition: shape is a property of NumPy arrays that returns a tuple (rows, cols, channels). Here:
• frame.shape[0] = height
• frame.shape[1] = width
cv.resize(): Changes the size of an image. The interpolation method defines how pixel values are
calculated:
• cv.INTER AREA: Preferred for shrinking images.
• cv.INTER LINEAR: Default, best for zooming.
• cv.INTER CUBIC: High-quality zooming, slower.

Line 8: Applying the Rescale Function

img = resacleFrame(img)
Reduces the image size to 25% of the original. Important for reducing computational cost during
testing or on devices with limited resources.

Line 9: Displaying the Rescaled Image

cv.imshow(’images’, img)
cv.imshow(): Opens a GUI window to display the image.
• First argument: Window title.
• Second argument: Image to display.

2
Line 10–11: Grayscale Conversion
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
cv.imshow(’Gray’, gray)

cv.cvtColor(): Converts images from one color space to another.

Definition of Grayscale:
• An image format where each pixel carries intensity information only (0-255).
• Reduces memory and computation.
Use Case: Essential preprocessing step for edge detection, segmentation, thresholding.

Line 12–13: Gaussian Blur

blur = cv.GaussianBlur(img, (19, 19), cv.BORDER_DEFAULT)
cv.imshow(’Blur’, blur)

Definition: Gaussian blur is a smoothing technique using a Gaussian function. It reduces noise and
detail.
Parameters:
• Kernel size (19, 19): The larger the kernel, the smoother the result.
• cv.BORDER DEFAULT: Pads the borders during convolution.
Use Case: Applied before edge detection to avoid false edges from noise.

Line 14–15: Canny Edge Detection

canny = cv.Canny(img, 125, 175)
cv.imshow(’Canny’, canny)

Definition: Canny edge detection is a multi-stage algorithm to detect a wide range of edges.
How It Works:
1. Apply Gaussian Blur
2. Compute Gradient Magnitude and Direction
3. Non-Maximum Suppression
4. Hysteresis Thresholding with two thresholds
Thresholds (125, 175): Weak edges below 125 are discarded. Edges above 175 are kept. Edges in
between are kept only if connected to strong edges.

Line 16–17: Dilation

dilated = cv.dilate(canny, (7, 7), iterations=3)
cv.imshow(’dilated’, dilated)

Definition: Dilation is a morphological operation that expands the white regions (foreground) in a
binary image.
Kernel: The (7,7) matrix determines the neighborhood for expansion.
Iterations: Repeats the dilation three times.
Use Case: Thickens detected edges to fill gaps or connect components.

Line 18–19: Erosion

eroded = cv.erode(dilated, (3, 3), iterations=1)
cv.imshow(’eroded’, eroded)

Definition: Erosion shrinks the white regions. It’s the inverse of dilation.
Use Case: Removes noise and reduces object size.

3
Line 20–21: Resizing Using Cubic Interpolation
resized = cv.resize(img, (500, 500), interpolation=cv.INTER_CUBIC)
cv.imshow(’resized’, resized)
Definition: Resizing to a fixed dimension (500,500) using cubic interpolation.
Cubic Interpolation: Uses 16 neighboring pixels to compute each new pixel value — produces
smooth results.

Line 22–23: Cropping

cropped = img[50:400, 250:400]
cv.imshow(’cropped’, cropped)
Definition: Cropping is the process of extracting a sub-region from an image.
• img[y1:y2, x1:x2] selects pixels between row 50 and 400, and column 250 to 400.
• This operation does not alter the original image.

Line 24: Holding Windows Open

cv.waitKey(0)
Definition: cv.waitKey() waits for a key event. Argument 0 means wait indefinitely.
Use Case: Prevents the image windows from closing immediately after display.

1.4 Conclusion of the Breakdown

Each function and operation in the above script serves a critical role in the image preprocessing pipeline.
The program covers an extensive range of fundamental operations — reading, resizing, converting, fil-
tering, edge detection, morphological transformations, and region extraction — each of which is indis-
pensable for real-world computer vision tasks such as:
• Object detection and recognition
• Image segmentation
• Preprocessing for machine learning and deep learning models
• Real-time surveillance and motion tracking
The modularity and clarity of this script make it an excellent starting point for any computer vision
practitioner or student.

2 Image Smoothing Techniques Using OpenCV: An In-Depth

Line-by-Line Analysis and Conceptual Overview
2.1 Introduction
Image smoothing, also known as image blurring, is a fundamental technique in computer vision and image
processing. It involves reducing image noise and detail by averaging pixel values with their neighbors.
Smoothing serves as a preprocessing step for edge detection, feature extraction, and denoising.
This section comprehensively analyzes a Python script that demonstrates four major smoothing
techniques using OpenCV:
• Averaging (Box Filter)
• Gaussian Blur
• Median Blur
• Bilateral Filter
We define each concept, explain its role in image processing, and offer a line-by-line commentary on
the implementation.

4
2.2 Complete Python Script Context
import cv2 as cv
import numpy as np
img = cv.imread(’Photo\\Varanasi.jpg’)

def resacleFrame(frame, scale=0.15):

width = frame.shape[1] * scale
height = frame.shape[0] * scale
dimension = (int(width), int(height))
return cv.resize(frame, dimension, interpolation=cv.INTER_AREA)

img = resacleFrame(img)
cv.imshow(’cv’, img)

average = cv.blur(img, (3, 3))

cv.imshow(’averatge’, average)

gaus_Avg = cv.GaussianBlur(img, (3, 3), 0)

cv.imshow(’Gaussian Blur’, gaus_Avg)

median = cv.medianBlur(img, 3)
cv.imshow(’Medoan’, median)

bilateral = cv.bilateralFilter(img, 5, 15, 15)

cv.imshow(’Bilateral’, bilateral)

cv.waitKey(0)

2.3 Detailed Explanation and Functionality of Each Line

Line 1–2: Importing Libraries
import cv2 as cv
import numpy as np

cv2: OpenCV library for image processing.

NumPy (np): Fundamental library for matrix operations; essential as OpenCV images are NumPy
arrays.

Line 3: Reading the Image

img = cv.imread(’Photo\\Varanasi.jpg’)

Reads the image from the path. The double backslash ‘

‘ ensures Windows path compatibility. The image is loaded as a NumPy array in BGR format.

Line 4–8: Image Rescaling Function

Purpose: Reduces image size to 15% of original to optimize display and performance.

• frame.shape[1] = width
• frame.shape[0] = height
• cv.resize(...) scales the image using INTER AREA interpolation (ideal for shrinking).

5
Line 9: Apply Rescaling
img = resacleFrame(img)
Why scale down? Speeds up display and processing, useful in real-time applications or GUI
visualizations.

Line 10: Displaying Original Image

cv.imshow(’cv’, img)
Displays the original (rescaled) image with the label ”cv”.

2.4 Image Smoothing Techniques

We now analyze four distinct smoothing methods applied in the script. Each is unique in how it handles
neighboring pixel values and noise.

2.4.1 1. Averaging (Box Blur)

average = cv.blur(img, (3, 3))
cv.imshow(’averatge’, average)
Definition: Averaging replaces each pixel’s value with the average of its surrounding pixels defined
by the kernel size.
Mathematical Operation:
a b
1 X X
I ′ (x, y) = I(x + i, y + j)
mn i=−a
j=−b

where m × n is the kernel size (e.g., 3 × 3).

Effect: Softens the image uniformly; blurs both noise and edges.
Use Case: Basic noise reduction.

2.4.2 2. Gaussian Blur

gaus_Avg = cv.GaussianBlur(img, (3, 3), 0)
cv.imshow(’Gaussian Blur’, gaus_Avg)
Definition: Gaussian blur uses a weighted average where closer pixels have more influence, based
on a Gaussian distribution.
Mathematical Form:
1 − x2 +y2 2
G(x, y) = e 2σ
2πσ 2
Parameters:
• (3, 3): Kernel size.
• 0: Standard deviation calculated automatically.
Effect: More natural blur compared to averaging; reduces noise while preserving edges better.
Use Case: Preprocessing before edge detection or for photographic effects.

2.4.3 3. Median Blur

median = cv.medianBlur(img, 3)
cv.imshow(’Medoan’, median)
Definition: Median filtering replaces each pixel with the median of neighboring pixels.
Why Median?
• Effective against salt-and-pepper noise.
• Better preserves edges than mean-based methods.
Kernel: 3 refers to a 3x3 neighborhood.
Use Case: Cleaning binary images, smoothing text, reducing salt-and-pepper artifacts.

6
2.4.4 4. Bilateral Filter
bilateral = cv.bilateralFilter(img, 5, 15, 15)
cv.imshow(’Bilateral’, bilateral)

Definition: Bilateral filtering smooths images while preserving edges using both:
• Spatial distance (how close pixels are)

• Intensity difference (how similar pixels are)

Parameters:
• 5: Diameter of pixel neighborhood.

• 15: σcolor : Larger values mean pixels with larger intensity differences will be mixed.
• 15: σspace : Larger values mean more distant pixels will influence the blur.
Effect: Removes noise while retaining sharp edges. Best among all for edge-preserving filtering.
Use Case: Medical imaging, cartoonizing, HDR photography, skin smoothing in portraits.

Final Step: Hold Display Windows Open

cv.waitKey(0)

Definition: Waits for a key event indefinitely to keep the image windows open.
Use Case: Prevents automatic closing of image display windows until user interaction.

2.5 Comparative Summary of Smoothing Techniques

Method Edge Preservation Noise Reduction Use Case
Averaging Poor Moderate Basic smoothing
Gaussian Blur Moderate Good Preprocessing for detection
Median Blur Good Excellent (salt-pepper) Text, binary images
Bilateral Filter Excellent Excellent Facial smoothing, HDR, medical

2.6 Real-World Applications

• Autonomous Vehicles: Preprocess road scenes before edge detection.

• Medical Imaging: Denoise X-ray, MRI, or CT scan images.

• Surveillance: Enhance visibility in low-quality security footage.
• Face Detection: Smoothing images to improve face detection accuracy.

3 Color Space Conversions in OpenCV: A Comprehensive The-

oretical and Practical Overview
3.1 Introduction
Color spaces define how color information is represented in an image. Although digital images are often
stored in the RGB (Red, Green, Blue) format, alternative color spaces can be better suited for specific
image processing tasks such as segmentation, enhancement, tracking, and analysis.
In this section, we explore a Python script that performs multiple color space transformations us-
ing OpenCV. Each conversion is explained in-depth with definitions, mathematical foundations, code
explanation, and real-world use cases.

7
3.2 Complete Python Script Overview
import cv2 as cv
img = cv.imread(’Photo\\Varanasi.jpg’)

def resacleFrame(frame, scale=0.15):

width = frame.shape[1] * scale
height = frame.shape[0] * scale
dimension = (int(width), int(height))
return cv.resize(frame, dimension, interpolation=cv.INTER_AREA)

img = resacleFrame(img)
cv.imshow(’Varansi’, img)

gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

cv.imshow(’gray’, gray)

hsv = cv.cvtColor(img, cv.COLOR_BGR2HSV)

cv.imshow(’HSV’, hsv)

lab = cv.cvtColor(img, cv.COLOR_BGR2LAB)

cv.imshow(’LAB’, lab)

rgb = cv.cvtColor(img, cv.COLOR_BGR2RGB)

cv.imshow(’RGB’, rgb)

hsv_bgr = cv.cvtColor(hsv, cv.COLOR_HSV2BGR)

cv.imshow(’HSV to BGR’, hsv_bgr)

lab_bgr = cv.cvtColor(lab, cv.COLOR_LAB2BGR)

cv.imshow(’LAB to BGR’, lab_bgr)

cv.waitKey(0)

3.3 Line-by-Line Explanation and Theoretical Insights

Lines 1–2: Importing and Reading Image
import cv2 as cv
img = cv.imread(’Photo\\Varanasi.jpg’)

cv2: The OpenCV Python binding.

cv.imread(): Loads an image from disk in BGR format by default.
Note: OpenCV loads images in BGR, not RGB. This is important for correct color interpretation.

Lines 3–8: Rescaling Function

Reduces image size for faster processing and display.

cv.INTER AREA: Recommended interpolation method for image shrinking.

Line 9: Rescaling the Image

img = resacleFrame(img)

Purpose: Reduces image dimensions to 15% of original size.

8
Line 10: Display Original Image
cv.imshow(’Varansi’, img)

Shows the rescaled BGR image labeled as ”Varansi”.

3.4 Color Space Conversions

The next part of the script performs several transformations between color spaces.

3.4.1 1. BGR to Grayscale

gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
cv.imshow(’gray’, gray)

Definition: Grayscale images store luminance (brightness) only, reducing the image to one channel.
Mathematical Concept:
Y = 0.299R + 0.587G + 0.114B
Use Case: Ideal for edge detection, thresholding, and simplifying input for deep learning.

3.4.2 2. BGR to HSV (Hue, Saturation, Value)

hsv = cv.cvtColor(img, cv.COLOR_BGR2HSV)
cv.imshow(’HSV’, hsv)

HSV:

• Hue (H): Color type (0–180 in OpenCV)

• Saturation (S): Vividness (0–255)
• Value (V): Brightness (0–255)
Why HSV? It separates color (H) from intensity (V), which is useful in:

• Color tracking
• Skin detection
• Lighting-invariant operations

3.4.3 3. BGR to LAB (CIE Lab*)

lab = cv.cvtColor(img, cv.COLOR_BGR2LAB)
cv.imshow(’LAB’, lab)

CIE L*a*b*:
• L*: Lightness (0 is black, 100 is white)
• a*: Green-Red axis
• b*: Blue-Yellow axis

Use Case: LAB is perceptually uniform — color differences correspond to visual perception. It is
often used for:
• Image enhancement
• Histogram equalization

• Professional photo editing

9
3.4.4 4. BGR to RGB
rgb = cv.cvtColor(img, cv.COLOR_BGR2RGB)
cv.imshow(’RGB’, rgb)

Definition: RGB is the standard format for most image viewers, unlike OpenCV’s BGR.
Use Case: Displaying images using matplotlib, which assumes RGB.

3.4.5 5. HSV to BGR

hsv_bgr = cv.cvtColor(hsv, cv.COLOR_HSV2BGR)
cv.imshow(’HSV to BGR’, hsv_bgr)

Conversion: Reconstructs the original image (approximately) from HSV.

Use Case: After processing in HSV (e.g., masking), conversion back to BGR is needed for sav-
ing/display.

3.4.6 6. LAB to BGR

lab_bgr = cv.cvtColor(lab, cv.COLOR_LAB2BGR)
cv.imshow(’LAB to BGR’, lab_bgr)

Purpose: Converts LAB image back to BGR format after processing.

Caution: Due to rounding errors and color approximations, the reverse transformation may not be
pixel-perfect.

Final Step: Wait for Key Event

cv.waitKey(0)

Function: Keeps all GUI windows open until a key is pressed.

3.5 Summary of Color Spaces

Color Space Channels Advantages Applications
Grayscale 1 Simple, low memory Edge detection, thresholding
HSV 3 Separates color/intensity Color tracking, segmentation
LAB 3 Perceptually uniform Image enhancement, color correction
RGB 3 Natural display order Visualization, graphics

3.6 Real-World Use Cases

• Self-Driving Cars: HSV for lane detection under varying light.

• Medical Imaging: LAB for contrast enhancement.

• Augmented Reality: RGB conversion for visualization.
• Photography: LAB and HSV for color manipulation in editing tools.

3.7 Conclusion
This script demonstrates the powerful and flexible color space conversion capabilities of OpenCV. Each
color model is optimized for specific tasks — choosing the right one can drastically improve the perfor-
mance of your vision system. Understanding these spaces is essential for all image processing pipelines,
especially in computer vision, robotics, and graphics applications.

10
4 Geometric Transformations in OpenCV: Translation, Rota-
tion, and Flipping
4.1 Introduction
Geometric transformations are fundamental operations in computer vision, used to alter the spatial
configuration of an image without changing its content. These include operations such as:
• Translation: Shifting the image along x and/or y axes.
• Rotation: Rotating the image about a defined point.
• Flipping: Mirroring the image across an axis.
Such transformations are essential in data augmentation, robot vision, image registration, and graph-
ical applications. This section analyzes a Python script that implements these transformations using
OpenCV.

4.2 Complete Script Overview

import cv2 as cv
import numpy as np
img = cv.imread(’Photo/download.jpg’)
cv.imshow(’image’, img)

Explanation:
• cv2 is the OpenCV module.
• numpy is used for numerical matrix operations.
• The image is read in BGR format using cv.imread() and displayed using cv.imshow().

4.3 Image Translation

4.3.1 Definition
Translation refers to shifting the image in the horizontal (x-axis) and vertical (y-axis) directions.
Mathematical Form:  
′ x
x 1 0 tx  
= · y
y′ 0 1 ty
1
Where:
• tx : shift in x-direction
• ty : shift in y-direction
• x, y: original coordinates
• x′ , y ′ : translated coordinates

4.3.2 Code and Explanation

def Translate(img, x, y):
transMat = np.float32([[1, 0, x], [0, 1, y]])
dimension = (img.shape[1], img.shape[0])
return cv.warpAffine(img, transMat, dimension)

tranlated = Translate(img, 100, -100)

# cv.imshow(’translated’, tranlated)

Function Description:

11
• transMat: The transformation matrix for shifting.
• img.shape[1] is width, img.shape[0] is height.
• cv.warpAffine() applies the affine transformation.

• The image is shifted 100 pixels right and 100 pixels upward.
Use Case: Shifting images for data augmentation in machine learning, or aligning objects in robotics.

4.4 Image Rotation

4.4.1 Definition
Rotation involves rotating an image around a fixed point, usually the center of the image.
Mathematical Form:

cos θ − sin θ (1 − cos θ)x0 + sin θy0
RotMatrix =
sin θ cos θ (1 − cos θ)y0 − sin θx0

4.4.2 Code and Explanation

def Rotate(img, angle, rotPoint=None):
(h, w) = img.shape[:2]
if rotPoint is None:
rotPoint = (w // 2, h // 2)
rotMat = cv.getRotationMatrix2D(rotPoint, angle, 1.0)
dimension = (w, h)
return cv.warpAffine(img, rotMat, dimension)

rotated = Rotate(img, -45)

# cv.imshow(’rotated’, rotated)

rot_rotated = Rotate(rotated, -45)

# cv.imshow(’rotated twice’, rot_rotated)

Function Breakdown:
• angle: Rotation angle in degrees (negative = clockwise).
• rotPoint: Center of rotation. Defaults to image center.
• cv.getRotationMatrix2D(): Returns a 2x3 affine rotation matrix.

• cv.warpAffine(): Applies the rotation.

Use Case:
• Aligning or re-orienting images.

• Simulating camera or object movement.

• Augmenting training data in classification tasks.

4.5 Image Flipping

4.5.1 Definition
Flipping reverses the pixels of an image either horizontally, vertically, or both.

12
4.5.2 Code and Modes
flip = cv.flip(img, 0) # Vertical Flip
# cv.imshow(’fliped’, flip)

flip2 = cv.flip(img, 1) # Horizontal Flip

# cv.imshow(’fliped2’, flip2)

flip3 = cv.flip(img, -1) # Both Horizontal and Vertical

# cv.imshow(’fliped3’, flip3)

cv.flip() Axis Codes:

• 0: Flip vertically.
• 1: Flip horizontally.
• -1: Flip both axes.
Use Case:

• Horizontal flip for facial recognition symmetry.

• Vertical flip for mirrored environments.
• Combined flip for geometric data augmentation.

4.6 Final Step: Holding Windows Open

cv.waitKey(0)

Purpose: Pauses the execution and keeps image windows open until a key is pressed.

4.7 Summary of Geometric Transformations

Transformation Function Matrix

Involved
Typical Use Cases
1 0 x
Translation cv.warpAffine() Object tracking, registration
0 1 y
Rotation cv.getRotationMatrix2D() 2x3 rotation matrix Image orientation, augmentation
Flipping cv.flip() N/A (built-in logic) Data augmentation, symmetry correction

4.8 Real-World Applications

• Augmented Reality: Reorient virtual objects in the user view.
• Medical Imaging: Standardize direction of X-rays or MRIs.

• Machine Learning: Geometric augmentations increase dataset diversity.

• Autonomous Vehicles: Adjust camera inputs to align with trajectory.

4.9 Conclusion
This script demonstrates how spatial transformations such as translation, rotation, and flipping can be
implemented in OpenCV. These operations are critical for real-time applications, data preprocessing,
and robust model training. Mastering them provides a strong foundation for more advanced computer
vision workflows.

13
5 Channel Splitting and Merging in OpenCV: Theory, Code,
and Applications
5.1 Introduction
In digital image processing, color images are typically represented using multiple channels — each channel
encodes intensity values for a primary color component. In the case of the BGR format (used by
OpenCV), these components are:
• B: Blue Channel
• G: Green Channel

• R: Red Channel
Manipulating these channels independently enables various applications such as color filtering, en-
hancement, and object detection based on specific spectral properties.
This section provides an in-depth explanation of how to split and merge color channels using OpenCV,
based on the provided Python script.

5.2 Complete Python Script Overview

import cv2 as cv
import numpy as np
img = cv.imread(’Photo\\Varanasi.jpg’)

def resacleFrame(frame, scale=0.15):

width = frame.shape[1] * scale
height = frame.shape[0] * scale
dimension = (int(width), int(height))
return cv.resize(frame, dimension, interpolation=cv.INTER_AREA)

img = resacleFrame(img)
cv.imshow(’Varansi’, img)

blank = np.zeros(img.shape[:2], dtype=’uint8’)

b, g, r = cv.split(img)

blue = cv.merge([b, blank, blank])

cv.imshow(’Blue’, blue)

green = cv.merge([blank, g, blank])

cv.imshow(’Green’, green)

red = cv.merge([blank, blank, r])

cv.imshow(’Red’, red)

cv.waitKey(0)

5.3 Line-by-Line Explanation and Theoretical Background

Lines 1–2: Importing Libraries
import cv2 as cv
import numpy as np

These are the standard imports for OpenCV and NumPy.

14
Line 3: Reading the Image
img = cv.imread(’Photo\\Varanasi.jpg’)
Loads the image in BGR format. The image is stored as a 3-dimensional NumPy array of shape
(height, width, 3).

Lines 4–8: Rescaling the Image

def resacleFrame(frame, scale=0.15):
width = frame.shape[1] * scale
height = frame.shape[0] * scale
dimension = (int(width), int(height))
return cv.resize(frame, dimension, interpolation=cv.INTER_AREA)
Purpose: Downscale the image to reduce memory and computation.
cv.INTER AREA: Preferred interpolation method for reducing image size.

Line 9: Applying Rescaling

img = resacleFrame(img)

Line 10: Displaying the Rescaled Image

cv.imshow(’Varansi’, img)
Shows the complete rescaled image for reference.

Line 11: Creating a Blank Image

blank = np.zeros(img.shape[:2], dtype=’uint8’)
Explanation:
• img.shape[:2] returns (height, width) — for a single-channel blank image.
• uint8 specifies pixel values from 0 to 255.
• Used as placeholder for zeroing unused color channels.

5.4 Channel Splitting

b, g, r = cv.split(img)
cv.split(): Decomposes a 3-channel BGR image into its individual blue, green, and red components.
After Splitting:
• b, g, and r are 2D arrays (grayscale images).
• Each pixel in these arrays corresponds to the intensity of that color in the original image.

5.5 Channel Merging and Visualization

To visualize each channel separately in color, we merge the target channel with two blank matrices.

5.5.1 Blue Channel Visualization

blue = cv.merge([b, blank, blank])
cv.imshow(’Blue’, blue)
Interpretation:
• b is retained in the blue channel.
• Green and red are zeroed out.
• Result: Pure blue intensities in the image.

15
5.5.2 Green Channel Visualization
green = cv.merge([blank, g, blank])
cv.imshow(’Green’, green)

5.5.3 Red Channel Visualization

red = cv.merge([blank, blank, r])
cv.imshow(’Red’, red)

5.6 Final Display Hold

cv.waitKey(0)
Prevents all OpenCV display windows from closing until a key is pressed.

5.7 Why Split and Merge Channels?

5.7.1 Applications of Splitting
• Feature Detection: Extract features from a specific color component.
• Masking: Apply masks only on the green or red channel.
• Analysis: Measure distribution and histogram of color intensities.

5.7.2 Applications of Merging

• Channel Manipulation: Modify brightness/contrast of one channel.
• Color Emphasis: Highlight or isolate certain colors in the image.
• Image Reconstruction: Combine edited channels to form the final image.

5.8 Matrix Insight

A pixel (x, y) in the original image has:
img[x, y] = [B, G, R]
After splitting:
b[x, y] = B, g[x, y] = G, r[x, y] = R
After merging with blank channels:
New Image[x, y] = [B, 0, 0] (for Blue Visualization)

5.9 Conclusion
Channel splitting and merging is a powerful low-level operation in image processing. It provides the
flexibility to perform channel-specific transformations, filtering, and analysis. Understanding this concept
is essential for tasks in color science, machine vision, and neural network preprocessing.

6 Contour Detection in OpenCV: Theory, Code Dissection, and

Applications
6.1 Introduction
Contour detection is a fundamental operation in computer vision. A contour can be defined as a curve
joining all continuous points along a boundary which share the same color or intensity. In binary images,
this typically means tracing the edges of white regions.
This section provides a comprehensive, line-by-line analysis of a Python script that detects and
draws contours using OpenCV. Concepts covered include grayscale conversion, thresholding, binary
segmentation, contour retrieval modes, and drawing functions.

16
6.2 Complete Script Overview
import cv2 as cv
import numpy as np

img = cv.imread(’Photo\\Varanasi.jpg’)

def resacleFrame(frame, scale=0.25):

width = frame.shape[1] * scale
height = frame.shape[0] * scale
dimension = (int(width), int(height))
return cv.resize(frame, dimension, interpolation=cv.INTER_AREA)

img = resacleFrame(img)
cv.imshow(’JPG’, img)

blank = np.zeros(img.shape, dtype=’uint8’)

cv.imshow(’Blank’, blank)

gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

cv.imshow(’Gray’, gray)

ret, thresh = cv.threshold(gray, 125, 255, cv.THRESH_BINARY)

cv.imshow(’Thresh’, thresh)

contours, hierarchies = cv.findContours(thresh, cv.RETR_LIST, cv.CHAIN_APPROX_NONE)

print(f’{len(contours)}’)

cv.drawContours(blank, contours, -1, (0, 0, 255), 1)

cv.imshow(’Countours’, blank)

cv.waitKey(0)

6.3 Step-by-Step Code Explanation

Lines 1–3: Imports and Image Loading
import cv2 as cv
import numpy as np
img = cv.imread(’Photo\\Varanasi.jpg’)

• cv2: OpenCV library for image processing.

• numpy: For matrix and numerical operations.
• The image is loaded in BGR format.

Lines 4–9: Image Rescaling

def resacleFrame(frame, scale=0.25):
...
img = resacleFrame(img)

Scales the image to 25% of its original dimensions for better display and processing efficiency.

Line 10: Display Original Image

cv.imshow(’JPG’, img)

17
Line 11: Creating a Blank Image
blank = np.zeros(img.shape, dtype=’uint8’)
cv.imshow(’Blank’, blank)

Purpose: An empty canvas of the same shape as the original image, used to draw contours.

Line 12–13: Grayscale Conversion

gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
cv.imshow(’Gray’, gray)

Why Grayscale? Most contour detection techniques operate on single-channel images where pixel
intensities define object boundaries.

6.4 Thresholding for Binary Conversion

ret, thresh = cv.threshold(gray, 125, 255, cv.THRESH_BINARY)
cv.imshow(’Thresh’, thresh)

cv.threshold(): Converts grayscale to binary:

(
maxVal, if src(x, y) > thresh
dst(x, y) =
0, otherwise

Parameters:
• gray: Source image
• 125: Threshold value
• 255: Maximum value assigned to pixels above threshold

• cv.THRESH BINARY: Thresholding method

Use Case: Converts image to black and white, necessary for contour extraction.

6.5 Finding Contours

contours, hierarchies = cv.findContours(thresh, cv.RETR_LIST, cv.CHAIN_APPROX_NONE)

Definition:
cv.findContours() retrieves contours from a binary image.
Returns:
• contours: List of contour points (arrays of (x, y))

• hierarchies: Structural hierarchy among contours

Modes:
• cv.RETR LIST: Retrieves all contours, no hierarchy.

• cv.RETR TREE: Retrieves all contours and builds a full hierarchy tree.
• cv.RETR EXTERNAL: Retrieves only the outermost contours.
Contour Approximation:
• cv.CHAIN APPROX NONE: Stores all contour points.

• cv.CHAIN APPROX SIMPLE: Removes redundant points.

18
6.6 Drawing Contours
cv.drawContours(blank, contours, -1, (0, 0, 255), 1)
cv.imshow(’Countours’, blank)

Parameters:
• blank: Destination canvas

• contours: List of contours

• -1: Draw all contours (use index to draw specific one)
• (0, 0, 255): Red color in BGR

• 1: Thickness of contour lines

6.7 Alternative Method: Canny + Contours (commented)

# blur = cv.GaussianBlur(gray, (5,5), cv.BORDER_DEFAULT)
# canny = cv.Canny(blur, 125, 175)
# cv.imshow(’Canny’, canny)

Explanation:
• Blurring reduces noise before edge detection.
• Canny edge detection produces binary edges.
• Contours can be found using these edges as input.

6.8 Summary of Key Functions

Function Purpose Category
cv.cvtColor() Convert BGR to Grayscale Preprocessing
cv.threshold() Convert grayscale to binary Segmentation
cv.findContours() Extract contours from binary image Feature Extraction
cv.drawContours() Draw contours on an image Visualization

6.9 Real-World Applications

• Object Detection: Localize objects based on boundary shapes.
• Image Segmentation: Divide image into regions.

• Shape Matching: Compare contour structures for object classification.

• Medical Imaging: Outline tumors, tissues in X-rays/MRIs.
• Robotics: Detect obstacles or follow lines based on contours.

6.10 Conclusion
Contour detection is a foundational technique in image analysis. Through operations like grayscale
conversion, thresholding, and morphological processing, contours can be reliably extracted and visualized.
Understanding these functions builds a strong base for more advanced tasks in computer vision such as
shape analysis, object tracking, and real-time robotic perception.

19
7 Bitwise Operations in OpenCV: Theory, Logic, and Visual
Image Manipulation
7.1 Introduction
Bitwise operations are logical manipulations applied at the binary level between two images. Each pixel
in the resulting image is computed by applying binary logic (AND, OR, XOR, NOT) to the corresponding
pixels in the input images.
These operations are extremely useful in:
• Image masking
• Region of Interest (ROI) extraction
• Image blending
• Set-theoretic shape operations
In this section, we explore a Python script implementing all major bitwise operations using simple
geometric shapes: a rectangle and a circle.

7.2 Complete Script Overview

import cv2 as cv
import numpy as np

img = cv.imread(’Photo\\Varanasi.jpg’)

def resacleFrame(frame, scale=0.15):

width = frame.shape[1] * scale
height = frame.shape[0] * scale
dimension = (int(width), int(height))
return cv.resize(frame, dimension, interpolation=cv.INTER_AREA)

img = resacleFrame(img)

blank = np.zeros((400, 400), dtype=’uint8’)

rectangle = cv.rectangle(blank.copy(), (30, 30), (370, 370), 255, -1)
circle = cv.circle(blank.copy(), (200, 200), 200, 255, -1)

binand = cv.bitwise_and(rectangle, circle)

cv.imshow(’Bitwise AND’, binand)

binor = cv.bitwise_or(rectangle, circle)

cv.imshow(’Bitwise OR’, binor)

binxor = cv.bitwise_xor(rectangle, circle)

cv.imshow(’Bitwise XOR’, binxor)

binnot = cv.bitwise_not(rectangle)
cv.imshow(’Bitwise NOT’, binnot)

cv.waitKey(0)

7.3 Line-by-Line Code Explanation

Lines 1–3: Importing and Reading
import cv2 as cv
import numpy as np
img = cv.imread(’Photo\\Varanasi.jpg’)

20
• cv2: OpenCV library.
• numpy: Matrix operations.
• img: The actual image isn’t used for logic, but is read and scaled.

Lines 4–9: Rescaling Image

def resacleFrame(frame, scale=0.15): ...
img = resacleFrame(img)

Downscales image for preview or context — not used in logic processing here.

Line 10: Creating Blank Image

blank = np.zeros((400, 400), dtype=’uint8’)

Creates a 400x400 grayscale image filled with 0 (black background).

7.4 Drawing Shapes on Blank Canvas

rectangle = cv.rectangle(blank.copy(), (30, 30), (370, 370), 255, -1)
circle = cv.circle(blank.copy(), (200, 200), 200, 255, -1)

• cv.rectangle(): Draws a white square.

• cv.circle(): Draws a white circle.
• 255: White color in grayscale.
• -1: Fills the shape.
• blank.copy(): Ensures original blank image isn’t modified.

Result: Two binary images (rectangle and circle) ready for bitwise operations.

7.5 Bitwise AND

binand = cv.bitwise_and(rectangle, circle)
cv.imshow(’Bitwise AND’, binand)

Logical Operation:
Result(x, y) = Rectangle(x, y) ∧ Circle(x, y)
Interpretation: Only the intersection of the two shapes is white; rest is black.

7.6 Bitwise OR
binor = cv.bitwise_or(rectangle, circle)
cv.imshow(’Bitwise OR’, binor)

Logical Operation:
Result(x, y) = Rectangle(x, y) ∨ Circle(x, y)
Interpretation: Union of the shapes is white; only where both are black is the result black.

7.7 Bitwise XOR

binxor = cv.bitwise_xor(rectangle, circle)
cv.imshow(’Bitwise XOR’, binxor)

Logical Operation:
Result(x, y) = Rectangle(x, y) ⊕ Circle(x, y)
Interpretation: Only regions where the shapes do not overlap are white.

21
7.8 Bitwise NOT
binnot = cv.bitwise_not(rectangle)
cv.imshow(’Bitwise NOT’, binnot)

Logical Operation:
Result(x, y) = ¬Rectangle(x, y)
Interpretation: Inverts all pixels — white becomes black, black becomes white.

7.9 Visual Comparison Summary

Operation Region Highlighted Application
AND Intersection Mask intersection or overlap detection
OR Union Combining ROIs or masks
XOR Non-overlapping parts Detect change regions
NOT Inversion Mask inversion or background change

7.10 Applications of Bitwise Operations

• Masking: Use AND to apply a mask over an image.
• Segmentation: Use XOR to isolate unique regions.
• ROI Manipulation: Use OR to combine multiple regions.
• Inversion Tasks: Use NOT to invert binary masks or background.

7.11 Conclusion
Bitwise operations are efficient, low-level operations that form the backbone of image masking, segmenta-
tion, and blending workflows. When combined with contour extraction or thresholding, these operations
unlock powerful tools for image analysis, robotics vision, and medical imaging.

8 Image Masking in OpenCV: Theoretical and Practical Explo-

ration
8.1 Introduction
Masking is a fundamental operation in image processing where certain regions of an image are selected
for processing while others are ignored. A mask is a binary matrix (black-and-white image) that acts as
a filter to specify which parts of the original image should be preserved or altered.
In OpenCV, masking is often implemented using bitwise operations in combination with binary masks.
This section explains a Python script that uses OpenCV to create a circular mask and apply it
to a resized image of Varanasi. The explanation includes visual logic, data structures, and pixel-level
implications of masking.

8.2 Complete Python Script Overview

import cv2 as cv
import numpy as np

img = cv.imread(’Photo\\Varanasi.jpg’)

def resacleFrame(frame, scale=0.15):

width = frame.shape[1] * scale
height = frame.shape[0] * scale
dimension = (int(width), int(height))
return cv.resize(frame, dimension, interpolation=cv.INTER_AREA)

22
img = resacleFrame(img)

blank = np.zeros(img.shape[:2], dtype=’uint8’)

cv.imshow(’blank’, blank)

mask = cv.circle(blank, (img.shape[1] // 2, img.shape[0] // 2), 100, 255, -1)

cv.imshow(’mask’, mask)

masked = cv.bitwise_and(img, img, mask=mask)

cv.imshow(’masked’, masked)

cv.waitKey(0)

8.3 Step-by-Step Code Explanation

Lines 1–3: Importing and Reading Image
import cv2 as cv
import numpy as np
img = cv.imread(’Photo\\Varanasi.jpg’)

• cv2: OpenCV for image processing.

• numpy: For matrix operations.
• img: Loaded in BGR format.

Lines 4–9: Rescaling Function and Execution

def resacleFrame(...):
...
img = resacleFrame(img)

Resizes the image to 15% of its original dimensions using cv.INTER AREA, which is effective for
downscaling.

Line 10: Creating a Blank Image

blank = np.zeros(img.shape[:2], dtype=’uint8’)

Explanation:
• img.shape[:2] gives (height, width) — creating a single-channel (grayscale) image.
• All pixels initialized to 0 (black).
Purpose: Acts as the canvas on which the circular mask is drawn.

Line 11: Displaying Blank Canvas

cv.imshow(’blank’, blank)

8.4 Creating the Mask

mask = cv.circle(blank, (img.shape[1] // 2, img.shape[0] // 2), 100, 255, -1)
cv.imshow(’mask’, mask)

Explanation:
• Draws a filled white circle (255) at the center of the image on the blank canvas.
• Radius = 100 pixels.
• The result is a binary mask with a white circle on black background.

23
Mathematical Description:
(
255 if (x − cx )2 + (y − cy )2 < r2
mask(x, y) =
0 otherwise

Where (cx , cy ) is the center and r is the radius.

8.5 Applying the Mask

masked = cv.bitwise_and(img, img, mask=mask)
cv.imshow(’masked’, masked)

Bitwise AND Operation with Mask

(
img(x, y), if mask(x, y) = 255
Output(x, y) =
0, otherwise

• The original image is preserved where the mask is white.

• All other regions are turned to black.

cv.bitwise and(...):
• First argument: input image
• Second argument: same image (bitwise operation on itself)
• mask=mask: optional mask to restrict effect

8.6 Visual Logic

• Left: Original image

• Middle: Binary mask (circle)

• Right: Output image with only circular region preserved

8.7 Applications of Masking

• Object Isolation: Select circular features like eyes, dials, fruits.
• ROI Extraction: Focus analysis only on region of interest.
• Blurring Specific Zones: Apply filters selectively.
• Medical Imaging: Highlight anatomical zones like tumors or vessels.

8.8 Conclusion
Masking is a vital technique in computer vision that provides spatial selectivity in image processing tasks.
Whether used for object isolation, attention-based filtering, or ROI-focused analytics, masks guide how
and where image operations are applied. This script serves as a minimal yet powerful example of how
to implement masks using NumPy arrays and OpenCV’s bitwise operations.

24
9 Image Histograms in OpenCV: A Deep Dive into Pixel Dis-
tribution and Analysis
9.1 Introduction
An image histogram is a graphical representation of the distribution of pixel intensities in a digital
image. It plots the number of pixels for each intensity value. Histograms are crucial for understanding
image contrast, brightness, dynamic range, and for preprocessing tasks like thresholding, equalization,
and segmentation.
This chapter dissects a script that computes and visualizes both grayscale and color histograms, with
and without masking, using OpenCV and matplotlib. The discussion covers every line in technical
detail and explains underlying principles with mathematical rigor.

9.2 Script Overview

import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt
• cv2: OpenCV library for image manipulation.
• numpy: Matrix and numerical computations.
• matplotlib.pyplot: For histogram plotting.

Image Reading and Rescaling

img = cv.imread(’Photo\\Varanasi.jpg’)
def resacleFrame(frame, scale=0.25):
...
img = resacleFrame(img)
cv.imshow(’JPG’, img)
• The image is loaded in BGR format.
• Rescaled to 25% of its original size using cv.INTER AREA.

Blank Canvas for Masking

blank = np.zeros((img.shape[:2]), dtype=’uint8’)
Creates a single-channel image of same height and width as the input, filled with zeros (black).

9.3 Part A — Grayscale Histogram (Commented)

The following section, though commented out, is worth analyzing theoretically.

Grayscale Conversion
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
cv.imshow(’gray’, gray)
Converts the image from 3-channel BGR to a single-channel grayscale using luminance-weighted
average:
Y = 0.299 · R + 0.587 · G + 0.114 · B

Mask Creation for Grayscale

circle = cv.circle(blank, (img.shape[1]//2, img.shape[0]//2), 200, 255, -1)
mask = cv.bitwise_and(gray, gray, mask=circle)
• Draws a filled white circle of radius 200 pixels centered on the image.
• The mask is then applied using cv.bitwise and, keeping only pixels inside the circular region.

25
Grayscale Histogram Computation
gray_hist = cv.calcHist([gray], [0], mask, [256], [0, 256])

Explanation:
• First arg: List of images.
• Second arg: Channel index (0 for grayscale).
• Third arg: Binary mask.
• Fourth arg: Number of bins (256 for 8-bit image).
• Fifth arg: Intensity range (0 to 255).
Mathematical Interpretation: Let I(x, y) be pixel intensities and M (x, y) the mask:
W X
X H
H(i) = δ(I(x, y) = i) · δ(M (x, y) = 255)
x=0 y=0

Grayscale Histogram Plotting

plt.figure()
plt.title(’Grayscale Histogram’)
plt.xlabel(’bins’)
plt.ylabel(’# of pixels’)
plt.plot(gray_hist)
plt.xlim([0, 256])
plt.show()

9.4 Part B — Color Histogram (Active)

Circular Mask for Color Histogram
mask = cv.circle(blank, (img.shape[1]//2,img.shape[0]//2), 100, 255, -1)
masked = cv.bitwise_and(img, img, mask=mask)
cv.imshow(’Mask’, masked)

• Creates a circular mask with radius 100.

• Applies it on the 3-channel image.

Color Histogram Calculation

colors = (’b’, ’g’, ’r’)
for i, col in enumerate(colors):
hist = cv.calcHist([img], [i], None, [256], [0, 256])
plt.plot(hist, color=col)
...

• Iterates through channels: blue (0), green (1), red (2).

• Computes histogram for each channel.
• None mask means histogram for entire image.

Histogram Plotting for Each Color Channel

plt.title(’Color Histogram’)
plt.xlabel(’bins’)
plt.ylabel(’# of pixels’)
plt.xlim([0,256])
plt.show()

26
9.5 Histogram Interpretation and Analysis
• X-axis (bins): Intensity values from 0 (black) to 255 (white).
• Y-axis: Count of pixels for each intensity.
• Peak at high intensities: Brighter image.
• Spread across range: High contrast.
• Narrow spike: Low contrast or under/overexposure.

9.6 Mathematical Foundation: Histogram Function

W X
X H
Hc (i) = δ(Ic (x, y) = i)
x=0 y=0

Where:
• c ∈ {B, G, R} is the color channel.
• i ∈ [0, 255] is the bin index.
• δ(·) is the Kronecker delta (1 if true, else 0).

9.7 Use Cases of Histograms

• Contrast Enhancement: Detect if histogram is clustered at one end.
• Equalization: Flatten the histogram to improve visibility.
• Image Comparison: Use histogram correlation as similarity metric.
• Segmentation: Intensity-based region separation.
• Camera Feedback: Auto exposure and lighting adjustment.

9.8 Conclusion
Histograms are essential for analyzing the tonal and color distribution of images. Through OpenCV’s
calcHist and Python’s matplotlib, we can visualize and interpret image data at a statistical level.
Whether working on segmentation, enhancement, or machine learning preprocessing, histograms provide
powerful insight into the underlying pixel structure of images.
This script illustrates both grayscale and color histogram construction, as well as spatially restricted
analysis using masks—offering a practical toolbox for any researcher in computer vision or digital image
processing.

10 Thresholding in OpenCV: Fixed and Adaptive Methods with

Binary Segmentation
10.1 Introduction
Thresholding is a fundamental technique in image processing used to segment images by converting
grayscale images into binary images. In its simplest form, thresholding sets all pixels above a certain
intensity to one value (usually white) and all below to another (usually black).
This section deeply explores fixed (global) and adaptive thresholding using OpenCV. These are often
used in:
• Document scanning (binarization),
• Object segmentation,
• Optical character recognition (OCR),
• Industrial quality inspection.

27
10.2 Script Overview
import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt
img = cv.imread(’Photo\\Varanasi.jpg’)

• cv2: OpenCV for image processing.

• numpy: For image shape and numerical computations.
• matplotlib.pyplot: Optional (not used in this particular script).

Image Rescaling
def resacleFrame(frame, scale=0.25):
...
img = resacleFrame(img)

Purpose: Reduce the size of the image to 25% for faster processing and display. cv.INTER AREA is
used for downsampling.

Original Image and Grayscale Conversion

cv.imshow(’JPG’, img)
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
cv.imshow(’gray’, gray)

• Converts the 3-channel BGR image to a 1-channel grayscale image.

• Grayscale is essential before applying thresholding.

10.3 Binary Thresholding

threshold, thresh = cv.threshold(gray, 150, 255, cv.THRESH_BINARY)
cv.imshow(’thresh’, thresh)

Definition:
cv.threshold() applies a global fixed threshold:
(
′ maxVal = 255 if I(x, y) > 150
I (x, y) =
0 otherwise

• gray: input image

• 150: threshold value

• 255: maximum value (white)
• cv.THRESH BINARY: operation mode

Interpretation: Segments the image into two parts based on intensity — good when lighting is
uniform.

10.4 Inverse Binary Thresholding

threshold, thresh_inv = cv.threshold(gray, 150, 255, cv.THRESH_BINARY_INV)
cv.imshow(’thresh_inv’, thresh_inv)

28
Definition:
(
′ 0 if I(x, y) > 150
I (x, y) =
maxVal = 255 otherwise
Use Case: Useful when the foreground is darker than the background (e.g., dark text on white
paper).

10.5 Adaptive Thresholding

adaptive_thresh = cv.adaptiveThreshold(gray, 255, cv.ADAPTIVE_THRESH_MEAN_C,
cv.THRESH_BINARY, 21, 3)
cv.imshow(’adaptive’, adaptive_thresh)

Concept:
Fixed thresholding fails under non-uniform lighting. Adaptive thresholding calculates the threshold value
for a pixel based on a small neighborhood around it.

Mathematical Formulation:
(
255 if I(x, y) > T (x, y)
I ′ (x, y) =
0 otherwise
Where T (x, y) is the mean or weighted sum of the neighboring pixel intensities in a window (block size).

Parameters Explained:
• gray: Input image.
• 255: Maximum value.
• cv.ADAPTIVE THRESH MEAN C: Uses mean of block.
• cv.THRESH BINARY: Binary threshold.
• 21: Block size (must be odd).
• 3: Constant subtracted from the mean.

Modes:
• cv.ADAPTIVE THRESH MEAN C: Mean of neighborhood.
• cv.ADAPTIVE THRESH GAUSSIAN C: Weighted Gaussian mean.

10.6 Visual Comparisons

• Fixed Binary: Sharp cutoff at 150, sensitive to lighting.
• Inverse Binary: Inverts result; useful for dark-on-light features.
• Adaptive: Locally adjusted, ideal for scanned documents or scenes with shadows.

10.7 Applications of Thresholding Techniques

• Document Scanning and OCR: Adaptive thresholding improves text clarity.
• Edge Detection Preprocessing: Binary images simplify edge analysis.
• Medical Imaging: Segment tumors or tissues.
• License Plate Recognition: Helps isolate characters.
• Fingerprint Recognition: Prepares binary ridge maps.

29
10.8 Conclusion
Thresholding transforms grayscale images into binary masks that are crucial for further analysis. While
fixed thresholding is computationally cheaper and sufficient under consistent lighting, adaptive thresh-
olding is significantly more robust in real-world scenarios with variable illumination. Both techniques
form the foundation for many high-level vision tasks in industry and research.

Real-Time Detection of Forest Fires Using FireNet-CNN and Explainable AI Techniques
No ratings yet
Real-Time Detection of Forest Fires Using FireNet-CNN and Explainable AI Techniques
32 pages
Pixel Perfect Precission (PP3)
No ratings yet
Pixel Perfect Precission (PP3)
214 pages
Computer Vision - Lab Manual
No ratings yet
Computer Vision - Lab Manual
43 pages
00 OpenCV Basics
No ratings yet
00 OpenCV Basics
14 pages
OpenCV Python Cheat Sheet
No ratings yet
OpenCV Python Cheat Sheet
9 pages
Lecture 11-Color Image Processing
No ratings yet
Lecture 11-Color Image Processing
83 pages
Sahara Newsletter Sudoku
No ratings yet
Sahara Newsletter Sudoku
1 page
2025 EDITION: The Illustrated Guidebook
No ratings yet
2025 EDITION: The Illustrated Guidebook
74 pages
Drashti CVML
No ratings yet
Drashti CVML
83 pages
Advanced Image Processing Using Opencv
No ratings yet
Advanced Image Processing Using Opencv
26 pages
Computer Vision and Robotics Lab R22-1
No ratings yet
Computer Vision and Robotics Lab R22-1
36 pages
Unit 1
No ratings yet
Unit 1
27 pages
Unit-1 Notes CV
No ratings yet
Unit-1 Notes CV
29 pages
Daily Challenge Curriculum - 100 Days of Machine Learning
No ratings yet
Daily Challenge Curriculum - 100 Days of Machine Learning
15 pages
Computer Vision
No ratings yet
Computer Vision
20 pages
MVS Prac 4
No ratings yet
MVS Prac 4
7 pages
CV SVD L04 P1 ImageTrasformations 1
No ratings yet
CV SVD L04 P1 ImageTrasformations 1
45 pages
Processing and Arduino Workshop Material PDF
No ratings yet
Processing and Arduino Workshop Material PDF
583 pages
Opencv Cheatsheet
No ratings yet
Opencv Cheatsheet
65 pages
Widt Unit I
No ratings yet
Widt Unit I
37 pages
RO190642 - Lab Manual - 2023
No ratings yet
RO190642 - Lab Manual - 2023
74 pages
H13 321 Full File Odghv0 - 2
No ratings yet
H13 321 Full File Odghv0 - 2
46 pages
Open CV
No ratings yet
Open CV
22 pages
Equity Reasrch
No ratings yet
Equity Reasrch
2 pages
Aiml Ass3
No ratings yet
Aiml Ass3
9 pages
Exp 2
No ratings yet
Exp 2
8 pages
Lab No.4 Iprr
No ratings yet
Lab No.4 Iprr
2 pages
IQ Factor & ISP
No ratings yet
IQ Factor & ISP
47 pages
Opencv Cheatsheet
No ratings yet
Opencv Cheatsheet
60 pages
Image Processing File
No ratings yet
Image Processing File
7 pages
AI - Lab1 - Puviyarasu Ayyappan
No ratings yet
AI - Lab1 - Puviyarasu Ayyappan
8 pages
OpenCV - Cheatsheet
100% (1)
OpenCV - Cheatsheet
12 pages
MA - Color Grading in LR ACR
No ratings yet
MA - Color Grading in LR ACR
19 pages
Digital Image Processing
No ratings yet
Digital Image Processing
106 pages
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet
Color Correction For The Mobile Phone Camera
No ratings yet
Color Correction For The Mobile Phone Camera
5 pages
01lab Intro To OpenCV
No ratings yet
01lab Intro To OpenCV
30 pages
Developing Image Processing and Computer Vision Applications Dr. Usman Ghani
No ratings yet
Developing Image Processing and Computer Vision Applications Dr. Usman Ghani
52 pages
REF2 - Basic Image Processing
No ratings yet
REF2 - Basic Image Processing
18 pages
REF1 - OpenCV Basics
No ratings yet
REF1 - OpenCV Basics
16 pages
Lab 05 - Digital Image Processing Practice
No ratings yet
Lab 05 - Digital Image Processing Practice
9 pages
R Programming
No ratings yet
R Programming
9 pages
Practical Image-1
No ratings yet
Practical Image-1
22 pages
Lab05 ML
No ratings yet
Lab05 ML
7 pages
Image Processing Skill Based Mini Project
No ratings yet
Image Processing Skill Based Mini Project
20 pages
ACFrOgCfX9ATrHm9ZSjs1HLKnJCXmmPcIwFi Y7hVAv6zU1Li3igjIXOOLtGhffODBql8a993YAsc3gM SE8bidlMJr2eFkl9eJB0BU8jcLD6iWrroxwbp1 X9yQtpQks6r8vMLEnR-ORk02lgVJ
No ratings yet
ACFrOgCfX9ATrHm9ZSjs1HLKnJCXmmPcIwFi Y7hVAv6zU1Li3igjIXOOLtGhffODBql8a993YAsc3gM SE8bidlMJr2eFkl9eJB0BU8jcLD6iWrroxwbp1 X9yQtpQks6r8vMLEnR-ORk02lgVJ
20 pages
Foundation of AI Lab: Project: Cam Scanner Using Python
No ratings yet
Foundation of AI Lab: Project: Cam Scanner Using Python
32 pages
Week 8
No ratings yet
Week 8
9 pages
CV Lab Manual
No ratings yet
CV Lab Manual
45 pages
Applied Sciences: Image Retrieval Method Based On Image Feature Fusion and Discrete Cosine Transform
No ratings yet
Applied Sciences: Image Retrieval Method Based On Image Feature Fusion and Discrete Cosine Transform
28 pages
ALCANTARAuLaboratory 6 Image Processing Student - 031006
No ratings yet
ALCANTARAuLaboratory 6 Image Processing Student - 031006
9 pages
CV Fundamentals
No ratings yet
CV Fundamentals
5 pages
Week 8
No ratings yet
Week 8
9 pages
Dip Exp1
No ratings yet
Dip Exp1
6 pages
Colorized Image by JUNAID
No ratings yet
Colorized Image by JUNAID
4 pages
Excelint: Automatically Finding Spreadsheet Formula Errors: Daniel W. Barowy, Emery D. Berger, Benjamin Zorn
No ratings yet
Excelint: Automatically Finding Spreadsheet Formula Errors: Daniel W. Barowy, Emery D. Berger, Benjamin Zorn
26 pages
DIP Lab Manual No 03
No ratings yet
DIP Lab Manual No 03
11 pages
Simplified Teaching and Understanding of Histogram Equalization in Digital Image Processing
No ratings yet
Simplified Teaching and Understanding of Histogram Equalization in Digital Image Processing
20 pages
Color Physics-1st Class PDF
No ratings yet
Color Physics-1st Class PDF
74 pages
Cursor Movement
No ratings yet
Cursor Movement
5 pages
P6 - Computer Vision
No ratings yet
P6 - Computer Vision
27 pages
Open CVIntro
No ratings yet
Open CVIntro
13 pages
Ip Lab
No ratings yet
Ip Lab
8 pages
Raspberry Pi Section-4
No ratings yet
Raspberry Pi Section-4
8 pages
Python OpenCV Computer Vision Training PDF
No ratings yet
Python OpenCV Computer Vision Training PDF
85 pages
Create Your Own CamScanner Using Python and OpenCV
No ratings yet
Create Your Own CamScanner Using Python and OpenCV
20 pages
CV Lab 1
No ratings yet
CV Lab 1
7 pages
Introduction To Computer Vision by Dylan Seychell
No ratings yet
Introduction To Computer Vision by Dylan Seychell
35 pages
Basic Operations in Image Processing - Poorvi Joshi - 2019 Batch
No ratings yet
Basic Operations in Image Processing - Poorvi Joshi - 2019 Batch
26 pages
007 - Summer Training Report
No ratings yet
007 - Summer Training Report
38 pages
Lab05 ML Naqash
No ratings yet
Lab05 ML Naqash
10 pages
Lab 13
No ratings yet
Lab 13
4 pages
I O CV H - W P: Ntroduction To PEN Ands ON Orkshop in Ython
No ratings yet
I O CV H - W P: Ntroduction To PEN Ands ON Orkshop in Ython
25 pages
OpenCV Functions
No ratings yet
OpenCV Functions
4 pages
18DIP Lab 2
No ratings yet
18DIP Lab 2
11 pages
Lab 04 Digital Image Processing Practice
No ratings yet
Lab 04 Digital Image Processing Practice
9 pages
SketchBook-Mobile v275 ENU PDF
No ratings yet
SketchBook-Mobile v275 ENU PDF
22 pages
Creative Portrait Concepts - Eye See You - Cheat Sheet
No ratings yet
Creative Portrait Concepts - Eye See You - Cheat Sheet
1 page
Page Maker
No ratings yet
Page Maker
293 pages
ICT-2123-2012S Visual Graphic Design (NC II) P1
100% (1)
ICT-2123-2012S Visual Graphic Design (NC II) P1
13 pages
Opencv 4.X Cheat Sheet (Python Version) : Filtering
No ratings yet
Opencv 4.X Cheat Sheet (Python Version) : Filtering
2 pages
Musnell Colour System
No ratings yet
Musnell Colour System
55 pages
Picture Style Editor: Ver. 1.9 Instruction Manual
No ratings yet
Picture Style Editor: Ver. 1.9 Instruction Manual
14 pages
Color Moments
0% (1)
Color Moments
4 pages
Detecting Jute Plant Disease Using Image Processing and Machine Learning
No ratings yet
Detecting Jute Plant Disease Using Image Processing and Machine Learning
6 pages
Image Colour Tool and Geosoft Colour Tables: Technical Note
No ratings yet
Image Colour Tool and Geosoft Colour Tables: Technical Note
18 pages
TP02 - Image Processing Using Python-OpenCV
No ratings yet
TP02 - Image Processing Using Python-OpenCV
3 pages
Chapter 7 - .Colour
No ratings yet
Chapter 7 - .Colour
25 pages
Documentation Image Processing Day 1
No ratings yet
Documentation Image Processing Day 1
11 pages
Robot Spray
No ratings yet
Robot Spray
4 pages
Computer Vision
No ratings yet
Computer Vision
13 pages
The Comparison of CPU Time Consumption For PDF
No ratings yet
The Comparison of CPU Time Consumption For PDF
4 pages
An Introduction To Opencv Using Python With Ubuntu: Krupali Mistry, Avneet Saluja
No ratings yet
An Introduction To Opencv Using Python With Ubuntu: Krupali Mistry, Avneet Saluja
4 pages
Touch Less
No ratings yet
Touch Less
6 pages

Open CV Notes

Uploaded by

Open CV Notes

Uploaded by

Open CV

June 19, 2025

1 Image Processing in OpenCV: Exhaustive Breakdown of a

1.2 Python Script Overview

def resacleFrame(frame, scale=0.25):

gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

blur = cv.GaussianBlur(img, (19, 19), cv.BORDER_DEFAULT)

canny = cv.Canny(img, 125, 175)

dilated = cv.dilate(canny, (7, 7), iterations=3)

eroded = cv.erode(dilated, (3, 3), iterations=1)

resized = cv.resize(img, (500, 500), interpolation=cv.INTER_CUBIC)

cropped = img[50:400, 250:400]

1.3 Line-by-Line Analysis with Definitions and Explanations

Line 2: Reading an Image from Disk

Line 3–7: Defining a Rescaling Function

Line 8: Applying the Rescale Function

Line 9: Displaying the Rescaled Image

cv.cvtColor(): Converts images from one color space to another.

Line 12–13: Gaussian Blur

Line 14–15: Canny Edge Detection

Line 16–17: Dilation

Line 18–19: Erosion

Line 22–23: Cropping

Line 24: Holding Windows Open

1.4 Conclusion of the Breakdown

2 Image Smoothing Techniques Using OpenCV: An In-Depth

def resacleFrame(frame, scale=0.15):

average = cv.blur(img, (3, 3))

gaus_Avg = cv.GaussianBlur(img, (3, 3), 0)

bilateral = cv.bilateralFilter(img, 5, 15, 15)

2.3 Detailed Explanation and Functionality of Each Line

cv2: OpenCV library for image processing.

Line 3: Reading the Image

Reads the image from the path. The double backslash ‘

Line 4–8: Image Rescaling Function

Line 10: Displaying Original Image

2.4 Image Smoothing Techniques

2.4.1 1. Averaging (Box Blur)

where m × n is the kernel size (e.g., 3 × 3).

2.4.2 2. Gaussian Blur

2.4.3 3. Median Blur

• Intensity difference (how similar pixels are)

Final Step: Hold Display Windows Open

2.5 Comparative Summary of Smoothing Techniques

2.6 Real-World Applications

• Medical Imaging: Denoise X-ray, MRI, or CT scan images.

3 Color Space Conversions in OpenCV: A Comprehensive The-

def resacleFrame(frame, scale=0.15):

gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

hsv = cv.cvtColor(img, cv.COLOR_BGR2HSV)

lab = cv.cvtColor(img, cv.COLOR_BGR2LAB)

rgb = cv.cvtColor(img, cv.COLOR_BGR2RGB)

hsv_bgr = cv.cvtColor(hsv, cv.COLOR_HSV2BGR)

lab_bgr = cv.cvtColor(lab, cv.COLOR_LAB2BGR)

3.3 Line-by-Line Explanation and Theoretical Insights

cv2: The OpenCV Python binding.

Lines 3–8: Rescaling Function

Reduces image size for faster processing and display.

Line 9: Rescaling the Image

Purpose: Reduces image dimensions to 15% of original size.

Shows the rescaled BGR image labeled as ”Varansi”.

3.4 Color Space Conversions

3.4.1 1. BGR to Grayscale

3.4.2 2. BGR to HSV (Hue, Saturation, Value)

• Hue (H): Color type (0–180 in OpenCV)

3.4.3 3. BGR to LAB (CIE L*a*b*)

• Professional photo editing

3.4.5 5. HSV to BGR

Conversion: Reconstructs the original image (approximately) from HSV.

3.4.6 6. LAB to BGR

Purpose: Converts LAB image back to BGR format after processing.

Final Step: Wait for Key Event

Function: Keeps all GUI windows open until a key is pressed.

3.5 Summary of Color Spaces

3.6 Real-World Use Cases

• Medical Imaging: LAB for contrast enhancement.

3.4.3 3. BGR to LAB (CIE Lab*)