0% found this document useful (0 votes)
27 views146 pages

Ass ch3

Uploaded by

svv1856
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views146 pages

Ass ch3

Uploaded by

svv1856
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 146

COMPUTER VISION - EC01 & EC02

Assignment-2 (Chapter 3)
Q(3.1) Write a simple application to change the color balance of an image by multiplying each
color value by a different user-specified constant. If you want to get fancy, you can make this
application interactive, with sliders.
1. Do you get different results if you take out the gamma transformation before or after doing the
multiplication? Why or why not?
2. Take the same picture with your digital camera using different color balance settings (most
cameras control the color balance from one of the menus). Can you recover what the color balance
ratios are between the different settings? You may need to put your camera on a tripod and align
the images manually or automatically to make this work.
3. Can you think of any reason why you might want to perform a color twist on the images

Answer for Q(3.1)


1)
import cv2
import numpy as np

def adjust_color_balance(image, r_gain, g_gain, b_gain):


# Split the image into its color channels
b, g, r = cv2.split(image)
# Multiply each channel by its respective gain
r = np.clip(r * r_gain, 0, 255).astype(np.uint8)
g = np.clip(g * g_gain, 0, 255).astype(np.uint8)
b = np.clip(b * b_gain, 0, 255).astype(np.uint8)
# Merge the channels back together
return cv2.merge([b, g, r])
# Define color balance values
r_gain = 1.2
g_gain = 1.0
b_gain = 0.8

# Load image and adjust color balance


image =
cv2.imread("C:\\Users\\HP\\Desktop\\360_F_386607662_ivepH9SyTymYgeVz51p4Rj0r
oA8ACS6l.jpg")
adjusted_image = adjust_color_balance(image, r_gain, g_gain, b_gain)
cv2.imshow('Original Image', image)
cv2.imshow('Adjusted Image', adjusted_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Yes we get different results if you take out the gamma transformation before or after
doing the multiplication
Applying Gamma Transformation Before Multiplication:

When gamma transformation is applied before color multiplication, the image’s


brightness and contrast are adjusted first. Gamma correction alters the pixel values to
account for non-linear display responses or to adjust perceived brightness. After this
initial adjustment, color balance is modified by multiplying each color channel by
specified gain factors. This means the color adjustments are performed on an already
gamma-corrected image. As a result, the color balance changes are influenced by the
non-linear transformation applied during gamma correction, which can lead to less
intuitive results. The interaction between gamma-corrected brightness and subsequent
color multiplication can produce outcomes that are harder to predict and control.

Applying Gamma Transformation After Multiplication:

Conversely, when gamma correction is applied after the color multiplication, the process
allows for more direct control over color balance. In this approach, color balance
adjustments are made to the raw color channels first, which involves multiplying each
channel by its respective gain factor. Following these adjustments, gamma correction is
applied to the color-balanced image to fine-tune the overall brightness and contrast. This
sequence ensures that the color balance modifications are based on the original image’s
color values, leading to more predictable and controllable outcomes. The final brightness
and contrast adjustment via gamma correction occurs after the color adjustments, making
it easier to manage and understand the impact of each operation.

2)
% Load two images taken with different color balance settings
img1 = imread('image_daylight.jpg'); % Image 1 (Daylight)
img2 = imread('image_tungsten.jpg'); % Image 2 (Tungsten)

% Extract a region of interest (ROI) where color balance is analyzed


% Assuming the ROI is manually defined, e.g., a [x, y, width, height] rectangle
roi = [50, 50, 100, 100]; % Example ROI

% Crop the images to the selected region


roi_img1 = imcrop(img1, roi);
roi_img2 = imcrop(img2, roi);

% Calculate the mean RGB values in the ROI for both images
mean_rgb_img1 = mean(reshape(roi_img1, [], 3), 1); % Image 1 (Daylight)
mean_rgb_img2 = mean(reshape(roi_img2, [], 3), 1); % Image 2 (Tungsten)

% Calculate the color balance ratio between the two images


color_balance_ratio = mean_rgb_img1 ./ mean_rgb_img2;

% Display the results


disp('Color Balance Ratio (Daylight to Tungsten):');
disp(['Red Ratio: ', num2str(color_balance_ratio(1))]);
disp(['Green Ratio: ', num2str(color_balance_ratio(2))]);
disp(['Blue Ratio: ', num2str(color_balance_ratio(3))]);
Color Balance Ratio (Daylight to Tungsten):
Red Ratio: 1.7105
Green Ratio: 1.0561
Blue Ratio: 0.80957

3)
Performing a color twist on images, as discussed in Section 3.1.2, involves applying a
linear transformation to the color channels, such as mixing the red, green, and blue
channels to create new color combinations. This can be particularly useful in various
scenarios:
1. Correcting Color Casts: A color twist can be used to correct undesired color casts
caused by lighting conditions (e.g., under artificial lighting that skews color tones). By
adjusting how each channel interacts with the others, you can create a more neutral or
realistic color appearance.
2. Stylizing Images: If you are aiming to create a specific artistic or cinematic look, a
color twist can help you apply uniform color changes across the image. For example,
adding a warm or cool tone by blending channels can achieve a unique style for an image
or a series of photos.
3. Enhancing Color Depth and Contrast: By twisting the colors in subtle ways, you can
increase the perceptual contrast or separation between similar colors, which might make
the image appear more vibrant or detailed.
4. Simulating Film Emulation or Filters: In some cases, photographers or filmmakers use
color twists to mimic the look of traditional film stock or to apply filters that affect the
way colors are rendered. This can give digital images a retro or analog look.
5. Harmonizing Multiple Images: If you're working with multiple images shot under
different lighting conditions or with different cameras, a color twist can help harmonize
the appearance of those images by shifting their color profiles to a common reference.
Thus, applying a color twist on an image can enhance visual aesthetics, correct color
inconsistencies, or create certain stylistic effects based on the context.
Q(3.2) If you have access to the RAW image for the camera, perform the demosaicing yourself
(Section 10.3.1). If not, just subsample an RGB image in a Bayer mosaic pattern. Instead of just
bilinear interpolation, try one of the more advanced techniques described in Section 10.3.1.
Compare your result to the one produced by the camera. Does your camera perform a simple linear
mapping between RAW values and the color-balanced values in a JPEG? Some high-end cameras
have a RAW+JPEG mode, which makes this comparison much easier.
Answer for Q(3.2)
To simulate RAW image a baeyer filter is applied on a JPEG image.Then Malvar, He, Cutler
algorithm mentioned in Section 10.3.1 is applied for demosaicing.

CODE
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import convolve
from skimage import io
# Baeyer Filter
def bayer_mosaic(rgb_image):
h, w, _ = rgb_image.shape
bayer = np.zeros((h, w))
# Green pixels
bayer[::2, ::2] = rgb_image[::2, ::2, 1]
bayer[1::2, 1::2] = rgb_image[1::2, 1::2, 1]
# Red pixels
bayer[::2, 1::2] = rgb_image[::2, 1::2, 0]
# Blue pixels
bayer[1::2, ::2] = rgb_image[1::2, ::2, 2]
return bayer
# Malvar-He-Cutler Demosaicing
def demosaic_malvar_he_cutler(bayer_image):
h_G_at_RB = np.array([[0, 0, -1, 0, 0],
[0, 0, 2, 0, 0],
[-1, 2, 4, 2, -1],
[0, 0, 2, 0, 0],
[0, 0, -1, 0, 0]]) / 8
h_R_at_G = np.array([[0, 0, 0.5, 0, 0],
[0, -1, 0, -1, 0],
[-1, 4, 5, 4, -1],
[0, -1, 0, -1, 0],
[0, 0, 0.5, 0, 0]]) / 8
h_B_at_G = h_R_at_G
h_R_at_B = np.array([[0, 0, -1.5, 0, 0],
[0, 2, 0, 2, 0],
[-1.5, 0, 6, 0, -1.5],
[0, 2, 0, 2, 0],
[0, 0, -1.5, 0, 0]]) / 8

h_B_at_R = h_R_at_B
h, w = bayer_image.shape
red_channel = np.zeros((h, w))
green_channel = np.zeros((h, w))
blue_channel = np.zeros((h, w))
red_channel[::2, 1::2] = bayer_image[::2, 1::2]
green_channel[::2, ::2] = bayer_image[::2, ::2]
green_channel[1::2, 1::2] = bayer_image[1::2, 1::2]
blue_channel[1::2, ::2] = bayer_image[1::2, ::2]
# Apply Demosaicing
green_channel += convolve(bayer_image, h_G_at_RB)
red_channel += convolve(bayer_image, h_R_at_G) + convolve(bayer_image, h_R_at_B)
blue_channel += convolve(bayer_image, h_B_at_G) + convolve(bayer_image,
h_B_at_R)
demosaiced_image = np.stack((red_channel, green_channel, blue_channel), axis=-1)
return np.clip(demosaiced_image, 0, 1) # Clip to valid range [0, 1]
# Load Image
rgb_image = io.imread('/home/evoprime/Athena/Downloads/fruits.jpeg') / 255.0 # Replace with
actual image path or URL
h, w, _ = rgb_image.shape

# Apply Baeyer Filtering


bayer_image = bayer_mosaic(rgb_image)
# Perform demosaicing
demosaiced_image = demosaic_malvar_he_cutler(bayer_image)
plt.figure(figsize=(15, 5))
plt.subplot(1, 3, 1)
plt.imshow(rgb_image)
plt.title('Original RGB Image')
plt.axis('off')
plt.subplot(1, 3, 2)
plt.imshow(bayer_image, cmap='gray')
plt.title('Bayer Mosaic Image')
plt.axis('off')
plt.subplot(1, 3, 3)
plt.imshow(demosaiced_image)
plt.title('Demosaiced Image (Malvar, He, Cutler)')
plt.axis('off')
plt.show()

OUTPUT
OBSERVATION
The image cannot be reconstructed perfectly simply using Malvar He Cutler Algorithm or any
demosaicing algorithm. JPEG processed images also use White balancing, Gamma
correction,Compression and color adjustments which accounts for the differences here.

Q(3.3) Answer the following questions and optionally validate them experimentally:
1. Most captured images have gamma correction applied to them. Does this invalidate the
basic compositing equation (3.8); if so, how should it be fixed?
2. The additive (pure reflection) model may have limitations. What happens if the glass is tinted,
especially to a non-gray hue? How about if the glass is dirty or smudged? How could you model
wavy glass or other kinds of refractive objects?

Answer for Q(3.3)

1.

Yes, gamma correction does invalidate the basic compositing equation (C = (1 − α)B + αF)
because the equation assumes that the images are in linear color space. Gamma correction
applies a non-linear transformation to image colors, which affects how the colors blend
during compositing.

To fix this:
● First, reverse the gamma correction by converting the gamma-corrected images
back into linear color space. This is done by applying the inverse gamma function,
typically with gamma ≈ 2.2.
● After this, apply the compositing equation in the linear color space.
● Once compositing is completed, reapply gamma correction to the final image so it
can be displayed correctly on non-linear devices like monitors.

2.

○ Tinted Glass: If the glass is tinted, especially with a non-gray hue, the reflection
will be altered by the tint color. For example, a blue-tinted glass will make
reflections appear bluish. This can be modeled by applying a color filter to the
reflection, simulating the effect of the tint on the reflected light.
○ Dirty or Smudged Glass: Dirt or smudges will scatter and diffuse the reflection,
reducing its sharpness and clarity. To model this, a noise pattern or blur effect can
be added to the reflection, simulating the scattering of light due to dirt or smudges
on the glass.
○ Wavy Glass or Refractive Objects: Wavy or irregular glass causes distortions in the
reflection. This can be modeled by using a spatial distortion map, which warps the
reflection based on the irregularities of the glass. Displacement mapping or ripple
effects can simulate the distortion caused by uneven or wavy glass.

Q(3.4) Set up a blue or green background, e.g., by buying a large piece of colored posterboard.
Take a picture of the empty background, and then of the background with a new object in front of
it. Pull the matte using the difference between each colored pixel and its assumed corresponding
background pixel

Answer for Q(3.4)


Matting Process (Extracting the Object)

● The goal is to extract the object from the second image by comparing it to the first, where
only the background is present. The difference in pixel values between the two images will
help identify where the object is located.

Technique for Matting

There are several approaches you can use based on the readings:
Difference Matting:

● Step 1: For each pixel in the image with the object, compare its color (RGB values) with
the corresponding pixel in the background image.
● Step 2: If the difference between the two pixel values is above a certain threshold
(indicating a change, i.e., the presence of the object), mark this pixel as part of the
foreground. Otherwise, it's part of the background.

Alpha Matting (From Smith and Blinn 1996):

The equation used to calculate the final composited pixel CCC is:

𝐶 = (1 − 𝛼)𝐵 + 𝛼𝐹

Where:

● B is the background color.


● F is the foreground color.
● α is the opacity or "alpha channel" value at each pixel, which ranges from 0 (completely
transparent) to 1 (completely opaque).
● Step 1: Calculate α (opacity) for each pixel. If the pixel difference is large, α will be close
to 1 (object). If the difference is small, α will be close to 0 (background).
● Step 2: Use the calculated α values to compute the foreground object colors at each pixel
using the equation:

𝐹 = (𝐶 − (1 − 𝛼)𝐵) ÷ 𝛼

● Step 3: Once the matte (foreground with transparency values) is obtained, you can use this
to composite the foreground object onto any new background by combining it with another
image.

Compositing the Object

After extracting the foreground object, you can insert it into a different background by applying
the compositing equation:

𝐶𝑛𝑒𝑤 = (1 − 𝛼)𝐵𝑛𝑒𝑤 + 𝛼𝐹

Where Bnew is the new background and F is your extracted foreground object.

Input Images:
Code:

import cv2
import numpy as np

import os

background_img = cv2.imread('solidblue.png')

foreground_img = cv2.imread('ballInSolidBlue.jpg')

if background_img is None:

raise ValueError("Error loading background image.")

if foreground_img is None:

raise ValueError("Error loading image with object.")

cv2.imshow('Background Image', background_img)

cv2.imshow('Foreground Image (with Object)', foreground_img)

cv2.waitKey(0)

cv2.destroyAllWindows()

background_img = cv2.resize(background_img, (foreground_img.shape[1],


foreground_img.shape[0]))
background = background_img.astype(np.float32) / 255.0

foreground = foreground_img.astype(np.float32) / 255.0

difference = np.abs(foreground - background)

threshold = 0.2

alpha = np.max(difference, axis=2) > threshold

alpha = alpha.astype(np.float32)

foreground_extracted = foreground * np.expand_dims(alpha, axis=2)

transparent_background = np.ones_like(foreground) * 255

result_image = np.where(np.expand_dims(alpha, axis=2), foreground_extracted * 255,


transparent_background).astype(np.uint8)

cv2.imwrite('extracted_object.png', result_image)

cv2.imshow('Extracted Object', result_image)

cv2.waitKey(0)

cv2.destroyAllWindows()

Output Image after bg removal:


Q(3.5) Implement a difference keying algorithm consisting of the following steps:
1. Compute the mean and variance (or median and robust variance) at each pixel in an
“empty” video sequence.
2. For each new frame, classify each pixel as foreground or background (set the background
pixels to RGBA=0).
3. (Optional) Compute the alpha channel and composite over a new background.
4. (Optional) Clean up the image using morphology (Section 3.3.1), label the connected
components (Section 3.3.3), compute their centroids, and track them from frame to frame. Use this
to build a “people counter”.

Answer for Q(3.5)


import cv2

import numpy as np

from scipy import ndimage


from google.colab.patches import cv2_imshow

import matplotlib.pyplot as plt

# File paths

fg_path = '/content/drive/MyDrive/NITC_Projects/Computer_vision/foreground_tajmahal.mp4'

bg_path = '/content/drive/MyDrive/NITC_Projects/Computer_vision/Taj_mahal_background.mp4'

new_bg_path = '/content/drive/MyDrive/NITC_Projects/Computer_vision/new_background.png'

output_file = '/content/output_video.mp4'

# Load background and foreground videos

cap_bg = cv2.VideoCapture(bg_path)

cap_fg = cv2.VideoCapture(fg_path)

# Load new background image

new_bg = cv2.imread(new_bg_path)

# Parameters

bg_mean = None

threshold = 10 # Adjust threshold as needed

def compute_mean_and_variance(bg_frames):

global bg_mean

# Compute mean over the background frames

frames = [cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) for frame in bg_frames]

stack = np.stack(frames, axis=-1)

bg_mean = np.mean(stack, axis=-1)

def classify_pixels(frame):
gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

diff = np.abs(gray_frame - bg_mean)

foreground_mask = diff > threshold

return foreground_mask

def cleanup_using_morphology(mask):

# Perform morphological operations to clean up the mask

kernel = np.ones((5, 5), np.uint8)

cleaned_mask = cv2.morphologyEx(mask.astype(np.uint8) * 255, cv2.MORPH_CLOSE, kernel)

return cleaned_mask

def label_connected_components(mask):

# Label connected components

labeled_image, num_labels = ndimage.label(mask)

return labeled_image, num_labels

def compute_centroids(labeled_image, num_labels):

# Compute centroids of labeled components

centroids = ndimage.center_of_mass(labeled_image, labeled_image, range(1, num_labels + 1))

return centroids

def track_centroids(frame, centroids):

# Draw centroids on the frame

for centroid in centroids:

x, y = int(centroid[1]), int(centroid[0])

cv2.circle(frame, (x, y), 5, (0, 255, 0), -1)

return frame
# Read a few background frames to compute mean and variance

bg_frames = [cap_bg.read()[1] for _ in range(10)]

compute_mean_and_variance(bg_frames)

frame_indices = [50, 150, 250, 350] # Example frame indices to show

processed_frames = []

for idx in range(max(frame_indices) + 1):

ret_fg, frame_fg = cap_fg.read()

if not ret_fg:

break

if idx in frame_indices:

# Classify pixels as foreground or background

foreground_mask = classify_pixels(frame_fg)

# Compute alpha channel and composite over new background

alpha_channel = foreground_mask.astype(np.uint8) * 255

new_bg_resized = cv2.resize(new_bg, (frame_fg.shape[1], frame_fg.shape[0]))

foreground = cv2.bitwise_and(frame_fg, frame_fg, mask=alpha_channel)

background = cv2.bitwise_and(new_bg_resized, new_bg_resized,


mask=cv2.bitwise_not(alpha_channel))

composited_frame = cv2.add(foreground, background)

# Clean up using morphology

cleaned_mask = cleanup_using_morphology(foreground_mask)
# Label connected components, compute centroids, and track

labeled_image, num_labels = label_connected_components(cleaned_mask)

centroids = compute_centroids(labeled_image, num_labels)

tracked_frame = track_centroids(composited_frame, centroids)

# Save processed frame

processed_frames.append((frame_fg, tracked_frame))

# Plot before and after for selected frames

fig, axs = plt.subplots(4, 2, figsize=(12, 16))

for i, (before, after) in enumerate(processed_frames):

axs[i, 0].imshow(cv2.cvtColor(before, cv2.COLOR_BGR2RGB))

axs[i, 0].set_title(f'Frame {frame_indices[i]} Before')

axs[i, 0].axis('off')

axs[i, 1].imshow(cv2.cvtColor(after, cv2.COLOR_BGR2RGB))

axs[i, 1].set_title(f'Frame {frame_indices[i]} After')

axs[i, 1].axis('off')

plt.tight_layout()

plt.show()

# Release video captures

cap_bg.release()

cap_fg.release()
Q(3.6) Write a variety of photo enhancement or effects filters: contrast, solarization (quantization),
etc. Which ones are useful (perform sensible corrections) and which ones are more creative (create
unusual images)?

Answer for Q(3.6)


1. Contrast Adjustment:
- Useful: Enhances the overall visibility of an image by increasing the difference between light
and dark areas. This is particularly useful for correcting underexposed or overexposed images.
2. Solarization (Quantization):
- Creative: This effect inverts the colors of an image at certain intensity levels, creating a
surreal, artistic look. It's more about creating unusual images rather than performing sensible
corrections.
3. Brightness Adjustment:
- Useful: Modifies the overall lightness or darkness of an image. This can help in correcting
images that are too dark or too bright.
4. Sharpening:
- Useful: Enhances the edges within an image, making it appear clearer and more defined. This
is useful for improving the clarity of slightly blurred images.
5. Blurring:
- Creative: Softens the details in an image, which can be used for artistic effects or to reduce
noise. Gaussian blur is a common technique used for this purpose.
6. Sepia Tone:
- Creative: Applies a warm brown tone to the image, giving it a vintage look. This is more
about creating a specific aesthetic rather than correcting the image.
7. Edge Detection:
- Creative: Highlights the edges within an image, often used for artistic effects or to create
outlines. Techniques like the Canny edge detector are commonly used.
8. Histogram Equalization:
- Useful: Improves the contrast of an image by spreading out the most frequent intensity
values. This is particularly useful for enhancing the details in images with poor contrast.

Q(3.7) Compute the gray level (luminance) histogram for an image and equalize it so that the tones
look better (and the image is less sensitive to exposure settings). You may want to use the following
steps:
1. Convert the color image to luminance.
2. Compute the histogram, the cumulative distribution, and the compensation transfer function.
3. (Optional) Try to increase the “punch” in the image by ensuring that a certain fraction of pixels
(say, 5%) are mapped to pure black and white.
4. (Optional) Limit the local gain in the transfer function. One way to do this is to limit while
performing the accumulation, keeping any unaccumulated values “in reserve”.
5. Compensate the luminance channel through the lookup table and re-generate the color image
using color ratios.
6. (Optional) Color values that are clipped in the original image, i.e., have one or more saturated
color channels, may appear unnatural when remapped to a non-clipped value. Extend your
algorithm to handle this case in some useful way.
Answer for Q(3.7)
Code:
import numpy as np
import cv2
import matplotlib.pyplot as plt

def rgb_to_luminance(image):
luminance = 0.299 * image[:, :, 2] + 0.587 * image[:, :, 1] + 0.114 * image[:, :, 0]
return luminance.astype(np.uint8)

def compute_histogram_and_cdf(luminance):
histogram, bins = np.histogram(luminance.flatten(), 256, [0, 256])
cdf = histogram.cumsum()
cdf_normalized = cdf * (histogram.max() / cdf.max())
return histogram, cdf, cdf_normalized

def equalize_histogram(luminance, cdf):


cdf_m = np.ma.masked_equal(cdf, 0)
cdf_m = (cdf_m - cdf_m.min()) * 255 / (cdf_m.max() - cdf_m.min())
cdf = np.ma.filled(cdf_m, 0).astype('uint8')
equalized_luminance = cdf[luminance]
return equalized_luminance

def reconstruct_color_image(original_image, equalized_luminance):


R = original_image[:, :, 2]
G = original_image[:, :, 1]
B = original_image[:, :, 0]
ratio_R = R / (0.299 * R + 0.587 * G + 0.114 * B + 1e-6)
ratio_G = G / (0.299 * R + 0.587 * G + 0.114 * B + 1e-6)
ratio_B = B / (0.299 * R + 0.587 * G + 0.114 * B + 1e-6)
new_R = equalized_luminance * ratio_R
new_G = equalized_luminance * ratio_G
new_B = equalized_luminance * ratio_B
new_image = np.stack([new_B, new_G, new_R], axis=2)
new_image = np.clip(new_image, 0, 255).astype(np.uint8)
return new_image
def process_image(image):
luminance = rgb_to_luminance(image)
hist, cdf, cdf_normalized = compute_histogram_and_cdf(luminance)
equalized_luminance = equalize_histogram(luminance, cdf)
equalized_image = reconstruct_color_image(image, equalized_luminance)
return equalized_image, hist, cdf_normalized
image = cv2.imread("C:\\Users\\aarus\\Desktop\\carr.jpeg")
equalized_image, hist, cdf_normalized = process_image(image)
cv2.imwrite('equalized_image.jpg', equalized_image)
plt.figure()
plt.subplot(121), plt.plot(hist), plt.title('Histogram')
plt.subplot(122), plt.plot(cdf_normalized), plt.title('CDF')
plt.show()
cv2.imshow('Original Image', image)
cv2.imshow('Equalized Image', equalized_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Input:

Output:

Answer for Q(3.8)

1. Luminance Computation (from Exercise 3.7):

● First, convert the colour image into a grayscale luminance image. This can be done using
the formula:
L=0.2126*R+0.7152*G+0.0722*B

● Here, R, G, and B are the red, green, and blue colour channels of the image, and LLL
represents the luminance (brightness) channel.

2. Divide the Image into Patches:

● Split the image into small patches (e.g., 8x8 or 16x16 pixels).
● For each patch, compute its own grey-level (luminance) histogram.

3. Bilinear Distribution to Adjacent Vertices:

● For each pixel, distribute its luminance value across adjacent vertices in its patch using
bilinear interpolation. This means that each pixel contributes its value to the nearest
vertices in the patch grid, with weights based on its distance to those vertices.
● If we let a pixel's luminance value be I(x,y) at location (x,y), and the nearest vertices be at
positions (x1,y1) (x2,y2), etc., distribute I(x,y) based on how close the pixel is to each
vertex.

4. Convert to Cumulative Distribution Function (CDF):

● For each patch, convert the histogram into a cumulative distribution function (CDF). The
CDF is essential for mapping luminance values to new values such that the histogram is
equalized.
● The CDF is calculated as:

Where P(i) is the probability of the grey level iii in the histogram.

5. Interpolate Between Adjacent CDFs:

● To smooth the transition between patches, interpolate the CDFs of adjacent patches using
bilinear interpolation. This ensures that the transition between different regions of the
image is smooth and prevents artifacts from appearing at patch boundaries.
● For a pixel at location (x,y), interpolate its new intensity based on the CDF values of the
surrounding patch vertices.

6. Remap Luminance Values Using Interpolated CDFs:

● After interpolation, use the resulting CDF to remap the luminance values of the pixels in
each patch.
● The remapped luminance L′ for a pixel with luminance LLL is obtained from the
interpolated CDF:

L’ = CDF-1(L)
● This remaps the luminance values in a way that improves the contrast locally.

7. Reconstruct the Colour Image:

● After adjusting the luminance values, the colour image is regenerated by combining the
adjusted luminance with the original colour ratios. This can be done using the method:

I’ = L’*(I/L)

Where III is the original intensity of the colour channels (R, G, B), and LLL is the original
luminance.

This ensures that the colour proportions are maintained while enhancing the luminance contrast.

8.:

● Apply low-pass filtering to the CDFs to further smooth the transitions between patches.
● Handle colour clipping or saturation by identifying pixels where the colour channels have
saturated values and applying specific rules to these pixels to avoid unnatural colours
after histogram equalization.

Code:

import cv2

import numpy as np

# Step 1: Convert the image to grayscale (luminance calculation)

def convert_to_luminance(image):

return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Step 2: Local histogram equalization function

def local_histogram_equalization(image, patch_size=(8, 8)):

# Divide the image into patches


h, w = image.shape

patches_x = w // patch_size[0]

patches_y = h // patch_size[1]

equalized_image = np.zeros_like(image)

# Loop through each patch

for i in range(patches_y):

for j in range(patches_x):

# Get the current patch

x_start = j * patch_size[0]

y_start = i * patch_size[1]

x_end = (j + 1) * patch_size[0]

y_end = (i + 1) * patch_size[1]

patch = image[y_start:y_end, x_start:x_end]

# Step 3: Compute the histogram and equalize

equalized_patch = cv2.equalizeHist(patch)

# Step 4: Assign the equalized patch back to the output image

equalized_image[y_start:y_end, x_start:x_end] = equalized_patch


return equalized_image

# Step 5: Reapply luminance to original image

def apply_luminance_to_color_image(original_image, luminance_image):

# Split the original image into R, G, B channels

b, g, r = cv2.split(original_image)

# Compute the ratios for each color channel relative to the original luminance

original_luminance = convert_to_luminance(original_image).astype(float) + 1e-10 # Avoid


division by zero

luminance_image = luminance_image.astype(float)

b_new = (b / original_luminance) * luminance_image

g_new = (g / original_luminance) * luminance_image

r_new = (r / original_luminance) * luminance_image

# Clip values to stay within valid range [0, 255]

b_new = np.clip(b_new, 0, 255).astype(np.uint8)

g_new = np.clip(g_new, 0, 255).astype(np.uint8)

r_new = np.clip(r_new, 0, 255).astype(np.uint8)

# Merge the channels back


return cv2.merge([b_new, g_new, r_new])

# Load the image

image = cv2.imread('input_image.jpg')

# Convert to grayscale (luminance)

luminance_image = convert_to_luminance(image)

# Apply local histogram equalization

equalized_luminance = local_histogram_equalization(luminance_image)

# Reapply the equalized luminance to the original color image

final_image = apply_luminance_to_color_image(image, equalized_luminance)

# Save or display the result

cv2.imwrite('output_image.jpg', final_image)

cv2.imshow('Equalized Image', final_image)

cv2.waitKey(0)

cv2.destroyAllWindows()

Answer for Q(3.9)


a)In the clamping method,
f(i,j)=f(k,l), where:
k = max(0; min(M - 1; i));
l = max(0; min(N - 1; j));
Advantages:Simple to implement and can effectively preserve the boundary values by
duplicating them, which may lead to smoother transitions at the edges.
Disadvantages: 1)Can introduce artifacts and unrealistic boundary behavior, especially when the
image has edges appearing at the boundary, leading to less natural-looking results.
2)The inherent assumption that the border value should remain constant may not be realistic for
all applications.
In the zero-padding method,
f(i,j)=f(k,l),if 0≤k<M and 0≤l<N
f(i,j)=0,otherwise where:
k = max(0; min(M - 1; i));
l = max(0; min(N - 1; j));
Advantages: 1)Maintains the original image size, allowing easy integration with filters and
maintaining the dimensionality required in operations like convolution.
2)Reduces border effects by ensuring that pixel values at the edges are not treated with different
behavior compared to interior pixels.
Disadvantages: 1) Can cause artificial discontinuities at the borders, which may lead to
undesirable artifacts in the convolution output, especially if the filter interacts heavily with the
edges.
2)Requires additional memory and computational resources to handle the increased size of the
input matrix.
c) In the mirror padding method,
f(i,j)=f(k,l), if 0≤k<M and 0≤l<N
f(i,j)=f(2k,l), if k<0
f(i,j)=f(2k-2(M-1),l), if k≥M
f(i,j)=f(k,2l), if l<0
f(i,j)=f(k,2l-2(N-1)), if l≥N
Where, k=i, l=j
Advantages: 1)Preserves the structure of the image by reflecting edge pixels, which results in
lesser artifaccts compared to zero padding.
2)Creates a smooth transition at the edges of the image, which can help preserve image features
and achieve better performance in applications like image convolution in CNNs.

Disadvantages: Mirrors pixel values which may introduce unrealistic data, especially if the
image has significant features or patterns at the borders, potentially distorting the data
representation.

d) In the wrap padding method,

f(i,j)=f(k,l)

Where:

k=imodM
l=jmodN

Advantages: 1) Useful for periodic or cyclic data, as it avoids introducing artificial edges.
2) No information loss at the boundaries, creating a continuous appearance and preventing the
introduction of artificial borders.
3)Can help preserve important features near edges across various applications, including CNNs.
Disadvantages: 1)Can lead to unrealistic artifacts, especially in non-periodic data, causing
discontinuities or misleading visual representations.
2)Computationally more complex to implement as it needs to account for the wraparound nature
of indices.

import glfw
from OpenGL.GL import *
from PIL import Image
import numpy as np

def load_texture(path):
img = Image.open(path)
img = img.transpose(Image.FLIP_TOP_BOTTOM)
img_data = np.array(img, dtype=np.uint8)

texture = glGenTextures(1)
glBindTexture(GL_TEXTURE_2D, texture)
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, img.width, img.height, 0, GL_RGB,
GL_UNSIGNED_BYTE, img_data)

glGenerateMipmap(GL_TEXTURE_2D)
return texture

if not glfw.init():
raise Exception("GLFW can't be initialized")

window = glfw.create_window(800, 600, "Texture Clamping Modes", None, None)

if not window:
glfw.terminate()
raise Exception("GLFW window can't be created")

glfw.make_context_current(window)

glViewport(0, 0, 800, 600)


vertices = [
# positions # texture coords
0.5, 0.5, 0.0, 2.0, 2.0, # Top Right
0.5, -0.5, 0.0, 2.0, 0.0, # Bottom Right
-0.5, -0.5, 0.0, 0.0, 0.0, # Bottom Left
-0.5, 0.5, 0.0, 0.0, 2.0 # Top Left
]

indices = [0, 1, 3, 1, 2, 3]

vertices = np.array(vertices, dtype=np.float32)


indices = np.array(indices, dtype=np.uint32)

VBO = glGenBuffers(1)
VAO = glGenVertexArrays(1)
EBO = glGenBuffers(1)

glBindVertexArray(VAO)

glBindBuffer(GL_ARRAY_BUFFER, VBO)
glBufferData(GL_ARRAY_BUFFER, vertices.nbytes, vertices, GL_STATIC_DRAW)

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, EBO)
glBufferData(GL_ELEMENT_ARRAY_BUFFER, indices.nbytes, indices,
GL_STATIC_DRAW)

glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 5 * vertices.itemsize, ctypes.c_void_p(0))


glEnableVertexAttribArray(0)

glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, 5 * vertices.itemsize,


ctypes.c_void_p(12))
glEnableVertexAttribArray(1)
texture = load_texture("path_to_your_texture.jpg")
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR)
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR)

GL_CLAMP_TO_EDGE)

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_MIRRORED_REPEAT)


glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_MIRRORED_REPEAT)
glClearColor(0.1, 0.1, 0.1, 1.0)

while not glfw.window_should_close(window):


glfw.poll_events()
glClear(GL_COLOR_BUFFER_BIT)
glBindTexture(GL_TEXTURE_2D, texture)
glBindVertexArray(VAO)
glDrawElements(GL_TRIANGLES, len(indices), GL_UNSIGNED_INT, None)
glfw.swap_buffers(window)

glDeleteVertexArrays(1, [VAO])
glDeleteBuffers(1, [VBO])
glDeleteBuffers(1, [EBO]
glfw.terminate()
Observations
GL_REPEAT: The texture will repeat when the texture coordinates exceed [0.0, 1.0]
GL_CLAMP_TO_EDGE: The texture will extend and stretch the edge pixels.
GL_MIRRORED_REPEAT: The texture will mirror itself when the coordinates go beyond [0.0,
1.0].
Answer for Q(3.10)

Here we are Performing convolution using separable filters on a grayscale image

We are implementing the padding mechansims (zero, replication, cyclic wrap, mirror).

We will first perform horizontal convolution followed by vertical convolution.

During both convolutions, padding will be applied based on the selected mode.

IMPLEMENTATION:

import numpy as np

# Padding function considering the padding mechanisms


def pad_image(image, pad_width, mode='zero'):
M, N = image.shape[:2]
if mode == 'zero':
padded_image = np.pad(image, pad_width, mode='constant', constant_values=0)

elif mode == 'replicate':


padded_image = np.pad(image, pad_width, mode='edge')

elif mode == 'wrap':


padded_image = np.pad(image, pad_width, mode='wrap')

elif mode == 'mirror':


padded_image = np.pad(image, pad_width, mode='reflect')
else:
raise ValueError(f"Unknown padding mode: {mode}")

return padded_image

# Separable convolution function


def separable_convolution(image, h_kernel, v_kernel, padding_mode='zero'):
# Determine padding width
h_pad = len(h_kernel) // 2
v_pad = len(v_kernel) // 2

# Step 1: Pad the image based on the selected padding mode


padded_image = pad_image(image, ((v_pad, v_pad), (h_pad, h_pad)), mode=padding_mode)

# Step 2: Perform horizontal convolution


intermediate_image = np.zeros_like(padded_image)
for i in range(padded_image.shape[0]): # Loop over each row
for j in range(h_pad, padded_image.shape[1] - h_pad): # Only convolve valid pixels
# Convolve the horizontal kernel
intermediate_image[i, j] = np.sum(padded_image[i, j - h_pad:j + h_pad + 1] * h_kernel)

# Step 3: Perform vertical convolution


final_image = np.zeros_like(intermediate_image)
for i in range(v_pad, intermediate_image.shape[0] - v_pad): # Only convolve valid pixels
for j in range(intermediate_image.shape[1]): # Loop over each column
# Convolve the vertical kernel
final_image[i, j] = np.sum(intermediate_image[i - v_pad:i + v_pad + 1, j] * v_kernel)
# Step 4: Crop the result to original image size
cropped_image = final_image[v_pad:-v_pad, h_pad:-h_pad]

return cropped_image

# Example usage
if __name__ == "__main__":
# Example grayscale image (random values)
image = np.random.rand(5, 5)

# Example kernels
h_kernel = np.array([1, 2, 1]) / 4 # Horizontal kernel
v_kernel = np.array([1, 2, 1]) / 4 # Vertical kernel

# Apply separable convolution with different padding modes


result_zero = separable_convolution(image, h_kernel, v_kernel, padding_mode='zero')
result_replicate = separable_convolution(image, h_kernel, v_kernel,
padding_mode='replicate')
result_wrap = separable_convolution(image, h_kernel, v_kernel, padding_mode='wrap')
result_mirror = separable_convolution(image, h_kernel, v_kernel, padding_mode='mirror')

print("Original Image:\n", image)


print("Result with Zero Padding:\n", result_zero)
print("Result with Replication Padding:\n", result_replicate)
print("Result with Wrap Padding:\n", result_wrap)
print("Result with Mirror Padding:\n", result_mirror)
CODE EXPLANATION:

1)Padding Mechanisms:

The pad_image function pads the input image based on the selected mode:

1.1) Zero Padding: Fills the out-of-bounds pixels with zeros.


1.2)Replication (Clamping): Replicates edge pixels.
1.3)Wrap (Cyclic): Wraps the image around like a torus.
1.4)Mirror (Reflection): Reflects the image at the boundaries.

2)Separable Convolution:

Step 2.1: Pad the image according to the padding mode.


Step 2.2: Perform horizontal convolution using the horizontal kernel.
Step 2.3: Perform vertical convolution using the vertical kernel.
Step 2.4: Crop the resulting image to match the size of the original image.

3)Efficiency: Since the kernel is separable, we first convolve along the rows and then along the
columns, reducing the complexity from o(k^2) to o(2k)

Answer for Q(3.11)


1. Sampling the Continuous Gaussian Filter
- Coefficients Summing to 1: When you sample a continuous Gaussian function at discrete
locations, the resulting discrete filter may not exactly sum to 1. This is because the Gaussian
function is infinite in extent, but a discrete filter has a finite number of samples. If the discrete
filter is not normalized, its coefficients will not naturally sum to 1, which can lead to incorrect
signal processing, such as unintended amplification or attenuation of the signal. To ensure the
sum equals 1, you would need to normalize the discrete filter after sampling
- Sampling the Derivative of a Gaussian: If you sample the derivative of a Gaussian, the sum of
the samples should theoretically sum to 0, reflecting the zero mean property of the derivative of a
Gaussian. However, due to finite sampling and truncation, the sum might not be exactly 0. This
can result in non-zero bias and the presence of non-vanishing higher-order moments, which can
distort the processed signal.

2. Interpolation and Convolution with a Continuous Gaussian


- Interpolation and Resampling with Sinc: One approach to avoid the inaccuracies from direct
sampling is to first interpolate the original signal using a sinc function, apply the continuous
Gaussian filter, and then prefilter with a sinc before resampling. While this approach could
theoretically yield more accurate results, it is computationally expensive and complex.
Moreover, sinc functions have infinite support, making practical implementation challenging.
- Simpler Frequency Domain Approach: A simpler method in the frequency domain involves
directly applying a Gaussian filter in the Fourier domain. This involves taking the Fourier
transform of the signal, multiplying by the Gaussian filter's frequency response, and then taking
the inverse Fourier transform. This method leverages the convolution theorem and can be more
computationally efficient than time-domain convolution, especially for large signals.

3. Gaussian Frequency Response in the Fourier Domain


- Inverse FFT for Discrete Filter: Producing a Gaussian frequency response in the Fourier
domain and then taking the inverse FFT is a common and effective approach to generate a
discrete filter. This method ensures that the filter exhibits the desired frequency characteristics.
However, one must consider that this method inherently assumes periodicity due to the Fourier
transform's properties, which could introduce artifacts if the signal is not appropriately
windowed or padded.

4. Truncation of the Filter


- Effects on Frequency Response: Truncating the Gaussian filter to create a finite-length
discrete filter alters its frequency response. Specifically, truncation can introduce side lobes and
ripple effects in the frequency domain, due to the sharp cut-off in the time domain (a
phenomenon related to the Gibbs effect). This truncation can result in a less accurate
approximation of the ideal Gaussian filter, and it may introduce artifacts such as ringing or
aliasing, particularly in high-frequency components.

5. Rotational Invariance of 2D Filters


- Rotational Invariance in Discrete Filters: In theory, a continuous Gaussian filter is
rotationally invariant, meaning it looks the same in all directions. However, when implemented
in a discrete grid, especially in two dimensions, the filter loses some of this rotational invariance
due to the discrete sampling and the Cartesian grid's inherent directional bias. This effect is more
pronounced in separable filters, where the 2D filter is applied as two 1D filters along the axes.
- Improving Rotational Invariance: To improve rotational invariance, one might consider using
non-separable filters or increasing the filter size to better approximate the continuous Gaussian's
rotational symmetry. Another approach is to design a filter directly in the frequency domain that
has a circularly symmetric frequency response, which can then be transformed back to the spatial
domain. However, achieving perfect rotational invariance in a discrete filter is generally not
possible due to the limitations of discrete sampling.

Answer for Q(3.12)

Selective Sharpening

Selective sharpening focuses on enhancing the edges and details in specific parts of an image
without affecting the entire image. This technique is particularly useful when you want to
highlight certain features, such as the eyes in a portrait or the texture in a landscape, while
keeping other areas smooth.

● Edge Detection: The filter identifies edges in the image where there is a significant
change in pixel intensity.
● Sharpening: It increases the contrast along these edges, making them appear crisper and
more defined.
● Selective Application: The sharpening effect is applied only to the detected edges,
leaving other areas untouched.

Gaussian filter:
Smooths the image by averaging the pixels within a specified kernel size. Makes a weighted
average of the pixels around it using a Gaussian function, which gets rid of noise and details.
# Function to apply Gaussian filter

def manual_gaussian_blur(image, kernel_size=5, sigma=1):

kernel = cv2.getGaussianKernel(kernel_size, sigma)

kernel = np.outer(kernel, kernel)


return cv2.filter2D(image, -1, kernel)

Median filter:
Reduces noise while preserving edges. Changes the value of each pixel to the median value of
the pixels next to it. This gets rid of "salt-and-pepper" noise.

# Function to apply Median filter

def manual_median_blur(image, kernel_size=5):

padded_image = cv2.copyMakeBorder(image, kernel_size//2, kernel_size//2, kernel_size//2, kernel_size//2,


cv2.BORDER_REFLECT)

median_blurred = np.zeros_like(image)

for i in range(image.shape[0]):

for j in range(image.shape[1]):

for k in range(image.shape[2]):

median_blurred[i, j, k] = np.median(padded_image[i:i+kernel_size, j:j+kernel_size, k])

return median_blurred

Bilateral :
Smooths the image while preserving edges. Using a weighted average to reduce noise without
blurring lines, it combines information about space and intensity.
def manual_bilateral_filter(image, d=9, sigma_color=75, sigma_space=75):

bilateral_filtered = np.zeros_like(image)

half_d = d // 2

for i in range(image.shape[0]):

for j in range(image.shape[1]):

for k in range(image.shape[2]):

i_min = max(i - half_d, 0)

i_max = min(i + half_d + 1, image.shape[0])


j_min = max(j - half_d, 0)

j_max = min(j + half_d + 1, image.shape[1])

region = image[i_min:i_max, j_min:j_max, k]

distances = np.exp(-((np.arange(i_min, i_max)[:, None] - i)**2 + (np.arange(j_min, j_max) - j)**2) /


(2 * sigma_space**2))

color_diffs = np.exp(-(region - image[i, j, k])**2 / (2 * sigma_color**2))

weights = distances * color_diffs

bilateral_filtered[i, j, k] = np.sum(weights * region) / np.sum(weights)

return bilateral_filtered

Sharpening :
Makes the edges and features in the picture look better. It uses a kernel to bring out the
differences between pixels that are close to each other, which makes edges look sharper and
more defined.

# Function to apply Sharpening

def manual_sharpen(image):

kernel = np.array([[0, -1, 0], [-1, 5, -1], [0, -1, 0]])

return cv2.filter2D(image, -1, kernel)

Applying above defined filters on a Low resolution/noisy image :

import cv2

import numpy as np
import matplotlib.pyplot as plt

import urllib.request

# Function to download image from URL

def url_to_image(url):

resp = urllib.request.urlopen(url)

image = np.asarray(bytearray(resp.read()), dtype="uint8")

image = cv2.imdecode(image, cv2.IMREAD_COLOR)

return image

# Image URL

image_url = 'https://cdn.clippingpath.in/wp-content/uploads/2017/11/Monalisa.jpg'

image = url_to_image(image_url)

# Apply filters

gaussian_blur = manual_gaussian_blur(image)

median_blur = manual_median_blur(image)

bilateral_filter = manual_bilateral_filter(image)

sharpened_image = manual_sharpen(image)

# Display the images

titles = ['Original Image', 'Gaussian Blur', 'Median Blur', 'Bilateral Filter', 'Sharpened Image']

images = [image, gaussian_blur, median_blur, bilateral_filter, sharpened_image]

plt.figure(figsize=(20, 5))

for i in range(5):

plt.subplot(1, 5, i+1)
plt.imshow(cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB))

plt.title(titles[i])

plt.axis('off')

plt.show()

Result :

Answer for Q(3.13)

Steerable filter Coefficients part:


Freeman and Adelson's (1991) steerable filter algorithm is a method used in image
processing to analyze and enhance images based on directional information. The key idea
behind steerable filters is that they can be adjusted, or "steered," to detect features in
various orientations.

Here’s a breakdown of how it works:

1. Directional Sensitivity: Steerable filters can be oriented in different directions to


detect features like edges, textures, or patterns in specific orientations within an
image.
2. Filter Representation: The filters are represented as a combination of basis
filters. This allows the filter to be easily adjusted to any desired orientation by
combining these basis filters with appropriate coefficients.
3. Efficient Computation: Instead of creating a new filter for each orientation,
steerable filters use a set of basis filters and mathematical functions to compute the
response for any given orientation. This makes the algorithm computationally
efficient.
4. Applications: They are used in various image analysis tasks, including edge
detection, texture analysis, and feature extraction, where orientation-specific
information is crucial.

Overall, Freeman and Adelson’s algorithm provides a powerful and flexible tool for
analyzing images with respect to directional information, making it widely applicable in
computer vision and image processing fields.

Code:
import numpy as np

import cv2

import matplotlib.pyplot as plt

def sobel_filters(image):

Gx = cv2.Sobel(image, cv2.CV_64F, 1, 0, ksize=3)

Gy = cv2.Sobel(image, cv2.CV_64F, 0, 1, ksize=3)

return Gx, Gy

def steerable_response(Gx, Gy, theta):

theta_rad = np.deg2rad(theta)

response = Gx * np.cos(theta_rad) + Gy * np.sin(theta_rad)

return response
image = cv2.imread('/content/Air_India_Boeing_747-400_takes_off_from_Toronto.jpg',
cv2.IMREAD_GRAYSCALE)

image = cv2.imread('/content/color-image.jpg', cv2.IMREAD_GRAYSCALE)

image = cv2.imread('/content/mahindra-thar-roxx-png-image.png', cv2.IMREAD_GRAYSCALE)

if image is None:

print("Error: Could not load the image. Please check the path.")

else:

Gx, Gy = sobel_filters(image)

orientations = [0, 45, 90, 135]

responses = [steerable_response(Gx, Gy, theta) for theta in orientations]

plt.figure(figsize=(12, 8))

plt.subplot(2, 3, 1)

plt.title('Original Image')

plt.imshow(image, cmap='gray')

for i, theta in enumerate(orientations):

plt.subplot(2, 3, i + 2)

plt.title(f'Response at {theta}°')

plt.imshow(responses[i], cmap='gray')
plt.tight_layout()

plt.show()

for i, theta in enumerate(orientations):

plt.subplot(2, 3, i + 2)

plt.title(f'Response at {theta}°')

plt.imshow(responses[i], cmap='gray')

plt.tight_layout()

plt.show()

Input Images:
Output:
Summary of the code:
The code performs edge detection on a grayscale image using Sobel filters and computes
steerable filter responses at various orientations (0°, 45°, 90°, 135°). Here's the summary:
1. Imports: numpy, cv2 (OpenCV), and matplotlib.pyplot for numerical operations,
image processing, and plotting.
2. Sobel Filters: The sobel_filters function calculates the gradients in the x and y
directions (Gx, Gy) using Sobel filters, highlighting edges in the image.
3. Steerable Filter Response: The steerable_response function computes the
response of the image to a steerable filter at any orientation (theta) by combining
Gx and Gy.
4. Image Loading: A grayscale image is loaded, and an error message is shown if it
fails.
5. Gradient Calculation: Sobel filters are applied to compute Gx and Gy.
6. Response Calculation: The filter response is computed for orientations 0°, 45°,
90°, and 135°.
7. Plotting: The original image and the filter responses at each orientation are
displayed using matplotlib.

This code helps visualize how an image responds to edge detection filters at different
angles.

Various order filters part:

This code applies first- and second-order Gaussian derivative filters to multiple images to
detect edges, corners, and intersections.The Breakdown of the code as follows:

1. Gaussian Derivative Filters

● The code defines gaussian_derivative_filters() to compute first-order Gaussian


derivative filters that detect horizontal (G0) and vertical (G90) edges.
● The filter size is calculated as size = ceil(3 * sigma) * 2 + 1, where sigma is the
standard deviation for the Gaussian. This ensures the filter captures sufficient
image details.

2. Second-Order Gaussian Derivative Filters

● The code defines second_order_gaussian_filters() to compute second-order


Gaussian derivative filters. These filters help detect corners (Gxx, Gyy) and
intersections (Gxy) by calculating higher-order variations in the pixel intensity.

3. Applying Filters to the Image


● The apply_filters() function applies both the first- and second-order filters to the
input image using convolution.
● The function returns the responses of the filters: G0 and G90 for first-order
responses, and Gxx, Gyy, and Gxy for second-order responses.

4. Testing on Multiple Images

● The test_filters_on_images() function takes a list of image file paths, applies the
filters to each image, and visualizes the results using matplotlib.
● It shows the original image, the first-order filter responses (for horizontal and
vertical edges), and the second-order filter responses (for corners and
intersections).

CODE:

import numpy as np

import cv2

import matplotlib.pyplot as plt

from scipy.ndimage import convolve

import os

def gaussian_derivative_filters(sigma, size):

x = np.arange(-size // 2 + 1, size // 2 + 1)

xx, yy = np.meshgrid(x, x)

G = np.exp(-(xx**2 + yy**2) / (2 * sigma**2))

G = G / np.sum(G)
G0 = -(xx / sigma**2) * G

G90 = -(yy / sigma**2) * G

return G0, G90

def second_order_gaussian_filters(sigma, size):

x = np.arange(-size // 2 + 1, size // 2 + 1)

xx, yy = np.meshgrid(x, x)

G = np.exp(-(xx**2 + yy**2) / (2 * sigma**2))

G = G / np.sum(G) # Normalize

Gxx = (xx**2 / sigma**4 - 1 / sigma**2) * G

Gyy = (yy**2 / sigma**4 - 1 / sigma**2) * G

Gxy = (xx * yy / sigma**4) * G

return Gxx, Gyy, Gxy

def apply_filters(image, sigma):


size = int(np.ceil(sigma * 3) * 2 + 1)

G0, G90 = gaussian_derivative_filters(sigma, size)

G0_response = convolve(image, G0)

G90_response = convolve(image, G90)

Gxx, Gyy, Gxy = second_order_gaussian_filters(sigma, size)

Gxx_response = convolve(image, Gxx)

Gyy_response = convolve(image, Gyy)

Gxy_response = convolve(image, Gxy)

return G0_response, G90_response, Gxx_response, Gyy_response, Gxy_response

def test_filters_on_images(image_paths, sigma=1.5):

for image_path in image_paths:

image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

if image is None:

print(f"Error: Could not load the image from '{image_path}'.")

continue
G0_response, G90_response, Gxx_response, Gyy_response, Gxy_response = apply_filters(image, sigma)

plt.figure(figsize=(14, 8))

plt.subplot(2, 3, 1)

plt.title('Original Image')

plt.imshow(image, cmap='gray')

plt.colorbar()

plt.subplot(2, 3, 2)

plt.title('First-order G0 (Horizontal Edges)')

plt.imshow(G0_response, cmap='gray')

plt.colorbar()

plt.subplot(2, 3, 3)

plt.title('First-order G90 (Vertical Edges)')

plt.imshow(G90_response, cmap='gray')

plt.colorbar()

plt.subplot(2, 3, 4)

plt.title('Second-order Gxx (Corners)')


plt.imshow(Gxx_response, cmap='gray')

plt.colorbar()

plt.subplot(2, 3, 5)

plt.title('Second-order Gxy (Intersections)')

plt.imshow(Gxy_response, cmap='gray')

plt.colorbar()

plt.subplot(2, 3, 6)

plt.title('Second-order Gyy')

plt.imshow(Gyy_response, cmap='gray')

plt.colorbar()

plt.tight_layout()

plt.show()

image_paths = [

r"/content/Air_India_Boeing_747-400_takes_off_from_Toronto.jpg",

r"/content/color-image.jpg",

r"/content/mahindra-thar-roxx-png-image.png",

]
test_filters_on_images(image_paths, sigma=1.5)

Input Images:
Output:
Summary of the Code:
This code applies Gaussian derivative filters to detect edges, corners, and
intersections in grayscale images. It uses both first-order and second-order filters to
identify structural features like edges and corners in the image.

Key Components:

1. Gaussian Derivative Filters:


○ First-order filters (gaussian_derivative_filters()):
■ These filters detect horizontal (G0) and vertical (G90) edges in the
image by computing the first derivatives of the Gaussian function.
○ Second-order filters (second_order_gaussian_filters()):
■ These filters detect corners (Gxx, Gyy) and intersections (Gxy) by
computing the second derivatives of the Gaussian function.
2. Filter Application:
○ The function apply_filters() applies both the first- and second-order filters
to the input image using convolution. The function returns responses for
horizontal edges (G0), vertical edges (G90), corners (Gxx, Gyy), and
intersections (Gxy).
3. Testing on Multiple Images:
○ The test_filters_on_images() function takes a list of image file paths and
applies the filters to each image. It visualizes the following results for each
image:
■ Original image.
■ First-order responses (G0: horizontal edges, G90: vertical edges).
■ Second-order responses (Gxx: corners, Gyy: corners, Gxy:
intersections).
4. Visualization:
○ Matplotlib is used to plot the original image and filter responses side by
side, making it easy to observe the detected edges, corners, and
intersections.

By varying Sigma to a smaller value,The filters detect small, sharp edges and fine
textures,but they may also pick up a lot of noise and More sensitive to small variations in
pixel intensity.And By varying to a larger value,The filters detect broader, smoother
edges, and ignore small-scale details or noise.Best for detecting larger, more gradual
changes in intensity across the image.

Answer for Q(3.14)


1. Bilateral Filter is a non-linear, edge-preserving, and noise-reducing smoothing filter,
useful for denoising images while maintaining edges.
2. Guided Filter is a fast, edge-preserving filter that uses a guidance image to perform
smoothing. It’s often used in tasks like edge-aware smoothing and detail enhancement.

CODE IMPLEMENTATION:
import cv2
import numpy as np
import matplotlib.pyplot as plt

# Load the image


image= cv2.imread(r'C:\Users\harshitha\OneDrive\Pictures\pic1.webp', cv2.IMREAD_COLOR)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Convert BGR to RGB for
displaying using matplotlib
# 1. Bilateral Filter (Edge-preserving smoothing)
def apply_bilateral_filter(image, d, sigma_color, sigma_space):
"""Applies a bilateral filter to the input image."""
print("Applying bilateral filter...")
return cv2.bilateralFilter(image, d, sigma_color, sigma_space)

# 2. Guided Filter (using OpenCV's ximgproc module if available)


def guided_filter(input_image, guide_image, radius, eps):
"""Applies a guided filter using the guide image."""
print("Applying guided filter...")
# OpenCV provides a fast guided filter in ximgproc module, install via pip install opencv-
contrib-python if needed
guide_image = guide_image.astype(np.float32) / 255.0
input_image = input_image.astype(np.float32) / 255.0
return cv2.ximgproc.guidedFilter(guide_image, input_image, radius, eps)

# 3. Noise Reduction using Bilateral Filtering


bilateral_filtered_image = apply_bilateral_filter(image_rgb, d=9, sigma_color=75,
sigma_space=75)

# 4. Guided Filter for Edge-aware Smoothing


if cv2.__version__ >= '4.0.0':
guided_filtered_image = guided_filter(image_rgb, image_rgb, radius=8, eps=0.01)

# Displaying results
plt.figure(figsize=(10, 10))

# Original Image
plt.subplot(1, 3, 1)
plt.imshow(image_rgb)
plt.title('Original Image')
plt.axis('off')

# Bilateral Filtered Image


plt.subplot(1, 3, 2)
plt.imshow(bilateral_filtered_image)
plt.title('Bilateral Filter')
plt.axis('off')

# Guided Filtered Image


if cv2.__version__ >= '4.0.0':
plt.subplot(1, 3, 3)
plt.imshow(guided_filtered_image)
plt.title('Guided Filter')
plt.axis('off')
plt.show()

CODE EXPLANATION:
1.Bilateral Filter:

i. This filter applies edge-preserving smoothing by considering both spatial distance and intensity
difference. It's useful for noise reduction and smooths regions while keeping edges intact.

ii. Parameters:

● d: Diameter of each pixel neighborhood.


● sigma_color: Color sigma, larger values mean more colors in the neighborhood will be
mixed together.
● sigma_space: Spatial sigma, larger values mean farther pixels influence each other.

2.Guided Filter:
i. A fast and efficient method for edge-preserving filtering, where the filtering result is guided by
another image. It helps smooth textures while preserving sharp edges.

ii. The guide_image and input_image are the same in this case, but they can be different in tasks
like joint filtering.

iii. Parameters:

● radius: Defines the filter window.


● eps: Regularization parameter, determines how much detail to preserve.

3.Display:

● The original image, the result of the bilateral filter, and the guided filter are displayed
side by side using Matplotlib for visual comparison.

Answer for Q(3.15)

Fourier Transform Properties

1. Linearity (Superposition) Property

This property states that the Fourier transform of a linear combination of functions
is the same linear combination of their Fourier transforms. If f(t) and g(t) are two
time-domain signals with Fourier transforms F(ω) and G(ω) and and b are
constants, then:
F{a f(t)+b g(t)}=a F{f(t)}+b F{g(t)}

Proof:

F{af(t)+bg(t)}=∫−∞∞(a f(t)+b g(t)) e−jωtdt

By linearity of the integral:

=a ∫∞−∞ f(t) e jωtdt+ b ∫∞-∞ g(t) e−jωtdt= aF{f(t)}+bF{g(t)}

2. Time Shift Property

Shifting a signal in time corresponds to multiplying its Fourier transform by a


complex exponential. If f(t) has a Fourier transform F(ω), then the time-shifted
function f(t−t0) has a Fourier transform:

F{f(t−t0)}=e−jωt0F(ω)

Proof:

F{f(t−t0)}=∫−∞∞f(t−t0)e−jωtdt

Let τ=t−t0:

=∫−∞∞f(τ)e−jω(τ+t0)dτ =e−jωt0∫−∞∞f(τ)e−jωτdτ =e−jωt0∫−∞∞f(τ)e−jωτdτ =e−jωt0F(ω)

3. Time Reversal (Flip) Property

If f(t) has a Fourier transform F(ω)then the time-reversed function f(−t) has a
Fourier transform:

F{f(−t)}= F(−ω)

Proof:

F{f(−t)}=∫−∞∞f(−t)e−jωtdt

Let τ=−t, so that dt=-dτ:


=∫−∞∞f(τ)ejωτ(−dτ)

=∫−∞∞f(τ)ejωτdτ

= F(−ω)

4. Convolution Property

The Fourier transform of the convolution of two time-domain signals is the product
of their individual Fourier transforms. If f(t) and g(t) have Fourier transforms F(ω)
and G(ω), then:

F{f(t)∗g(t)}=F(ω)G(ω)

Proof: The convolution of f(t) and g(t) is defined as:

(f∗g)(t)=∫−∞∞f(τ)g(t−τ)dτ

Taking the Fourier transform:

−jωt
F{(f∗g)(t)}=∫−∞∞(∫−∞∞f(τ)g(t−τ)dτ)e dt

Interchanging the order of integration:

=∫−∞∞f(τ)(∫−∞∞g(t−τ)e−jωtdt)dτ

Let t′= t−τ, so that dt=dt′:

=∫−∞∞f(τ)e−jωτ(∫−∞∞g(t′)e−jωt′dt′)dτ

= F(ω)G(ω)

5. Correlation Property

The Fourier transform of the correlation of two functions is the product of one
Fourier transform and the complex conjugate of the other. For signals f(t) and g(t):

F{(f⋆g)(t)}=F(ω)G∗(ω)
Proof: The correlation is defined as:

(f⋆g)(t)=∫−∞∞f∗(τ)g(t+τ)dτ

Taking the Fourier transform and using similar steps as


convolution, but with f∗(τ), leads to:

F{(f⋆g)(t)}=F(ω)G∗(ω)

6. Multiplication Property

The Fourier transform of the product of two time-domain signals is the convolution
of their Fourier transforms. If f(t) and g(t) have Fourier transforms F(ω) and G(ω),
then:

F{f(t)g(t)}=1/2π∫−∞∞F(ξ)G(ω−ξ)dξ

Proof:

F{f(t)g(t)}=∫−∞∞f(t)g(t)e−jωtdt

Expanding f(t) and g(t) in terms of their inverse Fourier transforms and then
interchanging integrals yields the convolution.

7. Differentiation in Time Domain

Differentiation of a signal in the time domain corresponds to multiplication by jω


in the frequency domain. If f(t) has a Fourier transform F(ω), then:

F{dnf(t)/dtn}=(jω)nF(ω)

Proof (for n=1):

F{df(t)/dt}=∫−∞∞df(t)/dt e−jωtdt

Using integration by parts:

=[f(t)e−jωt]−∞∞+jω∫−∞∞f(t)e−jωtdt
Since the boundary terms vanish:

=jωF(ω)

8. Scaling in Time Domain

If f(t) has a Fourier transform F(ω), then scaling the time variable by a results in
the Fourier transform being scaled in frequency and divided by ∣a∣:

F{f(at)}=1/∣a∣ F(ω/a)

Proof:

F{f(at)}=∫−∞∞f(at)e−jωtdt

Let τ=at, so that dt=dτ/a :

=1/a∫−∞∞f(τ)e−j(ω/a)τdτ

For a>0:

=1/a F(ω/a)

9. Fourier Transform of a Real Signal

If f(t) is real, then its Fourier transform satisfies the conjugate symmetry property:

F(−ω)=F∗(ω)

This follows directly from the fact that for a real signal f(t), the imaginary part of
the Fourier transform must cancel out when performing the inverse Fourier
transform, ensuring that f(t) remains real.

Proof:

Given the Fourier transform definition:

F(ω)=∫−∞∞f(t)e−jωtdt

If f(t) is real, then:


ωt
F∗(ω)=∫−∞∞f(t)ej dt= F(−ω)

Thus, the Fourier transform of a real signal satisfies


F(−ω)=F∗(ω), which shows that the real part of F(ω) is an
even function of ω, and the imaginary part is an odd
function of ω.

10. Parseval's Theorem

Parseval's Theorem states that the total energy of a signal in the time domain is
equal to the total energy in the frequency domain. For a signal f(t) with Fourier
transform F(ω), the theorem can be written as:

2 2
∫−∞∞∣f(t)∣ dt=1/2π ∫−∞∞∣F(ω)∣ dω

Proof:

The energy of the signal in the time domain is given by:

2
∫−∞∞∣f(t)∣ dt

Using the inverse Fourier transform of f(t), we substitute:

f(t)=1/2π ∫−∞∞F(ω)ejωtdω

Thus:

2 ωt 2
∫−∞∞∣f(t)∣ dt=∫−∞∞∣1/2π∫−∞∞F(ω)ej dω∣ dt

Expanding and simplifying using orthogonality properties of the exponential


functions leads to:

2
=1/2π ∫−∞∞∣F(ω)∣ dω

This proves that the energy is preserved between the time and frequency domains.
Fourier Transform Pairs

1. Impulse Function (Dirac Delta Function)

The Dirac delta function δ(t) is defined as:

δ(t)=0 for t≠0,∫−∞∞δ(t)dt=1

The Fourier transform of δ(t) is:

F{δ(t)}=∫−∞∞δ(t)e−jωtdt

Proof:

We know that the Dirac delta function has the sifting property:

∫−∞∞δ(t−t0)f(t)dt=f(t0)
For the Fourier transform of δ(t), we can apply this property with f(t)=e −jωtf:

⋅0
∫−∞∞δ(t)e−jωtdt=e−jω =1

Thus, the Fourier transform of the delta function is:

F{δ(t)}= 1

Fourier Pair:

δ(t)⟷1

2. Shifted Impulse

The shifted impulse function δ(t−t0) is defined as:

δ(t−t0)=0 for t≠t0,∫−∞∞δ(t−t0)dt=1

The Fourier transform of δ(t−t0) is:

F{δ(t−t0)}=∫−∞∞δ(t−t0)e−jωtdt

Proof:

Applying the shifting property of the Dirac delta function:

∫−∞∞δ(t−t0)e−jωtdt=e−jωt0

Thus, the Fourier transform of the shifted delta function is:

F{δ(t−t0)}=e−jωt0

Fourier Pair:

δ(t−t0)⟷e−jωt0
3. Box Filter (Rectangular Pulse)

The rectangular pulse, or box filter, is defined as:

rect(t/T)={1 ∣t∣≤T/2

0 ∣t∣>T/2 }

The Fourier transform of the rectangular pulse is:

F{rect(t/T)}=∫−∞∞rect(t/T)e−jωtdt

Proof:

For rect(t/T), the function is non-zero only for ∣t∣≤T/2,so


we integrate over that interval:

F{rect(t/T)}=∫−T/2T/2e−jωtdt

This is a standard integral:

T/2 −jωT/2 ωT/2


=e−jωt/−jω ∣−T/2 =e −ej /−jω

Using Euler's formula ejx−e−jx=2jsin(x), we get:

=2jsin(ωT/2)/−jω=2sin(ωT/2)/ω

Thus, the Fourier transform of the rectangular pulse is:

F{rect(t/T)}=T sinc(ωT/2)

where sinc(x)=sin(x)/x.

Fourier Pair:

rect(t/T)⟷T sinc(ωT/2)
4. Tent Function (Triangular Pulse)

The tent function, or triangular pulse, is the convolution of two rectangular pulses:

tri(t)=rect(t)∗rect(t)

The Fourier transform of tri(t)\text{tri}(t)tri(t) is:

F{tri(t)}=sinc2(ω/2)

Proof:

By the convolution theorem, the Fourier transform of a convolution is the product


of the Fourier transforms:

F{rect(t)∗rect(t)}=F{rect(t)}⋅F{rect(t)}

We already know that the Fourier transform of rect(t) is sinc(ω/2), so:

F{tri(t)}=sinc(ω/2)⋅sinc(ω/2)=sinc2(ω/2)

Fourier Pair:

tri(t)⟷sinc2(ω/2)

5. Gaussian Function

The Gaussian function is defined as:

f(t)=e−t2/(2σ2)

The Fourier transform of a Gaussian is also a Gaussian:

F{e−t2/(2σ2)}=σsqrt(2π)e−σ2ω2/2

Proof:

We start by applying the definition of the Fourier transform:


F{e−t2/(2σ2)}=∫−∞∞e−t2/(2σ2)e−jωtdt

To solve this, complete the square in the exponent:

e−t2/(2σ2)e−jωt=e−(t2−2jωσ2t)/(2σ2)

Completing the square for the term in ttt gives:

=e−(t+jσ2ω)2/(2σ2)e−σ2ω2/2= e^{-(t + j \sigma^2 \omega)^2 /


(2 \sigma^2)} e^{-\sigma^2 \omega^2 /
2}=e−(t+jσ2ω)2/(2σ2)e−σ2ω2/2

This allows us to transform the integral into a Gaussian integral, which results in:

F{e−t2/(2σ2)}=σ sqrt(2π)e−σ2ω2/2

Fourier Pair:

e−t2/(2σ2)⟷σ sqrt(2π) e−σ2ω2/2

6. Laplacian of Gaussian (LoG)

The Laplacian of Gaussian (LoG) is defined as the second derivative of the


Gaussian function:

LoG(t)=d2/dt2(e−t2/(2σ2))

The Fourier transform of the Laplacian of Gaussian is:

LoG(t)=d2/dt2 (e−t2/(2σ2))

Fourier Transform of LoG:

The Fourier transform of the Laplacian of Gaussian can be derived using the
property that differentiation in the time domain corresponds to multiplication by
jωj \omegajω in the frequency domain. The Fourier transform of the Gaussian
function is already known:
F{e−t2/(2σ2)}=σ sqrt(2π) e−σ2ω2/2

Since the second derivative in the time domain is


equivalent to multiplying by −ω2-\omega^2−ω2 in the
frequency domain, the Fourier transform of the Laplacian
of Gaussian is:

−σ2ω2/2
F{d2\dt2(e−t2/(2σ2))}=−ω2⋅σ sqrt(2π)e

Proof:

1. First, recall the differentiation property of Fourier transforms:

F{d2/dt2 f(t)}=−(jω)2F{f(t)}=−ω2F{f(t)}

2. Apply this property to the Gaussian function:

−σ2ω2/2
F{d2/dt2(e−t2/(2σ2))}=−ω2⋅σ sqrt(2π) e

Thus, the Fourier transform of the Laplacian of Gaussian is:

−σ2ω2/2
F{LoG(t)}=−ω2⋅σ sqrt(2π) e

Fourier Pair:

−σ2ω2/2
d2/dt2(e−t2/(2σ2))⟷−ω2⋅σ sqrt(2π) e

7. Gabor Function

The Gabor function is essentially a Gaussian modulated by a sinusoidal function.


It's widely used in signal processing, especially for time-frequency analysis.

Function:
ω0t
f(t)=e−t2/(2σ2)⋅ej

where e−t2/(2σ2) is a Gaussian, and ejω0t is a complex sinusoid.

Fourier Transform of Gabor Function:

We already know the Fourier transform of a Gaussian. Now we need to handle the
modulation by ejω0t, which corresponds to a frequency shift in the frequency
domain.

By the modulation property of the Fourier transform:

F{e−t2/(2σ2)ejω0t}=F{e−t2/(2σ2) }shifted by ω0

Thus, the Fourier transform of the Gabor function is:

ω0t −σ2(ω−ω0)2/2
F{e−t2/(2σ2)⋅ej }=σ sqrt(2π) e

Proof:

1. Use the known Fourier transform of a Gaussian:

F{e−t2/(2σ2)}=σ2πe−σ2ω2/2

2. Apply the modulation property:

ω0t −t2/(2σ2)
F{e−t2/(2σ2)⋅ej }=F{e } shifted by ω0

Thus, the Fourier transform of the Gabor function is:

ω0t −σ2(ω−ω0)2/2
F{e−t2/(2σ2)⋅ej }=σ sqrt(2π)e

Fourier Pair:

ω0t
e−t2/(2σ2) .ej ⟷σ sqrt(2π)e−σ2(ω−ω0)2/2
8. Unsharp Mask

The Unsharp Mask is a simple image sharpening technique that involves


subtracting a blurred version of an image from the original image. In terms of
signal processing, it's represented as:

f(t)=δ(t)−e−t2/(2σ2)

The unsharp mask enhances high-frequency components by emphasizing the


difference between the original signal and its smoothed (Gaussian) version.

Fourier Transform of Unsharp Mask:

We already know the Fourier transforms of δ(t) and e−t2/(2σ2) . Using linearity, the
Fourier transform of the unsharp mask is:

F{δ(t)−e−t2/(2σ2)}=1−σ sqrt(2π) e−σ2ω2/2

Proof:

1. The Fourier transform of δ(t) is 1.

2. The Fourier transform of e−t2/(2σ2) is σ sqrt(2π) e−σ2ω2/2.

3. Using linearity of the Fourier transform, we get:


F{δ(t)−e−t2/(2σ2)}=1−σ sqrt(2π) e−σ2ω2/2

Fourier Pair:

δ(t)−e−t2/(2σ2)⟷1−σ sqrt(2π) e−σ2ω2/2

9. Windowed Sinc
The Windowed Sinc function is a sinc function multiplied by a window function
(usually rectangular) to control sidelobes in frequency analysis or filtering. It is
defined as:

Windowed Sinc(t)=sinc(t)⋅rect(t/T)

The Fourier transform of the windowed sinc function is a smoothed rectangular


function.

Fourier Transform of Windowed Sinc:

The sinc function has a well-known Fourier transform. Applying the multiplication
property in the time domain results in a convolution in the frequency domain.

Proof:

1. The Fourier transform of sinc(t) is rect(ω).

2. The Fourier transform of rect(t/T) is T⋅sinc(ωT/2).

3. By the convolution theorem, multiplying two functions in the time


domain is equivalent to convolving their Fourier transforms:

F{sinc(t)⋅rect(t/T)}=rect(ω)∗T⋅sinc(ωT/2)

Answer for Q 3.16

Introduction

This involves implementing and evaluating various image filters for resizing operations,
focusing on both magnification and minification. The goal is to apply and compare the effects of
different filters, such as the windowed sinc filter and Gaussian filter, on synthetic and natural
images to understand their performance and visual quality.
Code Explanation

1. Filter Creation and Application:


- sinc_filter(size, cutoff): Generates a windowed sinc filter to perform high-quality resampling.
This filter reduces artifacts like ringing by applying a Hamming window.
- apply_filter(img, filt): Applies the created filter to an image using convolution, which
smooths the image according to the filter's characteristics.
- apply_gaussian_filter(image, ksize, sigma): Applies a Gaussian blur to the image to smooth
it, which can be useful for reducing noise and achieving a softening effect.

2. Image Resizing:
- resize_image(image, scale, interpolation): Resizes an image by scaling it up or down using
bilinear interpolation (or other methods if specified). This helps in understanding how different
filters perform during magnification or minification.

3. Synthetic Chirp Image:


- generate_chirp_image(size): Creates a synthetic chirp image with varying frequency patterns,
which helps in testing the filtering and resizing algorithms on images with different spatial
frequencies.

4. Saving and Displaying Results:


- Saves processed images (both resized and filtered) to disk for evaluation.
- Displays animations of the images using Matplotlib to visually compare the effects of
different filters and resizing operations. This provides an interactive way to assess filter
performance.

5. Visualization:
- Uses show_animation() to create and display animations of processed images, allowing for an
effective visual comparison of the different filtering and resizing techniques.
This approach helps in analyzing the quality and efficiency of various filters applied to image
resizing tasks, offering insights into their merits and deficiencies.

Code:
import numpy as np
import cv2
from scipy.signal.windows import hamming
from matplotlib import pyplot as plt
import matplotlib.animation as animation

# 1. Windowed Sinc Filter


def sinc_filter(size, cutoff):
"""Creates a windowed sinc filter."""
print("Creating windowed sinc filter...")
x = np.arange(-size // 2 + 1., size // 2 + 1.)
sinc_func = np.sinc(2 * cutoff * x)
window = hamming(size) # Apply Hamming window to reduce ringing
sinc_filter = sinc_func * window
return sinc_filter / np.sum(sinc_filter) # Normalize filter

# 2. Applying the filter using convolution


def apply_filter(img, filt):
"""Applies the filter using convolution."""
print("Applying filter using convolution...")
return cv2.filter2D(img, -1, filt)

# 3. Gaussian Filter
def apply_gaussian_filter(image, ksize, sigma):
"""Applies a Gaussian filter to the image."""
print("Applying Gaussian filter...")
return cv2.GaussianBlur(image, (ksize, ksize), sigma)

# 4. Bilinear and Bicubic Interpolation for Resizing


def resize_image(image, scale, interpolation=cv2.INTER_LINEAR):
"""Resizes an image by a given scale factor using the specified interpolation."""
print(f"Resizing image by a scale factor of {scale}...")
return cv2.resize(image, None, fx=scale, fy=scale, interpolation=interpolation)

# 5. Generating Synthetic Chirp Image


def generate_chirp_image(size):
"""Creates a synthetic chirp image with varying frequency."""
print("Generating synthetic chirp image...")
x = np.linspace(0, 1, size)
y = np.linspace(0, 1, size)
xx, yy = np.meshgrid(x, y)
chirp_image = np.sin(2 * np.pi * (xx*2 + yy*2))
chirp_image = ((chirp_image + 1) * 127.5).astype(np.uint8) # Normalize to 0-255
return chirp_image

# Save the synthetic chirp image


print("Saving synthetic chirp image...")
chirp_image = generate_chirp_image(256)
cv2.imwrite('chirp_image.png', chirp_image)

# Load a natural image for testing (replace 'path_to_image' with actual image path)
print("Loading natural image...")
image = cv2.imread('C:/Users/fawaz/Downloads/IMG_8680.jpg', cv2.IMREAD_GRAYSCALE)
if image is None:
raise ValueError("Image not found or could not be loaded.")

# Print original image details


print(f"Original image shape: {image.shape}")
print(f"Original image dtype: {image.dtype}")

# Display the original image


plt.figure(figsize=(8, 8))
plt.title('Original Image')
plt.imshow(image, cmap='gray')
plt.axis('off')
plt.show()

# 6. Resize images (magnification and minification)


print("Resizing images...")
magnified_image = resize_image(image, 2.0) # 2x magnification
minified_image = resize_image(image, 0.5) # 0.5x minification

# Applying filters to the resized images

# 7. Applying Gaussian filter


print("Applying Gaussian filter to resized images...")
gaussian_filtered_magnified = apply_gaussian_filter(magnified_image, 15, 2)
gaussian_filtered_minified = apply_gaussian_filter(minified_image, 15, 2)

# 8. Applying Windowed Sinc filter


print("Applying windowed sinc filter to resized images...")
sinc_filt = sinc_filter(31, 0.1) # Windowed sinc filter with cutoff frequency
sinc_filtered_magnified = apply_filter(magnified_image, sinc_filt)
sinc_filtered_minified = apply_filter(minified_image, sinc_filt)

# Save results for both magnified and minified images


print("Saving processed images...")
cv2.imwrite('magnified_image.png', magnified_image)
cv2.imwrite('minified_image.png', minified_image)
cv2.imwrite('gaussian_filtered_magnified.png', gaussian_filtered_magnified)
cv2.imwrite('gaussian_filtered_minified.png', gaussian_filtered_minified)
cv2.imwrite('sinc_filtered_magnified.png', sinc_filtered_magnified)
cv2.imwrite('sinc_filtered_minified.png', sinc_filtered_minified)

# Print details about the saved images


print("Saved images:")
print(" - magnified_image.png")
print(" - minified_image.png")
print(" - gaussian_filtered_magnified.png")
print(" - gaussian_filtered_minified.png")
print(" - sinc_filtered_magnified.png")
print(" - sinc_filtered_minified.png")

# 9. Visualizing and Comparing Filters

def show_animation(images, title, delay=100):


"""Display animation of images with Matplotlib."""
print(f"Displaying animation for {title}...")
fig = plt.figure()
ims = []
for img in images:
im = plt.imshow(img, animated=True, cmap='gray')
ims.append([im])
ani = animation.ArtistAnimation(fig, ims, interval=delay, blit=True, repeat_delay=1000)
plt.title(title)
plt.axis('off')
plt.show()

# Create animation for magnified images


images_to_animate_magnify = [magnified_image, gaussian_filtered_magnified,
sinc_filtered_magnified]
show_animation(images_to_animate_magnify, "Magnified Images")

# Create animation for minified images


images_to_animate_minify = [minified_image, gaussian_filtered_minified,
sinc_filtered_minified]
show_animation(images_to_animate_minify, "Minified Images")

Answer for Q3.17

Image pyramids are a fundamental concept in image processing and computer vision. They
involve creating progressively lower resolution representations of an image to facilitate
multi-scale analysis.

This technique is particularly useful for tasks like object detection, image blending, texture
analysis, and optical flow.

Types of Image Pyramids:


Gaussian Pyramid: This is a multi-resolution representation of an image in which the
image is repeatedly smoothed (blurred) and downsampled. Each layer is a smaller and
smoother version of the previous layer.
Laplacian Pyramid: This is derived from the Gaussian pyramid and represents the
difference between successive levels of the Gaussian pyramid. It captures the details (high
frequencies) that are lost between scales.

Steps in Constructing a Pyramid:

1. Blurring the Image (Filtering)

2. Downsampling (Decimation)

Filters Used in Image Pyramids:

1. 2x2 Block Filtering: The simplest downsampling technique. It reduces the image by
averaging the pixel values in a 2x2 block and keeps only one pixel from the block.

2. Burt and Adelson's Binomial Kernel:

Kernel: 1/16(1,4,6,4,1)

This is a 5-tap separable filter that approximates a Gaussian filter. It smooths the
image before downsampling to avoid aliasing.

3. 7-Tap or 9-Tap Filters: These filters use a larger kernel (e.g., 7 or 9-tap) for
smoothing. They provide better filtering by considering more pixel values, ensuring a
high-quality downsampled image.

Decimation and Aliasing

When we downsample an image (reduce its resolution), we must ensure that the image is
first smoothed (filtered). Without proper filtering, high-frequency components from the
original image may cause aliasing, a visual distortion where unwanted patterns or artifacts
appear in the downsampled image. This is why the choice of filter plays an essential role in
the quality of the pyramid.

Comparison of Filters

● Block filtering is the simplest but may cause artifacts due to aliasing.
● Burt and Adelson's binomial kernel strikes a good balance between computational
cost and image quality, reducing high-frequency artifacts without being too costly.
● High-quality 7 or 9-tap filters provide the best image quality, especially for detailed
images, but they are computationally more expensive.
CODE:

import cv2

import numpy as np

import matplotlib.pyplot as plt

# Load an example image (grayscale or color)

image = cv2.imread('image_path', cv2.IMREAD_GRAYSCALE)

# Function to display images

def display_image(image, title="Image"):

plt.imshow(image, cmap='gray')

plt.title(title)

plt.axis('off')

plt.show()

# 2x2 Block Filter

def block_filter(image, levels):

pyramid = [image]

for i in range(levels):

downsampled = cv2.pyrDown(pyramid[-1])

pyramid.append(downsampled)

return pyramid

# Burt and Adelson's Binomial Kernel

def binomial_filter(image, levels):

binomial_kernel = np.array([1, 4, 6, 4, 1]) / 16.0

pyramid = [image]

for i in range(levels):
filtered = cv2.sepFilter2D(pyramid[-1], -1, binomial_kernel, binomial_kernel)

downsampled = cv2.pyrDown(filtered)

pyramid.append(downsampled)

return pyramid

# High-Quality 7 or 9-tap Filter

def high_quality_filter(image, levels):

# You can create a 7-tap filter for example

high_quality_kernel = np.array([1, 6, 15, 20, 15, 6, 1]) / 64.0

pyramid = [image]

for i in range(levels):

filtered = cv2.sepFilter2D(pyramid[-1], -1, high_quality_kernel, high_quality_kernel)

downsampled = cv2.pyrDown(filtered)

pyramid.append(downsampled)

return pyramid

# Function to shift the image by n pixels

def shift_image(image, shift_x, shift_y):

rows, cols = image.shape

matrix = np.float32([[1, 0, shift_x], [0, 1, shift_y]])

shifted_image = cv2.warpAffine(image, matrix, (cols, rows))

return shifted_image

# Visualizing the pyramid levels

def visualize_pyramids(pyramids, title_prefix):

for i, level in enumerate(pyramids):

display_image(level, title=f"{title_prefix} Level {i}")


# Set the number of pyramid levels

levels = 4

# Generate and visualize pyramids for each filter

block_pyramid = block_filter(image, levels)

binomial_pyramid = binomial_filter(image, levels)

high_quality_pyramid = high_quality_filter(image, levels)

# Display the results

visualize_pyramids(block_pyramid, "Block Filter Pyramid")

visualize_pyramids(binomial_pyramid, "Binomial Filter Pyramid")

visualize_pyramids(high_quality_pyramid, "High-Quality Filter Pyramid")

# Compare images with shifted inputs

shifted_image = shift_image(image, 2, 2)

block_pyramid_shifted = block_filter(shifted_image, levels)

visualize_pyramids(block_pyramid_shifted, "Shifted Block Filter Pyramid")

Answer for Q 3.18

Steps for Blending Two Images

1. Construct Laplacian Pyramids:

Gaussian Pyramid: Create a Gaussian pyramid for each image by applying


Gaussian blur and downsampling.
Laplacian Pyramid: Calculate Laplacian images by subtracting the
upsampled version of the next Gaussian level from the current Gaussian level.

2. Create Mask Pyramids: Construct Gaussian pyramids for the binary


mask and its complement.
3. Blend Using Masks:

Multiply each Laplacian level by the corresponding mask from the mask
pyramids.

Sum the weighted Laplacian images to get the blended pyramid.

4. Reconstruct Final Image:Reconstruct the image from the blended


Laplacian pyramid by iteratively upsampling and adding.

Generalization for n Images

1. Pyramids for n Images:Construct Gaussian and Laplacian pyramids for


each of the n images.
2. Mask Pyramids:Create Gaussian pyramids for masks corresponding to
each label (1 through n).
3. Weighted Summation: Multiply each Laplacian pyramid by its
corresponding mask and sum them. Ensure mask weights sum to 1 for
correct blending.
4. Reconstruction:Reconstruct the final image from the combined
Laplacian pyramid.

Python Code:

import cv2

import numpy as np

def build_pyramid(image, levels, mode='gaussian'):

pyramid = [image]

for _ in range(levels - 1):


image = cv2.pyrDown(image) if mode == 'gaussian' else cv2.pyrUp(image)

pyramid.append(image)

return pyramid

def build_laplacian_pyramid(gaussian_pyramid):

laplacian_pyramid = []

for i in range(len(gaussian_pyramid) - 1):

gaussian_expanded = cv2.pyrUp(gaussian_pyramid[i + 1])


laplacian_pyramid.append(cv2.subtract(gaussian_pyramid[i], gaussian_expanded))

laplacian_pyramid.append(gaussian_pyramid[-1])

return laplacian_pyramid

def blend_pyramids(lap_pyramids, mask_pyramids):

blended_pyramid = []

levels = len(lap_pyramids[0])

for i in range(levels):

blended = np.zeros_like(lap_pyramids[0][i], dtype=np.float64)

total_mask = np.zeros_like(lap_pyramids[0][i], dtype=np.float64)

for lap, masks in zip(lap_pyramids, mask_pyramids):

blended += lap[i] * masks[i]

total_mask += masks[i]

# Avoid division by zero

total_mask[total_mask == 0] = 1

blended /= total_mask

blended_pyramid.append(blended)

return blended_pyramid
def reconstruct_image(pyramid):

image = pyramid[-1]

for level in reversed(pyramid[:-1]):

image = cv2.pyrUp(image)

image = cv2.add(image, level)

return image

def main(image_files, mask_files, output_file, levels=4):

# Read images and masks

images = [cv2.imread(file) for file in image_files]

masks = [cv2.imread(file, cv2.IMREAD_GRAYSCALE) / 255.0 for file in mask_files]

# Construct pyramids for all images

gaussian_pyramids = [build_pyramid(image, levels) for image in images]

laplacian_pyramids = [build_laplacian_pyramid(pyramid) for pyramid in


gaussian_pyramids]

# Construct pyramids for masks

mask_pyramids = [build_pyramid(mask, levels) for mask in masks]

# Blend pyramids

blended_pyramid = blend_pyramids(laplacian_pyramids, mask_pyramids)

# Reconstruct the final blended image

blended_image = reconstruct_image(blended_pyramid)

# Save the output image

cv2.imwrite(output_file, np.clip(blended_image, 0, 255).astype(np.uint8))

# Example usage

image_files = ['image1.jpg', 'image2.jpg', 'image3.jpg']


mask_files = ['mask1.png', 'mask2.png', 'mask3.png']

output_file = 'blended_image.jpg'

main(image_files, mask_files, output_file)

In the weighted summation stage, the weights (from the mask) should ideally
sum to 1 at each pixel location to avoid the need for renormalization. If the
sum of weights is not 1, you would need to normalize the weights to ensure the
final result is correctly balanced. This can be handled by adjusting the mask
values or scaling the final blended image.

Use Cases

● Exposure Fusion: Blend images taken with different exposures to create


an image with well-exposed regions.
● Creative Blends: Combine different scenes creatively, blending elements
seamlessly.

This approach provides a flexible and effective way to blend images while
preserving details and transitions.

Answer for Q 3.19)


i)The goal of this question is to implement pyramid blending using PyTorch. This involves
creating Gaussian and Laplacian pyramids, blending them, and reconstructing the final image.
The exercise aims to familiarize with PyTorch’s deep learning primitives without training
convolution weights.
ii)similar result
iii) Using PyTorch for this exercise allows leveraging GPU acceleration and seamless integration
with other deep learning tasks. However, for simple image processing tasks, traditional libraries
like OpenCV might be more straightforward and efficient.
iv)The choice of API depends on the specific use case.
Answer for Q(3.20)
Step-by-Step Implementation
1. Construct the Laplacian Pyramid
The Laplacian pyramid is constructed from a Gaussian pyramid. Each level of the Laplacian
pyramid is derived by subtracting the upsampled version of the next level's Gaussian image from
the current level's Gaussian image.
2. Apply Local Contrast Manipulation
Contrast is manipulated at each level of the pyramid using a pointwise transformation function
that depends on local intensities and their gradient magnitudes. The local Laplacian technique
allows contrast manipulation that preserves edges by modifying these values differently at each
level.
3. Reconstruct the Image
After manipulating each level of the Laplacian pyramid, reconstruct the final image by
progressively adding the modified levels back together.

Python Code
import cv2

import numpy as np

def gaussian_pyramid(image, levels):

"""Construct the Gaussian pyramid of an image."""

pyramid = [image]

for _ in range(levels - 1):

image = cv2.pyrDown(image) # Downsample the image

pyramid.append(image)

return pyramid

def laplacian_pyramid(gaussian_pyramid):

"""Construct the Laplacian pyramid from a Gaussian pyramid."""

laplacian_pyr = []
for i in range(len(gaussian_pyramid) - 1):

size = (gaussian_pyramid[i].shape[1], gaussian_pyramid[i].shape[0])

gaussian_expanded = cv2.pyrUp(gaussian_pyramid[i + 1], dstsize=size)

laplacian = cv2.subtract(gaussian_pyramid[i], gaussian_expanded)

laplacian_pyr.append(laplacian)

laplacian_pyr.append(gaussian_pyramid[-1]) # Add the top level of the Gaussian pyramid

return laplacian_pyr

def manipulate_contrast(laplacian_pyramid, alpha=1.0, beta=0.0):

"""Apply local contrast manipulation on each level of the Laplacian pyramid."""

manipulated_pyramid = []

for layer in laplacian_pyramid:

# Adjust contrast and brightness; alpha controls contrast, beta controls brightness

manipulated = cv2.convertScaleAbs(layer, alpha=alpha, beta=beta)

manipulated_pyramid.append(manipulated)

return manipulated_pyramid

def reconstruct_image(manipulated_pyramid):

"""Reconstruct the final image from the manipulated Laplacian pyramid."""

image = manipulated_pyramid[-1]

for i in range(len(manipulated_pyramid) - 2, -1, -1):

size = (manipulated_pyramid[i].shape[1], manipulated_pyramid[i].shape[0])

image = cv2.pyrUp(image, dstsize=size)

image = cv2.add(image, manipulated_pyramid[i])

return image

def local_laplacian_filter(image, levels=5, alpha=1.0, beta=0.0):

"""Main function to apply local Laplacian filtering."""


# Step 1: Construct Gaussian pyramid

gaussian_pyr = gaussian_pyramid(image, levels)

# Step 2: Construct Laplacian pyramid

laplacian_pyr = laplacian_pyramid(gaussian_pyr)

# Step 3: Manipulate contrast locally at each pyramid level

manipulated_pyr = manipulate_contrast(laplacian_pyr, alpha, beta)

# Step 4: Reconstruct the final image

final_image = reconstruct_image(manipulated_pyr)

return final_image

# Example usage

image = cv2.imread('input_image.jpg') # Load your input image

levels = 5 # Number of pyramid levels

alpha = 1.2 # Contrast manipulation parameter

beta = 0.0 # Brightness adjustment parameter

output_image = local_laplacian_filter(image, levels, alpha, beta)

cv2.imwrite('output_image.jpg', output_image) # Save the result

Explanation of Key Functions


• gaussian_pyramid: Builds the Gaussian pyramid by repeatedly downsampling the image.
• laplacian_pyramid: Constructs the Laplacian pyramid by subtracting the upsampled
version of each Gaussian level from the current level.
• manipulate_contrast: Applies contrast manipulation at each level, preserving edges while
adjusting local intensities.
• reconstruct_image: Reconstructs the final image from the manipulated Laplacian pyramid
by progressively adding upscaled pyramid levels.
Parameters
• alpha: Adjusts the contrast of the manipulated Laplacian layers.
• beta: Adjusts the brightness, allowing for tone manipulation.
Testing and Fine-Tuning
• Test the implementation on images with varying levels of detail and contrast.
• Fine-tune the parameters alpha and beta based on the desired effect, as these control the
intensity of contrast and brightness adjustments.
Answer for Q 3.21
Step 1: Implement a Wavelet Family
We will implement a wavelet decomposition based on the separable 2D wavelet transform
described in Section 3.5.4 using high-pass and low-pass filters. For simplicity, we can use one of
the predefined wavelet families like Daubechies, Coiflets, or Haar wavelets from the PyWavelets
library.

To implement the wavelet transform:

Load the image (grayscale or color).


Apply the wavelet decomposition: This will split the image into different frequency bands (sub-
bands) such as low-low (LL), low-high (LH), high-low (HL), and high-high (HH).
Reconstruct the image using inverse wavelet transform after modification
(compression/denoising).
For this exercise, we use PyWavelets to implement the wavelet decomposition:

import pywt
import numpy as np
import matplotlib.pyplot as plt
from skimage import data, img_as_float

# Load an example image


image = img_as_float(data.camera())

# Perform a 2D wavelet decomposition using Daubechies wavelet ('db1')


coeffs = pywt.wavedec2(image, 'db1', level=3)
cA, (cH, cV, cD) = coeffs[0], coeffs[1:]

# Visualize the decomposed components


plt.figure(figsize=(8, 8))
plt.subplot(221), plt.imshow(cA, cmap='gray'), plt.title('Approximation (LL)')
plt.subplot(222), plt.imshow(cH, cmap='gray'), plt.title('Horizontal (LH)')
plt.subplot(223), plt.imshow(cV, cmap='gray'), plt.title('Vertical (HL)')
plt.subplot(224), plt.imshow(cD, cmap='gray'), plt.title('Diagonal (HH)')
plt.show()

This code decomposes the image into different sub-bands using the Daubechies wavelet (db1).
The result is four bands: approximation (low-low), horizontal (low-high), vertical (high-low),
and diagonal (high-high).

Step 2: Implement Laplacian Pyramid (Exercise 3.17)


Now, we need to construct an image pyramid (as asked in Exercise 3.17) with separable filters.
The Laplacian pyramid is created by subtracting an image from its lower-resolution version.

We'll use the Burt and Adelson’s binomial kernel for this:
import cv2
import numpy as np

# Create a Gaussian pyramid for the input image


def gaussian_pyramid(image, levels):
pyramid = [image]
for i in range(levels):
image = cv2.pyrDown(image)
pyramid.append(image)
return pyramid

# Create a Laplacian pyramid by subtracting Gaussian levels


def laplacian_pyramid(image, levels):
g_pyr = gaussian_pyramid(image, levels)
l_pyr = [g_pyr[-1]]
for i in range(levels, 0, -1):
size = (g_pyr[i-1].shape[1], g_pyr[i-1].shape[0])
g_up = cv2.pyrUp(g_pyr[i], dstsize=size)
laplacian = cv2.subtract(g_pyr[i-1], g_up)
l_pyr.append(laplacian)
return l_pyr

# Load the grayscale image


image = cv2.imread('image.png', 0)

# Construct a 3-level Laplacian pyramid


lap_pyr = laplacian_pyramid(image, 3)

# Display the Laplacian pyramid levels


for i, lap in enumerate(lap_pyr):
plt.subplot(1, 4, i+1)
plt.imshow(lap, cmap='gray')
plt.title(f'Laplacian Level {i}')
plt.show()

Step 3: Apply to Compression or Denoising


Next, we will choose one of the tasks: compression or denoising. Let's perform denoising by
thresholding small wavelet coefficients (coring):

Denoising Task:

After performing the wavelet decomposition, we threshold small wavelet coefficients to zero.
This is known as coring and can be done using a piecewise linear function that sets small values
to zero.
Threshold and Reconstruct:
# Set a threshold to zero out small coefficients
threshold = 0.1
coeffs_thresh = list(map(lambda x: pywt.threshold(x, threshold, mode='soft'), coeffs))

# Reconstruct the image from the thresholded coefficients


image_denoised = pywt.waverec2(coeffs_thresh, 'db1')

# Display the denoised image


plt.imshow(image_denoised, cmap='gray')
plt.title('Denoised Image')
plt.show()

Comparison of Techniques:
Wavelet Decomposition: Highly effective at preserving edges and textures due to its multi-scale,
multi-orientation properties.
Laplacian Pyramid: Can be overcomplete and more prone to shift-variance, but simpler for
multi-resolution image processing.
In denoising, wavelets often perform better than the Laplacian pyramid because they preserve the
important features of the image while efficiently removing noise. This is due to the multi-level
decomposition and better localization of frequency components.

Conclusion:
For denoising, wavelets typically outperform Laplacian pyramids due to their ability to localize
features across scales and orientations. For compression, wavelets also tend to be more efficient
in retaining the important features while allowing for better compression ratios due to tighter
frequency localization.
Answer for Ex3.22:

1. Affine Transformation

Here's a Python code for affine transformation:


import cv2
import numpy as np
import matplotlib.pyplot as plt

# Load an image
image = cv2.imread('image.jpg')
rows, cols, ch = image.shape

# Affine transformation matrix


pts1 = np.float32([[50, 50], [200, 50], [50, 200]])
pts2 = np.float32([[10, 100], [200, 50], [100, 250]])

# Get affine transform matrix


M = cv2.getAffineTransform(pts1, pts2)

# Perform the affine transformation


affine_image = cv2.warpAffine(image, M, (cols, rows))
# Show the image
plt.subplot(121), plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB)),
plt.title('Input')
plt.subplot(122), plt.imshow(cv2.cvtColor(affine_image,
cv2.COLOR_BGR2RGB)), plt.title('Affine Transform')
plt.show()

Result:

2. Perspective Transformation

Perspective transformations handle how 3D objects are projected onto 2D


surfaces and allow for more flexible transformations, like changes in
perspective and vanishing points.
code:
import numpy as np
import cv2
import matplotlib.pyplot as plt

image = cv2.imread('image.jpg')
rows, cols, ch = image.shape
# Four points from input image
pts1 = np.float32([[56, 65], [368, 52], [28, 387], [389, 390]])

# Corresponding points in the output image


pts2 = np.float32([[0, 0], [300, 0], [0, 300], [300, 300]])

# Get the perspective transformation matrix


M = cv2.getPerspectiveTransform(pts1, pts2)

# Perform the perspective transformation


perspective_image = cv2.warpPerspective(image, M, (cols, rows))

# Show the image


plt.subplot(121), plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB)),
plt.title('Input')
plt.subplot(122), plt.imshow(cv2.cvtColor(perspective_image,
cv2.COLOR_BGR2RGB)), plt.title('Perspective Transform')
plt.show()
Result:

3. Bilinear Interpolation

We can use bilinear interpolation during image transformations to get smoother


results.
Discussion on MIP-mapping and Anisotropy
MIP-mapping:
MIP-mapping is a texture filtering technique that precomputes multiple levels of
textures at different resolutions. When applying a MIP-map:

Selecting the coarser level: When you choose a lower resolution level in the
MIP-map, you are essentially sampling from a smaller texture, which leads to
fewer details being preserved, resulting in a blurrier image.
Selecting the finer level: If you sample from a higher resolution level, you might
be picking too many samples in a small space, leading to aliasing (jagged lines
or noise).
Tri-linear MIP-mapping: This method interpolates between two adjacent
MIP-map levels. Blending between a blurred (coarser) image and an aliased
(finer) image can result in better visual quality because it balances the amount
of blur and aliasing. However, it may not completely solve the problem in all
cases.

Anisotropic Filtering:
When the ratio of horizontal and vertical resampling rates becomes very
different, MIP-mapping tends to fail because it doesn't consider directional
sampling. For example, when rendering textures that are viewed at steep angles
(like roads or walls), the texture can become distorted.

Solutions for anisotropic filtering problems:

Anisotropic filtering: This technique samples more along the axis that has a
higher resampling rate (the axis where more distortion would occur). This
results in better quality, especially when textures are viewed from oblique
angles.
Elliptical Weighted Average (EWA): This method adapts the kernel used for
texture filtering based on the sampling rate of both horizontal and vertical axes,
thus reducing aliasing and blur for anisotropic cases.

Visual Quality Comparison of Interpolants:


When experimenting with different interpolants (like bilinear, nearest neighbor,
bicubic), you can evaluate their visual quality as follows:

Nearest neighbor: Fast but can result in pixelated images, especially for larger
transformations.
Bilinear interpolation: Smooth results, but can still blur fine details.
Bicubic interpolation: Provides smoother results compared to bilinear, but
computationally more expensive.

we can change the interpolation method


code:
import numpy as np
import cv2
import matplotlib.pyplot as plt

image = cv2.imread('image.jpg')
rows, cols, ch = image.shape
pts1 = np.float32([[50, 50], [200, 50], [50, 200]])
pts2 = np.float32([[10, 100], [250, 50], [100, 300]]) # Increased skew for more
visible effect

# Get affine transform matrix


M = cv2.getAffineTransform(pts1, pts2)

# Use nearest neighbor interpolation


affine_image_nn = cv2.warpAffine(image, M, (cols, rows),
flags=cv2.INTER_NEAREST)

# Use bilinear interpolation (default)


affine_image_bilinear = cv2.warpAffine(image, M, (cols, rows),
flags=cv2.INTER_LINEAR)

# Use bicubic interpolation


affine_image_bicubic = cv2.warpAffine(image, M, (cols, rows),
flags=cv2.INTER_CUBIC)

# Display the results


plt.figure(figsize=(12, 4))
plt.subplot(131), plt.imshow(cv2.cvtColor(affine_image_nn,
cv2.COLOR_BGR2RGB)), plt.title('Nearest Neighbor')
plt.subplot(132), plt.imshow(cv2.cvtColor(affine_image_bilinear,
cv2.COLOR_BGR2RGB)), plt.title('Bilinear')
plt.subplot(133), plt.imshow(cv2.cvtColor(affine_image_bicubic,
cv2.COLOR_BGR2RGB)), plt.title('Bicubic')
plt.show()

Result:

we can then visually compare the images to assess their quality.

Answer to Qn 3.23

Part 1:
import cv2
import numpy as np
from scipy.interpolate import Rbf
import matplotlib.pyplot as plt

# Load the image


image = cv2.imread('input_image.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Convert image to RGB

# Define function to show image


def show_image(image, title="Image"):
plt.imshow(image)
plt.title(title)
plt.axis('off')
plt.show()

# Select points to be displaced (you can choose more points interactively in practice)
src_points = np.array([[100, 100], [200, 200], [300, 300], [400, 400]], dtype=np.float32) # Example
points
dst_points = np.array([[150, 100], [200, 150], [250, 300], [450, 400]], dtype=np.float32) # Displaced
points

# Visualize original points


image_with_points = image.copy()
for pt in src_points:
cv2.circle(image_with_points, tuple(pt.astype(int)), 5, (255, 0, 0), -1)
show_image(image_with_points, "Original Points")

# Define interpolation function (using Thin Plate Spline or RBF)


def interpolate_warp(src, dst, img_shape):
# Get grid of coordinates in the image
grid_x, grid_y = np.meshgrid(np.arange(img_shape[1]), np.arange(img_shape[0]))

# Interpolation using Radial Basis Function (RBF)


rbf_x = Rbf(src[:, 0], src[:, 1], dst[:, 0], function='thin_plate')
rbf_y = Rbf(src[:, 0], src[:, 1], dst[:, 1], function='thin_plate')

# Warp coordinates for the entire image


map_x = rbf_x(grid_x, grid_y).astype(np.float32)
map_y = rbf_y(grid_x, grid_y).astype(np.float32)

return map_x, map_y

# Interpolate the sparse displacement field to create a dense motion field


map_x, map_y = interpolate_warp(src_points, dst_points, image.shape[:2])

# Apply the warp to the image


warped_image = cv2.remap(image, map_x, map_y, interpolation=cv2.INTER_LINEAR)

# Show the warped image


show_image(warped_image, "Warped Image")

Part 2:

import cv2
import numpy as np
import matplotlib.pyplot as plt
# Beier-Neely Warp Function
def beier_neely_warp(image, src_lines, dst_lines):
def calc_u_v(x, y, P, Q):
PQ = Q - P
PQ_len = np.linalg.norm(PQ)
PQ_normalized = PQ / PQ_len if PQ_len != 0 else np.array([0, 0])

u = ((x - P[0]) * PQ_normalized[0] + (y - P[1]) * PQ_normalized[1]) / PQ_len


v = ((x - P[0]) * -PQ_normalized[1] + (y - P[1]) * PQ_normalized[0])

return u, v

rows, cols, _ = image.shape


warped_image = np.zeros_like(image)
weight_sum = np.zeros((rows, cols), dtype=np.float32)

for i, (src_line, dst_line) in enumerate(zip(src_lines, dst_lines)):


P_src, Q_src = src_line
P_dst, Q_dst = dst_line
PQ_src = Q_src - P_src
PQ_dst = Q_dst - P_dst

for y in range(rows):
for x in range(cols):
u, v = calc_u_v(x, y, P_src, Q_src)

new_x = P_dst[0] + u * (Q_dst[0] - P_dst[0]) - v * (Q_dst[1] - P_dst[1])


new_y = P_dst[1] + u * (Q_dst[1] - P_dst[1]) + v * (Q_dst[0] - P_dst[0])

if 0 <= new_x < cols and 0 <= new_y < rows:


warped_image[y, x] += image[int(new_y), int(new_x)]
weight_sum[y, x] += 1

# Normalize warped image by weight sum


warped_image[weight_sum > 0] /= weight_sum[weight_sum > 0][:, None]

return warped_image.astype(np.uint8)

# Load the image


image = cv2.imread('input_image.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Example source and destination lines (start and end points)


src_lines = np.array([[[100, 100], [200, 200]], [[300, 300], [400, 400]]], dtype=np.float32)
dst_lines = np.array([[[120, 110], [220, 190]], [[320, 320], [410, 390]]], dtype=np.float32)

# Apply Beier-Neely Warp


warped_image = beier_neely_warp(image, src_lines, dst_lines)

# Show original and warped image


plt.subplot(1, 2, 1)
plt.imshow(image)
plt.title('Original Image')

plt.subplot(1, 2, 2)
plt.imshow(warped_image)
plt.title('Warped Image')
plt.show()

Part 3:

import cv2
import numpy as np
import matplotlib.pyplot as plt

def create_grid(image, grid_size=(10, 10)):


h, w = image.shape[:2]
grid_x = np.linspace(0, w, grid_size[1])
grid_y = np.linspace(0, h, grid_size[0])
return np.meshgrid(grid_x, grid_y)

def deform_grid(grid, move_idx, displacement):


grid_deformed = grid.copy()
grid_deformed[0][move_idx] += displacement[0] # X displacement
grid_deformed[1][move_idx] += displacement[1] # Y displacement
return grid_deformed

def warp_image(image, grid, deformed_grid):


map_x, map_y = np.meshgrid(np.arange(image.shape[1]), np.arange(image.shape[0]))

map_x = map_x.astype(np.float32)
map_y = map_y.astype(np.float32)

for i in range(len(grid[0].flatten())):
x_src, y_src = grid[0].flatten()[i], grid[1].flatten()[i]
x_dst, y_dst = deformed_grid[0].flatten()[i], deformed_grid[1].flatten()[i]

map_x[int(y_src), int(x_src)] = x_dst


map_y[int(y_src), int(x_src)] = y_dst

warped_image = cv2.remap(image, map_x, map_y, interpolation=cv2.INTER_LINEAR)


return warped_image

# Load image and create a control grid


image = cv2.imread('input_image.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

grid = create_grid(image, grid_size=(10, 10))

# Define which grid point to move and by how much


move_idx = (5, 5) # Grid point index (row, col)
displacement = np.array([30, 20]) # X and Y displacement

deformed_grid = deform_grid(grid, move_idx, displacement)


warped_image = warp_image(image, grid, deformed_grid)

# Show the original and warped image


plt.subplot(1, 2, 1)
plt.imshow(image)
plt.title('Original Image')

plt.subplot(1, 2, 2)
plt.imshow(warped_image)
plt.title('Warped Image')
plt.show()

Part 4:

Proving Whether the Beier–Neely Warp Reduces to Point-Based Deformation:

Yes, the Beier–Neely warp reduces to a point-based deformation as the line segments become shorter.
This is because when the length of the line segments approaches zero, the line segments effectively
reduce to points. In this case, the interpolation formula collapses to a point-based interpolation, where the
deformation is driven purely by the individual points.

The main reason for this is that the influence of the lines is based on their length and proximity. As lines
get smaller, their effect becomes localised around the points they connect, resulting in behaviour similar
to point-based warping techniques.
Exercise 3.25:
Code:
import cv2

import numpy as np

def read_images(img1_path, img2_path):

img1 = cv2.imread(img1_path)

img2 = cv2.imread(img2_path)

return img1, img2

def convert_to_rgb(img):

if len(img.shape) == 2: # Grayscale image

img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)

elif img.shape[2] == 4: # Image with alpha channel

img = cv2.cvtColor(img, cv2.COLOR_BGRA2BGR)

return img

def resize_images(img1, img2):

# Resize img2 to the size of img1

img2_resized = cv2.resize(img2, (img1.shape[1], img1.shape[0]))

return img1, img2_resized

def warp_image(img, pts1, pts2):

matrix = cv2.getAffineTransform(pts1, pts2) # Affine transform using 3 points

warped_img = cv2.warpAffine(img, matrix, (img.shape[1], img.shape[0]))

return warped_img
def morph_images(img1, img2, alpha):

# Ensure the images have the same number of channels

if img1.shape[2] != img2.shape[2]:

if img1.shape[2] == 1: # Grayscale to RGB

img1 = cv2.cvtColor(img1, cv2.COLOR_GRAY2RGB)

if img2.shape[2] == 1: # Grayscale to RGB

img2 = cv2.cvtColor(img2, cv2.COLOR_GRAY2RGB)

return cv2.addWeighted(img1, alpha, img2, 1 - alpha, 0)

def main():

# Paths to your input images

img1_path = 'C:\\Users\\mohan\\Downloads\\Desktop\\male.jpeg' # Replace with the first image path

img2_path = 'C:\\Users\\mohan\\Downloads\\Desktop\\fem.webp' # Replace with the second image path

img1, img2 = read_images(img1_path, img2_path)

# Convert images to RGB

img1 = convert_to_rgb(img1)

img2 = convert_to_rgb(img2)

# Get correspondences (manually defined or through feature detection)

pts1 = np.float32([[50, 50], [200, 50], [50, 200]]) # Example points in img1

pts2 = np.float32([[60, 60], [220, 50], [60, 220]]) # Corresponding points in img2

# Warp images towards each other

warped_img1 = warp_image(img1, pts1, pts2)

warped_img2 = warp_image(img2, pts2, pts1)


# Resize images to the same size

warped_img1, warped_img2 = resize_images(warped_img1, warped_img2)

# Adjust alpha blending ratio

alpha = 0.5 # Blend ratio between 0 and 1

# Morph the images by cross-dissolving

morphed_image = morph_images(warped_img1, warped_img2, alpha)

# Save or display the result

cv2.imwrite('morphed_image.jpg', morphed_image)

cv2.imshow('Morphed Image', morphed_image)

cv2.waitKey(0)

cv2.destroyAllWindows()

if __name__ == "__main__":

main()

Explanation:

Main Functions and Their Operations


1. read_images(img1_path, img2_path):
○ Purpose: Load two images from specified paths.
○ Operations: Uses cv2.imread() to read each image from its path and
returns the two images.
2. convert_to_rgb(img):
○ Purpose: Ensure images are in RGB format, handling grayscale or
images with alpha channels.
○ Operations: Converts images to RGB if they are in grayscale, or
strips the alpha channel if present.
3. resize_images(img1, img2):
○ Purpose: Resize the second image to match the dimensions of the
first, ensuring they can be morphed together.
○ Operations: Uses cv2.resize() to match the dimensions of img2 to
img1.
4. warp_image(img, pts1, pts2):
○ Purpose: Apply an affine transformation to an image based on
specified point correspondences.
○ Operations: Calculates the affine transform matrix using
cv2.getAffineTransform() and applies it using cv2.warpAffine().
5. morph_images(img1, img2, alpha):
○ Purpose: Blend two images together using a specified alpha value.
○ Operations: Blends the images using cv2.addWeighted(), which
linearly combines images based on the alpha value.

Workflow in main() Function

● Image Loading: Calls read_images() to load the two specified images.


● Color Standardization: Converts both images to RGB format to avoid
issues during processing.
● Correspondences Setup: Manually sets points that correspond between the
two images for warping.
● Image Warping: Warps both images toward each other using predefined
points.
● Image Resizing: Ensures both warped images have the same dimensions.
● Image Morphing: Blends the warped images using a specified mix ratio
(alpha).
● Display and Save: Shows the final morphed image and saves it to a file.

Key Points:

● Point Correspondences: Critical for accurately warping the images so that


similar features align.
● Alpha Blending: Controls the mix of the two images, allowing for a gradual
transition or morph between them.
● Image Formats: Handling different image formats (grayscale, RGB, with or
without alpha) ensures the code is robust and can handle various input types.

This process showcases a combination of image processing techniques to achieve a


visually appealing result, commonly used in graphics and multimedia applications.

Output;

Ans for Q26


import cv2

import numpy as np

import json

import os
# Global variables for cropping, transformations, and image management

cropping = False

x_start, y_start, x_end, y_end = 0, 0, 0, 0

selected_image = None

canvas = None

transformation_mode = None

transformation_points = []

dragging_point = None

drag_threshold = 10 # Threshold for detecting drag

images = [] # List to hold images and their transformations

current_image_index = 0 # Index of the currently selected image

# Mouse callback function to handle cropping and dragging

def mouse_crop(event, x, y, flags, param):

global x_start, y_start, x_end, y_end, cropping, selected_image, dragging_point,


transformation_points, transformation_mode

if event == cv2.EVENT_LBUTTONDOWN:

if transformation_mode:

# Check if the click is near any transformation point

for i, pt in enumerate(transformation_points):

if np.linalg.norm(np.array([x, y]) - np.array(pt)) < drag_threshold:

dragging_point = i

break

else:

# Start cropping
cropping = True

x_start, y_start = x, y

elif event == cv2.EVENT_MOUSEMOVE:

if cropping:

# Update the end coordinates for cropping

x_end, y_end = x, y

# Draw a rectangle to show the cropping area

img_copy = image.copy()

cv2.rectangle(img_copy, (x_start, y_start), (x_end, y_end), (0, 255, 0), 1)

cv2.imshow("Image", img_copy)

elif dragging_point is not None:

# Update the dragged point position

transformation_points[dragging_point] = (x, y)

update_transformed_image()

cv2.imshow("Transformed Image", selected_image)

elif event == cv2.EVENT_LBUTTONUP:

if cropping:

# Finish cropping

cropping = False

x_end, y_end = x, y

if x_start != x_end and y_start != y_end:

# Crop the selected region


selected_image = image[min(y_start, y_end):max(y_start, y_end),

min(x_start, x_end):max(x_start, x_end)].copy()

cv2.imshow("Cropped Image", selected_image)

else:

dragging_point = None

# Function to update the transformed image based on points

def update_transformed_image():

global selected_image, transformation_points, transformation_mode

if transformation_mode == 'affine' and len(transformation_points) == 3:

pts1 = np.float32([[0, 0], [selected_image.shape[1] - 1, 0], [0, selected_image.shape[0] - 1]])

pts2 = np.float32(transformation_points)

affine_matrix = cv2.getAffineTransform(pts1, pts2)

selected_image = cv2.warpAffine(selected_image, affine_matrix, (selected_image.shape[1],


selected_image.shape[0]))

elif transformation_mode == 'perspective' and len(transformation_points) == 4:

pts1 = np.float32([[0, 0], [selected_image.shape[1] - 1, 0], [0, selected_image.shape[0] - 1],


[selected_image.shape[1] - 1, selected_image.shape[0] - 1]])

pts2 = np.float32(transformation_points)

perspective_matrix = cv2.getPerspectiveTransform(pts1, pts2)

selected_image = cv2.warpPerspective(selected_image, perspective_matrix,


(selected_image.shape[1], selected_image.shape[0]))

# Function to save the canvas and transformations

def save_canvas(filename):
global canvas, images

data = {

'canvas': canvas.tolist(),

'images': [{'image': img.tolist(), 'transformation': trans} for img, trans in images]

with open(filename, 'w') as f:

json.dump(data, f)

# Function to load the canvas and transformations

def load_canvas(filename):

global canvas, images

with open(filename, 'r') as f:

data = json.load(f)

canvas = np.array(data['canvas'], dtype=np.uint8)

images = [(np.array(img['image'], dtype=np.uint8), img['transformation']) for img in


data['images']]

# Function to paste the cropped image onto the canvas

def paste_on_canvas(image, location):

global canvas

canvas_height, canvas_width = canvas.shape[:2]

img_height, img_width = image.shape[:2]

# Check if the image fits on the canvas

if location[1] + img_height <= canvas_height and location[0] + img_width <= canvas_width:

canvas[location[1]:location[1] + img_height, location[0]:location[0] + img_width] = image


else:

print("Image does not fit on the canvas at the given location.")

return canvas

# Function to translate the image

def translate(image, tx, ty):

translation_matrix = np.float32([[1, 0, tx], [0, 1, ty]])

dimensions = (image.shape[1], image.shape[0])

return cv2.warpAffine(image, translation_matrix, dimensions)

# Function to apply affine transformation

def affine_transform(image):

rows, cols = image.shape[:2]

# Define source and destination points

pts1 = np.float32([[0, 0], [cols - 1, 0], [0, rows - 1]])

pts2 = np.float32([[0, rows * 0.33], [cols * 0.85, rows * 0.25], [cols * 0.15, rows * 0.7]])

# Get the affine transformation matrix

affine_matrix = cv2.getAffineTransform(pts1, pts2)

return cv2.warpAffine(image, affine_matrix, (cols, rows))

# Function to apply perspective transformation

def perspective_transform(image):

rows, cols = image.shape[:2]

# Define source and destination points

pts1 = np.float32([[0, 0], [cols - 1, 0], [0, rows -1], [cols -1, rows -1]])
pts2 = np.float32([[cols*0.1, rows*0.2], [cols*0.9, rows*0.1], [cols*0.2, rows*0.9], [cols*0.8,
rows*0.8]])

# Get the perspective transformation matrix

perspective_matrix = cv2.getPerspectiveTransform(pts1, pts2)

return cv2.warpPerspective(image, perspective_matrix, (cols, rows))

# Function to apply similarity transformation

def similarity_transform(image):

rows, cols = image.shape[:2]

# Define the rotation center, angle, and scale

center = (cols / 2, rows / 2)

angle = 30 # degrees

scale = 0.8

# Get the rotation matrix

similarity_matrix = cv2.getRotationMatrix2D(center, angle, scale)

return cv2.warpAffine(image, similarity_matrix, (cols, rows))

# Function to apply rigid transformation (rotation and translation)

def rigid_transform(image):

rows, cols = image.shape[:2]

# Define the rotation angle and translation

angle = 30 # degrees

tx, ty = 50, 50 # Translation

# Rotation matrix

center = (cols / 2, rows / 2)

rigid_matrix = cv2.getRotationMatrix2D(center, angle, 1)


# Translation

rigid_matrix[0, 2] += tx

rigid_matrix[1, 2] += ty

return cv2.warpAffine(image, rigid_matrix, (cols, rows))

# Main function for the image editor

def image_editor():

global image, selected_image, canvas, transformation_mode, transformation_points, images,


current_image_index

# Load the image

image_path = r"C:\Users\Namita\OneDrive\Desktop\College\Semester 7\CV\cv_assgn_img.png"


# Replace with your image path

image = cv2.imread(image_path)

# Create an empty canvas

canvas_height, canvas_width = 600, 800

canvas = np.ones((canvas_height, canvas_width, 3), dtype=np.uint8) * 255

# Set up the window and mouse callback

cv2.namedWindow("Image")

cv2.setMouseCallback("Image", mouse_crop)

cv2.imshow("Image", image)

while True:

key = cv2.waitKey(1) & 0xFF


if key == ord('q'):

break

elif key == ord('s'): # Save canvas

filename = 'canvas.json'

save_canvas(filename)

elif key == ord('l'): # Load canvas

filename = 'canvas.json'

load_canvas(filename)

elif key == ord('a'): # Set affine transformation mode

transformation_mode = 'affine'

transformation_points = [(0, 0), (image.shape[1], 0), (0, image.shape[0])]

elif key == ord('p'): # Set perspective transformation mode

transformation_mode = 'perspective'

transformation_points = [(0, 0), (image.shape[1], 0), (0, image.shape[0]), (image.shape[1],


image.shape[0])]

elif key == ord('t'): # Translate transformation

tx, ty = 100, 50

selected_image = translate(selected_image, tx, ty)

elif key == ord('r'): # Rigid transformation

selected_image = rigid_transform(selected_image)

elif key == ord('i'): # Similarity transformation

selected_image = similarity_transform(selected_image)

elif key == ord('f'): # Apply affine transformation

selected_image = affine_transform(selected_image)

elif key == ord('x'): # Apply perspective transformation

selected_image = perspective_transform(selected_image)
elif key == ord('n'): # Paste current image to canvas

location = (100, 100) # Define location to paste

canvas = paste_on_canvas(selected_image, location)

cv2.imshow("Canvas", canvas)

cv2.destroyAllWindows()

# Run the image editor

image_editor()

OUTPUTS:
Ans
for

question no. 3.27:

To extend the 3D viewer to support texture-mapped polygon rendering, we'll need


to add functionality for loading textures, mapping the texture onto the polygons,
and then rendering the polygons with the texture applied. The process involves
several steps, from handling texture coordinates to modifying the rendering
pipeline to support textures.
Step 1: Set Up the Environment
In addition to Tkinter and NumPy, we need the PIL library (Pillow) for handling
image textures:
pip install numpy pillow

Step 2: Modify the Viewer Class


We'll extend the existing Viewer3D class to handle texture mapping:
Loading Textures: Add a method to load an image as a texture.
Texture Coordinates: Associate each polygon with a set of texture coordinates (u,
v, w) corresponding to its vertices.
Rendering with Textures: Modify the rendering pipeline to map the texture onto
the polygon surface.

Step 3: Implement the Texture Mapping


Here’s the updated implementation:
import tkinter as tk
import numpy as np
from tkinter import filedialog
from PIL import Image, ImageTk
import json

class Viewer3D:
def __init__(self, root):
self.root = root
self.canvas = tk.Canvas(root, width=800, height=600, bg="white")
self.canvas.pack()

self.points = [] # List of 3D points


self.lines = [] # List of lines defined by point indices
self.polygons = [] # List of polygons defined by point indices and texture
coordinates
self.object_transform = np.eye(4) # Identity matrix for object transform
self.camera_transform = np.eye(4) # Identity matrix for camera transform
self.texture = None
self.texture_image = None
# Event bindings
self.canvas.bind("<ButtonPress-1>", self.on_button_press)
self.canvas.bind("<B1-Motion>", self.on_mouse_drag)
self.canvas.bind("<ButtonRelease-1>", self.on_button_release)

# Add menu
menu = tk.Menu(root)
root.config(menu=menu)
file_menu = tk.Menu(menu, tearoff=0)
menu.add_cascade(label="File", menu=file_menu)
file_menu.add_command(label="Load Scene", command=self.load_scene)
file_menu.add_command(label="Save Scene", command=self.save_scene)
file_menu.add_command(label="Load Texture", command=self.load_texture)
file_menu.add_separator()
file_menu.add_command(label="Exit", command=root.quit)

def load_scene(self):
file_path = filedialog.askopenfilename(filetypes=[("JSON files", "*.json")])
if file_path:
with open(file_path, 'r') as f:
data = json.load(f)
self.points = data['points']
self.lines = data['lines']
self.polygons = data['polygons']
self.object_transform = np.array(data['object_transform'])
self.camera_transform = np.array(data['camera_transform'])
self.render()

def save_scene(self):
data = {
'points': self.points,
'lines': self.lines,
'polygons': self.polygons,
'object_transform': self.object_transform.tolist(),
'camera_transform': self.camera_transform.tolist()
}
file_path = filedialog.asksaveasfilename(defaultextension=".json")
if file_path:
with open(file_path, 'w') as f:
json.dump(data, f)

def load_texture(self):
file_path = filedialog.askopenfilename(filetypes=[("Image files",
"*.png;*.jpg")])
if file_path:
self.texture_image = Image.open(file_path)
self.texture = self.texture_image.load() # Load texture pixels
self.render()

def on_button_press(self, event):


self.start_x = event.x
self.start_y = event.y

def on_mouse_drag(self, event):


dx = event.x - self.start_x
dy = event.y - self.start_y
self.start_x = event.x
self.start_y = event.y

# Apply transformation based on mode (object or camera)


rotation = self._get_rotation_matrix(dx, dy)
self.object_transform = np.dot(rotation, self.object_transform)

self.render()

def on_button_release(self, event):


pass

def _get_rotation_matrix(self, dx, dy):


angle_x = np.radians(dx)
angle_y = np.radians(dy)

rotation_x = np.array([
[1, 0, 0, 0],
[0, np.cos(angle_x), -np.sin(angle_x), 0],
[0, np.sin(angle_x), np.cos(angle_x), 0],
[0, 0, 0, 1]
])

rotation_y = np.array([
[np.cos(angle_y), 0, np.sin(angle_y), 0],
[0, 1, 0, 0],
[-np.sin(angle_y), 0, np.cos(angle_y), 0],
[0, 0, 0, 1]
])

return np.dot(rotation_y, rotation_x)

def transform_point(self, point, transform):


p = np.array([point[0], point[1], point[2], 1])
transformed_point = np.dot(transform, p)
return transformed_point[:3] / transformed_point[3]

def project_point(self, point):


# Simple orthographic projection
return [point[0], point[1]]

def render(self):
self.canvas.delete("all")

transformed_points = [
self.transform_point(point, np.dot(self.camera_transform,
self.object_transform))
for point in self.points
]

projected_points = [self.project_point(point) for point in transformed_points]

for line in self.lines:


p1 = projected_points[line[0]]
p2 = projected_points[line[1]]
self.canvas.create_line(p1[0] + 400, p1[1] + 300, p2[0] + 400, p2[1] + 300,
fill="blue")

for polygon in self.polygons:


poly_points = [projected_points[i] for i in polygon['indices']]
flat_points = [(p[0] + 400, p[1] + 300) for p in poly_points]

if self.texture:
self.draw_textured_polygon(flat_points, polygon['uvs'])
else:
self.canvas.create_polygon(flat_points, outline="black", fill="gray",
stipple="gray50")

def draw_textured_polygon(self, points, uvs):


# Using basic barycentric coordinates and interpolation for simplicity
if len(points) < 3:
return # Not a valid polygon

for i in range(0, len(points) - 2):


tri_points = [points[0], points[i+1], points[i+2]]
tri_uvs = [uvs[0], uvs[i+1], uvs[i+2]]

self.fill_triangle(tri_points, tri_uvs)

def fill_triangle(self, points, uvs):


# Simple implementation of barycentric coordinates for texture mapping
p1, p2, p3 = points
uv1, uv2, uv3 = uvs

min_x = max(0, min(p1[0], p2[0], p3[0]))


max_x = min(self.canvas.winfo_width(), max(p1[0], p2[0], p3[0]))
min_y = max(0, min(p1[1], p2[1], p3[1]))
max_y = min(self.canvas.winfo_height(), max(p1[1], p2[1], p3[1]))

for x in range(int(min_x), int(max_x)):


for y in range(int(min_y), int(max_y)):
w1, w2, w3 = self.barycentric_coords([x, y], p1, p2, p3)
if w1 >= 0 and w2 >= 0 and w3 >= 0: # Inside the triangle
u = w1 * uv1[0] + w2 * uv2[0] + w3 * uv3[0]
v = w1 * uv1[1] + w2 * uv2[1] + w3 * uv3[1]
if 0 <= u < 1 and 0 <= v < 1:
color = self.get_texture_color(u, v)
self.canvas.create_line(x, y, x+1, y, fill=color)

def barycentric_coords(self, p, a, b, c):


# Compute barycentric coordinates
detT = (b[1] - c[1]) * (a[0] - c[0]) + (c[0] - b[0]) * (a[1] - c[1])
w1 = ((b[1] - c[1]) * (p[0] - c[0]) + (c[0] - b[0]) * (p[1] - c[1])) / detT
w2 = ((c[1] - a[1]) * (p[0] - c[0]) + (a[0] - c[0]) * (p[1] - c[1])) / detT
w3 = 1 - w1 - w2
return w1, w2, w3

def get_texture_color(self, u, v):


# Map u, v to texture coordinates and fetch the color
tex_x = int(u * self.texture_image.width)
tex_y = int(v * self.texture_image.height)
r, g, b = self.texture[tex_x, tex_y]
return f'#{r:02x}{g:02x}{b:02x}'

if __name__ == "__main__":
root = tk.Tk()
viewer = Viewer3D(root)
root.mainloop()
Explanation
Loading the Texture:

The load_texture method loads an image and stores its pixel data.
We use the PIL library to handle image operations.
Texture Coordinates:

Each polygon now includes a list of (u, v) texture coordinates for each vertex.
These are stored in the polygons list as part of the polygon data.
Rendering with Textures:

The draw_textured_polygon and fill_triangle methods handle rendering polygons


with the texture applied.
We calculate barycentric coordinates for each pixel within the polygon to
interpolate the texture coordinates (u, v).
Texture Sampling:

The get_texture_color method samples the texture using the interpolated (u, v)
coordinates, fetching the appropriate color from the texture image.
Step 4: Run and Test
Run the program to load a scene, apply transformations, load a texture, and render
the polygons with the texture mapped.

Notes
Barycentric Coordinates: This is a basic approach to interpolate the texture over
the polygon. For large or complex scenes, more sophisticated methods or
optimizations might be required.
Perspective Correction: In this example, texture mapping is done without
perspective correction. For accurate 3D rendering, you'd need to implement
perspective-correct texture mapping. This involves dividing texture coordinates by
the depth (w) before interpolation.
Performance: The implementation is simplified for educational purposes and may
not be optimized for large-scale 3D scenes or complex textures.

Answer for 3.28:


For synthetic noise:
import cv2

import numpy as np

from skimage import img_as_float

from skimage.metrics import peak_signal_noise_ratio as psnr, structural_similarity as ssim

import matplotlib.pyplot as plt

# Load the image (for synthetic noise, we start with a clean image)

image = cv2.imread('cv1.jpeg')

image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # Convert to grayscale for simplicity

image_float = img_as_float(image) # Convert to float for PSNR/SSIM calculations

# Add synthetic Gaussian noise

noise_stddev = 25

noisy_image = image_float + noise_stddev / 255.0 * np.random.randn(*image_float.shape)

noisy_image = np.clip(noisy_image, 0, 1)

# Apply Gaussian Smoothing

gaussian_denoised = cv2.GaussianBlur(noisy_image, (5, 5), 1.5)


# Apply Non-Local Means Denoising

nlm_denoised = cv2.fastNlMeansDenoising(np.uint8(noisy_image*255), None, h=10,


templateWindowSize=7, searchWindowSize=21)

nlm_denoised = nlm_denoised / 255.0 # Normalize for comparison

# Compute PSNR and SSIM for both methods

psnr_gaussian = psnr(image_float, gaussian_denoised,data_range=1)

ssim_gaussian = ssim(image_float, gaussian_denoised,data_range=1)

psnr_nlm = psnr(image_float, nlm_denoised,data_range=1)

ssim_nlm = ssim(image_float, nlm_denoised,data_range=1)

# Display Results

print(f"Gaussian Smoothing -> PSNR: {psnr_gaussian}, SSIM: {ssim_gaussian}")

print(f"NLM Denoising -> PSNR: {psnr_nlm}, SSIM: {ssim_nlm}")

# Plot the images for comparison

fig, axes = plt.subplots(1, 4, figsize=(20, 5))

axes[0].imshow(image, cmap='gray')

axes[0].set_title('Original Image')

axes[0].axis('off')

axes[1].imshow(noisy_image, cmap='gray')

axes[1].set_title('Noisy Image')

axes[1].axis('off')

axes[2].imshow(gaussian_denoised, cmap='gray')

axes[2].set_title(f'Gaussian Denoised\nPSNR: {psnr_gaussian:.2f}, SSIM: {ssim_gaussian:.2f}')


axes[2].axis('off')

axes[3].imshow(nlm_denoised, cmap='gray')

axes[3].set_title(f'NLM Denoised\nPSNR: {psnr_nlm:.2f}, SSIM: {ssim_nlm:.2f}')

axes[3].axis('off')

plt.show()

For real noise:


import cv2

import numpy as np

from skimage import img_as_float

from skimage.metrics import peak_signal_noise_ratio as psnr, structural_similarity as ssim

import matplotlib.pyplot as plt

# Load the real-world low-light image

# You can use a low-light image or any real-world noisy image sequence

image = cv2.imread('cv2.jpeg')

image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # Convert to grayscale for simplicity

image_float = img_as_float(image) # Convert to float for PSNR/SSIM calculations


# Apply Gaussian Smoothing

gaussian_denoised = cv2.GaussianBlur(image_float, (5, 5), 1.5)

# Apply Non-Local Means Denoising

nlm_denoised = cv2.fastNlMeansDenoising(np.uint8(image_float*255), None, h=10, templateWindowSize=7,


searchWindowSize=21)

nlm_denoised = nlm_denoised / 255.0 # Normalize for comparison

# Since we don't have the ground truth clean image for real-world data, we won't compute PSNR/SSIM here.

# However, you can still visually compare the denoised images

# Plot the images for visual comparison

fig, axes = plt.subplots(1, 3, figsize=(15, 5))

axes[0].imshow(image, cmap='gray')

axes[0].set_title('Original Low-Light Image')

axes[0].axis('off')

axes[1].imshow(gaussian_denoised, cmap='gray')

axes[1].set_title('Gaussian Denoised')

axes[1].axis('off')

axes[2].imshow(nlm_denoised, cmap='gray')

axes[2].set_title('NLM Denoised')

axes[2].axis('off')

plt.show()
We have implemented Gaussian Smoothing and Non-Local Means (NLM).
Yes, the performance of image denoising algorithms like Gaussian Smoothing and
Non-Local Means(NLM) does depend on the correct choice of noise level
estimation. Here’s how:
1. Impact of Noise Level Estimate:
Gaussian Smoothing:
● Noise level sensitivity: Gaussian smoothing applies a low-pass filter that
removes high-frequency noise but blurs details. While it doesn't explicitly
estimate noise, the filter size and standard deviation (which correspond to
the "strength" of smoothing) need to match the actual noise level.
● Too low filter strength: Not enough noise reduction.
● Too high filter strength: Excessive blurring, loss of edges and fine details.
● Summary: Gaussian smoothing is relatively simple but lacks adaptability to
varying noise levels. Its performance is not directly dependent on estimating
noise but rather on tuning the smoothing parameters.

Non-Local Means (NLM):


● Noise level sensitivity: NLM explicitly requires an estimate of the noise
level. The filter strength (h) parameter directly controls how much denoising
is applied based on the noise level.
● Accurate noise estimate: NLM excels at removing noise while preserving
details, as it adaptively averages similar pixel patches.
● Underestimation of noise: The algorithm will fail to remove enough noise,
leaving visible grain in the image.
● Overestimation of noise: NLM will overly smooth the age, blurring details
along with noise.
● Summary: NLM is highly dependent on an accurate noise level estimate.
When the noise level is correctly estimated, it significantly outperforms
simpler methods like Gaussian smoothing.

2. Comparison of Techniques:

Gaussian Smoothing:
● Advantages: Simple, fast, and works well in cases where the noise is evenly
distributed and the image doesn’t have many sharp edges.
● Disadvantages: Blurs the entire image uniformly, including edges and fine
details, leading to loss of sharpness. It is less effective when noise is high or
complex.
● Conclusion: Better suited for mild, uniform noise and situations where
computational simplicity is more important than preserving fine details.

Non-Local Means (NLM):


● Advantages: Excellent at preserving edges and fine details. It adapts to the
image structure, making it more effective for high-variance noise or textures.
● Disadvantages: More computationally expensive, and its performance
heavily depends on a good noise level estimate.
● Conclusion: NLM works better in most practical scenarios, especially when
noise level is known or estimated accurately. It is especially effective for
low-light images and images with fine textures where detail preservation is
crucial.

3. Conclusions:
● NLM is superior to Gaussian smoothing when preserving details and
adapting to varying noise levels.
● Accurate noise level estimation is critical for NLM to perform optimally. If
the noise level is underestimated or overestimated, it may under- or over-
smooth the image.
● For Gaussian smoothing, the performance is mainly dependent on choosing
appropriate filter parameters, though it's less flexible compared to NLM.
● In real-world scenarios, NLM tends to outperform simpler methods like
Gaussian smoothing, particularly in high-noise environments (e.g., low-light
sequences).

Answer for 3.29


import cv2
import numpy as- np
import matplotlib-pyplot as plt

def load image(image_path):


image = cv2. imread (image path)
image = cv2. cvtcolor (image, cv2.COLOR_BGR2
return image

def create_rainbow_mask(image) :
hsv_image = cv2. cvtcolor (image, cv2. COLOR
lower_red1 = np.array ([0, 50, 50])
upper_redi = np.array ([10, 255, 255])
lower_red2 = np.array ([170, 50, 50])
upper_red2 = np.array ([180, 255, 255])
lower_orange = np.array([11, 50, 50])
upper_orange = np.array ([25, 255, 255])
lower_yellow = np.array ([26, 50, 50])
upper_yellow = np. array ([35, 255, 255])
lower_green = np.array ([36, 50, 50])
upper_green = np.array ([85, 255, 255])
lower _blue = np-array([86, 50, 50])
upper_blue = np.array 130, 255, 255)
lower_violet = np-array ([131, 50, 50])
upper_violet = np. array ([160, 255, 255])
mask_red1 = cv2. inRange(hsv_image, lower_red1, upper_red1)
mask_red2 = cv2. inRange(hsv_image, lower_red2, upper_red2)
mask_red = mask_red1 | mask_red2
mask_orange = cv2. inRange(hsv_image, lower_orange, upper_orange)
mask_yellow = cv2. inRange(hsv_image, lower_yellow, upper _yellow)
mask_green = cv2. inRange(hsv _image, lower green, upper green)
mask_blue = cv2. inRange(hsv_image, lower_blue, upper _blue)
mask_violet = cv2. inRange(hsv_image, lower_violet, upper_violet)
rainbow_mask = mask_red | mask_orange | mask yellow | mask green |
mask_blue | mask violet
return rainbow_mask

def amplify_rainbow (image, mask):


factor=1.8
image_float = np. float32(image) / 255.0
rainbow_region = cv2. bitwise_and(image_float, image_float, mask=mask[:,
:, np.newaxis])
amplified_rainbow = rainbow_region * factor
amplified_rainbow = np. clip(amplified_rainbow, 0, 1)
final_image = np. copy(image_float)
final_image[mask › 0] = amplified_rainbow[mask › 0]
return np.uint8(final_image * 255)

def display_images(original, modified):


fig, ax = plt.subplots(1, 2, figsize=(12, 6))
ax[0]. imshow(original)
ax[0]. set_title('Original Image')
ax[1] - imshow(modified)
ax[1]. set_title('Modified Image')
pit. show()

image_path = "/content/r. jpg"


image = load_image (image_path)
rainbow_mask = create_rainbow_mask(image)
amplified image = amplify_rainbow(image, rainbow mask)
display_images (image, amplified image)

You might also like