Ass ch3
Ass ch3
Assignment-2 (Chapter 3)
Q(3.1) Write a simple application to change the color balance of an image by multiplying each
color value by a different user-specified constant. If you want to get fancy, you can make this
application interactive, with sliders.
1. Do you get different results if you take out the gamma transformation before or after doing the
multiplication? Why or why not?
2. Take the same picture with your digital camera using different color balance settings (most
cameras control the color balance from one of the menus). Can you recover what the color balance
ratios are between the different settings? You may need to put your camera on a tripod and align
the images manually or automatically to make this work.
3. Can you think of any reason why you might want to perform a color twist on the images
Conversely, when gamma correction is applied after the color multiplication, the process
allows for more direct control over color balance. In this approach, color balance
adjustments are made to the raw color channels first, which involves multiplying each
channel by its respective gain factor. Following these adjustments, gamma correction is
applied to the color-balanced image to fine-tune the overall brightness and contrast. This
sequence ensures that the color balance modifications are based on the original image’s
color values, leading to more predictable and controllable outcomes. The final brightness
and contrast adjustment via gamma correction occurs after the color adjustments, making
it easier to manage and understand the impact of each operation.
2)
% Load two images taken with different color balance settings
img1 = imread('image_daylight.jpg'); % Image 1 (Daylight)
img2 = imread('image_tungsten.jpg'); % Image 2 (Tungsten)
% Calculate the mean RGB values in the ROI for both images
mean_rgb_img1 = mean(reshape(roi_img1, [], 3), 1); % Image 1 (Daylight)
mean_rgb_img2 = mean(reshape(roi_img2, [], 3), 1); % Image 2 (Tungsten)
3)
Performing a color twist on images, as discussed in Section 3.1.2, involves applying a
linear transformation to the color channels, such as mixing the red, green, and blue
channels to create new color combinations. This can be particularly useful in various
scenarios:
1. Correcting Color Casts: A color twist can be used to correct undesired color casts
caused by lighting conditions (e.g., under artificial lighting that skews color tones). By
adjusting how each channel interacts with the others, you can create a more neutral or
realistic color appearance.
2. Stylizing Images: If you are aiming to create a specific artistic or cinematic look, a
color twist can help you apply uniform color changes across the image. For example,
adding a warm or cool tone by blending channels can achieve a unique style for an image
or a series of photos.
3. Enhancing Color Depth and Contrast: By twisting the colors in subtle ways, you can
increase the perceptual contrast or separation between similar colors, which might make
the image appear more vibrant or detailed.
4. Simulating Film Emulation or Filters: In some cases, photographers or filmmakers use
color twists to mimic the look of traditional film stock or to apply filters that affect the
way colors are rendered. This can give digital images a retro or analog look.
5. Harmonizing Multiple Images: If you're working with multiple images shot under
different lighting conditions or with different cameras, a color twist can help harmonize
the appearance of those images by shifting their color profiles to a common reference.
Thus, applying a color twist on an image can enhance visual aesthetics, correct color
inconsistencies, or create certain stylistic effects based on the context.
Q(3.2) If you have access to the RAW image for the camera, perform the demosaicing yourself
(Section 10.3.1). If not, just subsample an RGB image in a Bayer mosaic pattern. Instead of just
bilinear interpolation, try one of the more advanced techniques described in Section 10.3.1.
Compare your result to the one produced by the camera. Does your camera perform a simple linear
mapping between RAW values and the color-balanced values in a JPEG? Some high-end cameras
have a RAW+JPEG mode, which makes this comparison much easier.
Answer for Q(3.2)
To simulate RAW image a baeyer filter is applied on a JPEG image.Then Malvar, He, Cutler
algorithm mentioned in Section 10.3.1 is applied for demosaicing.
CODE
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import convolve
from skimage import io
# Baeyer Filter
def bayer_mosaic(rgb_image):
h, w, _ = rgb_image.shape
bayer = np.zeros((h, w))
# Green pixels
bayer[::2, ::2] = rgb_image[::2, ::2, 1]
bayer[1::2, 1::2] = rgb_image[1::2, 1::2, 1]
# Red pixels
bayer[::2, 1::2] = rgb_image[::2, 1::2, 0]
# Blue pixels
bayer[1::2, ::2] = rgb_image[1::2, ::2, 2]
return bayer
# Malvar-He-Cutler Demosaicing
def demosaic_malvar_he_cutler(bayer_image):
h_G_at_RB = np.array([[0, 0, -1, 0, 0],
[0, 0, 2, 0, 0],
[-1, 2, 4, 2, -1],
[0, 0, 2, 0, 0],
[0, 0, -1, 0, 0]]) / 8
h_R_at_G = np.array([[0, 0, 0.5, 0, 0],
[0, -1, 0, -1, 0],
[-1, 4, 5, 4, -1],
[0, -1, 0, -1, 0],
[0, 0, 0.5, 0, 0]]) / 8
h_B_at_G = h_R_at_G
h_R_at_B = np.array([[0, 0, -1.5, 0, 0],
[0, 2, 0, 2, 0],
[-1.5, 0, 6, 0, -1.5],
[0, 2, 0, 2, 0],
[0, 0, -1.5, 0, 0]]) / 8
h_B_at_R = h_R_at_B
h, w = bayer_image.shape
red_channel = np.zeros((h, w))
green_channel = np.zeros((h, w))
blue_channel = np.zeros((h, w))
red_channel[::2, 1::2] = bayer_image[::2, 1::2]
green_channel[::2, ::2] = bayer_image[::2, ::2]
green_channel[1::2, 1::2] = bayer_image[1::2, 1::2]
blue_channel[1::2, ::2] = bayer_image[1::2, ::2]
# Apply Demosaicing
green_channel += convolve(bayer_image, h_G_at_RB)
red_channel += convolve(bayer_image, h_R_at_G) + convolve(bayer_image, h_R_at_B)
blue_channel += convolve(bayer_image, h_B_at_G) + convolve(bayer_image,
h_B_at_R)
demosaiced_image = np.stack((red_channel, green_channel, blue_channel), axis=-1)
return np.clip(demosaiced_image, 0, 1) # Clip to valid range [0, 1]
# Load Image
rgb_image = io.imread('/home/evoprime/Athena/Downloads/fruits.jpeg') / 255.0 # Replace with
actual image path or URL
h, w, _ = rgb_image.shape
OUTPUT
OBSERVATION
The image cannot be reconstructed perfectly simply using Malvar He Cutler Algorithm or any
demosaicing algorithm. JPEG processed images also use White balancing, Gamma
correction,Compression and color adjustments which accounts for the differences here.
Q(3.3) Answer the following questions and optionally validate them experimentally:
1. Most captured images have gamma correction applied to them. Does this invalidate the
basic compositing equation (3.8); if so, how should it be fixed?
2. The additive (pure reflection) model may have limitations. What happens if the glass is tinted,
especially to a non-gray hue? How about if the glass is dirty or smudged? How could you model
wavy glass or other kinds of refractive objects?
1.
Yes, gamma correction does invalidate the basic compositing equation (C = (1 − α)B + αF)
because the equation assumes that the images are in linear color space. Gamma correction
applies a non-linear transformation to image colors, which affects how the colors blend
during compositing.
To fix this:
● First, reverse the gamma correction by converting the gamma-corrected images
back into linear color space. This is done by applying the inverse gamma function,
typically with gamma ≈ 2.2.
● After this, apply the compositing equation in the linear color space.
● Once compositing is completed, reapply gamma correction to the final image so it
can be displayed correctly on non-linear devices like monitors.
2.
○ Tinted Glass: If the glass is tinted, especially with a non-gray hue, the reflection
will be altered by the tint color. For example, a blue-tinted glass will make
reflections appear bluish. This can be modeled by applying a color filter to the
reflection, simulating the effect of the tint on the reflected light.
○ Dirty or Smudged Glass: Dirt or smudges will scatter and diffuse the reflection,
reducing its sharpness and clarity. To model this, a noise pattern or blur effect can
be added to the reflection, simulating the scattering of light due to dirt or smudges
on the glass.
○ Wavy Glass or Refractive Objects: Wavy or irregular glass causes distortions in the
reflection. This can be modeled by using a spatial distortion map, which warps the
reflection based on the irregularities of the glass. Displacement mapping or ripple
effects can simulate the distortion caused by uneven or wavy glass.
Q(3.4) Set up a blue or green background, e.g., by buying a large piece of colored posterboard.
Take a picture of the empty background, and then of the background with a new object in front of
it. Pull the matte using the difference between each colored pixel and its assumed corresponding
background pixel
● The goal is to extract the object from the second image by comparing it to the first, where
only the background is present. The difference in pixel values between the two images will
help identify where the object is located.
There are several approaches you can use based on the readings:
Difference Matting:
● Step 1: For each pixel in the image with the object, compare its color (RGB values) with
the corresponding pixel in the background image.
● Step 2: If the difference between the two pixel values is above a certain threshold
(indicating a change, i.e., the presence of the object), mark this pixel as part of the
foreground. Otherwise, it's part of the background.
The equation used to calculate the final composited pixel CCC is:
𝐶 = (1 − 𝛼)𝐵 + 𝛼𝐹
Where:
𝐹 = (𝐶 − (1 − 𝛼)𝐵) ÷ 𝛼
● Step 3: Once the matte (foreground with transparency values) is obtained, you can use this
to composite the foreground object onto any new background by combining it with another
image.
After extracting the foreground object, you can insert it into a different background by applying
the compositing equation:
𝐶𝑛𝑒𝑤 = (1 − 𝛼)𝐵𝑛𝑒𝑤 + 𝛼𝐹
Where Bnew is the new background and F is your extracted foreground object.
Input Images:
Code:
import cv2
import numpy as np
import os
background_img = cv2.imread('solidblue.png')
foreground_img = cv2.imread('ballInSolidBlue.jpg')
if background_img is None:
if foreground_img is None:
cv2.waitKey(0)
cv2.destroyAllWindows()
threshold = 0.2
alpha = alpha.astype(np.float32)
cv2.imwrite('extracted_object.png', result_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
import numpy as np
# File paths
fg_path = '/content/drive/MyDrive/NITC_Projects/Computer_vision/foreground_tajmahal.mp4'
bg_path = '/content/drive/MyDrive/NITC_Projects/Computer_vision/Taj_mahal_background.mp4'
new_bg_path = '/content/drive/MyDrive/NITC_Projects/Computer_vision/new_background.png'
output_file = '/content/output_video.mp4'
cap_bg = cv2.VideoCapture(bg_path)
cap_fg = cv2.VideoCapture(fg_path)
new_bg = cv2.imread(new_bg_path)
# Parameters
bg_mean = None
def compute_mean_and_variance(bg_frames):
global bg_mean
def classify_pixels(frame):
gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
return foreground_mask
def cleanup_using_morphology(mask):
return cleaned_mask
def label_connected_components(mask):
return centroids
x, y = int(centroid[1]), int(centroid[0])
return frame
# Read a few background frames to compute mean and variance
compute_mean_and_variance(bg_frames)
processed_frames = []
if not ret_fg:
break
if idx in frame_indices:
foreground_mask = classify_pixels(frame_fg)
cleaned_mask = cleanup_using_morphology(foreground_mask)
# Label connected components, compute centroids, and track
processed_frames.append((frame_fg, tracked_frame))
axs[i, 0].axis('off')
axs[i, 1].axis('off')
plt.tight_layout()
plt.show()
cap_bg.release()
cap_fg.release()
Q(3.6) Write a variety of photo enhancement or effects filters: contrast, solarization (quantization),
etc. Which ones are useful (perform sensible corrections) and which ones are more creative (create
unusual images)?
Q(3.7) Compute the gray level (luminance) histogram for an image and equalize it so that the tones
look better (and the image is less sensitive to exposure settings). You may want to use the following
steps:
1. Convert the color image to luminance.
2. Compute the histogram, the cumulative distribution, and the compensation transfer function.
3. (Optional) Try to increase the “punch” in the image by ensuring that a certain fraction of pixels
(say, 5%) are mapped to pure black and white.
4. (Optional) Limit the local gain in the transfer function. One way to do this is to limit while
performing the accumulation, keeping any unaccumulated values “in reserve”.
5. Compensate the luminance channel through the lookup table and re-generate the color image
using color ratios.
6. (Optional) Color values that are clipped in the original image, i.e., have one or more saturated
color channels, may appear unnatural when remapped to a non-clipped value. Extend your
algorithm to handle this case in some useful way.
Answer for Q(3.7)
Code:
import numpy as np
import cv2
import matplotlib.pyplot as plt
def rgb_to_luminance(image):
luminance = 0.299 * image[:, :, 2] + 0.587 * image[:, :, 1] + 0.114 * image[:, :, 0]
return luminance.astype(np.uint8)
def compute_histogram_and_cdf(luminance):
histogram, bins = np.histogram(luminance.flatten(), 256, [0, 256])
cdf = histogram.cumsum()
cdf_normalized = cdf * (histogram.max() / cdf.max())
return histogram, cdf, cdf_normalized
Output:
● First, convert the colour image into a grayscale luminance image. This can be done using
the formula:
L=0.2126*R+0.7152*G+0.0722*B
● Here, R, G, and B are the red, green, and blue colour channels of the image, and LLL
represents the luminance (brightness) channel.
● Split the image into small patches (e.g., 8x8 or 16x16 pixels).
● For each patch, compute its own grey-level (luminance) histogram.
● For each pixel, distribute its luminance value across adjacent vertices in its patch using
bilinear interpolation. This means that each pixel contributes its value to the nearest
vertices in the patch grid, with weights based on its distance to those vertices.
● If we let a pixel's luminance value be I(x,y) at location (x,y), and the nearest vertices be at
positions (x1,y1) (x2,y2), etc., distribute I(x,y) based on how close the pixel is to each
vertex.
● For each patch, convert the histogram into a cumulative distribution function (CDF). The
CDF is essential for mapping luminance values to new values such that the histogram is
equalized.
● The CDF is calculated as:
Where P(i) is the probability of the grey level iii in the histogram.
● To smooth the transition between patches, interpolate the CDFs of adjacent patches using
bilinear interpolation. This ensures that the transition between different regions of the
image is smooth and prevents artifacts from appearing at patch boundaries.
● For a pixel at location (x,y), interpolate its new intensity based on the CDF values of the
surrounding patch vertices.
● After interpolation, use the resulting CDF to remap the luminance values of the pixels in
each patch.
● The remapped luminance L′ for a pixel with luminance LLL is obtained from the
interpolated CDF:
L’ = CDF-1(L)
● This remaps the luminance values in a way that improves the contrast locally.
● After adjusting the luminance values, the colour image is regenerated by combining the
adjusted luminance with the original colour ratios. This can be done using the method:
I’ = L’*(I/L)
Where III is the original intensity of the colour channels (R, G, B), and LLL is the original
luminance.
This ensures that the colour proportions are maintained while enhancing the luminance contrast.
8.:
● Apply low-pass filtering to the CDFs to further smooth the transitions between patches.
● Handle colour clipping or saturation by identifying pixels where the colour channels have
saturated values and applying specific rules to these pixels to avoid unnatural colours
after histogram equalization.
Code:
import cv2
import numpy as np
def convert_to_luminance(image):
patches_x = w // patch_size[0]
patches_y = h // patch_size[1]
equalized_image = np.zeros_like(image)
for i in range(patches_y):
for j in range(patches_x):
x_start = j * patch_size[0]
y_start = i * patch_size[1]
x_end = (j + 1) * patch_size[0]
y_end = (i + 1) * patch_size[1]
equalized_patch = cv2.equalizeHist(patch)
b, g, r = cv2.split(original_image)
# Compute the ratios for each color channel relative to the original luminance
luminance_image = luminance_image.astype(float)
image = cv2.imread('input_image.jpg')
luminance_image = convert_to_luminance(image)
equalized_luminance = local_histogram_equalization(luminance_image)
cv2.imwrite('output_image.jpg', final_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Disadvantages: Mirrors pixel values which may introduce unrealistic data, especially if the
image has significant features or patterns at the borders, potentially distorting the data
representation.
f(i,j)=f(k,l)
Where:
k=imodM
l=jmodN
Advantages: 1) Useful for periodic or cyclic data, as it avoids introducing artificial edges.
2) No information loss at the boundaries, creating a continuous appearance and preventing the
introduction of artificial borders.
3)Can help preserve important features near edges across various applications, including CNNs.
Disadvantages: 1)Can lead to unrealistic artifacts, especially in non-periodic data, causing
discontinuities or misleading visual representations.
2)Computationally more complex to implement as it needs to account for the wraparound nature
of indices.
import glfw
from OpenGL.GL import *
from PIL import Image
import numpy as np
def load_texture(path):
img = Image.open(path)
img = img.transpose(Image.FLIP_TOP_BOTTOM)
img_data = np.array(img, dtype=np.uint8)
texture = glGenTextures(1)
glBindTexture(GL_TEXTURE_2D, texture)
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, img.width, img.height, 0, GL_RGB,
GL_UNSIGNED_BYTE, img_data)
glGenerateMipmap(GL_TEXTURE_2D)
return texture
if not glfw.init():
raise Exception("GLFW can't be initialized")
if not window:
glfw.terminate()
raise Exception("GLFW window can't be created")
glfw.make_context_current(window)
indices = [0, 1, 3, 1, 2, 3]
VBO = glGenBuffers(1)
VAO = glGenVertexArrays(1)
EBO = glGenBuffers(1)
glBindVertexArray(VAO)
glBindBuffer(GL_ARRAY_BUFFER, VBO)
glBufferData(GL_ARRAY_BUFFER, vertices.nbytes, vertices, GL_STATIC_DRAW)
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, EBO)
glBufferData(GL_ELEMENT_ARRAY_BUFFER, indices.nbytes, indices,
GL_STATIC_DRAW)
GL_CLAMP_TO_EDGE)
glDeleteVertexArrays(1, [VAO])
glDeleteBuffers(1, [VBO])
glDeleteBuffers(1, [EBO]
glfw.terminate()
Observations
GL_REPEAT: The texture will repeat when the texture coordinates exceed [0.0, 1.0]
GL_CLAMP_TO_EDGE: The texture will extend and stretch the edge pixels.
GL_MIRRORED_REPEAT: The texture will mirror itself when the coordinates go beyond [0.0,
1.0].
Answer for Q(3.10)
We are implementing the padding mechansims (zero, replication, cyclic wrap, mirror).
During both convolutions, padding will be applied based on the selected mode.
IMPLEMENTATION:
import numpy as np
return padded_image
return cropped_image
# Example usage
if __name__ == "__main__":
# Example grayscale image (random values)
image = np.random.rand(5, 5)
# Example kernels
h_kernel = np.array([1, 2, 1]) / 4 # Horizontal kernel
v_kernel = np.array([1, 2, 1]) / 4 # Vertical kernel
1)Padding Mechanisms:
The pad_image function pads the input image based on the selected mode:
2)Separable Convolution:
3)Efficiency: Since the kernel is separable, we first convolve along the rows and then along the
columns, reducing the complexity from o(k^2) to o(2k)
Selective Sharpening
Selective sharpening focuses on enhancing the edges and details in specific parts of an image
without affecting the entire image. This technique is particularly useful when you want to
highlight certain features, such as the eyes in a portrait or the texture in a landscape, while
keeping other areas smooth.
● Edge Detection: The filter identifies edges in the image where there is a significant
change in pixel intensity.
● Sharpening: It increases the contrast along these edges, making them appear crisper and
more defined.
● Selective Application: The sharpening effect is applied only to the detected edges,
leaving other areas untouched.
Gaussian filter:
Smooths the image by averaging the pixels within a specified kernel size. Makes a weighted
average of the pixels around it using a Gaussian function, which gets rid of noise and details.
# Function to apply Gaussian filter
Median filter:
Reduces noise while preserving edges. Changes the value of each pixel to the median value of
the pixels next to it. This gets rid of "salt-and-pepper" noise.
median_blurred = np.zeros_like(image)
for i in range(image.shape[0]):
for j in range(image.shape[1]):
for k in range(image.shape[2]):
return median_blurred
Bilateral :
Smooths the image while preserving edges. Using a weighted average to reduce noise without
blurring lines, it combines information about space and intensity.
def manual_bilateral_filter(image, d=9, sigma_color=75, sigma_space=75):
bilateral_filtered = np.zeros_like(image)
half_d = d // 2
for i in range(image.shape[0]):
for j in range(image.shape[1]):
for k in range(image.shape[2]):
return bilateral_filtered
Sharpening :
Makes the edges and features in the picture look better. It uses a kernel to bring out the
differences between pixels that are close to each other, which makes edges look sharper and
more defined.
def manual_sharpen(image):
import cv2
import numpy as np
import matplotlib.pyplot as plt
import urllib.request
def url_to_image(url):
resp = urllib.request.urlopen(url)
return image
# Image URL
image_url = 'https://cdn.clippingpath.in/wp-content/uploads/2017/11/Monalisa.jpg'
image = url_to_image(image_url)
# Apply filters
gaussian_blur = manual_gaussian_blur(image)
median_blur = manual_median_blur(image)
bilateral_filter = manual_bilateral_filter(image)
sharpened_image = manual_sharpen(image)
titles = ['Original Image', 'Gaussian Blur', 'Median Blur', 'Bilateral Filter', 'Sharpened Image']
plt.figure(figsize=(20, 5))
for i in range(5):
plt.subplot(1, 5, i+1)
plt.imshow(cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB))
plt.title(titles[i])
plt.axis('off')
plt.show()
Result :
Overall, Freeman and Adelson’s algorithm provides a powerful and flexible tool for
analyzing images with respect to directional information, making it widely applicable in
computer vision and image processing fields.
Code:
import numpy as np
import cv2
def sobel_filters(image):
return Gx, Gy
theta_rad = np.deg2rad(theta)
return response
image = cv2.imread('/content/Air_India_Boeing_747-400_takes_off_from_Toronto.jpg',
cv2.IMREAD_GRAYSCALE)
if image is None:
print("Error: Could not load the image. Please check the path.")
else:
Gx, Gy = sobel_filters(image)
plt.figure(figsize=(12, 8))
plt.subplot(2, 3, 1)
plt.title('Original Image')
plt.imshow(image, cmap='gray')
plt.subplot(2, 3, i + 2)
plt.title(f'Response at {theta}°')
plt.imshow(responses[i], cmap='gray')
plt.tight_layout()
plt.show()
plt.subplot(2, 3, i + 2)
plt.title(f'Response at {theta}°')
plt.imshow(responses[i], cmap='gray')
plt.tight_layout()
plt.show()
Input Images:
Output:
Summary of the code:
The code performs edge detection on a grayscale image using Sobel filters and computes
steerable filter responses at various orientations (0°, 45°, 90°, 135°). Here's the summary:
1. Imports: numpy, cv2 (OpenCV), and matplotlib.pyplot for numerical operations,
image processing, and plotting.
2. Sobel Filters: The sobel_filters function calculates the gradients in the x and y
directions (Gx, Gy) using Sobel filters, highlighting edges in the image.
3. Steerable Filter Response: The steerable_response function computes the
response of the image to a steerable filter at any orientation (theta) by combining
Gx and Gy.
4. Image Loading: A grayscale image is loaded, and an error message is shown if it
fails.
5. Gradient Calculation: Sobel filters are applied to compute Gx and Gy.
6. Response Calculation: The filter response is computed for orientations 0°, 45°,
90°, and 135°.
7. Plotting: The original image and the filter responses at each orientation are
displayed using matplotlib.
This code helps visualize how an image responds to edge detection filters at different
angles.
This code applies first- and second-order Gaussian derivative filters to multiple images to
detect edges, corners, and intersections.The Breakdown of the code as follows:
● The test_filters_on_images() function takes a list of image file paths, applies the
filters to each image, and visualizes the results using matplotlib.
● It shows the original image, the first-order filter responses (for horizontal and
vertical edges), and the second-order filter responses (for corners and
intersections).
CODE:
import numpy as np
import cv2
import os
x = np.arange(-size // 2 + 1, size // 2 + 1)
xx, yy = np.meshgrid(x, x)
G = G / np.sum(G)
G0 = -(xx / sigma**2) * G
x = np.arange(-size // 2 + 1, size // 2 + 1)
xx, yy = np.meshgrid(x, x)
G = G / np.sum(G) # Normalize
if image is None:
continue
G0_response, G90_response, Gxx_response, Gyy_response, Gxy_response = apply_filters(image, sigma)
plt.figure(figsize=(14, 8))
plt.subplot(2, 3, 1)
plt.title('Original Image')
plt.imshow(image, cmap='gray')
plt.colorbar()
plt.subplot(2, 3, 2)
plt.imshow(G0_response, cmap='gray')
plt.colorbar()
plt.subplot(2, 3, 3)
plt.imshow(G90_response, cmap='gray')
plt.colorbar()
plt.subplot(2, 3, 4)
plt.colorbar()
plt.subplot(2, 3, 5)
plt.imshow(Gxy_response, cmap='gray')
plt.colorbar()
plt.subplot(2, 3, 6)
plt.title('Second-order Gyy')
plt.imshow(Gyy_response, cmap='gray')
plt.colorbar()
plt.tight_layout()
plt.show()
image_paths = [
r"/content/Air_India_Boeing_747-400_takes_off_from_Toronto.jpg",
r"/content/color-image.jpg",
r"/content/mahindra-thar-roxx-png-image.png",
]
test_filters_on_images(image_paths, sigma=1.5)
Input Images:
Output:
Summary of the Code:
This code applies Gaussian derivative filters to detect edges, corners, and
intersections in grayscale images. It uses both first-order and second-order filters to
identify structural features like edges and corners in the image.
Key Components:
By varying Sigma to a smaller value,The filters detect small, sharp edges and fine
textures,but they may also pick up a lot of noise and More sensitive to small variations in
pixel intensity.And By varying to a larger value,The filters detect broader, smoother
edges, and ignore small-scale details or noise.Best for detecting larger, more gradual
changes in intensity across the image.
CODE IMPLEMENTATION:
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Displaying results
plt.figure(figsize=(10, 10))
# Original Image
plt.subplot(1, 3, 1)
plt.imshow(image_rgb)
plt.title('Original Image')
plt.axis('off')
CODE EXPLANATION:
1.Bilateral Filter:
i. This filter applies edge-preserving smoothing by considering both spatial distance and intensity
difference. It's useful for noise reduction and smooths regions while keeping edges intact.
ii. Parameters:
2.Guided Filter:
i. A fast and efficient method for edge-preserving filtering, where the filtering result is guided by
another image. It helps smooth textures while preserving sharp edges.
ii. The guide_image and input_image are the same in this case, but they can be different in tasks
like joint filtering.
iii. Parameters:
3.Display:
● The original image, the result of the bilateral filter, and the guided filter are displayed
side by side using Matplotlib for visual comparison.
This property states that the Fourier transform of a linear combination of functions
is the same linear combination of their Fourier transforms. If f(t) and g(t) are two
time-domain signals with Fourier transforms F(ω) and G(ω) and and b are
constants, then:
F{a f(t)+b g(t)}=a F{f(t)}+b F{g(t)}
Proof:
F{f(t−t0)}=e−jωt0F(ω)
Proof:
F{f(t−t0)}=∫−∞∞f(t−t0)e−jωtdt
Let τ=t−t0:
If f(t) has a Fourier transform F(ω)then the time-reversed function f(−t) has a
Fourier transform:
F{f(−t)}= F(−ω)
Proof:
F{f(−t)}=∫−∞∞f(−t)e−jωtdt
=∫−∞∞f(τ)ejωτdτ
= F(−ω)
4. Convolution Property
The Fourier transform of the convolution of two time-domain signals is the product
of their individual Fourier transforms. If f(t) and g(t) have Fourier transforms F(ω)
and G(ω), then:
F{f(t)∗g(t)}=F(ω)G(ω)
(f∗g)(t)=∫−∞∞f(τ)g(t−τ)dτ
−jωt
F{(f∗g)(t)}=∫−∞∞(∫−∞∞f(τ)g(t−τ)dτ)e dt
=∫−∞∞f(τ)(∫−∞∞g(t−τ)e−jωtdt)dτ
=∫−∞∞f(τ)e−jωτ(∫−∞∞g(t′)e−jωt′dt′)dτ
= F(ω)G(ω)
5. Correlation Property
The Fourier transform of the correlation of two functions is the product of one
Fourier transform and the complex conjugate of the other. For signals f(t) and g(t):
F{(f⋆g)(t)}=F(ω)G∗(ω)
Proof: The correlation is defined as:
(f⋆g)(t)=∫−∞∞f∗(τ)g(t+τ)dτ
F{(f⋆g)(t)}=F(ω)G∗(ω)
6. Multiplication Property
The Fourier transform of the product of two time-domain signals is the convolution
of their Fourier transforms. If f(t) and g(t) have Fourier transforms F(ω) and G(ω),
then:
F{f(t)g(t)}=1/2π∫−∞∞F(ξ)G(ω−ξ)dξ
Proof:
F{f(t)g(t)}=∫−∞∞f(t)g(t)e−jωtdt
Expanding f(t) and g(t) in terms of their inverse Fourier transforms and then
interchanging integrals yields the convolution.
F{dnf(t)/dtn}=(jω)nF(ω)
F{df(t)/dt}=∫−∞∞df(t)/dt e−jωtdt
=[f(t)e−jωt]−∞∞+jω∫−∞∞f(t)e−jωtdt
Since the boundary terms vanish:
=jωF(ω)
If f(t) has a Fourier transform F(ω), then scaling the time variable by a results in
the Fourier transform being scaled in frequency and divided by ∣a∣:
F{f(at)}=1/∣a∣ F(ω/a)
Proof:
F{f(at)}=∫−∞∞f(at)e−jωtdt
=1/a∫−∞∞f(τ)e−j(ω/a)τdτ
For a>0:
=1/a F(ω/a)
If f(t) is real, then its Fourier transform satisfies the conjugate symmetry property:
F(−ω)=F∗(ω)
This follows directly from the fact that for a real signal f(t), the imaginary part of
the Fourier transform must cancel out when performing the inverse Fourier
transform, ensuring that f(t) remains real.
Proof:
F(ω)=∫−∞∞f(t)e−jωtdt
Parseval's Theorem states that the total energy of a signal in the time domain is
equal to the total energy in the frequency domain. For a signal f(t) with Fourier
transform F(ω), the theorem can be written as:
2 2
∫−∞∞∣f(t)∣ dt=1/2π ∫−∞∞∣F(ω)∣ dω
Proof:
2
∫−∞∞∣f(t)∣ dt
f(t)=1/2π ∫−∞∞F(ω)ejωtdω
Thus:
2 ωt 2
∫−∞∞∣f(t)∣ dt=∫−∞∞∣1/2π∫−∞∞F(ω)ej dω∣ dt
2
=1/2π ∫−∞∞∣F(ω)∣ dω
This proves that the energy is preserved between the time and frequency domains.
Fourier Transform Pairs
F{δ(t)}=∫−∞∞δ(t)e−jωtdt
Proof:
We know that the Dirac delta function has the sifting property:
∫−∞∞δ(t−t0)f(t)dt=f(t0)
For the Fourier transform of δ(t), we can apply this property with f(t)=e −jωtf:
⋅0
∫−∞∞δ(t)e−jωtdt=e−jω =1
F{δ(t)}= 1
Fourier Pair:
δ(t)⟷1
2. Shifted Impulse
F{δ(t−t0)}=∫−∞∞δ(t−t0)e−jωtdt
Proof:
∫−∞∞δ(t−t0)e−jωtdt=e−jωt0
F{δ(t−t0)}=e−jωt0
Fourier Pair:
δ(t−t0)⟷e−jωt0
3. Box Filter (Rectangular Pulse)
rect(t/T)={1 ∣t∣≤T/2
0 ∣t∣>T/2 }
F{rect(t/T)}=∫−∞∞rect(t/T)e−jωtdt
Proof:
F{rect(t/T)}=∫−T/2T/2e−jωtdt
=2jsin(ωT/2)/−jω=2sin(ωT/2)/ω
F{rect(t/T)}=T sinc(ωT/2)
where sinc(x)=sin(x)/x.
Fourier Pair:
rect(t/T)⟷T sinc(ωT/2)
4. Tent Function (Triangular Pulse)
The tent function, or triangular pulse, is the convolution of two rectangular pulses:
tri(t)=rect(t)∗rect(t)
F{tri(t)}=sinc2(ω/2)
Proof:
F{rect(t)∗rect(t)}=F{rect(t)}⋅F{rect(t)}
F{tri(t)}=sinc(ω/2)⋅sinc(ω/2)=sinc2(ω/2)
Fourier Pair:
tri(t)⟷sinc2(ω/2)
5. Gaussian Function
f(t)=e−t2/(2σ2)
F{e−t2/(2σ2)}=σsqrt(2π)e−σ2ω2/2
Proof:
e−t2/(2σ2)e−jωt=e−(t2−2jωσ2t)/(2σ2)
This allows us to transform the integral into a Gaussian integral, which results in:
F{e−t2/(2σ2)}=σ sqrt(2π)e−σ2ω2/2
Fourier Pair:
LoG(t)=d2/dt2(e−t2/(2σ2))
LoG(t)=d2/dt2 (e−t2/(2σ2))
The Fourier transform of the Laplacian of Gaussian can be derived using the
property that differentiation in the time domain corresponds to multiplication by
jωj \omegajω in the frequency domain. The Fourier transform of the Gaussian
function is already known:
F{e−t2/(2σ2)}=σ sqrt(2π) e−σ2ω2/2
−σ2ω2/2
F{d2\dt2(e−t2/(2σ2))}=−ω2⋅σ sqrt(2π)e
Proof:
F{d2/dt2 f(t)}=−(jω)2F{f(t)}=−ω2F{f(t)}
−σ2ω2/2
F{d2/dt2(e−t2/(2σ2))}=−ω2⋅σ sqrt(2π) e
−σ2ω2/2
F{LoG(t)}=−ω2⋅σ sqrt(2π) e
Fourier Pair:
−σ2ω2/2
d2/dt2(e−t2/(2σ2))⟷−ω2⋅σ sqrt(2π) e
7. Gabor Function
Function:
ω0t
f(t)=e−t2/(2σ2)⋅ej
We already know the Fourier transform of a Gaussian. Now we need to handle the
modulation by ejω0t, which corresponds to a frequency shift in the frequency
domain.
F{e−t2/(2σ2)ejω0t}=F{e−t2/(2σ2) }shifted by ω0
ω0t −σ2(ω−ω0)2/2
F{e−t2/(2σ2)⋅ej }=σ sqrt(2π) e
Proof:
F{e−t2/(2σ2)}=σ2πe−σ2ω2/2
ω0t −t2/(2σ2)
F{e−t2/(2σ2)⋅ej }=F{e } shifted by ω0
ω0t −σ2(ω−ω0)2/2
F{e−t2/(2σ2)⋅ej }=σ sqrt(2π)e
Fourier Pair:
ω0t
e−t2/(2σ2) .ej ⟷σ sqrt(2π)e−σ2(ω−ω0)2/2
8. Unsharp Mask
f(t)=δ(t)−e−t2/(2σ2)
We already know the Fourier transforms of δ(t) and e−t2/(2σ2) . Using linearity, the
Fourier transform of the unsharp mask is:
Proof:
Fourier Pair:
9. Windowed Sinc
The Windowed Sinc function is a sinc function multiplied by a window function
(usually rectangular) to control sidelobes in frequency analysis or filtering. It is
defined as:
Windowed Sinc(t)=sinc(t)⋅rect(t/T)
The sinc function has a well-known Fourier transform. Applying the multiplication
property in the time domain results in a convolution in the frequency domain.
Proof:
F{sinc(t)⋅rect(t/T)}=rect(ω)∗T⋅sinc(ωT/2)
Introduction
This involves implementing and evaluating various image filters for resizing operations,
focusing on both magnification and minification. The goal is to apply and compare the effects of
different filters, such as the windowed sinc filter and Gaussian filter, on synthetic and natural
images to understand their performance and visual quality.
Code Explanation
2. Image Resizing:
- resize_image(image, scale, interpolation): Resizes an image by scaling it up or down using
bilinear interpolation (or other methods if specified). This helps in understanding how different
filters perform during magnification or minification.
5. Visualization:
- Uses show_animation() to create and display animations of processed images, allowing for an
effective visual comparison of the different filtering and resizing techniques.
This approach helps in analyzing the quality and efficiency of various filters applied to image
resizing tasks, offering insights into their merits and deficiencies.
Code:
import numpy as np
import cv2
from scipy.signal.windows import hamming
from matplotlib import pyplot as plt
import matplotlib.animation as animation
# 3. Gaussian Filter
def apply_gaussian_filter(image, ksize, sigma):
"""Applies a Gaussian filter to the image."""
print("Applying Gaussian filter...")
return cv2.GaussianBlur(image, (ksize, ksize), sigma)
# Load a natural image for testing (replace 'path_to_image' with actual image path)
print("Loading natural image...")
image = cv2.imread('C:/Users/fawaz/Downloads/IMG_8680.jpg', cv2.IMREAD_GRAYSCALE)
if image is None:
raise ValueError("Image not found or could not be loaded.")
Image pyramids are a fundamental concept in image processing and computer vision. They
involve creating progressively lower resolution representations of an image to facilitate
multi-scale analysis.
This technique is particularly useful for tasks like object detection, image blending, texture
analysis, and optical flow.
2. Downsampling (Decimation)
1. 2x2 Block Filtering: The simplest downsampling technique. It reduces the image by
averaging the pixel values in a 2x2 block and keeps only one pixel from the block.
Kernel: 1/16(1,4,6,4,1)
This is a 5-tap separable filter that approximates a Gaussian filter. It smooths the
image before downsampling to avoid aliasing.
3. 7-Tap or 9-Tap Filters: These filters use a larger kernel (e.g., 7 or 9-tap) for
smoothing. They provide better filtering by considering more pixel values, ensuring a
high-quality downsampled image.
When we downsample an image (reduce its resolution), we must ensure that the image is
first smoothed (filtered). Without proper filtering, high-frequency components from the
original image may cause aliasing, a visual distortion where unwanted patterns or artifacts
appear in the downsampled image. This is why the choice of filter plays an essential role in
the quality of the pyramid.
Comparison of Filters
● Block filtering is the simplest but may cause artifacts due to aliasing.
● Burt and Adelson's binomial kernel strikes a good balance between computational
cost and image quality, reducing high-frequency artifacts without being too costly.
● High-quality 7 or 9-tap filters provide the best image quality, especially for detailed
images, but they are computationally more expensive.
CODE:
import cv2
import numpy as np
plt.imshow(image, cmap='gray')
plt.title(title)
plt.axis('off')
plt.show()
pyramid = [image]
for i in range(levels):
downsampled = cv2.pyrDown(pyramid[-1])
pyramid.append(downsampled)
return pyramid
pyramid = [image]
for i in range(levels):
filtered = cv2.sepFilter2D(pyramid[-1], -1, binomial_kernel, binomial_kernel)
downsampled = cv2.pyrDown(filtered)
pyramid.append(downsampled)
return pyramid
pyramid = [image]
for i in range(levels):
downsampled = cv2.pyrDown(filtered)
pyramid.append(downsampled)
return pyramid
return shifted_image
levels = 4
shifted_image = shift_image(image, 2, 2)
Multiply each Laplacian level by the corresponding mask from the mask
pyramids.
Python Code:
import cv2
import numpy as np
pyramid = [image]
pyramid.append(image)
return pyramid
def build_laplacian_pyramid(gaussian_pyramid):
laplacian_pyramid = []
laplacian_pyramid.append(gaussian_pyramid[-1])
return laplacian_pyramid
blended_pyramid = []
levels = len(lap_pyramids[0])
for i in range(levels):
total_mask += masks[i]
total_mask[total_mask == 0] = 1
blended /= total_mask
blended_pyramid.append(blended)
return blended_pyramid
def reconstruct_image(pyramid):
image = pyramid[-1]
image = cv2.pyrUp(image)
return image
# Blend pyramids
blended_image = reconstruct_image(blended_pyramid)
# Example usage
output_file = 'blended_image.jpg'
In the weighted summation stage, the weights (from the mask) should ideally
sum to 1 at each pixel location to avoid the need for renormalization. If the
sum of weights is not 1, you would need to normalize the weights to ensure the
final result is correctly balanced. This can be handled by adjusting the mask
values or scaling the final blended image.
Use Cases
This approach provides a flexible and effective way to blend images while
preserving details and transitions.
Python Code
import cv2
import numpy as np
pyramid = [image]
pyramid.append(image)
return pyramid
def laplacian_pyramid(gaussian_pyramid):
laplacian_pyr = []
for i in range(len(gaussian_pyramid) - 1):
laplacian_pyr.append(laplacian)
return laplacian_pyr
manipulated_pyramid = []
# Adjust contrast and brightness; alpha controls contrast, beta controls brightness
manipulated_pyramid.append(manipulated)
return manipulated_pyramid
def reconstruct_image(manipulated_pyramid):
image = manipulated_pyramid[-1]
return image
laplacian_pyr = laplacian_pyramid(gaussian_pyr)
final_image = reconstruct_image(manipulated_pyr)
return final_image
# Example usage
import pywt
import numpy as np
import matplotlib.pyplot as plt
from skimage import data, img_as_float
This code decomposes the image into different sub-bands using the Daubechies wavelet (db1).
The result is four bands: approximation (low-low), horizontal (low-high), vertical (high-low),
and diagonal (high-high).
We'll use the Burt and Adelson’s binomial kernel for this:
import cv2
import numpy as np
Denoising Task:
After performing the wavelet decomposition, we threshold small wavelet coefficients to zero.
This is known as coring and can be done using a piecewise linear function that sets small values
to zero.
Threshold and Reconstruct:
# Set a threshold to zero out small coefficients
threshold = 0.1
coeffs_thresh = list(map(lambda x: pywt.threshold(x, threshold, mode='soft'), coeffs))
Comparison of Techniques:
Wavelet Decomposition: Highly effective at preserving edges and textures due to its multi-scale,
multi-orientation properties.
Laplacian Pyramid: Can be overcomplete and more prone to shift-variance, but simpler for
multi-resolution image processing.
In denoising, wavelets often perform better than the Laplacian pyramid because they preserve the
important features of the image while efficiently removing noise. This is due to the multi-level
decomposition and better localization of frequency components.
Conclusion:
For denoising, wavelets typically outperform Laplacian pyramids due to their ability to localize
features across scales and orientations. For compression, wavelets also tend to be more efficient
in retaining the important features while allowing for better compression ratios due to tighter
frequency localization.
Answer for Ex3.22:
1. Affine Transformation
# Load an image
image = cv2.imread('image.jpg')
rows, cols, ch = image.shape
Result:
2. Perspective Transformation
image = cv2.imread('image.jpg')
rows, cols, ch = image.shape
# Four points from input image
pts1 = np.float32([[56, 65], [368, 52], [28, 387], [389, 390]])
3. Bilinear Interpolation
Selecting the coarser level: When you choose a lower resolution level in the
MIP-map, you are essentially sampling from a smaller texture, which leads to
fewer details being preserved, resulting in a blurrier image.
Selecting the finer level: If you sample from a higher resolution level, you might
be picking too many samples in a small space, leading to aliasing (jagged lines
or noise).
Tri-linear MIP-mapping: This method interpolates between two adjacent
MIP-map levels. Blending between a blurred (coarser) image and an aliased
(finer) image can result in better visual quality because it balances the amount
of blur and aliasing. However, it may not completely solve the problem in all
cases.
Anisotropic Filtering:
When the ratio of horizontal and vertical resampling rates becomes very
different, MIP-mapping tends to fail because it doesn't consider directional
sampling. For example, when rendering textures that are viewed at steep angles
(like roads or walls), the texture can become distorted.
Anisotropic filtering: This technique samples more along the axis that has a
higher resampling rate (the axis where more distortion would occur). This
results in better quality, especially when textures are viewed from oblique
angles.
Elliptical Weighted Average (EWA): This method adapts the kernel used for
texture filtering based on the sampling rate of both horizontal and vertical axes,
thus reducing aliasing and blur for anisotropic cases.
Nearest neighbor: Fast but can result in pixelated images, especially for larger
transformations.
Bilinear interpolation: Smooth results, but can still blur fine details.
Bicubic interpolation: Provides smoother results compared to bilinear, but
computationally more expensive.
image = cv2.imread('image.jpg')
rows, cols, ch = image.shape
pts1 = np.float32([[50, 50], [200, 50], [50, 200]])
pts2 = np.float32([[10, 100], [250, 50], [100, 300]]) # Increased skew for more
visible effect
Result:
Answer to Qn 3.23
Part 1:
import cv2
import numpy as np
from scipy.interpolate import Rbf
import matplotlib.pyplot as plt
# Select points to be displaced (you can choose more points interactively in practice)
src_points = np.array([[100, 100], [200, 200], [300, 300], [400, 400]], dtype=np.float32) # Example
points
dst_points = np.array([[150, 100], [200, 150], [250, 300], [450, 400]], dtype=np.float32) # Displaced
points
Part 2:
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Beier-Neely Warp Function
def beier_neely_warp(image, src_lines, dst_lines):
def calc_u_v(x, y, P, Q):
PQ = Q - P
PQ_len = np.linalg.norm(PQ)
PQ_normalized = PQ / PQ_len if PQ_len != 0 else np.array([0, 0])
return u, v
for y in range(rows):
for x in range(cols):
u, v = calc_u_v(x, y, P_src, Q_src)
return warped_image.astype(np.uint8)
plt.subplot(1, 2, 2)
plt.imshow(warped_image)
plt.title('Warped Image')
plt.show()
Part 3:
import cv2
import numpy as np
import matplotlib.pyplot as plt
map_x = map_x.astype(np.float32)
map_y = map_y.astype(np.float32)
for i in range(len(grid[0].flatten())):
x_src, y_src = grid[0].flatten()[i], grid[1].flatten()[i]
x_dst, y_dst = deformed_grid[0].flatten()[i], deformed_grid[1].flatten()[i]
plt.subplot(1, 2, 2)
plt.imshow(warped_image)
plt.title('Warped Image')
plt.show()
Part 4:
Yes, the Beier–Neely warp reduces to a point-based deformation as the line segments become shorter.
This is because when the length of the line segments approaches zero, the line segments effectively
reduce to points. In this case, the interpolation formula collapses to a point-based interpolation, where the
deformation is driven purely by the individual points.
The main reason for this is that the influence of the lines is based on their length and proximity. As lines
get smaller, their effect becomes localised around the points they connect, resulting in behaviour similar
to point-based warping techniques.
Exercise 3.25:
Code:
import cv2
import numpy as np
img1 = cv2.imread(img1_path)
img2 = cv2.imread(img2_path)
def convert_to_rgb(img):
return img
return warped_img
def morph_images(img1, img2, alpha):
if img1.shape[2] != img2.shape[2]:
def main():
img1 = convert_to_rgb(img1)
img2 = convert_to_rgb(img2)
pts1 = np.float32([[50, 50], [200, 50], [50, 200]]) # Example points in img1
pts2 = np.float32([[60, 60], [220, 50], [60, 220]]) # Corresponding points in img2
cv2.imwrite('morphed_image.jpg', morphed_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
if __name__ == "__main__":
main()
Explanation:
Key Points:
Output;
import numpy as np
import json
import os
# Global variables for cropping, transformations, and image management
cropping = False
selected_image = None
canvas = None
transformation_mode = None
transformation_points = []
dragging_point = None
if event == cv2.EVENT_LBUTTONDOWN:
if transformation_mode:
for i, pt in enumerate(transformation_points):
dragging_point = i
break
else:
# Start cropping
cropping = True
x_start, y_start = x, y
if cropping:
x_end, y_end = x, y
img_copy = image.copy()
cv2.imshow("Image", img_copy)
transformation_points[dragging_point] = (x, y)
update_transformed_image()
if cropping:
# Finish cropping
cropping = False
x_end, y_end = x, y
else:
dragging_point = None
def update_transformed_image():
pts2 = np.float32(transformation_points)
pts2 = np.float32(transformation_points)
def save_canvas(filename):
global canvas, images
data = {
'canvas': canvas.tolist(),
json.dump(data, f)
def load_canvas(filename):
data = json.load(f)
global canvas
return canvas
def affine_transform(image):
pts2 = np.float32([[0, rows * 0.33], [cols * 0.85, rows * 0.25], [cols * 0.15, rows * 0.7]])
def perspective_transform(image):
pts1 = np.float32([[0, 0], [cols - 1, 0], [0, rows -1], [cols -1, rows -1]])
pts2 = np.float32([[cols*0.1, rows*0.2], [cols*0.9, rows*0.1], [cols*0.2, rows*0.9], [cols*0.8,
rows*0.8]])
def similarity_transform(image):
angle = 30 # degrees
scale = 0.8
def rigid_transform(image):
angle = 30 # degrees
# Rotation matrix
rigid_matrix[0, 2] += tx
rigid_matrix[1, 2] += ty
def image_editor():
image = cv2.imread(image_path)
cv2.namedWindow("Image")
cv2.setMouseCallback("Image", mouse_crop)
cv2.imshow("Image", image)
while True:
break
filename = 'canvas.json'
save_canvas(filename)
filename = 'canvas.json'
load_canvas(filename)
transformation_mode = 'affine'
transformation_mode = 'perspective'
tx, ty = 100, 50
selected_image = rigid_transform(selected_image)
selected_image = similarity_transform(selected_image)
selected_image = affine_transform(selected_image)
selected_image = perspective_transform(selected_image)
elif key == ord('n'): # Paste current image to canvas
cv2.imshow("Canvas", canvas)
cv2.destroyAllWindows()
image_editor()
OUTPUTS:
Ans
for
class Viewer3D:
def __init__(self, root):
self.root = root
self.canvas = tk.Canvas(root, width=800, height=600, bg="white")
self.canvas.pack()
# Add menu
menu = tk.Menu(root)
root.config(menu=menu)
file_menu = tk.Menu(menu, tearoff=0)
menu.add_cascade(label="File", menu=file_menu)
file_menu.add_command(label="Load Scene", command=self.load_scene)
file_menu.add_command(label="Save Scene", command=self.save_scene)
file_menu.add_command(label="Load Texture", command=self.load_texture)
file_menu.add_separator()
file_menu.add_command(label="Exit", command=root.quit)
def load_scene(self):
file_path = filedialog.askopenfilename(filetypes=[("JSON files", "*.json")])
if file_path:
with open(file_path, 'r') as f:
data = json.load(f)
self.points = data['points']
self.lines = data['lines']
self.polygons = data['polygons']
self.object_transform = np.array(data['object_transform'])
self.camera_transform = np.array(data['camera_transform'])
self.render()
def save_scene(self):
data = {
'points': self.points,
'lines': self.lines,
'polygons': self.polygons,
'object_transform': self.object_transform.tolist(),
'camera_transform': self.camera_transform.tolist()
}
file_path = filedialog.asksaveasfilename(defaultextension=".json")
if file_path:
with open(file_path, 'w') as f:
json.dump(data, f)
def load_texture(self):
file_path = filedialog.askopenfilename(filetypes=[("Image files",
"*.png;*.jpg")])
if file_path:
self.texture_image = Image.open(file_path)
self.texture = self.texture_image.load() # Load texture pixels
self.render()
self.render()
rotation_x = np.array([
[1, 0, 0, 0],
[0, np.cos(angle_x), -np.sin(angle_x), 0],
[0, np.sin(angle_x), np.cos(angle_x), 0],
[0, 0, 0, 1]
])
rotation_y = np.array([
[np.cos(angle_y), 0, np.sin(angle_y), 0],
[0, 1, 0, 0],
[-np.sin(angle_y), 0, np.cos(angle_y), 0],
[0, 0, 0, 1]
])
def render(self):
self.canvas.delete("all")
transformed_points = [
self.transform_point(point, np.dot(self.camera_transform,
self.object_transform))
for point in self.points
]
if self.texture:
self.draw_textured_polygon(flat_points, polygon['uvs'])
else:
self.canvas.create_polygon(flat_points, outline="black", fill="gray",
stipple="gray50")
self.fill_triangle(tri_points, tri_uvs)
if __name__ == "__main__":
root = tk.Tk()
viewer = Viewer3D(root)
root.mainloop()
Explanation
Loading the Texture:
The load_texture method loads an image and stores its pixel data.
We use the PIL library to handle image operations.
Texture Coordinates:
Each polygon now includes a list of (u, v) texture coordinates for each vertex.
These are stored in the polygons list as part of the polygon data.
Rendering with Textures:
The get_texture_color method samples the texture using the interpolated (u, v)
coordinates, fetching the appropriate color from the texture image.
Step 4: Run and Test
Run the program to load a scene, apply transformations, load a texture, and render
the polygons with the texture mapped.
Notes
Barycentric Coordinates: This is a basic approach to interpolate the texture over
the polygon. For large or complex scenes, more sophisticated methods or
optimizations might be required.
Perspective Correction: In this example, texture mapping is done without
perspective correction. For accurate 3D rendering, you'd need to implement
perspective-correct texture mapping. This involves dividing texture coordinates by
the depth (w) before interpolation.
Performance: The implementation is simplified for educational purposes and may
not be optimized for large-scale 3D scenes or complex textures.
import numpy as np
# Load the image (for synthetic noise, we start with a clean image)
image = cv2.imread('cv1.jpeg')
noise_stddev = 25
noisy_image = np.clip(noisy_image, 0, 1)
# Display Results
axes[0].imshow(image, cmap='gray')
axes[0].set_title('Original Image')
axes[0].axis('off')
axes[1].imshow(noisy_image, cmap='gray')
axes[1].set_title('Noisy Image')
axes[1].axis('off')
axes[2].imshow(gaussian_denoised, cmap='gray')
axes[3].imshow(nlm_denoised, cmap='gray')
axes[3].axis('off')
plt.show()
import numpy as np
# You can use a low-light image or any real-world noisy image sequence
image = cv2.imread('cv2.jpeg')
# Since we don't have the ground truth clean image for real-world data, we won't compute PSNR/SSIM here.
axes[0].imshow(image, cmap='gray')
axes[0].axis('off')
axes[1].imshow(gaussian_denoised, cmap='gray')
axes[1].set_title('Gaussian Denoised')
axes[1].axis('off')
axes[2].imshow(nlm_denoised, cmap='gray')
axes[2].set_title('NLM Denoised')
axes[2].axis('off')
plt.show()
We have implemented Gaussian Smoothing and Non-Local Means (NLM).
Yes, the performance of image denoising algorithms like Gaussian Smoothing and
Non-Local Means(NLM) does depend on the correct choice of noise level
estimation. Here’s how:
1. Impact of Noise Level Estimate:
Gaussian Smoothing:
● Noise level sensitivity: Gaussian smoothing applies a low-pass filter that
removes high-frequency noise but blurs details. While it doesn't explicitly
estimate noise, the filter size and standard deviation (which correspond to
the "strength" of smoothing) need to match the actual noise level.
● Too low filter strength: Not enough noise reduction.
● Too high filter strength: Excessive blurring, loss of edges and fine details.
● Summary: Gaussian smoothing is relatively simple but lacks adaptability to
varying noise levels. Its performance is not directly dependent on estimating
noise but rather on tuning the smoothing parameters.
2. Comparison of Techniques:
Gaussian Smoothing:
● Advantages: Simple, fast, and works well in cases where the noise is evenly
distributed and the image doesn’t have many sharp edges.
● Disadvantages: Blurs the entire image uniformly, including edges and fine
details, leading to loss of sharpness. It is less effective when noise is high or
complex.
● Conclusion: Better suited for mild, uniform noise and situations where
computational simplicity is more important than preserving fine details.
3. Conclusions:
● NLM is superior to Gaussian smoothing when preserving details and
adapting to varying noise levels.
● Accurate noise level estimation is critical for NLM to perform optimally. If
the noise level is underestimated or overestimated, it may under- or over-
smooth the image.
● For Gaussian smoothing, the performance is mainly dependent on choosing
appropriate filter parameters, though it's less flexible compared to NLM.
● In real-world scenarios, NLM tends to outperform simpler methods like
Gaussian smoothing, particularly in high-noise environments (e.g., low-light
sequences).
def create_rainbow_mask(image) :
hsv_image = cv2. cvtcolor (image, cv2. COLOR
lower_red1 = np.array ([0, 50, 50])
upper_redi = np.array ([10, 255, 255])
lower_red2 = np.array ([170, 50, 50])
upper_red2 = np.array ([180, 255, 255])
lower_orange = np.array([11, 50, 50])
upper_orange = np.array ([25, 255, 255])
lower_yellow = np.array ([26, 50, 50])
upper_yellow = np. array ([35, 255, 255])
lower_green = np.array ([36, 50, 50])
upper_green = np.array ([85, 255, 255])
lower _blue = np-array([86, 50, 50])
upper_blue = np.array 130, 255, 255)
lower_violet = np-array ([131, 50, 50])
upper_violet = np. array ([160, 255, 255])
mask_red1 = cv2. inRange(hsv_image, lower_red1, upper_red1)
mask_red2 = cv2. inRange(hsv_image, lower_red2, upper_red2)
mask_red = mask_red1 | mask_red2
mask_orange = cv2. inRange(hsv_image, lower_orange, upper_orange)
mask_yellow = cv2. inRange(hsv_image, lower_yellow, upper _yellow)
mask_green = cv2. inRange(hsv _image, lower green, upper green)
mask_blue = cv2. inRange(hsv_image, lower_blue, upper _blue)
mask_violet = cv2. inRange(hsv_image, lower_violet, upper_violet)
rainbow_mask = mask_red | mask_orange | mask yellow | mask green |
mask_blue | mask violet
return rainbow_mask