DT081A - Signal and Image Processing Lab 1 Report
DT081A - Signal and Image Processing Lab 1 Report
1. Introduction
This lab covers the basics of signal and image processing, giving experience with different
techniques. It starts with generating and visualizing signals, both in the time domain and
frequency domain using Fourier Transforms. Making changes in values and plotting them
to show how exactly signals behave. It also explores how sampling and aliasing affect
signal accuracy.
In the image processing section, we focused on tasks like resizing and changing the detail
of images (sampling and quantization). We calculate image quality using PSNR and SSIM
and apply techniques like Gaussian blur to reduce noise. The effects of adding noise to
images are also analyzed.
In the audio processing part, we learn to analyze sound signals by down sampling
(reducing the sample rate) and looking at their frequency components. Also study methods
like noise reduction, pitch shifting, and time stretching to modify the audio. Overall, this
lab helps in understanding how these techniques are used to improve both images and
audio.
2. Experimental Setup
Tools Used:
• Python
• VS Code
Libraries:
A sawtooth wave is a type of non-sinusoidal waveform that rises upward and then sharply
drops, creating a "saw-like" shape.
We use Python's SciPy library, which provides a simplified function to generate a sawtooth
wave:
The term (2 * np.pi * 5 * t) converts the frequency into angular form (radians per
second) for the time vector t. This produces a periodic wave that linearly rises and then
sharply drops after each cycle.
Q:2
• Increased frequency:
A higher frequency means the sine wave completes more cycles in the same period.
At 20 Hz, the wave will now complete 20 cycles per second.
• Visual impact:
The sine wave will appear more compressed, with more oscillations (cycles) in the
same time range (0 to 1 second) on the plot.
Q: 3
Using the np.ceil function rounds each element of x up to the nearest integer. For the
sine function, np.sin(2 * np.pi * 5 * t) ranges from -1 to 1. When applying
np.ceil, positive sine values round to 1, while zero or negative values remain at 0 or
below.
After multiplying by 2 and subtracting 1, this means 1 stays 1 and 0 becomes -1. As a
result, it starts from –1 the final output still oscillates between -1 and 1, like the original
square wave.
Q:4
By adding square wave and sine wave together we will get composite signal.
composite_signal = sine_wave + square_wave
Amplitude:
Frequency Components:
It includes frequency components from both the sine wave (20 Hz) and the square
wave (5 Hz).
Shape:
It smoothly follows the sine wave’s oscillations and sudden shifts due to the square
wave's contribution. That shows increased complexity compared to either wave
alone.
Q:5
Understanding different signal types is beneficial in multiple real-world examples where
we can use them in different fields. We have wide areas to use them like in
telecommunication, medical imaging, or audio processing etc.
If we talk about a real-world example like music production, where various techniques
enhance sound quality and creativity.
In music production, different waveforms like sine, square, and sawtooth waves are used
to create and manipulate sounds. Each type of waveform has unique characteristics that
affect the final sound.
Sine Waves:
It generates smooth and pure tones with fundamental frequency that can be used for
testing multiple audio equipment.
Square Waves:
It generates abrupt changes between high and low that can be used to create sharp, edgy
sounds, commonly used in bass lines and electronic music.
Sawtooth Waves:
Gradually rises and drops sharply, containing both odd and even harmonics that produces
bright, rich sounds in electronic music.
Q1:
By analyzing the frequency domain plot, we found frequencies of 20 Hz, and 10 Hz are
present in the signal as we can see in the plot as spikes.
Q2:
By modifying the function by adding a 30 Hz frequency signal we can see in the frequency
domain plot there will be another spike with 30 Hz.
Q3:
By adding random noise to the signal, it will show multiple small spikes with random
amplitude that appear all over the frequency domain plot where the original peaks at 10
Hz, 20 Hz, and 30 Hz will still exist.
Q4:
Here is the function to generate chirp signal
def generate_signal(t):
return np.sin(2 * np.pi * (10 +20*t) * t)
t = np.linspace(0, 1, 1000)
The term (10 + 20*t) represents the frequency of the signal. Here, the frequency starts at 10
Hz (when t is 0) and increases linearly over time to 30 Hz (when t is 1).
This changing factor creates a chirp, where the signal begins at a lower frequency and
increases to a higher frequency as time progresses.
Q5:
Understanding the frequency domain of a signal is important in multiple areas like in
medical field if we talk about EEG, where the brain's electrical activity is recorded and
analyzed in terms of its frequency components. Different brain states such as sleep,
produce characteristic frequency patterns.
By analyzing these patterns in the frequency domain, doctors can diagnose conditions like
epilepsy, monitor brain function, and guide treatment strategies for neurological disorders.
Nyquist rate is twice the frequency of the highest frequency component in the signal.
Nyquist rate=2×f=2×10Hz=20Hz
The current sampling rate in the code is fs = 15 Hz. So, the sampling rate is not sufficient
because it is below the Nyquist rate (20 Hz). This can lead to aliasing.
Q2:
A higher sampling rate (25 Hz) provides more data points per second, which allows for a
more accurate representation of the original signal in the sampled version.
The sampled points will be spaced closer together, making the sampled signal more
closely resemble the smooth shape of the original continuous signal.Since 25 Hz is greater
than the Nyquist rate (20 Hz), aliasing will not occur.
Q3:
The Nyquist rate for a 20 Hz signal is 40 Hz (since Nyquist rate = 2 × signal frequency).
However, you're sampling the signal at 30 Hz, which is less than the Nyquist rate. This
leads to aliasing.
Q4:
The "wagon-wheel effect" occurs because the sampling rate is much lower than the
signal's actual frequency, which causes the signal to appear as if it is moving at a different,
often slower, rate than it is. This can sometimes create the illusion of reverse motion, as
the sampling points miss the true motion of the oscillations.
Q5:
Understanding aliasing is crucial in medical imaging, particularly in MRI (Magnetic
Resonance Imaging). In MRI, if the signal is sampled at a rate lower than the required
Nyquist rate, aliasing can occur, causing the image to appear distorted or showing false
patterns. This can lead to misinterpretation of the scans and potentially inaccurate
diagnoses. To prevent this, proper sampling rates and filtering techniques are used to
ensure the images accurately represent the underlying tissue or structures, allowing for
correct medical assessments.
For this Image that we used for testing distortion becomes apparent at 4.
Q2:
Characteristics of Distortion Introduced by Resampling:
• Pixelation:
As the resampling factor increases, fewer pixels represent the image, leading to a
blocky or pixelated appearance.
• Loss of Detail:
Fine details in the image, such as edges and textures, are blurred or lost because
they are no longer sampled at a high enough resolution.
• Smoothing:
The image becomes progressively smoother, with sharp edges becoming more
rounded or blurred due to insufficient pixel representation.
• Geometric Deformation:
Lines and shapes can lose their original form, becoming less defined and less
recognizable.
Key Differences:
Resampling Distortion is primarily spatial and affects the image's resolution and detail,
resulting in a blocky, pixelated appearance.
Requantization Distortion is tonal, affecting the image's color and intensity depth. It results
in banding and loss of smooth transitions, with sharper contrasts between adjacent
intensity levels.
Q3:
As most images use 8 bits per channel (24 bits total for RGB images). Reducing this too far
causes obvious distortions.
At bpp 4 distortion becomes apparent for this image.
Q:4
Here is the function that applies both resampling and requantization to the image.
def image_combinition(image, factor, bpp):
resampled=resample_image(image, factor)
resampled_requantized=requantize_image(resampled, bpp)
return resampled_requantized
Experimenting with different combinations of resampling factors and bits per pixel the
results are shown below
So, from results we can identify that combination of factor 3 and bpp 6 is visually clear.
Q5:
Nearest Neighbor Resampling:
This method results in blocky, pixelated images because each pixel in the downsampled
image is simply the closest original pixel. It's fast but introduces a lot of visual distortion,
especially at higher downsampling factors.
Bicubic Interpolation:
Bicubic interpolation produces much smoother images because it considers a larger pixel
neighborhood when resizing. This results in less distortion and better preservation of edges
and textures.
Q6:
By testing different type of images, we get a result that it depends on image details.
If the image have more details the compression effects the quality more.
3.2.2 PSNR Calculation
Q1:
For resampling factors (Factor 2, Factor 4, Factor 8):
As the resampling factor increases, the PSNR decreases, indicating a higher level of
distortion. For Factor 2, the PSNR is relatively high (34.01 dB), showing minimal distortion.
However, at Factor 8, the PSNR drops to 30.48 dB, meaning that the image has become
more distorted as more data is lost during the resampling process.
For bits per pixel (BPP 2, BPP 4, BPP 8):
As the bits per pixel (BPP) increases, the PSNR improves, showing less distortion. BPP 2
has the lowest PSNR (16.85 dB), meaning the image is heavily distorted. However, as the
BPP increases, the PSNR rises to 29.21 dB at BPP 4, and at BPP 8, the PSNR reaches
infinity (perfect reconstruction), indicating no distortion.
Q2:
The PSNR values generally correlate well with visual perception of image quality. Higher
PSNR values indicate better image quality, which aligns with clearer and more accurate
images upon visual inspection. For example:
• At high PSNR values (e.g., 34.01 dB for Factor 2), the image quality appears
relatively clear with minimal visible distortion.
• Lower PSNR values (e.g., 16.85 dB for BPP 2) correspond to a more noticeable
loss of detail and visible artifacts.
However, at extreme values (like BPP 8 with infinite PSNR), the visual quality may appear
nearly perfect, indicating that the PSNR value effectively captures major differences in
image, but visual differences might not always be fully reflected by the PSNR alone.
Q3:
Here is the code for calculating PSNR for different resampling factors and bits per pixel.
# Parameters
factors = [2, 4, 8] # Different resampling factors
bpps = [2, 4, 8] # Different bits per pixel
psnr_resampled = []
psnr_requantized = []
Resampling:
As the resampling factor increases, the image loses detail, which generally results in a
lower PSNR value. Higher resampling factors lead to more pronounced aliasing and loss of
information.
Requantization:
Lowering the bits per pixel reduces the color depth, leading to quantization errors. A
decrease in bpp generally results in lower PSNR values, as the image loses subtle
variations in brightness.
Q4:
The Structural Similarity Index (SSIM) is a perceptual metric used to measure the similarity
between two images. Unlike Peak Signal-to-Noise Ratio (PSNR), which quantifies the error
between pixel values, SSIM focuses on the perceived quality of images by considering
changes in structural information, luminance, and contrast.
SSIM:
Designed to correlate well with human visual perception. It evaluates images based on
how they look to human eyes, considering factors like texture and patterns.
PSNR:
Primarily based on pixel-wise differences, which does not account for how humans
perceive image quality. Small pixel differences can significantly impact PSNR but may not
affect perceived quality.
Q5:
Here is the function to calculate SSIM:
def calculate_ssim(original, compressed):
# Specify the data range (for grayscale images, max value is 255)
return ssim(original, compressed, data_range=compressed.max() -
compressed.min())
Comparison:
• PSNR: Provides a measure of the average error between the original and processed
images. Higher values indicate better quality but may not align with perceptual
quality.
• SSIM: Measures structural similarity and is more aligned with human perception.
Values closer to 1 indicate more similarity.
Q3:
At noise level (standard deviation) 130 the image become unrecognizable and the
correspondence PSNR value is 27.8658.
Q4:
Here is the function to apply gaussian blur on the image:
gaussian_filtered_img = gaussian_filter(noisy_img, sigma=2)
We got PSNR values
PSNR before denoising (noisy): 28.40 dB
PSNR after Gaussian blur: 31.90 dB
Effectiveness:
This method smooths the image by averaging the pixels with a Gaussian kernel. While it
can reduce noise, it may also blur edges and details, which can lead to a reduction in
image sharpness.
Limitations:
This method can lead to a loss of important details, particularly in high-frequency areas
(edges and textures). The amount of blurring depends on the choice of the sigma
parameter, and too much blurring can degrade the image quality.
Q5:
Advanced Noise Reduction Technique: Non-Local Means (NLM)
Description: Non-Local Means (NLM) is an advanced denoising algorithm that exploits the
redundancy of information within an image. Unlike traditional methods that consider only
local pixel neighborhoods, NLM uses the entire image to find similar patches and denoise
them.
Q1:
Here is the calculated duration and highest frequency component of audio signal
Duration of the audio clip: 0.50 seconds
Highest frequency component: 220.00 Hz
Q2:
In a spectrogram, we can observe:
Time vs. Frequency: The x-axis represents time, while the y-axis represents frequency.
Each point in the spectrogram shows the intensity of a specific frequency at a specific
time.
Bright Areas: The bright areas (or lighter colors) in the spectrogram represent higher
energy or intensity of sound at specific frequencies and times. This indicates that the
sound at those frequencies is more prominent during that time period.
Q3:
Here is the code to make it 0.5 second
duration = 0.5 # seconds
num_samples = int(samplerate * duration)
data = data[:num_samples]
But there is not much more change in the spectrum as it cuts just a little part of the audio.
Q4:
Here is description for music, human and environmental sound:
1. Music wave:
Waveform: Shows distinct patterns with varying intensity and amplitude over time.
Frequency Spectrum: Contains multiple peaks, indicating a mixture of different
frequencies (harmonic and non-harmonic).
Spectrogram: Displays multiple bright bands that repeat over time, signifying consistent
frequency patterns (indicative of music or a repetitive sound).
2.Human Sound wave:
Waveform: Speech typically shows clear segments of activity (words) followed by short
silences or lower amplitude segments.
Frequency Spectrum: Displays a few dominant peaks, which are likely to correspond to
formants (key frequencies in speech).
Spectrogram: Shows distinct, spaced-out bright bands (formants) corresponding to the
vowels and consonants in speech.
Q5:
Here is the function for downnsampling:
# Downsample audio
downsample_factor = 10
downsampled_data = resample(data, len(data) // downsample_factor)
Q6:
Noise Reduction is a common audio processing technique used to remove unwanted
background noise while preserving the main audio signal. This is achieved by identifying
noise components in the frequency spectrum and reducing or filtering them out.
In the lab, we worked with down sampling audio and observed changes in the frequency
spectrum. Noise reduction can be implemented by analyzing the frequency spectrum
using techniques like Fourier Transform. By identifying and minimizing noise frequencies,
the cleaner parts of the audio signal can be enhanced, similar to how we explored different
sound characteristics in the waveform and frequency spectrum.
4. Discussion:
The lab experiments highlighted the effects of resampling, requantization, and noise on
both audio and image quality. Resampling led to loss of detail and aliasing, while
requantization reduced color depth and smoothness. We observed that PSNR, while
useful, doesn’t always align with visual perception, and SSIM provided a better measure
for structural integrity. Noise addition significantly degraded quality, and basic noise
reduction methods, like Gaussian blur, improved PSNR but often at the cost of fine details.
The use of more advanced techniques like Non-Local Means filtering demonstrated better
noise reduction without sacrificing sharpness.
5. Conclusion:
This lab provided hands-on experience with signal and image processing techniques such
as Fourier analysis, resampling, requantization, and noise addition. We observed how
downsampling and compression impact signal quality and explored the limitations of
PSNR compared to SSIM for visual quality evaluation. Noise reduction methods highlighted
the importance of filtering in maintaining signal fidelity. Overall, the lab deepened our
understanding of key processing concepts and their practical trade-offs.
6. About Me:
Hamza Ali