0% found this document useful (0 votes)
32 views20 pages

DT081A - Signal and Image Processing Lab 1 Report

Uploaded by

Ali Jutt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views20 pages

DT081A - Signal and Image Processing Lab 1 Report

Uploaded by

Ali Jutt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

DT081A - Signal and Image Processing: Lab 1 Report

1. Introduction
This lab covers the basics of signal and image processing, giving experience with different
techniques. It starts with generating and visualizing signals, both in the time domain and
frequency domain using Fourier Transforms. Making changes in values and plotting them
to show how exactly signals behave. It also explores how sampling and aliasing affect
signal accuracy.

In the image processing section, we focused on tasks like resizing and changing the detail
of images (sampling and quantization). We calculate image quality using PSNR and SSIM
and apply techniques like Gaussian blur to reduce noise. The effects of adding noise to
images are also analyzed.

In the audio processing part, we learn to analyze sound signals by down sampling
(reducing the sample rate) and looking at their frequency components. Also study methods
like noise reduction, pitch shifting, and time stretching to modify the audio. Overall, this
lab helps in understanding how these techniques are used to improve both images and
audio.

2. Experimental Setup

Tools Used:

• Python
• VS Code

Libraries:

• numpy, matplotlib, scipy, PIL, scipy.ndimage, skimage.metrics,


sounddevice, librosa
3. Results and Analysis

3.1. Signal Generation and Analysis


3.1.1 Signal Generation and Visualization
Q:1

A sawtooth wave is a type of non-sinusoidal waveform that rises upward and then sharply
drops, creating a "saw-like" shape.

To implement a sawtooth wave mathematically, we can use the formula:

sawtooth(t)= 2 * (t / T - np.floor(t / T + 0.5))

We use Python's SciPy library, which provides a simplified function to generate a sawtooth
wave:

sawtooth_wave = signal.sawtooth(2 * np.pi * 5 * t)

The term (2 * np.pi * 5 * t) converts the frequency into angular form (radians per
second) for the time vector t. This produces a periodic wave that linearly rises and then
sharply drops after each cycle.

Q:2

To change the frequency of the sine wave to 20 Hz


sine_wave = np.sin(2 * np.pi * 20 * t) # Changed frequency to 20 Hz

Effects on the Plot:

• Increased frequency:

A higher frequency means the sine wave completes more cycles in the same period.
At 20 Hz, the wave will now complete 20 cycles per second.

• Visual impact:
The sine wave will appear more compressed, with more oscillations (cycles) in the
same time range (0 to 1 second) on the plot.

Q: 3

Using the np.ceil function rounds each element of x up to the nearest integer. For the
sine function, np.sin(2 * np.pi * 5 * t) ranges from -1 to 1. When applying
np.ceil, positive sine values round to 1, while zero or negative values remain at 0 or
below.

After multiplying by 2 and subtracting 1, this means 1 stays 1 and 0 becomes -1. As a
result, it starts from –1 the final output still oscillates between -1 and 1, like the original
square wave.

Q:4

By adding square wave and sine wave together we will get composite signal.
composite_signal = sine_wave + square_wave
Amplitude:

The composite signal's amplitude ranges from -2 to 2, combining the amplitudes of


the sine wave and square wave.

Frequency Components:

It includes frequency components from both the sine wave (20 Hz) and the square
wave (5 Hz).

Shape:

It smoothly follows the sine wave’s oscillations and sudden shifts due to the square
wave's contribution. That shows increased complexity compared to either wave
alone.
Q:5
Understanding different signal types is beneficial in multiple real-world examples where
we can use them in different fields. We have wide areas to use them like in
telecommunication, medical imaging, or audio processing etc.
If we talk about a real-world example like music production, where various techniques
enhance sound quality and creativity.
In music production, different waveforms like sine, square, and sawtooth waves are used
to create and manipulate sounds. Each type of waveform has unique characteristics that
affect the final sound.

How Different Signal Types Are Used:

Sine Waves:
It generates smooth and pure tones with fundamental frequency that can be used for
testing multiple audio equipment.
Square Waves:
It generates abrupt changes between high and low that can be used to create sharp, edgy
sounds, commonly used in bass lines and electronic music.
Sawtooth Waves:
Gradually rises and drops sharply, containing both odd and even harmonics that produces
bright, rich sounds in electronic music.

3.2 Fourier Transform and Frequency Domain Analysis

Q1:
By analyzing the frequency domain plot, we found frequencies of 20 Hz, and 10 Hz are
present in the signal as we can see in the plot as spikes.

Q2:
By modifying the function by adding a 30 Hz frequency signal we can see in the frequency
domain plot there will be another spike with 30 Hz.
Q3:
By adding random noise to the signal, it will show multiple small spikes with random
amplitude that appear all over the frequency domain plot where the original peaks at 10
Hz, 20 Hz, and 30 Hz will still exist.

Here is the code to add noise:


noise_level = 0.5
noise = noise_level * np.random.normal(size=t.shape)
noisy_signal = signal + noise

Q4:
Here is the function to generate chirp signal

def generate_signal(t):
return np.sin(2 * np.pi * (10 +20*t) * t)
t = np.linspace(0, 1, 1000)

The term (10 + 20*t) represents the frequency of the signal. Here, the frequency starts at 10
Hz (when t is 0) and increases linearly over time to 30 Hz (when t is 1).
This changing factor creates a chirp, where the signal begins at a lower frequency and
increases to a higher frequency as time progresses.
Q5:
Understanding the frequency domain of a signal is important in multiple areas like in
medical field if we talk about EEG, where the brain's electrical activity is recorded and
analyzed in terms of its frequency components. Different brain states such as sleep,
produce characteristic frequency patterns.
By analyzing these patterns in the frequency domain, doctors can diagnose conditions like
epilepsy, monitor brain function, and guide treatment strategies for neurological disorders.

3.3 Sampling and Aliasing


Q1:
The Nyquist rate is the minimum sampling rate required to accurately represent a
continuous signal without introducing aliasing.

Nyquist rate is twice the frequency of the highest frequency component in the signal.
Nyquist rate=2×f=2×10Hz=20Hz
The current sampling rate in the code is fs = 15 Hz. So, the sampling rate is not sufficient
because it is below the Nyquist rate (20 Hz). This can lead to aliasing.

Q2:
A higher sampling rate (25 Hz) provides more data points per second, which allows for a
more accurate representation of the original signal in the sampled version.
The sampled points will be spaced closer together, making the sampled signal more
closely resemble the smooth shape of the original continuous signal.Since 25 Hz is greater
than the Nyquist rate (20 Hz), aliasing will not occur.

Q3:
The Nyquist rate for a 20 Hz signal is 40 Hz (since Nyquist rate = 2 × signal frequency).
However, you're sampling the signal at 30 Hz, which is less than the Nyquist rate. This
leads to aliasing.
Q4:
The "wagon-wheel effect" occurs because the sampling rate is much lower than the
signal's actual frequency, which causes the signal to appear as if it is moving at a different,
often slower, rate than it is. This can sometimes create the illusion of reverse motion, as
the sampling points miss the true motion of the oscillations.

Q5:
Understanding aliasing is crucial in medical imaging, particularly in MRI (Magnetic
Resonance Imaging). In MRI, if the signal is sampled at a rate lower than the required
Nyquist rate, aliasing can occur, causing the image to appear distorted or showing false
patterns. This can lead to misinterpretation of the scans and potentially inaccurate
diagnoses. To prevent this, proper sampling rates and filtering techniques are used to
ensure the images accurately represent the underlying tissue or structures, allowing for
correct medical assessments.

3.2. Image Processing


3.2.1 Image Sampling and Quantization
Q1:
The resampling factor affects the image quality by reducing the image's resolution.
Increasing the resampling factor (such as reducing the image by factors of 2, 4, etc.) can
lead to a loss of details and boxy pixels. This occurs because fewer pixels are used to
represent the original content. Also, distortion becomes noticeable when the resampling
factor is higher. At this stage, significant details may be lost, resulting in a pixelated or
unclear appearance.

Low factor (2):


The image loses some detail and some pixelation going to show but remains recognizable.
Medium factor (4):
The image starts showing distortion. Edges are going to blur or lose, and the overall
appearance becomes pixelated.
High factor (8):
The image becomes heavily distorted, to the point that individual pixels or blocks of pixels
become visible.

For this Image that we used for testing distortion becomes apparent at 4.
Q2:
Characteristics of Distortion Introduced by Resampling:
• Pixelation:
As the resampling factor increases, fewer pixels represent the image, leading to a
blocky or pixelated appearance.
• Loss of Detail:
Fine details in the image, such as edges and textures, are blurred or lost because
they are no longer sampled at a high enough resolution.
• Smoothing:
The image becomes progressively smoother, with sharp edges becoming more
rounded or blurred due to insufficient pixel representation.
• Geometric Deformation:
Lines and shapes can lose their original form, becoming less defined and less
recognizable.

Characteristics of Distortion Introduced by Requantization:


• Posterization:
Smooth gradients are broken into discrete steps or bands of color/intensity. This
leads to a loss of smooth transitions between shades, creating an artificial look with
distinct color blocks.
• Loss of Tonal Range:
Requantization reduces the number of available intensity, leading to a flatter image
with less dynamic range and subtlety in shading.
• Banding Artifacts:
In areas with gradual changes in intensity, visible bands or stripes may appear due
to the reduced number of levels available to represent the image's tonal range.

Key Differences:
Resampling Distortion is primarily spatial and affects the image's resolution and detail,
resulting in a blocky, pixelated appearance.
Requantization Distortion is tonal, affecting the image's color and intensity depth. It results
in banding and loss of smooth transitions, with sharper contrasts between adjacent
intensity levels.
Q3:
As most images use 8 bits per channel (24 bits total for RGB images). Reducing this too far
causes obvious distortions.
At bpp 4 distortion becomes apparent for this image.

Q:4
Here is the function that applies both resampling and requantization to the image.
def image_combinition(image, factor, bpp):
resampled=resample_image(image, factor)
resampled_requantized=requantize_image(resampled, bpp)
return resampled_requantized

Experimenting with different combinations of resampling factors and bits per pixel the
results are shown below
So, from results we can identify that combination of factor 3 and bpp 6 is visually clear.

Q5:
Nearest Neighbor Resampling:
This method results in blocky, pixelated images because each pixel in the downsampled
image is simply the closest original pixel. It's fast but introduces a lot of visual distortion,
especially at higher downsampling factors.
Bicubic Interpolation:
Bicubic interpolation produces much smoother images because it considers a larger pixel
neighborhood when resizing. This results in less distortion and better preservation of edges
and textures.

Q6:
By testing different type of images, we get a result that it depends on image details.
If the image have more details the compression effects the quality more.
3.2.2 PSNR Calculation
Q1:
For resampling factors (Factor 2, Factor 4, Factor 8):
As the resampling factor increases, the PSNR decreases, indicating a higher level of
distortion. For Factor 2, the PSNR is relatively high (34.01 dB), showing minimal distortion.
However, at Factor 8, the PSNR drops to 30.48 dB, meaning that the image has become
more distorted as more data is lost during the resampling process.
For bits per pixel (BPP 2, BPP 4, BPP 8):
As the bits per pixel (BPP) increases, the PSNR improves, showing less distortion. BPP 2
has the lowest PSNR (16.85 dB), meaning the image is heavily distorted. However, as the
BPP increases, the PSNR rises to 29.21 dB at BPP 4, and at BPP 8, the PSNR reaches
infinity (perfect reconstruction), indicating no distortion.

Q2:
The PSNR values generally correlate well with visual perception of image quality. Higher
PSNR values indicate better image quality, which aligns with clearer and more accurate
images upon visual inspection. For example:
• At high PSNR values (e.g., 34.01 dB for Factor 2), the image quality appears
relatively clear with minimal visible distortion.
• Lower PSNR values (e.g., 16.85 dB for BPP 2) correspond to a more noticeable
loss of detail and visible artifacts.
However, at extreme values (like BPP 8 with infinite PSNR), the visual quality may appear
nearly perfect, indicating that the PSNR value effectively captures major differences in
image, but visual differences might not always be fully reflected by the PSNR alone.

Q3:
Here is the code for calculating PSNR for different resampling factors and bits per pixel.

# Parameters
factors = [2, 4, 8] # Different resampling factors
bpps = [2, 4, 8] # Different bits per pixel

psnr_resampled = []
psnr_requantized = []

for factor in factors:


img_resampled = resample_image(img, factor)
psnr_value = calculate_psnr(img, img_resampled)
psnr_resampled.append(psnr_value)
print(f"Factor {factor}: PSNR = {psnr_resampled[-1]:.2f} dB")

# Calculate PSNR for different bits per pixel


for bpp in bpps:
img_requantized = requantize_image(img, bpp)
psnr_value = calculate_psnr(img, img_requantized)
psnr_requantized.append(psnr_value)
print(f"BPP {bpp}: PSNR = {psnr_requantized[-1]:.2f} dB")

And the output is:

Resampling:
As the resampling factor increases, the image loses detail, which generally results in a
lower PSNR value. Higher resampling factors lead to more pronounced aliasing and loss of
information.
Requantization:
Lowering the bits per pixel reduces the color depth, leading to quantization errors. A
decrease in bpp generally results in lower PSNR values, as the image loses subtle
variations in brightness.

Q4:
The Structural Similarity Index (SSIM) is a perceptual metric used to measure the similarity
between two images. Unlike Peak Signal-to-Noise Ratio (PSNR), which quantifies the error
between pixel values, SSIM focuses on the perceived quality of images by considering
changes in structural information, luminance, and contrast.

Key Differences Between SSIM and PSNR

SSIM:
Designed to correlate well with human visual perception. It evaluates images based on
how they look to human eyes, considering factors like texture and patterns.

PSNR:
Primarily based on pixel-wise differences, which does not account for how humans
perceive image quality. Small pixel differences can significantly impact PSNR but may not
affect perceived quality.
Q5:
Here is the function to calculate SSIM:
def calculate_ssim(original, compressed):
# Specify the data range (for grayscale images, max value is 255)
return ssim(original, compressed, data_range=compressed.max() -
compressed.min())

And here Is the result for different factors and bpp:

Comparison:
• PSNR: Provides a measure of the average error between the original and processed
images. Higher values indicate better quality but may not align with perceptual
quality.
• SSIM: Measures structural similarity and is more aligned with human perception.
Values closer to 1 indicate more similarity.

3.2.3 Noise Addition and PSNR Analysis


Q1:
Noise typically starts to become visibly apparent in the image when the PSNR value drops
below 30 dB.
Q2:
The PSNR (Peak Signal-to-Noise Ratio) value is a common metric for assessing the quality
of reconstructed or compressed images compared to the original.

Here is description how these values can be assumed.


30 dB:
Generally considered the threshold below which the image quality is noticeably poor.
Images may show visible artifacts or distortions.
30-40 dB:
This range is often viewed as acceptable for standard images. Quality may be satisfactory,
but there might still be some noticeable issues depending on the content.
40-50 dB:
Indicates high-quality images, with only minor distortions. This range is often deemed
acceptable for most professional applications.
Above 50 dB:
Typically signifies an excellent quality image, very close to the original.

Q3:
At noise level (standard deviation) 130 the image become unrecognizable and the
correspondence PSNR value is 27.8658.

Q4:
Here is the function to apply gaussian blur on the image:
gaussian_filtered_img = gaussian_filter(noisy_img, sigma=2)
We got PSNR values
PSNR before denoising (noisy): 28.40 dB
PSNR after Gaussian blur: 31.90 dB

Effectiveness:
This method smooths the image by averaging the pixels with a Gaussian kernel. While it
can reduce noise, it may also blur edges and details, which can lead to a reduction in
image sharpness.
Limitations:
This method can lead to a loss of important details, particularly in high-frequency areas
(edges and textures). The amount of blurring depends on the choice of the sigma
parameter, and too much blurring can degrade the image quality.

Q5:
Advanced Noise Reduction Technique: Non-Local Means (NLM)
Description: Non-Local Means (NLM) is an advanced denoising algorithm that exploits the
redundancy of information within an image. Unlike traditional methods that consider only
local pixel neighborhoods, NLM uses the entire image to find similar patches and denoise
them.

Improvements Over Simpler Methods:


1. Preservation of Details: Unlike methods like Gaussian blur, which can excessively
smooth images and lose important details, NLM preserves edges and fine
structures by considering a broader context.
2. Adaptability: NLM can adapt to varying noise levels and structures within the
image, making it effective for complex images where simpler filters might fail.
3. Versatility: NLM is effective for various types of noise, including Gaussian and
Poisson noise, and can be applied to color images as well.

3.3. Audio Processing


3.3.1 Audio Signal Analysis

Q1:
Here is the calculated duration and highest frequency component of audio signal
Duration of the audio clip: 0.50 seconds
Highest frequency component: 220.00 Hz

Q2:
In a spectrogram, we can observe:
Time vs. Frequency: The x-axis represents time, while the y-axis represents frequency.
Each point in the spectrogram shows the intensity of a specific frequency at a specific
time.
Bright Areas: The bright areas (or lighter colors) in the spectrogram represent higher
energy or intensity of sound at specific frequencies and times. This indicates that the
sound at those frequencies is more prominent during that time period.

Q3:
Here is the code to make it 0.5 second
duration = 0.5 # seconds
num_samples = int(samplerate * duration)
data = data[:num_samples]

But there is not much more change in the spectrum as it cuts just a little part of the audio.

Q4:
Here is description for music, human and environmental sound:
1. Music wave:
Waveform: Shows distinct patterns with varying intensity and amplitude over time.
Frequency Spectrum: Contains multiple peaks, indicating a mixture of different
frequencies (harmonic and non-harmonic).
Spectrogram: Displays multiple bright bands that repeat over time, signifying consistent
frequency patterns (indicative of music or a repetitive sound).
2.Human Sound wave:
Waveform: Speech typically shows clear segments of activity (words) followed by short
silences or lower amplitude segments.
Frequency Spectrum: Displays a few dominant peaks, which are likely to correspond to
formants (key frequencies in speech).
Spectrogram: Shows distinct, spaced-out bright bands (formants) corresponding to the
vowels and consonants in speech.

3.Environmental sound wave:


Waveform: More continuous and less structured compared to speech, suggesting a
constant or background noise.
Frequency Spectrum: Broad distribution of lower frequencies with less distinct peaks,
characteristic of environmental sounds like wind or traffic.
Spectrogram: Appears more filled, with less regular or structured patterns, indicating the
randomness of environmental noises.

Q5:
Here is the function for downnsampling:
# Downsample audio
downsample_factor = 10
downsampled_data = resample(data, len(data) // downsample_factor)

Effects of Down sampling on Audio Quality:


Quality Reduction: Down sampling generally reduces audio quality. This is especially
noticeable in higher frequencies, which may get lost if the new sampling rate does not
satisfy the Nyquist theorem (i.e., the new rate should be at least twice the highest
frequency present in the audio).
Aliasing: If the original audio contains frequencies above half the new sampling rate,
aliasing occurs, leading to distortion and a change in the perceived audio. This can
manifest as a "gritty" or "tinny" sound.

Q6:
Noise Reduction is a common audio processing technique used to remove unwanted
background noise while preserving the main audio signal. This is achieved by identifying
noise components in the frequency spectrum and reducing or filtering them out.
In the lab, we worked with down sampling audio and observed changes in the frequency
spectrum. Noise reduction can be implemented by analyzing the frequency spectrum
using techniques like Fourier Transform. By identifying and minimizing noise frequencies,
the cleaner parts of the audio signal can be enhanced, similar to how we explored different
sound characteristics in the waveform and frequency spectrum.

4. Discussion:
The lab experiments highlighted the effects of resampling, requantization, and noise on
both audio and image quality. Resampling led to loss of detail and aliasing, while
requantization reduced color depth and smoothness. We observed that PSNR, while
useful, doesn’t always align with visual perception, and SSIM provided a better measure
for structural integrity. Noise addition significantly degraded quality, and basic noise
reduction methods, like Gaussian blur, improved PSNR but often at the cost of fine details.
The use of more advanced techniques like Non-Local Means filtering demonstrated better
noise reduction without sacrificing sharpness.

5. Conclusion:
This lab provided hands-on experience with signal and image processing techniques such
as Fourier analysis, resampling, requantization, and noise addition. We observed how
downsampling and compression impact signal quality and explored the limitations of
PSNR compared to SSIM for visual quality evaluation. Noise reduction methods highlighted
the importance of filtering in maintaining signal fidelity. Overall, the lab deepened our
understanding of key processing concepts and their practical trade-offs.
6. About Me:

Hamza Ali

You might also like