Notebook 1 Data Preparation and Eda and Data Augmentation
Notebook 1 Data Preparation and Eda and Data Augmentation
ipynb - Colaboratory
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 1/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
Variability: Images can vary greatly in terms of lighting conditions, scale, orientation,
occlusion, and background clutter. These variations make it difficult to generalize and
classify images accurately.
Overfitting: Overfitting occurs when a model learns the training data too well and fails to
generalize to unseen images. This challenge requires techniques like data augmentation,
regularization, and model selection to mitigate.
Class imbalance: In real-world datasets, certain classes may have a significantly larger
number of samples compared to others. This class imbalance can lead to biased models
that perform poorly on underrepresented classes.
article1
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 2/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
article2
Interpretability: Deep learning models used for image classification, such as convolutional
neural networks (CNNs), can be challenging to interpret. Understanding why a model
makes specific predictions is an ongoing research area.
objects on the road, or in medical imaging, where the system needs to identify
specific structures or anomalies within medical scans.
Detection, Localization, and Prediction: In the context of road signs and traffic-related
scenarios, detection, localization, and prediction tasks play crucial roles in traffic
management, driver assistance systems, and overall road safety. Let's explore an example:
Detection: Object detection can be applied to detect and identify specific road signs
within an image or a video stream. For instance, a computer vision system equipped
with object detection algorithms can analyze a live video feed from a traffic camera
and identify various road signs such as stop signs, yield signs, or speed limit signs.
The system would draw bounding boxes around each detected sign, indicating their
presence and location within the scene.
Prediction: Prediction tasks in the context of road signs and traffic-related scenarios
involve estimating future attributes or behaviors associated with the detected signs.
For example, an advanced driver assistance system (ADAS) can analyze a sequence
of video frames and predict the future position or behavior of a pedestrian crossing
sign. This prediction can assist the system in determining the appropriate action,
such as alerting the driver or adjusting the vehicle's speed, to ensure safety when
approaching the sign.
Combining detection, localization, and prediction tasks in road sign and traffic-related
applications enables advanced systems to accurately identify, locate, and anticipate the
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 4/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
article1
article2
article3
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 5/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
By setting appropriate color thresholds, pixels with color values falling within
the specified range are classified as the desired color region, while those
outside the range are considered part of the background. This allows for the
extraction of specific color regions within the image, facilitating subsequent
analysis and decision-making in applications such as traffic light detection and
control.
Techniques such as the Canny edge detector or gradient-based methods can be used
to identify edges and separate objects based on the detected edges.
Canny Edge Detector: The Canny edge detector is a popular algorithm used for
edge detection. It works by computing the gradient magnitude of the image and
identifying areas with significant changes in intensity. In the context of road
sign segmentation, the Canny edge detector can help separate the sign from
the surrounding background by detecting the edges of the sign.
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 6/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
segment the sign from the background by identifying the regions corresponding
to the sign's shape and color.
Active Contours (Snakes) Model: The active contours model, also known as
snakes, is a popular algorithm used for contour-based segmentation. It involves
iteratively deforming an initial contour to fit the boundaries of objects in the
image. By minimizing an energy function that combines data and smoothness
terms, the contour adapts to the object's shape. In the context of road signs, the
active contours model can be used to accurately delineate the sign's
boundaries.
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 7/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
be applied to segment the sign by finding the contours that separate it from the
surrounding background.
Semantic segmentation:
Example: In road sign and traffic-related scenarios, this involves segmenting and
labeling each road sign, pedestrian, vehicle, or other relevant objects as separate
instances. For example, if there are multiple stop signs in an image, instance
segmentation techniques can distinguish between them and assign unique labels to
each instance.
Example: It can be applied in urban planning. By segmenting an urban scene into both
objects (e.g., cars, buildings, trees) and amorphous regions (e.g., roads, sidewalks,
parks), planners can obtain a comprehensive understanding of the city layout, identify
potential areas for development, and analyze the distribution of different urban
elements.
Generation: Image generation refers to the task of creating new images based on learned
patterns and characteristics from existing data. Generative models, such as Generative
Adversarial Networks (GANs) or Variational Autoencoders (VAEs), are used to generate
new images that resemble the training data. Image generation has applications in image
synthesis, data augmentation, and creative content generation.
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 8/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
article1
article2
article3
Keypoint Detection and Matching: Keypoint detection involves identifying specific points of
interest in an image, such as corners, edges, or distinctive regions. Keypoints serve as
landmarks or reference points for further analysis or tracking. Keypoint matching focuses
on finding correspondences between keypoints in different images, enabling tasks like
image alignment, object tracking, or 3D reconstruction. These tasks are commonly used in
applications like augmented reality, image stitching, and image registration.
Problem Statement
DriveViz, a deep research and AI company specializing in fleet management and transportation
intelligence, aims to expand its product portfolio by developing Advanced Driver Assistance
Systems (ADAS) and establish its brand in the connected car market. The company's primary
objective is to enhance driver safety and reduce the mortality rate caused by road accidents.
After extensive research, DriveViz has identified a key problem that needs to be addressed within
the ADAS program. The problem revolves around the accurate classification of road signs under
various weather conditions. DriveViz believes that solving this problem will play a crucial role in
achieving their goal of improving driver safety and reducing the occurrence of road accidents.
As a valued contributor to DriveViz's ADAS development initiative, you have been presented with
an opportunity to contribute your expertise and help tackle the challenge of developing a robust
road sign classification system for improved driver assistance systems.
! pip install opencv-python albumentations torch c torchvision gdown --quiet
! gdown --fuzzy "https://drive.google.com/file/d/1OpowBroHNYx1hx6rXOIfwL7k9VLcTNzf/
! unzip -q -o notebook1_roadsigndataset.zip
import matplotlib.pyplot as plt
import cv2
import os
def plot_image_histograms(image_folder):
"""
Plot the histogram of image widths and heights for the images in the specified
Parameters:
image_folder (str): Path to the folder containing the images.
Returns:
None
"""
widths = []
heights = []
# Iterate over the images in the folder
for filename in os.listdir(image_folder):
if filename.endswith(".jpg") or filename.endswith(".png"):
image_path = os.path.join(image_folder, filename)
image = cv2.imread(image_path)
# Extract the width and height of the image
height, width, _ = image.shape
widths.append(width)
heights.append(height)
# Create subplots with 1 row and 3 columns
fig, axs = plt.subplots(1, 3, figsize=(18, 5))
# Plot the histogram of image widths
axs[0].hist(widths, bins=30, color='blue', alpha=0.7)
axs[0].set_xlabel("Width")
axs[0].set_ylabel("Frequency")
axs[0].set_title("Histogram of Image Widths")
# Plot the histogram of image heights
axs[1].hist(heights, bins=30, color='green', alpha=0.7)
axs[1].set_xlabel("Height")
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 10/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
axs[1].set_ylabel("Frequency")
axs[1].set_title("Histogram of Image Heights")
# Create a scatter plot of image width vs. height
axs[2].scatter(widths, heights, color='red', alpha=0.5)
axs[2].set_xlabel("Width")
axs[2].set_ylabel("Height")
axs[2].set_title("Scatter Plot of Image Width vs. Height")
# Adjust spacing between subplots
plt.tight_layout()
# Show the plot
plt.show()
# Specify the folder containing the images
image_folder = "/content/roadsigndataset/train"
# Call the function to plot the histograms
plot_image_histograms(image_folder)
Unexpected variations in image dimensions can indicate data corruption, incomplete data,
or errors during data collection or preprocessing.
Preprocessing techniques like resizing, cropping, or normalizing often require the images to
have consistent dimensions. By checking the height and width, we can determine the
necessary preprocessing steps to bring all images to a uniform size or aspect ratio,
enabling fair comparisons, accurate analysis, and efficient model training.
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 11/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
In many computer vision tasks, images serve as input to deep learning models or other
machine learning algorithms. These models often have specific input shape requirements.
By checking the image dimensions, we can ensure that the images align with the expected
input dimensions of the models we intend to use.
import os
import cv2
import numpy as np
import matplotlib.pyplot as plt
To find the distribution of the quality of images, we can use various image quality metrics such
as Mean Squared Error (MSE), Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index
(SSIM), or any other suitable quality metric.
Peak Signal-to-Noise Ratio (PSNR) is a widely used metric in image and video processing to
measure the quality of a reconstructed or compressed image/video compared to the original,
reference image/video. It provides an objective measure of the amount of distortion or loss
introduced during the compression or reconstruction process.
PSNR is calculated based on the mean squared error (MSE) between the original and
reconstructed images/videos. The higher the PSNR value, the closer the reconstructed
image/video is to the original, indicating better quality. It is expressed in decibels (dB).
def calculate_image_quality_psnr(image_folder):
"""
Calculate the Peak Signal-to-Noise Ratio (PSNR) for images in the specified fol
Parameters:
image_folder (str): Path to the folder containing the images.
Returns:
list: List of PSNR values for each image.
"""
psnr_values = []
# Iterate over the images in the folder
for filename in os.listdir(image_folder):
if filename.endswith(".jpg") or filename.endswith(".png"):
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 12/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
image_path = os.path.join(image_folder, filename)
image = cv2.imread(image_path)
# Convert the image to grayscale for PSNR calculation
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Calculate the PSNR
mse = np.mean((gray_image - gray_image.mean()) ** 2)
if mse == 0:
psnr = float('inf')
else:
max_pixel_value = np.max(gray_image)
psnr = 20 * np.log10(max_pixel_value / np.sqrt(mse))
psnr_values.append(psnr)
return psnr_values
def plot_image_quality_distribution_PSNR(image_folder):
"""
Plot the distribution of image quality based on Peak Signal-to-Noise Ratio (PSN
Parameters:
image_folder (str): Path to the folder containing the images.
Returns:
None
"""
# Calculate PSNR values for the images
psnr_values = calculate_image_quality_psnr(image_folder)
# Plot the distribution of PSNR values
plt.hist(psnr_values, bins=30, color='purple', alpha=0.7)
plt.xlabel("Peak Signal-to-Noise Ratio (PSNR)")
plt.ylabel("Frequency")
plt.title("Distribution of Image Quality (PSNR)")
plt.show()
# Specify the folder containing the images
image_folder = "/content/roadsigndataset/train"
# Call the function to plot the histograms
plot_image_quality_distribution_PSNR(image_folder)
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 13/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
PSNR is a widely used metric to evaluate image quality, and it measures the ratio of the
peak signal (maximum possible pixel value) to the mean squared error (MSE) between
images. The lower PSNR values and the distribution in this range suggest that the dataset
contains images with noticeable differences compared to the reference image.
The PSNR values in the range of 8 to 23 indicate that there is a noticeable difference
between the images and the reference image. Lower PSNR values typically correspond to
higher levels of image distortion or loss of quality.
The high peak near 12 to 16 suggests that there is a significant concentration of images
with similar levels of distortion or quality degradation. This range may represent a common
characteristic or specific type of image in the dataset.
Limitations of Mean Squared Error (MSE) and Peak Signal-to-Noise Ratio (PSNR) as Image
Quality Metrics
Mean Squared Error (MSE) and Peak Signal-to-Noise Ratio (PSNR) are commonly used metrics
to measure the quality of images, but they have certain limitations that make them less suitable
for assessing perceptual image quality.
MSE and PSNR evaluate image quality based on pixel-level differences between the
original and reconstructed images. While these metrics can measure the level of distortion
or error in an image, they are highly sensitive to even small changes in pixel values. This
sensitivity does not always align with human perception of image quality. Human visual
perception is more focused on higher-level features, such as edges, textures, and overall
visual appearance, rather than pixel-level differences.
They treat all pixels equally, regardless of their visual importance. As a result, these metrics
may not reflect the visual quality perceived by humans accurately. For example, a small
amount of distortion in a smooth region of an image may be less noticeable to the human
eye compared to distortion in high-contrast or detailed regions.
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 14/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
The Structural Similarity Index (SSIM) is a widely used image quality metric that measures the
similarity between two images. It aims to capture perceptual differences between images by
considering three components: luminance, contrast, and structural similarity.
Luminance: SSIM takes into account the similarity of pixel intensities in the images. It
considers the mean values of the pixels, which represent the overall brightness.
Contrast: SSIM evaluates the similarity of contrast between the images. It calculates the
standard deviation of pixel intensities, which measures the variability or sharpness of the
image.
Structural Similarity: SSIM focuses on the structural patterns and textures present in the
images. It computes the covariance of pixel intensities and their spatial arrangement,
capturing local image structures.
The SSIM index ranges between -1 and 1, where 1 indicates perfect similarity and -1 represents
complete dissimilarity. A higher SSIM value indicates a greater similarity between the images.
In addition to these components, SSIM also incorporates a Gaussian window to account for the
varying importance of different image regions. The Gaussian window assigns higher weights to
the central region and lower weights to the surrounding regions. This emphasizes the impact of
local structures on the overall similarity measure.
The Gaussian window helps to give more importance to the fine details and edges in the image
while reducing the influence of smooth regions. It enables SSIM to effectively capture the
structural similarity between images, even when they differ in terms of global luminance and
contrast.
To calculate SSIM with the Gaussian window, the pixel intensities of the images are multiplied by
the Gaussian weights before computing the mean, standard deviation, and covariance. This
weighting scheme allows SSIM to prioritize the local structural information and produce more
accurate similarity measurements.
Daytime and nighttime scenes have distinct visual characteristics due to differences in
lighting conditions, shadows, and overall visibility.
Road and traffic conditions vary significantly between day and night. Factors such as traffic
density, pedestrian activity, and lighting conditions can impact driver behavior and the
effectiveness of safety systems.
Road safety is a primary concern in the domain of road and traffic. Day and night driving
pose unique challenges, and accurate classification and detection of objects and road
signs under both conditions are essential for driver safety.
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 15/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
Knowing the distribution of day vs. night images helps in assessing the performance of
computer vision models designed for road and traffic-related tasks. If the dataset is imbalanced,
efforts can be made to collect more samples from the underrepresented class (e.g., nighttime
images) to create a balanced dataset for training.
import cv2
import numpy as np
import os
import matplotlib.pyplot as plt
# Define the image folder path containing the images
# Define the image folder path
image_folder = "/content/roadsigndataset/train"
# Initialize an empty list to store V values
v_values = []
# Iterate over the images in the folder
for image_file in os.listdir(image_folder):
if image_file.endswith(".jpg") or image_file.endswith(".png"):
image_path = os.path.join(image_folder, image_file)
# Load the image
image = cv2.imread(image_path)
# Convert the image to HSV color space
hsv_image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
# Calculate the V value of the top quarter of the image
height, width, _ = hsv_image.shape
top_quarter_v_values = hsv_image[:height//4, :, 2].flatten()
# Append the V values to the list
v_values.extend(top_quarter_v_values)
# Plot the histogram of V values
plt.hist(v_values, bins=30, color='blue', alpha=0.7)
plt.xlabel("V Value")
plt.ylabel("Frequency")
plt.title("Distribution of V Values for the Top Quarter of Images")
plt.show()
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 16/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
Task 4: Run yolov5(available object detection) and Find out the distribution of
objects in the images
This information helps in understanding the types of objects commonly present on the
road, such as cars, pedestrians, bicycles, traffic signs, and other relevant elements. It
allows us to identify patterns, trends, and potential challenges related to traffic flow,
congestion, or safety.
knowing the frequency and positioning of traffic signs, traffic lights, or road markings can
aid in optimizing road layouts, identifying areas where additional signage is required, or
evaluating the effectiveness of existing traffic control measures.
Understanding the distribution of objects in images is crucial for training object detection
models effectively. By analyzing the frequency and diversity of objects, we can determine the
appropriate training strategies, data augmentation techniques, and model architectures to
handle different object classes and their variations. It helps in building accurate and robust
object detection systems specific to the road and traffic-related domain.
To use the YOLOv5 model in Google Colab, you can follow these step-by-step instructions:
!pip install torch torchvision
!pip install numpy matplotlib
!pip install scikit-image
This will download the YOLOv5 model weights (yolov5m_Objects365.pt) into the current
directory of the Colab notebook.
!wget https://github.com/ultralytics/yolov5/releases/download/v6.0/yolov5m_Objects3
!wget https://github.com/ultralytics/yolov5/raw/master/data/Objects365.yaml
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 18/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/ultralytics/yolov5/master/data/Obj
--2023-07-10 09:26:47-- https://raw.githubusercontent.com/ultralytics/yolov5/
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.10
HTTP request sent, awaiting response... 200 OK
Length: 9204 (9.0K) [text/plain]
Saving to: ‘Objects365.yaml’
!git clone https://github.com/ultralytics/yolov5.git
Copy the Objects365.yaml configuration file into the yolov5/data directory. You can
download the file manually from the YOLOv5 GitHub repository and upload it to the Colab
notebook, then run the following code cell to move the file:
!mv '/content/Objects365.yaml' '/content/yolov5/data'
!mv '/content/yolov5m_Objects365.pt' '/content/yolov5'
!pip install ultralytics
Collecting ultralytics
Downloading ultralytics-8.0.131-py3-none-any.whl (626 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 626.9/626.9 kB 11.8 MB/s eta 0:00:
Requirement already satisfied: matplotlib>=3.2.2 in /usr/local/lib/python3.10/
Requirement already satisfied: opencv-python>=4.6.0 in /usr/local/lib/python3.
Requirement already satisfied: Pillow>=7.1.2 in /usr/local/lib/python3.10/dist
Requirement already satisfied: PyYAML>=5.3.1 in /usr/local/lib/python3.10/dist
Requirement already satisfied: requests>=2.23.0 in /usr/local/lib/python3.10/d
Requirement already satisfied: scipy>=1.4.1 in /usr/local/lib/python3.10/dist-
Requirement already satisfied: torch>=1.7.0 in /usr/local/lib/python3.10/dist-
Requirement already satisfied: torchvision>=0.8.1 in /usr/local/lib/python3.10
Requirement already satisfied: tqdm>=4.64.0 in /usr/local/lib/python3.10/dist-
Requirement already satisfied: pandas>=1.1.4 in /usr/local/lib/python3.10/dist
Requirement already satisfied: seaborn>=0.11.0 in /usr/local/lib/python3.10/di
Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packag
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/d
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 19/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
Requirement already satisfied: numpy>=1.20 in /usr/local/lib/python3.10/dist-p
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/di
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/d
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/pyt
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-pack
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-package
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-pack
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packag
Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.10/dist
Requirement already satisfied: cmake in /usr/local/lib/python3.10/dist-package
Requirement already satisfied: lit in /usr/local/lib/python3.10/dist-packages
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-pack
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/di
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-
Installing collected packages: ultralytics
Successfully installed ultralytics-8.0.131
import torch
import os
import matplotlib.pyplot as plt
import ultralytics
os.chdir("/content/yolov5")
# Model
model = torch.hub.load('.', 'custom', path="yolov5m_Objects365.pt",source='local')
Fusing layers...
YOLOv5m summary: 290 layers, 22323858 parameters, 0 gradients
Adding AutoShape...
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 20/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
import cv2
# Define the image folder path
image_folder = "/content/roadsigndataset/train"
# Initialize an empty dictionary to store the object distribution
object_distribution = {}
# Iterate over the images in the folder
for image_file in os.listdir(image_folder):
if image_file.endswith(".jpg") or image_file.endswith(".png"):
image_path = os.path.join(image_folder, image_file)
image = cv2.imread(image_path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Perform object detection using YOLOv5
results = model(image)
# Extract the class labels for detected objects
class_labels = results.pandas().xyxy[0]['name'].tolist()
# Update the object distribution dictionary
for label in class_labels:
if label in object_distribution:
object_distribution[label] += 1
else:
object_distribution[label] = 1
# Print the object distribution
for label, count in object_distribution.items():
print(f"{label}: {count}")
Traffic Sign: 94
Stop Sign: 66
Clock: 15
Street Lights: 88
Car: 86
Person: 17
Traffic Light: 35
Van: 7
Vase: 2
Potted Plant: 3
Flower: 1
Train: 2
Microphone: 1
Speaker: 1
Hat: 3
Handbag/Satchel: 1
Candle: 1
Truck: 5
Flag: 4
SUV: 5
Bus: 3
Motorcycle: 1
Picture/Frame: 3
Pickup Truck: 2
Volleyball: 2
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 21/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
Crane: 1
Mirror: 2
Lamp: 2
Basketball: 1
Backpack: 2
Air Conditioner: 6
Tent: 1
Data augmentation
Data augmentation is a technique commonly used in machine learning, including computer
vision tasks like image classification. It involves applying various transformations and
modifications to the existing dataset to create new training samples with altered versions of the
original data. Data augmentation serves two primary purposes:
Domain Adaptation: Data augmentation can be used for domain adaptation, which involves
training a model on a source domain and applying it to a target domain. In computer vision,
domains could differ in lighting conditions, camera angles, or imaging equipment, leading
to variations that the model needs to handle. Data augmentation can simulate these
variations in the source domain to make the model more adaptable to the target domain.
By applying domain-specific transformations during augmentation, the model learns to
generalize better and performs well on the target domain by mimicking the variations
present in the real-world data.
Increase Variability and Generalization: Data augmentation helps increase the variability
and diversity of the training data. In many cases, the available dataset is limited, and
training a model solely on this data can lead to overfitting. Overfitting occurs when a model
becomes too specific to the training data and fails to generalize well to new, unseen
examples. By applying random transformations and modifications to the existing data,
such as rotation, scaling, flipping, cropping, or adding noise, data augmentation generates
new samples that exhibit variations of the original data. This expanded dataset allows the
model to learn robust and generalized patterns, enabling better performance on unseen
data during inference.
Mitigate Class Imbalance: Class imbalance occurs when certain classes or categories
have significantly fewer samples compared to others. This can lead to biased models that
perform poorly on underrepresented classes. Data augmentation can help address class
imbalance by artificially increasing the number of samples for minority classes. By
applying augmentation techniques specifically to the underrepresented classes, the
dataset can be rebalanced, allowing the model to learn better representations for all
classes and prevent bias towards the majority classes.
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 22/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
different transformations, the model learns to recognize and generalize patterns in the
presence of such variations. This robustness enables the model to perform well on test
data that may have different lighting conditions, orientations, or other factors that were not
present in the original training data.
Data augmentation refers to a set of techniques used to create new training data samples by
applying various transformations and modifications to the existing dataset. These
transformations alter the appearance or characteristics of the data while preserving the label or
class information.
Data augmentation techniques can be applied to different types of data, including images, text,
audio, and time series. In computer vision, image data augmentation is widely used and includes
operations such as:
Noise addition: Different types of noise, such as Gaussian noise or random pixel value
perturbations, can be added to the images to mimic real-world variations or improve
robustness.
Occlusion and cutout: Artificial occlusions or cutout regions can be introduced into the
images to simulate partial object occlusion or missing information.
These are just a few examples of data augmentation techniques used in computer vision. The
specific choice and combination of augmentation techniques depend on the characteristics of
the dataset, the nature of the problem, and the desired variations needed to improve the model's
performance.
Albumination
Albumination is a Python library specifically designed for data augmentation in computer vision
tasks.The primary goal of Albumination is to facilitate the augmentation process for deep
learning and machine learning practitioners working on image classification, object detection,
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 23/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
segmentation, and other computer vision tasks. By leveraging the power of data augmentation,
Albumination helps improve model performance, generalization, and robustness by exposing the
model to a broader range of variations and scenarios.
It provides a flexible and easy-to-use interface for applying a wide range of image
transformations and augmentations to enhance the diversity and variability of the training
dataset. Albumination offers a rich set of augmentation techniques that can be easily applied to
images. These techniques include geometric transformations like rotation, scaling, and flipping,
as well as color manipulations such as brightness adjustment, contrast enhancement, and hue
shifts. In addition, Albumination supports more advanced transformations such as perspective
transformations, elastic deformations, and noise injection.
One of the key features of Albumination is its ability to handle complex augmentation pipelines.
Users can chain together multiple augmentation operations, specifying the desired parameters
and probabilities for each transformation. This allows for the creation of diverse and
customizable augmentation pipelines tailored to specific needs.
these specific transformations are chosen for simulating low light, low saturation, and a
particular range of rotation:
ColorJitter: This transformation randomly applies color variations to the images, such as
changes in hue, saturation, and brightness. It helps the model learn to recognize objects under
different color conditions, such as images captured with low saturation cameras.
GaussNoise: Adding Gaussian noise to the images simulates image noise commonly
encountered in real-world scenarios. It helps the model learn to distinguish and classify objects
even when there is noise present in the image.
Rotate: Rotation augmentation introduces variations by randomly rotating the images within a
specified range (30 degrees in this case). This is useful to handle situations where the road signs
might be tilted or not perfectly aligned in the input images.
RandomRain: Simulating rain effects in the images adds realism and helps the model generalize
better to images captured during rainy weather conditions. It allows the model to learn the
characteristics of road signs under rainy conditions.
RandomShadow: The presence of shadows can affect the appearance of road signs. By applying
random shadow effects, the model learns to recognize road signs even when they are partially
covered by shadows.
RandomFog: Fog is another common environmental factor that can impact the visibility of road
signs. Introducing random fog effects helps the model adapt to such conditions and learn to
classify road signs accurately in foggy scenarios.
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 24/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
RandomGravel: Road surfaces with gravel or other textured elements can introduce variations in
the appearance of road signs. By applying random gravel effects, the model becomes more
robust to such variations and can recognize road signs in diverse environments.
RandomSnow: Simulating snow on the images helps the model handle road sign classification in
snowy conditions, where the presence of snow might affect the appearance of the signs.
import cv2
import albumentations as A
import os
import matplotlib.pyplot as plt
# Define the image folder path
image_folder = "/content/roadsigndataset/train"
# Define data augmentation transformations
augmentation_transforms = A.Compose([
A.RandomBrightnessContrast(p=0.5),
A.ColorJitter(p=0.5),
A.GaussNoise(p=0.5),
A.Rotate(limit=30, p=0.5),
A.RandomRain(p=0.5),
A.RandomShadow(p=0.5),
A.RandomFog(p=0.5),
#A.RandomGravel(p=0.5),
A.RandomSnow(p=0.5),
])
# Create an empty list to store augmented images
augmented_images = []
# Apply data augmentation to each image in the dataset
# Iterate over the images in the folder
for image_file in os.listdir(image_folder):
if image_file.endswith(".jpg") or image_file.endswith(".png"):
image_path = os.path.join(image_folder, image_file)
# Read the image
image = cv2.imread(image_path)
# Apply data augmentation
augmented = augmentation_transforms(image=image)
augmented_image = augmented["image"]
# Add the augmented image to the list or save it to disk
augmented_images.append(augmented_image)
# Display the augmented image
plt.imshow(cv2.cvtColor(augmented_image, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 25/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 26/27
17/07/2023, 13:26 notebook_1_data_preparation_and_eda_and_data_augmentation.ipynb - Colaboratory
https://colab.research.google.com/drive/18zMbkgy3VE-o43WQasB9iEXrAIves8St#scrollTo=a4gjMwn9JDcS&printMode=true 27/27