0% found this document useful (0 votes)
15 views7 pages

IPRREPORT

1. The document describes a project that uses a U-NET convolutional neural network to perform semantic change detection on bi-temporal satellite images to identify changes in buildings and roads over time. 2. It first trains a U-NET model on a dataset to perform multi-class semantic segmentation of urban areas into buildings and roads, obtaining 91.41% accuracy. 3. The main goal is then to refine the model for binary semantic segmentation focused only on buildings and roads, achieving 95.25% accuracy and 92.21% mean IoU.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views7 pages

IPRREPORT

1. The document describes a project that uses a U-NET convolutional neural network to perform semantic change detection on bi-temporal satellite images to identify changes in buildings and roads over time. 2. It first trains a U-NET model on a dataset to perform multi-class semantic segmentation of urban areas into buildings and roads, obtaining 91.41% accuracy. 3. The main goal is then to refine the model for binary semantic segmentation focused only on buildings and roads, achieving 95.25% accuracy and 92.21% mean IoU.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Intellectual Property Rights ( EC73 )

ASSIGNMENT

Sem ic C^a ge 3eSec iou uci g Si-Se o×aI


catel/ite im ee

Rashmi T
( USN: 1MS20EC078 )

Contents:
1. IPR Forms
2. Semantic Change Detection Innovation Document.
3. Similarity Check Report
ABSTRACT
Semantic change detection(SCD) plays a crucial role in remote sensing and image analysis
by identifying and monitoring dynamic changes in land cover and land use.SCD is the
improved concept of conventional Change Detection(CD).The former tells 'where' and 'how'
instead of telling only 'where' the changes have occurred.This project focuses on binary
semantic change detection of buildings and roads using bi-temporal satellite images.
Accurate detection of changes in urban infrastructure holds immense significance for
applications such as urban planning, infrastructure management, and navigation systems.

The project begins with a concise introduction to the U-NET convolutional neural network
architecture and its implementation for multiclass semantic segmentation. The UNET model
serves as the foundation for subsequent binary semantic change detection tasks. The main
objective is to develop a robust system that can accurately detect and track changes in
buildings and roads across different timestamps.The LEVIR CD dataset is utilised, with
buildings and roads manually annotated using Label Studio to provide ground truth labels
for training and evaluation. Semantic segmentation is then performed using the U-NET
model, enabling precise identification and segmentation of buildings and roads in the
bi-temporal satellite images. To quantify changes between the two timestamps, the project
incorporates the Gray-Level Co-occurrence Matrix (GLCM). By comparing GLCM features
extracted from the segmented masks, differences in texture and spatial relationships are
measured, providing a reliable indication of change in roads and buildings. We have been
able to obtain an accuracy of 91.41 in multi-class semantic segmentation with a Mean IOU
of 50.72. The accuracy of Binary semantic segmentation which focuses on roads and
buildings is 95.25 with a Mean IOU of 92.21.
The successful implementation of this project has diverse applications in various domains.
The developed system facilitates efficient and automated detection of dynamic changes in
buildings and roads, enhancing urban planning, infrastructure management, and navigation.

CLAIMS

1. Used Lael Studio to generate dataset.


2. With U-NET architecture was able to generate results with better accuracy with no
complex system.
3. Classes mainly considered were only 2 and the focus was clear.
4. Improved Mean IOU as this was one of the main factors determining the
performance of the model.
5. Generated desired output with minimal dataset.
6. Cost Efficient.
PROCEDURE

The objective of a semantic change detection using bi-temporal satellite images is to


automatically identify map changes in the buildings and roads of the images between two
time periods. These involve:
1. Data Pre-processing: Pre-processing involves preparing the bitemporal satellite images
for analysis. This involves collection of dataset and data augmentation.
2. Feature Extraction: Feature extraction involves identifying and extracting relevant
features from the bi-temporal satellite images.
3. Labelling: Labelling involves creating a training dataset that consists of labelled
examples of semantic changes in the bi-temporal satellite images. This may involve
manually labelling changes or using automated change detection algorithms.
4. Model Selection: Model selection involves selecting an appropriate ML algorithm for
detecting semantic changes in the bi-temporal satellite images. Popular algorithms for
change detection include Support Vector Machines (SVMs), Random Forests, and
Convolutional Neural Networks (CNNs).
5. Training: Training involves training the ML algorithm on the labelled dataset. This
involves splitting the dataset into training and validation sets, selecting appropriate
hyperparameters, and training the model.
6. Prediction: Prediction involves using the trained ML model to predict semantic changes
in the bi-temporal satellite images. This may involve applying the model to new bitemporal
satellite images or applying the model to a time series of images to detect changes over
time.

DIFFERENCE BETWEEN CHANGE DETECTION AND SEMANTIC CHANGE


DETECTION
Change Detection (CD) detects changes between two different images of the same scene
taken at different times. Semantic Change Detection (SCD) is a CD technique that
intuitively assigns semantic meaning to detected change areas. CD says “where” is the
change, whereas SCD tells “where and how” differences between two images taken at
different times. Semantic change detection is a more advanced form of change detection
that goes beyond the identification of physical changes and instead focuses on detecting
changes in the meaning or semantic content of objects or features in an image. The output of
semantic change detection algorithms typically consists of maps or visualisations that
highlight the changes detected in the features of interest. These outputs may also include
statistical or quantitative information about the changes, such as the area of a new
construction or the change in the length of a road segment.
CONVOLUTIONAL NEURAL NETWORK – UNET In a Convolutional Neural
Network (CNN), the convolutional layers are the core building blocks that perform feature
extraction from input data, such as images. These layers consist of a set of learnable filters
(also known as kernels or weights) that are applied to the input data, producing a set of
feature maps that highlight different aspects of the input. Each filter in a convolutional layer
is a small matrix of weights that slides (convolves) over the input data. At each position, the
dot product between the filter and the corresponding input data patch is computed, and the
result is stored in the corresponding position of the output feature map. By sliding the filter
over the entire input data, the convolutional layer produces a new feature map that
highlights a particular pattern or feature that the filter is sensitive to. The CNN used here is
a UNET.

The U-Net is a convolutional neural network architecture designed for image segmentation
tasks. It was first introduced in a research paper titled "U-Net: Convolutional Networks for
Biomedical Image Segmentation" by Ronneberger et al. in 2015. The U-Net architecture
consists of a contracting path and an expanding path. The contracting path is a typical
convolutional network that extracts features from the input image, while the expanding path
uses up sampling and concatenation operations to produce a segmentation mask with the
same dimensions as the original input image. The contracting path is composed of multiple
convolutional and max-pooling layers that gradually reduce the spatial resolution of the
feature maps. This allows the network to capture high-level features and contextual
information from the input image. The expanding path consists of multiple deconvolutional
and concatenation layers that gradually increase the spatial resolution of the feature maps.
The deconvolutional layers perform upsampling to increase the spatial resolution, while the
concatenation layers merge the feature maps from the contracting path with the
corresponding feature maps in the expanding path. This allows the network to recover the
spatial information lost during the contracting path and produce an accurate segmentation
mask

MULTICLASS SEMANTIC CHANGE DETECTION

Fig.(1)
The motivation behind the first part of our project is to demonstrate the effectiveness of
multiclass semantic segmentation using the existing UNET architecture on a representative
dataset. By showcasing the capabilities of this approach, we aim to lay the foundation for
the subsequent phases of our project, which involve semantic change detection in buildings
and roads from bi-temporal satellite images. In this initial phase, we focus on utilising the
widely adopted UNET architecture for multiclass semantic segmentation. We select an
existing dataset that encompasses diverse urban scenes and contains annotated ground truth
labels for buildings and roads. By training the UNET model on this dataset, we aim to
showcase its ability to accurately classify and segment different classes within the urban
environment. Through rigorous experimentation and parameter tuning, we optimise the
UNET model's performance for multiclass segmentation. We evaluate the model using
appropriate metrics such as intersection over union (IoU), accuracy, and precision-recall
curves. The results of this demonstration provide valuable insights into the strengths and
limitations of the UNET architecture for the specific task of multiclass semantic
segmentation. By successfully demonstrating the efficacy of multiclass semantic
segmentation using the UNET architecture on an existing dataset, we establish a strong
foundation for the subsequent phases of our project. This demonstration serves as a
precursor to the main focus of our project, which involves extending this methodology to
detect semantic changes in buildings and roads from bi-temporal satellite images.

BINARY SEMANTIC SEGMENTATION

Fig.(2)
The main focus of our project is to perform binary semantic segmentation using the
multiclass semantic segmentation as our reference model and hence narrow down the scope
to the two target classes and optimise the model to accurately segment buildings and roads
in satellite imagery. The dataset used for this phase is manually annotated with masks
specifically for buildings and roads, enabling focused training and evaluation.
• Binary Semantic Segmentation: Binary semantic segmentation is a type of image analysis
technique that involves partitioning an image into two distinct classes or categories, usually
labelled as "foreground" and "background". The goal of binary semantic segmentation is to
identify and separate the objects or regions of interest from the background in an image. In
binary semantic segmentation, each pixel in the image is assigned a binary value of either 0
or 1, where 0 represents the background and 1 represents the foreground. The segmentation
process involves identifying the boundaries or edges of the foreground objects and
separating them from the background based on certain criteria. In our analysis foreground is
roads and buildings rest all the classes are considered as background.
• Dissimilarity Decoder: A dissimilarity block is a component commonly used in image
processing and computer vision systems to detect changes or differences between two or
more images of the same scene taken at different times. The goal is to identify any
significant changes in the scene, which can be used for a variety of applications such as
surveillance, environmental monitoring, and industrial quality control. The change detection
block works by taking the two semantically segmented images, and detecting the
dissimilarity between them. The resulting output image will highlight the regions where
significant changes have occurred between the two input images. The Euclidean distance
between the 2-feature vector is calculated as:

GENERATING GROUND TRUTH LABELS


Ground truth labels or masks are a critical component in semantic segmentation, as they
highlight the regions of interest while ignoring the other areas of an image. In our project,
the regions of interest are roads and buildings. To create training datasets for our machine
learning. models, we used Label Studio, an open-source data labelling tool. We sorted the
ground truth labels and images in order and fed them to our network for semantic
segmentation. To reduce computational costs and improve performance, we cropped and
resized the 1024 x 1024 images and masks to 256 x 256 before augmenting them to
generate a larger number of training images. Using Label Studio ensured that our ground
truth labels were accurate and consistent, which is crucial for training high-quality models.
Additionally, by augmenting our training set, we increased the diversity of our data and
improved the model's ability to generalise to new images. This approach allowed us to
develop an effective and efficient model for semantic segmentation in our project.

REQUIREMENT SPECIFICATION
HARDWARE REQUIREMENT
• Intel i7 microprocessor.
• 16GB RAM.
• NVIDIA ® GeForce ® GTX 1650 Ti (4GB).
• 256 GB SSD and 1TB HD.
• Windows 10 OS. Figure
SOFTWARE
REQUIREMENT
• Miniconda 3.
• Conda 23.1.0.
• Python 3.9.16. 15
• CUDA 11.3.1.
• CUDNN 8.2.1.

RESULTS :

You might also like