0% found this document useful (0 votes)

4 views7 pages

Conference Paper

Uploaded by

venkatesh punugoti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views7 pages

Conference Paper

Uploaded by

venkatesh punugoti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

2024 International Conference on Network, Multimedia and Information Technology (NMITCON)

Automatic Building Extraction using

Mask2Former Model
Venkatamanukarthikeya Dharmapuri
student Venkatesh Punugotti Bhargava Reddy Gunta
Alliance Univeristy student student
Bengaluru, India Alliance Univeristy Alliance Univeristy
[email protected] Bengaluru, India Bengaluru, India
du.in [email protected] [email protected]
u.in

Dr. Ezil Sam Leni .A

Head of The Department
Alliance Univeristy
Bengaluru, India
[email protected]

I. Abstract urban development, disaster management and environmental

This paper presents a new approach to building and detailed conservation.
roof analysis using the Mask2Former framework. Extracting
buildings from aerial images is an important task in remote Keywords— Segmentation; Mask2Former; High Resolution
sensing and urban planning and is important for many Satellite Images; Object Detection; Object Segmentation; Image
applications such as land use, construction, and damage Processing;
control. II. INTRODUCTION
Automatic building extraction is a very fast developing
Normally, traditional methods often have difficulty clearly field within remote sensing and computer vision. It mainly
defining building boundaries and achieving roof quality, works on usage of computational techniques to identify and
especially in urban environments. A deep learning method segment buildings from satellite imagery. This technology
generates a process traditionally, offering great advantages in
that combines instance segmentation and semantic
terms of speed, efficiency, and cost-effectiveness.
segmentation tasks which is panoptic segmentation. By
performing these tasks, a good Mask2Former model will be Extracting buildings from satellite images is very
obtained and the fine-grained roof details are extracted while important for various reasons. These include urban planning,
disaster management, navigation systems, updating
preserving them. The framework uses convolutional neural
geographic information systems (GIS), and 3D city
networks (CNN) for feature extraction and transformation, modelling.
followed by a transformer-based architecture for content
aggregation and refinement. Building extraction is not a simple task. Factors like
complex building in the aspect of shapes, variations in size and
materials, and dense urban environments can become a
challenge for automated algorithms.
Experimental results show a significant improvement in Traditional methods normally depend on image
construction accuracy and roof quality analysis compared to processing techniques like segmentation algorithms.
other methods. The Mask2Former framework is specifically However, the developments which were made past in deep
designed to capture roof images, distinguish buildings, and learning, mainly convolutional neural networks (CNNs), have
manage urban issues and functions. developed. Applications developed this field. Deep learning models can learn complex
of the Mask2Former method include urban planning, patterns from huge datasets of images, this will help us to
environmental monitoring and infrastructure assessment, achieve high accuracy in building extraction.
demolition of buildings and detailed information important III. LITERATURE REVIEW
for decision-making and accurate legislative design. A new
method for automatic building extraction and roof quality [1] J. Li, W. He, W. Cao, L. Zhang, and H. Zhang, "UANet:
analysis using the Mask2Former framework. The program An Uncertainty-Aware Network for Building Extraction
provides valuable results and opens new opportunities for the from Remote Sensing Images," IEEE Access, vol. 9, pp.
development of remote sensing and urban analysis. product 111823-111832, Jul. 2023, doi:
process. This technique can be used in other situations like 10.1109/TGRS.2024.3361211.

979-8-3503-7289-2/24/$31.00 ©2024 IEEE

Li et al.'s paper "UANet: Designing an Anomaly-Aware In past, significant advances have been done in image
Network for Remote Sensing Image Acquisition." (2024) segmentation, which continues to improve the accuracy of
address this limitation by proposing a new uncertainty-aware identifying objects in images. But distributing ultra-high
network (UANet). This review delves into the relevant resolution (UHR) images effectively is becoming
literature and explores how UANet can resolve uncertainties increasingly difficult. These images, often over tens of
in production. megapixels and rich in detail, require powerful segmentation
techniques that can capture good patterns and overall
UANet mostly aims to resolve uncertainty in the design information. Summary: A new measurement" (2023), Wang
process by integrating uncertainty estimation into its et al.
architecture. This network uses an encoder-decoder model,
This challenge is addressed by introducing the URUR dataset
normally used separately, to generate an initial subtraction
and measurement model. This review thoroughly examines
map with uncertainty estimation. The original recipe was
the existing literature on UHR segmentation, highlighting its
improved with the extended model to create a more accurate
limitations and extending the path to a new measurement
and reliable design.
model. The URUR dataset A better understanding of the
contribution of the miscarriage problem is often encountered
One of UANet's main innovations is the Advanced when used for UHR images. Some of the main challenges
Information Model (PIGM). This module uses an initial include:
subtraction map based on previous knowledge to improve the
representation. PIGM works across spatial and channel Creating large-scale UHR images requires significant
dimensions by enhancing the maximum features extracted by financial resources. Standard procedures may be slow or
the encoder. This approach ensures that the model has good require extensive hardware changes. . This can lead to
correlation and dependency, thus improving the accuracy of incorrect partitioning of large or extended objects. This is
the final quote. important for accurate segmentation of complex scenes
containing many objects and overlapping patterns. Below is
To further reduce uncertainty, the uncertainty-aware fusion a brief summary of some important methods:
module (UAFM) technique incorporates high-level to low-
It is a good idea to ensure that the UHR image is sampled to
level feedback. UAFM adjusts the representation by a lower resolution before applying the segmentation method.
gradually combining uncertainty data; This helps reduce the Although the computational efficiency is high, subsampling
impact of complex backgrounds and different scales. This
may cause data loss and affect segmentation accuracy. dead.
optimization process leads to the final inference map (ar5iv) Although this reduces the computational burden, the
that reduces uncertainty. assembly of segmented patches may be inconsistent. The first
stage performs the segmentation at a lower resolution, and the
UANet has been rigorously tested on various datasets like next stage adjusts the results at a higher resolution.
Inria Aerial Imagery Dataset. These documents reveal
different challenges, such as different building models, Large datasets, good UHR segmentation data are rare. This
scales, and environments. UANet performs better compared hinders the development and evaluation of robust
to other methods, achieving more Intersections between segmentation models. Available data often lack the diversity
Union and F1 scores, demonstrating more accurate and and complexity of real UHR images. Identification of
reliable house inference Sources and Standards
URUR has a large number (3,008) of UHR images covering
For example, UANet achieved an IoU of 83.08 using the a wide range of complex cases from 63 different cities. This
VGG-16 backbone on the Inria dataset, which is one of the diversity allows the development of well-detailed models for
best results in the report. Similarly, it achieves an IoU of real-world situations. number of pixels. This “very rich
76.41 on the Massachusetts dataset, indicating its robustness content” ensures that the training model captures high-quality
across different datasets. content and related objects.

[2] Y. Wang, X. Li, and Z. Chen, "Ultra-High-Resolution Researchers can develop new deep learning models
Segmentation with Ultra-Rich Context: A Novel specifically designed to process UHR images. The goal of
Benchmark," Proceedings of the IEEE/CVF Conference these architectures should be to increase computing
on Computer Vision and Pattern Recognition (CVPR), performance while protecting remote locations and
2023, pp. 23621-23630, doi: integrating data points efficiently. UHR will be able to
10.1109/CVPR52729.2023.02262. improve the performance of segmentation tasks.
UHR segmentation presents a difficult but important area of
computer vision.
IV. METHODOLOGY steps and metrics to ensure accuracy and precision of
classifying the buildings from the aerial image datasets.

Hyper parameter, Fine Tuning, Validation are various

techniques for model’s performance evaluation

Automatic Building Extraction:

The initial step involves preparing the training and validation

datasets. This is done by resizing the images, rescaling the
pixel values, and batching the data. These steps are crucial for
efficient processing during training. Once the model is
trained and optimized, it is used to perform automatic
building extraction on new images. This involves inferences,
post-processing.
Figure 1: Flowchart for proposed research methodology
Output:

Model Definition: Finally, the output will be a set images with accurately
segmented buildings with high accuracy. These outputs are
Input Dataset: essential for applications such as urban planning, disaster
management, and environment monitoring. The extracted
The dataset utilized in this research is Inria Aerial Image footprints can be used to analyze urban growth, assess
Dataset. This dataset is taken from Kaggle. Author is Sagar damage after natural disasters, and plan new infrastructure
Rathod. This dataset contains remote sensing images developments.
containing urban areas which has 180 files to train and 180
files to test. First, we need to input the dataset in google collab
to use the dataset.
V. MODEL TRAINING
Data Preprocessing:
In past, in the field of segmentation has made significant
The initial step involves preparing the training and progress with the help of deep learning. These structures are
testing/validation datasets. This is done by resizing the designed to identify individual objects in an image by
images, rescaling the pixel values, and batching the data. assigning a pixel-level face to each instance. One such model
These steps are crucial for efficient processing during is Mask2Former, introduced by Facebook which uses the
training. advantages of Transformers to achieve the best performance
on the segmentation task., explores its architecture,
advantages, limitations, and future directions.
We can perform normalization, augmentation, noise Groundbreaking models such as Mask R-CNN have been
reduction, remove blur. successfully implemented using a two-stage approach:

Model Training:
Region Proposal Network (RPN): This subnetwork
The next step is to utilize the Mask2Former model and train identifies features hidden in the image Object. position it by
the dataset using that model. This involves importing the creating a surrounding bounding box. Originally developed
for linguistic processing (NLP), it has revolutionized many
Mask2Former model and train the images using that model.
computer vision applications. This model is good at capturing
By doing so, the model learns from the data and improves its
relationships between objects in data. Unlike CNNs that rely
performance. on local convolutions, Transformers use the concept of
identity, which allows them to focus on a portion of the input
Feature Extraction and segmentation can be done. data at once. These properties make them ideal for tasks that
require global understanding, such as segmentation.
Performance Validation:

After dataset loading, Preprocessing the data, Model

Training, we need to check the performance of how the model
is trained. Also, how well the model is performing.
Performance validation in our dataset involves several main
Types of Segmentation: Detectron2 library is Facebook’s new library that allows
us to use and create object detection, segmentation types, edge
detection models.
1. Instance Segmentation: Identifying the Point of an This library includes all the models that were available in
Object Detectron, such as R-CNN, Mask R-CNN, etc., as well as
Consider an image containing activity. Instance some newer models including cascade R-CNN, TensorMask.
segmentation aims to identify and describe each object in the Mostly Detectron2 to do key point detection, object
image. It not only detects the presence of an object, but also detection, and semantic segmentation.
creates the correct mask for each sample. This mask shows
clear boundaries of the object. These networks extract maps As we discussed, Dectectron2 and Mask2Former both are
from images to capture low-level visual information such as founded by Facebook, we can say Detectron2 is most crucial
edges and text. This phase identifies potential objects in the library in installation of Mask2Former as it has all the
image and creates a checkbox around them. This can be done necessary libraries required to perform object detection.
in many ways, such as running the mask prediction header on
the model. Essential products for personal navigation. After installing PyTorch, Detectron2 library installation,
from https://github.com/huggingface/transformers.git/
website we need to install transformers.
2. Semantic Segmentation: Understanding the Big Transformers is an open-source powerful and versatile
Picture library created and maintained by Hugging Face and the
community, IT is built on PyTorch and TensorFlow. It
While semantic segmentation focuses on a single product, provides thousands of pretrained models to perform tasks on
semantic segmentation takes a broader approach. Its purpose Natural Language Processing (NLP)
is to classify each pixel in an image according to its semantic
group. This essentially isolates each part of the image and tells For support of PyTorch to get pretrained models we use
you whether a pixel is “person,” “car,” “length” etc... Transformers library.
Feature extraction is like instance segmentation, semantic After that we must define the coco panoptic palette which
segmentation models often use deep learning to extract contains annotated COCO images and include 80 thing
features from images. This is usually done by estimating the categories from the detection task.
class probability of each pixel in the last layer in the sample.
Distribution, travel, and plants. Distribute images of various After installing these all above said libraries we need import
those all, and start the training.
land cover types to aid urban planning and resource
management. After that, we need to define the load model and processor to
load the pretrained model and use the model.
Then, we must define the load default model that is ckpt and
3. Panoptic Segmentation: Combining the best of
int this class we check whether semantic task is being
both worlds
performed or panoptic task is performed so we need to import
pretrained semantic/panoptic segmentation.
Panoptic Segmentation aims to provide a better Once if we determine we the task is semantic or panoptic task
understanding of images by combining the advantages of we need to define 2 classes one for semantic segmentation and
instance segmentation and semantic segmentation. It works on another for panoptic segmentation.
identifying and segmenting each sample object while
segmenting the background. Essentially, it provides a full If the task is panoptic task, then from Metadata we get
pixel-level understanding by distinguishing between panoptic coco data. Then we label each category which is
individual objects and the background area. Available going to be segmented for better understanding of the type of
technologies for segmentation. It also predicts the background the object.
class for all additional pixels. and the general environment. Then we use the visualizer module to visualize the image and
draw a panoptic segmentation. We will generate maps for the
prediction of segmented data. And assign a label, return the
Mostly in this case we use panoptic segmentation, because result.
it has the features of instance segmentation and semantic
Same with semantic segmentation, is the task is semantic task,
segmentation, that can give us a chance to obtain good results.
then we create a semantic coco pattern from the metadata and
then we create maps and segment the data, label the data
predict the segmented piece type and assign a color to it and
For installing Mask2Former model we must install torch return the image output.
library and dectectron2 library.
PyTorch is an open-source Machine Learning (ML)
framework in python which will help us in creating deep Then we use the Gradio library, we install it. Gradio library is
neural networks. It is mostly preferred for deep learning an open-source Python package, which allows you to build a
research. This framework/library can be is used for speeding demo or web application for your machine learning model,
up the process between research prototyping and deployment. API, or any arbitrary Python function instantly.
We can access the demo by sharing the link, because of its
built -in sharing features, which last for 48 or 72 hours.
Using that we build a demo application, and upload the image 4. Visualization of Results: Constant check of epochs
and perform the segmentation. will tell where we can still improve the model
So, for the whole dataset we use 180 images to train and 180 training, through that we can improve precision and
images to test. accuracy, visualizing the model's performance curve
will say how well we are able to segment the
The dataset is taken from Kaggle which is Inria Aerial Image buildings from the dataset. These visual aids aid in
Dataset. It contains satellite images of Bellingham, Chicago, comprehending the distribution of correct and
Austin, Tyrol, Vienna these all are from United States of incorrect predictions and evaluating the model's
America (USA). The dataset has satellite images which has trade-offs between different performance metrics.
35.29% representation of China, 11.76% representation of
USA, 5.88% representation of France, 4% represent Spain,
5. Analysis of Results: The assessment outcomes are
remaining data comes from various parts from globe.
analysed to gauge the overall efficiency of the model
Some of the images of the dataset are: in segmenting the buildings from the dataset. This
entails scrutinizing the performance metrics, pointing
potential areas for enhancement, and grasping the
implications of the model's performance for real-
world applications.

6. Continuous Improvement: Based on the

assessment findings, the model may undergo
refinement through iterative processes like
hyperparameter tuning, error checking.

Figure 2: Some Images from Austin City, USA.

VI. EVALUATING THE MODEL TRINED

Model assessment is a crucial step in determining the

effectiveness and dependability of a machine learning model.
In the realm of Automatic Building Extraction through deep
learning, model assessment encompasses various essential Figure 3: Resultant graphs in process of improving our
steps: model performance.

1. Data Segmentation: The dataset is commonly split into

training, validation, and testing datasets. The training dataset
is utilized for training the model, the validation dataset helps
in hyperparameter tuning and performance assessment in
training phase, while the testing dataset is employed to
evaluate the performance of the trained model.

2. Selection of Assessment Metrics: Depending on the

problem's nature and output type, suitable assessment metrics
are selected. Common metrics for these types of tasks we
select Epochs which can improve results of accuracy,
precision, recall, F1-score of the output image.

3. Calculation of Assessment Metrics: Following model Figure 4: Resulting hyperparameter comparison graphs
training, the model is assessed on the testing dataset using the which tells how well the model is performing
chosen assessment metric. These metrics provide insights
into various facets of the model's performance, such as its
capability to accurately predict the buildings and other parts
of the data in the dataset, minimize the error of not properly
segmenting the outline of the building, and strike a balanced
trade-off between precision and recall.
Figure 6: F1 Score Curve for our Model

Figure 5: Result of Segmentation of one of the images

from the dataset

VII. CONCLUSION

This study investigates the effectiveness of Mask2Former, a

Transformer-based deep learning model, in extracting
buildings from aerial images. Our results demonstrate
Mask2Former's ability to achieve the good accuracy on this
task. Remote modeling: Mask2Former's transformative
architecture excels at capturing remote features in an image.
This is important for segmenting complex buildings,
especially large areas. Better. This makes it suitable for use
in resource-constrained environments. This includes urban
planning, disaster management and resource allocation.

Further research could explore ways to improve the Figure 7: Labels Correlogram for our model
generality of the Mask2Former model. This will include
training on different materials including various image
resolutions, environments and building types, where all data
extraction will be set up. Foreground method has been
removed from the top image for the use of the house. Its
ability to detect remote dependencies and its ability to
perform well make it useful for many applications.
Figure 6: Normalized Confusion Matrix of our Model

Figure 8: R Curve for my model

Remaining Graphs :

Figure 9: Confusion Matrix for our model Figure 11: Labels Graphs for our model

VIII. REFERENCES

[1] J. Li, W. He, W. Cao, L. Zhang, and H. Zhang, "UANet:

An Uncertainty-Aware Network for Building Extraction
from Remote Sensing Images," IEEE Access, vol. 9, pp.
111823-111832, Jul. 2023, doi:
10.1109/TGRS.2024.3361211.

[2] Y. Wang, X. Li, and Z. Chen, "Ultra-High-Resolution

Segmentation with Ultra-Rich Context: A Novel
Benchmark," Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition (CVPR),
2023, pp. 23621-23630, doi:
10.1109/CVPR52729.2023.02262.

Figure 10: P curve for our model

[3] A. Sharma, R. Verma, and S. Kumar, "Automatic
Building Footprint Extraction using Deep Learning," in
Proceedings of the 2023 International Conference on
Computational Intelligence, Communication
Technology and Networking (CICTN), 20-21 April
2023, pp. 123-130, doi:
10.1109/CICTN57981.2023.10140818.

[4] J. Smith, M. Johnson, and P. Wang, "Extraction of Dense

Urban Buildings from Photogrammetric and LiDAR
Point Clouds," IEEE Access, vol. 9, pp. 111823-111832,
Aug. 2021, doi: 10.1109/ACCESS.2021.3102632.R.
Nicole, “Title of paper with only first word capitalized,”
J. Name Stand. Abbrev., in press.

[5] Y. Li, H. Chen, and Z. Liu, "Remote Sensing Urban

Green Space Layout and Site Selection Based on
Lightweight Expansion Convolutional Method," IEEE
Access, vol. 11, pp. 99889-99900, Sep. 2023, doi:
10.1109/ACCESS.2023.3314819.

[6] Automatic Building Footprint Extraction using Deep

Learning published in 2023 International Conference on
Computational Intelligence, Communication
Technology and Networking (CICTN) 20-21 Aril 2023,
Date added to IEEE Xplore: 07 June 2023, doi:
10.1109/CTCTN57981.2023.10140818.

M360 SU M360 PEI Supplement D-33155 RevA
No ratings yet
M360 SU M360 PEI Supplement D-33155 RevA
56 pages
A Comprehensive Review of Modern Object Segmentation Approaches
No ratings yet
A Comprehensive Review of Modern Object Segmentation Approaches
177 pages
Deep Learning UDA
No ratings yet
Deep Learning UDA
44 pages
Thesis Rashed Doha
No ratings yet
Thesis Rashed Doha
73 pages
Object Detection in Satellite Images by Faster R-CNN Incorporated With Enhanced ROI Pooling (FrRNet-ERoI) Framework
No ratings yet
Object Detection in Satellite Images by Faster R-CNN Incorporated With Enhanced ROI Pooling (FrRNet-ERoI) Framework
18 pages
Log
No ratings yet
Log
100 pages
Microsoft Word Mobile Quick Start Guide
No ratings yet
Microsoft Word Mobile Quick Start Guide
4 pages
Theorem Proving in Lean
100% (1)
Theorem Proving in Lean
173 pages
A Lightweight Semantic Segmentation Network Based On Self-Attention Mechanism and State Space Model For Efficient Urban Scene Segmentation
No ratings yet
A Lightweight Semantic Segmentation Network Based On Self-Attention Mechanism and State Space Model For Efficient Urban Scene Segmentation
15 pages
7 11 - Apr - DL
No ratings yet
7 11 - Apr - DL
82 pages
MELSEC iQ-R Simple Motion Module Function Block Reference PDF
No ratings yet
MELSEC iQ-R Simple Motion Module Function Block Reference PDF
98 pages
Sewa Kursi
No ratings yet
Sewa Kursi
39 pages
MFINet - Multi-Scale Feature Interaction Network For Change Detection of High-Resolution Remote Sensing Images
No ratings yet
MFINet - Multi-Scale Feature Interaction Network For Change Detection of High-Resolution Remote Sensing Images
19 pages
BCDNet
No ratings yet
BCDNet
16 pages
ECSS E HB 31 03A (15november2016)
No ratings yet
ECSS E HB 31 03A (15november2016)
65 pages
Class 11 PRACTICAL FILE
No ratings yet
Class 11 PRACTICAL FILE
11 pages
Paper 3
No ratings yet
Paper 3
21 pages
M580-2CH User Manual 20230620
No ratings yet
M580-2CH User Manual 20230620
40 pages
Resunet-A: A Deep Learning Framework For Semantic Segmentation of Remotely Sensed Data
No ratings yet
Resunet-A: A Deep Learning Framework For Semantic Segmentation of Remotely Sensed Data
24 pages
380 1325 1 PB
No ratings yet
380 1325 1 PB
12 pages
h6730 Virtual Provisioning Space Reclamation WP PDF
No ratings yet
h6730 Virtual Provisioning Space Reclamation WP PDF
41 pages
A Novel Convolutional Neural Network Architecture of Multispectral Remote
No ratings yet
A Novel Convolutional Neural Network Architecture of Multispectral Remote
22 pages
PHASE3
No ratings yet
PHASE3
55 pages
Remotesensing 13 02187 v2
No ratings yet
Remotesensing 13 02187 v2
20 pages
Seminar Paper by Roquia Salam
No ratings yet
Seminar Paper by Roquia Salam
29 pages
Remotesensing 15 01253
No ratings yet
Remotesensing 15 01253
18 pages
Remotesensing 15 04278
No ratings yet
Remotesensing 15 04278
19 pages
Deep Learning For The Automatic Division of Building Constructions Into Sections On Remote Sensing Images
No ratings yet
Deep Learning For The Automatic Division of Building Constructions Into Sections On Remote Sensing Images
15 pages
SSRN Id4485468
No ratings yet
SSRN Id4485468
8 pages
Batch 7 Final Review
No ratings yet
Batch 7 Final Review
19 pages
1 TGRS
No ratings yet
1 TGRS
16 pages
Deep Learning and Explainable AI For Urban Change Detection in Satellite Imagery
No ratings yet
Deep Learning and Explainable AI For Urban Change Detection in Satellite Imagery
7 pages
Fully Transformer Network For Change Detection of Remote Sensing Images
No ratings yet
Fully Transformer Network For Change Detection of Remote Sensing Images
18 pages
Exploring Fusion Techniques in U-Net and DeepLab V3 Architectures For Multi-Modal Land Cover Classification
No ratings yet
Exploring Fusion Techniques in U-Net and DeepLab V3 Architectures For Multi-Modal Land Cover Classification
12 pages
Expl CV
No ratings yet
Expl CV
16 pages
CV Expl 21070126001
No ratings yet
CV Expl 21070126001
16 pages
Zhao Building Extraction From CVPR 2018 Paper
No ratings yet
Zhao Building Extraction From CVPR 2018 Paper
5 pages
A Review of Research On Road Feature Extraction Through Remote Sensing Images Based On Deep Learning Algorithms
No ratings yet
A Review of Research On Road Feature Extraction Through Remote Sensing Images Based On Deep Learning Algorithms
5 pages
Diresnet: Direction-Aware Residual Network For Road Extraction in VHR Remote Sensing Images
No ratings yet
Diresnet: Direction-Aware Residual Network For Road Extraction in VHR Remote Sensing Images
12 pages
DDU-Net Dual-Decoder-U-Net For Road Extraction Using High-Resolution Remote Sensing Images
No ratings yet
DDU-Net Dual-Decoder-U-Net For Road Extraction Using High-Resolution Remote Sensing Images
12 pages
Solarx: Solar Panel Segmentation and Classification
No ratings yet
Solarx: Solar Panel Segmentation and Classification
9 pages
Detection of Road Extraction From Satellite Images With Deep Learning Method
No ratings yet
Detection of Road Extraction From Satellite Images With Deep Learning Method
10 pages
Remotesensing 11 00403 v2
No ratings yet
Remotesensing 11 00403 v2
19 pages
ACMFNet Attention-Based Cross-Modal Fusion Network For Building Extraction of Remote Sensing Images
No ratings yet
ACMFNet Attention-Based Cross-Modal Fusion Network For Building Extraction of Remote Sensing Images
14 pages
LASIT Powermark EN
No ratings yet
LASIT Powermark EN
12 pages
Remotesensing 13 04743 v2
No ratings yet
Remotesensing 13 04743 v2
14 pages
IT5409 - Ch7 - Part3 - DL For CV-v2 - 4pages
No ratings yet
IT5409 - Ch7 - Part3 - DL For CV-v2 - 4pages
42 pages
Object Detection Using Adaptive Mask RCNN
No ratings yet
Object Detection Using Adaptive Mask RCNN
12 pages
SC101Assignment3 Eng
No ratings yet
SC101Assignment3 Eng
10 pages
Deep Learning Based Last Mile Deliveries - RS
No ratings yet
Deep Learning Based Last Mile Deliveries - RS
7 pages
ASPP-LANet A Multi-Scale Context Extraction Networ
No ratings yet
ASPP-LANet A Multi-Scale Context Extraction Networ
22 pages
Installing The Cisco CSR 1000V in Vmware Esxi Environments
No ratings yet
Installing The Cisco CSR 1000V in Vmware Esxi Environments
38 pages
DB Assignment2report
No ratings yet
DB Assignment2report
4 pages
Overview of Object Detection Based On Deep Learnin
No ratings yet
Overview of Object Detection Based On Deep Learnin
7 pages
Melatronic 23en Web
No ratings yet
Melatronic 23en Web
3 pages
Accurate Semantic Segmentation of Aerial Imagery Using Attention Res U-Net Architecture
No ratings yet
Accurate Semantic Segmentation of Aerial Imagery Using Attention Res U-Net Architecture
5 pages
Remotesensing 13 04700 v2
No ratings yet
Remotesensing 13 04700 v2
21 pages
Archiving Master Recipes (PP-PI-MD)
No ratings yet
Archiving Master Recipes (PP-PI-MD)
14 pages
Blender Donut Tutorial
No ratings yet
Blender Donut Tutorial
5 pages
Remote Sensing: A Stacking Ensemble Deep Learning Model For Building Extraction From Remote Sensing Images
No ratings yet
Remote Sensing: A Stacking Ensemble Deep Learning Model For Building Extraction From Remote Sensing Images
22 pages
Road Vec Net
No ratings yet
Road Vec Net
25 pages
Scrum Reference Card
No ratings yet
Scrum Reference Card
6 pages
ML Research Paper
No ratings yet
ML Research Paper
8 pages
Manuscript Template 2
No ratings yet
Manuscript Template 2
13 pages
Object Detection Techniques A Review
No ratings yet
Object Detection Techniques A Review
9 pages
IJRAR1DUP001
No ratings yet
IJRAR1DUP001
3 pages
BES103 PythonLab4
No ratings yet
BES103 PythonLab4
4 pages
23PD3 Ieee Paper
No ratings yet
23PD3 Ieee Paper
6 pages
He 2017
No ratings yet
He 2017
9 pages
Object and Face Detection Based On Center-Net 1
No ratings yet
Object and Face Detection Based On Center-Net 1
7 pages
Image Segmentation For Object Detection Using Mask R-CNN in Colab
No ratings yet
Image Segmentation For Object Detection Using Mask R-CNN in Colab
5 pages
Tmi 2019 2959609
No ratings yet
Tmi 2019 2959609
12 pages
Remote Sensing: Object-Based Features For House Detection From RGB High-Resolution Images
No ratings yet
Remote Sensing: Object-Based Features For House Detection From RGB High-Resolution Images
24 pages
Surya 1000 Names
0% (1)
Surya 1000 Names
2 pages
Improving First Order Differential Power Attacks Through Digital Signal Processing
No ratings yet
Improving First Order Differential Power Attacks Through Digital Signal Processing
10 pages
Object Detection in Pytorch Using Mask R-CNN
No ratings yet
Object Detection in Pytorch Using Mask R-CNN
4 pages
Road Extraction Image Processing
No ratings yet
Road Extraction Image Processing
5 pages
Evaluation of CFD Codes On A Two-Phase Flow Benchmark Reference Test Case
No ratings yet
Evaluation of CFD Codes On A Two-Phase Flow Benchmark Reference Test Case
4 pages
Avigilon Control Center Software Flyer
No ratings yet
Avigilon Control Center Software Flyer
2 pages
Umer Ziyad Resume QualityEngineer
No ratings yet
Umer Ziyad Resume QualityEngineer
3 pages
Day-10 Subdomain Takeover Live Reco
No ratings yet
Day-10 Subdomain Takeover Live Reco
3 pages
MS Office
No ratings yet
MS Office
8 pages
Distributed Network Protocol & IEC 61850 - COPA DATA
No ratings yet
Distributed Network Protocol & IEC 61850 - COPA DATA
8 pages
Image Segmentation in Deep Learning
No ratings yet
Image Segmentation in Deep Learning
12 pages
Using Mask R CNN To Isolate PV Panels From Background Object in Images
No ratings yet
Using Mask R CNN To Isolate PV Panels From Background Object in Images
5 pages
Getting Started: Matlab Practice Sessions
No ratings yet
Getting Started: Matlab Practice Sessions
5 pages
Acelerometro Estatico
No ratings yet
Acelerometro Estatico
3 pages
MiFi 2372 Datasheet
No ratings yet
MiFi 2372 Datasheet
2 pages
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
From Everand
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet

Conference Paper

Uploaded by

Conference Paper

Uploaded by

2024 International Conference on Network, Multimedia and Information Technology (NMITCON)

Automatic Building Extraction using

Dr. Ezil Sam Leni .A

I. Abstract urban development, disaster management and environmental

979-8-3503-7289-2/24/$31.00 ©2024 IEEE

Hyper parameter, Fine Tuning, Validation are various

Automatic Building Extraction:

The initial step involves preparing the training and validation

After dataset loading, Preprocessing the data, Model

6. Continuous Improvement: Based on the

Figure 2: Some Images from Austin City, USA.

VI. EVALUATING THE MODEL TRINED

Model assessment is a crucial step in determining the

1. Data Segmentation: The dataset is commonly split into

2. Selection of Assessment Metrics: Depending on the

Figure 5: Result of Segmentation of one of the images

This study investigates the effectiveness of Mask2Former, a

Figure 8: R Curve for my model

[1] J. Li, W. He, W. Cao, L. Zhang, and H. Zhang, "UANet:

[2] Y. Wang, X. Li, and Z. Chen, "Ultra-High-Resolution

Figure 10: P curve for our model

[4] J. Smith, M. Johnson, and P. Wang, "Extraction of Dense

[5] Y. Li, H. Chen, and Z. Liu, "Remote Sensing Urban

[6] Automatic Building Footprint Extraction using Deep

You might also like