0% found this document useful (0 votes)
8 views7 pages

Deep Learning For Built-Up Fractional Mapping

Uploaded by

blueorange630
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views7 pages

Deep Learning For Built-Up Fractional Mapping

Uploaded by

blueorange630
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

L AND C OVER I MAGE C LASSIFICATION

P UBLISHED IN CITCA 2023

Antonio Rangel Juan R. Terven Diana M. Cordova-Esparza


Instituto Tecnológico de Mazatlan Instituto Politecnico Nacional Universidad Autónoma de Querétaro
ITM CICATA-Qro Facultad de Informática
[email protected] [email protected] [email protected]
arXiv:2401.09607v1 [cs.CV] 17 Jan 2024

E.A. Chávez-Urbiola
Instituto Politecnico Nacional
CICATA-Qro
[email protected]

January 19, 2024

A BSTRACT
Land Cover (LC) image classification has become increasingly significant in understanding environ-
mental changes, urban planning, and disaster management. However, traditional LC methods are
often labor-intensive and prone to human error. This paper explores state-of-the-art deep learning
models for enhanced accuracy and efficiency in LC analysis. We compare convolutional neural
networks (CNN) against transformer-based methods, showcasing their applications and advantages in
LC studies. We used EuroSAT, a patch-based LC classification data set based on Sentinel-2 satellite
images and achieved state-of-the-art results using current transformer models.

Keywords remote sensing · land cover · LULC classification

1 Introduction

Land Use Land Cover (LULC) is a multidisciplinary field that categorizes and characterizes the earth’s terrestrial
surface. It encompasses various types of ground, from natural landscapes such as forests, wetlands, and deserts to
human-altered environments such as agricultural fields, urban areas, and industrial sites. LULC studies provide a
snapshot of the earth’s surface at a given time, offering valuable insights into the spatial distribution and interaction
of various land use types and land cover classes. The dynamic nature of LULC, driven by both natural processes and
human activities, necessitates continuous monitoring and analysis to capture temporal changes.
The importance of LULC studies extends to numerous fields. In environmental science, LULC data inform our under-
standing of biodiversity, ecosystem services, and the impacts of climate change. In urban planning and development, it
helps to manage land resources, assess environmental impacts, and guide sustainable practices. LULC helps optimize
land use for crop production in agriculture while minimizing environmental degradation. In addition, LULC data are
integral to policy-making, supporting land conservation, urban growth, and climate change mitigation decisions.
Changes in Land Use Land Cover (LULC) offer insights into the dynamics of natural and human-altered landscapes,
with applications spanning multiple disciplines. For example, LULC change data can elucidate the impacts of human
activities on ecosystems, such as tracking deforestation rates to study biodiversity loss or carbon sequestration changes.
LULC changes can also be used in climate change studies to influence local and global climates and inform climate
modeling and future scenario prediction. LULC change data can inform the growth patterns of cities, aiding decisions
about infrastructure development, zoning, and resource allocation. Monitoring LULC changes can optimize land use for
crop production, identify shifts in agricultural practices, and assess environmental impacts. LULC change data also aids
in disaster management, where changes in land cover, like deforestation, can increase vulnerability to natural disasters
P UBLISHED IN THE 1st C ONFERENCE IN T ECHNOLOGY AND A PPLIED S CIENCE 2023

Figure 1: Sample images from the EuroSAT dataset showing the ten categories.

such as floods and landslides. Also, LULC changes inform policy decisions related to land conservation, urban growth,
climate change mitigation, and sustainable development.
Traditionally, remote sensing for LULC mapping involved manual or semi-automated image classification processes.
While efficient in their time, these techniques were often time-consuming and meticulous tasks that required human
intervention and expert knowledge. The manual interpretation was potentially prone to errors and discrepancies and
lacked the scalability to cover large geographical areas.
AI, specifically machine learning (ML) algorithms, has significantly improved the ability to detect remote sensing in
LULC studies. These algorithms learn from vast amounts of data, building models that can forecast future scenarios
or identify patterns with exceptional accuracy. Recent progress in computer vision, a subfield of AI, has encouraged
the development of sophisticated algorithms to teach computers how to see and analyze large amounts of image data,
including satellite or aerial imagery used in remote sensing.
Among the classic Machine Learning methods for modeling LULC is K-nearest neighbor from [1] and [2], Random
Forests by [1], [3], [4], [5], [6], [7], [8], and [9], Support Vector Machines (SVM) have also been used in the works of
[1], [2], [3], [4], [7], [8], [10], [11], and [12]. Other approaches such as Naive Bayes by [4], Decision trees by [1], [2],
[4], and [12], Classic Neural Networks in the works of [1], [4], [7], [8], [10], and [11], Maximum likelihood classifiers
by [10], Bagging in [5], and Boosting in [5].
More recently, Deep Learning-based models have taken the stage and dominated the remote sensing world with models
based on Convolutional Neural Networks (CNNs) in the works of [13], [14], [15], and [16]. Autoencoders have also
been used to learn a representation of the data in [17], [18], and [19], Stacked Autoencoders by [20], and [21], 3D
Convolutional Autoencoders by [22], multitask deep learning in [23], Generative models by [24], and Transformers by
[25], [26], [27], [28], [29], and [30].
This paper compares classification methods for land cover. We compare standard Convolutional Neural Network
methods as well as state-of-the-art Transformer-based methods. We share all the code this project uses at github.com/
jrterven/eurosat_classification.

2 The EuroSAT dataset


Land Use Land Cover (LULC) categorizes and characterizes the earth’s terrestrial surface. Land use describes the human
activities directly related to the land, indicating how people utilize land resources, such as residential, agricultural, or
commercial. Land cover, on the other hand, refers to the physical material at the surface of the earth, including grass,
asphalt, trees, bare ground, water, etc.
The EuroSAT dataset [31] contains ten classes with 27000 labeled and geo-referenced images taken from Sentinel-2
satellite images. The images are 64 × 64 and cover cities in the European Urban Atlas. The covered cities are

2
P UBLISHED IN THE 1st C ONFERENCE IN T ECHNOLOGY AND A PPLIED S CIENCE 2023

Figure 2: Images distribution in the Training set.

distributed over the 34 European countries: Austria, Belarus, Belgium, Bulgaria, Cyprus, Czech Republic (Czechia),
Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia, Lithuania, Luxembourg,
Macedonia, Malta, Republic of Moldova, Netherlands, Norway, Poland, Portugal, Romania, Slovakia, Slovenia, Spain,
Sweden, Switzerland, Ukraine and United Kingdom. Figure 1 shows one random sample image for each of the ten
classes.
The dataset provides split for training, validation, and testing with 18900, 5400, and 2700 images per split, respectively.
Figure 2 shows the distribution of images per class for the training set.

3 Methods

3.1 Data Acquisition

We obtained the dataset from the original source ([32]), ensuring authenticity and data integrity. This dataset provides a
collection of ten categories, mostly balanced (see Figure 2), necessary for training and evaluating our classification
models.

3.2 Data Preprocessing and Training

We developed custom code for efficient data loading, preprocessing, and augmentation. The preprocessing steps
included normalization and resizing of images. We trained ten models, seven convolutional architectures, and three
transformer-based architectures. The convolutional models were AlexNet by [33], ResNet50 by [34], ResNeXt by [35],
DenseNet by [36], MobileNetV3 from [37], EfficientNetV2 by [38], and ConvNeXt by [39]. The transformed models
were ViT by [40], Swin Transformer by [41], and MaxViT by [42].
We trained all models using an early stop of ten epochs and kept the model with the best validation accuracy. We used
the categorical cross-entropy loss and the Adam optimizer by [43] with a learning rate of 1 × 10−4 .

3.2.1 Training from Scratch.

Initially, we trained each of the ten models from scratch. This approach involved initializing the weights with the
Kaiming Initialization proposed by [44] and then retraining all layers using the RGB images of the EuroSAT dataset.
The purpose of this step was to evaluate the learning capacity of each architecture without prior knowledge.

3
P UBLISHED IN THE 1st C ONFERENCE IN T ECHNOLOGY AND A PPLIED S CIENCE 2023

3.2.2 Training with Pre-trained Weights.


Subsequently, we employed transfer learning by training the same models using pre-trained weights. These weights
were obtained from models pre-trained on ImageNet.

3.3 Model Evaluation

We evaluated each model on the test set, which was not used during the training phase. As evaluation metrics, we
compute the Top-1 accuracy and the precision/recall curves.
For our experiments, we used Pytorch and multiple computers: a Colab machine with NVIDIA V100 GPU, as well as
two personal computers, one with TITAN X GPU and another equipped with RTX 4090 GPU.

4 Results

In this section, we present the results of the classification task using ten influential deep learning architectures trained
from scratch and retrained with pre-trained parameters. Table 1 shows the Top-1 accuracy on the test set.

Table 1: Classification accuracy comparison on ten deep learning models. The Accuracy column shows the results of the models trained from scratch, while the Accuracy
pre-trained column shows the results of retraining with pre-trained weights. The first seven are convolutional-based models, while the last three are Transformer-based
models.

Model name Accuracy Accuracy


pre-trained
AlexNet 0.837 0.916
ResNet 0.835 0.947
ResNeXt 0.843 0.981
DenseNet 0.925 0.949
MobileNetV3 0.877 0.958
EfficientNetV2 0.908 0.973
ConvNeXt 0.783 0.986
ViT32 0.917 0.972
SwinB 0.926 0.987
MaxViT 0.973 0.990

As shown in the table, the model with the highest scores is also the latest architecture that we tried, achieving a
state-of-the-art 99% accuracy on this dataset.
Figures 3a and 3b show the validation accuracy curves obtained during training. It should be noted that the models
trained with random weights (Figure 3a) took significantly longer to train and reached lower accuracy than the models
trained with pre-trained weights (Figure 3b). We used the [45] platform to log the experiments and generate these
graphs.
Figures 4a and 4b show the precision/recall curves for the ten models. The curves show that the models trained from
scratch (Figure 4a) present more variability in their results, with ConvNeXt on the lower end and MaxViT on the higher
end. For the pre-trained models, the curves are closer to each other, with AlexNet as the lowest-performance model and
MaxViT on the higher end. Curiously, ConvNeXt was the lowest-performance model trained from scratch but was on
par with the best models when trained with pre-trained weights. This may indicate that the transfer learning used with
the pre-trained weights helps alleviate the relatively small size of this dataset for such a model.

5 Conclusion

Land cover image classification is a crucial task in understanding environmental changes, urban planning, and disaster
management. In this paper, we explored using state-of-the-art deep learning models for enhanced accuracy and efficiency
in LC analysis. We compared convolutional neural networks (CNN) against transformer-based methods, illuminating
their applications and advantages in LC studies. Our results showed that both CNN and transformer-based methods
achieved high accuracy in LC classification, with transformer-based models outperforming CNNs in some cases.

4
P UBLISHED IN THE 1st C ONFERENCE IN T ECHNOLOGY AND A PPLIED S CIENCE 2023

(a) Models trained from scratch (b) Pretrained models


Figure 3: Validation accuracy curves for the ten models. (a) Trained from scratch and (b) using pre-trained weights on ImageNet. Best viewed in color.

(a) Models trained from scratch (b) Pretrained models


Figure 4: Precision/Recall curves for the ten models. (a) Trained from scratch and (b) using pre-trained weights on ImageNet. Best viewed in color.

Using deep learning models in land cover classification offers a powerful avenue for research and applications. With
the availability of large-scale remote sensing datasets and the rapid development of deep learning techniques, we can
expect continued advancement in this field. As we move towards more automated and efficient methods for land cover
classification, we can unlock the full potential of this data in addressing critical environmental and societal challenges.

6 Acknowledgements
We thank the Instituto Politecnico Nacional through the Research and Postgraduate Secretary (SIP) project number
20232290 and the Consejo Nacional de Humanidades Ciencias y Tecnologías (CONAHCYT) for its support through
the Sistema Nacional de Investigadoras e Investigadores (SNII).

References
[1] P. Thanh Noi and M. Kappas, “Comparison of random forest, k-nearest neighbor, and support vector machine
classifiers for land cover classification using sentinel-2 imagery,” Sensors, vol. 18, no. 1, p. 18, 2017.
[2] Y. Qian, W. Zhou, J. Yan, W. Li, and L. Han, “Comparing machine learning classifiers for object-based land cover
classification using very high resolution imagery,” Remote Sensing, vol. 7, no. 1, pp. 153–168, 2014.
[3] L. Ma, M. Li, X. Ma, L. Cheng, P. Du, and Y. Liu, “A review of supervised object-based land-cover image
classification,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 130, pp. 277–293, 2017.
[4] F. F. Camargo, E. E. Sano, C. M. Almeida, J. C. Mura, and T. Almeida, “A comparative assessment of machine-
learning techniques for land use and land cover classification of the brazilian tropical savanna using alos-2/palsar-2
polarimetric images,” Remote Sensing, vol. 11, no. 13, p. 1600, 2019.

5
P UBLISHED IN THE 1st C ONFERENCE IN T ECHNOLOGY AND A PPLIED S CIENCE 2023

[5] B. Ghimire, J. Rogan, V. R. Galiano, P. Panday, and N. Neeti, “An evaluation of bagging, boosting, and random
forests for land-cover classification in cape cod, massachusetts, usa,” GIScience & Remote Sensing, vol. 49, no. 5,
pp. 623–643, 2012.
[6] E. Adam, O. Mutanga, J. Odindi, and E. M. Abdel-Rahman, “Land-use/cover classification in a heterogeneous
coastal landscape using rapideye imagery: evaluating the performance of random forest and support vector
machines classifiers,” International Journal of Remote Sensing, vol. 35, no. 10, pp. 3440–3458, 2014.
[7] T. Shiraishi, T. Motohka, R. B. Thapa, M. Watanabe, and M. Shimada, “Comparative assessment of supervised
classifiers for land use–land cover classification in a tropical region using time-series palsar mosaic data,” IEEE
Journal of selected topics in applied earth observations and remote sensing, vol. 7, no. 4, pp. 1186–1199, 2014.
[8] E. Raczko and B. Zagajewski, “Comparison of support vector machine, random forest and neural network
classifiers for tree species classification on airborne hyperspectral apex images,” European Journal of Remote
Sensing, vol. 50, no. 1, pp. 144–154, 2017.
[9] R.-Y. Lee, D.-Y. Ou, Y.-S. Shiu, and T.-C. Lei, “Comparisons of using random forest and maximum likelihood
classifiers with worldview-2 imagery for classifying crop types,” in Proceedings of the 36th Asian Conference
Remote Sensing Foster, ACRS, Citeseer, 2015.
[10] P. K. Srivastava, D. Han, M. A. Rico-Ramirez, M. Bray, and T. Islam, “Selection of classification techniques for
land use/land cover change investigation,” Advances in Space Research, vol. 50, no. 9, pp. 1250–1265, 2012.
[11] S. Pal and S. Ziaul, “Detection of land use and land cover change and land surface temperature in english bazar
urban centre,” The Egyptian Journal of Remote Sensing and Space Science, vol. 20, no. 1, pp. 125–145, 2017.
[12] J. R. Otukei and T. Blaschke, “Land cover change assessment using decision trees, support vector machines
and maximum likelihood classification algorithms,” International Journal of Applied Earth Observation and
Geoinformation, vol. 12, pp. S27–S31, 2010.
[13] D. Marcos, M. Volpi, B. Kellenberger, and D. Tuia, “Land cover mapping at very high resolution with rotation
equivariant cnns: Towards small yet accurate models,” ISPRS journal of photogrammetry and remote sensing,
vol. 145, pp. 96–107, 2018.
[14] B. Huang, B. Zhao, and Y. Song, “Urban land-use mapping using a deep convolutional neural network with high
spatial resolution multispectral remote sensing imagery,” Remote Sensing of Environment, vol. 214, pp. 73–86,
2018.
[15] M. Rezaee, M. Mahdianpari, Y. Zhang, and B. Salehi, “Deep convolutional neural network for complex wet-
land classification using optical remote sensing imagery,” IEEE Journal of Selected Topics in Applied Earth
Observations and Remote Sensing, vol. 11, no. 9, pp. 3030–3039, 2018.
[16] Y. Shi, D. Ma, J. Lv, and J. Li, “Actl: Asymmetric convolutional transfer learning for tree species identification
based on deep neural network,” IEEE Access, vol. 9, pp. 13643–13654, 2021.
[17] W. Zhou, Z. Shao, C. Diao, and Q. Cheng, “High-resolution remote-sensing imagery retrieval using sparse features
by auto-encoder,” Remote sensing letters, vol. 6, no. 10, pp. 775–783, 2015.
[18] A. Azarang, H. E. Manoochehri, and N. Kehtarnavaz, “Convolutional autoencoder-based multispectral image
fusion,” IEEE access, vol. 7, pp. 35673–35683, 2019.
[19] M. Hu, C. Wu, L. Zhang, and B. Du, “Hyperspectral anomaly change detection based on autoencoder,” IEEE
Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 14, pp. 3750–3762, 2021.
[20] C. Zhao, X. Wan, G. Zhao, B. Cui, W. Liu, and B. Qi, “Spectral-spatial classification of hyperspectral imagery
based on stacked sparse autoencoder and random forest,” European journal of remote sensing, vol. 50, no. 1,
pp. 47–63, 2017.
[21] X. Sun, F. Zhou, J. Dong, F. Gao, Q. Mu, and X. Wang, “Encoding spectral and spatial context information for
hyperspectral image classification,” IEEE Geoscience and Remote Sensing Letters, vol. 14, no. 12, pp. 2250–2254,
2017.
[22] S. Mei, J. Ji, Y. Geng, Z. Zhang, X. Li, and Q. Du, “Unsupervised spatial–spectral feature learning by 3d
convolutional autoencoder for hyperspectral classification,” IEEE Transactions on Geoscience and Remote
Sensing, vol. 57, no. 9, pp. 6808–6820, 2019.
[23] S. Liu, Q. Shi, and L. Zhang, “Few-shot hyperspectral image classification with unknown classes using multitask
deep learning,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 6, pp. 5085–5102, 2020.
[24] D. Hong, J. Yao, D. Meng, Z. Xu, and J. Chanussot, “Multimodal gans: Toward crossmodal hyperspectral–
multispectral image segmentation,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 6,
pp. 5103–5113, 2020.

6
P UBLISHED IN THE 1st C ONFERENCE IN T ECHNOLOGY AND A PPLIED S CIENCE 2023

[25] D. Hong, Z. Han, J. Yao, L. Gao, B. Zhang, A. Plaza, and J. Chanussot, “Spectralformer: Rethinking hyperspectral
image classification with transformers,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–15,
2021.
[26] Z. Xue, X. Tan, X. Yu, B. Liu, A. Yu, and P. Zhang, “Deep hierarchical vision transformer for hyperspectral and
lidar data classification,” IEEE Transactions on Image Processing, vol. 31, pp. 3095–3110, 2022.
[27] Y. Qing, W. Liu, L. Feng, and W. Gao, “Improved transformer net for hyperspectral image classification,” Remote
Sensing, vol. 13, no. 11, p. 2216, 2021.
[28] A. Jamali and M. Mahdianpari, “Swin transformer and deep convolutional neural networks for coastal wetland
classification using sentinel-1, sentinel-2, and lidar data,” Remote Sensing, vol. 14, no. 2, p. 359, 2022.
[29] H. Dong, L. Zhang, and B. Zou, “Exploring vision transformers for polarimetric sar image classification,” IEEE
Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–15, 2021.
[30] J. Yao, B. Zhang, C. Li, D. Hong, and J. Chanussot, “Extended vision transformer (exvit) for land use and land
cover classification: A multimodal deep learning framework,” IEEE Transactions on Geoscience and Remote
Sensing, 2023.
[31] P. Helber, B. Bischke, A. Dengel, and D. Borth, “Eurosat: A novel dataset and deep learning benchmark for land
use and land cover classification,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote
Sensing, vol. 12, no. 7, pp. 2217–2226, 2019.
[32] Phelber, “Eurosat: Land use and land cover classification with sentinel-2,” 2023.
[33] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,”
Advances in neural information processing systems, vol. 25, 2012.
[34] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE
conference on computer vision and pattern recognition, pp. 770–778, 2016.
[35] S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, “Aggregated residual transformations for deep neural networks,”
in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1492–1500, 2017.
[36] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in
Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708, 2017.
[37] A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, et al.,
“Searching for mobilenetv3,” in Proceedings of the IEEE/CVF international conference on computer vision,
pp. 1314–1324, 2019.
[38] M. Tan and Q. Le, “Efficientnetv2: Smaller models and faster training,” in International conference on machine
learning, pp. 10096–10106, PMLR, 2021.
[39] Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,” in Proceedings of
the IEEE/CVF conference on computer vision and pattern recognition, pp. 11976–11986, 2022.
[40] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer,
G. Heigold, S. Gelly, et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv
preprint arXiv:2010.11929, 2020.
[41] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision
transformer using shifted windows,” in Proceedings of the IEEE/CVF international conference on computer vision,
pp. 10012–10022, 2021.
[42] Z. Tu, H. Talebi, H. Zhang, F. Yang, P. Milanfar, A. Bovik, and Y. Li, “Maxvit: Multi-axis vision transformer,” in
European conference on computer vision, pp. 459–479, Springer, 2022.
[43] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
[44] K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on
imagenet classification,” in Proceedings of the IEEE international conference on computer vision, pp. 1026–1034,
2015.
[45] Weights and Biases, “The ai developer platform,” 2023. Accessed: 2023-12-01.

You might also like