0% found this document useful (0 votes)
47 views76 pages

Batch 6

Uploaded by

Ysrecofyvu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views76 pages

Batch 6

Uploaded by

Ysrecofyvu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

DEEP LEARNING BASED DETECTION OF DEFORESTATION

IN SATELLITE IMAGES
A Report submitted
in partial fulfillment for the award of the Degree of

BACHELOR OF TECHNOLOGY
in
ELECTRONICS AND COMMUNICATION ENGINEERING
By

L.RAMANA 1012004907
P.SWETHA 1011904032
K.AKSHAYA 1011904002
N.SAI GOWTHAM 1011904025

Under the esteemed guidance of

Smt G.SILPALATHA
(Ph.D),

Academic Consultant
Department of ECE

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING


Y.S.R ENGINEERING COLLEGE OF YOGI VEMANA UNIVERSITY
Proddatur-516360, Y.S.R (Dt)
ANDHRA PRADESH
2023
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
ENGINEERING
Y.S.R ENGINEERING COLLEGE OF YOGI VEMANA UNIVERSITY
Proddatur-516360, Y.S.R (Dt)
ANDHRA PRADESH

CERTIFICATE
This is to certify that the project report entitled DEEP LEARNING
BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES
submitted by L.Ramana, P.Swetha, K.Akshaya & N.Sai Gowtham to the
Y.S.R Engineering College of Yogi Vemana University, Proddatur, in partial
fulfillment for the award of the degree of B.Tech in Electronics and
Communication Engineering is a bonafide record of project work carried out
by him/her under my/our supervision.
The contents of this report, in full or in parts, have not been submitted to
any other Institution or University for the award of any degree or diploma.

Project Guide Head of the Department

Smt.G.SILPALATHA Dr. B.P.SANTOSH KUMAR


(Ph.D), M.Tech, Ph.D., MISTE, MIE,
Academic Consultant Associate professor & HOD
Department of ECE Department of ECE
Y.S.R. Engineering College of YVU, Y.S.R. Engineering College of YVU,
Proddatur-516360 Proddatur-516360

Internal Examiner External examiner

ii
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
ENGINEERING
Y.S.R ENGINEERING COLLEGE OF YOGI VEMANA UNIVERSITY
Proddatur-516360, Y.S.R (Dt)
ANDHRA PRADESH

DECLARATION

We declare that this project report titled Deep Learning Based Detection Of
Deforestation In Satellite Images submitted in partial fulfillment of the degree of B.Tech in
Electronics and Communication Engineering is a record of original work carried out by us
under the Guidance of Smt G.Silpalatha. The matter embodied in this project report has not
been submitted by us for the award of any other degree or diploma.

L.Ramana 1012004907
P.Swetha 1011904032
K.Akshaya 1011904002
N.Sai Gowtham 1011904025

iii
ACKNOWLEDGMENTS
We take this opportunity to express our deepest gratitude appreciation to all those
who have helped us directly or indirectly towards the successful completion of this project.
It is great pleasure in expressing deep sense of gratitude and veneration to our guide,
Smt G.Silpalatha Academic Consultant, Department of Electronics and Communication
Engineering for her valuable guidance and thought provoking discussion throughout the
course of the project work.
We extend our profound gratefulness to Dr. S. Shafiulla Basha, Associate professor,
Department of Electronics and Communication Engineering for his encouragement and
support throughout the project.
We extend our profound gratefulness to, Dr. B.P. Santosh Kumar , Associate
professor, Project Coordinator & Head of the Department of Electronics and communication
Engineering for his encouragement and support throughout the project.
We take this opportunity to offer gratefulness to our Prof. K.Venkata Ramanaiah,
Dean Faculty of Engineering, Y.S.R Engineering College of Yogi Vemana university,
Proddatur for providing all sorts help during the project work.
We take this opportunity to offer gratefulness to our Prof. C. Nagaraju, Principal
of Y.S.R Engineering college of Yogi Vemana university, Proddatur for providing all sorts
help during the project work.
We express our thanks to all our college teaching and non-teaching staff members
who encouraged and helped us in some way or other throughout the project work.
Finally, we are thankful to all our friends who have in some way or the other
helped us getting towards the completion of this project work.

L.Ramana 1012004907
P.Swetha 1011904032
K.Akshaya 1011904002
N.Sai Gowtham 1011904025

iv
ABSTRACT
Human activity is an undeniable factor that increases the total forest area
loss. Different NGOs, governments, and private companies are looking for ways to prevent
human driven deforestation. A model that can generate interpretable deforestation predictions
is a valuable asset to prevent some causes of forest area loss, such as illegal logging and the
creation of pasture or plantation fields. Advances in satellite data analysis and artificial neural
networks resulted in different methods of creating such a model. This work lists the most used
machine learning techniques to create interpretable predictions. The problem of automatic
monitoring the deforestation process is considered for efficient prevention of illegal
deforestation. Image segmentation model on the basis of U-Net and ResNet family of deep
neural networks (DNNs) was created. The forest/deforestation dataset was collected by
parsing areas of Ukrainian floristries, where satellite images of 512×512 pixels contain areas
with forest, deforestation, and other areas. To overcome the imbalance of created dataset the
hybrid loss function was created and tested in the training environment. K-fold cross validation
and numerous runs for different random seeds were conducted to prove the model and dataset
usefulness and stability during the training and validation process. These results demonstrate
that variation of images in the dataset and randomness of initialization have no significant
effect on model performance, but the future research will be needed in the view of the possible
increase of datasets where performance could be improved by the larger data representation,
but some decrease of performance could be observed due to possible wider data variability. It
is especially important for deployment of DNNs on devices with the limited computational
resources for Edge Computing layer.
Key words: Deep learning, U-NET, Resnet 50, Deep Neural Networks

v
TABLE OF CONTENTS

DESCRIPTION PAGE NO
CERTIFICATE ii

DECLARATION iii

ACKNOWLEDGEMENTS iv

ABSTRACT v
TABLE OF CONTENTS vi

LIST OF FIGURES x

LIST OF TABLES xi

1. INTRODUCTION 1-12
1.1 Digital Image Processing 1
1.2 What is an Image 1
1.2.1 Analog Image 1
1.2.2 Digital Image 1
1.3 Representation of Digital Image 2
1.3.1 Neighbors of a Pixel 3
1.3.2 Image Resolution 3
1.3.3 Human Visual System 3
1.3.4 Brightness and Contrast 3
1.4 Image Formats 3
1.4.1 JPG 3
1.4.2 GIF 4
1.4.3 PNG 4
1.4.4 SVG 4
1.5 Types of Digital Images 5
1.5.1 Black and White Images 5
1.5.2 Colour Images 5
1.5.3 Binary or Bi-level Images 5
1.5.4 Indexed Coloured Images 5

vi
1.6 Resolution 6
1.6.1 Pixel Resolution 7
1.7 Colour Terminology 7
1.7.1 Primary and Secondary Colours and
Additive and Subtractive Colour Mixing 7
1.7.2 Colour Gamut 8
1.7.3 Colour Management 8
1.7.4 Hue 8
1.7.5 Saturation 8
1.7.6 Brightness 9
1.7.7 Luminance 9
1.7.8 Chrominance 9
1.8 Digital Image Colour Spaces 9
1.8.1 RGB 9
1.8.2 Hue Saturation Value 10
1.8.3 Hue Saturation Lightness 11
1.9 Forest 11
1.9.1 Deforestation 11
1.9.2 Degradation 12

2. LITERATURE SURVEY 13-16


2.1 Introduction 13
2.2 Fundamental Principles of the Evaluation Model 14
2.2.1 Systemization and Connectivity 14
2.2.2 Scientific Integrity and Objectivity 14
2.2.3 Sustainable Development 15
2.2.4 Transparency 15
2.2.5 Reliability 15
2.2.6 Compatibility 15
2.2.7 Qualitative Versus Quantitative Analysis 16

vii
3. SEGMENTATION TECHNIQUES AND CNN 17-43
3.1 A Mathematical Definition of Segmentation 17
3.2 A Review on Existing Segmentation Techniques 17
3.2.1 Histogram Thresholding 18
3.2.2 Edge Based Segmentation 19
3.2.3 Active Contour Based Segmentation 19
3.2.4 Region Based Segmentation 20
3.3 Region Growing 22
3.4 Region Based Method 23
3.4.1 Growth of Regions 23
3.4.2 Growth Algorithm 23
3.4.3 Growth Types 24
3.4.4 Seed Growth 24
3.4.5 Neighbor Growth 24
3.4.6 Disadvantages of Region Growing 25
3.5 Clustering Method 25
3.6 Types of Clustering 25
3.6.1 Hierarchical Clustering 26
3.6.2 K-Means Clustering 28
3.6.3 Fuzzy C-Means Clustering 29
3.6.4 QT Clustering Algorithm 30
3.6.5 Spectral Clustering 31
3.7 Comparison Between Data Clustering 31
3.7.1 K-Means Algorithm 33
3.7.2 Fuzzy C-Means 33
3.7.3 EM Algorithm 33
3.8 Convolution Neural Networks 34
3.9 Convolutional Layer 35
3.10 Pooling Layer 39
3.11 Batch Normalization 40
3.12 Residual Connections 41

viii
4. PROPOSED METHODOLOGY 44-50
4.1 Introduction 44
4.2 Resnet 45
4.3 Proposed Model 47
4.3.1 Data Set Used 47
4.3.2 Loss Function 49
4.3.3 Model 50

5. SOFTWARE DESCRIPTION 51-57


5.1 Introduction 51
5.2 Basic Building MATLAB 51
5.2.1 MATLAB window 51
5.3 MATLAB files 53
5.3.1 M-files 53
5.3.2 MAT-files 53
5.4 MATLAB systems 53
5.4.1 Development Environment 53
5.4.2 MATLAB Mathematical Functions 54
5.4.3 MATLAB Language 54
5.4.4 Graphics 54
5.4.5 MATLAB Application interface 54
5.5 Some Basic Commands 55
5.6 Some Basic Plot Commands 56
5.7 MATLAB Working Environment 56
5.7.1 MATLAB Desktop 56
5.7.2 Using MATLAB Editor to Create M-Files 57
5.7.3 Getting Help 57

6. RESULTS 58-59
7.CONCLUSION 60
REFERENCES 61-62
APPENDIX 63-65

ix
LIST OF FIGURES
DESCRIPTION NUMBERS PAGE NO
1.1 Resolution of the image if the pixels get larger and larger
6
details in the image are less
1.2 Resolution of an image if the size is reduced 7
1.3 RGB Cube 10
1.4 Different percentage of HSV 10
1.5 Different percentage of HSL 11
2.1 RADAR data examples 12
3.1 Histogram of images 19
3.2 Raw data 27
3.3 Traditional representation 27
3.4 Hierarchical feature extraction CNN 35
3.5 Illustration of convolution layer 36
3.6 Example of valid cross correlation without zero padding 38
3.7 Example of valid cross correlation with zero padding 38
3.8 Illustration pooling layer 39
3.9 Illustration of residual connection 41
3.10 Comparison of the standard residual design with the 42
bottle neck design
4.1 Resent functioning 46
4.2 Satellite image, image from initial dataset, mask 48
6.1 Input image 58
6.2 Output image using U-NET 58
6.3 Output image using RESNET 59

x
LIST OF TABLES
DESCRIPTION NUMBERS PAGE NO
3.1 Example data 33
4.1 Resnet description 47

xi
DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

CHAPTER 1
INTRODUCTION
1.1 Digital Image Processing
Digital image plays an important role, both in daily life applications such as satellite
television, magnetic resonance imaging, computer tomography as well as in area of research
and technology such as geographical information system and astronomy. An image is a 2D
representation of a three-dimensional scene. A digital image is basically a numerical
representation of an object. The term digital image processing refers to the manipulation of an
image by means of a processor. The different elements of an image processing system include
image acquisition, image storage, image processing and display.
The use of computer algorithms to performs on digital images. As a subcategory or
field of digital signal processing, digital image processing has many advantages over analog
image processing. It allows a much wider range of algorithms to be applied to the input data
and can avoid problems such as the build-up of noise and signal distortion during processing.
Since images are defined over two dimensions (perhaps more) digital image processing may
be modeled in the form of multi-image processing. Digital image processing methods stem
from two principal application areas: improvement of pictorial information for human
interpretation and processing of image data for storage, transmission and representation for
autonomous machine perception.
1.2 What is an Image?
An image is a two-dimensional function that represents a measure of characteristic
such as brightness or colour of a viewed scene. An image is a projection of a 3D scene into a
2D projection plane. It can be defined as a two variable function f(x, y) where for each position
(x, y) in the projection plane; f(x, y) defines the light intensity at this point.
1.2.1 Analog Image
An analog image can be mathematically represented as a continuous range of values
representing position and intensity. An analog image characterized by physical magnitude
varying continuously in space.
1.2.2 Digital Image
A digital image is composing of picture elements called pixels. Pixels are the smallest
sample of an image. A pixel represents the brightness at one point. Conversion of an analog
image into digital image involves two important operations namely sampling and quantization.

DEPT OF ECE, YSREC OF YVU, PDTR 1


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

Advantages of digital images:


 The processing of images is faster and cost-effective.
 Digital images can be effectively stored and efficiently transmitted from one place to
another.
 When shooting a digital image one can immediately see if the image is good or not.
 When the image is in digital format, the reproduction of the image is both faster and
cheaper.
 Copying a digital image is easy. The quality of the digital image will not be degraded
even if it is copied for several times.
 Digital technology offers plenty of scope for versatile image manipulation.
Drawbacks of digital image
 Misuse of copyright has become easier because image can be copied from the internet
just by clicking the mouse a couple of times.
 A digital file cannot be enlarged beyond a certain size without compromising on
quality.
 The processing of an image by means of a computer is general termed as digital image
processing. The advantages of using computers for the processing of images are:
Flexibility and Adaptability: The main advantage of digital computers when compared
to analog electronic and optical information processing devices is that no hardware
modifications are necessary in order to reprogram digital computers to solve different tasks.
This feature makes digital computers an ideal device for processing image signals adaptively.
Data Storage and Transmission: With the development of different image
compression algorithms, the digital data can be effectively stored. The digital data within the
computer can be easily transmitted from one place to another. The only limitation of digital
imaging and digital image processing are memory and processing speed capabilities of
computers. Different image processing 2techniques include image enhancement, image
restoration, image fusion and image watermarking.
1.3 Representation of Digital Image
A digital image is a two-dimensional discrete signal. A digital image is also an N*N
array of elements. Each element in the array is a number which represents the sampled
intensity. Converting an image into digital format can be done either with a digital camera, or
by a scanner. Digital image can be created directly on a computer screen. However, it is
restricted both in spatial coordinates (sampling) and in its allowed intensities (quantization).

DEPT OF ECE, YSREC OF YVU, PDTR 2


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

1.3.1 Neighbors of A Pixel


A pixel will have four neighbors exist in the east, west, north, south direction. The four
neighbors of the pixel ‘P’ pixel are represented below. A pixel ‘P’ will have eight neighbors
if the neighbors are in direction such as east, west, north, south, north-east, north-west, south-
east, south-west.
1.3.2 Image Resolution
Resolution gives the degree of distinguishable details. Resolution can be broadly
classified into spatial resolution and gray-level resolution. Spatial resolution is the smallest
discernible detail in an image. Spatial resolution depends on the number of pixels. The
principal factor determining spatial resolution is sampling. Gray-level resolution refers to the
smallest discernible change in the gray level. Gray level resolution depends on the number of
gray levels.
1.3.3 Human Visual System
The human visual system (HVS) is one of the most complex systems in existence. Our
visual system allows us to organize and understand the many complex elements in
environment. The visual system consists of an eye that transforms light into neural signals,
and the related parts of brain that process the neural signals and extract necessary information.
The human eye serves to project and convert light into neural activity.
1.3.4 Brightness and Contrast
Brightness is the psychological concept or sensation associated with the amount of
light stimulus. Light source intensity depends upon the total light emitted and the size of then
solid angle from which it is emitted. Two sources of equal intensity do not appear equally
bright. Luminance, the intensity per unit area, is a psychophysical property that can be
measured. The term contrast is used to emphasize the difference in luminance of objects. The
perceived brightness of a surface depends upon local background.
1.4 Image Formats
The difference in image types is the result of the need for compression. By default,
most images have a large file size, which is not conducive to use on the web. The most
compressed image file types are .jpg, .gif and .png.
1.4.1 JPG
The JPG file format, short for Joint Photographic Experts Group, is a type of image
compression that works best with photographs and complex images. JPGs use a compression
method that removes non-human-visible colours from images to decrease file sizes. Be careful,

DEPT OF ECE, YSREC OF YVU, PDTR 3


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

though. If you decrease the quality of a JPG too much, you will begin to lose important colour
information that cannot be recovered. The JPG file format also allows you to save progressive
JPGs, which will load in stages. You may have experienced this before when visiting a website
and watching as an image slowly loses its blurriness and becomes clearer. Use JPGs for
product photos, human portraits and other images where colour variances are important. Do
not use JPGs if you need transparency, which is the ability to see through an image and
decipher the background behind it. JPGs do not support transparency.
1.4.2 GIF

A GIF, or a Graphics Interchange Format, reduces the number of colours in an image


to 256, from potentially thousands of colours coming from a digital camera. GIFs also support
transparency. GIFs have the unique ability to display a sequence of images, similar to videos,
called an animated GIF, which is a series of separate GIF images that are linked together to
automatically create motion, or animation. GIFs, like JPGs, also have the ability to load in
segments on web pages. These images, known as interlaced GIFs, tend to be slightly larger
than regular GIFs, but they allow a GIF image to be partially visible as it is loading on a web
page. GIFs can be used effectively for limited-colour images, such as logos and graphs, or for
images where transparency is important. Do not use GIFs for full-colour product photos and
staff portraits, for example, where colour variances are important, as GIF colours are limited
to 256.Although the GIF format is still in use, it should generally be avoided in favor of the
PNG format, which does nearly everything better.
1.4.3 PNG

PNGs, or Portable Network Graphics, were created as an alternative to the GIF file
format, when the GIF technology was copyrighted and required permission to use. PNGs allow
for 5 to 25 percent greater compression than GIFs, and with a wider range of colours. Like
GIFs, PNG file formats also support transparency, but PNGs support variable transparency,
where users can control the degree to which an image is transparent. The downside to
advanced transparency in PNGs is that not all older browsers will display the transparency the
same. PNGs also support image interlacing, similar to GIFs, but PNGs use two-dimensional
interlacing, which makes them load twice as fast as GIF images. If you are interested in this
interlacing technology.
1.4.4 SVG
The standard has actually been around for more than a decade, but with the recent
emergence of HTML5 it is finally coming of age. For now, know that SVG allows you to

DEPT OF ECE, YSREC OF YVU, PDTR 4


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

create very high-quality graphics and animations that do not lose detail as their size increases.
This means that with SVG you could create one graphic that looked great on a tiny mobile
phone screen or on a 60-inch computer monitor.
1.5 Types of Digital Images
For photographic purposes, there are two important types of digital images—colour
and black and white. Colour images are made up of coloured pixels while black and white
images are made of pixels in different shades of gray.
1.5.1 Black and White Images
A black and white image is made up of pixels each of which holds a single number
corresponding to the gray level of the image at a particular location. These gray levels span
the full range from black to white in a series of very fine steps, normally 256 different grays.
Since the eye can barely distinguish about 200 different gray levels, this is enough to give the
illusion of a step less tonal scale as illustrated below:
Assuming 256 gray levels, each black and white pixel can be stored in a single byte
1.5.2 Colour Images
A colour image is made up of pixels each of which holds three numbers corresponding
to the red, green, and blue levels of the image at a particular location. Red, green, and blue
(sometimes referred to as RGB) are the primary colours for mixing light—these so-called
additive primary colours are different from the subtractive primary colours used for mixing
paints (cyan, magenta, and yellow). Any colour can be created by mixing the correct amounts
of red, green, and blue light. Assuming 256 levels for each primary, each colour pixel can be
stored in three bytes (24 bits) of memory. This corresponds to roughly 16.7 million different
possible colours. Note that for images of the same size, a black and white version will use
three times less memory than a colour version.8 bits) of memory.
1.5.3 Binary or Bi-Level Images
Binary images use only a single bit to represent each pixel. Since a bit can only exist
in two states—on or off, every pixel in a binary image must be one of two colours, usually
black or white. This inability to represent intermediate shades of gray is what limits their
usefulness in dealing with photographic images.
1.5.4 Indexed Colour Images
Some colour images are created using a limited palette of colours, typically 256
different colours. These images are referred to as indexed colour images because the data for
each pixel consists of a palette index indicating which of the colours in the palette applies to

DEPT OF ECE, YSREC OF YVU, PDTR 5


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

that pixel. There are several problems with using indexed colour to represent photographic
images. First, if the image contains more different colours than are in the palette, techniques
such as dithering must be applied to represent the missing colours and this degrades the image.
Second, combining two indexed colour images that use different palettes or even retouching
part of a single indexed colour image creates problems because of the limited number of
available colures.
1.6 Resolution
The more points at which we sample the image by measuring its colour, the more
detail we can capture. The density of pixels in an image is referred to as its resolution. The
higher the resolution, the more information the image contains. If we keep the image size the
same and increase the resolution, the image gets sharper and more detailed. Alternatively, with
a higher resolution image, we can produce a larger image with the same amount of detail.

Fig.1.1 shows resolution of the image if the pixels get larger and larger details in the
image are less

As we reduce the resolution of an image while keeping its pixels the same size—the image
gets smaller and smaller while the amount of detail (per square inch) stays the same

DEPT OF ECE, YSREC OF YVU, PDTR 6


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

Fig 1.2 resolution of an image if the size is reduced


1.6.1 Pixel Resolution
The term resolution is often used for a pixel count in digital imaging, even though
American, Japanese, and international standards specify that it should not be so used, at least
in the digital camera field. An image of N pixels high by M pixels wide can have any
resolution less than N lines per picture height, or N TV lines. But when the pixel counts are
referred to as resolution, the convention is to describe the pixel resolution with the set of two
positive integer numbers, where the first number is the number of pixel columns (width) and
the second is the number of pixel rows (height), for example as 7680 by 4320. Another popular
convention is to cite resolution as the total number of pixels in the image, typically given as
number of megapixel, which can be calculated by multiplying pixel columns by pixel rows
and dividing by one million.
1.7 Colour Terminology
While pixels are normally stored within the computer according to their red, green, and
blue levels, this method of specifying colours (sometimes called the RGB colour space) does
not correspond to the way we normally perceive and categorize colours. There are many
different ways to specify colours, but the most useful ones work by separating out the hue,
saturation, and brightness components of a colour.
1.7.1 Primary and Secondary Colours and Additive and Subtractive Colour Mixing
Primary colours are those that cannot be created by mixing other colours. Because of
the way we perceive colours using three different sets of wavelengths, there are three primary
colours. Any colour can be represented as some mixture of these three primary colours.There
are two ways to combine colours—additive and subtractive colour mixing.Subtractive colour

DEPT OF ECE, YSREC OF YVU, PDTR 7


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

mixing is the one most of us learned in school, and it describes how two coloured paints or
inks combine on a piece of paper. The three subtractive primaries are Cyan (blue-green),
Magenta (purple-red), and Yellow (not Blue, Red, and Yellow as we were taught). Additive
colour mixing refers to combining lights of two different colours, for example by shining two
coloured spotlights on the same white wall. The additive colour model is the one used in
computer displays as the image is formed on the face of the monitor by combining beams of
red, green, and blue light in different proportions. Colour printers use the subtractive colour
model and use cyan, magenta, and yellow inks. To compensate for the impure nature of most
printing inks, a fourth colour, black is also used since the black obtained by combining cyan,
magenta, and yellow inks is often a murky dark green rather than a deep, rich black. For this
and other reasons, commercial colour printing presses use a 4-colour process to reproduce
colour images in magazines. A colour created by mixing equal amounts of two primary colours
is called a secondary.
1.7.2 Colour Gamut
In the real world, the ideal of creating any visible colour by mixing three primary
colours is never actually achieved. The dyes, pigments, and phosphors used to create colours
on paper or computer screens are imperfect and cannot recreate the full range of visible
colours. The actual range of colours achievable by a particular device or medium is called its
colour gamut and this is mostly but not entirely determined by the characteristics of its primary
colours. Since different devices such as computer monitors, printers, scanners, and
photographic film all have different colour gamut, the problem of achieving consistent colour
is quite challenging. Different media also differ in their total dynamic range—how dark is the
darkest achievable black and how light is the brightest white.
1.7.3 Colour Management
The process of getting an image to look the same between two or more different media
or devices is called colour management, and there are many different colour management
systems available today. Unfortunately, most are complex, expensive, and not available for a
full range of devices.
1.7.4 Hue
The hue of a colour identifies what is commonly called “colour.” For example, all reds
have a similar hue value whether they are light, dark, intense, or pastel.
1.7.5 Saturation
The saturation of a colour identifies how pure or intense the colour is.A fully saturated

DEPT OF ECE, YSREC OF YVU, PDTR 8


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

colour is deep and brilliant—as the saturation decreases, the colour gets paler and more washed
out until it eventually fades to neutral.
1.7.6 Brightness
The brightness of a colour identifies how light or dark the colour is. Any colour whose
brightness is zero is black, regardless of its hue or saturation. There are different schemes for
specifying a colour's brightness and depending on which one is used, the results of lightening
a colour can vary considerably.
1.7.7 Luminance
The luminance of a colour is a measure of its perceived brightness. The computation
of luminance takes into account the fact that the human eye is far more sensitive to certain
colours (like yellow-green) than to others (like blue).
1.7.8 Chrominance
Chrominance is a complementary concept to luminance. If you think of how a
television signal works, there are two components—a black and white image which represents
the luminance and a colour signal which contains the chrominance information. Chrominance
is a 2-dimensional colour space that represents hue and saturation, independent of brightness.
1.8 Digital Image Colour Spaces
A colour space is a mathematical system for representing colours. Since it takes at least
three independent measurements to determine a colour, most colour spaces are three-
dimensional. Many different colour spaces have been created over the years in an effort to
categorize the full gamut of possible colours according to different characteristics.
1.8.1 RGB
Most computer monitors work by specifying colours according to their red, green, and
blue components. These three values define a 3-dimensional colour space call the RGB colour
space. The RGB colour space can be visualized as a cube with red varying along one axis,
green varying along the second, and blue varying along the third. Every colour that can be
created by mixing red, green, and blue light is located somewhere within the cube. The
following images show the outside of the RGB cube viewed from two different directions:
The eight corners of the cube correspond to the three primary colours (Red, Green, Blue), the
three secondary colours (Cyan, Magenta, Yellow) and black and white. All the different
neutral grays are located on the diagonal of the cube that connects the black and the white
vertices.

DEPT OF ECE, YSREC OF YVU, PDTR 9


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

1.8.2 HSV (Hue, Saturation, Value)


The HSV colour space attempts to characterize colours according to their hue,
saturation, and value (brightness). This colour space is based on a so-called hex cone model
which can be visualized as a prism with a hexagon on one end that tapers down to a single
point at the other. The hexagonal face of the prism is derived by looking at the RGB cube
centered on its white corner. The cube, when viewed from this angle, looks like a hexagon
with white in the center and the primary and secondary colours making up the six vertices of
the hexagon. Successive resections of the HSV hex cone as it narrows to its vertex are
illustrated below showing how the colours get darker and darker, eventually reaching black.

Fig1.3 shows RGB cube

Fig.1.4 shows different percentage of HSV

DEPT OF ECE, YSREC OF YVU, PDTR 10


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

1.8.3 HSL (Hue, Saturation, Lightness)


The HSL colour space (also sometimes called HSB) attempts to characterize colours
according to their hue, saturation, and lightness (brightness). This colour space is based on a
double hex cone model which consists of a hexagon in the middle that converges down to a
point at each end. Like the HSV colour space, the HSL space V = 100% V = 75% V = 50% V
= 25% goes to black at one end, but unlike HSV, it tends toward white at the opposite end.
The most saturated colours appear in the middle. Note that unlike in the HSL colour space,
this central cross section has 50% gray in the center and not white.

Fig1.5 shows different levels of HSL


1.9 Forest
The Kyoto protocol defines a forest area using three parameters. The first one is a
minimum land area between 0.05 and 1 hectare. Next, the minimum value of tree cover in this
area needs to be between 10 and 30%. Third, the minimum tree height needs to be in the
interval of 2 and 5 meters. This means that each country has a window to apply this concept
depending on its landscape. As an unfortunate direct result of this definition, countries can
exploit this concept to damage forests enough not to lose the forest status. Sasaki and Putz
(2009) warns that the Kyoto protocol definition needs to be revised while explores alternatives.
Nevertheless, for this project, the explanation given in the last paragraph will be maintained.
1.9.1 Deforestation
According to Myers (1991), and as the name implies, deforestation is when a forest
area is destroyed in a way it can no longer be classified as forest. According to Wibowo and
Byron (1999), the same term can be used to address both human activities, addressed at this
work's Introduction, and natural causes. Some examples of naturally caused deforestation are,

DEPT OF ECE, YSREC OF YVU, PDTR 11


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

but not limited to, soil erosion, ooding, forest fires, and hurricanes. While deforestation effects
can be perceived with satellite images, the same cannot be stated about selective logging.
1.9.2 Degradation
Forest degradation is a process that usually precedes deforestation. The Kyoto protocol
does not provide a definition, and there is no universally agreed interpretation (Sasaki and
Putz, 2009). This project defines that when a forest loses 10 to 30% of canopy cover on forest
area, forest degradation is in course. It is easy to see the effect from the ground, but not so
easy using satellites.

Figure 2.1: Radar data examples. Each green pixel is 15m2of forest area. A red
pixel corresponds to deforestation and colors between yellow and orange are levels of
degradation.
For this project, radar sensors provide the technology to differentiate both of them, as seen on
1.9. Human-based degradation can also be the result of selective logging, forest usage by
guerillas, or drug trafficking. Selective logging is a type of tree removal in which the objective
is to retrieve a limited number of marketable tree species This modality can damage other
trees, affect soil and local fauna. Since it is hard to capture it with satellite images, it will be
out of scope for this study, unless it can be categorized as forest degradation.

DEPT OF ECE, YSREC OF YVU, PDTR 12


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

CHAPTER 2
LITERATURE SURVEY
2.1 Introduction
Forest is considered as an important part in context to the environment. The major
purpose is to inhale carbon dioxide and generate oxygen in their cycle of photosynthesis for
maintaining a balance and healthy atmosphere. Examination of environmental disasters, such
as biodiversity loss, deforestation, depletion of natural resources, etc., necessitates the
computation of continuous change detection in the forest. Nowadays, land cover change
analysis is performed using satellite images. Several techniques are introduced for forest
change detection, but missing data in the satellite images is a serious problem due to artifacts,
cloud occlusion, and so on. Thus, techniques handling missing data for forest change detection
are essential. As a result, this survey provides a review of unique forest change detection
mechanisms. Therefore, this paper presents a complete analysis of 25 papers presenting a
forest change detection method, like Machine learning techniques, Pixel-based techniques. In
addition, a detailed investigation is carried out based on the performance measures, images
adapted, datasets used, evaluation metrics, and accuracy range. Finally, the issues faced by
different forest change detection methods are offered to extend the researchers to form
enhanced role in considerable detection methods.
Sustainable development is an agreed-upon global forestry trend because it impacts
other deforestation-related industries. Lots of factors, such as forest canopy density, forest
degradation, the patterns and processes of deforestation, and logging intensity, are important
for the sustainable development of forest ecosystems and the global carbon budget [1–8]. The
important principles of Chinese forestry blueprinting have been well integrated with the
harnessing of natural resources, environmental protection, and ecological balance. Improving
the ecological environment, focusing on ecoengineering, and effectively maintaining ecology
have become the leading demands of socioeconomic forestry development in this century [1–
8]. The change of this leading demand has given the Chinese forestry industry the most
favourable economic development position.
Specifically, forestry has become the center and foundation of ecological maintenance
and socioeconomic sustainable development. Forestry is no longer a single industry; rather, it
is a comprehensive and dynamic system within which any development or change of its
components would directly or indirectly impact the entire economy. Therefore, in response to
sustainable forestry development, a broad set of regulations, guidelines, and the technology
DEPT OF ECE, YSREC OF YVU, PDTR 13
DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

were developed to control and safeguard forest management practice that encompasses
silvicultural treatment, forest conservation strategy, cutting rate control, deforestation and
forest degradation monitoring, developing cableway logging [9–12], reducing impact logging,
assessment of biomass and carbon stock, and so forth [12]. Many studies have examined the
patterns and processes of deforestation [6–8, 13–17], but information about the light cableway
skidding technology beneficial to forest ecosystems is still limited [10, 18–25].
Noticeably, technological innovation is one of the most important components of the
forestry system. Scientific deforestation plays a vital role in the forestry sustainable
development and forestry competitiveness. The adoption of deforestation technology may
have a direct or an indirect and beneficial or damaging impact on the overall forestry system;
furthermore, it also has immediate implications for the sustainable development of forest
resources as well as the economy and society. Accordingly, the engineering-based study of
the adoption of rational analytic methods for the evaluation of sustainable forestry
development is of great significance.
2.2 Fundamental Principles of the Evaluation Model
2.2.1. Systematization and Connectivity
The forest ecosystem is complicated and features systematization and
comprehensiveness, which has many interrelated and interacted components. This system is
also interlocked with every operation of forestry business, demonstrating its innate
diversification. Noticeably, every subsystem in the forest ecosystem is not only relatively
independent but also interdependent. They are directly or indirectly related to each other to
develop dynamically and to compose a compatible system.
2.2.2. Scientific Integrity and Objectivity
Scientific integrity refers to the fact that research in any discipline should make its
subject objective, use detectable laws, and have theoretical verifiability, strict logicality, and
united scientific value. Specifically, the construction of the model should not only conform to
the fundamental scientific principles but also reflect the internal components, law of
development and characteristics of the material and technical base of the forest ecosystem
itself. Additionally, the construction of the evaluation model and the choosing of an index
system should consider the architecture layer, organization, and archetypes. Objectivity means
that the construction of the evaluation model and the chosen index should be as consistent as
possible with the objective reality and the objective laws of forest ecosystem development. In
addition, all of the data from the index and evaluation systems should be as objective and

DEPT OF ECE, YSREC OF YVU, PDTR 14


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

accurate as possible. The data collection process should be based on the statistics released by
national or provincial statistics departments or profession-qualified statistics institutions to
guarantee data authority and reliability.
2.2.3. Sustainable Development
Because the forest ecosystem has economic, ecological, and social benefits, forest
operators cannot be driven merely by economic interests but must be guided by a sustainable
development principle to reconcile these main three benefits. Additionally, forest operators
should be oriented by the market mechanism and the law of value reasonably and should
consider the economic and social sustainable development of the forest in the pursuit of profits.
Hence, the related governmental departments should provide necessary guidelines, education
and regulatory supervision to ensure that the forest operators are engaged in sustainable
development efforts.
2.2.4. Transparency
Transparency means that the construction of the evaluation model and index system for
the sustainable forest development is set to be open, transparent, accurate, and specific so that
some constructive suggestions can be adopted.
2.2.5. Reliability
Reliability requires that the methods adopted by the evaluation model for sustainable
forest development should be feasible and practical. Otherwise, the model would lose its
significance and operability. The index system should have a reliable, continuous, and
authoritative data source. Since some important indexes fail to attain reliable data sources,
they should be preserved until the data collection process is ready or they are just regarded as
a theoretical basis and the calculation is omitted. On all accounts, the index system should
represent the facts and be precise, workable, and practical.
2.2.6. Comparability
Comparability means that the construction of the evaluation model and index system for
the sustainable forest development should be available for objective comparison with the
alternatives. Different types of statistics are used to reflect the sustainable development of the
forest during construction; so, different types of indexes should be compared. For example,
the dimensionless method can give index comparability.

DEPT OF ECE, YSREC OF YVU, PDTR 15


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

2.2.7. Qualitative versus Quantitative Analysis


In the sustainable forestry development system, the interaction of every component
demonstrates its complexity, comprehensiveness, connectivity, and compatibility. Therefore,
in the specific evaluation process, qualitative analysis should be applied to evaluate the
objectives, construct a scientific and feasible index evaluation model, and use specific data to
assess the state of the sustainable development. Only in this fashion can the scientific integrity,
objectivity, and sustainability of the evaluation system be obtained. Therefore, quantitative
analysis cannot ignore the significance of qualitative analysis, and actually, the related index
of the sustainable forestry development cannot be fully quantified. Instead, necessary notes
and explications should be added to specify the definition, function, and equations of every
index. In the evaluation practice, the combination of qualitative and quantitative analysis
makes scientific evaluation possible.

DEPT OF ECE, YSREC OF YVU, PDTR 16


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

CHAPTER 3
SEGMENTATION TECHNIQUES AND CNN
3.1 A Mathematical Definition of Segmentation
The following is a very general definition of image segmentation. It uses a
homogeneity predicate P(R) that helps formalizing the notion of homogeneity in an image: a
region R is homogeneous if and only if P(R) = True. Therefore, the homogeneity can be
defined in infinity of different ways: on the grey levels, on the textures or even on non-obvious
properties of the image
Definition 1 (segmentation): Let I be the set of pixels (the input image) and P(𝑅 ) the
homogeneity predicate defined on groups of connected pixels.
A segmentation S of I is a partitioning set of image regions {R1,R2, . . . ,Rn} sPuch that
⋃ 𝑅 = 𝐼 𝑎𝑛𝑑 𝑅 ∩ 𝑅 = ∅ Ɐ 𝑖 ≠ 𝑗 (eq. 3.1)

Is a mathematical definition of a partition: the union of all the regions forms the whole image
and all the regions are distinct.
𝑃(𝑅 ) = 𝑇𝑟𝑢𝑒 Ɐ 𝑖 (eq. 3.2)
Signifies that the homogeneity predicate is valid on every region
𝑃 𝑅 ∪𝑅 = 𝐹𝑎𝑙𝑠𝑒 Ɐ 𝑅 𝑎𝑑𝑗𝑎𝑐𝑒𝑛𝑡 𝑡𝑜 𝑅 (eq. 3.3)
signifies that the union of two adjacent regions cannot satisfy the homogeneity predicate, i.e.
two adjacent regions must be distinct regarding the homogeneity predicate.
𝑅 ⊂𝑅 ∧ 𝑅 ≠∅ ∧ 𝑃 𝑅 = 𝑇𝑟𝑢𝑒 = (𝑃(𝑅 ) = 𝑇𝑟𝑢𝑒) (eq. 3.4)
Signifies that the homogeneity predicate is valid on any sub-region of a region where it is
verified.
3.2 A Review on Existing Segmentation Techniques
A wide range of very specialized segmentation techniques currently exist and since the
research is very active in this field; the panel of available techniques and algorithms constantly
evolves. Therefore, a complete study that would review all the state-of-the-art techniques is
not relevant in the context of this document. Instead, this section tries to present a simple yet
homogeneous and relevant classification of the existing techniques into a number of families.
For each family the general functional philosophy is analyzed and a non-extensive list of
algorithms is presented, with a short explanation of the specificities for each of them.
There are numerous types of classifications proposed in the specialized literature, each
of which is relevant respectively to the point of view required by the study. Since this research
DEPT OF ECE, YSREC OF YVU, PDTR 17
DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

project deals with medical image segmentation, where a large majority of the acquired data is
grey-scaled, and all the techniques concerning color images will be left aside. The techniques
are categorized into three main families:
a. Pixel based techniques (also known as histogram thresholding);
b. Edge based techniques;
c. Region based techniques.
This classification is very commonly encountered in numerous papers as
 Histogram thresholding
 Edge based segmentation
 Tree or graph based approaches
 Region growing
 Clustering
 Probabilistic and Bayesian approaches
 Neural networks segmentation
 Other approaches.
3.2.1 Histogram Thresholding
The pixel-based family of techniques is probably the simplest one; it essentially
consists in finding an acceptable threshold in the grey levels of the input image in order to
separate the object(s) from the background. It is often referred to as histogram thresholding
since the grey-levels histogram of an ideal image will clearly show two distinct peaks
assimilable to Gaussians (which can be obtained by applying a filter to the image) representing
the distribution of grey levels for one object and its background.
Histogram-Based Methods
Histogram-based methods are very efficient when compared to other image
segmentation methods because they typically require only one pass through the pixels. In this
technique, a histogram is computed from all of the pixels in the image, and the peaks and
valleys in the histogram are used to locate the clusters in the image. Color or intensity can be
used as the measure. A refinement of this technique is to recursively apply the histogram-
seeking method to clusters in the image in order to divide them into smaller clusters. This is
repeated with smaller and smaller clusters until no more clusters are formed.
One disadvantage of the histogram-seeking method is that it may be difficult to identify
significant peaks and valleys in the image. In this technique of image classification distance
metric and integrated region matching are familiar.

DEPT OF ECE, YSREC OF YVU, PDTR 18


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

Figure3.1. Histogram of images


The threshold value for an image can be computed by using several different methods, like
the Gaussian filtering
3.2.2 Edge-Based Segmentation
The edge-based family of techniques tries to detect edges in an image so that the
boundaries of the objects can be inferred. The most simple method of this type is known as
detect and link : the algorithm first tries to detect local discontinuities and then tries to build
longer ones by connecting them, hopefully leading to closed boundaries which circumscribe
the objects in the image. The main disadvantage of this technique lies in the fact that,
depending on the quality of the input image, the algorithm is not guaranteed to produce closed
edges. As a consequence, the image will not sharply split into regions.
MR image segmentation based on edge detection has been proposed where a
combination of Marr-Hildreth operator for edge detection and morphological operations for
the refinement of the detected edges is used to segment 3D MR images. A boundary tracing
method is proposed, where the operator clicks a pixel in a region to be outlined and the method
then finds the boundary starting from that point. The method is, however, restricted to
segmentation of large, well-defined structures, but not to distinguish fine tissue types. Edge-
based segmentation methods usually suffer from over or under-segmentation, induced by
improper threshold selection. In addition, the edges found are usually not closed and
complicated edge linking techniques are further required.
3.2.3 Active Contour-Based Segmentation
Active contour deforms to fit the object’s shape by minimizing a gradient dependent
attraction force while at the same time maintaining the smoothness of the contour shape. Thus,
DEPT OF ECE, YSREC OF YVU, PDTR 19
DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

unlike edge detection, active contour methods are much more robust to noise as the
requirements for contour smoothness and contour continuity act as a type of regularization.
Another advantage of this approach is that prior knowledge about the object’s shape can be
built into the contour parameterization process. However, active contour-based algorithms
usually require initialization of the contour close to the object boundary for it to converge
successfully to the true boundary. More importantly, active contour methods have difficulty
handling deeply convoluted boundary such as CSF, GM and WM boundaries due to their
contour smoothness requirement. Hence, they are often not appropriate for the segmentation
of brain tissues. Nevertheless, it has been applied successfully to the segmentation of
intracranial boundary, brain outer surface and Neuro-anatomic structures in MR brain images.
3.2.4 Region-Based Segmentation
The region-based family of techniques fundamentally aims at iteratively building
regions in the image until a certain level of stability is reached. The region growing algorithms
start from well-chosen seeds (usually defined by the user). They then expand the seed regions
by annexing their homogeneous neighbors. The process is iterated until all the pixels in the
image have been classified. The region splitting algorithms use the entire image as a seed and
split it into regions until no more heterogeneity can be found. An algorithm that associates the
advantages of both methods, called the Split, Merge and Group (SMG) algorithm, has been
developed by Horowitz and Pavlidis.
The shape of an object can be described in terms of its boundary or the region it
occupies. Image region belonging to an object generally have homogeneous characteristics,
e.g. similar in intensity or texture. Region-based segmentation techniques attempt to segment
an image by identifying the various homogeneous regions that correspond to different objects
in an image. Unlike clustering methods, region-based methods explicitly consider spatial
interactions between neighboring voxels. In its simplest form, region growing methods usually
start by locating some seeds representing distinct regions in the image. The seeds are then
grown until they eventually cover the entire image. The region growing process is therefore
governed by a rule that describe the growth mechanism and a rule that check the homogeneity
of the regions at each growth step. Region growing technique has been applied to MRI
segmentation.
The main goal of segmentation is to partition an image into regions. Some
segmentation methods such as "Thresholding" achieve this goal by looking for the boundaries
between regions based on discontinuities in gray levels or color properties. Region-based

DEPT OF ECE, YSREC OF YVU, PDTR 20


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

segmentation is a technique for determining the region directly. The basic formulation for
Region-Based Segmentation is:

(eq. 3.5)

(b)Ri is a connected region, i = 1, 2, ..., n (eq. 3.6)

(eq. 3.7)

(d) P(Ri) = TRUE for i = 1,2,..., n (eq. 3.8)

(eq. 3.9)

P(Ri) is a logical predicate defined over the points in set P(Rk) and is the null set.
(a) means that the segmentation must be complete; that is, every pixel must be in a region.
(b) requires that points in a region must be connected in some predefined sense.
(c) indicates that the regions must be disjoint.
(d) deals with the properties that must be satisfied by the pixels in a segmented region.
(e) For example, P(Ri) = TRUE if all pixels in Ri have the same gray level.
(f) indicates that region Ri and Rj are different in the sense of predicate P.
A semi-automatic, interactive MRI segmentation algorithm was developed that employ
simple region growing technique for lesion segmentation. In an automatic statistical region
growing algorithm based on a robust estimation of local region mean and variance for every
voxel on the image was proposed for MRI segmentation. Furthermore, relaxation labeling,
region splitting, and constrained region merging were used to improve the quality of the MRI
segmentation. The determination of an appropriate region homogeneity criterion is an
important factor in region growing segmentation methods. However, such homogeneity
criterion may be difficult to obtain a priori. An adaptive region growing method is proposed
where the homogeneity criterion is learned automatically from characteristics of the region to
be segmented while searching for the region. Other region-based segmentation techniques,
1. Split-and-merge based segmentation and
2. Watershed based segmentation have also been proposed for MRI segmentation
Split-and-merge based segmentation
In the split-and-merge technique, an image is first split into many small regions during the
splitting stage according to a rule, and then the regions are merged if they are similar enough

DEPT OF ECE, YSREC OF YVU, PDTR 21


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

to produce the final segmentation.


Watershed-based segmentation
In the watershed-based segmentation, the gradient magnitude image is considered as a
topographic relief where the brightness value of each voxel corresponds to a physical
elevation. An immersion based approach is used to calculate the watersheds. The operation
can be described by imagine that holes are pierced in each local minimum of the topographic
relief. Then, the surface is slowly immersed in water, which causes a flooding of all the
catchment basins, starting from the basin associated with the global minimum. As soon as two
catchment basins begin to merge, a dam is built. The procedure results in a partitioning of the
image in many catchment basins of which the borders define the watersheds. To reduce over-
segmentation, the image is smoothed by 3D adaptive anisotropic diffusion prior to watershed
operation.

3.3 Region Growing


Region growing is a technique for extracting an image region that is connected based
on some predefined criteria. These criteria can be based on intensity information and/or edges
in the image. In its simplest form, region growing requires a seed point that is manually
selected by an operator and extracts all pixels connected to the initial seed based on some
predefined criteria. For example, one possible criterion might be to grow the region until an
edge in the image is met. Like thresholding, region growing is seldom used alone but usually
within a set of image-processing operations, particularly for the delineation of small, simple
structures such as tumors and lesions.
Region growing is a simple region-based image segmentation method. It is also
classified as a pixel-based image segmentation method since it involves the selection of initial
seed points. This approach to segmentation examines neighboring pixels of initial “seed
points” and determines whether the pixel neighbors should be added to the region. The process
is iterated on, in the same manner as general data clustering algorithms. A general discussion
of the region growing algorithm is described below
Advantages:
1. Region growing methods can correctly separate the regions that have the same properties
we define.
2. Region growing methods can provide the original images which have clear edges the good
segmentation results.
3. The concept is simple. We only need a small numbers of seed point to represent the property

DEPT OF ECE, YSREC OF YVU, PDTR 22


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

we want, then grow the region.


Disadvantage:
1. The computation is consuming, no matter the time or power.
2. Noise or variation of intensity may result in holes or over segmentation.
3. This method may not distinguish the shading of the real images.
3.4 Region Based Method
3.4.1 Growth of Regions
The first region growing method was the seeded region growing method. This method
takes a set of seeds as input along with the image. The seeds mark each of the objects to be
segmented. The regions are iteratively grown by comparing all unallocated neighboring pixels
to the regions. The difference between a pixel's intensity value and the region's mean, δ, is
used as a measure of similarity. The pixel with the smallest difference measured this way is
allocated to the respective region. This process continues until all pixels are allocated to a
region. The growth of the regions is carried out from the seeds that were determined as input,
where each one of them contains the following information:
 Position: These are x, y and z coordinates within the image. It is known that this
point belongs to the region of interest.
 Intensity: The voxel intensity is important to determine the rank of intensities that
will be included in the region (if the inclusion criterion makes use of this value).
Another input data of the algorithm is the three-dimensional image with a cubical matrix
shape. The algorithm output will be a matrix with the same dimensions as the input image.
This output matrix is initially filled out with zeroes in all the positions, and the seeds will be
marked to let the region grow.
3.4.2 Growth Algorithm
An auxiliary FIFO (First In First Out) structure is used where the seeds are initially
located, and where the Neighbors that belong to the region to be visited are queued up. In
algorithm 1 it is possible to see the pseudo code of Voxel Grow algorithm in detail. The
algorithm successively takes elements from the queue. Each one of these elements is one of
the volume’s voxel that have already been accepted. For each one of them we must visit its
neighbors, and decide if that neighbor belongs or not to the region according to the selection
criterion. In order to compare neighbors, 6-connectedness is used. One of the most remarkable
aspects of this technique is that it always grows by neighbors, so it maintains connectivity
between the elements that are included within the segmented region.

DEPT OF ECE, YSREC OF YVU, PDTR 23


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

3.4.3 Growth Types


Three growth variations are provided to consider if a voxel belongs or not to the region
of interest. The first one considers the variation of voxel intensity in relation to the seed
intensity. The second one considers the local intensity variation in relation to the neighbor
being visited. The last one considers the three-dimensional gradient of the image.
3.4.4 Seed Growth
In this case the seed intensity is taken always as reference. Each new voxel that is added
to the region is included if the intensity difference that exists between it and the intensity of
the seed maintains within a threshold determined previously. This threshold is compared
directly with the intensity difference. This technique gives as result regions that contain voxels
whose intensities are within a certain rank
3.4.5 Neighbor Growth
Unlike the previous case, this variation considers that the voxel belongs to the region
if the intensity difference with its neighbor remains underneath the threshold. In this technique,
voxels that have great variations of intensity with their neighbors are excluded.
3.4.6 Disadvantages of Region Growing
The primary disadvantage of region growing is that it requires manual interaction to obtain
the seed point. Thus, for each region that needs to be extracted, a seed must be planted. Split
and-merge is an algorithm related to region growing, but it does not require a seed point.
Region growing can also be sensitive to noise, causing extracted regions to have holes or
even become disconnected. To help alleviate these problems, a homotopic region-growing
algorithm has been proposed that preserves the topology between an initial region and an
extracted region. Fuzzy analogies to region growing have also been developed.
3.5 Clustering Method
Clustering can be considered the most important unsupervised learning problem
because no information is provided about the "right answer" for any of the objects. It classifies
a set of observations in the data and it finds a reasonable structure in the data set. Here priori
information about classes is not required, i.e., neither the number of clusters nor the rules of
assignment into clusters are known. They have to be discovered exclusively from the given
data set without any reference to a training set. Cluster analysis allows many choices about
the nature of the algorithm for combining groups. There are two basic approach to clustering,
which we call supervised and unsupervised. In the case of unsupervised classification on
clustering, we do not have labels. If we know the labels of our input data, the problem is

DEPT OF ECE, YSREC OF YVU, PDTR 24


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

considered supervised, or otherwise it is called unsupervised.


Clustering definition:
Clustering is a grouping of data with similar characteristics. To divide the data into
several groups the similarity of objects are used, here the distance functions are being used to
find the similarity of two objects in the data set.
Cluster analysis:
Cluster analysis or clustering is the assignment of a set of observations into subsets
(called clusters) so that observations in the same cluster are similar in some sense. Clustering
is a method of unsupervised learning, and a common technique for statistical data analysis
used in many fields, including machine learning, data mining, pattern recognition, image
analysis and bioinformatics. Besides the term clustering, there are a number of terms with
similar meanings, including automatic classification, numerical taxonomy, botryology and
typological analysis.
3.6 Types of Clustering
Hierarchical algorithms find successive clusters using previously established clusters.
These algorithms can be either agglomerative ("bottom-up") or divisive ("top-down").
Agglomerative algorithms begin with each element as a separate cluster and merge them into
successively larger clusters. Divisive algorithms begin with the whole set and proceed to
divide it into successively smaller clusters. Partitional algorithms typically determine all
clusters at once, but can also be used as divisive algorithms in the hierarchical clustering.
Density-based clustering algorithms are devised to discover arbitrary-shaped clusters. In this
approach, a cluster is regarded as a region in which the density of data objects exceeds a
threshold. DBSCAN and OPTICS are two typical algorithms of this kind.
Two-way clustering, co-clustering or bi clustering are clustering methods where not
only the objects are clustered but also the features of the objects, i.e., if the data is represented
in a data matrix, the rows and columns are clustered simultaneously. Many clustering
algorithms require specification of the number of clusters to produce in the input data set, prior
to execution of the algorithm. Barring knowledge of the proper value beforehand, the
appropriate value must be determined, a problem for which a number of techniques have been
developed.
Distance measure
An important step in any clustering is to select a distance measure, which will
determine how the similarity of two elements is calculated. This will influence the shape of

DEPT OF ECE, YSREC OF YVU, PDTR 25


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

the clusters, as some elements may be close to one another according to one distance and
farther away according to another.
For example, in a 2-dimensional space, the distance between the point (x = 1, y = 0)
and the origin (x = 0, y = 0) is always 1 according to the usual norms, but the distance between
the point (x = 1, y = 1) and the origin can be 2, √2 or 1 if you take respectively the 1-norm, 2-
norm or infinity-norm distance.
Common distance functions
 The Euclidean distance (also called distance as the crow flies or 2-norm distance). A
review of cluster analysis in health psychology research found that the most common
distance measure in published studies in that research area is the Euclidean distance or
the squared Euclidean distance.
 The Manhattan distance (aka taxicab norm or 1-norm)
 The maximum norm (aka infinity norm)
 The Mahalanobis distance corrects data for different scales and correlations in the
variables.
 The angle between two vectors can be used as a distance measure when clustering high
dimensional data. See Inner product space.
 The Hamming distance measures the minimum number of substitutions required to
change one member into another.
Another important distinction is whether the clustering uses symmetric or asymmetric
distances. Many of the distance functions listed above have the property that distances are
symmetric (the distance from object A to B is the same as the distance from B to A). In other
applications this is not the case. (A true metric gives symmetric measures of distance.)
3.6.1 Hierarchical Clustering
Hierarchical clustering creates a hierarchy of clusters which may be represented in a
tree structure called a dendrogram. The root of the tree consists of a single cluster containing
all observations, and the leaves correspond to individual observations.
Algorithms for hierarchical clustering are generally either agglomerative, in which one
starts at the leaves and successively merges clusters together; or divisive, in which one starts
at the root and recursively splits the clusters. The choice of which clusters to merge or split is
determined by a linkage criterion, which is a function of the pairwise distances between
observations. Cutting the tree at a given height will give a clustering at a selected precision.
In the following example, cutting after the second row will yield clusters {a} {b c} {d e} {f}.

DEPT OF ECE, YSREC OF YVU, PDTR 26


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

Cutting after the third row will yield clusters {a} {b c} {d e f}, which is a coarser clustering,
with a smaller number of larger clusters.
Agglomerative hierarchical clustering
For example, suppose this data is to be clustered, and the Euclidean distance is the distance
metric.

Fig 3.2 Raw data

The hierarchical clustering dendrogram would be as such:

Fig 3.3 Traditional representation


This method builds the hierarchy from the individual elements by progressively
merging clusters. In our example, we have six elements {a} {b} {c} {d} {e} and {f}. The first
step is to determine which elements to merge in a cluster. Usually, we want to take the two

DEPT OF ECE, YSREC OF YVU, PDTR 27


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

closest elements, according to the chosen distance.


Optionally, one can also construct a distance matrix at this stage, where the number in
the i-th row j-th column is the distance between the i-th and j-th elements. Then, as clustering
progresses, rows and columns are merged as the clusters are merged and the distances updated.
This is a common way to implement this type of clustering, and has the benefit of caching
distances between clusters. A simple agglomerative clustering algorithm is described in the
single-linkage clustering page; it can easily be adapted to different types of linkage .
Suppose we have merged the two closest elements b and c, we now have the following
clusters {a}, {b, c}, {d}, {e} and {f}, and want to merge them further. To do that, we need to
take the distance between {a} and {b c}, and therefore define the distance between two
clusters. Usually the distance between two clusters and is one of the following:
 The maximum distance between elements of each cluster (also called complete linkage
clustering):
 The minimum distance between elements of each cluster (also called single-linkage
clustering):
(eq. 3.10)

 The mean distance between elements of each cluster (also called average linkage
clustering, used e.g. in UPGMA):

(eq. 3.11)

(eq. 3.12)

 The sum of all intra-cluster variance. The increase in variance for the cluster being
merged (Ward's criterion).
 Each agglomeration occurs at a greater distance between clusters than the previous
agglomeration, and one can decide to stop clustering either when the clusters are too
far apart to be merged (distance criterion) or when there is a sufficiently small number
of clusters (number criterion).
3.6.2 K-means Clustering
The k-means algorithm assigns each point to the cluster whose center (also called
centroid) is nearest. The center is the average of all the points in the cluster — that is, its

DEPT OF ECE, YSREC OF YVU, PDTR 28


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

coordinates are the arithmetic mean for each dimension separately over all the points in the
cluster.
Example: The data set has three dimensions and the cluster has two points: X = (x1, x2, x3)
and Y = (y1, y2, y3). Then the centroid Z becomes Z = (z1, z2, z3), where z1 = (x1 + y1)/2
and z2 = (x2 + y2)/2 and z3 = (x3 + y3)/2.
The algorithm steps are
 Choose the number of clusters, k.
 Randomly generate k clusters and determine the cluster centers, or directly generate k
random points as cluster centers.
 Assign each point to the nearest cluster center.
 Re compute the new cluster centers.
 Repeat the two previous steps until some convergence criterion is met
The main advantages of this algorithm are its simplicity and speed which allows it to
run on large datasets. Its disadvantage is that it does not yield the same result with each
run, since the resulting clusters depend on the initial random assignments. It minimizes
intra-cluster variance, but does not ensure that the result has a global minimum of variance.
Other popular variants of K-means include the Fast Genetic K-means Algorithm (FGKA)
and the Incremental Genetic K-means Algorithm (IGKA).
3.6.3 Fuzzy C-means Clustering
In fuzzy clustering, each point has a degree of belonging to clusters, as in fuzzy logic,
rather than belonging completely to just one cluster. Thus, points on the edge of a cluster, may
be in the cluster to a lesser degree than points in the center of cluster. For each point x we have
a coefficient giving the degree of being in the kth cluster uk(x).

(eq.3.13)

With fuzzy c-means, the centroid of a cluster is the mean of all points, weighted by their degree
of belonging to the cluster:
(eq. 3.14)

The degree of belonging is related to the inverse of the distance to the cluster center then the
coefficients are normalized and fuzzyfied with a real parameter m > 1 so that their sum is 1.

DEPT OF ECE, YSREC OF YVU, PDTR 29


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

(eq. 3.15)

(eq. 3.16)

For m equal to 2, this is equivalent to normalizing the coefficient linearly to make their sum
1. When m is close to 1, then cluster center closest to the point is given much more weight
than the others, and the algorithm is similar to k-means.
The fuzzy c-means algorithm is very similar to the k-means algorithm:
 Choose a number of clusters.
 Assign randomly to each point coefficients for being in the clusters.
 Repeat until the algorithm has converged (that is, the coefficients' change between two
iterations is no more than , the given sensitivity threshold) : Compute the centroid
for each cluster, using the formula above. For each point, compute its coefficients of
being in the clusters, using the formula above.
The algorithm minimizes intra-cluster variance as well, but has the same problems as k-means,
the minimum is a local minimum, and the results depend on the initial choice of weights.
The expectation-maximization algorithm is a more statistically formalized method which
includes some of these ideas: partial membership in classes. It has better convergence
properties and is in general preferred to fuzzy-c-means.
3.6.4 QT Clustering Algorithm
QT (quality threshold) clustering is an alternative method of partitioning data, invented
for gene clustering. It requires more computing power than k-means, but does not require
specifying the number of clusters a priori, and always returns the same result when run several
times. The algorithm is:
 The user chooses a maximum diameter for clusters.
 Build a candidate cluster for each point by including the closest point, the next closest,
and so on, until the diameter of the cluster surpasses the threshold.
 Save the candidate cluster with the most points as the first true cluster, and remove all
points in the cluster from further consideration. Must clarify what happens if more than
1 cluster has the maximum number of points ?
The distance between a point and a group of points is computed using complete
linkage, i.e. as the maximum distance from the point to any member of the group
DEPT OF ECE, YSREC OF YVU, PDTR 30
DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

Locality-sensitive hashing:
Locality-sensitive hashing can be used for clustering. Feature space vectors are sets,
and the metric used is the Jaccard distance. The feature space can be considered high-
dimensional. The min-wise independent permutations LSH scheme (sometimes MinHash) is
then used to put similar items into buckets. With just one set of hashing methods, there are
only clusters of very similar elements. By seeding the hash functions several times, it is
possible to get bigger clusters.
Graph-theoretic methods:
Formal concept analysis is a technique for generating clusters of objects and attributes,
given a bipartite graph representing the relations between the objects and attributes. Other
methods for generating overlapping cluster are discussed by Jardine and Sibson (1968) and
Cole and Wishart (1970).
3.6.5 Spectral Clustering
Given a set of data points A, the similarity matrix may be defined as a matrix S where

Sij represents a measure of the similarity between points . Spectral clustering


techniques make use of the spectrum of the similarity matrix of the data to perform
dimensionality reduction for clustering in fewer dimensions. One such technique is the Shi-
Malik algorithm, commonly used for image segmentation. It partitions points into two sets
(S1,S2) based on the eigenvector v corresponding to the second-smallest eigenvalue of the
Laplacian matrix of S, where D is the diagonal matrix 𝐷 = ∑ 𝑆 .
This partitioning may be done in various ways, such as by taking the median m of the
components in v, and placing all points whose component in v is greater than m in S1, and the
rest in S2. The algorithm can be used for hierarchical clustering by repeatedly partitioning the
subsets in this fashion. A related algorithm is the Melia-Shi algorithm, which takes the
−1
eigenvectors corresponding to the k largest eigenvalues of the matrix P = SD for some k,
and then invokes another to cluster points by their respective k components in these
eigenvectors.
3.7 Comparisons Between Data Clustering
There have been several suggestions for a measure of similarity between two
clustering’s. Such a measure can be used to compare how well different data clustering
algorithms perform on a set of data. Many of these measures are derived from the matching
matrix e.g., the Rand measure, Adjusted Rand Index and the Fowlkes-Mallows Bk measures.
With small variations, such kind of measures also include set matching based clustering

DEPT OF ECE, YSREC OF YVU, PDTR 31


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

criteria, e.g., the Melia and Heckerman.


Several different clustering systems based on mutual information have been proposed.
One is Marina Meila's 'Variation of Information' metric another provides hierarchical
clustering. The adjusted-for-chance version for the mutual information is the Adjusted Mutual
Information (AMI), which corrects mutual information for agreement solely due to chance
between clustering.
Clustering algorithms essentially perform the same function as classifier methods
without the use of training data. Thus, they are termed unsupervised methods. To compensate
for the lack of training data, clustering methods iteratively alternate between segmenting the
image and characterizing the properties of each class. In a sense, clustering methods train
themselves, using the available data. Three commonly used clustering algorithms are the k-
means or ISODATA algorithm, the fuzzy c-means algorithm, and the expectation-
maximization (EM) algorithm. The K-means clustering algorithm clusters data by iteratively
computing a mean intensity for each class and segmenting the image by classifying each pixel
in the class with the closest mean. Figure below shows the result of applying the K-means
algorithm to a slice of an MR brain image in Figure below. The number of classes was assumed
to be three, representing cerebrospinal fluid, gray matter, and white matter. The fuzzy c-means
algorithm generalizes the K-means algorithm, allowing for soft segmentations based on fuzzy
set theory. The EM algorithm applies the same clustering principles with the underlying
assumption that the data follow a Gaussian mixture model.
3.7.1 K-Means Algorithm
The K-means algorithm is an iterative technique that is used to partition an image into
K clusters. The basic algorithm is:
1. Pick K cluster centers, either randomly or based on some heuristic
2. Assign each pixel in the image to the cluster that minimizes the variance between the
pixel and the cluster center
3. Re-compute the cluster centers by averaging all of the pixels in the cluster
4. Repeat steps 2 and 3 until convergence is attained (e.g. no pixels change clusters)
In this case, variance is the squared or absolute difference between a pixel and a cluster
center. The difference is typically based on pixel color, intensity, texture, and location, or a
weighted combination of these factors. K can be selected manually, or by a heuristic.
This algorithm is guaranteed to converge, but it may not return the optimal solution.
The quality of the solution depends on the initial set of clusters and the value of K. The detailed

DEPT OF ECE, YSREC OF YVU, PDTR 32


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

description of clustering methods for images is given in a source. The other approach to
partition an image into K clusters is the statistical hierarchical agglomerative cauterization
technique for identification of images regions by the color similarity. This method uses a
binary mask and ranks the color components of the clusters’ central components. The basic
algorithm is:
1. Each pixel is the separate cluster
2. The clusters with the same masks join into new clusters
3.7.2 Fuzzy C-Means
The fuzzy c-means algorithm, like the k-means algorithm, the fuzzy c-means aims to
minimize an objective function. The fuzzy c- mean algorithm is better than the k-mean
algorithm, since in k-mean algorithm, feature vectors of the data’s set can be partitioned into
hard clusters, and the feature vector can exactly be a member of one cluster only. Instead, the
fuzzy c-mean relax the condition, and it allows the feature vector to have multiple membership
grades to multiple clusters, Suppose the data set with known clusters and a data point which
is close to both clusters but also equidistant to them. Fuzzy clustering gracefully copes with
such dilemmas by assigning this data point equal but partial memberships to both clusters that
is the point belong to both clusters with some degree of membership grades varies 0 to 1.
Example:
Suppose we have taken the data in table (1). We choose k = 2 (two clusters), where k
is a number of clusters, and we use both crisp clustering method and fuzzy clustering method
to make 2 clusters. Instead, in the fuzzy clustering, the object belongs to both clusters with
different degrees of memberships.
Table 3.1 Example Data

3.7.3 EM Algorithm
Expectation Maximization (EM) is one of the most common algorithms used for
density estimation of data points in an unsupervised setting. The algorithm relies on finding
the maximum likelihood estimates of parameters when the data model depends on certain
latent variables. In EM, alternating steps of Expectation (E) and Maximization (M) are
performed iteratively till the results converge. The E step computes an expectation of the
DEPT OF ECE, YSREC OF YVU, PDTR 33
DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

likelihood by including the latent variables as if they were observed, and a maximization (M)
step, which computes the maximum likelihood estimates of the parameters by maximizing the
expected likelihood found on the last E step. The parameters found on the M step are then used
to begin another E step, and the process is repeated until convergence. Mathematically for a
given training dataset {x(1),x(2),….x(m)} and model p(x, z) where z is the latent variable, We
have
(eq. 3.17)

(eq. 3.18)

3.8 Convolutional Neural Networks


We introduced the concept of fully-connected layers. Image processing often deals
with high dimensional input data. If only fully-connected layers are used, the number of
parameters grows rapidly and becomes extremely high, making optimization very
computationally intensive or even impossible. For this reason, new layer types were
introduced to reduce the number of parameters. In image processing, we can reduce the
number of parameters by incorporating prior knowledge about the strong local correlation of
neighboring pixels into the layer structure. Hence, neurons are only connected to a small local
area of the input and are no longer not connected to all other input neurons. As a result,
convolutional layers (see Section 3.2) were introduced and replaced most fully-connected
layers in neural networks [LeCun et al., 1989]; hence the name convolutional neural network.
In the following section, we briefly explain how convolutional neural networks extract
information from an image and also discuss the different layer types.
In a convolutional neural network, information is extracted hierarchically [Zeiler et al.,
2014]. The first layers extract simple features such as edges or color blobs. Deeper layers
extract feature combinations from previous layers based on the linear combination of
previously extracted features. In the final convolutional layers, high-level features are
extracted from the image. Figure 3.1 presents a hierarchical feature extraction. The top row
illustrates a convolutional neural network with multiple layers. Each layer extracts some low-
level features, which are shown underneath.

DEPT OF ECE, YSREC OF YVU, PDTR 34


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

Fig 3.4: Hierarchical feature extraction of a convolutional neural network. The bottom
the ImageNet dataset]. The top row illustrates the layers of a CNN.
For example, the first layers extract color blobs and edges, while the middle layers extract
combinations such as circles. Thereafter, certain objects are extracted that are hopefully
linearly separable by a classifier (i.e., the final fully-connected layer).
3.9 Convolutional Layer
The convolutional layer is motivated by the fact that, in an image, the information of each
pixel has a strong local correlation to neighboring pixels (e.g., edges are an important feature
formed by local correlations). Since features can be present in several areas of an image, a
filter needs to slide over the complete input data to extract them. The local correlations are
utilized by convolving a small filter K with the input data. The filter often has a symmetric
kernel size of k× k. Although the layer is called a convolutional layer, the cross-correlation is
typically calculated because this helps to omit kernel flipping. For a two-dimensional input
matrix, I and filter K, the two-dimensional cross-correlation is calculated as follows:

(eq.3.19)

Notably we calculate a valid cross-correlation. This means that the calculation area is
× ×
constrained to pixels (i, j), where the filter K ∈ ℝ is fully within the input matrix I ∈ ℝ
Let h = k/2, where ⌊. ⌋ is the integer division. Thus, we can define the calculation area with
i ∈ {h, h + 1, … … . , p − h} and j ∈ {h, h + 1, … … . , q − h}. The parameters of the filters are
learned during training of the neural network.

DEPT OF ECE, YSREC OF YVU, PDTR 35


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

In the following section, we explain a so-called two-dimensional convolutional layer


× ×
and provide an illustration of this layer in Figure 3.2. The feature map F ∈ ℝ is the
output of the l-th convolutional layer with widthw , height h , and depth 𝑑 . While the width
w and height h depend on the size of the input map F , the depth dl is the number of filters
a convolutional layer can learn during optimization. Moreover, the depth dl is a
hyperparameter that is often defined before training. Let v and a be the run indexes over the
depth 𝑑 and 𝑑 , respectively. Thus, we can extend the equation (3.19) for a three-
dimensional case:

(eq. 3.20)

Fig 3.5: Illustration of a convolutional layer with stride 𝒔𝒍 = 𝟐 and padding 𝒑𝒍 = 𝟏.


The color emphasizes the difference between each tensor. In comparison to the fully-
connected layers, it is easier to consider that the neurons are structured in a matrix and not as
a vector. The total number of neurons N in a convolutional layer equals the size of the feature
map; therefore,N = w h d .Two key components are required to realize the convolution in a
neural network and reduce the number of parameters: local receptive field and weight sharing.
Local receptive field:
Each neuron of the l-th convolutional layer is only connected to local area R , , in the
l-1-th layer with the size k × k × d , where d is the depth of the input layer to the
convolutional layer. This local area or local receptive field describes the size of the region in

DEPT OF ECE, YSREC OF YVU, PDTR 36


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

the input that contributed to the feature calculation. As such, each local receptive field can
learn its own filter K , , with the same size as R , , . The displacement of each local receptive
field in a convolutional layer is defined by the stride s ∈ N ∗ without weight sharing (which is
explained next), each of the N neurons would have k k d + 1parameters, while the
convolutional layer would have N(k k d + 1) parameters in total. Notably, one parameter
is added due to the bias b of each neuron.
Weight sharing:
Since the same feature can appear at multiple locations, the concept of weight sharing
was proposed. This makes it unnecessary to learn the same feature extractor multiple times
and reduces the parameters significantly. Weight sharing implies that all neurons belonging to
the same slice v have the same filter K . Therefore, the depth dl controls how many filters can
be learned. This reduces the total parameters of the convolutional layer by w h ; hence, the
layer only has d(k k d + 1) parameters. In Figure 4.3, we provide a simple example of a
× ×
convolutional layer with stride s = 2 and kernel size k = 2, k ∈ K . To calculate the
final results, we use the cross-correlation in Equation 3.1 and add the bias b. For example, in
the top row of Figure 4.3, we calculate the result for the first cell as follows:
(eq. 3.21)
First, the filter Kl (size:2 ×2) is applied to the top left area of the l-1-th layer (i.e., the light
red area). Thereafter, the bias 𝑏 is added and the result is the top left pixel of the l-th layer
(i.e., the light red pixel). Then, the filter is shifted by the stride sl to the right and the same
calculation is performed again. This calculation is shown as the light green area and pixels.
The local receptive field must be fully connected to the input. Thus, the size of feature
Map 𝐹 can be calculated by:
(eq. 3.21)

(eq. 3.21)

This would always reduce the size of the input tensor by at least 𝑘 + 1. Therefore, padding
was introduced. Padding artificially increases the size of the l-1-th layer by adding a border
around the input tensor. The size of the border is defined by 𝑝 ∈ 𝑁 and the added border
typically contains only zeros. Hence, padding is also known as zero-padding. In Figure 3.4,
we illustrate zero-padding with padding 𝑝 = 1 and stride 𝑠 = 2for an example matrix. The
width and height are then calculated as follows:
(eq.3.24)
DEPT OF ECE, YSREC OF YVU, PDTR 37
DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

Fig 3.6: Example of a valid cross-correlation calculation with stride sl = 2 and without
zero-padding. Only the first two steps are shown.

Fig 3.7: Example of a valid cross-correlation calculation with zero-padding 𝒑𝒍 = 𝟏 and


stride 𝒔𝒍 = 𝟐 . The zeros with a light green background are added because of the zero-
padding.
DEPT OF ECE, YSREC OF YVU, PDTR 38
DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

3.10 Pooling Layer


The pooling layers are used to reduce the spatial dimensions and are defined by three
aspects: a specific operation applied to the filter area, the filter size 𝑘 × 𝑘 , and the stride 𝑠 .
The most common operations are maximum and average pooling. While maximum pooling
[Zhou et al., 1988] (max-pooling) calculates the maximum of the filter area, average pooling
calculates the average of the filter area. Average pooling is often used as the last layer to reduce
the spatial dimensions before the fully-connected layer is employed. Usually, only the
dimensions width and height are reduced—but not the depth of the input tensor. An illustration
of max-pooling with filter size 𝑘 = 2 and stride 𝑠 = 2 is shown in Figure 3.8.

Fig 3.8: Illustration of a pooling layer example. The input layer (size: 4× 4× 𝟏) is max-
pooled with filter size 𝒌𝒍 = 𝟐 and stride 𝒔𝒍 = 𝟐 into an output layer of size 2×2×1.
Pooling layers help a model become invariant for small translations of the input;
however, the spatial meaning of a pixel is lost [Goodfellow et al., 2016]. In this context,
invariant means that most output values of the pooling layer do not change if the input is
shifted (i.e., translated) by a small amount. In the past, pooling layers were integrated into
neural networks many times because they are an efficient way to reduce the total parameters.
This acts as a regularization method and can counter overfitting on small datasets [Krizhevsky
et al., 2012]. Due to increased computing power and data availability, Springenberg etc al.
[2014] suggests that pooling layers should be replaced by convolutional layers or omitted. For
example, the convolutional neural networks in our experiments only contain two or three
pooling layers.

DEPT OF ECE, YSREC OF YVU, PDTR 39


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

Without this regularization method to counteract overfitting, other methods such as


batch normalization [Ioffe et al., 2015], dropout [Srivastava et al., 2014], data augmentation
[Krizhevsky et al., 2012], and weight decay [Krogh et al., 1992] are currently used (and often
required for a small dataset). Good overviews and explanations of common regularization
methods can be found in [Kukaˇcka et al., 2017] and [Goodfellow et al., 2016].
3.11 Batch Normalization
Batch normalization counters several problems that arise when training deep neural
networks. First, it accelerates the training processes by a substantial margin due to improved
convergence properties [Ioffe et al., 2015]. Secondly, it allows higher learning rates and a less
careful weight initialization [Ioffe et al., 2015]. Thirdly, it can act as a regularization and
reduce the need for dropout [Goodfellow et al., 2016]. Currently, several normalization
methods are available [Ba et al., 2016; Miyato et al., 2018; Salimans et al., 2016; Ulyanov et
al., 2016; Wu et al., 2018]. Since we use batch normalization in this thesis, it is explained in
the this section. Ioffe et al. [2015] identified internal covariate shift as a problem for slow
convergence. In neural networks, the inputs of internal layers are affected by the parameters
of all previous layers. A small adjustment to parameters in the beginning becomes amplified
as the networks become deeper. If the parameters change due to training, the distribution of
the layer input also changes. The layer must be adapted and coordinated to this change, which
is known as an internal covariate shift. An internal covariate shift can be reduced by
normalizing the activation of a layer by making it have a mean of zero and a unit variance.
Consider a layer with a ddimensional activation vector 𝑠 = (𝑠 , … . , 𝑠 ) and a mini-batch size
of m being used for gradient descent. In this case, each input has d activations. We can arrange
×
this in the activation matrix 𝑆 ∈ ℝ , where the row represents the samples of the minibatch
m and the columns are the corresponding activations 𝑠 . The values of S are normalized by
column (i.e., the d-dimension are independent) as follows

(eq. 3.25)
Where 𝜇 and 𝜎 are the mean and standard deviation for each column, respectively. Mean and
variance are computed over the mini-batch by,

(eq. 3.26)

DEPT OF ECE, YSREC OF YVU, PDTR 40


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

(eq.27)

A simple normalization can reduce the representation power of a neural network. For example,
a normalized input to a sigmoid nonlinearity would constrain the function to the linear area.
Therefore, two additional parameters are used to apply a linear transformation:

(eq. 3.28)

where 𝛾 and 𝛽 are parameters of the neural network that are optimized during gradient
descent. This allows the neural network to restore the original activation by driving 𝛾 to 𝜎
and 𝛽 to 𝜇 .

3.12 Residual Connections


The findings of Eldan et al. [2016] show that deeper neural networks are desirable since
they can better approximate functions. Based on the findings of He et al. [2015a], it can be
argued that when compared to a shallow network, a deeper network should have the same or
better error for the same test set. However, naive stacking of layers (i.e., adding more layers
to a neural network) does not usually help the optimization method find a solution with a lower
error. Therefore, residual connections (also known as skip connections) were proposed by He
et al. [2015a] to deal with this problem. At the time of writing, deep neural networks with
residual connections represent state-of-the-art networks for many tasks.

Fig 3.9: Illustration of a residual connection, which is the shortcut from x to the sum
F(x) + x (i.e., the identity connection).

DEPT OF ECE, YSREC OF YVU, PDTR 41


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

He et al. [2015a] concluded that the optimizer often faces difficulties in finding a
favorable solution with a small error for deep neural networks. As a result, He et al. [2015a]
introduced residual connections to ease the optimization process for very deep neural
networks. Figure 3.6 illustrates the basic concept of a residual connection. A residual
connection is often implemented in deep neural networks by adding connections that act as a
shortcut over one or more stacked layers and forward the identity x to the output of the stacked
layers. Let H(x) be the desired mapping. Instead of driving F(x) to H(x), we can reformulate
the problem so that F(x) := H(x)- x fits the residual mapping. Thus, the desired mapping H(x)
is F(x) + x. This is realized by the shortcut connection (as seen in Figure 4.6 (a)) and is
motivated by the fact that it might be more difficult for deeper layers to learn an identity
mapping than to drive F(x) to zero [He et al., 2016].
A bottleneck architecture was also proposed to reduce computation complexity in
terms of floating-point operations (FLOPs) since complexity does not scale well by adding
more layers to a neural network. For example, training a 200-layer ResNet with bottleneck
architecture on ImageNet takes approximately three weeks on eight graphics processing units
(GPUs) and would not otherwise be possible [He et al., 2016]. In a bottleneck architecture, a
block of two convolutional layers is replaced with three convolutional layers. While this may
seem counterintuitive at first due to the additional convolutional layer, it has a major impact
on computational complexity. The convolutional layers perform the following three steps.

Figure 3.10: Comparison of the standard residual connection design with the
bottleneck design.

DEPT OF ECE, YSREC OF YVU, PDTR 42


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

The bottleneck has a four times greater input dimension when compared to the standard design.
However, the time complexity is the same for both designs. First, a convolutional layer with a
filter size of 1 × 1 × 𝑑 is employed to reduce the depth dimension of the input [Lin et al.,
2013]. As explained in Section 4.2, the convolutional layer can reduce the depth dimension
× ×
𝑑 of the input map 𝐹 ∈ℝ to 𝑑 by having only dl filters. This is illustrated
in Figure 4.7 (b), where the input map with 𝑑 = 256 is reduced to 𝑑 = 64. Secondly, the
time-consuming 3 × 3 × 𝑑 convolution is only calculated on the reduced dimensions dl .
Finally, the last convolutional layer restores the depth dimension d by also performing again
a1×1×d convolution. The depth dimension d is restored via the same method used to
reduce the depth dimension in the first layer; however, the number of filters is now greater
than the input depth.
For the example, in Figure 3.7, the number of parameters are 73.728 and 69.632 for
the old and bottleneck design, respectively. While both have similar complexity in terms of
FLOPs, the bottleneck design calculates with an input that has a four times greater depth
dimension.

DEPT OF ECE, YSREC OF YVU, PDTR 43


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

CHAPTER 4
PROPOSED METHODOLOGY
4.1 Introduction
Deforestation whether legal or not should be monitored by authorities and it could
be difficult to grasp all possible forest cuttings using just the human eye, furthermore the
process of counting the area of deforestation by hand from unfiltered satellite imagery is
another problem for a human. Regions of illegal deforestation might become unplanted for a
long time and reduce the amount of usable tree. Attempts to solve aforesaid problems were
attempted in the past [1], [2]. Image thresholding [3] and morphological image transformations
[4] were used [1] which were fine for the regions of deforestation visible and understandable
for a human eye. For example, the cuttings which were created a few days ago and left the
terrain without any vegetation, but such cuttings are just a small percent of all of them. Most
of the areas of interest have some amount of vegetation and probably even some amount of
newly planted trees, so these areas could not be determined by such methods. Also, the satellite
imagery was taken in ultraviolet or infrared spectra which have more information about the
amount of vegetation, and tasks of deforestation were solved using the Normalized Difference
Vegetation Index (NDVI) [5], [6].
This is an acceptable solution, but usage of open-source satellite imagery with low
resolution is not an option in our case, because the area of the forestry sections could be much
lower than the resolution of this data. Therefore, the dataset was created in the visible range
of electromagnetic spectrum. The dataset consists of 322 images, images shape is 512x512
pixels and the dataset is saved in tfrecord format [7]. Tfrecord format was used to increase
compatibility with Google Cloud Services. There are, however, a number of obstacles that
need to overcome, for example the model predictions are more accurate in the regions which
were the most numerous in the dataset distribution. The fact that the dataset for model training
was created on the data from Ukrainian forestries means that most of the forest is located on
the steppe areas and the terrain of the area was not an important factor for model training. But
some areas with special conditions, like forests located on the hill of the mountains, could be
completely misclassified due to the shadow of the mountain and predictions are more
dependent on the time of the day when satellite images were taken. Also, areas around the
forest boundary are the areas of uncertainty for the model.

DEPT OF ECE, YSREC OF YVU, PDTR 44


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

The aforesaid problems could be fixed by a much larger dataset size in comparison to
the current size. The main problems for such dataset enlargement are the consistency of
labelling rules among all of the images, to prevent ambiguous areas. For example, some areas
could look like light deforestation, but some labellers could mark these areas as forest, and the
other one as deforestation. These images will make model learning more complicated. The
research was done with the model-centric [8] view which declares that the results could
become better with more sophisticated model architecture or model’s hyperparameters. After
satisfactory results with our created U-Net model [9], the dataset was recreated with the data-
centric view [8], to further increase the accuracy. The data-centric view declares that the model
could not return good results without good data, also known as “garbage in, garbage out”.
Therefore, the dataset was recreated with more precise segmentation.
4.2 ResNet
The working of neural network is performed by considering a picture and the weights
are been generated depending on the nature of image complexity and dividing the weights
from each consecutive values. The CNN network the ease of doing pre-processing for the
given input is better when compared to other deep learning algorithms. The training of
classifiers in CNN utilizes a simple technique with fundamental capabilities. This will help to
identifies the similar features of the object which is been targeted. The structure of CNN is
composed of human brain in which the neuron’s structure is built mainly the visual cortex.
The response of every neuron is important in a particular section of visual area which is called
as field of receptive.
One of the models created is the deep residual network, or ResNet. This design was
created to overcome difficulties in the convolution network model since the time and the
number of layers while performing network training is high. The connection skipping or
creating shortcuts is the operation of ResNet and is widely used in such applications. The
ResNets model has a benefit over other architectural approaches in that its efficiency does not
degrade as the design becomes deeper. Furthermore, the computational level of complexity is
low and the training capability of the network is been drastically improved. One of the
advantages of proposed model is its level skipping functionality, where it can skip two to three
level that effects the ReLU and batch normalization. This paper uses residual learning to apply
to several levels of layers.
In ResNet the residual block is given as follows
𝑦 = 𝐹(𝑥, 𝑊 + 𝑥) (eq. 4.1)

DEPT OF ECE, YSREC OF YVU, PDTR 45


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

Fig 4.1. ResNet Functioning


Where the layer of the input is termed as ‘x’; the output layer is given as ‘y’; and the function
𝐹 is related by the residual map.
Assumptions of residual block
1. Addition of more / new layers would not degrade the performance model enables skip
of layers if they were not needed.
2. If there is benefit with the additional layer and if regularization is present, the weights
or kernels of the layers will be non-zero, and the performance of the model may
improve marginally.
As a result of the "Skip connection" / "residual connection," adding more layers ensures that
the model's performance does not degrade but may somewhat improve. You may create a very
deep network by making arrangements in a way the network layers on top of one another. The
identity function can be derived for every block in simple process due to the presence of ResNet
blocks. This implies that you may add more ResNet blocks without affecting training set
performance. In a ResNet, two types of blocks are employed, depending on the dimension
levels of the input and output.
 Identification block
This block is termed to be the standard block in residual networks and one of the
important considerations in this block is that dimensions of input and output activations
are same.

DEPT OF ECE, YSREC OF YVU, PDTR 46


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

 Block of convolutions
The utilization of this block refers to mismatch of dimensions for input and output
activations and the layer i.e., CONV2D have shortcut paths and it is different from
identification block.
The input and output dimension need to be same to obtain the residual block using the
ResNet. Furthermore, each ResNet block is made up of two or three number of layers in which
ResNet-18 is having two layers and ResNet-34 is having two layers and ResNet-50 and 101
are termed to be three layered. The two layers present in the starting of ResNet utilizes 7×7
convolution operations with a size of 3×3 of max-pooling and finally 277 number of strides.
In the suggested work, Network-18 and Network-20 is investigated. The input image
considered for performing the process is resized into a 224×224 grid. ResNet weights are
initialized utilizing Stochastic Gradient Descent (SGD) and typical momentum settings. The
proposed network structure is shown in table1.
Table 4.1. ResNet description

4.3 Proposed Model


4.3.1. Dataset Used
Dataset is comprised of 322 pairs of visible spectral images and corresponding
masks of shape 512 × 512 × 3. Each image is part of satellite imagery taken over Ukrainian
forestries with a view from a height of around 3 kilometres, taken between 2018 and 2021, for
a total of 84 million pixels. A low amount of images contain partial cloud cover due to the

DEPT OF ECE, YSREC OF YVU, PDTR 47


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

atmospheric correction. The dataset spans over diverse types of forestries with different terrain
what is useful for more solid model training. Table 1 contains the overall amount of segmented
pixels per class. 40 pixels were unlabeled during the dataset creation. The distribution of
classes in the dataset is imbalanced and this became one of the main training problems.
Dataset was created by parsing areas of Ukrainian forestries using PyAutoGui and
Google Earth Pro. Images contain not only areas with forest or deforestation, but areas with
roads, villages, rivers, and ponds. To make our model more robust and predict areas of
deforestation more accurately. The dataset contains masks with three classes: “Forest”,
“Deforestation” and “Other”. Finally, the dataset with satellite imagery was created for the
task considered here and uploaded at GitHub repository1 where it is available with the
correspondent code for distributed training on TPU.
Dataset benchmarking
After the creation of the baseline U-Net model [9] for initial predictions, we have
segmented the dataset one more time from scratch to create a more accurate one. The second
version of the dataset was much better in comparison with the first one, but the problem with
both of them is the poor ability to determine the accuracy of the model using such metrics as
F1 Score and Intersection over Union (IoU) because the number of minor areas of
deforestation and little trees was so huge that it was too work consuming to mark absolutely
all of them. Also, no help from subject matter experts was not involved in process of dataset
segmentation, so there might be incorrectly labelled areas of deforestation. For example, can
we segment the area as deforestation if it contains a certain density of trees and trees of a
certain age? This question is hard to answer using satellite imagery.

Fig 4.2. The original satellite imagery (left), the image from the initial dataset (centre),
the mask from the final dataset (right).

DEPT OF ECE, YSREC OF YVU, PDTR 48


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

Example of an image and mask from different versions of the training dataset in Fig. 4.2. The
quality of segmentation on the first mask is lower than on the second one, but the results of
model predictions from the initial dataset were still accurate enough.
4.3.2. Loss Function
Usage of Categorical Cross entropy [11] loss function gave poor results due to treating
all classes equally, but in our case prediction of the “Deforestation” class is the most important.
Dice loss and Tversky loss are common choices in image segmentation tasks. Dice loss is a
widely used loss function in computer vision. Tversky loss can also be seen as a generalization
of the Dices coefficient. It adds weight to false positives and false negatives with the help of
coefficients. The idea to combine these loss functions to merge their strengths was taken into
account. Therefore loss function similar to the hybrid loss function in Anatomy Net [12] was
created and modified to treat the “Deforestation” class as the most valuable by multiplying its
contribution to the total loss by a value which would be proportioned to the number of classes
the model should predict. Using available data about loss functions for image segmentation
[5,6] different combination were checked. The next loss functions was tested: Focal Tversky
Loss [13], Dice loss [14], [15], Focal Loss [16], [17], etc. Tversky loss function and Dice loss
function were proved to be the best solution for the current problem [18], so they were
combined with the factor of lambda for control of the Dice subloss. Focal Tversky Loss not as
good as Tversky Loss with manually weighting of class subclasses. Despite on being a good
idea to use hybrid loss [12] instead of simple Categorical Crossentropy, the results could be
improved much further with a proper new loss function, which will use borders of segmented
classes instead of areas [19].
𝑇𝑃 (𝑐) = ∑ 𝑝 (𝑐)𝑔 (𝑐) (eq. 4.2)

𝐹𝑁 (𝑐) = ∑ (1 − 𝑝 (𝑐)𝑔 (𝑐)) (eq.4.3)

𝐹𝑃 (𝑐) = ∑ 𝑝 (𝑐) 1 − 𝑔 (𝑐) (eq. 4.4)


( )
𝐿 (𝑐) = 𝐶 − ∑ 𝑤 ( ) ( ) ( )
(eq. 4.5)
( ) ( )
𝐿 (𝑐) = 𝐶 − ∑ ( ) ( )
(eq. 4.6)

𝐿 =𝐿 +𝜆∗𝐿 (eq. 4.7)


Where𝑇𝑃 , 𝐹𝑁 and 𝐹𝑃 are the true positives, false negatives and false positives for class c
calculated by prediction probabilities respectively, 𝑝 (𝑐)is the predicted probability for pixel
n being class c, 𝑔 (𝑐)is the ground truth for pixel n being class c, C is the total number of
DEPT OF ECE, YSREC OF YVU, PDTR 49
DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

classes, C=3 in our case, 𝛌 is the trade-off between dice loss𝐿 (𝑐) and loss 𝐿 , 𝛌 and 𝛽
are the trade-offs of penalties for false negatives and false positives which are set to 0.5 in our
case, wc is the weight for class c, in our case weights for classes “Forest“, “Deforestation“ and
“Other“ are equal to 0.4, 2.2 and 0.4 respectively. A high value for the Deforestation class is
important to overcome the dataset imbalance. Because learning the correct representation of
the “Forest” class is the easiest to do.
4.3.3. Model
The architecture for the model was chosen to be standard UNet and Resnet architecture with
the next amount of filters in encoder f32, 64, 128, 256, 512, 1024g, bottleneck with 2048 filters
and the decoder part with the same filters as the encoder, starting with 1024, ending with 32.
The total amount of trainable parameters is 124,424,995 and equals 475 megabytes of disk
space. The RMS-prop optimizer [20] was used with a learning rate equals 1e-6, with other
parameters set to their default values. This is the optimal learning rate for this problem, which
was proved experimentally, and it helps the model to learn the correct representations of the
“Deforestation” class more accurately. Default learning rate yielded in constant “overshoot”
in weights updates. To speed up training for the model with more than 100 million parameters
the distributed TPU (Tensor Processing Unit) [21], [22] strategy was used. Recently, the
efficiency of TPU-based training and inference of various DNNs was demonstrated by us on
various applications from classification problems [23] (including medical applications [24])
to gesture and pose recognition with the detailed scaling analysis of GPU and TPU
performance [25].

DEPT OF ECE, YSREC OF YVU, PDTR 50


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

CHAPTER 5
SOFTWARE DESCRIPTION
5.1 Introduction
MATLAB is a high-performance language for technical computing. It integrates
computation, visualization, and programming in an easy-to-use environment where problems
and solutions are expressed in familiar mathematical notation. MATLAB stands for matrix
laboratory, and was written originally to provide easy access to matrix software developed by
LINPACK (linear system package) and EISPACK (Eigen system package) projects.
MATLAB is therefore built on a foundation of sophisticated matrix software in which the
basic element is array that does not require pre dimensioning which to solve many technical
computing problems, especially those with matrix and vector formulations, in a fraction of
time.
MATLAB features a family of applications specific solutions called toolboxes. Very
important to most users of MATLAB, toolboxes allow learning and applying specialized
technology. These are comprehensive collections of MATLAB functions (M-files) that extend
the MATLAB environment to solve particular classes of problems. Areas in which toolboxes
are available include signal processing, control system, neural networks, fuzzy logic, wavelets,
simulation and many others.
Typical uses of MATLAB include: Math and computation, Algorithm development,
Data acquisition, Modeling, simulation, prototyping, Data analysis, exploration, visualization,
Scientific and engineering graphics, Application development, including graphical user
interface building.
5.2 Basic Building Blocks of MATLAB
The basic building block of MATLAB is matrix. The fundamental data type is the
array. Vectors, scalars, real matrices and complex matrix are handled as specific class of this
basic data type. The built in functions are optimized for vector operations. No dimension
statements are required for vectors or arrays.
5.2.1 MATLAB Window
The MATLAB works based on five windows: Command window, Workspace
window, Current directory window, Command history window, Editor Window, Graphics
window and Online-help window.

DEPT OF ECE, YSREC OF YVU, PDTR 51


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

a. Command window
The command window is where the user types MATLAB commands and expressions
at the prompt (>>) and where the output of those commands is displayed. It is opened
when the application program is launched. All commands including user-written
programs are typed in this window at MATLAB prompt for execution.
b. Work space window
MATLAB defines the workspace as the set of variables that the user creates in a work
session. The workspace browser shows these variables and some information about them.
Double clicking on a variable in the workspace browser launches the Array Editor, which can
be used to obtain information.
c. Current directory window
The current Directory tab shows the contents of the current directory, whose path is
shown in the current directory window. For example, in the windows operating system the
path might be as follows: C:\MATLAB\Work, indicating that directory “work” is a
subdirectory of the main directory “MATLAB”; which is installed in drive C. Clicking on the
arrow in the current directory window shows a list of recently used paths. MATLAB uses a
search path to find M-files and other MATLAB related files..
d. Command history window
The Command History Window contains a record of the commands a user has entered
in the command window, including both current and previous MATLAB sessions. Previously
entered MATLAB commands can be selected and re-executed from the command history
window by right clicking on a command or sequence of commands.
e. Editor window
The MATLAB editor is both a text editor specialized for creating M-files and a
graphical MATLAB debugger. The editor can appear in a window by itself, or it can be a sub
window in the desktop. In this window one can write, edit, create and save programs in files
called M-files
f. Graphics or figure window:
The output of all graphic commands typed in the command window is seen in this
window.
g. Online help window: MATLAB provides online help for all it’s built in functions and
programming language constructs. The principal way to get help online is to use the MATLAB
help browser, opened as a separate window either by clicking on the question mark symbol

DEPT OF ECE, YSREC OF YVU, PDTR 52


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

(?) on the desktop toolbar, or by typing help browser at the prompt in the command window.
The help Browser is a web browser integrated into the MATLAB desktop that displays a
Hypertext Markup Language (HTML) document. The Help Browser consists of two panes,
the help navigator pane, used to find information, and the display pane, used to view the
information. Self-explanatory tabs other than navigator pane are used to perform a search.
5.3 MATLAB Files
MATLAB has three types of files for storing information. They are: M-files and
MAT-files.
5.3.1 M-Files
These are standard ASCII text file with ‘m’ extension to the file name and creating
own matrices using M-files, which are text files containing MATLAB code. MATLAB editor
or another text editor is used to create a file containing the same statements which are typed
at the MATLAB command line and save the file under a name that ends in .m. There are two
types of M-files:
1. Script Files
It is an M-file with a set of MATLAB commands in it and is executed by typing name of file
on the command line. These files work on global variables currently present in that
environment.
2. Function Files
A function file is also an M-file except that the variables in a function file are all local.
This type of files begins with a function definition line.
5.3.2 MAT-Files
These are binary data files with .mat extension to the file that are created by MATLAB
when the data is saved. The data written in a special format that only MATLAB can read.
These are located into MATLAB with ‘load’ command
5.4 MATLAB System
The MATLAB system consists of five main parts:
5.4.1 Development Environment
This is the set of tools and facilities that help you use MATLAB functions and files.
Many of these tools are graphical user interfaces. It includes the MATLAB desktop and
Command Window, a command history, an editor and debugger, and browsers for viewing
help, the workspace, files, and the search path.

DEPT OF ECE, YSREC OF YVU, PDTR 53


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

5.4.2 MATLAB Mathematical Function


This is a vast collection of computational algorithms ranging from elementary
functions like sum, sine, cosine, and complex arithmetic, to more sophisticated functions like
matrix inverse, matrix eigen values, Bessel functions, and fast Fourier transforms.
5.4.3 MATLAB Language
This is a high-level matrix/array language with control flow statements, functions, data
structures, input/output, and object-oriented programming features. It allows both
"programming in the small" to rapidly create quick and dirty throw-away programs, and
"programming in the large" to create complete large and complex application programs.
5.4.4 Graphics
MATLAB has extensive facilities for displaying vectors and matrices as graphs, as well
as annotating and printing these graphs. It includes high-level functions for two-dimensional
and three-dimensional data visualization, image processing, animation, and presentation
graphics. It also includes low-level functions that allow you to fully customize the appearance
of graphics as well as to build complete graphical user interfaces on your MATLAB
applications.
5.4.5 MATLAB Application Program Interface (API)
This is a library that allows you to write C and FORTRAN programs that interact with
MATLAB. It includes facilities for calling routines from MATLAB (dynamic linking), calling
MATLAB as a computational engine, and for reading and writing MAT-files.
5.5 Some Basic Commands
 pwd :- prints working directory
 Demo :- demonstrates what is possible in Mat lab
 Who :- lists all of the variables in your Mat lab workspace?
 Whose :- list the variables and describes their matrix size
 clear :- erases variables and functions from memory
 clear x :- erases the matrix 'x' from your workspace
 close :- by itself, closes the current figure window
 figure :- creates an empty figure window
 hold on :- holds the current plot and all axis properties so that subsequent graphing
commands add to the existing graph
 hold off :- sets the next plot property of the current axes to "replace"

DEPT OF ECE, YSREC OF YVU, PDTR 54


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

 find :- find indices of nonzero elements e.g.:d =find(x>100) returns the indices
of the vector x that are greater than 100
 break :- terminate execution of m-file or WHILE or FOR loop
 load :- loads contents of matlab.mat into current workspace
 save filename x y z :- saves the matrices x, y and z into the file titled filename.mat
 save filename x y z /ascii :- save the matrices x, y and z into the file titled filename.dat
 load filename :- loads the contents of filename into current workspace; the file can
be a binary (.mat) file
 load filename.dat:- loads the contents of filename.dat into the variable filename
 xlabel(‘ ’) :- Allows you to label x-axis
 ylabel(‘ ‘) :- Allows you to label y-axis
 title(‘ ‘) :- Allows you to give title for plot
 subplot() :- Allows you to create multiple plots in the same window
5.6 Some Basic Plot Commands
Kinds of plots:
 plot(x,y) :- creates a Cartesian plot of the vectors x & y
 plot(y) :- creates a plot of y vs. the numerical values of the elements in the y-
vector
 semilogx(x,y) :- plots log(x) vs y
 semilogy(x,y) :- plots x vs log(y)
 loglog(x,y) :- plots log(x) vs log(y)
 polar(theta,r) :- creates a polar plot of the vectors r & theta where theta is in radians
 bar(x) :- creates a bar graph of the vector x. (Note also the command stairs(x))
 bar(x, y) :- creates a bar-graph of the elements of the vector y, locating the bars
according to the vector elements of 'x'
Plot description:
 grid :- creates a grid on the graphics plot
 title('text') :- places a title at top of graphics plot
 xlabel('text') :- writes 'text' beneath the x-axis of a plot
 ylabel('text') :- writes 'text' beside the y-axis of a plot
 text(x,y,'text') :- writes 'text' at the location (x,y)

DEPT OF ECE, YSREC OF YVU, PDTR 55


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

 text(x,y,'text','sc') :-writes 'text' at point x,y assuming lower left corner is (0,0) and
upper right corner is (1,1)
 axis([xmin xmax ymin ymax]) :- sets scaling for the x- and y-axes on the current plot
5.7 MATLAB Working Environment
5.7.1 MATLAB Desktop

Matlab Desktop is the main Matlab application window. The desktop contains five sub
windows, the command window, the workspace browser, the current directory window, the
command history window, and one or more figure windows, which are shown only when the
user displays a graphic. The command window is where the user types MATLAB commands
and expressions at the prompt (>>) and where the output of those commands is displayed.
MATLAB defines the workspace as the set of variables that the user creates in a work session.
The workspace browser shows these variables and some information about them.
Double clicking on a variable in the workspace browser launches the Array Editor, which can
be used to obtain information and income instances edit certain properties of the variable.
The current Directory tab above the workspace tab shows the contents of the current
directory, whose path is shown in the current directory window. For example, in the windows
operating system the path might be as follows: C:\MATLAB\Work, indicating that directory
“work” is a subdirectory of the main directory “MATLAB”; WHICH IS INSTALLED IN
DRIVE C. clicking on the arrow in the current directory window shows a list of recently used
paths. Clicking on the button to the right of the window allows the user to change the current
directory.
MATLAB uses a search path to find M-files and other MATLAB related files, which
are organize in directories in the computer file system. Any file run in MATLAB must reside
in the current directory or in a directory that is on search path. By default, the files supplied
with MATLAB and math works toolboxes are included in the search path. The easiest way to
see which directories are soon the search path, or to add or modify a search path, is to select
set path from the File menu the desktop, and then use the set path dialog box. It is good
practice to add any commonly used directories to the search path to avoid repeatedly having
the change the current directory.
The Command History Window contains a record of the commands a user has entered
in the command window, including both current and previous MATLAB sessions. Previously
entered MATLAB commands can be selected and re-executed from the command history
window by right clicking on a command or sequence of commands. This action launches a

DEPT OF ECE, YSREC OF YVU, PDTR 56


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

menu from which to select various options in addition to executing the commands. This is
useful to select various options in addition to executing the commands. This is a useful feature
when experimenting with various commands in a work session.
5.8.2 Using the MATLAB Editor to Create M-Files
The MATLAB editor is both a text editor specialized for creating M-files and a
graphical MATLAB debugger. The editor can appear in a window by itself, or it can be a sub
window in the desktop. M-files are denoted by the extension .m, as in pixelup.m.
The MATLAB editor window has numerous pull-down menus for tasks such as
saving, viewing, and debugging files. Because it performs some simple checks and also uses
color to differentiate between various elements of code, this text editor is recommended as the
tool of choice for writing and editing M-functions.
To open the editor, type edit at the prompt opens the M-file filename.m in an editor
window, ready for editing. As noted earlier, the file must be in the current directory, or in a
directory in the search path.
5.8.3 Getting Help
The principal way to get help online is to use the MATLAB help browser, opened as a
separate window either by clicking on the question mark symbol (?) on the desktop toolbar,
or by typing help browser at the prompt in the command window. The help Browser is a web
browser integrated into the MATLAB desktop that displays a Hypertext Markup
Language(HTML) documents. The Help Browser consists of two panes, the help navigator
pane, used to find information, and the display pane, used to view the information. Self-
explanatory tabs other than navigator pane are used to perform a search.

DEPT OF ECE, YSREC OF YVU, PDTR 57


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

CHAPTER 6
RESULTS

Fig 6.1 Input image

Fig 6.2 Output image using U-NET

DEPT OF ECE, YSREC OF YVU, PDTR 58


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

Fig 6.3 Output image using RESNET

Output:
3×1 cell array {'OTHERS’} {'DEFORESTATION'} {'FOREST'}
Evaluating semantic segmentation results using UNET

Selected metrics: global accuracy, class accuracy, IoU, weighted IoU, BF score.

* Processed 322 images.


* Finalizing... Done.
* Data set metrics:
Global Accuracy Mean Accuracy Mean IoU Weighted IoU MeanBFScore
______________ ____________ _______ ___________ ___________

0.94013 0.92896 0.82956 0.89712 0.59794

Evaluating semantic segmentation results using resnet 50

Selected metrics: global accuracy, class accuracy, IoU, weighted IoU, BF score.

* Processed 322 images.


* Finalizing... Done.
* Data set metrics:

GlobalAccuracy MeanAccuracy MeanIoU WeightedIoU MeanBFSco


____________ ____________ _______ __________ ___________

0.99749 0.9969 0.99094 0.99501 0.9608

DEPT OF ECE, YSREC OF YVU, PDTR 59


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

CHAPTER 7
CONCLUSION
The results obtained allowed us to conclude that the problem of automatic
monitoring the deforestation process for efficient prevention of illegal deforestation can be
efficiently resolved by the method proposed. Despite the limited number of satellite images in
the considered dataset, the proposed image segmentation model on the basis of U-Net and
Resnet family achieved the reasonable results with the strictly defined segmentation metrics
with mean and standard deviation values measured by k-fold cross validation and numerous
runs for different random seeds. The dataset with satellite imagery and segmented masks was
uploaded at GitHub repository and could be increased by size and variety of data to check the
correspondent influence. It should be emphasized that these training/validation methods and
segmentation results obtained can be used in the more general context (and they are actually
used for the medical applications mentioned above), but the more extended research will be
necessary, especially for deployment of the U-Net DNNs and Resnet DNN on Edge
Computing TPU-based devices with the limited computational resources for the
aforementioned applications.

DEPT OF ECE, YSREC OF YVU, PDTR 60


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

REFERENCES
[1] A. K. Ludeke, R. C. Maggio, and L. M. Reid, “An analysis of anthropogenic deforestation
using logistic regression and GIS,” Journal of Environmental Management, vol. 31, no. 3, pp.
247–259, 1990.
[2] J. R. Makana and S. C. Thomas, “Impacts of selective logging and agricultural clearing on
forest structure, floristic composition and diversity, and timber tree regeneration in the Ituri
Forest, Democratic Republic of Congo,” in Forest Diversity and Management, pp. 315–337,
Springer, 2006.
[3] L. Miles and V. Kapos, “Reducing greenhouse gas emissions from deforestation and forest
degradation: global land-use implications,” Science, vol. 320, no. 5882, pp. 1454–1455, 2008.
[4] J. Phelps, E. L. Webb, and A. Agrawal, “Does REDD+ threaten to recentralize forest
governance?” Science, vol. 328, no. 5976, pp. 312–313, 2010.
[5] M. R. W. Rands, W. M. Adams, L. Bennun et al., “Biodiversity conservation: Challenges
beyond 2010,” Science, vol. 329, no. 5997, pp. 1298–1303, 2010.
[6] E. H. Baur, R. B.McNab, L. E.Williams, V. H. Ramos, J. Radachowsky, and M. R.
Guariguata, “Multiple forest use through commercial sport hunting: Lessons from a
community-based model fromthe Pet´en, Guatemala,” Forest Ecology and Management, vol.
268, pp. 112–120, 2012.
[7] P.Cronkleton, M. R.Guariguata, and M. A. Albornoz, “Multiple use forestry planning:
Timber and Brazil nut management in the community forests of Northern Bolivia,” Forest
Ecology and Management, vol. 268, pp. 49–56, 2012.
[8] M. S. Mon, N. Mizoue, N. Z. Htun, T. Kajisa, and S. Yoshida, “Factors affecting
deforestation and forest degradation in selectively logged production forest: a case study
inMyanmar,” Forest Ecology and Management, vol. 267, pp. 190–198, 2012.
[9] L. G. Z. Xinnian and W. Yilong, “Design models of the single span cableway on the
accurate catenary method,” Journal of Fujian College of Forestry, 1999.
[10] Z. Xinnian, Z. Zhengxiong, andW. Zhilong, “Progress in forest ecological logging,”
Journal of Fujian College of Forestry, vol. 27, p. 6, 2007.
[11] Z. Chuanfang, L. Minrong, Z. Chunxia, and Z. Huiru, Annual Report on Competitiveness
of China’s Provincial Forestry No.1(2004∼2006), Social Sciences Academic Press, Beijing,
China, 2010.
[12] F. Huirong, Z. Xinnian, L. Minhui et al., “Three benefits comparison on skidding methods
of light-duty cableway and road-cutting,” Scientia Silvae Sinicae, vol. 48, p. 6, 2012.

DEPT OF ECE, YSREC OF YVU, PDTR 61


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

[13] S. Eckert, H. R. Ratsimba, L. O. Rakotondrasoa, L. G. Rajoelison, and A. Ehrensperger,


“Deforestation and forest degradation monitoring and assessment of biomass and carbon stock
of lowland rainforest in the Analanjirofo region, Madagascar,” Forest Ecology and
Management, vol. 262, no. 11, pp. 1996–2007, 2011.
[14] M. T. Moroni, “Aspects of forest carbon management in Australia—a discussion paper,”
Forest Ecology andManagement, vol. 275, pp. 111–116, 2012.
[15] L. Rist, P. Shanley, T. Sunderland et al., “The impacts of selective logging on non-timber
forest products of livelihood importance,” Forest Ecology and Management, vol. 268, pp. 57–
69, 2012.
[16] W. A. Mugasha, T. Eid, O. M. Bollands˚as et al., “Allometric models for prediction of
above- and belowground biomass of trees in the miombowoodlands of Tanzania,” Forest
Ecology and Management, vol. 310, pp. 87–101, 2013.
[17] J. Williams, “Exploring the onset of high-impact mega-fires through a forest land
management prism,” Forest Ecology and Management, vol. 294, pp. 4–10, 2013.
[18] B. Mertens and E. F. Lambin, “Spatial modelling of deforestation in southern Cameroon:
spatial disaggregation of diverse deforestation processes,” Applied Geography, vol. 17, no. 2,
pp. 143–162, 1997.
[19] L. Petersen and A. Sandh¨ovel, “Forestry policy reform and the role of incentives in
Tanzania,” Forest Policy and Economics, vol. 2, no. 1, pp. 39–55, 2001.
[20] O. Flores, S. Gourlet-Fleury, and N. Picard, “Local disturbance, forest structure and
dispersal effects on sapling distribution of light-demanding and shade-tolerant species in a
French Guianian forest,” Acta Oecologica, vol. 29, no. 2, pp. 141–154, 2006.
[21] W. F. Laurance, J. M. Fay, R. J. Parnell, G. Sounguet, A. Formia, and M. E. Lee, “Does
rainforest logging threatenmarine turtles?” Oryx, vol. 42, no. 2, pp. 246–251, 2008.
[22] J. Brunet, ¨ O. Fritz, and G. Richnau, “Biodiversity in European beech forests–a review
with recommendations for sustainable forest management,” Ecological Bulletins, vol. 53, pp.
77–94, 2010.
[23] J. Radachowsky, V. H. Ramos, R. McNab, E. H. Baur, and N. Kazakov, “Forest
concessions in the Maya Biosphere Reserve, Guatemala: a decade later,” Forest Ecology
andManagement, vol. 268, pp. 18–28, 2012.
[24] N. Sasaki, K. Chheng, and S. Ty, “Managing production forests for timber production
and carbon emission reductions under the REDD+ scheme,” Environmental Science and
Policy, vol. 23, pp. 35–44, 2012.

DEPT OF ECE, YSREC OF YVU, PDTR 62


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

APPENDIX
SOURCE CODE
Clc; close all; warning off all
imgDir=('ImageDatastore\*');
imds = imageDatastore(imgDir);
pixDir_t = ('PixelLabelDatastore 2d\*');
classes = ["OTHERS" "DEFORESTATION" "FOREST"];
pixelLabelID = [47 80 126];
pxds_t = pixelLabelDatastore(pixDir_t,classes,pixelLabelID);
pixDir = ('PixelLabelDatastore 2d 1st\*');
classes = ["OTHERS" "DEFORESTATION" "FOREST"];
pixelLabelID = [47 80 126];
pxds = pixelLabelDatastore(pixDir,classes,pixelLabelID);
image=98;
imageSize = [512 512 3];
numClasses = 3;
unet_net = unetLayers(imageSize, numClasses);
options = trainingOptions('sgdm', ...
'InitialLearnRate',1e-3, ...
'MaxEpochs',5, ...
'VerboseFrequency',10);
ds = combine(imds,pxds);
net = resnet50();
net.Layers; % net = trainNetwork(ds,lgraph,options)
imds1 = imageDatastore('database', 'LabelSource', 'foldernames', 'IncludeSubfolders',true);
% net = trainNetwork(ds,lgraph,options)
[trainingSet, testSet] = splitEachLabel(imds1, 0.6, 'randomize');
imageSize = net.Layers(1).InputSize;
augmentedTrainingSet = augmentedImageDatastore(imageSize, trainingSet,
'ColorPreprocessing', 'gray2rgb');
augmentedTestSet = augmentedImageDatastore(imageSize, testSet, 'ColorPreprocessing',
'gray2rgb');
featureLayer = 'fc1000';

DEPT OF ECE, YSREC OF YVU, PDTR 63


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

trainingFeatures = activations(net, augmentedTrainingSet, featureLayer, ...


'MiniBatchSize', 32, 'OutputAs', 'columns');
test_features = activations(net, augmentedTestSet, featureLayer, ...
'MiniBatchSize', 32, 'OutputAs', 'columns');
imdsTrain = imds.Files(1:round(.6*size(imds.Files,1)));
imdsVal = imds.Files(round(.6*size(imds.Files,1))+1: round(.8*size(imds.Files,1)));
imdsTest = imds.Files(round(.8*size(imds.Files,1))+1: end);
pxdsTrain = imds.Files(1:round(.6*size(pxds.Files,1)));
pxdsVal = imds.Files(round(.6*size(pxds.Files,1))+1: round(.8*size(pxds.Files,1)));
pxdsTest = imds.Files(round(.8*size(pxds.Files,1))+1: end);

% Specify the network image size. This is typically the same as the traing image sizes.
imageSize = [512 512 3];
% Specify the number of classes.
numClasses = numel(classes);
imgDir=('ImageDatastore\*');
imds = imageDatastore(imgDir);
pixDir = ('PixelLabelDatastore 2d 3rd\*');
classNames = ["OTHERS" "DEFORESTATION" "FOREST"];
pixelLabelID = [47 80 126];
pxds2 = pixelLabelDatastore(pixDir,classNames,pixelLabelID);
I = readimage(imds,image);
C2 = readimage(pxds2,image);
pixDir = ('PixelLabelDatastore 2d 2nd\*');
classNames = ["OTHERS" "DEFORESTATION" "FOREST"];
pixelLabelID = [47 80 126];
pxds1 = pixelLabelDatastore(pixDir,classNames,pixelLabelID);
I = readimage(imds,image);
C1 = readimage(pxds1,image);
pixDir = ('PixelLabelDatastore 2d 1st\*');
classNames = ["OTHERS" "DEFORESTATION" "FOREST"];
pixelLabelID = [47 80 126];
pxds = pixelLabelDatastore(pixDir,classNames,pixelLabelID);

DEPT OF ECE, YSREC OF YVU, PDTR 64


DEEP LEARNING BASED DETECTION OF DEFORESTATION IN SATELLITE IMAGES

I = readimage(imds,image);
C = readimage(pxds,image);
categories(C)
B = labeloverlay(I,C);
B1 = labeloverlay(I,C1);
labels = trainingSet.Labels;
mdl_knn = fitcknn(trainingFeatures',labels);
predictedLabels_knn = predict(mdl_knn, test_features');
CVMdl = crossval(mdl_knn);
L = kfoldLoss(CVMdl);
Acc = (1-L)*100;
imgDir=('ImageDatastore\*');
imds = imageDatastore(imgDir);
pixDir = ('PixelLabelDatastore 2d\*');
classes = ["OTHERS" "DEFORESTATION" "FOREST"];
pixelLabelID = [47 80 126];
pxds = pixelLabelDatastore(pixDir,classes,pixelLabelID);
total_pixels = countEachLabel(pxds);
frequency = total_pixels.PixelCount/sum(total_pixels.PixelCount);
metrics2 = evaluateSemanticSegmentation(pxds2,pxds_t);
training_loss2 = (1-metrics2.DataSetMetrics.GlobalAccuracy);
metrics1 = evaluateSemanticSegmentation(pxds1,pxds_t);
training_loss1 = (1-metrics1.DataSetMetrics.GlobalAccuracy);

figure;imshow(I)
title('Input Image')
figure;imshow(B1)
title('Output Deforestation recognition Image using U-NET with KNN')
figure; imshow(B)
title('Output Deforestation recognition Image using Resnet50 with KNN')

DEPT OF ECE, YSREC OF YVU, PDTR 65

You might also like