0% found this document useful (0 votes)
56 views

Lecture 13 Image Segmentation Using Convolutional Neural Network

Uploaded by

Devyansh Gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

Lecture 13 Image Segmentation Using Convolutional Neural Network

Uploaded by

Devyansh Gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019 ISSN 2277-8616

Image Segmentation Using Convolutional Neural


Network
Ravi Kaushik, Shailender Kumar

Abstract: Identifying regions in an image and labeling them to different classes is called image segmentation. Automatic image segmentation has been
one of the major research areas, which is in trend nowadays. Every other day a new model is being discovered to do better image segmentation for the
task of computer vision. Computer vision is making human‘s life easier by automating the tasks which humans used to do manually. In this survey we
are comparing various image segmentation techniques and after comparing them with each other we have explained the merits and demerits of each
technique in detail. Detailed analysis of each methodology is done on the basis of various parameters, which are used to provide a comparison among
different methods discussed in our work. Our focus is on the techniques which can be optimized and made better than the one which are present before.
This survey emphasizes on the importance of applications of image segmentation techniques and to make them more useful for the mankind in daily life.
It will enable to us to take full benefits of this technology in monitoring of the time consuming repetitive activities occurring around, as doing such tasks
manually can become cumbersome and also increases the possibility of errors.

Index Terms: Computer Vision, Convolution Neural Networks, Deep Learning, Edge Detection Models, Fully Connected Layer, Image Segmentation,
Max Pooling.
——————————  ——————————

1. INTRODUCTION the work done in the field of remote sensing image


MACHINE learning is the most ideal skill of this digital age. As segmentation by Xiomeng Fu Et al. [3]. Image segmentation is
we dissect the process how a machine learns to classify and most applicable in medical field, as proper analysis of MRI or
obtains the inputs or the raw materials needed for learning the X-rays images is very crucial in proper diagnosis. Pim
specifics of the desired task. Features or attributes forms the moeskps Et al. [4] has discussed one such application of
basis of what we feed in the learning algorithm. In the task of segmenting the MRI images of brain, breast and cardiac CTA.
image processing and object identification, machine learning A similar application is implemented by ZhuolingLi Et al. [5] to
plays a vital role. There are many techniques available to do detect the deformities in eye cornea images and to identify any
such tasks. We will discuss various methods that can be used object in lungs x-ray images.
to achieve such tasks. In this world of digitization, images play Our survey emphasizes on comparing many of such
a very important role in various areas of life including scientific implemented models of CNN in various applications and to
computing and visual persuasion tasks etc. Technically produce results which are used to tell which model is best
images can be binary images, grey scale images, RGB suitable for different applications. Here, we are analyzing each
images, hue saturation value or hue saturated lightness model thoroughly on the basis of efficiency, applications, area
images. Each data record can be represented via a huge of use, size of training data required and many other
number of features. But all features are not necessarily parameters. In this survey, section 1 describes the introduction
significant for analysis or classification. Thus, feature selection part of this research, which explains the motivation behind
and feature extraction are significant research areas. In image doing this work and the flow of work explained here. Second
segmentation each pixel in image is labeled to different class. section of this survey describes some concepts of image
This pixel labeling task is also called as dense prediction. segmentation from the scratch so that reader can get some
Suppose in an image there are various objects available like idea of how this technology actually work and can become
cars, trees, signals, animals. So, image segmentation will useful in identifying the crucial details while reading the
classify all the trees as a single class, all animals and signals survey. Then in section 3, the actual survey related work starts
to their respective classes. One important thing to consider in and here a detailed analysis of their work and methodology is
image segmentation is that it considers two objects of the explained. Section 4 contains the comparison on the basis of
same type as a single class. We can differentiate objects of common parameters. Section 5 contains the brief analysis and
the same type using instance segmentation. CNN is used very conclusion of all work discussed here. In the last, section 6
frequently for segmenting the image in pattern recognition and discusses about the future scope of work that is still possible in
object identification. Here, we have discussed some of the real enhancing the performance and expansion of application
life applications of CNN. Like work done by Olaf Ronneberger areas in the field of image segmentation using CNN.
to segment the neural structures in microscopic images of
human blood [1]. A major work is done by Sadesh 2 BACKGROUND
Karimpouliet [2] in identifying the different types of rocks In this section, we are going to discuss about the techniques
present in the images of rocky area. Next, we have discussed and knowledge required prior to understand the research work
we have discussed here in further sections. Here, basics of
———————————————— image segmentation, its usefulness and architecture are being
 Ravi kaushik is currently pursuing masters degree program in discussed for the better understanding of readers.
Software engineering at Delhi Technological University, India.
E-mail: [email protected]
 Shailender Kumar is currently working as an Associate professor at 2.1 NEED OF IMAGE SEGMENTATION
Delhi Technological University, India. Suppose, you are crossing a road and you see various objects
E-mail: [email protected] around like vehicles, traffic lights, footpath, zebra-crossing and
pedestrians. Now, while crossing the road your eyes instantly
analyze each object and process their locations to take
667
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019 ISSN 2277-8616

decision of whether to cross the road at that moment or into a single cluster. In the end image gets classified into
not. Can computers do this task? The answer to this question several regions. Although this is a time taking process but it
was ‗No‘ since a long time before the breakthrough inventions provides accurate results on small datasets.
in computer vision and object identification. But now with the
help of image segmentation and object identification 2.3.4 CNN based image segmentation
techniques it is possible for the computers to see the real- This method is currently the state of art technique in image
world objects and based on their positions they can now take segmentation field of research. It works on images which are
necessary actions. Also, in the medical field, image of 3 dimensions i.e. height, width and number of channels.
segmentation plays a very important role. For example, we First two dimensions tell us the image resolution and third
can consider the task of identifying cancer cells in the dimension represents the number of channels (RGB) or
microscopic image of human blood. If you ask of identifying intensity values for red, green and blue colors. Usually images
cancer cells, it‘s crucial to identify the shape of cells in blood which are fed into the neural network are reduced in
and any unusual growth in the shape of blood cells will be dimensions which reduce the processing time and avoid the
diagnosed for the presence of cancer cells. This makes the problem of under fitting. Even though if we take an image of
recognition of cancer in blood at an early stage so that it can size 224*224*3 which when converted in to 1 dimension will
be cured within time. make an input vector of 150528. So, this input vector is still too
large to be fed as input to the neural network.
2.2 ROLE OF DEEP LEARNING IN IMAGE SEGMENTATION
As all other machine learning techniques work in a way of TABLE 1
taking the labelled training data and based on the training, Comparison of Image Segmentation Techniques
they process new input and take the decisions. But as
opposed to this deep learning works with neural networks and Advantages
can make the conclusions of its own without the need of Region Based Edge Based Cluster Based CNN Based
labeled training data. This method is useful for a self-driving Simple Good for images Works really well Provides most
calculations. having better on small accurate results.
car, so that it can differentiate between a sign board and a contrast between datasets and
Fast operation
pedestrian. Neural networks use algorithms which are present speed. objects. generates
in the network side by side and output of one algorithm is excellent
subject to the outcome of another algorithm. This creates a clusters.
system which can think as a human being for taking the Disadvantages
decisions. And this makes a model which is a perfect example Region Based Edge Based Cluster Based CNN Based
If there is no Not suitable when Computation Takes comparably
of an artificial intelligence system.
significant there are too many time is too longer time to
grayscale edges in the large and train the model.
2.3 COMPARISON OF VARIOUS IMAGE SEGMENTATION TECHNIQUES difference or an image. expensive.
overlap of the gray
2.3.1 Region Based image segmentation scale pixel values,
It is the simplest way of segmenting an image based on its it becomes difficult
to get accurate
pixel values. This makes use of the fact that there will be a segments.
huge difference in pixel values at the edges of an object from
the pixel values of its background pixels. So, in such cases we 2.4 DEFINITION OF CNN
can set a threshold value, the difference in pixel values falling CNN stands for Convolutional Neural Network, is used for the
below the threshold can be segmented accordingly. In case classification of images. The input images which are used in
there is one object in front of a single background only one CNN are of 3 dimensions i.e. height, width and number of
threshold value will work. But in case there are multiple object channels. First two dimensions tell us the image resolution.
or overlapping among objects we might need multiple And third dimension represents the number of channels (RGB)
threshold values. This technique is also called as threshold or intensity values for red, green and blue colors. Usually
segmentation. images which are fed into the neural network are reduced in
dimensions which reduce the processing time and avoid the
2.3.2 Edge Based Segmentation problem of under fitting. Even though if we take an image of
In an image, what divides an object with its background or size 224*224*3 which when converted in to 1 dimension will
other object is an edge. So, this technique makes use of this make an input vector of 150528. So, this input vector is still too
fact that whenever a new edge is encountered while scanning large to be fed as input to the neural network.
an image, we are detecting a new object. This detection of
edge is done with the help of filters and convolutions. Here, we 2.5 WORK FLOW DIAGRAM OF CNN
have a filter matrix which we need to run over the image to This section discusses the design structure of CNN based on
detect the specified type of edge for which the filter was data and the optimization techniques that can help most of the
made. CNNs with low dependency on the structure. We are actually
going to see the techniques in a reverse manner similar to the
2.3.3 Clustering based image segmentation flow of a back propagation to demonstrate the learning cycle of
This technique work in the way that it divides the data points in a CNN.
an image in some clusters having similar pixel values. Based
on the number of objects that might present in the image the
value of number of clusters is selected. Then during the
clustering process similar kind of data points were classified 2.5.1 Convolution Layer
668
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019 ISSN 2277-8616

A filter or kernel is used to run over the image in fixed gap


intervals called strides. Selecting the size of stride is crucial to
achieve desired results. During running the filter over the
image, dot product of filter with part of image on which filter
lies is calculated. Then sum of all values of product matrix is
copied to the corresponding position in convolved feature map
matrix. Thus, we get a reduced dimension feature map of
image. Filters may be of many kinds, where each filter is used
to extract different kind of feature from the image. For e.g. one
filter may be responsible for extracting one kind of feature from
the image based on shapes and edges and another filter may
be used to extract features based on color intensities.
Parameters which helps in adjusting CNN‘s performance:
Stride: This defines the number of pixels by which we have to
move our filter over the image so that we can focus on a new
set of pixels while doing convolution. Stride‘s value ranges
from 1 to 3 depending upon the amount of loss which we can
be accommodated during convolution. The amount of loss in
image increases with the increasing value of stride. Selecting
optimal value of stride is a very crucial task. The accuracy of
model highly depends upon this value. Padding: It is a process
of adding zeroes around the border of original image
symmetrically. This helps us obtaining the feature map output
to be of size as per our requirement. Commonly it is used to
preserve the dimension of image after convolution. Filters:
These are also called kernels. These may be of many types.
Some of the types are:
a. Sobel filter (horizontal) is used to detect horizontal edges Fig. 1 - Different Layers of Convolutional Neural Network
in the image.
b. Sobel Filter (vertical) filter is used to detect vertical edges 2.5.2 Pooling Layer
in the image. This layer operates a small kernel on the image at a fixed
c. Laplace Filter (both horizontal and vertical) is used to stride. It is used to pick the pixel with the highest intensity and
detect horizontal edges as well as vertical edges in the discard other pixels. The resultant matrix will be a reduced
image. dimensional matrix of feature image. This helps in reducing
d. Blurring filter is used to blur out an image. the unnecessary sparse cells of image which are of no use in
e. Sharpening filter is used to sharpen an image. classification. Max pooling helps reduce the dimensionality of
the network (or image), but this may cause some information
loss. The concept behind these is that adjacent or nearby
pixels can be approximated by the maximum information
carrying pixel.

2.5.3 Activation Layer


This layer mostly uses ReLu as an activation function. ReLu is
a function which is used to set all negative values to zero and
keeps positive value as it is. This step is usually followed by
convolution and pooling layer. In numerous fields such as
Computer Vision they are becoming the state of art achieving
near human or better performance. They sound fascinating but
designing a CNN is a herculean task in itself. Till now there is
no fixed formula for the design of CNN. Many researchers
have come up with the general suggestions such as. But they
don‘t always hold, as the task is dependent as importantly on
data as it does on the algorithm. CNN exploits the spatial
hierarchical features of data, extracting features and help
classify them into different classes. This has led to
development to a stream of data augmentation and pre-
processing to increase the data, as more data allows for
chance of better training and avoiding over fitting. This helps
build models that are more robust to new samples as we try to
make it more generalized to noise at the training phase.

2.5.4 Drop Out Layer

669
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019 ISSN 2277-8616

Dropout Layer is usually applied after the layer containing no formulation of the units you choose as it is a hit and trial
neurons in the fully connected network. Dropout layer is a usually. These also pile up with the depth of the fully
regularization layer. With each iteration, it randomly drops out connected subset of the network. Most of the networks in
the weights of some input neurons such that the other neurons research usually perform well with number of units in multiple
get more weightage in next iteration. Usually people take of 64. Two to three layers networks are good if there are
dropouts of 20-50%. By using dropout layer, normally the enough patterns being passed to the network after flattening
performance of a network increases 1 to 2 %. the outputs of the Convolutional layers.

2.5.5 Fully Connected Layer 2.6 ARCHITECTURE OF CNN


The structure from top to down usually forms a pyramid Here, the internal working architecture of CNN is displayed in
structure, the number of parameters in these layers keep on the figure. The image is represented by a matrix of pixel
converging till they finally reach the number of desired classes. values and convolution map is obtained by sliding and
Increasing the number of hidden units in the layer can multiplying filter matrices over the input image.
increase the learning ability of the network, but there is a
saturation of the increase in accuracy of the network. There is

Fig. 2 - CNN Architecture


3 LITERATURE SURVEY
Here, we are going to discuss the research work already done mapped with pixel wise loss weight to identify the border
by several authors in the field of image segmentation using pixels. So, for that reason ceresin et al predicted the class
CNN. First of all, we‘ll start with the work done by label of pixels in images by providing a region which is local to
Olaf Ronneberger Et al. [1]. They have done crucial work on the pixel in that image and with the help of a sliding window
biomedical images dataset. They segmented the neural setup to train the network. Accuracy is 77.5% and training time
structures found in microscopic images of human blood is 10 hours [1]. Next, we are going to discuss the work done
samples. Cell growth tracking in a human body is performed by Sadegh Karimpouli Et al. [2]. Here they used CNN to
automatically using this approach. The typical task of CNN is segment the rock images. Segmentation in rock images is a
classification, where output to single image is a class label. critical step as images presented here are in grey-scale
But in many other applications, localization information should format. Conventional methods of segmentation such as
be included in output. That is each pixel in an image need to thresholding or watershed algorithm don‘t work accurately with
be assigned to a class label. In their work they have used CNN digital rock images data. As these methods work on color
to segment the image to identify different cell types and to contrast to segment the regions with each other, the problem
track the growth of some specific cells in the body over a arises when two phases of similar color and contrast found to
period of time. Along with identifying the cells in the body, the be side by side, then these algorithms consider them as a
location of those cells is also important to monitor the cell single phase. They have analyzed several machine learning
growth properly. They have applied segmentation task on a approaches which could be used to solve the problem but
raw image which classify objects with different colors then a Convolutional auto encoder was found to be producing more
black & white segmentation mask is generated using white for accurate results than others. Here, a relatively very small
foreground and black for background. Then the image is dataset is used to train the model, but the data augmentation
670
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019 ISSN 2277-8616

method helps to increase the size of training dataset. By lot of dissimilarity. As a result of these shortcomings with
creating a training seed which consists of manually and semi medical data set, a very deep neural network model cannot be
manually segmented images. They took the help of a data set used. There may be many parameters to be trained properly
of images from Berea sandstone. They divided the dataset and would need a larger dataset to be able to work accurately.
among groups of training, testing and validation. From which So, as a result they have developed a framework named
only 10% of the data was required and used for validation Clustering Convolution neural networks (CLU-CNN) which
purpose and another 10% was used for the testing purpose aims specifically at medical images segmentation task [5].
and rest of remaining data i.e. 80% was for used to train the Jie Chang Et al. [6] have applied R-CNN to detect and
model accurately [2]. Finally, this dataset is given to stochastic recognize the marine organisms in image restoration sea
image generator which makes the size of dataset bigger to surface images. R-CNN requires a lot of labeled training data
train the model more effectively. Then CNN model was to work accurately and it is impossible to find such a big data
implemented which provided results with an accuracy of set of marine organisms. So, there are three data
96%. Segmentation on the images of High resolution remote augmentation methods for underwater imaging has been
sensing data is done by Xiaomeng Fu Et al.[3] with the help of proposed in this paper. A way of inverse process is used for
fully convolutional neural network. It is shown that FCN works the simulation of different turbulences in the marine
better than CNN in automatic segmentation of remote sensing environment. Then the performance of each data
images. The accuracy achieved with FCN in this model is augmentation method is compared. Although there have been
above 85%. As we know with increase of satellites, there is a so many advantages of robotic capturing, and some of the
sudden increment in remote sensing image data. And it is techniques have already been applied in practical. For e.g.
almost impossible to segment those images manually to find Norway developed an ROV (Remote Operated Vehicle) for
useful results. Now as the deep learning technique has submarine harvesting but it requires an experienced operator
advanced, it is now possible to extract low level to high level who knows where and how to reach a location where lot of
features from raw input data images automatically. Hinton has marine organisms can be found. So, this object detection and
explained in his work that deep learning methods can obtain machine learning algorithm is becoming quite successful
much more useful features than the existing methods, and during the past decade which helps in automatic object
also has the good classification ability. In MRI images or x- detection and recognition. Cem M. Deniz Et al. [7] have
ray images, it is very difficult to identify different parts of body. proposed a method to measure bone quality and access
Therefore, image segmentation helps in segmenting different fracture risk. As we know, manual segmentation of Magnetic
part of the body in those complex images. Pim Moeskops Et Resonance (MR) is time consuming and error prone process.
al. [4] applied image segmentation on MRI (magnetic So, in order to deal with this problem and to take full
resonance imaging) image dataset of brain, breasts and advantage of MR scan technology an automatic femur
cardiac CTA). The classical procedure of segmentation is to segmentation technique has been developed. This method is
train the model with some hand-crafted features and then use totally based on Image Segmentation using Convolutional
the model on input data to classify the objects, whereas in this Neuron Network. This method greatly helps in detection of
approach CNN automatically extracts features which are proximal femur in bones. The data set used by them is
required at the hand for the task of classification. Here, CNN is Microsoft COCO. It contains 2.5 million instances of labeled
used to classify medical images of knee cartilage, brain images [7]. It contains around 20000 images of training set
regions, pancreas and coronary arteries. Four results were and 2000 images of validation set along with 3000 images of
obtained from the network, one when the network was trained testing set data. They achieved an accuracy of 97.8% and
only for brain MRI, second only for breast MRI, 3rd only for exceeded the performance of 2D CNN by a large margin. The
cardiac MRI and 4th when network is trained for all together. performance of the model improved with adding of more
Each network is trained using 25000 batches of images per number of layer and feature maps to the model. M. Yang Et al.
task [12]. ZhuolingLi Et al. [5] have developed a method to [8] have used CNN on 3D input to recognize human activity.
adapt domain called CLU-CNN, which is specifically designed They have used CNN to process raw input, thus making
for medical images. As oppose to normal object identification process of feature extraction automated. This model works on
task which are designed for pedestrian, cars or trees 3D data as most of other CNN models only process 2D
detection, here the task is detecting object in medical images data. So, it works on both spatial and temporal dimensions
like some deformities in eye cornea images or some object in to recognize the activity in a scene. It works on several frames
lungs x-ray images etc. As we know, medical images have of input thus generates multiple channels of information. The
apparently very different features from normal images, so they final result is obtained by combining all the input channels
needed to be handled differently and model training also together detecting the motion activity in sequence of
needs to be done while keeping this thing in mind. Another frames. The accuracy of their model came to be 90.2%. They
problem with medical images was that deep neural networks used a very large data set of 49 hour long video of TRECVID
suffer from a very big amount of parameters, this causes slow 2008 data [8]. When video was taken certain assumptions
convergence rate. To overcome this problem sparse about the circumstances of the situation were made because
methodology and new learning rate schemes have been of issues like small scale and changes in the viewpoint. But
introduced. Also, medical images are too expensive to obtain. such kind of assumptions hardly holds in real world scenarios.
Therefore, the size of training data is not enough to be used as They designed the model to be able to automatically identify
general images training data. The size of training data might the features which are needed to be extracted from the input.
be as low as less than 400 which results in not covering all the Then, that input is fed to the model to identify
possible cases while training the model. Also medical images the actions.Ahmed Bassiouny Et al. [9] used CNN to
may be taken at different hospitals and different environmental categorize images into different categories on giving an input
conditions, so images relating to same problem might have a data set of different scenes. They created image segmentation

671
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019 ISSN 2277-8616

map of the image for the purpose of representing the images images to identify various regions in an x-ray image like-
by their segmentation map itself. Image segmentation map is a bones, soft tissues, open beam regions etc. This segmentation
one to one mapping of each pixel of an image to a set of is required to quickly identify the region in an x-ray image. As
labels. This approach not only identifies the objects contained all regions in an x-ray image are of similar shade therefore it
in the scene but also identifies shape, location and size of the becomes very crucial and precisely done task. X-ray image
objects. They used two data sets i.e. MSRC-21 and SIFT segmentation is an important task in many medical
flow, to train their model in two different ways. The accuracy applications such as image enhancement, computer assisted
achieved by each dataset comes to be 81.9% and 66.8% surgery and anomaly detection. These applications normally
respectively. Venugopal K.R. Et al. [10] have combined CNN require medical images to be segmented into three categories
with CC (connected component) algorithm to segment the i.e. open beam, soft tissue and bone. Current methods rely on
SEM images. As we know CNN is used to extract features classical processing methods such as clustering, line
directly from raw images with minimal pre-processing. Also, fluctuation analysis or entropy-based methods which normally
CNN is able to recognize patterns in the image which are not requires the tuning of hyper-parameters for each body part
provided as training data before, provided it resembles one of separately. They have designed a unique CNN architecture to
the training data images. Accuracy (F-score) of the model is perform the segmentation by extracting fine grained features,
found to be 78%. They have done detection of neuron tissues. alongside with controlling the number of training parameters to
Neurons are particularly difficult to segment as they are control the over fitting problem [13].This model attains an
branched, intertwined and are very closely packed. Axons in overall accuracy of 82% (TF: true positives). Yan Song Et al.
neural structures are very thin that only electron microscopy [14] have discussed segmentation and synthesis methods for
has enough resolution to reveal them. In this approach all sonar images. Also, SSS image segmentation is done for
steps of parameter tuning is automated. So, in order to do which features are extracted using CNN of multi-pathways.
automation and avoid human interaction in parameter tuning, The advantage of CNN with multi pathways is that it can learn
CNN has been introduced. In order to recognize 2D patterns local and global features adaptively. SSS i.e. side sonar
with very high degree of images transformations techniques a scanner provides essential high-resolution underwater images
CNN is used as a multi layer perceptron. Six features maps which is useful for many tasks like- mine detection,
are constructed in each layer of this CNN model. All the filters oceanographic research and rescuing underwater. SSS
in the CNN are given a size of 5 * 5 [10]. Random selection on images can be segmented into areas like object highlights,
the basis of normal distribution of standard deviations is done object shadows or sea floor areas. But there is lot of noise and
to calculate the weights of the edges in the model. The input to inhomogeneity in SSS images that makes the task of
the CNN is a raw EM image. Detection of features in the segmenting SSS images very difficult and inaccurate. Shan
image is done automatically by the CNN. CNN automatically E Ahmed Raza Et al. [15] have used CNN based deep
learns by adjusting its weights by using stochastic gradient architecture for the segmentation of cells, nuclei and glands in
descent learning algorithm. Ji Shunping Et al. [11] have used fluorescence microscopy and histology images. They have
3D CNN to automatically classify the crops in remote sensing used extra Convolutional layers so that is can work on variable
images taken by satellites. In order to structure the multi- input intensities and object sizes to make the model robust to
spectral remote sensing data, kernels are designed according noisy data. This model helps us to build molecular profiles of
to that. Also, fine tuning of 3D CNN‘s parameters are done in individual cells. Dan C. Cirean Et al. [16] have segmented
order to train the model with 3D samples of the crops which neuronal structures in the stacks of electron microscopy
leads the model to learn spatial and temporal representations. images. The label of each pixel in the image is predicted by
Full crop growth cycle samples are preserved for training the the pixel centered at square window in which the pixel exists.
model again for the fully-grown crops. Florent Marie Et al. [12] Then the input layer maps each pixel to a neuron. Then
have segmented the images of kidneys and nephron convolution and max pooling layers are applied to preserve the
blastemal with the help of case based reasoning and CNN 2D information and to extract features of the image. This
architecture. They have introduced two methods to achieve approach outperforms the competing techniques with a large
the required task. First one is CBR i.e. case based margin in all three categories. The solution is based on DNN
reasoning method for determining a region growth in (Deep neural network) used as a pixel classifier. The
process of image segmentation. And the second one is the probability of a pixel being a membrane is calculated. Wang,
method to perform image segmentation on a small dataset. S. Et al. [17] have developed a technique to find the Ulva
Input images taken were are CT scans of kidneys of prolifera regions around oceans by extraction of UAV images
children having tumors in them. The result accuracy and applying image segmentation using CNN. Here first of all
obtained is between 88 to 90% overall. The CBR system raw images are processed with super-pixel algorithm to get the
that they have used comes from CT-scan and it searches local multi-scale patches. Then CNN classification is applied
for the closest image that has already been segmented in on those patches. By combining super pixel algorithm
the base case. This technique helps in saving a lot of time segmentation and CNN classification they have produced
and computations. In designing the CNN model the major much better results as compared to pixel level segmentation or
problem arises is the need of a large training dataset. This instance aware segmentations. First of all, a reflection of light
however is not the problem in this case as here each tumor is removed as a pre-processing step to process raw UAV
is unique and there is only a limited number of tumor images. Then we choose super pixels elements extracted via
segmentations available from the previous tests. Thus, they energy driven sampling. It iteratively refines the super-pixels
have developed a method to establish a dedicated deep by updating their boundaries. Here, they defined energy
neural network for each patient that will carry the task of function as the sum of parts H(s) and G(s) [17]. (E(s) = H(s) +
tumor segmentations automatically without the need of y G(s)) Where, H(s) is based on the likelihood of the color of
human intervention. Bullock Et al. [13] have segmented x-ray super pixels. G(s) is a prior of the shape of the super pixel. In

672
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019 ISSN 2277-8616

the end they have compared different numbers of super pixels segmentation outputs. They have proposed a test- time
to evaluate and improve the accuracy and efficiency of the augmentation based uncertainties in order to monitor the effect
model. Jie Chang Et al. [18] have done segmentation of MR various transformations on the input images. Their task
images of the brain using CNN for developing a method of consists of image and noise transformations. Then they
automatic brain tumor detection. They have developed a two- compared and combined the proposed aleatoric uncertainty
way path model which contains one average pooling layer and with uncertainty of the model. They concluded that test time
other max pooling layer in different paths. Then, finally the augmentation based aleatoric uncertainty provides better
CNN model is combined to a fully connected layer to predict results than any other kind of uncertainty calculation method.
optimized results. As MRI scans are very useful in detecting Nowadays, UAV are most useful in monitoring of ocean areas
inner body dysfunction, the need of automatic region detection rather than radars or satellites. As UAVs provides high
becomes an essential in the medical field. Pre-processing-Not resolution real-time images and this is not possible with other
having properly normalized and quantifiable pixel intensities two sources
interpretation is one of very big defects in MRI data. So, we .
need to preprocess the training data in order to extract 4 COMPARISON OF RELATED WORK
meaningful quantifiable data. With the growing number of DISCUSSED
microscopic images automatic Nano particle detection has We have compared the literature survey discussed above on
become an essential task to achieve. Ayse Betul Oktay Et al. basis of various parameters like accuracy, training data size,
[19] have discussed a method to detect Nano particle in applicable areas, complexity, methodology and data-source in
microscopic images using segmentation. They have proposed a tabular form. After comparing these works we have found
a method for the detection of Nano-particles and detection of certain observations. We found the maximum accuracy in the
their shapes and sizes with the help of deep learning work of Deniz CM and Xiang S [7], where they have achieved
algorithms. This method employees multiple output CNN and an accuracy of 97.8%. These results show that although each
has two outputs. First is the detection output which gives the model is unique and serve a specific purpose but performance
location of the Nano particle in the input image. Other is the matrices differ in each model substantially. If we talk about
segmentation output which outputs the boundary of training the model on a smaller dataset and still be able to
segmented Nano particles. The final sizes of the Nano provide useful results work of Sadegh Karimpouli et al. [2] to
particles are determined by the Hough algorithm that works on segment the rock images prove to be very much efficient and
the segmentation outputs. Here they have used MOCNN i.e. useful. This may increase the complexity of model but
multiple output Convolutional neural network for the purpose of provides quality results as complexity is not much of an issue
detecting, localizing and segmenting of Nano particles found in with availability of such high end computing hardware
microscopic images [19]. MOCNN takes an input image as a systems. Furthermore, Pim Moeskops Et al. [4] proposed
window and it produces two outputs for it, from which one model uses image set of 25000 images which provides a good
output tells us the location of the Nano particle in the image relatively good accuracy of 87%. Their model is trained with a
and second output tells us the distance of the object boundary relatively large data set but proves to be lesser complex than
to the window center. Despite very accurate and one of the other models. Considering area of application, image
best ways of image segmentation by CNN, it does have some segmentation is mostly used in medical field and surveillance
uncertainties which nobody likes to tackle. Guotai Wang Et al. systems in remote areas. Finally, we have analyzed and
[20] have analyzed these different kinds of uncertainties for compared the work done by many researchers in the field of
CNN based 2D and 3D image segmentation. Additionally, they image segmentation using Convolutional neural networks with
have analyzed a test time augmentation-based uncertainty to the help of table 2.
analyze the effect of different kind of image transformations on

TABLE 2- Comparison on the basis of various parameters

Training Data
Paper Accuracy Application area Complexity Technique used Data Source
Size

Healthcare, Blood cell Microscopic ―PhC-U373‖ and ―DIC-


[1] 77.5% 3200 images High
growth monitoring Imaging HeLa‖data set

Data
Object classification, Digital Rock Samples
Augmentation and
[2] 96% 500 images Rock image High introduced by Andra et
SegNet
classification al
Architecture

Surveillance,
FCN and
[3] 85% 15000 images Segmentation of Medium VGG16
expanded CNN
Satellite images

673
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019 ISSN 2277-8616

Healthcare, To
Weighted MR brain
classify images of
[4] 87%. 25000 images Low MRI and X-Ray images from the
knee cartridge, brain
OASIS project
regions and pancreas.

Healthcare, detecting
REFUGE
[5] 72% 400 images deformities in eye High CLU-CNN
CHALLENGE 2018
cornea images

Faster R-CNN Marine organisms data


Marine Organisms
[6] 60% 1800 images Medium and Data by National Natural
Detection
augmentation Science Foundation

MR Scan
Healthcare, To
[7] 97.8% 2000 images Medium technology and Microsoft COCO
measure bone quality
CNN

automated video
[8] 90.2% 49- hour video surveillance, human high 3D CNN TRECVID 2008 Data
action recognition

Segment Count
Surveillance, Feature
81.9% and 591 + 2886
[9] Scene Recognition Low Vector(SegC) and MSRC-21 and Siftflow
66.8% images
SegC with
Centrist

CNN and
Neuro-Science, Drosophila First in star
Connected
[10] 78% 3000 images Detection of neuronal High larva ventral nerve
Component
tissues cord (VNC)
Algorithm(CC)

5.3 FUTURE SCOPE


5 END SECTIONS There is always room for improvement, innovation or change
of existing techniques in any research field. So, despite
ACKNOWLEDGEMENTS availability of so much quality research work in this field, yet
This work was supported by Computer Science Department of there is a lot of work to be done in making those automatic
Delhi Technological University. Authors thank the department monitoring systems more accurate and reliable. There is more
and all the faculty who provide the opportunity to present this scope in handling the uncertainties where images are of bad
work. quality or the boundary pixels of the segmented objects are
overlapping. Accuracy of such systems need to be increased
to an extent that they can be relied upon to do crucial tasks
CONCLUSION such as monitoring unidentified activities in restricted areas
In this work we have compared various image segmentation such as country borders or ministerial offices, where the
techniques and their state of the art implementations. After slightest inaccuracy may prove to be disastrous. Various
researching various techniques we have found that the CNN is models need to be implemented by combining two more
one the most powerful tool in image segmentation technique. models into one such system that it can enable a robot or any
Detailed analysis of CNN is also done here explaining different other automatic system to do tasks more effectively and
layers and workings of each layer. We have explained all the accurately. For e.g. a system can be designed which is able to
possible advantages and fields where CNN can be used in our identify the objects as well as monitor the activities in
daily life. As we know CNN technology is at a boost of surroundings at the same time. This will increase the
implementation nowadays in making the human life more and applicability of such techniques and will make manual tasks
more convenient and less manual. There have already been a much more automatic as compared to they have ever been.
lot of work done in various fields like commutation, medical
tasks, crop monitoring, road transportation, activity detection,
product quality monitoring. All these fields have seen a great
REFERENCES
improvement after the use of these techniques, so all the work [1] Olaf Ronneberger, Philipp Fischer and Thomas Brox,
done is itself state of the art techniques. ―U-Net: Convolutional Networks for Biomedical Image
Segmentation‖, MICCAI 2015: Medical Image
Computing and Computer-Assisted Intervention –
MICCAI 2015 pp 234-241.
674
IJSTR©2019
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 11, NOVEMBER 2019 ISSN 2277-8616

[2] Sadegh Karimpouli, Pejman Tahmasebi, [13] Bullock, Joseph & Cuesta, Carolina &Quera-
Segmentation of digital rock images using deep Bofarull, Arnau. (2018). XNet: A convolutional
Convolutional auto encoder networks, Computers & neural network (CNN) implementation for medical
Geosciences, Volume 126, 2019,Pages 142- X-Ray image segmentation suitable for small
150,ISSN 0098- datasets.
3004,https://doi.org/10.1016/j.cageo.2019.02.003. [14] Yan Song, Bo He, Peng Liu, Tianhong Yan, Side
[3] Xiaomeng Fu and Huiming Qu, ―Semantic scan sonar image segmentation and synthesis
Segmentation of High-resolution Remote Sensing based on extreme learning machine, Applied
Image Based on Full Convolutional Neural Network‖, Acoustics, Volume 146, 2019, Pages 56-65, ISSN
2018 12th International Symposium on Antennas, 0003-682X
Propagation and EM Theory (ISAPE) DOI: [15] Shan E Ahmed Raza, Linda Cheung, Muhammad
10.1109/ISAPE.2018.8634106 Shaban, Simon Graham, David Epstein, Stella
[4] Pim Moeskops, Jelmer M. WolterinkBas, H. M. van Pelengaris, Michael Khan, Nasir M. Rajpoot, Micro-
der Velden, Kenneth G. A. Gilhuijs, Tim Leiner, Max Net: A unified model for segmentation of various
A. Viergever and Ivana Išgum, ―Deep Learning for objects in microscopy images, Medical Image
Multi-task Medical Image Segmentation in Multiple Analysis, Volume 52, 2019, Pages 160-173,ISSN
Modalities‖ MICCAI 2016: Medical Image Computing 1361-8415,
and Computer-Assisted Intervention – MICCAI 2016 [16] Dan C Cirean, Alessandro Giusti, Luca M
pp 478-486 047 Gambardella and Schmidhuber, ―Deep Neural
[5] Zhuoling Li, Minghui Dong, Shiping Wen, Xiang Hu, Networks Segment Neuronal Membranes in Electron
Pan Zhou, Zhigang Zeng,CLU-CNNs: Object Microscopy Images‖, Advances in neural information
detection for medical processing systems 25 · January 2012
images,Neurocomputing,Volume 350,2019,Pages [17] Wang, Shengke& Liu, Lu & Qu, Liang & Yu,
53-59,ISSN 0925-2312, Changyin& Sun, Yujuan& Gao, Feng & Dong,
[6] Hai Huang, Hao Zhou, Xu Yang, LuZhang, LuQi and Junyu. (2018). ―Accurate Ulva Prolifera Regions
Ai-Yun Zang, ―Faster R-CNN for Extraction of UAV Images with Superpixel and
marine organisms detection and recognition using CNNs for Ocean Environment Monitoring‖
data augmentation‖, Neurocomputing, Volume 337, Neurocomputing. 10.1016/j.neucom.2018.06.088.
14 April 2019, Pages 372-384 [18] Jie Chang, Luming Zhang, Naijie Gu
[7] Deniz CM, Xiang S, Hallyburton RS, Welbeck A, and Xiaoci Zhang, ―A Mix-pooling CNN Architecture
Babb JS, Honig S, Cho K, Chang G. Segmentation with FCRF for Brain Tumor Segmentation‖, Journal of
of the Proximal Femur from MR Images using Deep Visual Communication and Image Representation 58 ·
Convolutional Neural Networks. Sci Rep. 2018 Nov December 2018, DOI: 10.1016/j.jvcir.2018.11.
7;8(1):16485. doi: 10.1038/s41598-018-34817-6. [19] AyseBetulOktay, Anıl Gurses,Automatic detection,
PubMed PMID: 30405145; PubMed Central PMCID: localization and segmentation of nano-particles with
PMC6220200. deep learning in microscopy images,Micron,Volume
[8] S. Ji, W. Xu, M. Yang and K. Yu, "3D Convolutional 120,2019,Pages 113-119,ISSN 0968-4328,
Neural Networks for Human Action Recognition," in [20] Guotai Wang, Wenqi Li, Michael Aertsen, Jan
IEEE Transactions on Pattern Analysis and Machine Deprest, SébastienOurselin, Tom
Intelligence, vol. 35, no. 1, pp. 221-231, Jan. Vercauteren,Aleatoric uncertainty estimation with
2013.doi: 10.1109/TPAMI.2012.59 test-time augmentation for medical image
[9] Ahmed Bassiouny , Motaz El-Saban, ―Semantic segmentation with convolutional neural
segmentation as image representation networks,Neurocomputing,Volume 338,2019,Pages
for scene recognition‖, 2014 IEEE International 34-45.
Conference on Image Processing (ICIP).
[10] M. L.S., V.K. G. (2011) Convolutional Neural Network
Based Segmentation. In: Venugopal K.R., Patnaik
L.M. (eds) Computer Networks and Intelligent
Computing. ICIP 2011.Communications in Computer
and Information Science, vol 157. Springer, Berlin,
Heidelberg
[11] Ji Shunping & Chi, Zhang & Xu, Anjian & Shi, Yun
& Duan, Yulin.(2018). ―3D Convolutional Neural
Networks for Crop Classification with Multi-Temporal
Remote Sensing Images" 10. 75.
10.3390/rs10010075.
[12] Florent Marie, Lisa Corbat, Yann Chaussy, Thibault
Delavelle, Julien Henriet, Jean-Christophe
Lapayre,Segmentation of deformed kidneys and
nephroblastoma using Case-Based Reasoning and
Convolutional Neural Network,Expert Systems with
Applications,Volume 127,2019,Pages 282-294,ISSN
0957-4174

675
IJSTR©2019
www.ijstr.org

You might also like