The Segmentation of Oral Cancer MRI Images Using Residual Network
The Segmentation of Oral Cancer MRI Images Using Residual Network
ISSN No:-2456-2165
Abstract:- The segmentation of tumour from a cancer An automatic oral tumor segmentation using MRI
MRI images in image processing is classic research area images will be able to solve this issue which provides an
of interest and a tedious task. Manually segmenting the early diagnosis and fast recovery. Early treatment can save
MRI images is very time consuming and liable to errors. the patient life by finding the exact tumor location and type.
Many researchers have done investigation using deep
neural network in segmenting the oral MRI images as In the recent years the deep leaning of neural network
they poses higher performance in segmenting the oral has gained the popularity and most of the researchers have
cancer images automatically. Owing to their gradient an outstanding perseverance with a highest accuracy in
dissemination and complexity issues, the CNN takes segmenting the image. Convolution neural network is most
more time and excess computational power in training important category of deep neural network which is capable
the images. Our aim is build an automated technique for of learning and extracting the features from the cancer MRI
the segmentation of oral cancer images using Residual images.
learning networks (ResNet) to render the complications
of gradient dissemination caused by CNN. ResNet attains In 2015(A.A Pereira et.al) the authors have designed
higher accuracy and trains the images faster compared an deep leaning CNN model containing an 3*3 convolution
to CNN. To accomplish this, ResNet counts a skip kernels that segments the tumors in cancer MRI images.
connection parallel to convolution neural network layers. They had employed a tiny kernel filters to obtain deeper
The verification accuracy of the proposed technique has CNN and cascaded few more convolution layers which had
been carried out on oral cancer (lip and tongue) images similar response in the bigger kernels. The algorithms of
dataset. The results of proposed technique shows a better segmentation were proposed to overcome the issue of
accuracy, dice co-efficient, specificity and precision of redundancy by allotting every pixel to a class label. The
0.92, 0.95, 0.94, 0.96 respectively and computational time architecture of CNN was modified to a FCN (Fully
of 63 mins. Convolution Network). This classified each local block of
an image into U-shaped model with expanding and
Keywords:- Oral cancer, Segmentation, DNN, ResNet. contracting the paths. This model required a more number of
training images to achieve a precise segmentation and
I. INTRODUCTION suffered from more computational time.
In the present, Oral cancer is considered to be greatest To overcome the problem of gradient dissemination
threat to human beings. It is an uncontrollable growth of Convolution neural network (CNN) technique and to
cells that starts from mouth and spreads to lips, tongue and improve the computational power we have implied residual
other parts of the face. Squamous cell carcinoma is most network (ResNet18).
deadly oral cancer where life span will be approximately for
five years. Early stage diagnosis may help in curing the In this research papers, section 2 contains literature
disease with less cost. Segmentation of the Oral cancer MRI survey regarding the research accomplished by the other
images depicts a crucial role in deciding an exact location of researchers and outline information obtained by their work.
the tumour. MRI (Magnetic resonance imaging) helps the Section 3 contains brief discussion of the proposed
physicians to explore the tissues and lesions of the tumours. methodology; the proposed model using Resnet18 for
Segmentation indicates the segregation of salient segmentation. In section 4 simulation set up required is
characteristics of the image background. It is representation explained. The section 5 explains the performance
and extracting of significant data from group pixels into evaluation with several parameters. The section 6 explains
similarity regions. Grouping of pixels takes place based on the result and comparison of ResNet with CNN. The
the change in their intensity accomplished by regions. conclusion is given in section 7.
Optimization
Noise Removal using Adaptive
scheduling of
stochastic
Image gradients
Enhancement
Pre-processing
Performance
Evaluation
Segmentation
Fig. 1: Proposed methodology development and performance evaluation of Oral Cancer MRI images.
Weight layer
Q(z) ReLu Z identity
Two stacked Layer
Weight layer
P(z) = Q(z) + z
+
ReLu
Fig. 2: Building block of Residual Network
There are two major blocks in the model of RESNET 2D convolution layer is the first constituent with 1*1 size
which is explained below: filter, a pace of (1, 1). Batch normalization is carried out
to normalise the channels and rectified linear activation
C. Identity block unit is applied for nonlinear activation units.
The identity block is described as 𝑚 = 𝑄(𝑧, {𝐾𝑖}) + 𝑧 The second element is same as the first one but with the
………………………………… (1) change in size of filter (𝑞 ∗ 𝑞).
The z and m represents the input and output layers and The third element same as the first element but it will not
𝑄(𝑧, {𝐾𝑖}) function defines the bounding of residual contain ReLU activation function.
network. The identity block consists of same dimension of x Finally, before applying the activation function the
and Q. Fig. 3(a) explains the design of identity block shortcut and inputs are integrated together.
composed of three constituents as shown below:
Convolution 2D
Batch Convolution 2D Convolution 2D
Input Normalization Batch Normalization Batch Output
RELU RELU Normalization
Convolution 2D
Back Normalization
Convolution 2D
Convolution 2D Convolution 2D
Batch
Normalization Batch Normalization Batch
Input Output
RELU RELU Normalization
The ResNet 18 contains 4 convolution layers in each Stage 6: An Average pooling of size 7*7 is used, the
of the module (first convolution layer and fully connected obtained output is smashed and the fully connected layer
layer = 18 layers) and is composed of 5 stages with every is declined its input to several numbers of classes employs
layer having convolution and identity blocks to persist: activation unit “softmax’.
Stage 1: Contains 2D Convolution layer with a shape of
(7*7) size, 64 filters and a stride (2, 2). Similarities of the E. Simulation set up
channels are performed by batch normalisation and In this section, we are explaining in detail our
activation function ReLU. Max pooling is combined at simulations to verify the performance of deep residual
end with stride (2, 2). network ResNet 18 in segmenting the oral cancer MRI
Stage 2: Contains two identity blocks with one 2D images. We have employed tensor flow for our model.
convolution block, both the blocks uses 3 filter sets (56,
The model proposed is examined and evaluated using
56, 64), with kernel size (3 * 3) and (2, 2) stride.
MRI Oral cancer images dataset. The training set
Stage 3: Contains three identity blocks with one
compromises of 100 patients suffering from Oral Squamous
convolution block, both uses 3 filter set (28, 28, 128) with
cell Carcinoma. In the dataset we have every patient 10
kernel size - (3 * 3) , (2, 2) stride.
samples so total number of 1000 images having image size
Stage 4: Contains four identity blocks with one of 225*225. We have resized the image into 128*128. The
convolution block, both uses 3 filter sets (14, 14, 256) hyper-parameters used in this proposed models is described
with kernel size - (3 * 3) , (2, 2) stride. in table 1.
Stage 5: Contains five identity blocks and one convolution
block and both uses 3 filters set (7, 7, 512) and with the
size (3 * 3) and (2 * 2) stride.
To compare the two MRI images we have used Dice The efficiency evaluator’s specificity and sensitivity
Similarity Coefficient (DSC), Specificity (rate of true examine the robustness of our proposed methodology for
negative), Sensitivity (rate of true positive), and Accuracy segmenting MRI tumor images.
(A) and Precision (P) values.
Specificity
Dice Similarity Coefficient 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 =
𝑇𝑁
…….…………………….. (4)
𝑇𝑁+𝐹𝑃
The Dice similarity Coefficient calculates the overlap
that occurs between the main Oral cancer MRI segmented Accuracy
images and ground truth images. It is gives us shown below: (𝑇𝑃+𝑇𝑁)
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃+𝐹𝑁+𝑇𝑁+𝐹𝑃)
….…………………… (5)
2𝑇𝑃
𝐷𝑆𝐶 = ….…………………………… (3)
𝐹𝑃+2𝑇𝑃+𝐹𝑁 Precision
𝑇𝑃
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (𝑇𝑃+𝐹𝑃)
…………………………… (6)
Table 2: This shows comparison of different techniques with proposed methodology- performance evaluated for Dice score,
Specificity, Accuracy, Precision and computation time in minutes
Techniques Dice Score Specificity Accuracy Precision Computation Time
CNN 0.91 0.84 0.83 0.84 156 mins
VGG Net-16 0.92 0.86 0.88 0.93 360 mins
VGG Net-19 0.89 0.91 0.87 0.96 256 mins
U-Net 0.86 0.83 0.80 0.91 354 mins
UNet-Res 0.91 0.86 0.84 0.92 280 mins
ResNet 18 0.95 0.94 0.92 0.96 63 mins
1.2
1
Computation Time
400
0.8 350
0.6 Dice Score 300
250
0.4 Specificity 200
150
0.2 Accuracy 100 Computation
50 Time
0 Precision 0
Graph 1: Represents the performance Evaluation Graph 2: Represents the Computational time
IV. RESULT AND DISCUSSION average computational time is calculated in segmenting the
data. The average computational time is calculated defined
This section contains the comparing of the results of as the processing time that is required for segmenting the
our proposed methodology of segmenting Oral cancer MRI Oral MRI images. Table2 also shows that the proposed
images with 3 segmentation techniques: CNN, VGG Net- model has minimal average computational comparing to the
16, VGG Net-19, U-Net and UNet-Res (residual block). other techniques. This aptly shows that the proposed
methodology has higher accuracy and minimal average
The data collection was made from Aster CMI computational time period.
Hospital Hebbal, MRI Oral cancer images related to lip and
mouth cancer. Few images were taken from Radiopaedia The ResNet model accomplishes identity mapping and
squamous cell carcinoma tongue and digital imaging and these outputs get connected to the corresponding stacked
communications in medicine database. layers without any addition of extra parameters. This
mechanism show that the layers of ResNet model will try to
from : https://radiopaedia.org/articles/squamous-cell- learn the leftover inputs and outputs while the layers of
carcinoma-tongue and https://www.dicomstandard.org. CNN, VGG-16, VGG-19, U-Net and UNet-Res learns
exclusively the true outputs. The gradients flow backwards
A. Dataset Training without any effort which results in quick processing in
The proposed methodology is compared with CNN, comparison with other techniques. The ResNet has power of
VGG Net-16, VGG Net-19, U-NET and UNet-Res over all short connections which helps in solving the issue related to
the process of training. Each and every sequence in the dissemination the gradients. ResNet also guarantees that the
model is normalised as discussed in the pre-processing higher layers execute as good as that of the lower layers.
stage. Adaptive scheduling of stochastic gradients
optimization algorithm is employed to restrict the V. CONCLUSION AND FUTURE SCOPE
optimization. It is faster comparing to other optimization at
attaining convergence. It displays low level loss with Oral cancer MRI tumor segmentation is one of the
outlined features helping in optimization. The performance needed requirements in the early treatment of Oral cancers.
of training of the model compared with CNN, VGG Net-16, Though the Deep neural networks are important strength of
VGG Net-19, U-Net and UNet-Res are shown in Graph 1. image segmentation they have one drawback of
It shows that ResNet 18 has lesser error while training and dissemination of gradients which arises during the process
shows high accuracy comparing to various techniques. of training. We have used Residual Network – ResNet to
come out of this problem. In residual network we have
The validation of proposed model uses over 32 epochs employed ResNnet 18 in our proposed methodology since it
for the process of training. This exhibits that the errors has fewer errors while training and also high accuracy as it
decreases rapidly over the training period and accuracy of contains lesser number of layers. The proposed methodology
the training rises after every epoch. performs well compared to CNN, VGG 16, VGG 19, U-Net,
UNet-Res models relating to computation time. We have
B. Dataset Testing employed Adaptive scheduling of stochastic gradients
In this process, the data is tested over the model in optimization technique. It has minimal computing execution
segmenting the tumors in Oral MRI images. The model is time comparing to all other techniques mentioned above.
evaluated against the techniques using performance metrics Our proposed methodology achieves shows a better
as defined in section 5 along with the computational time accuracy, dice co-efficient, specificity and precision of 0.92,
this gives us the results of the segmentation. We have 0.95, 0.94, 0.98 respectively and computational time of 63
calculated the evaluation metrics on each patient dataset and mins.
the average value of each data is estimated. Graph 2
displays the segmented image that displays the performance
of the proposed methodology. To represent the viability, the