Deep Learning-Based Methods For Brain Tumor Segmentation
Deep Learning-Based Methods For Brain Tumor Segmentation
of-the-Art Review
Syed shahid Abbas 1, Salahuddin2*, Abdul Manan razzaq3, Mubashar hussain4, Meiraj
aslam5, Prince Hamza shafique6 and Muhammad Asif Nadeem 7
1,2,3,,5,6
Department of Computer Science, NFC Institute of Engineering and technology, Multan, Pakistan.
4
Department of computer science, university of Engineering and technology Lahore Pakistan.
7
Department of English, Institute of Southern Punjab, Multan,Punjab.
*
Corresponding Author: Salahuddin. Email: [email protected]
_____________________________________________________________________________
Abstract:
Hospitals have lately begun using machine learning to expedite the diagnostic
and analysis process. Now that they have assistance with diagnosis, doctors can
expedite the start of the healing process. AI in healthcare may be used for simple to
complex tasks in the future, such as phone answering, reviewing medical records,
trending and analytics in primary care, computer design and therapeutic medicine,
reading radiology images, creating treatment and diagnosis plans, and even having
conversations with patients. Medical imaging such as CT, MRI, and X-ray pictures may
be interpreted using deep learning models to establish a diagnosis. Inconsistencies and
dangers can be identified by the algorithms in the medical imaging data. Cancer
detection frequently makes use of deep learning. Brain tumors must be correctly
segmented using MRI images in order to aid in clinical diagnosis and therapy planning.
However, the lack of certain diagnostic procedures in MRI images makes medical
practice more challenging. The recommended method performs better when comparing
the quantitative and qualitative results of medical image analysis as it is currently
performed. When it comes to the accurate identification of malignant lung nodules in the
event of lung cancer detection, CT scans of the chest perform better. Early detection of
lung cancer is crucial for patients' chances of survival. Using sparse chest computed
tomography (CT) data from earlier research, create a multi-view knowledge-based
collaborative (MV-KBC) deep model to distinguish between benign and malignant
nodules. However, the MV-KBC model had more accuracy. Nevertheless, the model
can only be used to supervise image data. In this research, we present a novel deep
learning-based multi view model to alleviate the model's shortcoming. The accuracy of
the suggested model was significantly improved, and computation and classification
times were decreased, for semi-supervised medical image applications.
1. Introduction
Over the years, interest in deep learning methods for image analysis has
increased. One specific area is healthcare. Medical imaging involves the use of 3D
images of the human body, usually obtained from CT or MRI [1] scanners. Doctors
investigate and discover the patient's condition based on medical records. Thus, they
can diagnose
Multi-detector row CT (MDCT) is widely used to identify and confirm lung disease
[9]. It is considered the gold standard for diagnosing these diseases and is particularly
sensitive in diagnosing lung nodules. MRI can identify 85% to 95% of nodules
measuring 5 to 11 mm [10]. Depending on the risk of lung cancer, caution is advised for
lesions larger than 7 or 8 mm, even if MDCT shows only 1 or 2 mm in size. Bacteria
smaller than 7 mm in diameter should be present to determine the growth pattern [11].
Koyam et al. [16] stated that non-contrast lung MRI (Figure 1) is a better method for
detecting malignant tumors than thin-section MDCT. There was no significant difference
between the total diagnosis and the diagnosis of malignant nodules (p>0.05). 97.0%,
p0.05, will give lower numbers than MDCT.
Fig 1. Normal and Affected patient’s images
The main purpose of the "MV-KBC deep model" is to determine the positive and
negative aspects of chest CT images. Determining whether nodules, especially cancer,
are benign (non-cancerous) or malignant (cancerous) is an important task to improve
the standard nodule classification accuracy by combining a model with multiple clinical
quality, anatomical data utilization and aggregation of features from multiple images.
The use of anonymous data may be particularly useful in cases where recorded data
are difficult and expensive to obtain. Treatment decisions are not affected by the
presence or absence of lymph nodes in any ipsilateral bronchial region or hilum (which
may indicate N1 disease). Ipsilateral mediastinal or subcarinal lymphadenopathy is a
manifestation of N2 disease and occurs when only one region is affected. Stage of
classification from N1 to N3 indicating severity of disease. N3 describes the main stage
of disease. In N3 disease consisting of Virchow, scalene or hilar lymph nodes, major
surgery is not recommended [12]. Printers and images combine MRI and PET [13].
However, significant advances have made it possible to make patterns and physical
applications using single-detection images. The functional information provided by MRI
does not imply an additional dimension compared to dynamic MDCT. Although MRI
cannot measure glucose metabolism, it is considered equivalent to functional and
molecular analysis (14).
Fig.2 Different stages of Brain tumor
The evolution of cancer is shown in Figure 2. Machine learning techniques are used to
detect cancer early to benefit patients. Brain tumors usually take months or years to
develop. Glioblastoma is the most common and deadly form of brain cancer. It is one of
our main models because its development is not detected by the immune system. It
causes fatigue and weakness, a constant need for sleep and a tendency to stay in bed
or rest for days. Weight loss and loss of muscle mass. Difficulty swallowing food or
drinks and little or no appetite. Artificial intelligence (AI) is driving change and growth in
many areas of healthcare. MRI, CT scan and X-ray. AI can share a wealth of patient
information, enabling early diagnosis and treatment. AI is also using automation to
develop new drugs, predict disease outbreaks, and streamline operations. AI-powered
telemedicine and remote patient care are facilitating virtual consultations and
information gathering. Genomic analysis helps identify genetic factors related to health
issues, while natural language processing (NLP) improves communication and
electronic medical records. As AI integration continues to transform healthcare, careful
attention to ethics, privacy, and regulatory issues is required, suggesting collaboration
between AI experts, professionals, and physicians. In order to ensure patient safety,
ethical concerns, and equity in new treatments, standards and standards for responsible
and equitable use of image analysis should be established. Magnetic resonance
imaging (MRI) signals can identify tumors [15]. According to a recent study, lung cancer
can be identified by DWI and is different from the findings detected by DWI after
obstructive lobe collapse [17]. Whole-body MRI combined with DWI can be used to
detect M stage in cancer patients with an accuracy comparable to PET-CT. In addition,
quantitative DWI analysis can be used to distinguish infected from non-infected tumors
[18].
2. Literature
In today's environment, brain tumors are more dangerous than cancer. Early detection
is important for screening and treatment planning. MRI scanning is the best technique
for brain tumors because it creates similar tissue without the need for radiation therapy.
In terms of analysis, traditional brain tumor segmentation methods are quite effective.
This study uses machine learning techniques and capabilities. However, brain
segmentation is still difficult, especially when there are not many methods. There are
three different groups of missing patterns. The first category refers to differences in the
brain structure of the patients [19]. The second problem is that gliomas vary in size,
shape and texture from patient to patient. Differentiating between different low-intensity
magnetic resonance imaging techniques. H. Fu et al., “Brain segmentation with missing
nodes in MRI” [20]. This approach overcomes the problem of segmentation
incompleteness with the help of robust segmentation and latent representation learning.
However, many segmentation methods make successful fusion difficult [21]. Anatomical
landmarks and landmark identification are prerequisites for the interpretation of medical
images. In this process, various methods are used to diagnose patients and plan their
care. This process includes measurement analysis, segmentation method configuration,
and image registration. Although seemingly simple, manual identification of anatomical
locations is sometimes difficult and time-consuming. Automatic localization methods are
faster and more accurate than manual identification, and are especially useful when
there are a large number of image locations that need to be processed accurately.
Latent feature extraction is a recently introduced segmentation method for resolving
missing nodes. The problem is where to get to the core of the work and how to learn it.
The HeMIS implementation [22], using Havaei’s real-time framework, independently
examines the appearance of each sample and then calculates the fixed features across
multiple changes to estimate the final segmentation. However, analyzing the mean and
variance of each representative does not help you discover the hidden representatives.
The mean function is used to determine the separation. Chen et al. distilled the ideas
into numerical expressions and mathematical concepts. Uncertainty operations are
used in [24]. Then, through the gating operation, the content code is put into a shared
representation for segmentation. Although the two coders need to match, this method is
more difficult and time-consuming.
Accurate and timely identification of anatomical structures in examination and
diagnosis is essential for the treatment of patients. The two biggest problems of these
systems are the use of poor-quality engineering methods and the low performance of
human search algorithms. J. Liu [26] proposed a system that uses artificial intelligence
to detect viruses. Deep learning principles are also used in this field to help obtain
images as input to estimate human anatomy. The model will be developed to
characterize various image analyses. In addition, many morphological objects in
scanned images are also used to distinguish target anatomical objects in our body.
However, the deep learning method performs better than the state transition method
[27]. However, it is important to increase the accuracy. Then, the model will calculate
the relationship between them to improve the performance analysis idea. The deep
network in the experiment uses features between recurrent and convolutional neural
networks for classification. Tests showed that single-view mode increases efficiency
and effectiveness in thinking about cancer. It shows great potential in Covid-19. It
learned to use multiple views of a chest CT image. Deep learning 28 is currently the
best image recognition method, but it requires multiple training models, which is not
usually used in clinical settings. The authors of this study used a small amount of chest
computed tomography data to distinguish breast cancer from breast cancer using
multimodality integration (TMME). The technology uses the ResNet-50 model to send
image data, and then uses the Image Net database to predict lung nodes. Apply weight
adjustment to nodes when backpropagation errors are performed in the distribution.
Image, voxel value and appearance are among the metrics produced by these three
models. . The classification of lung nodules and tissues around cancer is important.
Therefore, it is very difficult to use machine learning correctly. To solve this problem, the
authors proposed a multilayer learning model based on 3D convolutional neural network
(MMEL-3DCNN). The number of patients worldwide increased by 26% in 2017.
Classification of lung nodules before diagnosis is important, especially since
computerized classification can help doctors to reach an agreement. CT image
classification can be done quickly and accurately using modern machine learning and
computer algorithms. Low-dose chest computed tomography (LDCT) is expected to
reduce the risk of death in people with early-stage disease. The accuracy and efficiency
of cancer diagnosis will be increased with AI management that is equal to or better than
human experts in terms of analytical ability. Technology: We developed an artificial
intelligence (AI) framework for cancer diagnosis using deep neural network (DNN) [29].
First, a semi-automatic annotation method is used to map the image. Then, a DNN-
based malicious classification and pulmonary nodule (LN) identification model is
developed to identify cancer, especially lung cancer, using LDCT images. The LN
discovery was called “deep learning” and was validated on large datasets using deep
learning models. Tense core muscles differently. Negative nonlinearities exist between
two models or systems. Reddy et al. [30], for example, use deep learning to determine
which images have the most accuracy. In some cases, such as when generating CT
images from MRI data, we use synthetic data to create libraries containing the original
images. This can be useful since there is no electricity. We collected a large number of
MRI images with different tumors, locations, shapes and appearances to be able to
accurately represent the model. We continue to work on the SVM classifier and various
optimization methods (softmax, RMSProp, sigmoid, etc.) to analyze our work. Our
solution was built using TensorFlow and Keras because Python is a fast programming
language.
ANNs with multiple hidden input and output layers are called deep neural
networks (DNN). Based on these techniques and methods, we can construct a hidden
layer of deep neural networks (DNN). DNN architecture provides a compositional model
that represents objects as hierarchical compositions of primitives. More importantly,
complex data can be modeled with fewer units compared to shallow networks.
Additional layers make the combination lower. Generally speaking, DNNs are
feedforward networks, meaning that data flows directly from input to output or from layer
to layer. Convolutional Neural Network (CNN) CNN is a neural network with a design
consisting of many layers. Symptoms include hormonal changes, blood clots,
weakness, unsteady gait, slurred speech, mood swings, and blurred vision. The location
of the tumor determines its type, and timely diagnosis can prolong the patient's life [31].
Benign tumors are tumors that cannot invade neighboring tissues. They can be
completely eliminated and are unlikely to return. Even if brain tumors do not spread to
other tissues, they can cause serious, permanent brain damage and death. A malignant
brain tumor is a bad place. They divide rapidly and spread throughout the brain or spinal
cord, multiplying in some places. MRI scans use powerful magnets and high-frequency
radio waves to provide accurate information about tissues. An X-ray beam is used to
perform a tomography scan. Image preprocessing, feature extraction, segmentation,
and postprocessing are the processes involved in identifying brain disease.
2. Technology
i. CNN
There are two parts of CNN: pooling layer and convolution layer. Both of these
are considered as components of convolutional neural network. It is important to use
research to develop models and implement them. The design will suggest the structure
of CNN using many neurons. A useful strategy to learn how to design neural networks is
to study successful implementations [33]. This is possible because CNN has been
intensively researched and applied in the ImageNet Large-Scale Visual Recognition
Challenge (ILSVRC) from 2012 to 2016. Rapid progress in innovation. Since we deal
with the small size of each brain slice, a simple and effective cascaded convolutional
neural network (C-ConvNet/C-CNN) is proposed in the next step. This C-CNN model
uses two different methods for local and global objects. A special remote intelligent
tracking (DWA) mechanism is also proposed to improve the accuracy of tumor cells
over other models. The DWA method describes the central location of the brain in the
structure and effect of the tumor [35].
Fig 3 explains about the CNN architecture for image processing. It contains the
various layers which will segment and filter the image frames.
ii. RNN
Random neural network (RNN) is represented by a network algorithm that uses
continuous data or time series [36]. Deep learning concepts are used in applications
such as Siri, voice search, and Google Translate. This effective device is also used
in medical treatment to protect human life [37]. In this model, researchers first create
a model with the required features. The feed forward strategy in RNN is used to exit
from its memory, which will affect the current input and output. A series of layers are
used to process the results collected from the previous set. Therefore, the
relationship and potential benefits are treated as a coefficient value. Many hidden
levels are where these values can be changed. The RNN output is always based on
the early stage of the network. However, due to the many layers that combine the
results, the prediction of the image value is accurate.
Fig 4. The RNN layers structure
This famous RNN design was developed by Sepp Hochreiter and Juergen Schmidhuber
as a gradient descent solution. In other words, if the previous state influences the current
prediction, the RNN model will not be able to predict the current state correctly. It is not
suitable for peanuts. The specific nature of the nut allergy will determine whether nuts
are present or not. However, if the data comes from many previous columns, the RNN
will have difficulty integrating the input. The network's ability to predict the output
depends on these gates controlling the flow. For example, if a gender of speech such as
"he" occurs multiple times in the previous sentence, you can remove it from the hand
state.
This RNN version is similar to LSTMs in that it makes an effort to address the
problem with temporary memory that RNN models have. Instead of "cell states," it uses
hidden states to control information, and it has two gates instead of three: a reset gate
and an update gate. Like the gates in LSTMs, the reset and update gates control how
much and what kind of information to keep.
vi. GNN
3: Comparison Analysis:
The multi-view model is compared with single-shot imaging and traditional imaging
methods. The same data were used to ensure consistency, and the performance of
each model was evaluated using the previous method. They also performed different
levels of noise simulation and data augmentation to test the performance of the model.
segmentation. The network combines information from various measurement methods
to improve the accuracy of segmentation tasks. It is designed to process three-
dimensional medical data such as CT scans or MRI volumes. Features and models
are included. Leveraging multimodal data, which can include multiple imaging
viewpoints such as T1-weighted, T2-weighted, and FLAIR MRI scans, allows the
network to collect additional information that increases the accuracy and reliability of
the segmentation results. Such connectivity is important for identifying objects of
interest, including organs, tumors, or lesions. Combining data from multiple sources
makes it easier for the network to resolve issues arising from differences in image
texture and appearance, resulting in more accurate and reliable segmentation.
Network models can often include encoder-decoder models with cross-connections or
tracking systems. Pre- and transfer learning models can also be used to improve
network performance, especially when training data is scarce. Metrics such as
Hausdorff distance, Jaccard index, and dice coefficient, which measure the overlap
between predicted and true segmentations, are often used to evaluate the
performance of the network. They play an important role in disease assessment and
treatment procedures. However, to achieve good results in certain medical tasks, good
training on a large number of data and careful hyperparameter tuning are required for
successful deployment.
4: Proposed System
i. Fusion Strategy
In this paper, we examine a set of deep multimodal fusion methods in the context of
point guidance. In other words, given more video inputs (such as thickness and color
information), our goal is to process information from multiple streams when describing
motions. We specifically focus on the construction of shared representations across
layers, which has been neglected in previous studies. In the past, late fusion was used
to combine joint fusions for each model. We investigated the late fusion technique and
used the C3D [44] backbone, which has shown good results in investigating various
aspects. In addition, we evaluate the convergence of the middle levels of the network
and provide a simple way to join the early execution by analyzing 111 convolutions of
different processes. Finally, we present a new model called C3Dstitch that can learn to
combine the signals of two neurons at any level using parallel lines. Dynamic methods
use late fusion (a method often used in action recognition) to combine the results of two
or more networks at the end of the connection process. Three different training models
were investigated: 1) final integration of multiple networks, calculating the loss after
summing the average; - the method combining progress and fine-tuning. The models
used for the backbone were trained separately with the same training as the network for
each change, except for the positive stage, which was trained in several stages [46].
The main goal of this research is the technology of exchanging information on special
maps of the middle layer of the network. Our first impulse is to separate the flow at the
lowest level and then combine it into a later model. Simple convolutions combined with
multiple output extraction features form a simple fusion concept. Each input of two
combined modules must be the same as the input of the shared module in the lower
layer (post-fusion). Therefore, we cut half of the output filter in each of the 111
convolutional layers (i.e., subtract the number of flows of the filter). Convolution can be
used to reduce the dimensionality of the filter space. As a result, the final architecture
consists of three parts: 1) a general network that uses the combined representation in
the final stage, 2) two early networks for each variable.
The model phase where the streams would be combined up until this point had to
be explicitly chosen. Our objective in this paragraph is to develop a paradigm that
permits simultaneous knowledge exchange on several levels without limiting the
locations of individual or group learning. We provide a novel multi-stream methodology
where each modality has its own unique C3D network that interacts with the others at
the fully connected and pooling layers. In this design, the output of every layer is
combined using a learned weighted average called cross-stitch units [47]. Put
differently, all networks interact pairwise at every level, and the degree of interaction
between foreign modalities is found all along the way.
5: Conclusion