Brain Tumor Classification Using Neural Network Based Methods
Kailash D.Kharat1 & Pradyumna P.Kulkarni2 & M.B.Nagori 3
Govt. Engineering College, Aurangabad, Maharashtra, India.
Dept. of Computer Science and Engg., Aurangabad, Maharashtra, India
E-mail- [email protected] , [email protected]
Abstract - MRI (Magnetic resonance Imaging) brain tumor images Classification is a difficult task due to the variance and
complexity of tumors. This paper presents two Neural Network techniques for the classification of the magnetic resonance human
brain images. The proposed Neural Network technique consists of three stages, namely, feature extraction, dimensionality reduction,
and classification. In the first stage, we have obtained the features related with MRI images using discrete wavelet transformation
(DWT). In the second stage, the features of magnetic resonance images (MRI) have been reduced using principles component
analysis (PCA) to the more essential features. In the classification stage, two classifiers based on supervised machine learning have
been developed. The first classifier based on feed forward artificial neural network (FF-ANN) and the second classifier based on
Back-Propagation Neural Network. The classifiers have been used to classify subjects as normal or abnormal MRI brain images.
Artificial Neural Networks (ANNs) have been developed for a wide range of applications such as function approximation, feature
extraction, optimization, and classification. In particular, they have been developed for image enhancement, segmentation,
registration, feature extraction, and object recognition and classification. Among these, object recognition and image classification is
more important as it is a critical step for high-level processing such as brain tumor classification. Multi-Layer Perceptron (MLP),
Radial Basis Function (RBF), Hopfield, Cellular, and Pulse-Coupled neural networks have been used for image segmentation. These
networks can be categorized into feed-forward (associative) and feedback (auto-associative) networks..
Keywords-MRI; Feature Extraction; Feature Selection; Tumor Classification; Feed forward Neural Network; Back-Propagation
Neural Network.
the classification model so we do not employ
registration. Image segmentation is required to delineate
Early detection and classification of brain tumors is the boundaries of the ROIs ensuring, in our case, that
very important in clinical practice. Many researchers tumors are outlined and labeled consistently across
have proposed different techniques for the classification subjects. Segmentation can be performed manually,
of brain tumors based on different sources of automatically, or semi-automatically. The manual
information. In this paper we propose a process for brain method is time consuming and its accuracy highly
tumor classification, focusing on the analysis of depends on the domain knowledge of the operator.
Magnetic Resonance (MR) images and Magnetic Specifically, various approaches have been proposed to
Resonance Spectroscopy (MRS) data collected for deal with the task of segmenting brain tumors in MR
patients with benign and malignant tumors. Our aim is images. The performance of these approaches usually
to achieve a high accuracy in discriminating the two depends on the accuracy of the spatial probabilistic
types of tumors through a combination of several information collected by domain experts. In previous
techniques for image segmentation, feature extraction work, we proposed an automatic segmentation algorithm
and classification. The proposed technique has the that is based on the fuzzy connectedness concept. The
potential of assisting clinical diagnosis. main idea is to assign to every pair of voxels, x, y, in the
image, a real number between 0 and 1 indicating their
Necessary preprocessing steps prior to
connectedness. Starting with several seed points, all the
characterization and analysis of regions of interest
voxels are automatically assigned to the structure to
(ROIs) are segmentation and registration. Image which they have the highest connectedness value.
registration is used to determine whether two subjects
Utilizing the statistical information cumulated during
have ROIs in the same location. However, in this work
the segmentation process, this method can provide
we do not take into account the location of the tumor in
satisfying results even in cases where the boundaries of There are four major steps in the proposed approach
the ROIs cannot be easily identified. for brain tumor classification: (a) ROI segmentation:
delineating the boundary of the tumor (ROI) in an MR
Having segmented the ROI and in order to build a
image; (b) feature extraction: getting meaningful
classification model, one needs to extract a set of
features of the ROI identified in the previous step; (c)
discriminative features from the ROI. Most
feature selection: removing the redundant features; (d)
characterization techniques are based on extracted
classification: learning a classification model using the
global visual features that refer to the entire image rather
than to regions that are of interest. However, in medical
images, feature extraction has to focus on specific A. Segmentation
regions and capture not only shape but also structural
Within the segmentation process, each image region
and internal volume properties that can be useful for
confined by a rectangular window is represented by a
building a classification model. Megalooikonomou et al.
feature vector of length R. These vectors computed for
proposed a method that efficiently extracts a k-
Q selected regions are organized in the pattern matrix
dimensional feature vector using concentric spheres in
PR,Q and form clusters in the R-dimensional space. The
3D (or circles in 2D) radiating out of the ROI’s center of
Q pattern vectors in P are fed into the input NN layer,
mass. The method has been applied successfully to
while the number C of the output layer elements
classification and similarity searches of spatial ROIs. In
represents the desired number of segmentation classes.
this paper, we propose an approach (see Figure 1) for
In each epoch of the network training process, the
building a classification from the MR images, and a
network weights WC,R are recalculated by minimizing
group of features is extracted. Instead of employing all
the distances between each input pattern vector and the
of the features to build the model, a preprocessing step
corresponding weights of the winning neuron
of feature selection is model performed aiming to
characterized by its coefficients closest to the current
remove the redundant features. Based on the statistical
pattern. In case that the process is successfully
information, only the most for informative features
completed, the network weights belonging to separate
extracted from the MR images are utilized in the model
output elements represent typical class individuals. In
building process. In addition, in this
this paper, the region segmentation process comprises of
brain training the NN on all image regions extracted by a
rectangular sliding window with half overlap, and
subsequent exploitation of the trained network for
region classification. The algorithm comprises of the
following successive steps:
1. Feature vectors computation to create the
feature matrix P using the sliding window
2. Initialization of the learning process
coefficients and the network weights matrix W
3. Iterative application of the competitive process
and the Kohonen learning rule [10] for all
feature vectors during the learning stage
4. NN simulation to assign class numbers to
individual feature vectors
5. Evaluation of the regions classification results
B. Feature Extraction
The proposed system uses the Discrete Wavelet
Transform (DWT) coefficients as feature vector. The
wavelet is a powerful mathematical tool for feature
Figure 1. Proposed Work Model extraction, and has been used to extract the wavelet
coefficient from MR images. Wavelets are localized
paper, we consider features from other sources (e.g., basis functions, which are scaled and shifted versions of
MRS data) in the classifier training process. This leads some fixed mother wavelets. The main advantage of
to improved classification accuracy. wavelets is that they provide localized frequency
II. METHODOLOGY information about a function of a signal, which is
particularly beneficial for classification. A review of factors. The main feature of DWT is multiscale
basic fundamental of Wavelet Decomposition is representation of function. By using the wavelets, given
introduced as follows: function can be analyzed at various levels of resolution.
Fig. 2 illustrates DWT schematically. The original
The continuous wavelet transform of a signal x(t),
image is process along the x and y direction by h(n) and
square-integrable function, relative to a real-valued
g(n) filters which, is the row representation of the
wavelet, (t) is defined as:
original image. As a result of this transform there are 4
∞ subband (LL, LH, HH, HL) images at each scale.
(1) Wψ (a, b) =
∫ f (x) ψ * a,b (t )dx (Fig.2). Subband image LL is used only for DWT
calculation at the next scale. To compute the wavelet
features in the _rst stage, the wavelet coefficients are
Where 1 calculated for the LL subband using Harr wavelet
ψ a, b (t ) =
C. Feature Selection and Reduction
and the wavelet Ψa,b is computed from the mother Ψ
wavelet by translation and dilation, wavelet, a the One of the most common forms of dimensionality
dilation factor and b the translation parameter (both reduction is principal components analysis. Given a set
being real positive numbers). Under some mild of data, PCA finds the linear lower-dimensional
assumptions, the mother wavelet Ψ satisfies the representation of the data such that the variance of the
constraint of having zero mean. reconstructed data is preserved. Using a system of
feature reduction based on a combined principle
The eq. (1) can be discretized by restraining a and b component analysis on the feature vectors that
to a discrete lattice (a = 2b; a € R+; b € R) to give the calculated from the wavelets limiting the feature vectors
discrete wavelet transform (DWT). The discrete wavelet to the component selected by the PCA should lead to a n
transform (DWT) is a linear transformation that operates efficient classification algorithm utilizing supervised
on a data vector whose length is an integer power of approach. So, the main idea behind using PCA in our
two, transforming it into a numerically different vector approach is to reduce the dimensionality of the wavelet
of the same length. It is a tool that separates data into coefficients. This leads to more efficient and accurate
different frequency components, and then studies each classifier.
component with resolution matched to its scale. DWT
can be expressed as. The feature extraction process was carried out
through two steps: firstly the wavelet coefficients were
dj , k = ∑ ( x ( n ) h* j ( n − 2 jk )) extracted by the DWT and then the essential coefficients
(2) DWTx ( n ) = dj , k = ∑ ( x ( n ) g * j ( n − 2 jk )) have been selected by the PCA.
nodes. This was to avoid over fitting or under fitting the include a complexity term that reacts a prior distribution
data. Due to hardware limitations, ten nodes in the over the values that the parameters can take.
hidden layer were selected to run the final simulation.
The activation function considered for each node in
Figure 2 shows the design of the Feed Forward Neural
the network is the binary sigmoidal function defined
networks used in this research.
(with s = 1) as output = 1/(1+e-x), where x is the sum of
The 500 data points extracted from each subject the weighted inputs to that particular node. This is a
were then used as inputs of the neural networks. The common function used in many BPN. This function
output node resulted in either a 0 or 1, for control or limits the output of all nodes in the network to be
patient data respectively. Since the nodes in the input between 0 and 1. Note all neural networks are basically
layer could take in values from a large range, a transfer trained until the error for each training iteration stopped
function was used to transform data first, before sending decreasing.
it to the hidden layer, and then was transformed with
another transfer function before sending it to the output
layer. In this case, a tan sigmoid transfer function was
used between the input and hidden layer, and a log
sigmoid function was used between the hidden layer and
the output layer.
