0% found this document useful (0 votes)
46 views5 pages

2-Breast Cancer Detection Using K-Nearest Neighbor Machine Learning Algorithm

This summary provides the key details about the proposed method in 3 sentences: The paper proposes using image processing techniques to prepare mammography images for feature extraction, which are then used as input for two machine learning models - a Back Propagation Neural Network model and Logistic Regression model - to detect breast cancer. Image cropping, filtering, and wavelet transformation are applied to the images before extracting features and training the machine learning models. The results of the two models are then compared to determine the most accurate model for breast cancer detection.

Uploaded by

mariam askar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views5 pages

2-Breast Cancer Detection Using K-Nearest Neighbor Machine Learning Algorithm

This summary provides the key details about the proposed method in 3 sentences: The paper proposes using image processing techniques to prepare mammography images for feature extraction, which are then used as input for two machine learning models - a Back Propagation Neural Network model and Logistic Regression model - to detect breast cancer. Image cropping, filtering, and wavelet transformation are applied to the images before extracting features and training the machine learning models. The results of the two models are then compared to determine the most accurate model for breast cancer detection.

Uploaded by

mariam askar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2016 9th International Conference on Developments in eSystems Engineering

Breast Cancer Detection using K-nearest Neighbor


Machine Learning Algorithm
1
Moh’d Rasoul Al-hadidi, 2Abdulsalam Alarabeyyat, 3Mohannad Alhanahnah
1
Computer Engineering Department, 2 Computer Science Department
1,2
AlBalqa’ Applied University, 3University of Kent
1,2
Salt, Jordan, 3Kent, UK
1
[email protected]

Abstract— Breast cancer is very popular between females neural network that consists of many neurons in many
all over the world. However, detecting this cancer in its first layers.
stages helps in saving lives. Radiologists have the ability to
predict if the mammography images have cancer or not, but A group of neurons in the neural network have separate
they may miss about 15% of them. In this paper, we proposed functions at the same time [6]. In the neural networks there
a new method to detect the breast cancer with high accuracy. is a learning stage in which the weights are adjusted to get
This method consists of two main parts, in the first part the the desired output and testing stage in which the neural
image processing techniques are used to prepare the network is tested to see its accuracy in detecting process.
mammography images for feature and pattern extraction Generally we have three types of learning that are
process. The extracted features are utilized as an input for a supervised learning that need a teacher, unsupervised
two types of supervised learning models, which are Back learning that works without teacher, and hybrid learning that
Propagation Neural Network (BPNN) model and the Logistic is between supervised and unsupervised learning [7].
Regression (LR) model with comparing the result and the
accuracy for the both models. Another type of the supervised learning algorithm is the
Logistic Regression (LR), where this algorithm is used for
Keywords— Breast Cancer, Image Processing, learning process; it is a type of statistical classification
Mammography, Machine Learning. model. This model is used for predicting the outputs of
many probable outcomes. The mathematical equation of the
I. INTRODUCTION logistic function is shown in the following equation.
Breast cancer is the second most dangerous cancer after
lung cancer. Early detection can survive the people lives
because it is easier to treat and prevent the tumor from
expanded. Tumor is the abnormal growth of cells.
For many years, the X-ray was the only method that
was used to detect the breast cancer [1, 2]. However, many
another methods have been generated and proposed for
detecting process that are more efficient than x-ray
procedure such as, neural networks [3], artificial
intelligence, and data mining.
There is a self-test every woman can do it monthly
using her hand to check for any abnormal growing cells,
another way is going to a specialist doctor for
mammography test. Mammography is “the process of using Fig. 1: Logistic Function Curve
low-dose X-rays to examine the human breast and is used as
a diagnostic as well as a screening tool” [4].
By using the Logistic Regression (LR) model, it will
Image processing techniques are used to convert the estimate the logistic function to get a value between zero
image from one to another format and for feature extraction and one with making appealing to know the risk factor that
of the images that helps to get a more useful data set. There mean a risk for disease.
are a large number of applications that relates to the human
activities use the image processing, from remotely position
The first section of this paper will be the introduction
explanation to biomedical image interpretation [5].
then the literature review of the breast cancer is presented
Artificial neural networks (ANNs) are one of the most where the next section will be the explanation of the
common in machine learning, it simulate the human body

978-1-5090-5487-9/17 $31.00 © 2017 IEEE 35


DOI 10.1109/DeSE.2016.8
experiment and finally the conclusion and the future works A. Image processing
are illustrated.
For the mammography images we applied a sequence of
II. RELATED WORKS image processing functions to generate a utilized image to be
In Breast cancer detection field there are many studies input for the machine learning algorithms.
with many concepts and methods were used to be a useful
The first process was image cropping where a specific
methods. Many researchers present many methods and
part of the image (margin) was deleted and the remained part
algorithms to detect the breast cancer disease; here we
was extracted. In our dataset, the images have the same size
briefly discussed some of them.
of the margins so it is easy to have a static cropping process.
Using Ultra-Wide Band (UWB) antenna to get
As shown in Fig.2.
microwave images of the breast with cancerous information
was presented by Zhang et al. they used the gelatin-oil
technology to get the experimental results to manifest the
efficiency of these microwave images in breast cancer
detection. By using UWB the detection process was more
accurate and the signal-to-noise ratios showed that the noise
was attracted around the tumor [8].
In [9], the authors proposed a method that used a
gaussian pulse generator in UWB application to make the
image detection of the breast cancer diagnosis occurred in
fast manner. That needs to use static inverter with phase
detector and NMOS pulse shaping circuit to accelerate the
system detection process.
A logistic Generalized Additive Model (GAM) was
proposed by Roca-Pardinas et al.. They used linear kernel
smoothers with this logistic GAM,they speed up their
system using many techniques. In this simulation model
they used odds-ratio curves. Using this model help in early
detection process for the breast cancer [10]. Fig.2. Cropping Process
In another study, the Back propagation Neural Network
was used for breast cancer detection and the authors To enhance and modify the images we used the image
compared the result with another model that used the radial filtering process. Our data set have blur effects and to
basis function network, where the best result were gotten by remove it we used Weiner filter to eliminate this noise. In
the BPNN [11]. figure 3 we cane note that the white lines are less than the
For breast cancer detection the researchers used the original image.
direct subtraction beam forming imager, where Jin et al.
used a numirical simulation and electromagnetic in their In many works, the researchers preferred to convert the
model and they found that the model have a high resolution images into binary images (black and white), regarding to
and robustness in detecting breast cancer process [12]. the useful information they can get for detection process.
Another model that was designed for breast cancer But, in our work, we kept the images in grey scale format
detection was proposed by Sajjadieh et al. They used an because when we tried to convert them we noticed that the
electromagnetic model that based on Finite Difference Time white dot in the images that presents the tumor was
Domain (FDTD). This model gives a higher accuracy in vanished from the image this means the useful information
tumor detection process than another signal processing will be omitted as shown in Fig.3.
algorithms [13].
The filtered images were transformed from time domain
into frequency domain using Discrete Wavelet
III. EXPERIMENT AND METHODOLOGY Transformation (DWT). The output of the wavelet
The proposed method for breast cancer detection transform is known as wavelet decomposition that consists
consists of two main parts: image processing techniques and of four matrices: the approximation coefficients matrix, the
the machine learning algorithms where applying these horizontal detailed coefficient matrix, the vertical detailed
algorithms were done by using Matlab software. In this coefficient matrix, and the diagonal detailed coefficient
work we extracted 209 images for 50 patients cases who matrix. In our work, we used the last three matrices where
have breast cancer and the testing stage was applied on the first matrix was not used, see Fig 4. The values in these
many people either they have a breast cancer or they have matrices were used as input for our learning algorithm; we
not.

36
can observe that the shown values of these plots were B. Machine Learning Algorithm
distributed cross zero value.
In our work, supervised learning algorithm has been
used. Indeed, we used two types of supervised machine
learning algorithms which were the Logistic Regression and
the Backpropagation neural Network and we compared the
results from both of them.

Logistic Regression (LR) was used for classification


process for the mammography images. As any machine
learning algorithm, LR needs a hypothesis and a cost
function. The following equations show the hypothesis and
the cost function equations.

Fig. 3. Filtered Image Where we have the values of weights of the hypothesis,
x's are the input values and the y's are the output values.
This distribution cross zero value was utilized in our Now, our purpose to optimize the cost function where this is
model to extract information. After generating a new matrix can be achieved by repeating equation 4 many times until
depending on our algorithm with the count of zero crossing reach the desired cost function.
we got the input values for the learning model. The values
of the algorithm outputs were 0's for the normal images and
1's for images that have tumor, so we need to normalize the
data by dividing these data on the maximum value for each
column of these output data. Many features and numbers were required to get the
optimal value with utilization such as the following
parameter:

x Value of A = 0.45

x Number of Iterations = 1000

x Number of Features = 750

Fig. 5 shows the cost function with different values of


alpha during the training process.

Fig.4. Plot of coefficient matrices


Fig. 5. Error values of the cost function

37
The second learning algorithm that was used in our work
was the Backpropagation Neural Network (BPNN).

Our purpose focused on still finding optimization


method for detection and classification. The BPNN is easy
to implement and has been used widely for classification
purposes. For our neural network the configuration
parameters are shown in table 1. However, we could not use
the same number of features as in LR model, this related to
the limitation of the Matlab memory. We used 264 features
with LR.

Table. 1. The Configuration Parameters of Neural Network

Parameter Value
Fig. 7. MSE of Neural Network
Number of Hidden Layers 1

Number of Neurons in the second layer 10 The utilization percentage for training, testing, and
validation was 70%, 20%, and 10%.
Number of Neurals in the first layer 240

Used function Triangular


Epoch 1000

IV. RESULT AND DISCUSSION


In our proposed system we used 209 images that were Fig. 8. Gradient of Neural Network
extracted from 50 patient's cases. These images were used
for training, testing, and validating processes. Each image in our dataset has resolution of 1024x1024.
Our dataset consist of 96 normal images and 110 effected
images, we arranged them as vector which consist 96 zeros
and 110 ones, and this vector was used in LR and BPNN
models.

The output matrices with our training vector that we


prepared for machine learning models is shown in Fig. 6 for
the BPNN where the regression value of the neural network
model exceeded 93.7%.Moreover, we can see that the Mean
Square Root (MSE) value is less than 0.07 where the
performance of the gradient of neural network is shown in
Fig.8.
V. CONCLUSIONS
In this paper, we focused on a very dangerous disease
that causes death for many women over the world which is
the breast cancer and we proposed a contributed method to
diagnosis this disease and give information about the patient
status. Our proposed model consists mainly of two parts, the
first one is using image processing techniques for feature
extraction where the second part was the machine learning
algorithms in two types of supervised learning algorithms,
LR and BPNN. We observed that, the number of features
utilized in LR model was much higher than with the BPNN.
However, we also have a good regression value using
Fig.6. Regression of Neural Network BPNN that exceeded 93% with only 240 features.

38
REFERENCES

[1] M. Brown, F. Houn, E. Sickles, and L. Kessler. “Screening


mammography in community practice”. Amer.J.Roentgen,
vol. 165, 1995.
[2] M. Alhadidi, M. Al-Gawagzeh , B. Alsaaidah, “Solving A
Mammography Problems of Breast Cancer Detection Using
Artificial Neural Networks And Image Processing
techniques”, Indian Journal of Science and Technology,
Vol.5, No.4, 2012.
[3] E. D. Ubeyli,”Implementing automated diagnostic systems for
breast cancer detection”, Elsevier, Expert systems with
applications, vol.33, (2007).

[4] D. Kulkarni,S. Bhagyashree, G. Udupi, ”Texture analysis of


mammographic images”,International Journal of Computer
Applications, vol.5 , (2010).
[5] T. Acharya, A. Ray,”Image processing: principles and
applications”,Wiley-Interscience,Hoboken new jersey, ISBN
0471719986,(2005).
[6] A. Hopgood, Intelligent systems for engineers and scientists,
Library of Congress Cataloging in Publication Data, (2000).
[7] H. Zhang, T. Arslan, B. Flynn,”A Single Antenna Based
microwave System for Breast Cancer Detection: Experimental
Results”, IEEE, (2013).
[8] E. Y. K. NG, E. C. KEE,”Advanced integrated technique in
breast cancer thermography”, Journal of Medical Engineering
and Technology, Vol. 32, No. 2,(2008).
[9] S. H. Barboza,J. A. Palacio, E. Pontes, S. Kofuji, ”Fifth
Derivative Gaussian Pulse Generator for UWB Breast Cancer
Detection System”,IEEE,(2014).
[10] J. Roca-Pardiasa,C. Cadarso-Surezb, P. Tahocesc, M. Ladod,
”ssessing continuous bivariate effects among different groups
through nonparametric regression models: An application to
breast cancer detection”, Elsevier, Computational Statistics
and Data Analysis, Vol. 52,(2008).
[11] P. Pawar, D. Patil,”Breast Cancer Detection Using Neural
Network Models”, IEEE, International Conference on
communication Systems and Network Technologies,(2013).
[12] Y. Jin, J. Moura, Y. Jiang, ”Breast Cancer Detection By Time
Reversal Imaging”, IEEE,(2008).
[13] M. Sajjadieh, F. Foroohar, A. Asif, ”Breast Cancer Detection
using Time Reversal Signal Processing”,IEEE, (2009).

39

You might also like