0% found this document useful (0 votes)
168 views

Cancer Prediction Using Machine Learning

The document discusses a project using machine learning models to predict three types of cancer: breast cancer, lung cancer, and prostate cancer. For breast cancer prediction, the model uses SVM and considers attributes like clump thickness and cell size/shape to determine if a cancer is malignant or benign. For lung cancer prediction, the model uses Random Forest and considers attributes like smoking history. For prostate cancer prediction, the model also uses Random Forest and considers attributes like radius and texture to determine likelihood of having cancer. The goal is to make cancer prediction easier using machine learning techniques.

Uploaded by

Noel Binoy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
168 views

Cancer Prediction Using Machine Learning

The document discusses a project using machine learning models to predict three types of cancer: breast cancer, lung cancer, and prostate cancer. For breast cancer prediction, the model uses SVM and considers attributes like clump thickness and cell size/shape to determine if a cancer is malignant or benign. For lung cancer prediction, the model uses Random Forest and considers attributes like smoking history. For prostate cancer prediction, the model also uses Random Forest and considers attributes like radius and texture to determine likelihood of having cancer. The goal is to make cancer prediction easier using machine learning techniques.

Uploaded by

Noel Binoy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2022 2nd International Conference on Innovative Practices in Technology and Management (ICIPTM)

Cancer Prediction using Machine Learning


Ganta Sruthi Chokkakula Likitha Ram Malegam Koushik Sai
Department of CSE Department of CSE Student, B. Tech., Dept. of CSE
Chandigarh University Chandigarh University Chandigarh University
2022 2nd International Conference on Innovative Practices in Technology and Management (ICIPTM) | 978-1-6654-6643-1/22/$31.00 ©2022 IEEE | DOI: 10.1109/ICIPTM54933.2022.9754059

Chandigarh, India Chandigarh, India Chandigarh, India


[email protected] [email protected] [email protected]

Bhanu Pratap Singh Nikhil Majhotra Neha Sharma


Student, B. Tech., Dept. of CSE Student, B. Tech., Dept. of CSE Department of CSE
Chandigarh University Chandigarh University Chandigarh University
Chandigarh, India Chandigarh, India Chandigarh, India
[email protected] [email protected] [email protected]

Abstract—Machine learning is increasingly being employed in


cancer detection and diagnosis. Cancer prediction will become
quite easy in the future and we can predict it without the need
of going to the hospitals. As we can see many technologies are
being used and tested in the medical field. So, by this we can say
that this will make us easier in the future to detect cancer. We
are testing which algorithm will give us good result among
CART, SVM AND KNN. We are making a cancer prediction
using machine learning, in which we are including three types of
cancer they are breast cancer, lungs cancer and prostate cancer.
In breast cancer, we are using SVM algorithm and for lung and
prostate we are using Random forest algorithm. We are going to Figure 1.1 Cancer cells
give different attributes for three cancer system where the user
has to enter data to get result. For breast cancer we are Machine Learning involves creating a model that is trained
considering attributes like clump thickness, uniform cell size, on certain training data and can analyse other data to generate
uniform cell shape etc. and the prediction result will be whether predictions [1]. For this machine learning system, several
the cancer is malignant or benign. For lung cancer, we are
types of models have been explored and researched. There
considering smoking, yellow fingers, anxiety, peer pressure etc.
In prostate cancer, we are considering are radius, texture, are various types of machine learning models, some of them
perimeter, area etc. and the result for both cancer is likelihood are SVM, Random Forest, Decision tree, Logistic regression.
of being affected by the cancer.
Now-a-days, we can see the constant growth in machine
Keywords— Machine Learning, Breast Cancer, Lungs Cancer,
learning in various fields, one such field is medical field. In
Prostate Cancer, Support vector machine (SVM), Random Forest,
Data-set this, the machine algorithms create a model based on the
information to predict the result of the disease using previous
I. INTRODUCTION instances recorded in datasets by using patterns and
A tumor contains cancer cells that can spread to places of the correlations among a large number of cases.
body. This is the top cause of mortality among women
worldwide, and it is one of the most frequent and life- This research paper is basically about Cancer prediction using
threatening malignant tumors. Early discovery and treatment machine learning techniques. In this project, we are going to
in any disease or any cancer can increase the survival chances talk about two models of machine learning they are: We have
as well as it decreases the chances of going under the selected three types of datasets for our project. They are
expensive treatment for benign tumors and this early breast cancer, Lung's cancer, prostate cancer. We are going
detection will help doctors to give the perfect treatments to to attach the images of these three cancer's data sets.
the patients. A cancer diagnosis is a key prerequisite for such
socioeconomic advantages. According to data provided by World health organization
(WHO) two million new cases are reported and which of
This detection of cancer can be done by using different 626,679 were died in year 2018 [2]. At this situation, the
methods one of them is ML based detection. importance of machine learning can be realized. [3]

To detect this cancer, we are going to use the SVM model. In


this cancer the parameters we are going to consider are clump
thickness, uniform cell size, uniform cell shape, marginal
adhesion, single epithelial cell size, bare nuclei, bland

217
978-1-6654-6643-1/22/$31.00 ©2022 IEEE

Authorized licensed use limited to: Motilal Nehru National Institute of Technology. Downloaded on February 03,2024 at 06:09:29 UTC from IEEE Xplore. Restrictions apply.
2022 2nd International Conference on Innovative Practices in Technology and Management (ICIPTM)

chromatin, normal nucleoli and mitosis we have plot mind on this prostate most cancers are radius, texture,
correlation matrix of these features as shown in Figure 1.2 [4] perimeter, area, smoothness, compactness, symmetry etc. we
can also see the accuracy of different algorithms in Figure1.
4 [6].

Figure 1.4 Accuracy Comparison of prostate cancer

Figure 1.2 Feature Corelation of Breast cancer In the analysis of massive volumes of data, machine learning
plays a critical role. Without depending on a precise
instruction set, computer systems use machine learning to do
Lung cancer is a type of cancer which is present at the lungs. a given specific task based on trends and specific
As we usually know that smoking cigars causes this lungs patterns[12]. We can easily say that machine learning is safe
cancer. Not only smoking cigars but the people who doesn't to be used in different fields to generate the perfect and
smoke can also get this cancer. But the people who smoke a speedier results with the given data. By using these
lot have the high chances of getting this cancer. In this cancer techniques doctors can detect cancer at a much earlier stage.
the cells start growing abnormally in the lungs which causes In this research paper, we are going to use the pictures of
tumors and they start spreading to other parts of the body and cancerous cells for the training and testing in order to analyse
then lung cancer happens. In this we are going to use the them into malignant or benign cells, to get the best result as
RANDOM FOREST model for the detection. The soon as possible.
parameters we are going to check in this are smoking, yellow
fingers, anxiety, peer pressure, chronic disease, fatigue, II. REQUIREMENTS
allergy, wheezing, alcohol consumption, coughing, shortness
of breath, swallowing difficulty, chest pain as depicted in A. Existing System
Figure 1.3 [5].
Different researchers used different methods and
technologies to carry out the process of Cancer prediction
system. Some of the important research paper Effectiveness
of –

Data Mining-based Cancer Prediction System [7] is a system


that evaluates the risk of breast, skin, and lung cancers. This
model is useful because it presents multiple cancer models. It
even takes into account both genetic and non-genetic data to

calculate risk. However, because the Weka toolkit can only


handle tiny data sets, the actual use of this model is limited
due to the sophisticated numerical solutions that are required.
As a result, we are attempting to resolve this issue.
Figure 1.3 Dataset of Lung Cancer
Prostate cancer is a type of that occurs with inside the prostate
Automated Melanoma Recognition in Dermoscopy Images
gland in men. This cancer most effectively takes place in
Using Very Deep Residual Networks [8] is a paper that uses
men. Usually, this cancer grows slowly after which assaults
very deep CNNs to overcome the hurdles of this system by
the prostate gland. The parameters we're going to keep in
using in dermoscopy images. It is ideal because deep neural

218

Authorized licensed use limited to: Motilal Nehru National Institute of Technology. Downloaded on February 03,2024 at 06:09:29 UTC from IEEE Xplore. Restrictions apply.
2022 2nd International Conference on Innovative Practices in Technology and Management (ICIPTM)

networks allow the model to learn regardless of data power to train. We are trying to minimize the amount of
limitations, but they take a significant amount of processing power in our model.
TABLE I. LITERATURE REVIEW
Project Title Year Algorithm Advantages Disadvantages
Comparison of Logistic Regression 2010 ANN and logistic regression They are presenting the idea of Creating an ANN based less
and Artificial Neural Network cancer on the basis of two models domain knowledge than creating
Models in Breast Cancer Risk with different techniques and a logistic regression mode
Estimation [7] comparing them
Effectiveness of Data Mining - 2013 Data mining WEKA It even considers factors like genetic As they are using weka tool kit, it
based Cancer Prediction System [8] and non- genetic information and can handle only small datasets
estimate risk level
Automated Melanoma Recognition 2016 Deep learning, CNN, MAT LAB Deeper neural networks allow the need lots of computational power
in Dermoscopy Images via Very model to learn irrespective of the data to train networks.
Deep Residual Networks [9] limitations
Lung Cancer Detection using Deep 2018 Deep learning and neural networks help doctors make better and Requires large amount of
Convolutional Networks [10] informed decisions when diagnosing database
lung cancer
Cancer Prediction Using Machine 2019 SVM Used to classify the normal person Not suitable when we use large
Learning Algorithms [11] and Tumor patient. data sets.

III. PROPOSED METHODOLOGY


This is the homepage of the website in which we can see
breast cancer, lung cancer and prostate cancer.

Figure 3.2 Home Page

This webpage is of breast cancer. As we know that pink


colour ribbon indicates breast cancer. In this webpage, we can
see the parameters which are used to check whether the
cancer is malignant or benign.

Figure 3.1 FLOW CHART

Algorithm

1. The webpage loads, and the user is offered the option of


selecting one of three cancer types.

2. The user must select the cancer prediction system that he


or she wishes to utilise.

3. The user must next complete the form based on their


symptoms. Figure 3.3 Breast cancer page
This webpage is of Lung cancer. As we know that grey colour
4. The data is then sent to the route that deals with machine ribbon indicates the lung cancer. In this webpage we can see
learning models, and the result is returned on the the parameters which are used to predict whether the person
following route. is suspectable to lung cancer or not.

• Webpages of the project

219

Authorized licensed use limited to: Motilal Nehru National Institute of Technology. Downloaded on February 03,2024 at 06:09:29 UTC from IEEE Xplore. Restrictions apply.
2022 2nd International Conference on Innovative Practices in Technology and Management (ICIPTM)

Figure 3.4 Lung Cancer page

This webpage is of prostate cancer. As we know that blue


colour ribbon indicates the lung cancer. In this webpage we
can see the parameters which are used to predict whether the
person is suspectable to prostate cancer or not.

Figure 3.7 Testing result of Algorithm

IV. CONCLUSION

We concentrated on cancer prediction in this paper since it is


an extremely serious disease that kills a lot of people all
Figure 3.5 Prostate cancer page around the world. Breast cancer, lung cancer, and prostate
cancer are the three types of cancer. In the field of Medicare
and Biomedical, breast cancer prognosis is quite important.
The goal of this work was to create a classifier that could
predict the most serious malignancy, breast cancer. we
developed a collaborative strategy for diagnosing this disease
and providing information on the patient's condition. The
breast cancer model as a classification job as is the
development of the Support Vector Machine (SVM)
approach to classify breast cancer as benign or malignant.
Random forest classifier was employed in the lung cancer and
prostate cancer prediction systems. The main aim of this
paper was to develop a classifier that could predict the
likelihood of a person developing lung or prostate cancer
based on a set of common factors.
Figure 3.6 Team Page
V. LIMITATION AND FUTURE SCOPE

We have some project limitations that will need to be


addressed in the future, as listed below. Because the entire
project is currently offline, the project's scope is constrained.
More cancer prediction models can be introduced as time
goes on. There is no database to keep track of the values
entered by the user.

220

Authorized licensed use limited to: Motilal Nehru National Institute of Technology. Downloaded on February 03,2024 at 06:09:29 UTC from IEEE Xplore. Restrictions apply.
2022 2nd International Conference on Innovative Practices in Technology and Management (ICIPTM)

In the future, we will be able to improve our project. The


project's scope can be hosted to expand the project's scope. It
is possible to predict cancer using image or X-ray data. A user
database can be kept up to date so that they can learn and
improve.
REFERENCES
[1] S. Sayed, "Machine Learning Is The Future Of Cancer Prediction,"
2018.
[2] H. Masood, "Breast Cancer Detection using Machine Learning
Algorithm," International Research Journal of Engineering and
Technology (IRJET), vol. 08, no. 02, 2021.
[3] Md. Milon Islam, Md. Rezwanul Haque, Hasib Iqbal, Md. Munirul
Hasan, Mahmudul Hasan and Muhammad Nomani Kabir , "Breast
Cancer Prediction: A Comparative Study Using Machine Learning
Techniques," 01 September 2020.
[4] A. Jain, "Exploring Breast Cancer Data set," 29 April 2018.
[5] V. N. Pham, "Lung cancer dataset and visualization," 22 February
2019.
[6] Georgina Cosma, Stéphanie E. McArdle, Stephen Reeder, Gemma A.
Foulds and Simon Hood, "Identifying the Presence of Prostate Cancer
in Individuals Using Computational Data Extraction Analysis of High
Dimensional Peripheral Blood Flow Cytometric Phenotyping Data," 18
December 2017.
[7] A.Priyanga and S.Prakasam, "Effectiveness of Data Mining - based
Cancer Prediction System," International Journal of Computer
Applications, vol. 83, 2013.
[8] lequan hu, hao chen, Qi dou and jing qin, "Automated Melanoma
Recognition in Dermoscopy Images via Very Deep Residual
Networks," IEEE Transactions on Medical Imaging , 2016.
[9] Turgay Ayer, Jagpreet Chhatwal, Oguzhan Alagoz, PhD Charles E.
Kahn, MS, Ryan W. Woods and Elizabeth S. Burnside, "Comparison
of Logistic Regression and Artificial Neural Network Models in Breast
Cancer Risk Estimation," 2010.
[10] J. salomon, "Lung Cancer Detection using Deep Convolutional
Networks," 2018.
[11] M. Agrawal, " Cancer Prediction Using Machine Learning
Algorithms," International Journal of Science and Research, 2018.
[12] Sharma, S., Mishra, V.M. Development of Sleep Apnea Device by
detection of blood pressure and heart rate measurement. Int J Syst Assur
Eng Manag 12, 145–153 (2021). https://doi.org/10.1007/s13198-020-
01041-3

221

Authorized licensed use limited to: Motilal Nehru National Institute of Technology. Downloaded on February 03,2024 at 06:09:29 UTC from IEEE Xplore. Restrictions apply.

You might also like