0% found this document useful (0 votes)
20 views17 pages

Brest Cancer Prediction

Breast cancer is the most common cancer among women in India, with an alarming increase in cases among younger age groups. This project focuses on enhancing early detection using machine learning algorithms, achieving higher diagnostic accuracy compared to traditional methods. The study emphasizes the importance of early detection to improve patient outcomes as breast cancer cases continue to rise, particularly in younger populations.

Uploaded by

Kalyan Potnuru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views17 pages

Brest Cancer Prediction

Breast cancer is the most common cancer among women in India, with an alarming increase in cases among younger age groups. This project focuses on enhancing early detection using machine learning algorithms, achieving higher diagnostic accuracy compared to traditional methods. The study emphasizes the importance of early detection to improve patient outcomes as breast cancer cases continue to rise, particularly in younger populations.

Uploaded by

Kalyan Potnuru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Breast cancer is a group of diseases in which

cells in breast tissue change and divide


BREAST CANCER uncontrolled, typically resulting in a lump or
PREDICTION mass. Most breast cancers begin in the
lobules (milk glands) or in the ducts that
connect the lobules to the nipple.
1. INTRODUCTION
 Breast Cancer is the Most common cancer in women in India, in cities and in rural areas
of India as well.

 Am I worried about the trends of breast cancer in India? Yes, very much. But can we
change the trend? No, of course not. But what we can change is our 'ignorance', our
'lack of awareness'. You will understand what I mean, from the discussions in this
section. Breast cancer is a global disease. Though most underlying causes, and other
features are usually uniform around the world, every region has its own uniqueness for
that cancer. Let us assess the trend of breast cancer in India.
A BS T RAC T

This project aims to enhance early breast cancer detection u


sing various machine learning algorithms, including Logistic
Regression, SVM,KNN, and Random Forest. These models are
trained on patient data, focusing on features like clump thic
kness and cell size. By evaluating each algorithm's performa
nce through crossvalidation, we identify the most effective
method for accurate diagnosis. This study not only highlights
algorithm strengths and weaknesses but also aims to impro
ve patient outcomes with better diagnostic tools.
 The key challenge in cancer detection is how to classify
tumours into malignant or benign. Machine learning
techniques can dramatically improve the accuracy of
diagnosis. Research indicates that most experienced
physicians can diagnose cancer with 79 percent accuracy
while 91 percent correct diagnosis is achieved using
machine learning techniques. In this case study our task is
to classify tumours into malignant or benign tumours
using features obtained from several cell images. Let us
look at the cancer diagnosis and classification process. So,
the first step in the cancer diagnosis process is to do what
we call it fine needle aspirate or FNA process which is
simply extracting some of the cells out of the tumour. And
at that stage we do not know if that tumour is malignant
or benign. When we say malignant or benign as you guys
can see these are kind of the images of the cell. This would
be benign tumour and on the right side is the malignant
tumour.
Increasing Incidence of BC in Younger Age Groups (30s and 40s)

 In India, we are now witnessing more and more numbers of


patients being diagnosed with breast cancer to be in the younger
age groups (in their thirties and forties). Please consider the
adjoining graph (This is only a rough representation of the data,
for your understanding, the data used for this representation are
rough estimates):
 The horizontal line lower down represents the age groups: 20 to
30 years, 30 to 40 yrs. and so on. And the vertical line represents
the percentage of cases. The blue color represents the incidence
25 years back, and maroon color represents the situation today.
25 years back, out of every 100 breast cancer patients, 2% were
in 20 to 30 years age group, 7% were in 30 to 40 and so on. 69%
of the patients were above 50 years of age. Presently, 4% are in
20 to 30 yrs. age group, 16% are in 30 to 40, 28% are in 40 to 50
age group. So, almost 48% patients are below 50. An increasing
numbers of patients are in the 25 to 40 years of age, and this is a
very disturbing trend.
 Of course, one reason for higher numbers of younger patients is
our population pyramid, which is broad at the base and middle
and narrow at top, which means that we have a huge population
in the younger age group and much lesser in older age group.
INCREASING INCIDENCE OF
BREAST CANCER IN INDIA

 Initially, before the 2008 statistics, cancer of the Cervix was the most
common cancer in women in India and Breast cancer was second
most common.

 When the 2008 statistics came out, Breast Cancer overtook Cervical
cancer as the most common cancer in Indian cities. But Cervical
cancer was still common in rural areas.

 And with the 2012 statistics report, Breast Cancer in India made a
huge jump - The numbers rose so rapidly, that the difference on
numbers between breast and cervical cancer was quite stark. In most
Indian cities, Breast Cancer amounted to almost 25 to 32 percent of
all cancers in women, whereas Cervical cancer was around 8 to 9
percent.

 And the numbers are just going on rising. You can read a detailed
analysis and discussion of breast cancer in India in the section

 The Upper Image shows a bar graph about percentage distribution of


top ten cancers in females in Mumbai, for the year 2006 - 2008. You
can clearly note how common was breast cancer then, was back in
2008. And today, as of May 2020, this number has only risen much
more.

 The Lower Image shows trend of breast cancer in a few cities like
2 . V I S U A L I Z AT I O N O F O U R F E AT U R E S
 The adjacent figure represents the first
4 features from our data and how they
relate to each other.

 Note that ‘0’ is for Benign tumour


while’1’ is for malignant tumour.

 One of the many observations to see is


that mean radius and mean perimeter
have a linear relationship.

 Whenever the mean area increases wrt


mean radius, the chances of a benign
cell increases. This means that the
patient can be safe.

 When the mean smoothness is at the


middle, the mean area decides if the
cell is benign or malignant. The higher
the area, more are the chances of a
benign tumour.

 This goes on for each and every


comparison. Note that we have only
compared the first 5 features with each
other. We have in total of 30 features.
 A Scatter Plot representing
mean area and mean
smoothness.
 The lower the mean area
and mean smoothness, the
higher the chances of a
malignant tumour.
 But the main problem here
is that our data is not
scaled so we cannot know
the exact differences. Still
our conclusion is correct.
 The adjacent figure is a
heatmap of our whole
dataset including our y
variable(feature).
 Mean radius, mean
perimeter and mean area
are one of the most
highly correlated points
to our worst perimeter
feature.
 These types of
correlations can also be
seen with our target
variable which is seen in
the heatmap.
3 . M O D E L E VA LU AT I O N
3 . 1 RA N D O M F O R E S T

Random Forest Classification is a rob


ust and widely-used
ensemble learning method in machi
ne learning. This technique operates
by constructing a multitude of decis
ion trees during training and outputs
the mode of the classes for classific
ation tasks. By averaging multiple d
eep decision trees, Random Forest c
orrects for overfitting, providing a hi
gh level of accuracy and generalizab
ility.
S U P P O RT V E C T O R
C L A S S I F I C AT I O N

 Support Vector Classification (SV


C) is a powerful and versatile su
pervised machine learning algori
thm that excels in classification
tasks. It is part of the Support V
ector Machine (SVM) family and
works by finding the optimal hy
perplane that best separates th
e data points of different classes
in a high-dimensional space.
KNEIGHBOUR CLASSIFIER
The K-Nearest Neighbors (K-
NN) algorithm is a fundamental and
intuitive machine learning techniqu
e used for classification and regressi
on tasks. This non-
parametric algorithm works by ident
ifying the k nearest data points to th
e query point, based on a chosen di
stance metric (such as Euclidean dis
tance). In classification, it assigns th
e class label most frequently repres
ented among these neighbors, maki
ng it a powerful tool for pattern reco
gnition and decision-making.
A L G O R I T H M C O M PA R I S I O N

He landscape of machine learning algorithms for


breast cancer detection, we embark on a quest to
compare a spectrum of methods, from the traditio
nal Logistic Regression and Support Vector Machin
e (SVM) to the more complex architectures like Gr
adient Boosting and LightGBM. This comparative s
tudy involves evaluating the performance of each
algorithm using cross-
validation, ensuring robustness in our results. Eac
h model will be trained and validated on the datas
et, providing accuracy metrics that highlight their
predictive prowess. The visual representation of t
his comparison, through a bar plot, will illuminate
the strengths and potential weaknesses of each al
gorithm, guiding us toward the most effective met
hod for early and accurate detection
R E S U LT

In the adjacent figure shows


result of a
breastcancer,Depending on the p
rediction result where 2 indicates
breast cancer and any other valu
e indicates no breastcancer,it prin
ts a message stating whether the
patient has breast caner or not.
4 . C O N C LU S I O N
 It is important to understand that breast cancer can occur in the younger
population as well. We saw that 37.7% of all newly detected breast cancer cases
were in the 25 to 49 years age group. For most, breast cancer is an illness of the
elderly, but it is not so. Also, many young breast cancers tend to be very
aggressive and have to be tackled in proper sequence of surgery and other
treatments, and needless to say, early detection will definitely give high cure rates.
 If you have read all the above points, they are all pointing to one necessity -
'Detect Breast Cancer Early'. Since the number of cases are rising, more younger
women are getting affected, most are presenting only after symptoms develop (so
usually stage 2B and beyond, rarely earlier stage) and we cannot prevent this
cancer, all we can do is to detect this cancer early.

You might also like