Breast Cancer Survival Prediction With Machine Learning
Breast Cancer Survival Prediction With Machine Learning
I hope you have an overview of the dataset we are using for the
task of breast cancer survival prediction. This dataset was
collected from Kaggle. You can download this dataset from here.
Now, in the section below, I will walk you through the task of
predicting breast cancer survival with machine learning using
Python.
1 import pandas as pd
2 import numpy as np
3 import plotly.express as px
4 from sklearn.model_selection import train_test_spl
5 from sklearn.svm import SVC
6
7 data = pd.read_csv("BRCA.csv")
8 print(data.head())
Negative
4 II Infiltrating Ductal Carcinoma Positive Positive
Negative
Patient_Status
0 Alive
1 Dead
2 Alive
3 Alive
4 Dead
1 print(data.isnull().sum())
Patient_ID 7
Age 7
Gender 7
Protein1 7
Protein2 7
Protein3 7
Protein4 7
Tumour_Stage 7
Histology 7
ER status 7
PR status 7
HER2 status 7
https://thecleverprogrammer.com/2022/03/08/breast-cancer-survival-prediction-with-machine-learning/ 4/17
7/7/23, 9:55 PM Breast Cancer Survival Prediction with Machine Learning | Aman Kharwal
Surgery_type 7
Date_of_Surgery 7
Date_of_Last_Visit 24
Patient_Status 20
dtype: int64
So this dataset has some null values in each column, I will drop
these null values:
1 data = data.dropna()
Now let’s have a look at the insights about the columns of this
data:
1 data.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 317 entries, 0 to 333
Data columns (total 16 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Patient_ID 317 non-null object
1 Age 317 non-null float64
2 Gender 317 non-null object
3 Protein1 317 non-null float64
4 Protein2 317 non-null float64
5 Protein3 317 non-null float64
6 Protein4 317 non-null float64
7 Tumour_Stage 317 non-null object
8 Histology 317 non-null object
9 ER status 317 non-null object
10 PR status 317 non-null object
11 HER2 status 317 non-null object
12 Surgery_type 317 non-null object
https://thecleverprogrammer.com/2022/03/08/breast-cancer-survival-prediction-with-machine-learning/ 5/17
7/7/23, 9:55 PM Breast Cancer Survival Prediction with Machine Learning | Aman Kharwal
1 print(data.Gender.value_counts())
FEMALE 313
MALE 4
Name: Gender, dtype: int64
1 # Tumour Stage
2 stage = data["Tumour_Stage"].value_counts()
3 transactions = stage.index
4 quantity = stage.values
5
6 figure = px.pie(data,
7 values=quantity,
8 names=transactions,hole = 0.5,
9 title="Tumour Stages of Patients")
10 figure.show()
https://thecleverprogrammer.com/2022/03/08/breast-cancer-survival-prediction-with-machine-learning/ 6/17
7/7/23, 9:55 PM Breast Cancer Survival Prediction with Machine Learning | Aman Kharwal
So most of the patients are in the second stage. Now let’s have
a look at the histology of breast cancer patients. (Histology is a
description of a tumour based on how abnormal the cancer cells
and tissue look under a microscope and how quickly cancer can
grow and spread):
1 # Histology
2 histology = data["Histology"].value_counts()
3 transactions = histology.index
4 quantity = histology.values
5 figure = px.pie(data,
6 values=quantity,
7 names=transactions,hole = 0.5,
8 title="Histology of Patients")
9 figure.show()
https://thecleverprogrammer.com/2022/03/08/breast-cancer-survival-prediction-with-machine-learning/ 7/17
7/7/23, 9:55 PM Breast Cancer Survival Prediction with Machine Learning | Aman Kharwal
1 # ER status
2 print(data["ER status"].value_counts())
3 # PR status
4 print(data["PR status"].value_counts())
5 # HER2 status
6 print(data["HER2 status"].value_counts())
Positive 317
Name: ER status, dtype: int64
Positive 317
Name: PR status, dtype: int64
Negative 288
Positive 29
Name: HER2 status, dtype: int64
https://thecleverprogrammer.com/2022/03/08/breast-cancer-survival-prediction-with-machine-learning/ 8/17
7/7/23, 9:55 PM Breast Cancer Survival Prediction with Machine Learning | Aman Kharwal
1 # Surgery_type
2 surgery = data["Surgery_type"].value_counts()
3 transactions = surgery.index
4 quantity = surgery.values
5 figure = px.pie(data,
6 values=quantity,
7 names=transactions,hole = 0.5,
8 title="Type of Surgery of Patients")
9 figure.show()
https://thecleverprogrammer.com/2022/03/08/breast-cancer-survival-prediction-with-machine-learning/ 9/17
7/7/23, 9:55 PM Breast Cancer Survival Prediction with Machine Learning | Aman Kharwal
1 data["Tumour_Stage"] = data["Tumour_Stage"].map({"
2 data["Histology"] = data["Histology"].map({"Infilt
3 "Infilt
4 data["ER status"] = data["ER status"].map({"Positi
5 data["PR status"] = data["PR status"].map({"Positi
6 data["HER2 status"] = data["HER2 status"].map({"Po
7 data["Gender"] = data["Gender"].map({"MALE": 0, "F
8 data["Surgery_type"] = data["Surgery_type"].map({"
9 "
10 print(data.head())
https://thecleverprogrammer.com/2022/03/08/breast-cancer-survival-prediction-with-machine-learning/ 10/17
7/7/23, 9:55 PM Breast Cancer Survival Prediction with Machine Learning | Aman Kharwal
1 # Splitting data
2 x = np.array(data[['Age', 'Gender', 'Protein1', 'P
3 'Tumour_Stage', 'Histology', 'E
4 'HER2 status', 'Surgery_type']]
5 y = np.array(data[['Patient_Status']])
6 xtrain, xtest, ytrain, ytest = train_test_split(x,
1 model = SVC()
2 model.fit(xtrain, ytrain)
https://thecleverprogrammer.com/2022/03/08/breast-cancer-survival-prediction-with-machine-learning/ 11/17
7/7/23, 9:55 PM Breast Cancer Survival Prediction with Machine Learning | Aman Kharwal
Now let’s input all the features that we have used to train this
machine learning model and predict whether a patient will
survive from breast cancer or not:
1 # Prediction
2 # features = [['Age', 'Gender', 'Protein1', 'Prote
3 features = np.array([[36.0, 1, 0.080353, 0.42638,
4 print(model.predict(features))
['Alive']
Summary
So this is how we can use machine learning for the task of
breast cancer survival prediction. As the use of data in
healthcare is very common today, we can use machine learning
to predict whether a patient will survive a deadly disease like
breast cancer or not. I hope you liked this article on Breast
cancer survival prediction with machine learning using Python.
Feel free to ask valuable questions in the comments section
below.
https://thecleverprogrammer.com/2022/03/08/breast-cancer-survival-prediction-with-machine-learning/ 12/17