Breast Cancer Classification Using Machine Learning
Breast Cancer Classification Using Machine Learning
Learning
Decision Tree
The decision tree is displayed in the form of a tree.
The test items are categorised according to their
feature values. In a decision tree, a node represents an
instance, a branch represents the test results, and the
leaf node represents the class label.
Random Forest
A random forest is a collection of separate decision Experimental Section
trees. Each decision tree produces a classification
During the experimental section following were the
prediction. It determines the test object's class by
steps included.
combining votes from several decision trees.
1. Importing libraries
2. Feature Selection
When the amount of features in a machine learning
method is redundant, the accuracy drops [16]. We
used feature selection on the data to increase the
accuracy of the models and minimise overfitting. To
choose characteristics from the dataset for the
conventional models, we employed two strategies.
We produce the feature significance of the latest
training result for the Decision Tree and Random
Forest models and pick features accordingly.
Correlation depicts how the dataset's characteristics
2. Exploratory Data Analysis (EDA)
are connected to one another. It's easy to see which
a. Providing file path
features are significantly connected when you use the
heatmap view. We may plot the heatmap for a better
look using the seaborn library. We pick only one
feature to represent all of the characteristics in each
set of strongly associated features. To avoid
b. Reading data from the file 9. Getting the correlation
10. Training of the dataset
logistic regression accuracy: 0.9912087912087912
Decision tree accuracy: 1.0
Random forest accuracy: 0.9978021978021978
11. Compiling the data