Decision Tree Code Explanation
Decision Tree Code Explanation
1. Importing Libraries
python
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from sklearn import tree
numpy (as np): For working with arrays and numerical data
python
data = load_breast_cancer()
X = data.data
y = data.target
What it does:
data = load_breast_cancer() : Loads the breast cancer dataset (569 samples with 30 features each)
X = data.data : Gets the input features (measurements like tumor size, texture, etc.)
What it does:
Splits the data into training (80%) and testing (20%) sets
python
clf = DecisionTreeClassifier(random_state=42)
clf.fit(X_train, y_train)
What it does:
The model learns patterns by asking questions like "Is the tumor radius > 15?" and creating a tree of
decisions
5. Making Predictions
python
y_pred = clf.predict(X_test)
What it does:
Uses the trained model to predict outcomes for the test data
6. Calculating Accuracy
python
What it does:
Compares the model's predictions ( y_pred ) with the actual answers ( y_test )
python
new_sample = np.array([X_test[0]])
prediction = clf.predict(new_sample)
prediction_class = "Benign" if prediction == 1 else "Malignant"
print(f"Predicted Class for the new sample: {prediction_class}")
What it does:
python
plt.figure(figsize=(12,8))
tree.plot_tree(clf, filled=True, feature_names=data.feature_names, class_names=data.target_name
plt.title("Decision Tree - Breast Cancer Dataset")
plt.show()
What it does:
If no → go right branch
Each path from top to bottom represents a different rule for classification, making the model
interpretable and easy to understand!