Logistic Regression Vs Decision Tree
Logistic Regression Vs Decision Tree
Logistic Regression vs Decision Tree.
1. Model Type: 3. Update Step: Calculate new centroids by
● Logistic Regression is a linear model that taking the mean of all points assigned to
predicts the probability of class each cluster.
membership using a sigmoid function.
● Decision Tree is a non-linear, tree-based 4. Repeat Steps 2 and 3 until centroids do
model that splits data based on feature not change significantly (i.e.,
values. convergence).
2. Decision Boundary:
● Logistic Regression creates a straight-line
(linear) boundary.
● Decision Tree can form complex,
3(b) Prove that your algorithm will converge
non-linear boundaries by recursive
splitting. in finite number of iterations
3. Interpretability:
Proof of Finite Convergence:
● Logistic Regression is highly interpretable
with meaningful coefficients. ● At each iteration, K-means:
● Decision Tree is also interpretable but can
become complex if too deep. ○ Minimizes the within-cluster sum
4. Handling Non-linearity and Interactions: of squared distances (WCSS).
● Logistic Regression struggles with ○ Reassigns points to the nearest
non-linearity unless features are centroid → cost function
transformed. decreases or stays same.
● Decision Tree naturally handles ○ Updates centroids to the mean →
non-linear relationships and feature further reduces WCSS.
interactions. ● Since there are finite ways to assign n
Conclusion: points to k clusters, and the cost function
● Use Logistic Regression when data is never increases, K-Means must converge
linearly separable and model simplicity is after a finite number of steps, even if it’s
key. not optimal (local minima).
● Use Decision Tree for non-linear data and
when interpretability through rules is
preferred.
● BIC penalizes more complex models Decrease weights of correctly classified samples.
(higher k) to avoid overfitting.
6. Repeat steps 2–5:
● Choose the number of clusters k where
BIC is minimized. For a fixed number of rounds or until desired
● It helps find a balance between model performance.
complexity and accuracy.
7. Final Model:
3. Calculate error:
Formula: