ML Assigment 3
ML Assigment 3
1. Logistic Regression
How it works: Logistic regression predicts the probability of a binary outcome by fitting data
to a logistic function (sigmoid curve). It estimates the relationship between independent
variables and a binary dependent variable.
Key strengths:
1. Simple to implement and interpret.
2. Effective for binary classification problems.
Limitations:
1. Assumes linearity between independent variables and the log-odds.
2. Can struggle with complex, non-linear relationships.
Key strengths:
1. Simple and intuitive.
2. No training phase, making it fast for small datasets.
Limitations:
1. Computationally expensive for large datasets.
2. Sensitive to irrelevant or redundant features.
3. Decision Tree
How it works: Decision trees use a tree-like model of decisions and their possible
consequences. It splits the dataset into subsets based on the value of input features, making
a decision at each node.
Key strengths:
1. Easy to understand and visualize.
2. Can handle both numerical and categorical data.
Limitations:
1. Prone to overfitting.
2. Can be unstable with small variations in data.
Key strengths:
1. Effective in high-dimensional spaces.
2. Robust to overfitting, especially with the proper kernel choice.
Limitations:
1. Computationally intensive, especially with large datasets.
2. Less effective with overlapping classes or non-linear problems without an appropriate
kernel.
`
5. Dataset with Noise (e.g., data with many irrelevant or misleading features)
Recommended Algorithm : K-Nearest Neighbors (KNN)
Explanation: KNN is naturally robust to noise because it bases classification on the majority
vote of neighboring points. By considering multiple neighbors, KNN reduces the impact of
individual noisy instances on the overall classification. While it may not be the most
computationally efficient for large datasets, its simplicity and resilience to noisy data make it
a strong candidate for scenarios where irrelevant or misleading features are present.