Machine_Learning_Notes
Machine_Learning_Notes
1. Distance-Based Methods
the similarity or difference between data points in terms of distance metrics, such as
Key Idea: Data points closer in distance are assumed to have similar labels or values.
Example: Predicting house prices by finding houses nearby with similar features.
Formula:
Working:
1. Calculate the distance between the test point and all training points.
3. Assign the most common class label among the neighbors to the test point.
Applications:
- Image recognition.
- Recommender systems.
Advantages: Easy to understand and implement.
Disadvantages:
Example: Classifying a fruit based on its features like color, size, and shape.
3. Decision Trees
or tests on attributes, branches represent outcomes, and leaves represent class labels
or values.
Key Concepts:
Working:
1. Select the attribute with the highest information gain to split the data.
Advantages:
Disadvantages:
- Prone to overfitting.
4. Naive Bayes
Bayes' Theorem:
Steps:
3. Compute posterior probability and assign the class with the highest posterior.
Applications:
- Spam detection.
- Sentiment analysis.
Advantages:
1. Clustering: K-means
Definition: A clustering algorithm that partitions data into k clusters, where each data
Steps:
Applications:
- Customer segmentation.
- Document clustering.
Advantages:
- Easy to implement.
Disadvantages:
- Sensitive to outliers.