K Means
K Means
For each data point, compute its distance to all centroids (e.g., Euclidean
distance).
Assign the data point to the cluster with the nearest centroid.
Update Step
Calculate the new centroid of each cluster as the mean of all data points assigned
to it.
Repeat
𝑘
Requires
Assumes that features are continuous and follow a Gaussian (normal) distribution.
Multinomial Naive Bayes:
Works with discrete data, often used for text classification where features
represent word counts or frequencies.
Bernoulli Naive Bayes:
Advantages
Simple and fast to implement.
Handles both continuous and discrete data.
Works well with high-dimensional datasets (e.g., text data).
Limitations
Relies on the independence assumption, which may not hold in many cases.
Can struggle with data where features are highly correlated.
Zero probabilities for unseen data (solved using Laplace smoothing).
VISION