ML Lecture14
ML Lecture14
CSE343/CSE543/ECE363/ECE563
Lecture 14 | Take your own notes during lectures
Vinayak Abrol <[email protected]>
Eager and Lazy Learning
Eager: is a learning method in which we learn a general model/mapping i.e., an
input-independent target function during training of the system.
- Examples SVM, LR, DT
- Target function will be approximated globally during training
- Post-training queries to the system have no effect on the system
- Much less space is required
Lazy: Generalization of the training data is, in theory, delayed until a query is made
to the system.
- Used when data set is continuously updated e.g., top 10 songs
- There is in principle no training phase
- Target function will be approximated locally
- Large space requirements, slow inference, and sensitive to noise
- Examples include K-NN, Local Regression, CBR
Instance-Based Learning: KNN
Instance-based learning methods simply store the training examples (or a
reasonably sized subset) instead of learning explicit description of the target
function.
- When a new instance is encountered, its relationship to the stored examples
is examined in order to assign a target function value for the new instance.
k-Nearest Neighbor
Pick ‘k’ with the lowest error rate on the validation set
Notice that
- If the old clustering is the same as the new, then the next clustering will again be the same
- If the new clustering is different from the old, then the newer one has a lower cost
Assignment Step
Update Step
K-Means
● k-means assume the variance of the distribution of
each attribute (variable) is spherical;
The data points in the separating, sparse regions are typically considered noise/outliers.
Density Based Clustering
This method is based on the idea that a cluster/group in a data space is a contiguous region of high point
density, separated from other clusters by sparse regions.
The data points in the separating, sparse regions are typically considered noise/outliers.
● Defined distance (Density-based spatial clustering of applications with noise-DBSCAN) is used
to differentiate between dense clusters and sparser noise. DBSCAN is the fastest of the clustering
algorithms, but it can only be used when all significant clusters possess comparable densities.
● Kernel density based (Mean-shift clustering) methods estimates the underlying distribution
from samples and moves the kernel window towards mean/center of mass to identify clusters.
Density Based Clustering
https://www.youtube.com/watch?app=desktop&v=RDZUdRSDOok
Clustering
Segmentation via Mean-Shift Clustering
Mean-Shift Clustering
Thanks