Introduction to Machine Learning
Introduction to Machine Learning
OVERVIEW OF ML
OVERVIEW OF ML
2
1
3 6
3
WHAT IS MACHINE LEARNING?
6
DATA
8
ALGORITHMS
The success of machine learning system also depends on
the algorithms.
10
FLAVORS OF CLASSIFICATION
SOME USEFUL CLASSIFIER
Decision Trees
--Examples are used to
--Learn topology
If patrons=full and day=Friday
K-nearest --Order of questions
then wait (0.3/0.7) neighbors
If wait>60 and Reservation=no
then wait (0.4/0.9)
Association rules
--Examples are used to
--Learn support and
confidence of association
rules SVMs
Neural Nets
--Examples are used to
--Learn topology
--Learn edge weights
Naïve bayes
(bayesnet learning)
--Examples are used to
--Learn topology
--Learn CPTs
SUPERVISED LEARNING
19
UNSUPERVISED LEARNING
28
INSTANCE BASED LEARNING ALGORITHMS
Instances are nothing but subsets of datasets, and instance based learning models work
on an identified instance or groups of instances that are critical to the problem.
The results across instances are compared, which can include an instance of new data
as well.
This comparison uses a particular similarity measure to find the best match and predict.
Instance based methods are also called lazy learning Algorithms
Here the focus is on the representation of the instances and similarity measures for
comparison between instances.
Basic idea:
⚫ If it walks like a duck, quacks like a duck, then it’s probably a duck
49
TRAINING AND TESTING
Universal set
(unobserved)
a: TP (true positive)
PREDICTED CLASS b: FN (false
negative)
Class=Yes Class=No
c: FP (false positive)
error
Bias is all about Train error. When train error is high, it is said to
have high bias.
In case of overfiiting, bias will be low because the train error will
be minimum.
BALANCED FIT MODEL
BALANCED FIT MODEL
BULLS EYE DIAGRAM
Inner circle represents the ground truth.
WAYS TO GET BALANCED FIT MODEL
Cross Validation
Regularization
Dimensionality Reduction
Ensemble Techniques
K FOLD CROSS VALIDATION
OPTIONS TO TRAIN A MODEL
Use all data to train the model and then use some of that
data to test the model.
PROBLEMS WITH THIS APPROACH?
We are testing the model on the same data on which it
was trained.
Split available dataset into training and test sets
PROBLEMS WITH THIS APPROACH?
This approach works well most of the times.
However, suppose a case where most of the training samples
were from one class and only few training samples were from
other class.
And most of the test samples are from other class.
K FOLD CROSS VALIDATION
We divide the whole dataset into folds. Lets say 5 folds.
And then we run multiple iterations.
In first iteration, the first fold is used to test the model
and remaining four folds are used to train the model.
In second iteration, the second fold is used to test the
model and remaining folds are used to train the model.
This process is repeated till the last fold, where last fold
is used for testing and remaining folds are used to train
the model.
Lastly, results from each iteration are averaged.
K FOLD CROSS VALIDATION
L1 AND L2 REGULARIZATION
L1 and L2 regularizations are some of the techniques that can be
used to address the overfitting issue.
Consider the equation for overfit case, if in this equation I
somehow make sure that my theta 3 and theta 4 is almost close to
0, then the equation will change.
L1 AND L2 REGULARIZATION
the idea here is to shrink your parameters your parameters which
is theta 0, theta 3, theta 4, even theta 2, theta 1 if you can
102
DIMENSIONALITY REDUCTION
ENSEMBLE METHODS
This is a very powerful and widely adopted class of techniques.
As the name suggests, ensemble methods encompass multiple
models that are built independently and the results of these
models are combined and responsible for overall predictions.
It is critical to identify what independent models are to be
combined or included, how the results need to be combined, and
in what way to achieve the required result.
The subset of models that are combined is sometimes referred to
as weaker models as the results of these models need not
completely fulfill the expected outcome in isolation.
104
ENSEMBLE METHODS
ENSEMBLE METHODS
The following are some of the Ensemble method algorithms:
⚫ Random forest
⚫ Bagging: Sampling with replacement
⚫ AdaBoost
⚫ Bootstrapped Aggregation (Boosting)
⚫ Stacked generalization (blending)
⚫ Gradient boosting machines (GBM)
106
MACHINE LEARNING APPLICATIONS
113
MACHINE LEARNING TOOLS AND FRAMEWORKS
115
MACHINE LEARNING TOOLS AND FRAMEWORKS
116