Module-4 ML Landscape
Module-4 ML Landscape
Sushma B.
Assistant Professor
Dept. of Electronics & Communication Engineering
CMRIT, Bangalore
Perspectives
The evolution of ML algorithms, such as deep learning and
reinforcement learning, has led to state-of-the-art performance in
tasks like image recognition, speech processing, and game-playing.
Automating decision-making processes
The ability to process large datasets allows ML systems to uncover
patterns and insights that are otherwise impossible to detect
manually, enabling data-driven decision-making.
Classification based on
Whether or not they are trained with human supervision (supervised,
unsuper- vised, semisupervised, and Reinforcement Learning)
Whether or not they can learn incrementally on the fly (online versus
batch learning)
Whether they work by simply comparing new data points to known
data points, or instead detect patterns in the training data and build
a predictive model.
Supervised/unsupervised learning
Batch or online based learning
Model or instance based learning
Some algorithms can deal with partially labeled training data, usually
a lot of unla- beled data and a little bit of labeled data. This is called
semisupervised learning
Semisupervised learning algorithms are combinations of unsupervised
and supervised algorithms.
Train on labelled data, then use predictions on unlabelled data to
create new labelled points. These new points are added to the
training data, and the model is retrained iteratively.
The learning system, called an agent in this context, can observe the
environment, select and perform actions, and get rewards in return.
Learns the strategy, called policy
A policy defines what action the agent should choose when it is in a
given situation.
i) Batch learning
In batch learning, the system is incapable of learning incrementally: it
must be trained using all the available data.
This will generally take a lot of time and computing resources, so it is
typically done offline.
First the system is trained, and then it is launched into production
and runs without learning anymore; it just applies what it has learned.
This is called offline learning.
life-satisfaction=θ0 + θ1 xGDP
Split the data into two sets: the training set and the test set.
The error rate on new cases is called the generalization error (or
out-of- sample error), and by evaluating your model on the test set,
you get an estimate of this error. This value tells you how well your
model will perform on instances it has never seen before.
The most important rule to remember is that the validation set and
the test must be as representative as possible of the data you expect
to use in production.
Train-Dev: Hold part of the training dataset
Train the model on train dataset, and evaluate on train-dev set.
If it performs well, then the model is not overfitting the training set,
so if performs poorly on the validation set, the problem must come
from the data mismatch.
We learn our surrounding through 5 senses : eye, ear, nose, tongue and
skin. We learn lot of things during the entire life. Some of them are based
on experience and some of them are based on memorization. On the basis
of that we can divide learning methods into five types:
Rote Learning (memorization): Memorizing things without knowing
the concept/ logic behind them.
Passive Learning (instructions): Learning from a teacher/expert.
Analogy (experience): Learning new things from our past experience.
Inductive Learning (experience): On the basis of past experience,
formulating a generalized concept.
Deductive Learning: Deriving new facts from past facts.
FIND-S will find a hypothesis consistent with the training data, it has no
way to determine whether it has found the only hypothesis in H consistent
with the data (i.e., the correct target concept), or whether there are many
other consistent hypotheses as well.
In case there are multiple hypotheses consistent with the training examples,
FIND-S will find the most specific. It is unclear whether we should prefer
this hypothesis over, say, the most general, or some other hypothesis of
intermediate generality.
In most practical learning problems there is some chance that the training
examples will contain at least some errors or noise. Such inconsistent sets of
training examples can severely mislead FIND-S, given the fact that it ignores
negative examples.
There can be several maximally specific hypotheses consistent with the data.
Find S finds only one.
One obvious way to represent the version space is simply to list all of
its members.
This leads to a simple learning algorithm, which we might call the
List-Then-Eliminate algorithm.
The LIST-THEN-ELIMINATE algorithm first initializes the version
space to contain all hypotheses in H and then eliminates any
hypothesis found inconsistent with any training example.
In principle, List-Then-Eliminate algorithm can be applied whenever
the hypothesis space H is finite. However, since it requires exhaustive
enumeration of all hypotheses in practice it is not feasible.