2ML Problem
2ML Problem
Summary
1. Analogy
Think about arithmetic classes in primary school. During the class hours, a
student looks at solved examples in a textbook and learns how to solve
simple three digit addition problems. Let us say that her textbook has the
following problems along with the answers:
During the instructional hours, the student has access to both the
questions and the answers. In the exam, she will not have access to the
answers. But more importantly, she will not even be asked the same
questions! So, just memorizing the answers will not help. She would have to
learn how addition works. She needs to have a mental model of addition. In
other words, she would have to learn a function from the input (question)
to the output (answer).
This is exactly what happens in a regression problem. The inputs are a set
of data-points. The outputs corresponding to these inputs are real numbers
called targets or labels. A regression model has to learn the mapping from
input to output. Once this mapping or function is learnt, the model can
then be used to predict the output on unseen inputs. The collection of data-
points along with their targets is called a labeled dataset. A regression
model makes use of this dataset to learn a function. A labeled dataset is
nothing but the textbook problems in our analogy.
d. Arranging the n data-points in a matrix, we get a n × d data-matrix. Let
us call this matrix X:
x 11 ― x 1d
X = | x ij |
x n1 ― x nd
Each row of this matrix is the feature vector for one data-point. The
element x ij is the j th feature of the i th data-point. The labels can be put
together in a vector of size n. Let us call this y:
y1
y= ⋮
yn
3. Model
f : Rd → R
Pictorially:
Feature Label
Model
Vector
What is so special about a ML model? All models take some input and produce
a corresponding output. The key difference is that in a classical
programming setting, we are given the input and the function and are asked
to find the output. In machine learning we are given both the input and the
output, we have to learn a model f. The function or the model is the
unknown. This is what has to be learnt.
4. Learning
ML is all about learning from data. But who or what is learning? More
importantly, who or what enables the learning? There is a learning algorithm
which drives the learning. We can think of the model as the outcome of the
learning process. During the learning stage, the dataset is fed as input to
a learning algorithm, which in turn outputs a model.
There is one important detail that is missing in this diagram. There are
several models that we could choose from. Going back to our analogy, there
are different ways to understand three digit addition:
Model
Family
• labeled dataset
• family of models
The task of the algorithm is to explore the space of models and pick the
one that best fits the labeled dataset. ML scientists have come up with a
variety of models. The simplest such model is a linear model. We shall take
this up in subsequent chapters.
Remark: For some regression models, once you have learnt the model, you
can throw away the dataset (textbook). This is not true of all regression
models though! Think about how you learnt three-digit addition. Do you
still carry your primary school textbooks around? No! Your mind has a
representation of what addition is. This representation is what we call a
model.
5. Summary