MachineLearning in short
MachineLearning in short
A computer program is said to learn from experience E with respect to some class of tasks T and
representation of a machine learning system, including, but not restricted to, the representation of
the input and output data, the algorithm involved in learning, the parameters that define the
Population
A set of all possible examples relating to the experiment under consideration. This is what the
machine learning models try to predict, the distribution of the target population.
Feature space is the input space of the model, where the variables (other than the target variable
which we want to predict) live. Features can be numeric or categorical. For example, the weight and
speed of a car are numeric features. Whether the car is a Chevy or a Tesla is a categorical feature.
If you are describing a set of cars using their color, speed, make and model, then all possible
Feature Vector X
Each entry in the feature space is referred to as an n-dimensional feature vector, where n is the
vector. example: [Orange colour, 280mph, 2800lbs] is a 3-dimensional feature vector with 3
The set of labels or target variables associated with each of the feature vectors make up the label
space.
There can be various cars Mustang GT, Roadster, Camaro, etc. All these are labels for the set of
True label y
This is the actual label associated with one particular data point.
Example
An example is a data point including the features and the label. The examples available in the
Roadster[Orange colour, 280mph, 2800lbs] is one example from a data set. If we have a dataset
of 100 cars that belong to just one company, it does not mean that we can make predictions about
cars from other companies as they might have totally different data distribution. It would be
really hard to make predictions about Ford cars based on Tesla cars data.
Predicted label y^
It is the label predicted for a given feature vector by the machine learning model. It may or may not
be correct.
The vector [Orange colour, 280mph, 2800lbs] can be predicted as a Roadster or an orange Beach
Buggy.
Training set
The set of examples that are used to train a machine learning model.
Validation set
This is a subset of the training set (or sometimes separate from the training set set) that is used to
check the current state of the model during the training process. This does not directly contribute to
the training of the model. Validation set can be used to train the parameters of the model or provide
Testing set
The set of data points that are not made accessible to the model unless it has been trained. It is used
Classification
Classification models are models that categorize or classify data into 2 (binary classifier) or more
Lets plot this data set, such that each axis represents one of the features’ values:
Now we’ll mark the given labels on the points:
If we try to separate the two classes using straight lines, there could theoretically be infinite possible
lines:
If we add another point, the number of possible lines reduce:
and more..
Now if we are given a new point, from the same data distribution, we know where to classify that
point based on our blue line. This blue line is the hypothesis obtained form the trained model.
We still cannot be sure if our blue line is the actual representation of the line dividing the original
Population. Consider the following line, this also separates the two classes of points.
If we get access to even more points, the line may actually change its position. That is why you might
have always heard, more data in machine learning usually yields better results.
Regression
A simple linear regression is a linear approach to modeling the relationship between a scalar label
and one or more explanatory features. Usually, regression models are used to predict continuous
The blue line describes approximately where the points from the distribution lie.
Now if we add some more points from the data distribution to the training set, the line changes
altogether:
Any new point that we add from the same data distribution will lie on this blue curve. Given one of
the 2 features’ value, we can predict the value of the other feature based on its location on the curve.
Training
As the model kept seeing new points, the position of the line kept on shifting. This is the process of
learning (in case of supervised learning). The more points we get, the better will be the learning
Hypothesis
It is a function (or model) that we believe is as close to the true function (or model) that describes
the data as possible. In our classification example, the blue line we obtained is one such hypothesis
that describes the data such that all points on one side of the line belong to a similar class.
Hypothesis space
Hypothesis space is the set of all possible models or functions that can be represented by n features,
not necessarily describing the data. The target function has to be selected form this hypothesis
It can be considered as a simple hypothesis space or a decision that intuitively helps us in ultimately
selecting the right model or function. For example, in our classification example, we intuitively
decided to select different forms of straight lines, and not circles or squares, because it was evident
that a line would be sufficient to separate the points. That was our heuristic. We could have selected
circles to engulf the 2 classes, but that would have restricted the test space to just those circles
obtained from the training data. Had we selected other shapes, it could have taken more tries to
arrive at the final line. The right choice of heuristic helps in arriving at the target function quicker.
Target Function
Target function is the function that actually represents the original data distribution. If we had
access to all the possible data points, we could train a model to learn the target function.
Parameters
Model parameters are internal variables whose values can be determined from the data. During
4. Choose a machine learning model based on the heuristic and the task at hand.