6.1.unit-1 ML Handsout
6.1.unit-1 ML Handsout
L-1
LECTURE HANDOUTS
IT IV/VII-A
Course Name with Code : Machine Learning -
Prerequisite knowledge for Complete understanding and learning of Topic: (Max. Four
important topics)
Supervised Learning
Unsupervised Learning
Reinforcement Learning
Detailed content of the Lecture:
As with any method, there are different ways to train machine learning algorithms, each with
their own advantages and disadvantages.
To understand the pros and cons of each type of machine learning, we must first look at what
kind of data they ingest. In ML, there are two kinds of data — labelled data and unlabeled data
Labelled data has both the input and are output parameters in a completely machine-readable
pattern, but requires a lot of human labour to label the data, to begin with.
Unlabeled data only has one or none of the parameters in a machine-readable form. This
negates the need for human labour but requires more complex solutions.
There are also some types of machine learning algorithms that are used in very specific use-
cases, but three main methods are used today.
Supervised learning is one of the most basic types of machine learning. In this type, the machine
learning algorithm is trained on labelled data. Even though the data needs to be labelled
accurately for this method to work, supervised learning is extremely powerful when used in the
right circumstances.
The algorithm then finds relationships between the parameters given, essentially establishing a
cause and effect relationship between the variables in the dataset.
Unsupervised machine learning holds the advantage of being able to work with unlabeled
data. This means that human labor is not required to make the dataset machine-readable,
allowing much larger datasets to be worked on by the program.
In supervised learning, the labels allow the algorithm to find the exact nature of the
relationship between any two data points. However, unsupervised learning does not have
labels to work off of, resulting in the creation of hidden structures.
Relationships between data points are perceived by the algorithm in an abstract manner, with
no input required from human beings. The model may be predictive to make predictions in the
future, or descriptive to gain knowledge from data, or both.
Machine learning uses the theory of statistics in building mathematical models, because
the core task is making inference from a sample.
The creation of these hidden structures is what makes unsupervised learning algorithms
versatile. Instead of a defined and set problem statement, unsupervised learning algorithms
can adapt to the data by dynamically changing hidden structures. This offers more
post-deployment development than supervised learning
algorithms. The role of computer science is two fold: First, in training, we need efficient
algorithms to solve the optimization problem, as well as to store and process them assive
amount of data we generally have.
Second, once a model is learned, its representation and algorithmic solution for inference needs
to be efficient as well.
Reinforcement learning directly takes inspiration from how human beings learn from data in
their lives. It features an algorithm that improves upon itself and learns from new situations
using a trial-and-error method. Favorable outputs are encouraged or ‘reinforced’, and non-
favorable outputs are discouraged or ‘punished’. In certain applications, the efficiency of the
learning or inference algorithm, namely, its space and time complexity, may be as important
as its predictive accuracy.
Based on the psychological concept of conditioning, reinforcement learning works by putting
the algorithm in a work environment with an interpreter and a reward system. In every
iteration of the algorithm, the output result is given to the interpreter, which decides whether
the outcome is favorable or not.
In case of the program finding the correct solution, the interpreter reinforces the solution by
providing a reward to the algorithm.
Course Teacher
Verified by HoD
MUTHAYAMMAL ENGINEERING COLLEGE
(An Autonomous Institution)
(Approved by AICTE, New Delhi, Accredited by NAAC & Affiliated to Anna University)
Rasipuram-637408, Namakkal Dist.,TamilNadu
L-2
LECTURE HANDOUTS
IT IV/VII-A
Course Name with Code : Machine Learning -
Image recognition is one of the most common applications of machine learning. It is used to
identify objects, persons, places, digital images, etc.
The popular use case of image recognition and face detection is, Automatic friend tagging
suggestion: Face book provides us a feature of auto friend tagging suggestion.
2. Self-driving cars:
One of the most exciting applications of machine learning is self-driving cars.
Machine learning plays a significant role in self-driving cars. Tesla, the most popular car
manufacturing company is working on self-driving car.
It is using unsupervised learning method to train the car models to detect people and objects
while driving.
o Content Filter
o Header filter
o General blacklists filter
o Rules-based filters
o Permission filters
We have various virtual personal assistants such as Google assistant, Alexa, Cortana, Siri. As the
name suggests, they help us in finding the information using our voice instruction.
These assistants can help us in various ways just by our voice instructions such as Play music,
call someone, Open an email, Scheduling an appointment, etc.
Machine learning is widely used in stock market trading. In the stock market, there is always a risk of
up and downs in shares, so for this machine learning's long short term memory neural network is used
for the prediction of stock market trends.
Medical Diagnosis:
In medical science, machine learning is used for diseases diagnoses. With this, medical
technology is growing very fast and able to build 3D models that can predict the exact position
of lesions in the brain.
Course Teacher
Verified by HoD
MUTHAYAMMAL ENGINEERING COLLEGE
(An Autonomous Institution)
(Approved by AICTE, New Delhi, Accredited by NAAC & Affiliated to Anna University)
Rasipuram-637408, Namakkal Dist.,TamilNadu
L-3
LECTURE HANDOUTS
IT IV/VII-A
Course Name with Code : Machine Learning -
Introduction :( Maximum 5 sentences) This is learning a class from its positive and negative
eexamples. We generalize and discuss the case of multiple classes, then regression, where the outputs
are continuous.
Prerequisite knowledge for Complete understanding and learning of Topic: (Max. Four
important topics)
Supervised Learning
Unsupervised Learning
Reinforcement Learning
DetailedcontentoftheLecture:
During the design of the checker's learning system, the type of training experience available for a
learning system will have a significant effect on the success or failure of the learning.
2. Teacher or Not — Supervised - The training experience will be labelled, which means, all the
board states will be labelled with the correct move. So the learning takes place in the presence of
a supervisor or a teacher. Unsupervised-The training experience will be unlabeled, which
means, all the board states will not have the moves. So the learner generates random games and
plays against itself with no supervision or teacher involvement. Semi-supervised- Learner
generates game states and asks the teacher for help in finding the correct move if the
board state is
confusing.
3. Is the training experience good — Do the training examples represent the distribution of
examples over which the final system performance will be measured?
Performance is best when training examples and test examples are from the same/a similar
distribution.
The checker player learns by playing against oneself. Its experience is indirect. It may not
encounter moves that are common in human expert play. Once the proper training experience is
available, the next design step will be choosing the Target Function.
In this design step, we need to determine exactly what type of knowledge has to be learned and
When you are playing the checkers game, at any moment of time, you make a decision on
choosing the best move from different possibilities. You think and apply the learning that you
have gained from the experience. Here the learning is, for a specific board, you move a checker
such that your board state tends towards the winning situation. Now the same learning has to be
Course Teacher
Verified by HoD
MUTHAYAMMAL ENGINEERING COLLEGE
(An Autonomous Institution)
(Approved by AICTE, New Delhi, Accredited by NAAC & Affiliated to Anna University)
Rasipuram-637408, Namakkal Dist.,TamilNadu
L-4
LECTURE HANDOUTS
IT IV/VII-A
Course Name with Code : Machine Learning -
We can define a domain of numbers as our input, such as floating-point values from -50 to 50.
We can define a simple function with one numerical input variable and one numerical output
variable and use this as the basis for understanding neural networks for function
approximation.
We can then select a mathematical operation to apply to the inputs to get the output values. The
selected mathematical operation will be the mapping function, and because we are choosing it,
we will know what it is. In practice, this is not the case and is the reason why we would use a
supervised learning algorithm like a neural network to learn or discover the mapping function.
In this case, we will use the square of the input as the mapping function, defined as:
y = x^2
Course Teacher
Verified by HoD
MUTHAYAMMAL ENGINEERING COLLEGE
(An Autonomous Institution)
(Approved by AICTE, New Delhi, Accredited by NAAC & Affiliated to Anna University)
Rasipuram-637408, Namakkal Dist.,TamilNadu
L-5
LECTURE HANDOUTS
IT IV/VII-A
Course Name with Code : Machine Learning -
Introduction: ( Maximum 5 sentences) : A learning algorithm is used go through the given data
set. Inductive Learning Algorithm (ILA) is an iterative and inductive machine learning algorithm
which is used for generating a set of a classification rule, which produces rules of the form “IF-THEN”,
for a set of examples, producing rules at each iteration and appending to the set of rules.
Prerequisite knowledge for Complete understanding and learning of Topic: (Max. Four
important topics)
Classification
Regression
Detailed content of the Lecture:
There are basically two methods for knowledge extraction firstly from domain experts and
then with machine learning.
For a very large amount of data, the domain experts are not very useful and reliable. So we
move towards the machine learning approach for this work.
To use machine learning One method is to replicate the experts logic in the form of
algorithms but this work is very tedious, time taking and expensive.
So we move towards the inductive algorithms which itself generate the strategy for
performing a task and need not instruct separately at each step
The need was due to the pitfalls which were present in the previous algorithms, one of the
major pitfalls was lack of generalisation of rules.
The ID3 and AQ used the decision tree production method which was too specific which were
difficult to analyse and was very slow to perform for basic short classification problems.
The decision tree-based algorithm was unable to work for a new problem if some attributes
are missing.
The ILA uses the method of production of a general set of rules instead of decision trees,
which overcome the above problems.
THE ILA ALGORITHM: General requirements at start of the algorithm:
1. list the examples in the form of a table ‘T’ where each row corresponds to an example and
each column contains an attribute value.
2. create a set of m training examples, each example composed of k attributes and a class
attribute with n possible decisions.
3. create a rule set, R, having the initial value false.
4. initially all rows in the table are unmarked.
Tom Mitchell “Machine Learning published” by Tata Mc Grill in the year 1997.
Course Teacher
Verified by HoD
MUTHAYAMMAL ENGINEERING COLLEGE
(An Autonomous Institution)
(Approved by AICTE, New Delhi, Accredited by NAAC & Affiliated to Anna University)
Rasipuram-637408, Namakkal Dist.,TamilNadu
L-6
LECTURE HANDOUTS
IT IV/VII-A
Course Name with Code : Machine Learning -
V SH,D≡ {h ∈ H|Consistent(h,D)}
Note difference between definitions of consistent and satisfies: – an example x satisfies
hypothesis h when h(x) = 1, regardless of whether x is +ve or −ve example of target concept – an
example x is consistent with hypothesis h iff h(x) = c(x)
Can represent version space by listing all members. Leads to List-Then-Eliminate concept learning
algorithm:
For each training example, hx, c(x)i remove from Version Space any hypothesis h for which h(x)
6= c(x)
However, since it requires exhaustive enumeration of all hypotheses in practice it is not feasible.
Course Teacher
Verified by HoD
MUTHAYAMMAL ENGINEERING COLLEGE
(An Autonomous Institution)
(Approved by AICTE, New Delhi, Accredited by NAAC & Affiliated to Anna University)
Rasipuram-637408, Namakkal Dist.,TamilNadu
L-7
LECTURE HANDOUTS
IT IV/VII-A
Course Name with Code : Machine Learning -
Introduction: ( Maximum 5 sentences) : The candidate Elimination algorithm finds all hypotheses
that match all the given training examples. Unlike in Find-S algorithm and List-then-Eliminate
algorithm, it goes through both negative and positive examples, eliminating any inconsistent
hypothesis.
Prerequisite knowledge for Complete understanding and learning of Topic: (Max. Four
important topics)
Concepts of Supervised Learning
Version space
Detailed content of the Lecture:
The candidate elimination algorithm incrementally builds the version space given a
hypothesis space H and a set E of examples.
The examples are added one by one; each example possibly shrinks the version space by
removing the hypotheses that are inconsistent with the example.
The candidate elimination algorithm does this by updating the general and specific boundary
for each new example.
You can consider this as an extended form of Find-S algorithm.
Consider both positive and negative examples.
Actually, positive examples are used here as Find-S algorithm (Basically they are generalizing
from the specification).
While the negative example is specified from generalize form.
Terms Used:
Concept learning: Concept learning is basically learning task of the machine (Learn by Train
data)
General Hypothesis: Not Specifying features to learn the machine.
G = {‘?’, ‘?’,’?’,’?’…}: Number of attributes
Specific Hypothesis: Specifying features to learn machine (Specific feature)
S= {‘pi’,’pi’,’pi’…}: Number of pi depends on number of attributes.
Version Space: It is intermediate of general hypothesis and Specific hypothesis. It not only
just written one hypothesis but a set of all possible hypothesis based on training data-set.
Algorithm:
Step1: Load Data set
Step2: Initialize General Hypothesis and Specific Hypothesis.
Step3: For each training example
Step4: If example is positive example
if attribute_value == hypothesis_value:
Do nothing
else:
replace attribute value with '?' (Basically generalizing it)
Step5: If example is Negative example
Make generalize hypothesis more specific.
Tom Mitchell “Machine Learning published” by Tata Mc Grill in the year 1997.
Course Teacher
Verified by HoD
MUTHAYAMMAL ENGINEERING COLLEGE
(An Autonomous Institution)
(Approved by AICTE, New Delhi, Accredited by NAAC & Affiliated to Anna University)
Rasipuram-637408, Namakkal Dist.,TamilNadu
L-8
LECTURE HANDOUTS
IT IV/VII-A
Course Name with Code : Machine Learning -
“What is a chair?”
by deciding which of the properties are relevant for the concept “chair,” and which of them have to be
fulfilled or not to be fulfilled, respectively.
Hence, we obtain:
Tom Mitchell “Machine Learning published” by Tata Mc Grill in the year 1997.
Course Teacher
Verified by HoD
MUTHAYAMMAL ENGINEERING COLLEGE
(An Autonomous Institution)
(Approved by AICTE, New Delhi, Accredited by NAAC & Affiliated to Anna University)
Rasipuram-637408, Namakkal Dist.,TamilNadu
L-9
LECTURE HANDOUTS
IT IV/VII-A
Course Name with Code : Machine Learning -
Introduction: (Maximum5sentences):A machine learning model can't directly see, hear, or sense
input examples. Instead, you must create a representation of the data to provide the model with a
useful vantage point into the data's key qualities. That is, in order to train a model, you must
choose the set of features that best represent the data.
Prerequisite knowledge for Complete understanding and learning of Topic: (Max. Four
important topics)
Concepts of Supervised Learning
Application of supervised learning
Detailed content of the Lecture:
A concept representation resource is a particular kind of digital resource for learning designed
to support the learning of disciplinary concepts. Such representation allows a learner to
manipulate properties, parameters and relationships, and explore relevant information
related to a concept.
In machine learning, feature learning or representation learning is a set of techniques that
allows a system to automatically discover the representations needed for feature detection
or classification from raw data.
Need of Representation Learning
Assume you’re developing a machine-learning algorithm to predict dog breeds based on
pictures. Because image data provides all of the answers, the engineer must rely heavily on it
when developing the algorithm.
Each observation or feature in the data describes the qualities of the dogs. The machine learning
system that predicts the outcome must comprehend how each attribute interacts with other
outcomes such as Pug, Golden Retriever, and so on.
Representation learning is a class of machine learning approaches that allow a system to
discover the representations required for feature detection or classification from raw data. The
requirement for manual feature engineering is reduced by allowing a machine to learn the
features and apply them to a given activity.
In representation learning, data is sent into the machine, and it learns the representation on its
own. It is a way of determining a data representation of the features, the distance function, and
the similarity function that determines how the predictive model will perform. Representation
learning works by reducing high-dimensional data to low-dimensional data, making it easier
to discover patterns and anomalies while also providing a better understanding of the data’s
overall behavior.
Basically, Machine learning tasks such as classification frequently demand input that is
mathematically and computationally convenient to process, which motivates representation
learning. Real-world data, such as photos, video, and sensor data, has resisted attempts to define
certain qualities algorithmically. An approach is to examine the data for such traits or
representations rather than depending on explicit techniques.
We must employ representation learning to ensure that the model provides invariant and
untangled outcomes in order to increase its accuracy and performance. In this section, we’ll look
at how representation learning can improve the model’s performance in three different learning
frameworks: supervised learning, unsupervised learning.
Course Teacher
Verified by HoD