ML Unit 1
ML Unit 1
UNIT-1
Introduction: What is Machine Learning
In the real world, we are surrounded by humans who can learn everything from their experiences with
their learning capability, and we have computers or machines which work on our instructions. But can
a machine also learn from experiences or past data like a human does? So here comes the role
of Machine Learning.
Machine Learning is said as a subset of artificial intelligence that is mainly concerned with the
development of algorithms which allow a computer to learn from the data and past experiences on
their own. The term machine learning was first introduced by Arthur Samuel in 1959. We can define
it in a summarized way as:Machine learning enables a machine to automatically learn from data,
improve performance from experiences, and predict things without being explicitly programmed.
With the help of sample historical data, which is known as training data, machine learning algorithms
build a mathematical model that helps in making predictions or decisions without being explicitly
programmed. Machine learning brings computer science and statistics together for creating predictive
models. Machine learning constructs or uses the algorithms that learn from historical data. The more
we will provide the information, the higher will be the performance
A Machine Learning system learns from historical data, builds the prediction models, and
whenever it receives new data, predicts the output for it. The accuracy of predicted output depends
upon the amount of data, as the huge amount of data helps to build a better model which predicts the
output more accurately.
Suppose we have a complex problem, where we need to perform some predictions, so instead of
writing a code for it, we just need to feed the data to generic algorithms, and with the help of these
algorithms, machine builds the logic as per the data and predict the output. Machine learning has
changed our way of thinking about the problem. The below block diagram explains the working of
Machine Learning algorithm:
The need for machine learning is increasing day by day. The reason behind the need for machine
learning is that it is capable of doing tasks that are too complex for a person to implement directly. As
a human, we have some limitations as we cannot access the huge amount of data manually, so for this,
we need some computer systems and here comes the machine learning to make things easy for us.
We can train machine learning algorithms by providing them the huge amount of data and let them
explore the data, construct the models, and predict the required output automatically. The performance
of the machine learning algorithm depends on the amount of data, and it can be determined by the cost
function. With the help of machine learning, we can save both time and money.
The importance of machine learning can be easily understood by its uses cases, Currently, machine
learning is used in self-driving cars, cyber fraud detection, face recognition, and friend suggestion
by Facebook, etc. Various top companies such as Netflix and Amazon have build machine learning
models that are using a vast amount of data to analyze the user interest and recommend product
accordingly.
Following are some key points which show the importance of Machine Learning:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
The system creates a model using labeled data to understand the datasets and learn about each data,
once the training and processing are done then we test the model by providing a sample data to check
whether it is predicting the exact output or not.
The goal of supervised learning is to map input data with the output data. The supervised learning is
based on supervision, and it is the same as when a student learns things in the supervision of the
teacher. The example of supervised learning is spam filtering.
o Classification
o Regression
In unsupervised learning, we don't have a predetermined result. The machine tries to find useful
insights from the huge amount of data. It can be further classifieds into two categories of algorithms:
o Clustering
o Association
3) Reinforcement Learning
Reinforcement learning is a feedback-based learning method, in which a learning agent gets a reward
for each right action and gets a penalty for each wrong action. The agent learns automatically with
these feedbacks and improves its performance. In reinforcement learning, the agent interacts with the
environment and explores it. The goal of an agent is to get the most reward points, and hence, it
improves its performance.
The robotic dog, which automatically learns the movement of his arms, is an example of
Reinforcement learning.
Note: We will learn about the above types of machine learning in detail in later chapters.
Before some years (about 40-50 years), machine learning was science fiction, but today it is the part of
our daily life. Machine learning is making our day to day life easy from self-driving cars to Amazon
virtual assistant "Alexa". However, the idea behind machine learning is so old and has a long
history. Below some milestones are given which have occurred in the history of machine learning:
Well Posed Learning Problem – A computer program is said to learn from experience E in
context to some task T and some performance measure P, if its performance on T, as was measured by
P, upgrades with experience E.
Any problem can be segregated as well-posed learning problem if it has three traits –
Task
Performance Measure
Experience
Certain examples that efficiently defines the well-posed learning problem are –
1. To better filter emails as spam or not
Task – Classifying emails as spam or not
Performance Measure – The fraction of emails accurately classified as spam or not spam
Experience – Observing you label emails as spam or not spam
2. A checkers learning problem
Task – Playing checkers game
Performance Measure – percent of games won against opposer
Experience – playing implementation games against itself
3. Handwriting Recognition Problem
Task – Acknowledging handwritten words within portrayal
Performance Measure – percent of words accurately classified
Experience – a directory of handwritten words with given classifications
4. A Robot Driving Problem
Task – driving on public four-lane highways using sight scanners
Performance Measure – average distance progressed before a fallacy
Experience – order of images and steering instructions noted down while observing a human driver
5. Fruit Prediction Problem
Task – forecasting different fruits for recognition
Performance Measure – able to predict maximum variety of fruits
Experience – training machine with the largest datasets of fruits images
6. Face Recognition Problem
Task – predicting different types of faces
Performance Measure – able to predict maximum types of faces
Experience – training machine with maximum amount of datasets of different face images
7. Automatic Translation of documents
Task – translating one type of language used in a document to other language
Step 1) Choosing the Training Experience: The very important and first task is to choose the
training data or training experience which will be fed to the Machine Learning Algorithm. It is
important to note that the
data or experience that fed to the algorithm must have a significant impact on the Success or Failure
of the Model. So Training data or experience should be chosen wisely.
Below are the attributes which will impact on Success and Failure of Data:
The training experience will be able to provide direct or indirect feedback regarding choices. For
example: While Playing chess the training data will provide feedback to itself like instead of this
move if this is chosen the chances of success increases.
Second important attribute is the degree to which the learner will control the sequences of training
examples. For example: when training data is fed to the machine then at that time accuracy is very
less but when it gains experience while playing again and again with itself or opponent the
machine algorithm will get feedback and control the chess game accordingly.
Third important attribute is how it will represent the distribution of examples over which
performance will be measured. For example, a Machine learning algorithm will get experience
while going through a number of different cases and different examples. Thus, Machine Learning
Algorithm will get more and more experience by passing through more and more examples and
hence its performance will increase.
Step 2- Choosing target function: The next important step is choosing the target function. It means
according to the knowledge fed to the algorithm the machine learning will choose NextMove
function which will describe what type of legal moves should be taken. For example : While
playing chess with the opponent, when opponent will play then the machine learning algorithm will
decide what be the number of possible legal moves taken in order to get success.
Step 3- Choosing Representation for Target function: When the machine algorithm will know all
the possible legal moves the next step is to choose the optimized move using any representation i.e.
using linear Equations, Hierarchical Graph Representation, Tabular form etc. The NextMove
function will move the Target move like out of these move which will provide more success rate.
For Example : while playing chess machine have 4 possible moves, so the machine will choose that
optimized move which will provide success to it.
Step 4- Choosing Function Approximation Algorithm: An optimized move cannot be chosen just
with the training data. The training data had to go through with set of example and through these
examples the training data will approximates which steps are chosen and after that machine will
ISSUES:
Task T: Learn to predict the value of EnjoySport for an arbitrary day, based on the values of the
attributes of the day.
Let us take a very simple hypothesis representation which consists of a conjunction of constraints in
the instance attributes. We get a hypothesis h_i with the help of example i for our training set as
below:
hi(x) := <x1, x2, x3, x4, x5, x6>
where x1, x2, x3, x4, x5 and x6 are the values
of Sky, AirTemp, Humidity, Wind, Water and Forecast.
Hence h1 will look like(the first row of the table above):
h1(x=1): <Sunny, Warm, Normal, Strong, Warm, Same > Note: x=1 represents a positive hypothesis /
Positive example
We want to find the most suitable hypothesis which can represent the concept. For example, Ramesh
enjoys his favorite sport only on cold days with high humidity (This seems independent of the values
of the other attributes present in the training examples).
h(x=1) = <?, Cold, High, ?, ?, ?>
Concept learning search: What is concept learning? In terms of machine learning, the concept learning
can be formulated as “Problem of searching through a predefined space of potential hypotheses
for the hypothesis that best fits the training examples”
Find-S Algorithm:
Example:
Consider the dataset given below:
Algorithmic steps:
Initially : G = [[?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?],
[?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?], [?, ?, ?, ?, ?, ?]]
S = [Null, Null, Null, Null, Null, Null]
Output :
G = [['sunny', ?, ?, ?, ?, ?], [?, 'warm', ?, ?, ?, ?]]
S = ['sunny','warm',?,'strong', ?, ?]
Remarks on version space and candidate elimination algorithm:
The version space learned by the CANDIDATE-ELIMINATION algorithm will converge toward the
hypothesis that correctly describes the target concept, provided (1) there are no errors in the training
examples, and (2) there is some hypothesis in H that correctly describes ...
Inductive bias: In machine learning, the term inductive bias refers to a set of (explicit or implicit)
assumptions made by a learning algorithm in order to perform induction, that is, to generalize a finite
set of observation (training data) into a general model of the domain. Without a bias of that kind,
induction would not be possible, since the observations can normally be generalized in many ways.
Treating all these possibilities in equally, i.e., without any bias in the sense of a preference for
specific types of generalization (reflecting background knowledge about the target function to be
learned), predictions for new situations could not be made.
There are various algorithms in Machine learning, so choosing the best algorithm for the given dataset
and problem is the main point to remember while creating a machine learning model. Below are the
two reasons for using the Decision tree:
o Decision Trees usually mimic human thinking ability while making a decision, so it is easy to
understand.
o The logic behind the decision tree can be easily understood because it shows a tree-like
structure.
Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes according
to the given conditions.
Pruning: Pruning is the process of removing the unwanted branches from the tree.
Parent/Child node: The root node of the tree is called the parent node, and other nodes are called
the child nodes.
In a decision tree, for predicting the class of the given dataset, the algorithm starts from the root node
of the tree. This algorithm compares the values of root attribute with the record (real dataset) attribute
and, based on the comparison, follows the branch and jumps to the next node.For the next node, the
algorithm again compares the attribute value with the other sub-nodes and move further. It continues
the process until it reaches the leaf node of the tree. The complete process can be better understood
using the below algorithm:Machine Learning - Observational and Experimental Studies
o Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
o Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
o Step-3: Divide the S into subsets that contains possible values for the best attributes.
o Step-4: Generate the decision tree node, which contains the best attribute.
o Step-5: Recursively make new decision trees using the subsets of the dataset created in step -3.
Continue this process until a stage is reached where you cannot further classify the nodes and
called the final node as a leaf node.
Example: Suppose there is a candidate who has a job offer and wants to decide whether he should
accept the offer or Not. So, to solve this problem, the decision tree starts with the root node (Salary
attribute by ASM). The root node splits further into the next decision node (distance from the office)
and one leaf node based on the corresponding labels. The next decision node further gets split into one
decision node (Cab facility) and one leaf node. Finally, the decision node splits into two leaf nodes
(Accepted offers and Declined offer). Consider the below diagram:
4. The training data may contain errors.“Decision tree learning methods are robust to errors, both
errors in classifications of the training examples and errors in the attribute values that describe these
examples.”