0% found this document useful (0 votes)
29 views10 pages

ML Unit I

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views10 pages

ML Unit I

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

UNIT-I

What is Machine Learning?


Machine Learning is a concept which allows the machine to learn from examples and
experience, and that too without being explicitly programmed. So instead of you writing the
code, what you do is you feed data to the generic algorithm, and the algorithm/ machine builds
the logic based on the given data.
A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by P, improves with
experience E.”
What is Machine Learning?
Have you ever shopped online? So while checking for a product, did you noticed when it
recommends for a product similar to what you are looking for? or did you noticed “the person
bought this product also bought this” combination of products. How are they doing this
recommendation? This is machine learning.

What is Machine Learning?


Machine Learning is a subset of artificial intelligence which focuses mainly on machine learning
from their experience and making predictions based on its experience.
What does it do?
It enables the computers or the machines to make data-driven decisions rather than being
explicitly programmed for carrying out a certain task. These programs or algorithms are designed
in a way that they learn and improve over time when are exposed to new data.

How does Machine Learning Work?


Machine Learning algorithm is trained using a training data set to create a model. When new
input data is introduced to the ML algorithm, it makes a prediction on the basis of the model.
The prediction is evaluated for accuracy and if the accuracy is acceptable, the Machine Learning
algorithm is deployed. If the accuracy is not acceptable, the Machine Learning algorithm is
trained again and again with an augmented training data set.
This is just a very high-level example as there are many factors and other steps involved.

Types of Machine Learning


Machine learning is sub-categorized to three types:
Supervised Learning – Train Me!
Unsupervised Learning – I am self sufficient in learning
Reinforcement Learning – My life My rules! (Hit & Trial)

What is supervised Learning?


Supervised Learning is the one, where you can consider the learning is guided by a teacher. We
have a dataset which acts as a teacher and its role is to train the model or the machine. Once the
model gets trained it can start making a prediction or decision when new data is given to it.
What is Unsupervised Learning?
The model learns through observation and finds structures in the data. Once the model is given a
dataset, it automatically finds patterns and relationships in the dataset by creating clusters in it.
What it cannot do is add labels to the cluster, like it cannot say this a group of apples or
mangoes, but it will separate all the apples from mangoes.
Suppose we presented images of apples, bananas and mangoes to the model, so what it does,
based on some patterns and relationships it creates clusters and divides the dataset into those
clusters. Now if a new data is fed to the model, it adds it to one of the created clusters.
What is Reinforcement Learning?
It is the ability of an agent to interact with the environment and find out what is the best
outcome. It follows the concept of hit and trial method. The agent is rewarded or penalized with
a point for a correct or a wrong answer, and on the basis of the positive reward points gained the
model trains itself. And again once trained it gets ready to predict the new data presented to it.

Machine Learning – Applications


Machine learning is one of the most exciting technologies that one would have ever come across.
As it is evident from the name, it gives the computer that which makes it more similar to
humans: The ability to learn. Machine learning is actively being used today, perhaps in many
more places than one would expect. We probably use a learning algorithm dozens of time
without even knowing it. Applications of Machine Learning include:
Web Search Engine: One of the reasons why search engines like google, bing etc work so well
is because the system has learnt how to rank pages through a complex learning algorithm.
Photo tagging Applications: Be it facebook or any other photo tagging application, the ability
to tag friends makes it even more happening. It is all possible because of a face recognition
algorithm that runs behind the application.
Spam Detector: Our mail agent like Gmail or Hotmail does a lot of hard work for us in
classifying the mails and moving the spam mails to spam folder. This is again achieved by a
spam classifier running in the back end of mail application.
Image Recognition: Image Recognition is one of the most significant Machine Learning
applications. Basically, it is an approach for identifying and detecting a feature or an object in the
digital image. Moreover, this technique can be used for further analysis such as pattern
recognition, face detection, face recognition, optical character recognition and many more.
Sentiment Analysis : is another real-time machine learning application. It also refers to opinion
mining, sentiment classification, etc. It‟s a process of determining the attitude or opinion of the
speaker or the writer. In other words, it‟s the process of finding out the emotion from the text.
The main concern of sentiment analysis is “what other people think?”. Assume that someone
writes „the movie is not so good.‟ To find out the actual thought or opinion from the text (is it
good or bad) is the task of sentiment analysis. This sentiment analysis application can also apply
to the further application such as in review based website, decision-making application.
Speech recognition: Speech recognition is the process of transforming spoken words into text. It
is additionally called automatic speech recognition, computer speech recognition or speech to
text. This field is benefited from the advancement of machine learning approach and big data.

Design of a Learning system:


Steps involved in "Designing a Learning system”:
Step1: Choosing the Training Experience
Step2: Choosing the Target Function
Step3: Choosing a Representation for the Target Function
Step4: Choosing a Function Approximation Algorithm
Step5: The Final Design (Validation and Testing)

Step 1. Choosing the Training Experience: The very important and first task is to choose the
training data or training experience which will be fed to the Machine Learning Algorithm. It is
important to note that the data or experience that we fed to the algorithm must have a
significant impact on the Success or Failure of the Model. So Training data or experience
should be chosen wisely.
Below are the attributes which will impact on Success and Failure of Data:
 The training experience will be able to provide direct or indirect feedback regarding
choices. For example: While Playing chess the training data will provide feedback to itself
like instead of this move if this is chosen the chances of success increases.
 Second important attribute is the degree to which the learner will control the sequences of
training examples. For example: when training data is fed to the machine then at that time
accuracy is very less but when it gains experience while playing again and again with itself
or opponent the machine algorithm will get feedback and control the chess game
accordingly.
 Third important attribute is how it will represent the distribution of examples over which
performance will be measured. For example, a Machine learning algorithm will get
experience while going through a number of different cases and different examples. Thus,
Machine Learning Algorithm will get more and more experience by passing through more
and more examples and hence its performance will increase.
Step 2- Choosing target function: The next important step is choosing the target function. It
means according to the knowledge fed to the algorithm the machine learning will choose
NextMove function which will describe what type of legal moves should be taken.
For example : While playing chess with the opponent, when opponent will play then the
machine learning algorithm will decide what be the number of possible legal moves taken in
order to get success.
Step 3- Choosing Representation for Target function: When the machine algorithm will
know all the possible legal moves the next step is to choose the optimized move using any
representation i.e. using linear Equations, Hierarchical Graph Representation, Tabular form
etc. The NextMove function will move the Target move like out of these move which will
provide more success rate.
For Example : while playing chess machine have 4 possible moves, so the machine will
choose that optimized move which will provide success to it.
Step 4- Choosing Function Approximation Algorithm: An optimized move cannot be
chosen just with the training data. The training data had to go through with set of example and
through these examples the training data will approximates which steps are chosen and after
that machine will provide feedback on it.
For Example : When a training data of Playing chess is fed to algorithm so at that time it is
not machine algorithm will fail or get success and again from that failure or success it will
measure while next move what step should be chosen and what is its success rate.
Step 5- Final Design: The final design is created at last when system goes from number of
examples , failures and success , correct and incorrect decision and what will be the next step
etc.
Example: DeepBlue is an intelligent computer which is ML-based won chess game against
the chess expert Garry Kasparov, and it became the first computer which had beaten a human
chess expert.

Choices in Designing the Checkers Learning Problem


Random Variable

A random variable is a quantity that is produced by a random process.

A random variable is often denoted as a capital letter, e.g. X, and values of the random variable
are denoted as a lowercase letter and an index, e.g. x1, x2, x3.

The value that a random variable can take is called its domain, and the domain of a random
variable may be discrete or continuous.

A random variable x, is a variable which randomly takes on values from a sample space. We
often indicate a specific value x can take on with italics.

For example, if x represented the outcome of a coin flip, we might discuss a specific outcome
as x = heads.

Random variables can either be discrete like the coin, or continuous (can take on an uncountably
infinite amount of possible values).

Discrete random variable has a finite set of states: for example, colors of a car.

A random variable that has values true or false is discrete and is referred to as a Boolean
random variable: for example, a coin toss.
A continuous random variable has a range of numerical values: for example, the height of
humans.

 Discrete Random Variable. Values are drawn from a finite set of states.
 Boolean Random Variable. Values are drawn from the set of {true, false}.
 Continuous Random Variable. Values are drawn from a range of real-valued numerical
values.
 A value of a random variable can be specified via an equals operator: for
example, X=True.
 The probability of a random variable is denoted as a function using the upper
case P or Pr; for example, P(X) is the probability of all values for the random variable X.
 The probability of a value of a random variable can be denoted P(X=True), in this case
indicating the probability of the X random variable having the value True.

Probability Theory

Probability theory is a mathematical framework for quantifying our uncertainty about the world.

It allows us (and our software) to reason effectively in situations where being certain is
impossible.

Probability theory is at the foundation of many machine learning algorithms.

In plain English, the probability of any event has to be between 0 (impossible) and 1 (certain), and
the sum of the probabilities of all events should be 1.

This follows from the fact that the sample space must contain all possible outcomes. Therefore,
we are certain (probability 1) that one of the possible outcomes will occur.

Conditional Probabilities

Frequently we want to know the probability of an event given some other event has occurred. We
write conditional probability of an event A given event B as P(A | B).

Let's suppose, we want to calculate the event A when event B has already occurred, "the
probability of A under the conditions of B", it can be written as:

P(A|B)= P(A⋀B)/P(B)

Where P(A⋀B)= Joint probability of a and B


P(B)= Marginal probability of B.

If the probability of A is given and we need to find the probability of B, then it will be given as:

P(B|A)= P(A⋀B)/P(A)

Probability Distribution
A probability distribution is a summary of probabilities for the values of a random variable.

As a distribution, the mapping of the values of a random variable to a probability has a shape
when all values of the random variable are lined up. The distribution also has general properties
that can be measured.

Two important properties of a probability distribution are the expected value and the variance

The expected value is the average or mean value of a random variable X.

E(X)

The variance is the spread of the values of a random variable from the mean.
Var(X)

Discrete Probability Distributions


 Poisson distribution.
 Bernoulli and binomial distributions.
 Multinoulli and multinomial distributions.
 Discrete uniform distribution.

 The probabilities of dice rolls form a discrete uniform distribution.


 The probabilities of coin flips form a Bernoulli distribution.
 The probabilities car colors form a multinomial distribution.

Decision Theory
Decision theory is principle associated with decisions. Contemporary decision theory was
developed in the mid of the 20th century with the support of several academic disciplines.
Decision theory is typically followed by researchers who pinpoint themselves as economists,
statisticians, psychologists, political and social scientists or philosophers. Decision theory
provides a formal structure to make rational choices in the situation of uncertainty. Given a set of
alternatives, a set of consequences, and a correspondence between those sets, decision theory
offers conceptually simple procedures for choice. The origin of decision theory is derived from
economics by using the utility function of payoffs. It proposes that decisions be made by
computing the utility and probability, the ranges of options, and also lays down strategies for
good decision.
Decision theory is a set of concepts, principles, tools and techniques that help the decision maker
in dealing with complex decision problems under uncertainty. More specifically, decision theory
deals with methods for determining the optimal course of action when a number of alternatives
are available and their consequences cannot be forecasted with certainty.
According to David Lewis (1974), "decision theory (at least if we omit the frills) is not an
esoteric science, however unfamiliar it may seem to an outsider. Rather it is a systematic
exposition of the consequences of certain well-chosen platitudes about belief, desire, preference
and choice. It is the very core of our common-sense theory of persons, dissected out and
elegantly systematized".
In theoretical literature, it is represented that decision theory signifies a generalized approach to
decision making. It enables the decision maker to analyze a set of complex situations with many
alternatives and many different possible consequences and to identify a course of action
consistent with the basic economic and psychological desires of the decision maker.
Decision theory problems are categorized by the following:
1. A decision criterion
2. A list of alternatives
3. A list of possible future events (states of nature)
4. Payoffs associated with each combination of alternatives and events
5. The degree of certainty of possible future events
There are two categories of decisions theories that include normative or prescriptive decision
theory to identify the best decision to take, assuming an ideal decision maker who is fully
informed, able to compute with perfect accuracy, and fully rational. The practical application of
this prescriptive approach is called decision analysis, and aimed at finding tools, methodologies
and software to help people make better decisions. The most systematic and comprehensive
software tools developed in this way are called decision support systems.
In contrast, positive or descriptive decision theory explain observed behaviors under the
assumption that the decision-making agents are behaving under some consistent rules. These
rules may, for instance, have a technical framework or an axiomatic framework, integration the
Von Neumann-Morgenstern axioms with behavioral desecrations of the expected utility
hypothesis, or they may explicitly give a functional form for time-inconsistent utility functions.
Solution to any decision problem include following steps:
1. Identify the problem
2. Specify objectives and the decision criteria for choosing a solution
3. Develop alternatives
4. Analyze and compare alternatives
5. Select the best alternative
6. Implement the chosen alternative
7. Verify that desired results are achieved

Decisions in stages, decision trees:


In many instances, the choice of the best act is not made in one stage, and the decision
problem involves a sequence of acts, events, acts, events. There may be a number of basic
alternatives, each leading to one of a number of situations depending on the outcome of a
certain random process. At each such situation, a number of other alternatives may be
available which also lead to a new set of situations depending on another set of events and so
on, with acts followed by events, followed by acts, events. Discrete decision theory problems
can be represented pictorially using decision tress. It chronically portrays the sequence of
actions and events as they unfolds. In below figure, square symbol precedes the set of actions
that can be taken by decision maker. The round node precedes the set of events or states of
nature that could be encountered after decisions is made. The nodes are connected by
branches
Analysis of Decision Trees: After the tree has been drawn, it is scrutinized from right to left. The
aim of analysis is to determine the best strategy of the decision maker that means an optimal
sequence of the decisions. To analyze a decision tree, managers must know a decision criterion,
probabilities that are assigned to each event, and revenues and costs for the decision alternatives
and the chance events that occur.
There are two possibilities to how to include revenues and costs in a decision tree. One
possibility is to assign them only to terminating nodes where they are included in the conditional
value of the decision criterion associated with the decisions and events along the path from the
first part of the tree to the end. However, it can be appropriate to assign revenues and costs to
branches. This reduces the required arithmetic for calculating the values of the decision criterion
for terminating nodes and focuses attention on the parameters for sensitivity analysis.
When analyzing a decision tree, managers must start at the end of the tree and work backwards.
They perform two kinds of calculations.
For chance event nodes managers calculate certainty equivalents related to the events emanating
from these nodes. Under the assumption that the decision maker has a neutral attitude toward
risk, certainty equivalent of uncertain outcomes can be replaced by their expected value. At
decision nodes, the alternative with the best expected value of the decision criterion is selected.
Benefits of Decision Trees
They enable to obtain a visual portrayal of sequential decisions, i.e. they picture a series of
chronological decisions. Decision tree are universal, they make more accurate the structure of the
decision process and facilitate a communication among solvers of the decision problem.
Decision tree force the decision maker to appreciate all consequences of his decisions.
Construction and analysis of decision trees by means of computers makes possible to experiment
with decision trees and quickly to establish the impact of changes in the input parameters of the
tree on the choice of the best policy.
Limitations of decision trees:
1. Only one decision criterion can be considered.
2. The decision tree is an abstraction and simplification of the real problem. Only the
important decisions and events are included.
3. Managers cannot use decision trees if the chance event outcomes are continuous. Instead,
they must redefine the outcomes so that there is a finite set of possibilities.
The significant result of the analysis of a decision tree is to choose the best alternative in the first
stage of the decision process. After this stage, some changes in the decision situations can come,
an additional information can be obtained, and usually, it is essential to actualize the decision
tree and to determine a new optimal strategy. This procedure is required before every further
stage.
Theoretical studies have revealed that decision theory is a formal study of rational decision
making formed largely by the joint efforts of mathematicians, philosophers, social scientists,
economists, statisticians and management scientists (Jeffrey 1992). While decision theory has
history of applications to real world problems in many disciplines, including economics, risk
analysis, business management, and theoretical behavioral ecology, it has more recently gained
acknowledgment as a beneficial approach to conservation in the last 20 years (Maguire 1986).
To summarize, Decision theory is a Structure of logical and mathematical concepts which is
intended to assist managers to formulate rules that may lead to a most beneficial course of action
under the given circumstances. Decision theory divides decisions into three categories that
include Decisions under certainty; where a manager has far too much information to choose the
best alternative, Decisions under conflict; where a manager has to anticipate moves and
countermoves of one or more competitors and lastly, Decisions under uncertainty; where a
manager has to dig-up a lot of data to make sense of what is going on and what it is leading to. It
is established that decision theory can be applied to conditions of certainty, risk, or uncertainty.
Decision theory identifies that the ranking produced by using a criterion has to be consistent with
the decision maker's objectives and preferences.

You might also like