0% found this document useful (0 votes)

30 views

Unit 1 ML

Uploaded by

26cssohar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

Unit 1 ML

Uploaded by

26cssohar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

UNIT – 1 Introduction to Machine learning

Machine learning is an application of artificial intelligence (AI) that provides systems the ability to
automatically learn and improve from experience without being explicitly programmed. Machine
learning focuses on the development of computer programs that can access data and use it learn for
themselves.

Traditional Learning v/s Machine Learning

Learning System
Types of Learning

Machine learning is sub-categorized to three types:

• Supervised Learning – Train Me!

• Unsupervised Learning – I am self sufficient in learning

• Reinforcement Learning – My life My rules! (Hit & Trial)

What is Supervised Learning?

Supervised Learning is the one, where you can consider the learning is guided by a teacher. We have
a dataset which acts as a teacher and its role is to train the model or the machine. Once the model
gets trained it can start making a prediction or decision when new data is given to it.

What is Unsupervised Learning?

The model learns through observation and finds structures in the data. Once the model is given a
dataset, it automatically finds patterns and relationships in the dataset by creating clusters in it. What
it cannot do is add labels to the cluster, like it cannot say this a group of apples or mangoes, but it will
separate all the apples from mangoes.

Suppose we presented images of apples, bananas and mangoes to the model, so what it does, based
on some patterns and relationships it creates clusters and divides the dataset into those clusters. Now
if a new data is fed to the model, it adds it to one of the created clusters.
What is Reinforcement Learning?

It is the ability of an agent to interact with the environment and find out what is the best outcome. It
follows the concept of hit and trial method. The agent is rewarded or penalized with a point for a
correct or a wrong answer and based on the positive reward points gained the model trains itself. And
again, once trained it gets ready to predict the new data presented to it.
Supervised Learning vs Unsupervised Learning

Well-Posed Learning Problems

Definition:

A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by P, improves with experience
E.

Well-Posed Learning Problems: Examples

• A checkers learning problem

o Task T : playing checkers

o Performance measure P : percent of games won against opponents

o Training experience E : playing practice games against itself

• A handwriting recognition learning problem

– Task T : recognizing and classifying handwritten words within images

– Performance measure P : percent of words correctly classified

– Training experience E : a database of handwritten words with given classifications

• A robot driving learning problem

– Task T : driving on public four-lane highways using vision sensors

– Performance measure P : average distance traveled before an error (as judged by

human overseer)

– Training experience E : a sequence of images and steering commands recorded while

observing a human driver

Designing a Learning System

• Choosing the Training Experience

• Choosing the Target Function

• Choosing a Representation for the Target Function

• Choosing a Function Approximation Algorithm

• The Final Design

Choosing the Training Experience

• Whether the training experience provides direct or indirect feedback regarding the choices
made by the performance system:

• Example:

– Direct training examples in learning to play checkers consist of individual checkers

board states and the correct move for each.

– Indirect training examples in the same game consist of the move sequences and final
outcomes of various games played in which information about the correctness of
specific moves early in the game must be inferred indirectly from the fact that the
game was eventually won or lost – credit assignment problem.

• The degree to which the learner controls the sequence of training examples:

• Example:

– The learner might rely on the teacher to select informative board states and to
provide the correct move for each

– The learner might itself propose board states that it finds particularly confusing and
ask the teacher for the correct move. Or the learner may have complete control over
the board states and (indirect) classifications, as it does when it learns by playing
against itself with no teacher present.

• How well it represents the distribution of examples over which the final system performance
P must be measured: In general learning is most reliable when the training examples follow
a distribution similar to that of future test examples.
• Example:

– If the training experience in play checkers consists only of games played against itself,
the learner might never encounter certain crucial board states that are very likely to be played by the
human checkers champion. (Note however that the most current theory of machine learning rests on
the crucial assumption that the distribution of training examples is identical to the distribution of test
examples)

Choosing the Target Function

• To determine what type of knowledge will be learned and how this will be used by the
performance program:

• Example:

– In play checkers, it needs to learn to choose the best move among those legal moves:
ChooseMove: B ->M, which accepts as input any board from the set of legal board
states B and produces as output some move from the set of legal moves M.

• Since the target function such as ChooseMove turns out to be very difficult to learn given the
kind of indirect training experience available to the system, an alternative target function is
then an evaluation function that assigns a numerical score to any given board state, V: B ->R.

Choosing a Representation for the Target Function

• Given the ideal target function V, we choose a representation that the learning system will
use to describe V' that it will learn:

• Example:

– In play checkers,

V'(b) = w0 + w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + w6x6

– where wi is the numerical coefficient or weight to determine the relative importance

of the various board features and xi is the number of i-th objects on the board.

Choosing a Function Approximation Algorithm

• Each training example is given by <b, Vtrain(b)> where Vtrain(b) is the training value for a board
b.

• Estimating Training Values:

Vtrain(b) <- V' (Successor(b)).

• Adjusting the weights: To specify the learning algorithm for choosing the weights wi to best
fit the set of training examples {<b, Vtrain(b)>}, which minimizes the squared error E between
the training values and the values predicted by the hypothesis V‘

E= ∑ (𝑉𝑡𝑟𝑎𝑖𝑛(𝑏) − 𝑉 ′(𝑏))^2
The Final Design

• Performance System: To solve the given performance task by using the learned target
function(s). It takes an instance of a new problem (new game) as input and a trace of its
solution (game history) as output.

• Critic: To take as input the history or trace of the game and produce as output a set of training
examples of the target function.

• Generalizer: To take as input the training examples and produce an output hypothesis that is
its estimate of the target function. It generalizes from the specific training examples,
hypothesizing a general function that covers these examples and other cases beyond the
training examples.

• Experiment Generator: To take as input the current hypothesis (currently learned function)
and outputs a new problem (i.e., initial board state) for Performance System to explore. Its
role is to pick new practice problems that will maximize the learning rate of the overall
system.

History of ML

1950s: Samuel’s Checker-Playing Program

1960s: Neural Network: Rosenblatt’s Perceptron (Inventor of ANN)

Pattern Recognition

Minsky & Papert Prove Limitations of Perceptron

1970s: Symbolic Concept Introduction

Expert Systems & Knowledge Acquisition Bottleneck

Quinlan’s ID3

NLP

1980s: Advanced Decision Trees & Rule Learning

Focus on experimental methodology

Resurgence of Neural Network

90s: ML & Statistics

Support Vector Machines

Data Mining

Adaptive Agents & Web Applications

Text Learning

Reinforcement Learning

Ensembles

Bayes Net Learning

1994: Self Driving Cars Road Test

1997: Deep Blue defeated Garry Kasparov in Chess Exhibition Match.

2009: Google Builds Self Driving Cars

2011: Watson wins Jeopardy

2014: Human Vision surpasses by ML systems

ANN(ARTIFICIAL NEURAL NETWORK)

An Artificial Neural Network (ANN) has hidden layers which are used to respond to more complicated
tasks than the earlier perceptrons could. ANNs are a primary tool used for Machine Learning. Neural
networks use input and output layers and, normally, include a hidden layer (or layers) designed to
transform input into data that can be used the by output layer. The hidden layers are excellent for
finding patterns too complex for a human programmer to detect, meaning a human could not find the
pattern and then teach the device to recognize it.

Artificial Neural Networks or ANN is an information processing paradigm that is inspired by the way
the biological nervous system such as brain process information. It is composed of large number of
highly interconnected processing elements(neurons) working in unison to solve a specific problem

The following diagram represents the general model of ANN which is inspired by a biological neuron.
It is also called Perceptron.

A single layer neural network is called a Perceptron. It gives a single output.

Clustering in Machine Learning

It is basically a type of unsupervised learning method . An unsupervised learning method is a method

in which we draw references from datasets consisting of input data without labelled responses.
Generally, it is used as a process to find meaningful structure, explanatory underlying processes,
generative features, and groupings inherent in a set of examples.
Clustering is the task of dividing the population or data points into a number of groups such that
data points in the same groups are more similar to other data points in the same group and
dissimilar to the data points in other groups. It is basically a collection of objects on the basis of
similarity and dissimilarity between them.

Applications of Clustering in different fields

Marketing : It can be used to characterize & discover customer segments for marketing purposes.

Biology : It can be used for classification among different species of plants and animals.

Libraries : It is used in clustering different books on the basis of topics and information.

Insurance : It is used to acknowledge the customers, their policies and identifying the frauds.

City Planning: It is used to make groups of houses and to study their values based on their
geographical locations and other factors present.
Earthquake studies: By learning the earthquake-affected areas we can determine the dangerous
zones.

Reinforcement Learning

Reinforcement learning is an area of Machine Learning. It is about taking suitable action to maximize
reward in a particular situation. It is employed by various software and machines to find the best
possible behavior or path it should take in a specific situation. Reinforcement learning differs from the
supervised learning in a way that in supervised learning the training data has the answer key with it
so the model is trained with the correct answer itself whereas in reinforcement learning, there is no
answer but the reinforcement agent decides what to do to perform the given task. In the absence of
a training dataset, it is bound to learn from its experience.
Example: The problem is as follows: We have an agent and a reward, with many hurdles in between.
The agent is supposed to find the best possible path to reach the reward. The following problem
explains the problem more easily.

Decision Tree Learning

Decision tree learning is one of the predictive modelling approaches used in statistics, data
mining and machine learning. It uses a decision tree (as a predictive model) to go from observations
about an item (represented in the branches) to conclusions about the item's target value (represented
in the leaves). Tree models where the target variable can take a discrete set of values are
called classification trees; in these tree structures, leaves represent class labels and branches
represent conjunctions of features that lead to those class labels. Decision trees where the target
variable can take continuous values (typically real numbers) are called regression trees. Decision trees
are among the most popular machine learning algorithms given their intelligibility and simplicity.

• Decision tree learning is a method for approximating discrete-valued target functions.

• The learned function is represented by a decision tree.

– A learned decision tree can also be re-represented as a set of if-then rules.

• Decision tree learning is one of the most widely used and practical methods for inductive
inference.

• It is robust to noisy data and capable of learning disjunctive expressions.

• Decision tree learning method searches a completely expressive hypothesis .

– Avoids the difficulties of restricted hypothesis spaces.

– Its inductive bias is a preference for small trees over large trees.

• The decision tree algorithms such as ID3, C4.5 are very popular inductive inference
algorithms, and they are successfully applied to many leaning tasks.

Bayesian Networks

Bayesian networks are a type of Probabilistic Graphical Model that can be used to build models from
data and/or expert opinion.

They can be used for a wide range of tasks including prediction, anomaly detection, diagnostics,
automated insight, reasoning, time series prediction and decision making under uncertainty. Figure
below shows these capabilities in terms of the four major analytics disciplines, Descriptive
analytics, Diagnostic analytics, Predictive analytics and Prescriptive analytics.

They are also commonly referred to as Bayes nets, Belief networks and sometimes Causal networks.

A Bayes net is a model. It reflects the states of some part of a world that is being modeled and it
describes how those states are related by probabilities. The model might be of your house, or your
car, your body, your community, an ecosystem, a stock-market, etc. Absolutely anything can be
modeled by a Bayes net. All the possible states of the model represent all the possible worlds that can
exist, that is, all the possible ways that the parts or states can be configured. The car engine can be
running normally or giving trouble. It's tires can be inflated or flat. Your body can be sick or healthy,
and so on.

Support Vector Machines

Support Vector Machine (SVM) is a relatively simple Supervised Machine Learning Algorithm used for
classification and/or regression. It is more preferred for classification but is sometimes very useful for
regression as well. Basically, SVM finds a hyper-plane that creates a boundary between the types of
data. In 2-dimensional space, this hyper-plane is nothing but a line.
In SVM, we plot each data item in the dataset in an N-dimensional space, where N is the number of
features/attributes in the data. Next, find the optimal hyperplane to separate the data. So by this, you
must have understood that inherently, SVM can only perform binary classification (i.e., choose
between two classes). However, there are various techniques to use for multi-class problems.

SVM works very well without any modifications for linearly separable data. Linearly Separable Data is
any data that can be plotted in a graph and can be separated into classes using a straight line.

A: Linearly Separable Data B: Non-Linearly Separable Data

We use Kernelized SVM for non-linearly separable data. Say, we have some non-linearly separable
data in one dimension. We can transform this data into two-dimensions and the data will become
linearly separable in two dimensions. This is done by mapping each 1-D data point to a
corresponding 2-D ordered pair.
So for any non-linearly separable data in any dimension, we can just map the data to a higher
dimension and then make it linearly separable. This is a very powerful and general transformation.
A kernel is nothing a measure of similarity between data points. The kernel function in a kernelized
SVM tell you, that given two data points in the original feature space, what the similarity is between
the points in the newly transformed feature space.

Genetic Algorithms

Genetic Algorithms(GAs) are adaptive heuristic search algorithms that belong to the larger part of
evolutionary algorithms. Genetic algorithms are based on the ideas of natural selection and genetics.
These are intelligent exploitation of random search provided with historical data to direct the search
into the region of better performance in solution space. They are commonly used to generate high-
quality solutions for optimization problems and search problems.

Genetic algorithms simulate the process of natural selection which means those species who can
adapt to changes in their environment are able to survive and reproduce and go to next generation.
In simple words, they simulate “survival of the fittest” among individual of consecutive generation for
solving a problem. Each generation consist of a population of individuals and each individual
represents a point in search space and possible solution. Each individual is represented as a string of
character/integer/float/bits. This string is analogous to the Chromosome

Foundation of Genetic Algorithms

Genetic algorithms are based on an analogy with genetic structure and behavior of chromosome of
the population. Following is the foundation of GAs based on this analogy –

Individual in population compete for resources and mate

Those individuals who are successful (fittest) then mate to create more offspring than others

Genes from “fittest” parent propagate throughout the generation, that is sometimes parents create
offspring which is better than either parent.

Thus, each successive generation is more suited for their environment.

Search space

The population of individuals are maintained within search space. Each individual represent a solution
in search space for given problem. Each individual is coded as a finite length vector (analogous to
chromosome) of components. These variable components are analogous to Genes. Thus, a
chromosome (individual) is composed of several genes (variable components).

Fitness Score

A Fitness Score is given to everyone which shows the ability of an individual to “compete”. The
individual having optimal fitness score (or near optimal) are sought.

The GAs maintains the population of n individuals (chromosome/solutions) along with their fitness
scores. The individuals having better fitness scores are given more chance to reproduce than others.
The individuals with better fitness scores are selected who mate and produce better offspring by
combining chromosomes of parents. The population size is static, so the room has to be created for
new arrivals. So, some individuals die and get replaced by new arrivals eventually creating new
generation when all the mating opportunity of the old population is exhausted. It is hoped that over
successive generations better solutions will arrive while least fit die.
Each new generation has on average more “better genes” than the individual (solution) of previous
generations. Thus, each new generations have better “partial solutions” than previous generations.
Once the offspring's produced having no significant difference than offspring produced by previous
populations, the population is converged. The algorithm is said to be converged to a set of solutions
for the problem.

Issues in Machine Learning

• What algorithms exist for learning general target functions from specific training examples ?

• How does the number of training examples influence accuracy ?

• When and how can prior knowledge held by the learner guide the process of generalizing
from examples ?

• What is the best strategy for choosing a useful next training experience, and how does the
choice of this strategy alter the complexity of the learning problem ?

• What is the best way to reduce the learning task to one or more function approximation
problems ?

• How can the learner automatically alter its representation to improve its ability to represent
and learn the target function ?

Data Science vs Machine Learning

Project 2
No ratings yet
Project 2
17 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
UNIT 1 Machine Learning MTech
No ratings yet
UNIT 1 Machine Learning MTech
167 pages
ML1
No ratings yet
ML1
28 pages
ml notes
No ratings yet
ml notes
47 pages
Ai&ml Unit 4
No ratings yet
Ai&ml Unit 4
21 pages
ML Unit-I
No ratings yet
ML Unit-I
121 pages
Effective Applications of Learning: Speech Recognition
No ratings yet
Effective Applications of Learning: Speech Recognition
52 pages
Unit 1 ML
No ratings yet
Unit 1 ML
60 pages
Ecs 403 ML Module I
No ratings yet
Ecs 403 ML Module I
33 pages
Module 1
No ratings yet
Module 1
27 pages
Machine Learning
No ratings yet
Machine Learning
111 pages
Module 1 Notes PDF
No ratings yet
Module 1 Notes PDF
26 pages
Unit 1 1
No ratings yet
Unit 1 1
64 pages
ML - Unit 1 - Part I
No ratings yet
ML - Unit 1 - Part I
24 pages
ML RUSA Module 1 Intro
No ratings yet
ML RUSA Module 1 Intro
30 pages
Unit 4
No ratings yet
Unit 4
45 pages
Unit 1 1
No ratings yet
Unit 1 1
26 pages
ML Unit-I Chapter-I Introduction
No ratings yet
ML Unit-I Chapter-I Introduction
36 pages
Machine Learning Notes-1 (ML Design)
No ratings yet
Machine Learning Notes-1 (ML Design)
7 pages
ML Module Notes
No ratings yet
ML Module Notes
139 pages
ML Chapter-1
No ratings yet
ML Chapter-1
39 pages
ML UNIT-1 NOTES
No ratings yet
ML UNIT-1 NOTES
15 pages
Video Tutorial: Machine Learning 17CS73
100% (2)
Video Tutorial: Machine Learning 17CS73
27 pages
Svit Dept of Computer Science and Engineering Machine Learning B.Tech Iiiyr
No ratings yet
Svit Dept of Computer Science and Engineering Machine Learning B.Tech Iiiyr
53 pages
Unti 1 ML
No ratings yet
Unti 1 ML
26 pages
Unit 1.2 Desigining A Learning System
No ratings yet
Unit 1.2 Desigining A Learning System
15 pages
Unit-1 Notes
No ratings yet
Unit-1 Notes
26 pages
ML (Unit-1)
No ratings yet
ML (Unit-1)
17 pages
Module 1
No ratings yet
Module 1
28 pages
1 Introduction To Machine Learning
No ratings yet
1 Introduction To Machine Learning
20 pages
Module 1
No ratings yet
Module 1
27 pages
ML Unit-1
No ratings yet
ML Unit-1
61 pages
What Is Learning?: CS 391L: Machine Learning
No ratings yet
What Is Learning?: CS 391L: Machine Learning
6 pages
Learning
No ratings yet
Learning
35 pages
Designing A Learning System: DR - Chandrika.J Professor CSE Course Faculty
No ratings yet
Designing A Learning System: DR - Chandrika.J Professor CSE Course Faculty
22 pages
ML-UNIT-1 - Introduction PART-1
No ratings yet
ML-UNIT-1 - Introduction PART-1
60 pages
Course. Introduction To Machine Learning Lecture 1. Introduction To ML
No ratings yet
Course. Introduction To Machine Learning Lecture 1. Introduction To ML
46 pages
MACHINE LEARNING TECHNIQUES - PPSX
No ratings yet
MACHINE LEARNING TECHNIQUES - PPSX
26 pages
Module 1 Concept Learning Notes
No ratings yet
Module 1 Concept Learning Notes
26 pages
ML-1
No ratings yet
ML-1
86 pages
ML Unit-1
No ratings yet
ML Unit-1
70 pages
Module 2 PDF
No ratings yet
Module 2 PDF
26 pages
Eid 403 ML Module I Lecture Notes
No ratings yet
Eid 403 ML Module I Lecture Notes
26 pages
ML Module1 Chapter1
No ratings yet
ML Module1 Chapter1
38 pages
ML Unit I Notes
No ratings yet
ML Unit I Notes
27 pages
ML Unit 1
No ratings yet
ML Unit 1
156 pages
Designing A Learning System
No ratings yet
Designing A Learning System
23 pages
ML Lec 03 Machine Learning Process
No ratings yet
ML Lec 03 Machine Learning Process
42 pages
Machine Learning (Unit-1)
No ratings yet
Machine Learning (Unit-1)
24 pages
Unit 1
No ratings yet
Unit 1
14 pages
Learningintro Notes
No ratings yet
Learningintro Notes
12 pages
Machine Learning Unit1
No ratings yet
Machine Learning Unit1
151 pages
M01 Machine Learning
No ratings yet
M01 Machine Learning
25 pages
Unit 1
No ratings yet
Unit 1
15 pages
ML Unit 1 CS
100% (2)
ML Unit 1 CS
102 pages
Chapter 1
No ratings yet
Chapter 1
3 pages
Machine Learning
No ratings yet
Machine Learning
25 pages
01 Introduction ML
No ratings yet
01 Introduction ML
60 pages
1 Introduction
No ratings yet
1 Introduction
11 pages
Next Level Deep Machine Learning: Complete Tips and Tricks to Deep Machine Learning
From Everand
Next Level Deep Machine Learning: Complete Tips and Tricks to Deep Machine Learning
Joe Grant
No ratings yet
Weka Tutorial 2
No ratings yet
Weka Tutorial 2
50 pages
R2032051
No ratings yet
R2032051
7 pages
Act 05
No ratings yet
Act 05
5 pages
Classification & Prediction
No ratings yet
Classification & Prediction
24 pages
chapter-6-TREE
No ratings yet
chapter-6-TREE
3 pages
HCIA-AI V3.5 Experiment Guide
No ratings yet
HCIA-AI V3.5 Experiment Guide
97 pages
Customer Churn Prediction in Telecom Sector Using Machine Learning Techniques
No ratings yet
Customer Churn Prediction in Telecom Sector Using Machine Learning Techniques
16 pages
1-s2.0-S277294192400111X-main-cids
No ratings yet
1-s2.0-S277294192400111X-main-cids
17 pages
Decision_Trees_Concepts_Algorithms
No ratings yet
Decision_Trees_Concepts_Algorithms
15 pages
ML Manual AIDS
No ratings yet
ML Manual AIDS
44 pages
LIVER_DISEASE_PREDICTION_USING_MACHINE_LEARNING_FINAL
No ratings yet
LIVER_DISEASE_PREDICTION_USING_MACHINE_LEARNING_FINAL
22 pages
Chapter 16 - Decision Analysis
No ratings yet
Chapter 16 - Decision Analysis
61 pages
Project Poster (IT Group 22)
No ratings yet
Project Poster (IT Group 22)
1 page
Decision Trees
No ratings yet
Decision Trees
13 pages
DSS 3 4 9 10
No ratings yet
DSS 3 4 9 10
5 pages
Sat - 149.Pdf - Prediction of Bigmart Sales Using Machine Learning Algorihms
No ratings yet
Sat - 149.Pdf - Prediction of Bigmart Sales Using Machine Learning Algorihms
11 pages
Chapter 09
No ratings yet
Chapter 09
54 pages
Decision Tree ML
No ratings yet
Decision Tree ML
24 pages
Data Mining - Rule Based Classification
No ratings yet
Data Mining - Rule Based Classification
3 pages
DM Unit 3
No ratings yet
DM Unit 3
63 pages
Fall Detection For Elderly People Using Machine Learning
No ratings yet
Fall Detection For Elderly People Using Machine Learning
4 pages
HFDSDH
No ratings yet
HFDSDH
8 pages
AI Assignment 2
No ratings yet
AI Assignment 2
5 pages
Data Science 面试必备指南 + 面试真题
No ratings yet
Data Science 面试必备指南 + 面试真题
54 pages
Operations Research: Introduction, Historical Background: Notes of KMBN 206 Unit-1
No ratings yet
Operations Research: Introduction, Historical Background: Notes of KMBN 206 Unit-1
7 pages
Retail Price Optimization Ppt
No ratings yet
Retail Price Optimization Ppt
16 pages
Advanced Machine Learning Techniques For Cardiovascular Disease Early Detection and Diagnosis
No ratings yet
Advanced Machine Learning Techniques For Cardiovascular Disease Early Detection and Diagnosis
29 pages
Summer Internship Report ON: "Data Analytics"
No ratings yet
Summer Internship Report ON: "Data Analytics"
24 pages

Unit 1 ML

Uploaded by

Unit 1 ML

Uploaded by

UNIT – 1 Introduction to Machine learning

Traditional Learning v/s Machine Learning

Machine learning is sub-categorized to three types:

• Supervised Learning – Train Me!

• Unsupervised Learning – I am self sufficient in learning

• Reinforcement Learning – My life My rules! (Hit & Trial)

What is Supervised Learning?

What is Unsupervised Learning?

Well-Posed Learning Problems

Well-Posed Learning Problems: Examples

• A checkers learning problem

o Task T : playing checkers

o Performance measure P : percent of games won against opponents

o Training experience E : playing practice games against itself

• A handwriting recognition learning problem

– Task T : recognizing and classifying handwritten words within images

– Performance measure P : percent of words correctly classified

– Training experience E : a database of handwritten words with given classifications

– Task T : driving on public four-lane highways using vision sensors

– Performance measure P : average distance traveled before an error (as judged by

– Training experience E : a sequence of images and steering commands recorded while

Designing a Learning System

• Choosing the Training Experience

• Choosing the Target Function

• Choosing a Representation for the Target Function

• Choosing a Function Approximation Algorithm

• The Final Design

Choosing the Training Experience

– Direct training examples in learning to play checkers consist of individual checkers

Choosing the Target Function

Choosing a Representation for the Target Function

V'(b) = w0 + w1x1 + w2x2 + w3x3 + w4x4 + w5x5 + w6x6

– where wi is the numerical coefficient or weight to determine the relative importance

Choosing a Function Approximation Algorithm

• Estimating Training Values:

Vtrain(b) <- V' (Successor(b)).

1950s: Samuel’s Checker-Playing Program

1960s: Neural Network: Rosenblatt’s Perceptron (Inventor of ANN)

Minsky & Papert Prove Limitations of Perceptron

1970s: Symbolic Concept Introduction

Expert Systems & Knowledge Acquisition Bottleneck

1980s: Advanced Decision Trees & Rule Learning

Resurgence of Neural Network

90s: ML & Statistics

Support Vector Machines

Adaptive Agents & Web Applications

Bayes Net Learning

1994: Self Driving Cars Road Test

1997: Deep Blue defeated Garry Kasparov in Chess Exhibition Match.

2009: Google Builds Self Driving Cars

2011: Watson wins Jeopardy

2014: Human Vision surpasses by ML systems

ANN(ARTIFICIAL NEURAL NETWORK)

A single layer neural network is called a Perceptron. It gives a single output.

It is basically a type of unsupervised learning method . An unsupervised learning method is a method

Applications of Clustering in different fields

Decision Tree Learning

• Decision tree learning is a method for approximating discrete-valued target functions.

• The learned function is represented by a decision tree.

– A learned decision tree can also be re-represented as a set of if-then rules.

• It is robust to noisy data and capable of learning disjunctive expressions.

• Decision tree learning method searches a completely expressive hypothesis .

– Avoids the difficulties of restricted hypothesis spaces.

Support Vector Machines

A: Linearly Separable Data B: Non-Linearly Separable Data

Foundation of Genetic Algorithms

Individual in population compete for resources and mate

Thus, each successive generation is more suited for their environment.

Issues in Machine Learning

• How does the number of training examples influence accuracy ?

Data Science vs Machine Learning

You might also like