0% found this document useful (0 votes)
16 views25 pages

Chapter 5

Uploaded by

yuti6211
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views25 pages

Chapter 5

Uploaded by

yuti6211
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Chapter 5

Learning

1
What is Learning?
– Learning is memorizing and remembering
• Telephone number
• Reading textbook
• Understanding Language (memorizing grammar & practicing)
• Recognize face, signature & fraudulent credit card transactions
– Learning is improving motor skills
• Riding a bike
• Exercising and practicing the idea
– Learning is understanding the strategy & rule of the game
• Playing chess and football
– Learning is abstraction and exploration
• Develop scientific theory
• Undertaking research
2
What is Learning?
• Learning is one of the keys to human intelligence. Do you
agree?
• The idea behind learning is that percepts should not only be
used for acting now, but also for improving the agent‘s
ability to act in the future.
– Learning is essential for unknown environments, i.e. when the agent
lacks knowledge. It enables to organize new knowledge into
general, effective representations
– Learning modifies the agent's decision making mechanisms to
improve performance

– Learning is nothing but


• Feature extraction
• Classification
3
Feature extraction
Hair length Hair length
Male Female
= y =0 = y = 30 cm

Task: to extract features which are good for classification.


Good features:
• Objects from the same class have similar feature values.
• Objects from different classes have different values.

―Good‖ features ―Bad‖ features 4


Classification
•Applications
–Character recognition:
Different printing styles.
–Speech recognition:
•Use of a dictionary or the
syntax of the language.
•Example: Credit scoring
–Differentiating between low-
risk & high-risk customers
from their income & savings

Discriminant: IF income > θ1 AND savings > θ2 THEN


low-risk
ELSE
5
high-risk
Face Recognition
• Learning: it is training (adaptation) from data set
•Training examples of a person –
• Face
recognition is
challenging
because of the
•Test images effect of facial
expression,
lighting,
occlusion,
make-up, hair
style, etc.

6
Learning Agents

7
The Basic Learning Model
• A computer program is said to learn from experience E
with respect to some class of tasks T and performance
measure P,
– if its performance at tasks T, as measured by P, improves with
experience E.
• Learning agents consist of four main components:
–Learning element – the part of the agent responsible for
improving its performance
–Performance element – the part that chooses the actions to take
–Critic – provides feedback for the learning element how the
agent is doing with respect to a performance standard
–Problem generator – suggests actions that could lead to new,
informative experiences (suboptimal from the point of view of
the performance element, but designed to improve that element)
8
Types of Learning
• Supervised learning: occurs where a set of input/output pairs
are explicitly presented to the agent by a teacher
– The teacher provides a category label for each pattern in a training
set, then the learning algorithm finds a rule that does a good job of
predicting the output associated with a new input.
• Unsupervised learning: Learning when there is no information
about what the correct outputs are.
– In unsupervised learning or clustering there is no explicit teacher,
the system forms clusters or natural groupings of the input patterns.
• Reinforcement learning: an agent interacting with the world
makes observations, takes actions, & is rewarded or punished; it
should learn to choose actions in order to obtain a lot of reward.
– The agent is given an evaluation of its action, but not told the
correct action. Reward strengthens likelihood of its action.
Typically, the environment is assumed to be stochastic. 9
Learning—A Two-Step Process
• Model construction:
– A training set is used to create the model.
– The model is represented as classification rules,
decision trees, or mathematical formulae

• Model usage:
– the test set is used to see how well it works for
classifying future or unknown objects

10
Step 1: Model Construction

Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier


M ike A ssistan t P ro f 3 no (Model)
M ary A ssistan t P ro f 7 yes
B ill P ro fesso r 2 yes
Jim A sso ciate P ro f 7 yes
IF rank = ‗professor‘
D ave A ssistan t P ro f 6 no
OR years > 6
Anne A sso ciate P ro f 3 no
THEN tenured = ‗yes‘
11
Step 2: Using the Model in Prediction

Classifier
model

Testing
Data Unseen Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
T om A ssistant P rof 2 no Tenured?
M erlisa A ssociate P rof 7 no
G eorge P rofessor 5 yes
Joseph A ssistant P rof 7 yes 12
Metrics for Performance Evaluation…
PREDICTED CLASS
Class=Yes Class=No
Class=Yes a b
(TP) (FP)
ACTUAL
CLASS Class=No c d
(FP) (TP)

• Most widely-used metric:


– To measure the performance of the model in general:
ad TP
Accuracy   *100
a  b  c  d TP  FP
– To measure the performance of the model on each
class: Recall, precision and F-Measure
13
Confusion matrix
Observe the following Confusion Matrix
Classified as A B

A 7 2

B 3 2

The above confusion matrix can be used to


calculate TP, FP, Precision, Recall and F-
Measure, and also Accuracy of the system

14
Learning methods
• There are various learning methods. Popular
learning techniques include the following.
– Decision tree : divide decision space into piecewise
constant regions.
– Neural networks: partition by non-linear boundaries
– Bayesian network: a probabilistic model
– Regression: (linear or any other polynomial)
– Support vector machine
– Expectation maximization algorithm

15
NEURAL NETWORKS

16
Brain vs. Machine
• The Brain
– Pattern Recognition
– Association
– Complexity
– Noise Tolerance

• The Machine
– Calculation
– Precision
– Logic
17
Features of the Brain
• There are around ten billion (1010) neurons in our brain
• Neuron switching time >10-3secs
• Face Recognition ~0.1secs
• On average, each neuron has several thousand connections
• Hundreds of operations per second
• High degree of parallel computation
• Distributed representations
• Die off frequently (never replaced)
• Compensated for problems by massive parallelism

18
Neural Network
•It is represented as a layered set of interconnected processors. These
processor nodes has a relationship with the neurons of the brain. Each
node has a weighted connection to several other nodes in adjacent
layers. Individual nodes take the input received from connected nodes
and use the weights together to compute output values.

• The inputs are fed simultaneously


into the input layer.
• The weighted outputs of these
units are fed into hidden layer.
• The weighted outputs of the last
hidden layer are inputs to units
making up the output layer.

19
Architecture of Neural network
• Neural networks are used to look for patterns in data, learn these
patterns, and then classify new patterns & make forecasts
• A network with the input and output layer only is called single-
layered neural network. Whereas, a multilayer neural network is
a generalized one with one or more hidden layer.
– A network containing two hidden layers is called a three-layer neural
network, and so on.
Single layered NN Multilayer NN
n
o   (  wi xi )
x1 x1
w1
x2 i 1 x2
w2
1 x3
x3 w3  ( y) 
1  e y Input HiddenOutput
nodes nodes nodes 20
A Multilayer Neural Network
• INPUT: records with class attribute with
normalized attributes values. Output layer
–INPUT VECTOR: X = { x1, x2, …. xm},
where n is the number of attributes.
–INPUT LAYER – there are as many nodes
as class attributes i.e. as the length of the Hidden layer
input vector.
• HIDDEN LAYER – neither its input nor its
output can be observed from outside.
–The number of nodes in the hidden layer and
the number of hidden layers depends on Input layer
implementation.
• OUTPUT LAYER – corresponds to the class attribute.
–There are as many nodes as classes (values of the class
attribute).
–Ok, where k= 1, 2,.. n, where n is number of classes 21
Neuron with Activation
• ANN is an electronic network of neurons based on the neural
structure of the brain
• The neuron is the basic information processing unit of a NN. It
consists of:
1. A set of links, describing the neuron inputs, with weights W1,
W2, …, Wm
2. An adder function (linear combiner): computes the weighted sum
of the inputs (real numbers): m
y   wj x j
j 1

3. Activation function: limits the output behavior of the neuron.

Where e = 2.718281828459045235
22
Two Topologies of neural network
• NN can be designed in a feed forward or recurrent
manner
• In a feed forward neural network connections
between the units do not form a directed cycle.
– In this network, the information moves in only one
direction, forward, from the input nodes, through the
hidden nodes (if any) & to the output nodes. There
are no cycles or loops or no feedback connections are
present in the network, that is, connections extending
from outputs of units to inputs of units in the same
layer or previous layers.
• In recurrent networks data circulates back &
forth until the activation of the units is
stabilized
– Recurrent networks have a feedback loop where data
can be fed back into the input at some point before it
is fed forward again for further processing and final
23
output.
Training the neural network
• The purpose is to learn to generalize using a set of sample
patterns where the desired output is known.
• Back Propagation is the most commonly used method for
training multilayer feed forward NN.
– Back propagation learns by iteratively processing a set of training
data (samples).
– For each sample, weights are modified to minimize the error
between the desired output and the actual output.
• After propagating an input through the network, the error
is calculated and the error is propagated back through the
network while the weights are adjusted in order to make
the error smaller.
24
Pros and Cons of Neural Network
• Useful for learning complex data like handwriting, speech and
image recognition
Pros Cons
+ Can learn more complicated - Slow training time
class boundaries - Hard to interpret
+ Fast application - Hard to implement: trial
+ Can handle large number of and error for choosing
features number of nodes

• Neural Network needs long time for training.


• Neural Network has a high tolerance to noisy and incomplete
data
• Conclusion: Use neural nets only if decision-trees fail.
25

You might also like