Learning
Learning
Outlines
• What is Learning?
• Rote learning
• Induction
• Decision Trees
• Neural Network
What is Learning?
• Most often heard criticisms of AI is that machines cannot be called
intelligent until they are able to learn to do new things and adapt to new
situations, rather than simply doing as they are told to do.
• Definitions of Learning [Simon 1983]: changes in the system that are
adaptive in the sense that they enable the system to do the same task or
tasks drawn from the same population more efficiently and more
effectively the next time.
• The classes from which the classification procedure can choose can be described in a
variety of ways.
• Their definition will depend on the use to which they are put.
If: the current goal is to get from place A to place B, and There is a WALL separating the two places
To use this rule successfully, the system’s matching routine must be able to identify an object as a wall
Then, to apply the rule, the system must be able to recognize a doorway
Concept learning
• The idea of producing a classification program that can
evolve its own class definitions is called concept learning
or induction.
• The techniques used for this task must depend on the way
that classes (concepts) are described
• If classes are described by scoring functions, the concept
learning can be done using the technique of coefficient
adjustment
LEARNING
[Through Classification]
Priyam Biswas
Learning Algorithms
• Supervised
– Class information exists
Class 1
Class 2
Class 3
Class 4
• Unsupervised
– NO Class information
Learning Agents
• Learns a classifier from examples: Training stage
Red Blue
texture shape
Three Four
edges edges
Square Circular
Priyam Biswas
Artificial Neural Network
Class 1
?
x1
x2 Class 2
x3
Class n
xm
Input Outputs
features
Artificial Neural Network
x1
x2
x3
Class 1
Class 2
xm Class n
ANN AS A LINEAR CLASSIFIER
Consider the AND or OR
functions
x1 + x2 -1/2 =0
x1
1
1
x2
-1/2
ANN AS A LINEAR CLASSIFIER
x1 + x2 -3/2 =0
x1
1
1
x2
-3/2
ANN AS A LINEAR CLASSIFIER
x1 x1
1 1
1 1
x2 x2
-3/2 -1/2
A perceptron or
A perceptron or
neuron realizing
neuron realizing OR
AND function
function
A 2 INPUT PERCEPTRON
x1 x1
w1 w1
Σ Σ f
w2 w2
x2 w0 x2 w0
ω1
ω2
Assume the
Initial
classifier
x1
Introduction of Perceptron
x2
ω1
ω2
Modify w to
improve the
classifier x1
Perceptron Training Process
1. Set up the network input nodes, layer(s), & connections
2. Randomly assign weights to Wij & bias: j
net j Wij X i j
i
1 net j > 0
Y j= if
0 net j 0
The training process
4. Training computation:
If T j Y j 0 than:
Wij T j Y j X i
j T j Y j
X1
f
• Let W11=1, W21=0.5, Θ=0.5
Let learning rate η=0.1
• The initial net function is:
net = W11•X1+ W21•X2 -Θ
i.e. net = 1•X11 + 0.5•X21 - 0.5
• Feed the input pattern into network one by one
(0,0), net= -0.5, Y=0, (T-Y)= 0 O.K.
(0,1), net= 0, Y= 0, (T-Y)=1- Ø=1 Need to update weight
(1,0), net= 0.5, Y=1, (T-Y)= 0 O.K.
(1,1), net= 1, Y=1, (T-Y)= 0 O.K.
• update weights for pattern (0,1)
ΔW11=(0.1)(1)( 0)= 0, W11= W11+ΔW11=1
ΔW12=(0,1)(1)( 1)= 0.1, W21=0.5+0.1=0.6
ΔΘ=-(0.1)(1)=-0.1 Θ=0.5-0.1=0.4
• Applying new weights to the net function:
net=1•X1+0.6•X2-0.4
• Verify the pattern (0,1) to see if it satisfies the expected output.
(0,1), net= 0.2, Y= 1, (T-Y)=Ø O.K.
• Feed the next input pattern, again, one by one
(1,0), net= 0.6, Y=1, (T-Y)= Ø O.K.
(1,1), net= 1.2, Y=1, (T-Y)= Ø O.K.
• Since the first pattern(0,0) has not been testified with the new
weights, feed again.
(0,0), net=-0.4, Y=Ø, δ= Ø O.K.
• Now all the patterns are satisfied. Hence, the network is
successfully trained for OR patterns.
We get the final network is : net=1•X1+0.6•X2-0.4
X1 1
Y
X2 f
0.6
Θ=0.4
• Recall process:
Once the network is trained, we can apply any two element
vectors as a pattern and feed the pattern into the network for
recognition. For example, we can feed (1,0) into to the
network
(1,0), net= 0.6, Y=1
Therefore, this pattern is recognized as 1.
Exercise2: Solving the AND problem
• Let the training patterns are used as follow
X1 X2 T X2
0 0 0
0 1 0
f1
1 0 0
1 1 1
X1
Sol:
Let W11=0.5, W21=0.5, Θ=1, Let η=0.1
• The initial net function is:
net =0.5X11+0.5X21 – 1
• Feed the input pattern into network one by one
(0,0), net=-1, Y=Ø, (T-Y)= Ø, O.K.
(0,1), net=-0.5, Y= Ø, (T-Y)= Ø, O.K.
(1,0), net=- 0.5, Y= Ø, (T-Y)= Ø, O.K.
(1,1), net= Ø, Y= Ø, (T-Y)= 1, Need to update weight
• update weights for pattern (1,1) which does not
satisfying the expected output:
ΔW11=(0,1)(1)( 1)= 0.1, W11=0.6
ΔW21=(0,1)(1)( 1)= 0.1, W21=0.5+0.1=0.6,
ΔΘ=-(0.1)(1)=-0.1, Θ=1-0.1=0.9
• Applying new weights to the net function:
net=0.6X1 + 0.6X2 - 0.9
X2 0.6 Θ=0.9
Example: Solving the XOR problem
• Let the training patterns are used as follow.
X2
P2 P4
X1 X2 T f1
0 0 0
0 1 1 P1
X1 f2
1 0 1 f2 P3
P2, P3 P4
1 1 0
f3
P1 f1
• Let W11=1.0, W21= -1.0, Θ=0
If we choose one layer network, it will be proved that the
network cannot be converged. This is because the XOR
problem is a non-linear problem, i.e., one single linear
function is not enough to recognize the pattern. Therefore, the
solution is to add one hidden layer for extra functions.
Θ1
W11
X1 f1
W12
W21 f3
f2
X2
W22 Θ3
Θ2
• The following pattern is formed.
X1 X2 f1 f2 T3
0 0 0 0 0
0 1 0 1 1
1 0 0 1 1
1 1 1 1 0
• So we solve f1 as AND problem and f2 as OR problem
• Assume we found the weights are:
W11=0.3, W21=0.3, W12= 1, W22= 1 θ1=0.5 θ2=0.2
• Therefore the network function for f1 & f2 is:
f1 = 0.3X11+ 0.3X21 - 0.5
f2 = 1•X12+ 1•X22 - 0.2
• Now we need to feed the input one by one for training the
network for f1 and f2 seprearately. This is to satisfiying the
expected output for f1 using T1 and for f2 using T2.
• Finally, we use the output of f1 and f2 as input pattern to train
the network for f3. ,We can derive the result as following.
f3=1•X13 - 0.5X23 + 0.1
Θ=0.5
X1 0.3
f1 1
1
Y
0.3
X2 f2 -0.5
1 Θ= -0.1
Θ=0.2