Machine Learning Lecture 11
Machine Learning Lecture 11
Second, all the weighted inputs are added together with a bias b
(w.x) + b = w1* x1 + w2* x2 + b
Function Value at x = 7
Unit Step 1
Linear Function 7
Hyperbolic Tangent 0.9999
Function Value at x = 7
Unit Step 0
Linear Function -7
Hyperbolic Tangent -1
Sigmoid 0
Linear Classifier
The simple neuron can solve linear classifier problem such as OR.
The simple neuron (Perceptron) can not classify the problems such as
XOR.
This network has 2 inputs, a hidden layer with 2 neurons (h1 and h2), and an output
layer with 1 neuron (o1).
Notice that the inputs for o1 are the outputs from h1 and h2
A hidden layer is any layer between the input (first) layer and output (last) layer.
There can be multiple hidden layers! (This is called Deep Learning )
Example
Given this neural network
w1 =w3 =0 , w2 = w4 =1, and b = 0 for h1 and h2 and o1
Second: we initialize w1, w2, w3, w3, w5, w6,b1,b2,b3 by random values
Third: we determine predicate Sex according to give Weight and Height as follow
O1 is called y_pred
Training a Neural Network
Loss
We first need a way to quantify how “good” it’s doing so that it can try to do “better”.
That’s what the loss is.
We’ll use the mean squared error (MSE) loss:
Where
• n is the number of samples, which is 4
• y represents the variable being predicted, which is Sex.
• y_true is the true value of the variable (the “correct answer”).
y_pred is the predicted value of the variable. It’s whatever our network outputs.
(y_true−y_pred)² is known as the squared error.
Our loss function is simply taking the average over all squared errors.
The better our predictions are, the lower our loss will be!
Better predictions = Lower loss.
Training a network = trying to minimize its loss.
Training a Neural Network
The process of calculate h1, h2 ,and o1 is called feedforward where the calculation goes from input to
hidden and then to output
The process of modifying weights (w1,w2,w3, … w2) is called back Propagation where the modifications
goes from output to hidden and then to input
In Feedforward we calculate
1. H1 and h2
2. O1
YES
loss is
acceptable Stop
Determine mean squared error (MSE) loss
Repeat for
All iterations Done
Modify Weights And Biases
specified iterations
First Step initialize w1,w2,w3,w4,w5,w6,wb1,b2,b3
# Weights
self.w1 = np.random.normal() w1 = -0.17078787256065822
self.w2 = np.random.normal() w2 = 0.8018910260223238
self.w3 = np.random.normal() w3 = 2.042028648489558
w4 = 0.9472266245782457
self.w4 = np.random.normal()
w5 = 0.11610745255156371
self.w5 = np.random.normal()
w6 = -0.04474672280574983
self.w6 = np.random.normal()
# Biases
self.b1 = np.random.normal() b1 = 0.049210103523584146
self.b2 = np.random.normal() b2 = -0.7372822297715569
self.b3 = np.random.normal() b3 = 0.6148445824873799
Second Step: Determine Hidden values (h1, h2) and Output (o1)
x1 = w1 * weight + w2 * height + b1
h1 = x3 = w5 * h1 + w6 * h2 + b3
h2 =
w1 = -0.170
w2 = 0.802
b1 = 0.049
w3 = 2.042
w4 = 0.947
b2 = -0.737
w5 = 0.116
w6 = -0.045
b3 = 0.615
Determine mean squared error (MSE) loss
Ypred Ytrue
0.65 1
0.67 0
0.67 0
0.65 1
We use partial differentiation to determine the value of the effect of each of the
weights.so that the
new weight = old weight – learning rate * partial differential of equation L with respect
to weight.
Third Step: Modify Weights And Biases
= −
ℎ o1
d d
− =2 − ∗ = −2 −
d d
Third Step: Modify Weights And Biases
Imagine we wanted to tweak w1. How would loss L change if we changed w1?
That’s a question the partial derivative can answer. How do we calculate it?
To start, let’s rewrite the partial derivative in terms of ∂y_pred/∂w1 instead:
Where L= (1−y_pred)²
Ytrue = 1 in this example
Third Step: Modify Weights And Biases
We can break down ∂L/∂w1 into
several parts we can calculate:
So