Building Neural Networks - A Hands-On Journey From Scratch With Python - by Long Nguyen - Medium
Building Neural Networks - A Hands-On Journey From Scratch With Python - by Long Nguyen - Medium
Open in app
8
Search Write
484 6
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 1/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Our sample neural network will consist of: 2 input neurons, 1 hidden layer
with 2 neurons and an output layer with 2 neurons. Some initial weights and
bias values have been provided to help with the calculation. Assume the
expected output is 0.1 and 0.9:
1. Forward Propagation
Note: I’m using sigmoid as the activation function
Hidden Layer
Hidden Neuron 1:
Hidden Neuron 2:
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 2/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Output Layer
Output Neuron 1:
Output Neuron 2:
So, the Mean Squared Error (MSE) is approximately 0.2085. This is a measure
of the difference between the expected and actual outputs. A lower MSE
indicates a better fit of the model to the given data.
2. Backward Propagation
Once predictions are obtained, we need to train the network by adjusting
weights and biases based on prediction errors. This is achieved through
backward propagation.
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 3/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Assuming we use the learning rate of 0.5
Then, the objective is to diminish the error function by reducing its gradient,
thereby facilitating a ‘descent’ along the gradient — “gradient descent”.
Output Layer
Applying the chain rule to get the formula to calculate the change in error
function with respect to a small change in weight w5
First — We’ve got the error function and its derivative with respect to ao1
Second — the derivative of the activation over the weighted sum, aka the
derivative of sigmoid function
Lastly — the derivative of the weighted sum with respect to w5 gives you ah1
which is the output of the h1 neuron in the previous layer
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 4/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Putting them together
This is the gradient of the error function — applying gradient descent to get a
new value of weight w5 by reducing the weight it by the learning rate times
gradient
Let’s generalise the formulas for the delta of an output neuron, and the
formula to update the weight in an output layer.
From this exercise, you should be able to then derive the formula for
updating bias on your own — it’s very similar to updating weights. Hint: the
final result doesn’t involve previous layer neuron’s output.
Now let’s apply real numbers from the example to those equations to
calculate new weights w5, w6, w7, w8
Output Neuron 1:
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 5/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Output Neuron 2:
Hidden Layer
Applying the chain rule again to get the formula to calculate the change in
error function with respect to a small change in weight w1
This formula will be a little bit more complicated as we’re further away from
the output, so a lot more “chaining” of functions will happen so take your
time to go through this.
Looking at the first derivative — derivative of total errors with respect to ah1.
Because total error equals sum of Eo1 and Eo2, using the sum rule we’ve got
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 6/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
The derivative of total error with respect to weight ah1 now looks like
Derivative of ah1 over sh1 is the derivative of the sigmoid function, and the
derivative of sh1 over w1 is the output of the previous layer neuron (which is
the input layer neuron as we only have 1 hidden layer in this example)
Putting it together
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 7/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Let’s group the weighted sum of the deltas in the next layer (output layer)
with the sigmoid derivative and call it the Delta(h1)
Let’s generalise the formulas for the delta of a neuron in a hidden layer and
the formula to update the weight in hidden layer
Now let’s apply real numbers from the example to those equations to
calculate new weights w1, w2, w3, w4
Hidden Neuron 1:
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 8/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Hidden Neuron 2:
That’s it! All of our weights have been updated — and that was just 1 iteration
(epoch). Imagine if we run it thousands or millions of times, the error will
become smaller and smaller, hence increasing the accuracy of the network’s
prediction.
There were a lot of math formulas and calculations and variables so errors
are quite likely to occur. If you notice something that is incorrect, please let
me know!
First, let’s define the 2 functions for the sigmoid activation function and its
derivative. These 2 will be reused throughout the exercise.
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 9/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Bias ( bias ): Similar to the intercept in a linear equation, the bias allows
the neuron to adjust its output independently of the input.
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 10/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Neurons are then organised into layers — here’s the Layer class. Layers
organise neurons into meaningful groups. Neurons in the same layer share
the same input and output dimensions.
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 11/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Now, the bulk of the logic is in the Network class. It represents the neural
network itself and orchestrates its training and prediction processes.
Key properties:
Key functions:
The train method trains the neural network for a specified number of
epochs using the provided training set and expected outputs. The
expected list uses one-hot encoding to indicate the expected output.
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 12/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Now it’s time to try to run this code on some sample data. I’ve reused the
data from this tutorial.
This function creates a sample dataset and initialise the network with 1
hidden layer (with 2 neurons) and 1 output layer (with 2 neurons). Then,
training is run for 40 epochs with learning rate of 0.5.
The neurons’ weights are randomised initially and updated as training goes
on.
If you run the code, you should get something similar to this
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 13/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Now, the code is obviously quite lengthy and somewhat complex — let’s try to
simplify that by using numpy!
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 14/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Using numpy helps us shorten the code a little bit — you can imagine that it’s
doing “bulk” calculation by utilising matrices instead of looping one neuron
and one layer at a time as our previous implementation.
However, you’d need to have a pretty good mental model of the dimensions
of the matrices in each step in order to understand and write the correct
calculation, which can be a bit challenging. I’ve put comments in the code
about the dimensions expected for most of the calculations (based on the
test case).
The Layer class is a data class that encapsulates the parameters and
attributes associated with a layer in the neural network. We don’t need a
Neuron class anymore since it will just be an element in the numpy
array/matrix.
I’ve added a static method create which creates a network with random
weights and biases based on the specified number of neurons in each layer.
The rest of the functions are the same, except the calculation is done with
matrix multiplications instead of manually multiplying each neuron’s data.
There are obviously many ways of implementing this — one might simplify
this further and remove the Layer class completely and represents the whole
network with nested arrays. However I find that approach a bit hard to wrap
my head around with all of the multiple dimensions so I went with this
approach for now.
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 15/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
The dimensions of input and expected output are obviously slightly changed
to fit with numpy n-D arrays, but the data stay the same.
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 16/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
However, I won’t go into details about this code, and Tensorflow in general as
our aim in this blog is not to delve into the intricacies of TensorFlow but to
comprehend the fundamental workings of a neural network. TensorFlow
abstracts away many of the underlying details, making it an efficient tool for
practical applications but potentially not the best way to learn.
Sample output
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 17/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
In this blog journey, we took a dive into the behind the scene of neural
networks, starting from the basic walkthrough with math calculation and
then moving into code implementation with Python. We built these
networks step by step, first with plain Python, then with the help of a handy
library called numpy, and finally, we peeked into the powerful realm of
TensorFlow.
As we wrap up, I invite you to try this out for yourself. Coding is a journey of
discovery, and building a neural network from scratch is like a backstage
tour. So, grab your coding gear, start tinkering, and enjoy the adventure of
learning. Happy coding!
Backpropagation calculus
Deep Learning
253 Followers
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 19/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Lists
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 20/21
11/1/24, 9:17 PM Building Neural Networks: A Hands-On Journey from Scratch with Python | by Long Nguyen | Medium
Oct 14 Jun 5 5
https://medium.com/@thisislong/building-a-neural-network-from-scratch-with-backpropagation-a789bec70b29 21/21