0% found this document useful (0 votes)

18 views5 pages

Neural Networks Essay

The document discusses a project focused on understanding neural networks through the book 'Neural Networks and Deep Learning' by Michael Neilsen, specifically using a sigmoid neural network for handwritten digit recognition. It outlines the process of setting up the neural network, including data preparation, network structure, and training methods using Python libraries like NumPy and Matplotlib. The project demonstrates successful model training and emphasizes the potential for further optimization and application in various predictive tasks.

Uploaded by

Steven

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views5 pages

Neural Networks Essay

Uploaded by

Steven

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Steven 1

Steven Baar

Professor Anderson

CIS156

3/5/25

Sigmoid Neural Network

A common task when learning any new programming language is to begin by creating a

“hello world” script to take that first step towards learning. This project takes that step towards

understanding how neural networks fundamentally work in today’s rapidly evolving AI

ecosystem. The book I’m reporting on, Neural Networks and Deep Learning by Michael Neilsen,

explains how neural networks are structured and how we can use an algorithm to make a

machine “learn”, or in other words, how we can optimize our predictions towards the likelihood

of an occurrence. In the practical examples, we are guided through using a neural network for

handwritten digit recognition. Using popular Python data science libraries like NumPy, Theano,

and Matplotlib we can design a neural network to take in a recorded collection of handwritten

digits, train it to predict the number with incredible accuracy, and then present our results in a

way that is meaningful to interpret. Initially, we need to decide how we are going to structure the

collection of data to train, test, and trial. This will involve tweaking our technique used in the

book to return a higher success rate. Also, it wouldn’t be data science without a few charts, using

Matplotlib we can create charts to visualize our results from different models and compare

predictions.

Let’s begin by understanding our current data set and how we want to proceed with

setting up our network. The examples taken from the book use the MNIST database, this a
Steven 2

collection of handwritten digits written in part by Census bureau employees and American high

school students. This has been widely used in training machine learning models that have

emerged throughout the 21st century. Each image has already been converted to grayscale and the

size is normalized to a square border of 28x28 (784) pixels. We have been given 60,000 images

that will be used for training our model, there will also be an additional 10,000 images that will

be used for evaluating. This smaller set has been taken from a sample of human writers that are

not present in our original dataset, therefore, we can determine the results of predicting unseen

data. Through the book a variety of methods are used to load the data, typically though, we will

be using methods that implement standard pickle and gzip libraries to initialize our different data

sets. Once we have all the images sectioned into their respective variables (training_data,

validation_data, and test_data) we will initialize our network by assigning random values from

zero to one using Gaussian Distribustion via NumPy’s randn() random method. This will

increase the probability of our random value being near zero and less probable as you near one.

After everything is set up, we can use techniques like Scholastic Gradient Descent to begin

training, this uses a cost function that determines how well the model did and updates our

weights to minimize the output of that function. To accomplish this, we need to use the NumPy

library. This is the core of our ability to handle blazing fast vector computations that are

optimized by using low-level software operations to be as efficient as possible. Using these

techniques, we can compare what our network predicted versus the labels provided for each

written number and start incrementally updating the network.

Sounds great, but it’s still unclear what is happening under the hood. Using arrays, we

can construct a layered network of sigmoid neurons. These neurons act similarly to how a

perceptron work, however, instead of lighting up with a one or zero, sigmoid neurons range from
Steven 3

0.00 to 1.00. Therefore, a value of 0.638 would be considered valid but 1.5 or a negative number

would not. We give a value or “activation” to each neuron. Then the paths that connect each

neuron will be given another value between 0.00 and 1.00, this will be the weight between

connections. If we multiply these, add a bias, and finally wrap it all in a sigmoid function, we

return a result that limits our values between what is predicted (0.00-1.00). To construct our

neural network, we will use 3 layers, the input layer, the hidden layer, and the output layer. Our

input layer consists of 784 pixels that represent a handwritten digit, these are read line by line

and stacked into a column of neurons. Our hidden layer can consist of an arbitrary number of

neurons, there are some techniques for optimizing success for a particular problem, but typically

experimenting is necessary. Finally, we have our output layer consisting of 10 neurons that will

determine what number is guessed, 0 through 9. We will then initialize our test data using the

main training_data and smaller sets for testing and validation. Now that we have our neural

network set up, we can begin training using gradient descent. Our training_data is a tuple with

two entries, the first being a NumPy ndarray with 50,000 entries, each entry can be further

expanded to 784 pixels that represent a single picture. The other entry in the tuple is another

ndarray that stores an answer value (0-9) for what the number is supposed to be. Therefore, this

gives us a tuple that will contain the pixels in each image and the number it is supposed to be.

This is similar for validation_data and test_data, however, each only contains 10,000 images. All

the weights and activations are then initialized using the Gaussian Distribution technique

mentioned previously and the initial tests begin. On average, our model will be quite inaccurate

since these values were initialized randomly. After our first pass through, the model will

determine a cost by using the values predicted versus the label provided by answers ndarray.

Using the cost function with Scholastic Gradient Descent we can calculate the gradient vector of
Steven 4

our cost and then move in the opposite direction to search for a local minima. This can be

visualized by imagining a ball falling down a slope into a valley. We have now finished

constructing the foundation for a model to begin learning any function.

The book recommends running the neural network through the python interpreter

console, however, in my example I have created an additional script that imports the necessary

network modules and begins training the network as many times as we decide to loop. The first

operation we do is load the mnist_library through the method load_data_wrapper(), this extracts

our data from the mnist.pkl.gz file and allows us to initialize and store everything into three

variables. After our data is loaded, we can begin our loop using our variable test_times to

determine how many cycles we will test the network. Our other import, network, will enable us

to initialize the actual network itself. We use the Network method in network and determine the

layer sizes 784, 30, 10. Now we can use the SSG (Scholastic Gradient Descent) method to start

running our training data, how many epochs we want to run the network for a single loop, a mini

batch size of 10, a learning rate of 3, and finally we can include our dataset to test against. Our

results are quite promising, from our initial batch we can pick our best results, giving us a total of

9473 out of 10000 images successfully identified by epoch 30. We can bring that number of

correct predictions up even further by increasing the amount of neurons in our hidden layer. Let’s

now test a network consisting of the layer sizes 784, 100, 30. Taking our best result out of 3, we

have much better results, at epoch 30 our predictions are 9647 out of 10000 images.

This concludes with a brief look into the book Neural Networks and Deep Learning, with

knowledge taken from this lesson we can create a system of layers that can potentially predict

any function imaginable. By tweaking our hidden layer and output layer, we can take in any

input and begin training our model to light up the neurons that we specifically want activated.
Steven 5

This can be as simple as an output layer with Boolean answers true/false, yes/no, etc… or can be

as complicated as determining the health or age of someone based on an image. I’m grateful to

have had the opportunity to absorb as much of this book as possible. I hope that after a few years

practicing Calculus I will come back to this book with a fresh perspective, there are still many

optimizations that can be done to our technique to bring our performance up all the way to 99%.

From this project, I was able to clone a repository to my computer, read through the code,

understand how to implement my own scripts, use the debugger in complex ways to visualize our

input of 28x28 pixels, and create charts for data visualization.

3blue1 Brown Neural Networks
No ratings yet
3blue1 Brown Neural Networks
72 pages
Final DL
No ratings yet
Final DL
26 pages
LOGIC GATES USING TRANSISTORS Digital Techniques
100% (1)
LOGIC GATES USING TRANSISTORS Digital Techniques
19 pages
imageRUNNER ADVANCE 6500 III Series - Partscatalog - E - EUR
0% (1)
imageRUNNER ADVANCE 6500 III Series - Partscatalog - E - EUR
261 pages
Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Le
100% (3)
Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Le
178 pages
Ad3511 Deep Learning Lab Manual
No ratings yet
Ad3511 Deep Learning Lab Manual
80 pages
Deep Learning Lab Practicals
No ratings yet
Deep Learning Lab Practicals
24 pages
Lecture 7 - Perceptrons and Multi-Layer Feedforward Neural Networks Using Matlab Part 3
No ratings yet
Lecture 7 - Perceptrons and Multi-Layer Feedforward Neural Networks Using Matlab Part 3
6 pages
DLP Lab
No ratings yet
DLP Lab
81 pages
ANN Programs
No ratings yet
ANN Programs
20 pages
Harrison Kinsley, Daniel Kukieła - Neural Networks From Scratch in Python (2020) - 62-92
No ratings yet
Harrison Kinsley, Daniel Kukieła - Neural Networks From Scratch in Python (2020) - 62-92
31 pages
How To Develop A CNN For MNIST Handwritten Digit Classification
No ratings yet
How To Develop A CNN For MNIST Handwritten Digit Classification
43 pages
DL Lab-III-II
No ratings yet
DL Lab-III-II
98 pages
DL Lab - Merged
No ratings yet
DL Lab - Merged
60 pages
Sigmoid Neural Networks To Predict Handwritten Digits
No ratings yet
Sigmoid Neural Networks To Predict Handwritten Digits
16 pages
Lecture W15ab
No ratings yet
Lecture W15ab
44 pages
Mnist Handwritten Digit Classification
No ratings yet
Mnist Handwritten Digit Classification
26 pages
Phoneme PDF
No ratings yet
Phoneme PDF
26 pages
AD3511-DEEP LEARNING LAB MANUAL Revised
No ratings yet
AD3511-DEEP LEARNING LAB MANUAL Revised
72 pages
Introduction To Genetic Algorithm Neural Networks
No ratings yet
Introduction To Genetic Algorithm Neural Networks
44 pages
Comparison TIA Portal Vs Studio 5000 1
100% (1)
Comparison TIA Portal Vs Studio 5000 1
53 pages
Barber Colman
No ratings yet
Barber Colman
61 pages
AD3511 Deep Learning Lab Manual
No ratings yet
AD3511 Deep Learning Lab Manual
54 pages
Genaifile
No ratings yet
Genaifile
39 pages
Deep Learning Models (Basic)
No ratings yet
Deep Learning Models (Basic)
35 pages
Introduction To ANN With Steps 10 25
No ratings yet
Introduction To ANN With Steps 10 25
30 pages
Bingham Yield Slurry
No ratings yet
Bingham Yield Slurry
124 pages
DL Experiments
No ratings yet
DL Experiments
19 pages
Image Classification Using Backpropagation Neural Network Without Using Built-In Function
No ratings yet
Image Classification Using Backpropagation Neural Network Without Using Built-In Function
8 pages
Intro Ai Group3
No ratings yet
Intro Ai Group3
35 pages
Project Documentation
No ratings yet
Project Documentation
24 pages
DL Lab-Final
No ratings yet
DL Lab-Final
22 pages
AI Principles and Applications: Artificial Neural Networks
No ratings yet
AI Principles and Applications: Artificial Neural Networks
29 pages
B Cisco Nexus 9000 Series NX-OS VXLAN Configuration Guide 7x PDF
No ratings yet
B Cisco Nexus 9000 Series NX-OS VXLAN Configuration Guide 7x PDF
268 pages
DL Mannual For Reference
No ratings yet
DL Mannual For Reference
58 pages
09-Neural Networks
No ratings yet
09-Neural Networks
17 pages
Control System Term Paper
No ratings yet
Control System Term Paper
12 pages
NeuralNets DeepLearning
No ratings yet
NeuralNets DeepLearning
17 pages
ML Unit-5
No ratings yet
ML Unit-5
14 pages
2nd Research
No ratings yet
2nd Research
7 pages
Project
No ratings yet
Project
4 pages
Tableau Tutorial For Beginners
No ratings yet
Tableau Tutorial For Beginners
8 pages
Catia V5 Bending Torsion Tension Shear Tutorial
No ratings yet
Catia V5 Bending Torsion Tension Shear Tutorial
18 pages
Micro Python Neural Network
No ratings yet
Micro Python Neural Network
5 pages
GK Deeplearning
No ratings yet
GK Deeplearning
15 pages
FPGA Based Implementation of Neural Network
No ratings yet
FPGA Based Implementation of Neural Network
5 pages
Assignment 2 - Neural Network Fundamentals
No ratings yet
Assignment 2 - Neural Network Fundamentals
7 pages
Lab DigitRecognitionMINST
No ratings yet
Lab DigitRecognitionMINST
10 pages
Building Your Deep Neural Network - Step by Step v8 PDF
No ratings yet
Building Your Deep Neural Network - Step by Step v8 PDF
44 pages
Ermias Bemnet
No ratings yet
Ermias Bemnet
10 pages
Geotube Cone Test Prueba de Cono Del Geotube
No ratings yet
Geotube Cone Test Prueba de Cono Del Geotube
12 pages
Week 6 - Lab
No ratings yet
Week 6 - Lab
5 pages
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
No ratings yet
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
9 pages
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
No ratings yet
This Code Fragment Defines A Single Layer With Artificial Neurons, and It Expects Input Variables
9 pages
Lab 12
No ratings yet
Lab 12
6 pages
Building A Neural Network From Scratch C++: Name of Student: Niranjan Class: XII Year: 2019 - 2020
No ratings yet
Building A Neural Network From Scratch C++: Name of Student: Niranjan Class: XII Year: 2019 - 2020
40 pages
Classifying Hand-Written Digits Using Neural Network
No ratings yet
Classifying Hand-Written Digits Using Neural Network
21 pages
Introduction To Deep Learning Assignment 0: September 2023
No ratings yet
Introduction To Deep Learning Assignment 0: September 2023
3 pages
CNN MATLAB Lab Instructions
No ratings yet
CNN MATLAB Lab Instructions
7 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
Blockchain Tehnologije EN
No ratings yet
Blockchain Tehnologije EN
6 pages
First/Second Semester B.E.Degree Examination Engineering Chemistry
No ratings yet
First/Second Semester B.E.Degree Examination Engineering Chemistry
2 pages
Mathematics - A Course in Fluid Mechanics With Vector Field
No ratings yet
Mathematics - A Course in Fluid Mechanics With Vector Field
198 pages
5 5kw+8p+ie3
No ratings yet
5 5kw+8p+ie3
5 pages
17 GEOG245 Tutorial9 PDF
No ratings yet
17 GEOG245 Tutorial9 PDF
7 pages
ECE/CS 559 - Neural Networks Lecture Notes #3 Some Example Neural Networks
No ratings yet
ECE/CS 559 - Neural Networks Lecture Notes #3 Some Example Neural Networks
7 pages
2802ICT Programming Assignment 2
No ratings yet
2802ICT Programming Assignment 2
6 pages
Handwritten Character Recognition System Using Artificial Neural Networks
No ratings yet
Handwritten Character Recognition System Using Artificial Neural Networks
5 pages
Handwritten Digit Recognition
No ratings yet
Handwritten Digit Recognition
4 pages
wph11 01 Rms 20240815
No ratings yet
wph11 01 Rms 20240815
18 pages
Artificial Intelligence Interview Questions
From Everand
Artificial Intelligence Interview Questions
Tech Interviews
5/5 (2)
Hand-Printed Character Recognizer Using Neural Networks: By: Shahzad Malik (219762)
No ratings yet
Hand-Printed Character Recognizer Using Neural Networks: By: Shahzad Malik (219762)
9 pages
Pattern Recognition Using Neural Network (Project Proposal For Image Processing)
No ratings yet
Pattern Recognition Using Neural Network (Project Proposal For Image Processing)
6 pages
A Deep Learning Model For Detection of Cervical SP
No ratings yet
A Deep Learning Model For Detection of Cervical SP
12 pages
Syllabus For Statistics 135: Concepts in Statistics University of California, Berkeley Fall 2021
No ratings yet
Syllabus For Statistics 135: Concepts in Statistics University of California, Berkeley Fall 2021
4 pages
什么是自动分配ip地址的服务器？
100% (1)
什么是自动分配ip地址的服务器？
6 pages
Gigamon Gigavue VM Virtual Machine 4022
No ratings yet
Gigamon Gigavue VM Virtual Machine 4022
7 pages
Selective Determination of Fe (III) in Fe (II) Samples by UV-spectrophotometry With The Aid of Quercetin and Morin
No ratings yet
Selective Determination of Fe (III) in Fe (II) Samples by UV-spectrophotometry With The Aid of Quercetin and Morin
8 pages
Design The Midship Section and Calculate Von-Misses Stress.: Pathak Pharindra
No ratings yet
Design The Midship Section and Calculate Von-Misses Stress.: Pathak Pharindra
31 pages
Physical Chemistry (471) : Faculty of Applied Sciences Laboratory Report
No ratings yet
Physical Chemistry (471) : Faculty of Applied Sciences Laboratory Report
12 pages
Geotechnical Earthquake Engineering: Prof. Deepankar Choudhury
No ratings yet
Geotechnical Earthquake Engineering: Prof. Deepankar Choudhury
38 pages
Isometry: 5.1 Isometry and Isometric Isomorphism
No ratings yet
Isometry: 5.1 Isometry and Isometric Isomorphism
13 pages
Solax Solar Inverter: ZDNY-TL10000 / 12000 / 15000 / 17000 / 20000
No ratings yet
Solax Solar Inverter: ZDNY-TL10000 / 12000 / 15000 / 17000 / 20000
1 page
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
From Everand
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
Steven Cooper
4/5 (9)
Enumerations in WinCC
No ratings yet
Enumerations in WinCC
16 pages
LNB For KU Band
No ratings yet
LNB For KU Band
6 pages
TG63 DS en
No ratings yet
TG63 DS en
4 pages
AEN 1 - Laboratory Exercise No. 1
No ratings yet
AEN 1 - Laboratory Exercise No. 1
3 pages
The Pearson Guide To The GPAT and Other Competitive Examinations in Pharmacy 3rd Edition Umang Shah Instant Download
No ratings yet
The Pearson Guide To The GPAT and Other Competitive Examinations in Pharmacy 3rd Edition Umang Shah Instant Download
67 pages
Artificial Neural Networks: Fundamentals and Applications for Decoding the Mysteries of Neural Computation
From Everand
Artificial Neural Networks: Fundamentals and Applications for Decoding the Mysteries of Neural Computation
Fouad Sabry
No ratings yet

Neural Networks Essay

Uploaded by

Neural Networks Essay

Uploaded by

Steven 1

Sigmoid Neural Network

understanding how neural networks fundamentally work in today’s rapidly evolving AI

optimized by using low-level software operations to be as efficient as possible. Using these

written number and start incrementally updating the network.

constructing the foundation for a model to begin learning any function.

input of 28x28 pixels, and create charts for data visualization.

You might also like