0% found this document useful (0 votes)

9 views5 pages

Neural Network Introduction

The document discusses the basics of neural networks, focusing on the graphical representation of linear equations and how they can be extended to learn complex functions through multiple layers. It explains the importance of activation functions in enabling the network to capture non-linear patterns, and describes the structure of a neural network with hidden layers and learnable parameters. Additionally, it introduces common activation functions like ReLU and softmax, which are used to process inputs and generate predictions in classification tasks.

Uploaded by

Ashwini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views5 pages

Neural Network Introduction

Uploaded by

Ashwini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Neural Networks

Graphical Representation of Linear Equations

Consider the following graph:

Figure 1: Single Neuron

The above graph simply represents the linear equation:

y = w1 ∗ x1 + w2 ∗ x2 + w3 ∗ x3 + b (1)

Where the w1, w2, w3 are called the weights and b is an intercept term called
bias. The graph above, therefore, is simply a graphical representation of a simple
linear equation. The equation can also be vectorised like this:

y = W.X + b (2)

Where X = [x1, x2, x3] and W = [w1, w2, w3].T. The .T means transpose.
This is because we want the dot product to give us the result we want i.e. w1
* x1 + w2 * x2 + w3 * x3. This gives us the vectorized version of our linear
equation.
With machine learning, what we are trying to do, essentially, is find out that if
we are given a large amount of data (pairs of X and corresponding y), can we
write an algorithm to figure out the optimal values of W and b? We need to
find a way for our model to find the optimal values for W and b and not the
absolute values. Absolute values probably don’t even exist given the limitations
of our mathematical model since we are assuming a linear function for a problem
where it might be a much more complex one in reality and we don’t know what
that function is.

1
By taking the observed data and a proposed model, we want to write an algorithm
to learn the values for W and b which best fit the data and ultimately, by doing
that, we learn an approximate function which maps the inputs to outputs of our
data. This type of algorithm is called an optimization algorithm and there are a
few different optimization algorithms that are typically used in training neural
networks.
Let’s say we have a dataset which has examples of shape(60000, 28, 28). The
first dimension is simply the number of examples we have, so each example is of
the shape (28, 28). If we unroll this 28 by 28 array into a single dimension,
it will become a 28 * 28 = 784 dimensional vector. Now, it can probably be
modeled somewhat like a linear equation, right? Given features from x1 to x784,
we get an output y. It could be represented like this:

Figure 2: Single Neuron with 784 features

This may actually work for really simple problems but in most cases, this model
will turn out to be insufficient. This is where Neural Networks may be more
effective.

Neural Network with Two Hidden Layers

Turns out, we can learn much more complex functions by simply cascading the
linear functions one after the other - and the only additional thing that a node in
a neural network does (as opposed to a node in a linear equation shown above),
is that an activation function is applied to each linear output. The purpose of
an activation functions is to help the neural network find non-linear patterns in
the data because if we just cascaded the nodes like the ones described above in
the linear equation example, even with many layers of cascaded linear functions,
the result will still be a linear function; which means that, after training the
mode, it will learn a linear function that best fit the data. This is a problem

2
because in many, if not most, cases the input to output map is going to be much
more complex than a linear function. So, the activation gives the model more
flexibility, and allows the model to be able to learn non-linear patterns.
Now, instead of setting y to a weighted sum of our input features, we can get
a few hidden outputs which are weighted sums of our input features passed
through an activation function and then get the weighted sums of those hidden
outputs and so on. We do this a few times, and then get to our output y. This
type of model gives our algorithm a much greater chance of learning a complex
function.

Figure 3: Neural Network with 2 hidden layers

In the network above, we have two hidden layers. The first layer with all the X
features is called the input layer and the output y is called the output layer. In
this example, the output has only one node. The hidden layer can have a lot
of nodes or a very few nodes depending on how complex the problem may be.
Here, both the hidden layer have 2 nodes each. Each node gives the output of
a linear function after the linear output passes through an activation function,
and takes inputs from each node of the preceding layer. All the W’s and all
the b’s associated with all of these functions will have to be “learned” by our
algorithm as it attempts to optimize those values in order to best fit the given
data. Note that the total number of learnable parameters in any layer depend
on the number of nodes in that layer as well as on the number of nodes in the
preceding layer. For example, learnable parameters for hidden layer 1 can be
calculated as: (number of nodes of the layer) * (number of nodes of preceding
layer) + (number of nodes of the layer). Why? The first part is obvious: if every
node of a layer is connected to every node of the preceding layer, we can simply
multiply the number of nodes of these two layers to get the total number of
weight parameters. Also, the bias from previous layer would be connected to
each node in the layer as well - that gives us the second term. So, for hidden

3
layer 1, we get: 2 * 2 + 2 = 6 learnable parameters.
In the hand-written digit classification problem, we will have 128 nodes for
two hidden layers, we will have 10 nodes for the output layer with each node
corresponding to one output class, and of course we already know that the input
is a 784 dimensional vector.

Activation Functions
We have talked about each node having a weighted sum of the inputs of the
preceding layer. And, before this sum is fed to the next layer’s nodes, it goes
through another function called an activation function. So, each node actually
does two things. First step is the weighted sum, let’s call it Z:

Z = W.X + b (3)

The second step in the node is the activation function output, let’s call it A:

A = f (Z) (4)

There are various types of activation functions used in Neural Networks. One of
the more common ones is a rectified linear unit of ReLU function. It’s a pretty
simple function: it’s a linear function for all the positive values and is simply set
to 0 for all the negative values. Something like this:

Figure 4: Linear Sum and Activation Function

Another activation function commonly used in classification problems is softmax.

This function gives us probability scores for various nodes, in this case 10 nodes
of the output layer, which sum upto 1. This activation gives us the probabilities

4
for various classes given the input. The class with the highest probability gives
us our prediction.

From Project: Basic Image Classification with TensorFlow by Amit Yadav

Notes On Introduction To Deep Learning
No ratings yet
Notes On Introduction To Deep Learning
19 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Neural Networks Essay Feranmi Dere
No ratings yet
Neural Networks Essay Feranmi Dere
7 pages
Unit 2
No ratings yet
Unit 2
18 pages
Activation FN
No ratings yet
Activation FN
15 pages
Neural Networks
No ratings yet
Neural Networks
37 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
NN Unit - 1
No ratings yet
NN Unit - 1
27 pages
NN PDF
No ratings yet
NN PDF
11 pages
3 ArtificialNeuralNetworks PDF
No ratings yet
3 ArtificialNeuralNetworks PDF
77 pages
Understanding and Creating Neural Networks
No ratings yet
Understanding and Creating Neural Networks
69 pages
0905 Cs 161183 Vishal
No ratings yet
0905 Cs 161183 Vishal
38 pages
Unit 2 - Machine Learning
No ratings yet
Unit 2 - Machine Learning
19 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
ML Lec-22
No ratings yet
ML Lec-22
25 pages
NN Concepts
No ratings yet
NN Concepts
4 pages
Annette Paper
No ratings yet
Annette Paper
7 pages
Maths Extended Essay
No ratings yet
Maths Extended Essay
18 pages
Unit 2
No ratings yet
Unit 2
35 pages
Neural Networks
No ratings yet
Neural Networks
61 pages
Neural Networks
No ratings yet
Neural Networks
12 pages
Mind - How To Build A Neural Network (Part One)
No ratings yet
Mind - How To Build A Neural Network (Part One)
9 pages
ML Lec 10 Neural Networks
No ratings yet
ML Lec 10 Neural Networks
87 pages
Unit I
No ratings yet
Unit I
90 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Unit II
No ratings yet
Unit II
12 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
Neural Networks Unit-3
No ratings yet
Neural Networks Unit-3
14 pages
Activation Function
No ratings yet
Activation Function
4 pages
Neural Networks - Annotated
No ratings yet
Neural Networks - Annotated
21 pages
CV Lec5
No ratings yet
CV Lec5
54 pages
Feedforward in Neural Networks
No ratings yet
Feedforward in Neural Networks
14 pages
Unit V
No ratings yet
Unit V
9 pages
Types of Machine Learning: Supervised Learning: The Computer Is Presented With Example Inputs and Their
No ratings yet
Types of Machine Learning: Supervised Learning: The Computer Is Presented With Example Inputs and Their
50 pages
Fundamentals Deep Learning Activation Functions When To Use Them
No ratings yet
Fundamentals Deep Learning Activation Functions When To Use Them
15 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
Linear Separability Linearly Separable Data Non-Linearly Separable Data
No ratings yet
Linear Separability Linearly Separable Data Non-Linearly Separable Data
1 page
Intro To Deep Learning - Lab
No ratings yet
Intro To Deep Learning - Lab
7 pages
ML Unit 4
No ratings yet
ML Unit 4
23 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Neural Network Notes
No ratings yet
Neural Network Notes
8 pages
Lecture8,9-Neural Networks
No ratings yet
Lecture8,9-Neural Networks
65 pages
Neural Networks / Deep Learning
No ratings yet
Neural Networks / Deep Learning
9 pages
Module 2
No ratings yet
Module 2
44 pages
Unit 2 - Activation Function - PR
No ratings yet
Unit 2 - Activation Function - PR
22 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
36 pages
3b Dynamics
No ratings yet
3b Dynamics
19 pages
14 Deep
No ratings yet
14 Deep
6 pages
Pr2 ANN WriteUp
No ratings yet
Pr2 ANN WriteUp
11 pages
FML Unit5
No ratings yet
FML Unit5
21 pages
Day1 06 Simple NN Python
No ratings yet
Day1 06 Simple NN Python
18 pages
Unit Iv
No ratings yet
Unit Iv
34 pages
MLfromBasics Ch2E
No ratings yet
MLfromBasics Ch2E
32 pages
Unit III
No ratings yet
Unit III
37 pages
Structure of Neural Networks
No ratings yet
Structure of Neural Networks
12 pages
Activatn FN 2
No ratings yet
Activatn FN 2
10 pages
AML 03 Dense Neural Networks
No ratings yet
AML 03 Dense Neural Networks
20 pages
Unit Ii DNN
No ratings yet
Unit Ii DNN
24 pages
Functii de Activare1
No ratings yet
Functii de Activare1
89 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Matrices Notes Part 2 2024 SOLUTIONS
No ratings yet
Matrices Notes Part 2 2024 SOLUTIONS
13 pages
Multiple Choice Questions: Answer: B Level: Easy Section: 6-1
No ratings yet
Multiple Choice Questions: Answer: B Level: Easy Section: 6-1
51 pages
DPP-1 2D Projectile Motion Op
No ratings yet
DPP-1 2D Projectile Motion Op
2 pages
Co5124.Sp52.Assignment1 Ngo Chi Nguyen 12528511 in
No ratings yet
Co5124.Sp52.Assignment1 Ngo Chi Nguyen 12528511 in
15 pages
Maintaining Test Methods in The User's Laboratory: Standard Guide For
No ratings yet
Maintaining Test Methods in The User's Laboratory: Standard Guide For
4 pages
물리 교재 28단원
No ratings yet
물리 교재 28단원
26 pages
Assignment/ Tugasan - Elementary Data Analysis
No ratings yet
Assignment/ Tugasan - Elementary Data Analysis
8 pages
Discrete Mathematics An Open Introduction, 2nd Edition
No ratings yet
Discrete Mathematics An Open Introduction, 2nd Edition
242 pages
1953 04erdos
No ratings yet
1953 04erdos
5 pages
Grade 3
No ratings yet
Grade 3
2 pages
Object Detection and Recognition: Final Project Title
No ratings yet
Object Detection and Recognition: Final Project Title
6 pages
Complex Number and Quadratic Equation BITSAT Previous Year Chapter
No ratings yet
Complex Number and Quadratic Equation BITSAT Previous Year Chapter
6 pages
Integrating Factors Found by Inspection
50% (2)
Integrating Factors Found by Inspection
14 pages
Maths Test Mac
No ratings yet
Maths Test Mac
12 pages
Group 5
No ratings yet
Group 5
5 pages
The Incident at Antioch, by Alain Badiou
100% (1)
The Incident at Antioch, by Alain Badiou
32 pages
Python Sach
No ratings yet
Python Sach
67 pages
OULD - bammOUNE Preparation Tp1
No ratings yet
OULD - bammOUNE Preparation Tp1
13 pages
Vinyl Acetate
No ratings yet
Vinyl Acetate
13 pages
Electrical First Term Allocation
No ratings yet
Electrical First Term Allocation
1 page
Second Moment of Area
No ratings yet
Second Moment of Area
4 pages
A Study On Effectiveness of Franchise Business Model of Mcdonald'S in Ahmedabad
No ratings yet
A Study On Effectiveness of Franchise Business Model of Mcdonald'S in Ahmedabad
25 pages
Studying The Factors Affecting The Settling Velocity of Solid Particles in Non-Newtonian Fluids
No ratings yet
Studying The Factors Affecting The Settling Velocity of Solid Particles in Non-Newtonian Fluids
10 pages
Presentation Regression
No ratings yet
Presentation Regression
12 pages
Tall y Viner Imagen y Concepto
No ratings yet
Tall y Viner Imagen y Concepto
20 pages
Science MODEL QUESTION 2079
No ratings yet
Science MODEL QUESTION 2079
5 pages
Solution 227
No ratings yet
Solution 227
15 pages
Cloud Hypothesis
No ratings yet
Cloud Hypothesis
17 pages
Basics Basics Basics Basics of of of of Radio Interferometry
No ratings yet
Basics Basics Basics Basics of of of of Radio Interferometry
26 pages
Kawasaki 1987
No ratings yet
Kawasaki 1987
23 pages

Neural Network Introduction

Uploaded by

Neural Network Introduction

Uploaded by

Neural Networks

Graphical Representation of Linear Equations

Figure 1: Single Neuron

The above graph simply represents the linear equation:

Figure 2: Single Neuron with 784 features

Neural Network with Two Hidden Layers

Figure 3: Neural Network with 2 hidden layers

Figure 4: Linear Sum and Activation Function

Another activation function commonly used in classification problems is softmax.

From Project: Basic Image Classification with TensorFlow by Amit Yadav

You might also like