0% found this document useful (0 votes)

2 views22 pages

13 Nnbasics

The document provides an introduction to neural networks, focusing on their structure, types, and functionality, particularly Feedforward Neural Networks (FNN) and Multilayer Perceptrons (MLP). It discusses the limitations of simple linear models and compares the approaches of Support Vector Machines (SVM) and neural networks in adapting basis functions for large-scale problems. Additionally, it outlines the process of forward propagation in a two-layer perceptron model and the role of activation functions in both hidden and output layers.

Uploaded by

mckaymcfadden

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views22 pages

13 Nnbasics

Uploaded by

mckaymcfadden

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

13 - Introduction to Neural Networks

UCLA Math156: Machine Learning

Instructor: Lara Kassab
Neural Networks
The origin of Neural Networks is inspired by information processing
models of biological systems, in particular the human brain.
Neural networks are also called Artificial Neural Networks
(ANN) or Neural Nets (NN).
They consist of connected artificial neurons called units or
nodes which loosely model the neurons in a brain.
Neural Networks
Deep learning refers to training neural networks with multiple
hidden layers.
Feedforward Neural Network

A Feedforward Neural Network (FNN) is one of the two main types

of NNs. A FNN has a uni-directional flow of information between
its layers.

The direction of the flow or connections from: input nodes →

(multiple) hidden nodes → output nodes is forward without
any cycles or loops.

This is in contrast to Recurrent Neural Networks (RNN),

which have a bi-directional flow.

FNNs can be regression or classification models depending on

the activation function used in the output layer.
Multilayer Perceptron
A Multilayer Perceptron (MLP) is a FNN where all the nodes of
the previous layer are connected to each input of the succeeding
layer (except for the bias node). This architecture is called
fully-connected.
Review of Linear Models

Generalized linear models for regression and classification have the

form:  
M
X −1
y(x, w) = f  wj ϕj (x)
j=0

The basis functions ϕj (x) are fixed nonlinear functions such

as Gaussian RBF, Sigmoidal functions, etc.
For regression, f is usually the identity function. For
classification, f is usually a nonlinear activation function such
as logistic sigmoid or sign function.
Simple Linear Models: Limitations

These (fixed) linear basis function models have limited practical

applicability on large-scale problems due to the curse of
dimensionality.

The number of coefficients needed to adapt the basis

functions to the data grows with the number of features.
To extend to large-scale problems we need to adapt the basis
functions ϕj to the data. Both SVMs and neural networks
address this limitation in different ways.
SVM Approach

The number of basis functions in SVM is not pre-defined. SVM

varies the number of basis functions centered on training samples:

SVM selects a subset of these during training (support

vectors). This number depends on the characteristics of the
data, choice of kernels, hyperparameters (e.g. regularization
coefficient), etc.
Although training involves nonlinear optimization, the
objective function is convex.
In SVM, the number of basis functions is much smaller than
the number of training points, but it can still be large and
grow with the size of the training set.
Neural Networks Approach

Neural Networks fix the number of basis functions in advance, but

allow them to be adaptive:

The basis functions ϕj have their own parameters {wji }

which are adapted during training.
Neural networks involve a non-convex optimization during
training (many minima), but we get a more compact and
faster model at the expense of training.
Basic Neural Network Model

A neural network can also be represented similar to linear models

but the basis functions are generalized:
 
M
X −1
y(x, w) = f  wj ϕj (x)
j=0

There can be several activation functions f and the process is

repeated over and over.
generalized model = nonlinear function ( linear model )
The parameters wj of the nonlinear basis functions ϕj are
adjusted during training.
Basic Neural Network Model

A basic FNN model can be described by a series of functional

transformations:

We have input x = (x1 , · · · , xD )⊤ and M linear combinations

in the form:
D
(1) (1)
X
aj = wji xi + wj0 for j = 1, · · · , M
i=1

The superscript (1) indicates parameters are in the first layer

of network, the parameters wji are referred to as weights, the
parameters wj0 are biases, with x0 = 1.
Two-layer Perceptron Model
Basic Neural Network Model

Recall from above:

D
(1) (1)
X
aj = wji xi + wj0 for j = 1, · · · , M
i=1

The quantities aj are known as activations which are the

inputs to activation functions.

The number of hidden units in a layer (M in this case) can be

regarded as the number of basis functions.

In neural networks, each basis function has parameters wji

which can be adjusted (learned through the training process).
Basic Neural Network Model

Each activation aj is transformed using differentiable

nonlinear activation functions h,

zj = h (aj ) .

So, for the nodes of the (first) hidden layer we have:

 
D 
X (1) (1) 
zj = h 
 wji xi + wj0   for j = 1, · · · , M
 i=1 
| {z }
linear model
| {z }
generalized linear model

This process is repeated for each pair of consecutive layers until we

reach the output layer.
Notes:
Activation Functions for Hidden Layers
Examples of activation functions for hidden layers:
1
Logistic sigmoid R → (0, 1): σ(a) = 1+e−a
ea −e−a
Hyperbolic tangent R → (−1, 1): tanh(a) = ea +e−a
Rectified Linear unit R → R+ : f (a) = max(0, a)

There are many choices of activation functions. We will later

discuss key properties.
Two-layer Perceptron Model

To give a brief start-to-finish picture, we will consider only a

2-layer perceptron (input layer + 1 hidden layer + output layer).
Second Layer

So, the second layer is the output layer.

The values zi (i = 1, · · · , M ) are linearly combined to give

output unit activations:
M
(2) (2)
X
ak = wki zi + wk0 for k = 1, .., K
i=1

where K is the total number of outputs.

This corresponds to the second layer of the network, and
again wk0 are bias parameters.
The output unit activations ak are transformed by using
appropriate activation function f to give network outputs yk .
Activation Functions for Output Layer

The choice of the activation function in the output layer is

determined by the task (e.g. regression, classification), the nature
of the data, the assumed distribution of the target variables, etc.

For standard regression problems the activation function is

usually the identity function so that yk = ak . Note the
number of output nodes K can be equal to 1.
For multiple binary classification problems, each output unit
activation is usually transformed using a logistic sigmoid
function so that yk = σ(ak ).
For multiclass problems, usually a softmax acivation function
is used.
Two-layer Perceptron Model

→ Forward propagation is the process where the input data is

passed through the network’s layers (i.e. evaluated) to generate an
output.

Putting the 2-layer perceptron model together. The forward

propagation is:
 
M D
!
(2) (1) (1) (2)
X X
yk (x, w) = f  wkj h wji xi + wj0 + wk0 
j=1 i=1

We can write this more generally for a MLP with L layers. Note
how this architecture is fully-connected.
Remarks

A few more remarks on FNN:

1 Multiple distinct choices for a weight vector w in FNN can
give rise to the same mapping function from inputs to outputs.
This property is called weight-space symmetry (Section 5.1.1).
2 FNN can be sparse with not all connections being present (i.e.
not fully-connected).
3 A convolutional neural network (CNN) is a special kind of
FNN with significant use in image and text processing.
Notes:

Transformational Grammar
No ratings yet
Transformational Grammar
1 page
Engine J08E TI
100% (6)
Engine J08E TI
568 pages
Lecture 10 Neural Network
No ratings yet
Lecture 10 Neural Network
34 pages
CV Lec5
No ratings yet
CV Lec5
54 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
86 pages
Unit 3 - Ann
No ratings yet
Unit 3 - Ann
49 pages
Neural Network
No ratings yet
Neural Network
55 pages
Artificial Neural Network: Lecture Module 22
No ratings yet
Artificial Neural Network: Lecture Module 22
54 pages
Artificial Intelligence: Outline
No ratings yet
Artificial Intelligence: Outline
35 pages
Deep Learning
No ratings yet
Deep Learning
37 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
82 pages
Unit 3 Endsem PYQs
No ratings yet
Unit 3 Endsem PYQs
19 pages
Neural Network
100% (1)
Neural Network
54 pages
Unit 4 Neural Networks
No ratings yet
Unit 4 Neural Networks
76 pages
Data Mining Techniques: Presentation On Neural Network
No ratings yet
Data Mining Techniques: Presentation On Neural Network
55 pages
Neural
No ratings yet
Neural
53 pages
Neural Networks
No ratings yet
Neural Networks
28 pages
Lecture-2 Learning Process45452465442
No ratings yet
Lecture-2 Learning Process45452465442
50 pages
Classification BP Regression KNN Other Classifiers - Final
No ratings yet
Classification BP Regression KNN Other Classifiers - Final
116 pages
Types of Neural Networks and Definition of Neural Network
No ratings yet
Types of Neural Networks and Definition of Neural Network
15 pages
Artificial Neural Networks (Anns) : Intro
No ratings yet
Artificial Neural Networks (Anns) : Intro
15 pages
3ML.05.NeuralNetworks DeepLearning
No ratings yet
3ML.05.NeuralNetworks DeepLearning
67 pages
Unit-4 Full
No ratings yet
Unit-4 Full
48 pages
ML Unit 2
No ratings yet
ML Unit 2
23 pages
Structure of Neural Networks
No ratings yet
Structure of Neural Networks
12 pages
Neural Networks
No ratings yet
Neural Networks
19 pages
Aimlf Unit4
No ratings yet
Aimlf Unit4
20 pages
NNDL
No ratings yet
NNDL
96 pages
Unit V
No ratings yet
Unit V
49 pages
Unit 5
No ratings yet
Unit 5
102 pages
Lecture8,9-Neural Networks
No ratings yet
Lecture8,9-Neural Networks
65 pages
Neural Networks
No ratings yet
Neural Networks
61 pages
Chapter Neural Networks
No ratings yet
Chapter Neural Networks
14 pages
Unit Iv DM
No ratings yet
Unit Iv DM
58 pages
ECSE484 Intro v2
No ratings yet
ECSE484 Intro v2
67 pages
Lecture NN 2005
No ratings yet
Lecture NN 2005
137 pages
Unit I
No ratings yet
Unit I
90 pages
Deep Learning
No ratings yet
Deep Learning
180 pages
AI Mod4 Session 8 Best Fit Line & ANN
No ratings yet
AI Mod4 Session 8 Best Fit Line & ANN
39 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
21 pages
Basics
No ratings yet
Basics
48 pages
Ann MJJ-1
No ratings yet
Ann MJJ-1
64 pages
Unit 4
No ratings yet
Unit 4
38 pages
Deep Learning 10 Hours
No ratings yet
Deep Learning 10 Hours
27 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Unit 1
No ratings yet
Unit 1
16 pages
Neural Network and Fuzzy Logic
50% (2)
Neural Network and Fuzzy Logic
54 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
61 pages
Unit 5 ML
No ratings yet
Unit 5 ML
37 pages
CH 12 - Artificial Neural Networks
No ratings yet
CH 12 - Artificial Neural Networks
39 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
29 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
Neural Networks - Annotated
No ratings yet
Neural Networks - Annotated
21 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
No ratings yet
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
40 pages
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
No ratings yet
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
45 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
216 pages
Advanced Supervised Learning
No ratings yet
Advanced Supervised Learning
17 pages
M3 Transcript
No ratings yet
M3 Transcript
10 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Introduction to Advanced Mathematical Analysis
From Everand
Introduction to Advanced Mathematical Analysis
Simone Malacrida
No ratings yet
Updated CV Afzal 2024
No ratings yet
Updated CV Afzal 2024
2 pages
Req Dozer CAT
No ratings yet
Req Dozer CAT
4 pages
SOP For American Express Interview
No ratings yet
SOP For American Express Interview
4 pages
Iom 2 4PG BW Oct2015
No ratings yet
Iom 2 4PG BW Oct2015
4 pages
BMW Engines
100% (7)
BMW Engines
8 pages
Sage Publications, Inc. Association For Psychological Science
No ratings yet
Sage Publications, Inc. Association For Psychological Science
8 pages
1.1 General: 1.2 Scope of The Project
No ratings yet
1.1 General: 1.2 Scope of The Project
55 pages
Saba Phy
No ratings yet
Saba Phy
15 pages
Outline
No ratings yet
Outline
4 pages
Resume-Brusola, John Loyd
No ratings yet
Resume-Brusola, John Loyd
2 pages
Abdullah 2022 (Bab 2)
No ratings yet
Abdullah 2022 (Bab 2)
22 pages
The Power of Communication in A Relationship - Healing Collective Therapy
No ratings yet
The Power of Communication in A Relationship - Healing Collective Therapy
12 pages
College Contest MCQs
No ratings yet
College Contest MCQs
11 pages
1 - Bomba Aurora 3550 2x3-10
No ratings yet
1 - Bomba Aurora 3550 2x3-10
15 pages
MMBT5551
No ratings yet
MMBT5551
2 pages
The Linguistics of The Voynich Manuscript
No ratings yet
The Linguistics of The Voynich Manuscript
27 pages
Test of Significance-The Chi-Square
No ratings yet
Test of Significance-The Chi-Square
22 pages
Assessment 2 Module 3 Handouts
No ratings yet
Assessment 2 Module 3 Handouts
7 pages
Codigos Errores Truma Combi 4
No ratings yet
Codigos Errores Truma Combi 4
23 pages
(IDR) Building Maintenance and Management (TUGINO) (INDONESIA) (Online Learning) (February 2022) Ika
No ratings yet
(IDR) Building Maintenance and Management (TUGINO) (INDONESIA) (Online Learning) (February 2022) Ika
6 pages
Trapezoidal CSG Dams
No ratings yet
Trapezoidal CSG Dams
9 pages
Fa1800a-M'13 (F2S8)
No ratings yet
Fa1800a-M'13 (F2S8)
70 pages
About Emmanuel Kant
No ratings yet
About Emmanuel Kant
4 pages
Max 487
No ratings yet
Max 487
20 pages
Public Sculpture Examining The Differences Between Contemporary Sculpture Inside and Outside The Art Institution
No ratings yet
Public Sculpture Examining The Differences Between Contemporary Sculpture Inside and Outside The Art Institution
20 pages
Travelport Oracle EBS Functional Lead
No ratings yet
Travelport Oracle EBS Functional Lead
4 pages
Deep Learning Et Le Hasard - Ipynb
No ratings yet
Deep Learning Et Le Hasard - Ipynb
52 pages
STD 188 PDF
No ratings yet
STD 188 PDF
36 pages

13 Nnbasics

Uploaded by

13 Nnbasics

Uploaded by

13 - Introduction to Neural Networks

UCLA Math156: Machine Learning

A Feedforward Neural Network (FNN) is one of the two main types

The direction of the flow or connections from: input nodes →

This is in contrast to Recurrent Neural Networks (RNN),

FNNs can be regression or classification models depending on

Generalized linear models for regression and classification have the

The basis functions ϕj (x) are fixed nonlinear functions such

These (fixed) linear basis function models have limited practical

The number of coefficients needed to adapt the basis

The number of basis functions in SVM is not pre-defined. SVM

SVM selects a subset of these during training (support

Neural Networks fix the number of basis functions in advance, but

The basis functions ϕj have their own parameters {wji }

A neural network can also be represented similar to linear models

There can be several activation functions f and the process is

A basic FNN model can be described by a series of functional

We have input x = (x1 , · · · , xD )⊤ and M linear combinations

The superscript (1) indicates parameters are in the first layer

Recall from above:

The quantities aj are known as activations which are the

The number of hidden units in a layer (M in this case) can be

In neural networks, each basis function has parameters wji

Each activation aj is transformed using differentiable

So, for the nodes of the (first) hidden layer we have:

This process is repeated for each pair of consecutive layers until we

There are many choices of activation functions. We will later

To give a brief start-to-finish picture, we will consider only a

So, the second layer is the output layer.

The values zi (i = 1, · · · , M ) are linearly combined to give

where K is the total number of outputs.

The choice of the activation function in the output layer is

For standard regression problems the activation function is

→ Forward propagation is the process where the input data is

Putting the 2-layer perceptron model together. The forward

A few more remarks on FNN:

You might also like