0% found this document useful (0 votes)
32 views80 pages

DL Concepts 1 Overview

Uploaded by

B Prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views80 pages

DL Concepts 1 Overview

Uploaded by

B Prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 80

Deep Learning Concepts

Overview

1
Why Study Artificial Neural
Networks?
 They are extremely powerful computational devices (Turing
equivalent, universal computers)

 Massive parallelism makes them very efficient

 They can learn and generalize from training data – so there is


no need for enormous feats of programming

 They are particularly fault tolerant – this is equivalent to the


“graceful degradation” found in biological systems

 They are very noise tolerant – so they can cope with situations
where normal symbolic systems would have difficulty

 In principle, they can do anything a symbolic/logic system can


do, and more. (In practice, getting them to do it can be rather
difficult…)
Learning in Neural Networks
 There are many forms of networks.
 Most operate by passing neural
‘activations’ through a network of
connected neurons.
Learning in Neural Networks
 One of the most powerful features of
neural networks is their ability to learn
and generalize from a set of training
data.
 They adapt the strengths/weights of the
connections between neurons so that
the final output activations are correct.
Learning in Neural Networks
 There are three broad types of
learning:

1. Supervised Learning (i.e. learning


with a teacher)
2. Reinforcement learning (i.e. learning
with limited feedback)
3. Unsupervised learning (i.e. learning
with no help)
Single and mutilayer
feedforward nets
Single layer Nets
 Units can be input and output
 Typically each input unit is connected to
all output units
 Input units are not connected to each
other
 Output units are not connected to each
other
 In Hopfield nets, each unit functions as
both input unit and output unit
Multilayer Net
 Multilayer nets have hidden layers
between the input and the output
 Can solve more complex problems
than single layer nets
 Training may be more involved
though not always
Setting the weights
 Apart from the architecture, the
method of setting the weights or
values is another distinguishing
feature
 This is called training
 Training may be supervisedor
unsupervised
 For fixed weight nets, there is no
iterative training process
Supervised Training
 Sequence of input training vectors
along with associated output vector
 Weights are adjusted accordingly
 E.g. In pattern classification problems
where an input vector must be
classified into belonging or not
belonging to a certain set
Supervised Training
 Pattern association is another class of
problems where the output is not
yes/no but a pattern
 Such a neural network is called an
associative memory
 Multilayer nets may be used to map
input vectors of size n (n-tuples) to
output vectors of size m (m-tuples)
Unsupervised training
 Self organizing neural nets group
similar input vectors together and
map them to the same output
 No training data is used
 Weights are modified to achieve the
mapping
 Clustering is one application where
unsupervised learning is used
Fixed Weight Nets
 Used for constraint optimization
problems
 Constraints are represented by
weights
The McCulloch-Pitts Neuron
 A set of synapses (i.e. connections) brings in activations
from other neurons.
 A processing unit sums the inputs, and then applies a non-
linear activation function (i.e.
squashing/transfer/threshold function).
 An output line transmits the result to other neurons.
Networks of McCulloch-Pitts
Neurons
 The simplest ANNs consist of a set of McCulloch-Pitts
neurons labeled by indices k, i, j and activation flows
between them via synapses with strengths wki, wij:
Some Useful Functions
 Common activation functions
 Identity function
 f(x) = x for all x

 Binary step function (with threshold )


(aka Heaviside function or threshold
function)
1 if x  
f (x)  
 0 if x  
Some Useful Functions
 Binary sigmoid
1
f ( x) 
1  e x

 Bipolar sigmoid

2
g ( x)  2 f ( x)  1  x
1
1 e
The McCulloch-Pitts Neuron
Equation
 The output out of a McCulloch-Pitts neuron as a
function of its n inputs ini :
The Perceptron
 We can connect any number of McCulloch-Pitts
neurons together in any way we like.
 An arrangement of one input layer of McCulloch-
Pitts neurons feeding forward to one output
layer of McCulloch-Pitts neurons is known as a
Perceptron.
The Perceptron
Logic Gates with MP Neurons
 We can use McCulloch-Pitts neurons to
implement the basic logic gates.

 All we need to do is find the connection


weights and neuron thresholds to produce the
right outputs for each set of inputs.

 One can construct simple networks that


perform NOT, AND, and OR.

 It is then a well known result from logic that


we can construct any logical function from
these three operations.
Implementation of Logical
NOT, AND, and OR
 Logical OR

x1 θ=2
x1x2 y 2

0 0 0 y
0 1 1
x2 2
1 0 1
1 1 1
Implementation of Logical
NOT, AND, and OR
 Logical AND

x1 θ=2
x1x2 y 1

0 0 0 y
0 1 0
x2 1
1 0 0
1 1 1
Implementation of Logical
NOT, AND, and OR
 Logical NOT

x1 θ=2
x1y -1

0 1 y
1 0 1
2

bias
Implementation of Logical
NOT, AND, and OR
 Logical AND NOT

x1 θ=2
x1x2 y 2

0 0 0 y
0 1 0
x2 -1
1 0 1
1 1 0
Neural Networks
Convolutional Networks: LeNet-5
Neural Networks that use convolution in place of general
matrix multiplication in at least one layer

The original Convolutional Neural Network model goes


back to 1989 (LeCun)
AlexNet (Krizhevsky, Sutskever, Hinton 2012)

ImageNet 2012 15.4% error rate


Convolutional Neural Networks

Figure: Andrej Karpathy


Convolution
Kernel
w7 w8 w9
w4 w5 w6
w1 w2 w3

Feature Map

Grayscale Image

Convolve image with kernel having weights w (learned by


backpropagation)
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x
Convolution
Convolution

wT x

What is the number of


parameters?
Learn Multiple Filters

If we use 100 filters, we get 100 feature maps

Figure: I. Kokkinos
Pooling
Pooling

max{a i }
Pooling

max{a i }
Pooling

max{a i }
Pooling

max{a i }

Other options: Average pooling, L2-norm pooling, random


pooling
Pooling

We have multiple feature maps, and get an equal number of


subsampled maps
This changes if cross channel pooling is done
LeNet-5

Filters are of size 5 × 5, stride


1 Pooling is 2 × 2, with stride
2 How many parameters?
AlexNet

Input image:227 X 227 X 3


First convolutional layer:
96 filters with K = 11 applied with stride = 4
Width and height of output: 2 2 7 −41 1 + 1 = 55
Success of Deep Learning
--Dramatic improvements in accuracy
--Solving wide range of tasks traditionally
thought to be difficult
--Resulted in intense increase in interest,
research, investments in AI in generals
--Enormous business value
--May need 10 years to reach full potential

71
Success of Deep Learning
--Amazing results obtained with relatively
simple ideas
--Incremental algorithmic innovations
--Availability of large amount of perceptual
data
--Fast highly parallel computation hardware
NVDIA GPUs
--Software layers CUDA, Tensorflow, Keras

72
Steps of Deep Learning
--Define problem
--Metrics to measure success
--Define testing and validation process
--Vectorize data
--Define model
--Tune parameters, refine model
--Avoid overfitting

73
Deep Learning Architectures
--Dense, CNN, RNN
--Vector data: (Dense)
--Image Data: 2D convnets
--Sound Data:1d convnets (preferred)/RNN
--Text: 1d convnets (preferred)/RNN
--Timeseries: RNN (preferred)/1d convnets
--Sequence: RNN/1-d convnets
--Video : 3-d convnets, 2-d CNN + RNN
74
Limitations of Deep Learning
--Cannot read specs and design algorithm
--Cannot code
--Too complex to express certain things as
a chain of simple vector transformations
--Do not have capability to learn without
data
--Cannot plan

75
Future of Deep Learning
--Models based on richer primitives
--New forms of learning algorithms
--Automatic Deep Learning
--Greater systematic use of previously
learned features and architectures

76
Credits
 Many Slides by
 Subhendu Trivedi and Rishi Condor,
University of Chicago
 Michael Scherger
Department of Computer Science
Kent State University
Reference Book 1

Deep Learning
by
Ian Goodfellow,
Yoshua Bengio,
Aaron Courville,
MIT Press, 2017.
ISBN-13: 978-0-262-03561-3.
Reference Book 2

Fundamentals of Neural Networks:


Architectures, Algorithms and
Applications, by Laurene Fausett,
Pearson Education, 2006.
ISBN-13: 978-8-131-70053-2
Reference Book 3

Deep Learning with Python


by
Francois Chollet.
Manning Publishers, 2018.
ISBN-13: 978-1-617-29443-3

You might also like