Chapter 5 Deep Learning
Chapter 5 Deep Learning
Neural Network
Ts. Võ Như Thành
Bộ môn Cơ điện tử
Khoa Cơ khí
Email: [email protected]
Tel 0903532083
1
Content
What is Machine Learning vs Neural Network and Deep Learning
Advantage and Disadvantage
General Structure of Deep Learning and CNN
Typical famous of CNN
Learning Type and Implimentation
2
3
4
5
6
Reinforce Learning
- Supervised and Unsupervised learning work on static dataset, while RL works with data from a
dynamic environment.
- Goal of RL is not to cluster or label the data but it finds best sequence of actions that will
generate optimal outcome.
- RL allows an agent (Piece of software) to explore, interact with and learn from the environment
on reward and punishment basis.
7
Deep Learning (DL)
- Deep Neural Networks are basis of Deep Learning.
- The term "deep" usually refers to the number of hidden layers in the neural network. Traditional
neural networks only contain 2-3 hidden layers, while deep networks can have as many as 150
layers
- Deep learning models are trained by using large sets of labeled data and neural network
architectures that learn features directly from the data without the need for manual feature
extraction.
8
Deep Learning (DL)
9
Limitations
- Deep learning requires huge amounts of training data.
- Deep learning requires extensive computing power.
- Architectures can be complex and must often be highly tailored to a
specific application.
- The resulting models may not be easily interpretable.
Implimentation
- GPUs (Graphic Processor Units) and TPUs (Tensor Processing Units)
are being exploited for heavy computations.
- For low end processing, FPGAs (Field Programmable Gate Arrays)
and CPUs are used.
10
Machine Learning (ML):
- It relies on feature extraction from input images. In Cat Vs
Dog case features may be whiskers, Ears, Eyes etc. Feature
extraction parameters are defined by us.
- On the basis of these features, some classifier gives output.
Deep Learning (DL):
- Deep learning takes ML one step ahead. Deep learning
automatically finds out the features which are important for
classification.
11
12
Convolutional Neural Network (CNN)
One of the most popular types of deep neural networks is known as convolutional
neural networks (CNN or ConvNet). A CNN convolves learned features with input
data, and uses 2D convolutional layers, making this architecture well suited to
processing 2D data, such as images, sounds...
13
14
Convolution
15
16
- Convolution puts the input images through a set of convolutional filters, each of
which activates certain features from the images.
- Pooling simplifies the output by performing nonlinear down-sampling, reducing the
number of parameters that the network needs to learn about.
- Rectified linear unit (ReLU) allows for faster and more effective training by mapping
negative values to zeros. 17
SOME CNN STRUCTURE
ALEXNET
- It is a simple yet powerful network architecture having 23 layers.
- It utilizes CNN with convolution, ReLu, Pooling and Fully Connected
classification layers.
- The key feature of AlexNet is training on GPU architecture, which
speeds up the training.
VGG Net
- The VGG Network was introduced by the researchers at Visual
Graphics Group at Oxford.
- This network is specially characterized by its pyramidal shape, where
the bottom layers which are closer to the image are wider, whereas the
top layers are narrower.
- Has 19 layers and slow in training 18
SOME CNN STRUCTURE
Google Net
- Google Net (or Inception Network) is a class of architecture designed by
researchers at Google.
- Google Net was the winner of ImageNet 2014, where it proved to be a powerful
model.
- It has 144 layers.
- It offers parallel architecture which shows drastic change from the sequential
architectures of previously used models.
ResNet - Residual Networks
- ResNet is one of the monster architectures which truly define how deep a deep
learning architecture can be.
- It uses 152 layers.
- Residual Networks (ResNet in short) consists of multiple subsequent residual
modules, which are the basic building block of ResNet architecture.
19
SOME CNN STRUCTURE
RCNN (Region Based CNN)
- Region Based CNN architecture is said to be the most influential of all the
deep learning architectures.
- RCNN does is to attempt to draw a bounding box over all the objects present
in the image, and then recognize what object is in the image.
20
SOME CNN STRUCTURE
YOLO (You Only Look Once)
- YOLO is the current state-of-the-art real time system built on deep learning for
solving image detection problems.
- It first divides the image into defined bounding boxes, and then runs a recognition
algorithm in parallel for all of these boxes to identify which object class.
- After identifying this classes, it goes on to merging these boxes intelligently to form
an optimal bounding box around the objects.
22
How to Create and Train Deep Learning Models
Transfer Learning
- A process that involves fine-tuning a pretrained model.
- We start with an existing network, such as AlexNet or GoogleNet, and
feed in new data containing previously unknown classes.
- After making some tweaks to the network, we can now perform a new
task, such as categorizing only dogs or cats instead of 1000 different
objects.
- This also has the advantage of needing much less data (processing
thousands of images, rather than millions), so computation time drops
to minutes or hours.
23
How to Implement Deep Learning Models
24
Design CNN example
- Using MNIST data that previously introduced in the last
lecture
- Test data set has 10000 graysalce image
- 10 folder with 1000 images in each
- 28x28 pixel
- Make sure the folders in the right path
25
Design CNN example
- Check the path and make sure it is accessible.
- For Training, using 750 images from each folder, therefore,
total training images are 7500.
- Rest 250 images in each folder are used for validation,
therefore, total images for validation are 2500 .
- This distribution of images will be done by writing MATLAB
code, not manually.
- Test image will be done manually
26
Design CNN example – Layers in the model
27
Layers in the model
Layers of a CNN
Image Input Layer: This is the layer where we specify the image size.
imagelnputLayer([M N n], 'Name', 'Input')
Ex. imagelnputLayer([28 28 1], 'Name', 'Input')
28
Layers in the model
Layers of a CNN
Batch Normalization Layer: It normalizes the activations and gradients,
making network training an easier optimization problem and speeds up
network training and reduces the sensitivity to network initialization.
batchNormalizationLayer('Name','BN_1')
ReLU Layer
The most common non linear activation function is the Rectified Linear Unit
(ReLU).
reluLayer('Name','Relu_1')
29
Layers in the model
Max Pooling Layer
It is down-sampling operation that reduces the spatial size of the feature
map and removes redundant spatial information.
maxPooling2dLayer(PoolSize,'Stride',n,'Name','MaxPool_1')
maxPooling2dLayer(2,'Stride',2,'Name','MaxPool_1')
Fully Connected Layer
The last fully connected layer combines the features to classify the images.
fullyConnectedLayer(outputSize,Name,Value)
fullyConnectedLayer(10,'Name','FC')
30
Layers in the model
Softmax Layer:
The softmax activation function normalizes the output of the fully connected
layer; output of the softmax layer consists of positive numbers that sum to
one, which can then be use classification probabilities by the classification
layer.
softmaxLayer('Name', Name) √
softmaxLayer('Name', 'SoftMax')
Classification Layer:
The final layer is the classification layer. This layer uses the probabilities
returned by the softmax activation function for each input to assign the input
to one of the mutually exclusive classe and compute the loss.
classificationLayer('Name',Name)
classificationLayer('Name','Output Classification')
31
Layers in the model
Training Parameters
trainingOptions(solverName, Name, Value)
trainingOptions('sgdm', 'LearnRateSchedule', 'piecewise',…
'LearnRateDropFactor', 0.2, 'LearnRateDropPeriod', 5, 'MaxEpochs',… 20,
'MiniBatchSize', 64, 'Plots', 'training-progress')
SolverName:
• 'sgdm': Stochastic Gradient Descent with momentum (SGDM) optimizer.
It needs momentum rate.
• 'rmsprop': RMSProp optimizer. It needs decay rate of the squared
gradient moving average.
• 'adam': Adam optimizer. It needs decay rates of the gradient and squared
gradient moving averages.
Hardware Options:
'Execution Environment' - Hardware resource for training network 'auto'
(default) | 'cpu' | 'gpu' |'multi-gpu' | 'parallel'
32
Layers in the model
33
Testing
34
Faculty of Mechanical Engineer
Vo Nhu Thanh, Ph.D, Senior lecturer
35