CC511 Week 7 - Deep - Learning

Deep learning uses neural networks to learn representations of data. Convolutional neural networks (CNNs) are commonly used for visual data. A CNN contains convolutional layers that learn hierarchical representations through local connections and weight sharing. It also uses pooling layers for downsampling, fully connected layers for classification, and may employ techniques like dropout to prevent overfitting. CNNs have achieved human-level performance on image recognition tasks.

Uploaded by

mohamed sherif

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

102 views33 pages

CC511 Week 7 - Deep - Learning

Uploaded by

mohamed sherif

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

CC 511 Artificial Intelligence

Deep Learning

Dr. Karma Fathalla

Lecture Highlights
• Deep Learning Applications
• Current Architectures
• Convolution Neural Network
DL and Data Science
• Scalability of neural networks - results get better
with more data and larger models, that in turn
require more computation to train.
DL and Feature Engineering
• Automated Feature Learning - ability to perform
automatic feature extraction from raw data.
• Hierarchical Feature Learning - ability to provide
different levels of abstractions of the data.
Traditional Recognition
Convolution
Neural Networks (CNN)
• It is a class of deep, feed-forward artificial neural
networks that are applied to analyzing visual
imagery.
Convolution Layer
Convolution Layer
Convolution Layer
Convolution Layer
Convolution Layer
• The # of output feature maps are usually larger than
the # of input feature maps.
Convolution Layer
Related terms
• Filter : A mask/window that holds the learned weights that
are convolved with the image. Its size specifies the patch or
receptive field of the image.
• Feature Map: is the output of one filter applied to the
previous layer.
• Stride: is the distance (number of rows and columns) that
filter is moved across the input from the previous location.
• Padding: is to invent mock inputs for the receptive field for
the filter to read, incase the filter is attempting to read off the
edge of the input feature map
Spatial Dimensions
• 7x7 input (spatially) assume 3x3
filter => 5x5 output
• 7x7 input (spatially) assume 3x3
filter applied with stride 2 => 3x3
output!
• 7x7 input (spatially) assume 3x3
filter applied with stride 3? doesn’t
fit! cannot apply 3x3 filter on 7x7
input with stride 3.
Spatial Dimensions

• Output size: (N - F) / stride

+1
• e.g. N = 7, F = 3:
stride 1 => (7 - 3)/1 + 1 = 5
stride 2 => (7 - 3)/2 + 1 = 3
stride 3 => (7 - 3)/3 + 1 =
2.33
Padding

• Input 7x7 and 3x3 filter, applied with stride

1 pad with 1 pixel border => what is the
output?
• 7x7 output!
• In general, common to see CONV layers with
stride 1, filters of size FxF, and zero-padding
with (F-1)/2. (will preserve size spatially)
F = 3 => zero pad with 1
F = 5 => zero pad with 2
F = 7 => zero pad with 3
Weight Sharing
• Is the concept by which the CNN achieves translation
invariance.
• Based on the assumption: That if one feature is useful to
compute at some spatial position (x,y), then it should also be
useful to compute at a different position (x2,y2).
• Is to constrain the neurons in each depth slice to use the same
weights and bias across the whole image.
• However, it is possible to relax the parameter sharing scheme,
and instead simply call the layer a Locally-Connected Layer.
Weight Sharing
• In practice, the weight update is performed
concurrently through parallelization algorithms and
special hardware called the Graphical Processing Unit
(GPU)
• GPUs : are hundreds of simpler cores, thousands of
hardware threads that are applied to image regions
at the same time.
Number of parameters
• Input volume: 32x32x3 10 5x5 filters with stride 1,
pad 2
• Number of parameters in this layer? each filter has
5*5*3 + 1 = 76 params (+1 for bias) => 76*10 =
760
Hierarchy of
Convolution Layers
Activation Layer
• After each conv layer, it is conventional to apply a nonlinear function.
• In the past, nonlinear functions like tanh and sigmoid were used, but
researchers found out that ReLU layers work far better because the
network is able to train a lot faster (because of the computational
efficiency) without making a significant difference to the accuracy. It also
helps to alleviate the vanishing gradient problem.
• Generalization would not be possible with a linear mapping as in that case
a high level of abstraction/generalization would not be possible. Hence, to
map a class of images into a manifold of feature vector, we need
activation, without it, it would be really difficult to generalize as pictures in
a class can have to much intra-class variations.
Activation Layer
• Relu (REctified Linear Unit)
Pooling Layer
• It down-samples the previous layer’s feature map.
• Pooling layers follow a sequence of one or more convolutional .
• It may be considered as a technique to compress or generalize
feature representations and generally reduce the overfitting of the
training data by the model.
• They too have a receptive field, often much smaller than the
convolutional layer. Also, the stride or number of inputs that the
receptive field is moved for each activation is often equal to the size
of the receptive field to avoid any overlap.
• Pooling layers are often very simple, taking the average or the
maximum of the input value in order to the new feature map.
Pooling Layer
Dropout Layer
• Probabilistically dropping out or ignoring nodes in the network is a
simple and effective regularization method.
• It offers a very computationally cheap and remarkably effective
regularization method to reduce overfitting and improve
generalization error in deep neural networks of all kinds.
• Dropout has the effect of making the training process noisy, forcing
nodes within a layer to probabilistically take on more or less
responsibility for the inputs.
• It encourages the network to actually learn a sparse representation.
Dropout Layer
Fully Connected Layer
• is the normal flat feed-forward neural network layer.
• is preceded by a flatten procedure.
• Contains neurons that connect to the entire input volume, as in ordinary
Neural Networks
• Spatial information is lost at this phase
• These layers may have a non-linear activation function or a softmax
activation in order to output probabilities of class predictions.
• Fully connected layers are used at the end of the network after feature
extraction and consolidation has been performed by the convolutional and
pooling layers.
• They are used to create final non-linear combinations of features and for
making predictions by the network.
Soft--max Layer
Soft
Soft--max Layer
Soft
• A Softmax function is a type of squashing function, that limit the
output of the function into the range 0 to 1.
• This allows the output to be interpreted directly as a probability.
Similarly, softmax functions are multi-class sigmoids, meaning they
are used in determining probability of multiple classes at once.
• Since the outputs of a softmax function can be interpreted as a
probability, a softmax layer is typically the final layer used in neural
network functions.
• It is important to note that a softmax layer must have the same
number of nodes as the output later.
• It allows for the calculation of the error.
Transfer Learning

• is a technique which reuses the finished Deep Learning

model in another more specific task.
• A pretrained CNN is used to process data of different
dataset than the one it was trained on.
• The learned parameters are used as they are.
• Sometimes, some further training to fine tune the CNN is
used. Also, some adaptation of the architecture might be
involved.
Data Augmentation
• Artificially making the dataset larger
• By using a collection of simple image transformations
on the already included images yielding new ones,
such as: grayscales, horizontal flips, vertical flips,
random crops, color jitters, translations, rotation.
Shortcomings of CNNs
• A black-box : operates in the paradigm of non-
explainable AI, With the exception of visualization of
output structures at intermediate levels
• The application of CNNs in unsupervised settings is still
lagging behind
• Limitations to context reasoning
• Not invariant to other affine and non-affine
transformations
Famous CNNs Listing
• LeNet. The first successful application of Convolutional Networks were developed by Yann LeCun in 1990’s.
• AlexNet. The first work that popularized Convolutional Networks in Computer Vision. The AlexNet was submitted
to the ImageNet ILSVRC challenge in 2012 and significantly outperformed the second runner-up (top 5 error of
16% compared to runner-up with 26% error). The Network had a very similar architecture to LeNet, but was
deeper, bigger, and featured Convolutional Layers stacked on top of each .
• ZF Net. The ILSVRC 2013. It was an improvement on AlexNet by tweaking the architecture hyperparameters, in
particular by expanding the size of the middle convolutional layers and making the stride and filter size on the first
layer smaller.
• GoogLeNet. The ILSVRC 2014 winner was a Convolutional Network from Szegedy et al. from Google. Its main
contribution was the development of an Inception Module that dramatically reduced the number of parameters in
the network (4M, compared to AlexNet with 60M). Additionally, this paper uses Average Pooling instead of Fully
Connected layers at the top of the ConvNet, eliminating a large amount of parameters that do not seem to matter
much.
• VGGNet. The runner-up in ILSVRC. Its main contribution was in showing that the depth of the network is a critical
component for good performance. Their final best network contains 16 CONV/FC layers and, appealingly, features
an extremely homogeneous architecture that only performs 3x3 convolutions and 2x2 pooling from the beginning
to the end.
• ResNet. Residual Network was the winner of ILSVRC 2015. It features special skip connections and a heavy use of
batch normalization. The architecture is also missing fully connected layers at the end of the network.

Lecture_3
No ratings yet
Lecture_3
48 pages
3.Convolutional Networks and Sequence Modeling
No ratings yet
3.Convolutional Networks and Sequence Modeling
19 pages
ML Lec 13 CNN
No ratings yet
ML Lec 13 CNN
44 pages
Chapter14 CNN
No ratings yet
Chapter14 CNN
54 pages
CS601 Machine Learning Unit 3
No ratings yet
CS601 Machine Learning Unit 3
47 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
35 pages
Super VIP Cheatsheet - Deep Learning
No ratings yet
Super VIP Cheatsheet - Deep Learning
47 pages
Cnnbasics 171028092801
No ratings yet
Cnnbasics 171028092801
43 pages
Deep Neural Network DNN
No ratings yet
Deep Neural Network DNN
5 pages
MODULE 5
No ratings yet
MODULE 5
20 pages
CV Lab 12 - Implementatin of a Simple CNN
No ratings yet
CV Lab 12 - Implementatin of a Simple CNN
9 pages
UNIT-4 Foundations of Deep Learning
100% (1)
UNIT-4 Foundations of Deep Learning
43 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
38 pages
What Is Convolutional Neural Network
No ratings yet
What Is Convolutional Neural Network
16 pages
4th Unit Aktu Machine Learning
No ratings yet
4th Unit Aktu Machine Learning
9 pages
What is a Convolutional Neural Network-unit3.docx
No ratings yet
What is a Convolutional Neural Network-unit3.docx
12 pages
UNIT-III DLL full unit
No ratings yet
UNIT-III DLL full unit
63 pages
DL_MOD3
No ratings yet
DL_MOD3
102 pages
cnn
No ratings yet
cnn
10 pages
DL Unit3
No ratings yet
DL Unit3
8 pages
Machine Learning (CSO851) - Lecture 10
No ratings yet
Machine Learning (CSO851) - Lecture 10
83 pages
Unit 2
No ratings yet
Unit 2
20 pages
Intro CNN PDF
No ratings yet
Intro CNN PDF
31 pages
CNN
No ratings yet
CNN
31 pages
Unit 3 - Machine Learning
No ratings yet
Unit 3 - Machine Learning
29 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
7 pages
Convolutional Neural Networks-Part2
No ratings yet
Convolutional Neural Networks-Part2
21 pages
AE556_2024_Topic4_CNN
No ratings yet
AE556_2024_Topic4_CNN
26 pages
CNN and Autoencoder
No ratings yet
CNN and Autoencoder
56 pages
Additional CNN
No ratings yet
Additional CNN
82 pages
Convolutional Neural Networks - Annotated
No ratings yet
Convolutional Neural Networks - Annotated
83 pages
Machine Learning Unit 3
No ratings yet
Machine Learning Unit 3
40 pages
Deep Learning Notes For Easy Access
No ratings yet
Deep Learning Notes For Easy Access
14 pages
nn-jaguar-lava-122
No ratings yet
nn-jaguar-lava-122
10 pages
Introduction to Deep Learning
No ratings yet
Introduction to Deep Learning
47 pages
CNN Midterm
No ratings yet
CNN Midterm
103 pages
CNNs (1)
No ratings yet
CNNs (1)
88 pages
Unit-4
No ratings yet
Unit-4
19 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
11 pages
Combined Paper
No ratings yet
Combined Paper
26 pages
371810f3-a2d5-467f-aa88-bfa680405b79
No ratings yet
371810f3-a2d5-467f-aa88-bfa680405b79
17 pages
CNN notes unit-3
No ratings yet
CNN notes unit-3
12 pages
Unit II
No ratings yet
Unit II
38 pages
02 - Introduction to Convolutional Neural Networks (CNNs)
No ratings yet
02 - Introduction to Convolutional Neural Networks (CNNs)
28 pages
unit-3-CNN-2024
No ratings yet
unit-3-CNN-2024
58 pages
DL U4
No ratings yet
DL U4
59 pages
Convolution Neural Networks
No ratings yet
Convolution Neural Networks
80 pages
UNIT2-CNN
No ratings yet
UNIT2-CNN
34 pages
MODULE_05_CNN_ARCTITECTURE
No ratings yet
MODULE_05_CNN_ARCTITECTURE
7 pages
Unit-3
No ratings yet
Unit-3
59 pages
Unit III
No ratings yet
Unit III
89 pages
CV Lec6
No ratings yet
CV Lec6
57 pages
CNN Cheat Sheet
No ratings yet
CNN Cheat Sheet
5 pages
Cheatsheet Convolutional Neural Networks
No ratings yet
Cheatsheet Convolutional Neural Networks
5 pages
CS 230 - Convolutional Neural Networks Cheatsheet
No ratings yet
CS 230 - Convolutional Neural Networks Cheatsheet
7 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
55 pages
Slides CNN
No ratings yet
Slides CNN
17 pages
Mini Project Final Report
No ratings yet
Mini Project Final Report
30 pages
Unit 3 ML
No ratings yet
Unit 3 ML
27 pages
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
Compiling ONNX Neural Network Models Using Mlir
No ratings yet
Compiling ONNX Neural Network Models Using Mlir
8 pages
Deep Learning Applications Volume 3 Advances in Intelligent Systems and Computing 1st Edition M Arif Wani Bhiksha Raj Feng Luo Dejing Dou Editors
No ratings yet
Deep Learning Applications Volume 3 Advances in Intelligent Systems and Computing 1st Edition M Arif Wani Bhiksha Raj Feng Luo Dejing Dou Editors
79 pages
ADL Unit-3
No ratings yet
ADL Unit-3
21 pages
KV Preboard
No ratings yet
KV Preboard
14 pages
MGT 321 mID 1 aSSIGNMENT Sumaya Hyder Laboni
No ratings yet
MGT 321 mID 1 aSSIGNMENT Sumaya Hyder Laboni
6 pages
Google Cloud Certified Professional Machine Learning Engineer Study Guide 1st Edition Mona instant download
100% (1)
Google Cloud Certified Professional Machine Learning Engineer Study Guide 1st Edition Mona instant download
80 pages
K-Nearest Neighbor (KNN) ..: Class or Value
No ratings yet
K-Nearest Neighbor (KNN) ..: Class or Value
18 pages
HRDC JNTUH RC AI and ML
No ratings yet
HRDC JNTUH RC AI and ML
2 pages
Letter of Motivation - Information Technology - Frankfurt University of Applied Sciences
100% (1)
Letter of Motivation - Information Technology - Frankfurt University of Applied Sciences
2 pages
AI_Techniques_for_Detecting_and_Diagnosing_COVID-19 (1)
No ratings yet
AI_Techniques_for_Detecting_and_Diagnosing_COVID-19 (1)
5 pages
The Machine Learning Challenge: Background and Motivation
No ratings yet
The Machine Learning Challenge: Background and Motivation
3 pages
Immediate download Supervised Machine Learning for Text Analysis in R 1st Edition Emil Hvitfeldt Julia Silge ebooks 2024
No ratings yet
Immediate download Supervised Machine Learning for Text Analysis in R 1st Edition Emil Hvitfeldt Julia Silge ebooks 2024
40 pages
Emerging Trends in Disruptive Technology Management for Sustainable Development 1st Edition Rik Das (Editor) All Chapters Instant Download
100% (2)
Emerging Trends in Disruptive Technology Management for Sustainable Development 1st Edition Rik Das (Editor) All Chapters Instant Download
55 pages
Teresa Scassa - Administrative Law and The Governance of Automated Decision-Making - Canada
No ratings yet
Teresa Scassa - Administrative Law and The Governance of Automated Decision-Making - Canada
29 pages
ucalgary_2023_hajimohammadkhani_ahmad
No ratings yet
ucalgary_2023_hajimohammadkhani_ahmad
103 pages
1-s2.0-S0378377424000933-main
No ratings yet
1-s2.0-S0378377424000933-main
19 pages
Neuro Solutions
No ratings yet
Neuro Solutions
3 pages
Text Based Nlp.2
No ratings yet
Text Based Nlp.2
29 pages
DS Gurucul Platform Brochure 2022 07 27
No ratings yet
DS Gurucul Platform Brochure 2022 07 27
8 pages
[Ebooks PDF] download Modern Deep Learning for Tabular Data: Novel Approaches to Common Modeling Problems Andre Ye full chapters
100% (2)
[Ebooks PDF] download Modern Deep Learning for Tabular Data: Novel Approaches to Common Modeling Problems Andre Ye full chapters
41 pages
CSI Prediction
No ratings yet
CSI Prediction
9 pages
aws-ai-3
No ratings yet
aws-ai-3
35 pages
Explainable AI (XAI) for Obesity Prediction: An Optimized MLP Approach with SHAP Interpretability on Lifestyle and Behavioral Data
No ratings yet
Explainable AI (XAI) for Obesity Prediction: An Optimized MLP Approach with SHAP Interpretability on Lifestyle and Behavioral Data
9 pages
Data Science 1
No ratings yet
Data Science 1
5 pages
Stock Price Prediction Based On Financial Statements Using SVM
No ratings yet
Stock Price Prediction Based On Financial Statements Using SVM
10 pages
4220 2 (Bigdata)
No ratings yet
4220 2 (Bigdata)
19 pages
An Exploratory Study on Using Embedded Systems in Monitoring Weather Conditions in Ohaukwu Local Government Area.
No ratings yet
An Exploratory Study on Using Embedded Systems in Monitoring Weather Conditions in Ohaukwu Local Government Area.
11 pages
Predicting Results of Brazilian Soccer League Matches: University of Wisconsin-Madison
No ratings yet
Predicting Results of Brazilian Soccer League Matches: University of Wisconsin-Madison
13 pages
Application of Artificial Intelligence in Dentistry
No ratings yet
Application of Artificial Intelligence in Dentistry
13 pages
Python Pandas and Matplotlib 7
100% (3)
Python Pandas and Matplotlib 7
72 pages

CC511 Week 7 - Deep - Learning

Uploaded by

CC511 Week 7 - Deep - Learning

Uploaded by

CC 511 Artificial Intelligence

Dr. Karma Fathalla

• Output size: (N - F) / stride

• Input 7x7 and 3x3 filter, applied with stride

• is a technique which reuses the finished Deep Learning

You might also like