0% found this document useful (0 votes)

51 views36 pages

NN 08

The document discusses regularization techniques for neural networks including dropout, data augmentation, dropconnect, reducing the number of parameters, and weight decay. It also covers loss functions, hyperparameters, and CNN architectures such as AlexNet, ZFNet, VGGNet, GoogLeNet, and ResNet.

Uploaded by

youssef hussein

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views36 pages

NN 08

Uploaded by

youssef hussein

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Artificial Neural Network

and Deep Learning

Lecture 8
Regularization
CNN Architectures

Neural Networks - Lecture 8 1

• Loss Function
• Hyperparameters
• Regularization for good generalization
• Dropout
• Data augmentation
• DropConnect
• Reduce the number of parameters
• Weight decay
Agenda • CNN Applications
• Object Classification
• Different Dataset for Object Recognition.
• Different CNN Architectures for Object
recognition
• AlexNet, ZFNet, VGGNet, GoogLeNet,
ResNet
2
Neural Networks - Lecture 8

1
Important Components of Neural Network apart
from the neurons
• Activation functions. Transforms the sum of weights and biases of each layer –
adds non-linearity to the model.
• Loss function (cost function, objective function, error function). Measures how
well the NN reproduces the experimental training data.
• Optimization algorithm. Finds weights and bias values that minimize (locally ) the
Loss function.
Deep learning neural networks are trained using the stochastic gradient descent
optimization algorithm.
• Hyperparameters. It are some setting that is difficult to optimize (LR, Momentum
term, # of hidden layers, etc). Settled at first. No training for them.
• Regulation techniques. Prevents over-fitting of the NN to the training data.

Neural Networks - Lecture 8 3

Loss Function
• A loss function tells how good our current classifier is.
• The loss is calculated using loss function by matching the target(actual) value and predicted value by
a neural network.
• Then we use the gradient descent method to update the weights of the neural network such that the loss is
minimized. This is how we train a neural network.

• The loss function used to estimate the loss of the model so that the weights can be updated to reduce the
loss on the next evaluation.

Neural Networks - Lecture 8 4

2
The choice of Loss Function
• Regression Loss Functions
• Mean Squared Error Loss
• Mean Squared Logarithmic Error Loss Cross-entropy
• Mean Absolute Error Loss and mean squared
• Binary Classification Loss Functions error are the two
• Binary Cross-Entropy main types of loss
• Hinge Loss functions to use
• Squared Hinge Loss when training
• Multi-Class Classification Loss Functions neural network
• Multi-Class Cross-Entropy Loss models.
• Sparse Multiclass Cross-Entropy Loss
• Kullback Leibler Divergence Loss

Reference: https://machinelearningmastery.com/how-to-choose-loss-functions-when-training-deep-learning-neural-networks/

Neural Networks - Lecture 8 5

Hyperparameters
• It are some setting that is difficult to optimize.
i.e. It is not appropriate to learn that on the training set.
• Examples:
• Network architecture
• Learning rate
• Filter size for convolution layer

Neural Networks - Lecture 8 6

3
Regularization for Good
Generalization

7
Neural Networks - Lecture 8

Regularization
• Regularization is any modification we make to a learning algorithm
that is intended to reduce its generalization error but not its training
error.
i.e. any method that prevent over-fitting or help the optimization.

Neural Networks - Lecture 8 8

4
Under- and Over-fitting
• are factors determining how well an ML algorithm will
perform. i.e. its ability to:
1. Make the training error small
2. Make gap between training and test errors small
• Underfitting
Inability to obtain low enough error rate on the training
set.
• Overfitting
Gap between training error and testing error is too large

Source: Fei-Fei Li & Justin Johnson & Serena Yeung 2019

Neural Networks - Lecture 8 9

Regularization: Add term to loss

It is any method that prevent over-fitting or help the optimization. This done by
using additional terms in the training optimization objective.

Neural Networks - Lecture 8 10

5
Regularization Strategies
1. Parameter Norm Penalties
– (L2- and L1- regularization)
2. Norm Penalties as Constrained Optimization
3. Regularization and Under-constrained Problems
4. Data Set Augmentation
5. Noise Robustness The best-performing models
6. Semi-supervised learning on most benchmarks use
7. Multi-task learning some or all of these tricks.
8. Early Stopping
9. Parameter tying and parameter sharing
10. Sparse representations
11. Bagging and other ensemble methods
12. Dropout
13. Adversarial training
14. Tangent methods

Neural Networks - Lecture 8 11

Regularization: Dropout [Srivastava et al.]

“randomly set some neurons to zero”

• Dropout is a technique used to improve over-fit on neural networks.

• Randomly drop units (along with their connections) during training.
• Probability of dropping is a hyperparameter; 0.5 is common.
• Technique proposed by:
Srivastava et al. "Dropout: a simple way to prevent neural networks from
overfitting." Journal of machine learning research (2014).

Neural Networks - Lecture 8 13

6
Regularization: Dropout, cont.
See : http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
Dropout was used for training of fully connected layers.
Training:
• Setting to 0 the output of each hidden neuron with probability 0.5 (50%).
• The neurons which are “dropped out” in this way
• do not contribute to the forward pass
• and do not participate in back-propagation.
• So, every time an input is presented, the neural network samples a different
architecture, but all these architectures share weights.
Test:
At test time, we use all the neurons.

Neural Networks - Lecture 8 14

Regularization: Data Augmentation

(How to use Deep Learning when you have Limited Data for training?)
• The best way to improve generalization is to collect more data for training.
• We can augment the training data by transforming the examples. This is called data
augmentation.
• Examples (for visual recognition)
• translation
• horizontal or vertical flip
• rotation
• smooth warping
• noise (e.g. flip random pixels)
• Padding
• cropping
• Only warp the training, not the test, examples.
• The choice of transformations depends on the task. (E.g. horizontal flip for object recognition, but
not handwritten digit recognition.)

Neural Networks - Lecture 8 15

7
Regularization: DropConnect
• Training: Drop connections between neurons (set weights to 0)
• Testing: Use all the connections.

• Technique proposed by: Wan et al., “Regularization of /Neural Networks using

DropConnect”, ICML 2013.

Neural Networks - Lecture 8 16

Reducing the Number of Parameters

• Can reduce the number of layers or the number of parameters per
layer.
• Adding a linear bottleneck layer is a way to reduce the number of
parameters:

Neural Networks - Lecture 8 17

8
Weight Decay
• Encouraging the weights to be small in magnitude.
• The weight decay is an additional term in the weight update rule that causes
the weights to exponentially decay to zero.
• When training neural networks, it is common to use "weight decay," where after each
update, the weights are multiplied by a factor slightly less than 1. This prevents
the weights from growing too large.
• We regularize the cost function by change it to (adds a penalty equal to the sum of the
squared value of the coefficients, this called L2 regularization)

The regularization parameter λ determines how you trade off the original cost E with the
large weights penalization.

Neural Networks - Lecture 8 18

Weight Decay, cont.

• The gradient descent update can be interpreted as weight decay:

The new term −ηλw causes the weight to decay in proportion to its size.
when the regularization hyperparameter lambda increases, Weights are pushed
toward becoming smaller (closer to 0).

Neural Networks - Lecture 8 19

9
CNN Applications

20
Neural Networks - Lecture 8

CNN Applications

Neural Networks - Lecture 8 21

10
Object Recognition

22
Neural Networks - Lecture 8

Object Recognition
• Object recognition is the task of identifying which object category
is present in an image.
• It's challenging because objects can differ widely in
position, size, shape, appearance, etc.,
and we have to deal with occlusions, lighting changes, etc.
• Object recognition can be either in either:
• Direct applications to image search.
• Closely related to object detection, the task of locating all
instances of an object in an image
• E.g., a self-driving car detecting pedestrians or stop signs.

Neural Networks - Lecture 8 23

11
CNNs for Recognition or Classification: Feature
Learning

1. Learn features in input image through convolution.

2. Introduce non-linearity through activation function (real-world data is non-
linear).
3. Reduce dimensionality and preserve spatial invariance with pooling.
MIT 6.S191, Introduction to Deep Learning, 2020.

Neural Networks - Lecture 8 24

CNNs for Recognition or Classification: Class

Probabilities

- CONV and POOL layers output high-level features of input.

- Fully connected layer uses these features for classifying input 𝒆𝒚𝒊
𝒔𝒐𝒇𝒕𝒎𝒂𝒙 𝒚𝒊 = 𝒚𝒊
image. 𝒊𝒆

- Express output as probability of image belonging to a particular

class.
MIT 6.S191, Introduction to Deep Learning, 2020.

Neural Networks - Lecture 8 25

12
CNNs: Training with Backpropagation

Learn weights for convolutional filters and fully connected layers

Backpropagation: cross-entropy loss
𝑳= 𝒚(𝒊) 𝐥𝐨𝐠(𝒚(𝒊) ) Loss over the dataset is a
𝒊 sum of loss over examples
MIT 6.S191, Introduction to Deep Learning, 2020.

Neural Networks - Lecture 8 26

Recognition Datasets

27
Neural Networks - Lecture 8

13
Recognition Datasets
• In order to train and evaluate a machine learning system, we need to collect a
dataset. The design of the dataset can have major implications.
• Some questions to consider:
• Which categories to include?
• Where should the images come from?
• How many images to collect?
• How to normalize (preprocess) the images?
• During the last two decades:
• Datasets have gotten much larger (because of digital cameras and the
Internet)
• Computers got much faster
• Graphics processing units (GPUs) turned out to be really good at training
big neural nets; they're generally about 30 times faster than CPUs.

Neural Networks - Lecture 8 28

Recognition Datasets, cont.

• MNIST: Dataset of handwritten digits with 10 classes. 70k low resolution images
(50Mb)
http://yann.lecun.com/exdb/mnist/
• CIFAR 10/100: Dataset with 60k low resolution images (10 and 100 classes
respectively)
https://www.cs.toronto.edu/~kriz/cifar.html
• ImageNet: 14M images and more than 20k classes.
http://www.image-net.org/

Neural Networks - Lecture 8 29

14
MNIST Dataset

• MNIST dataset of handwritten digits

• Categories: 10 digit classes
• Source: Scans of handwritten zip codes from envelopes
• Size: 60,000 training images and 10,000 test images, grayscale, of size 28 x 28
• Normalization: centered within in the image, scaled to a consistent size
• The assumption is that the digit recognizer would be part of a larger
pipeline that segments and normalizes images.
• In 1998, Yann LeCun and colleagues built a conv net called LeNet which was able
to classify digits with 98.9% test accuracy.

Neural Networks - Lecture 8 30

CIFAR-10 Dataset https://www.cs.toronto.edu/~kriz/cifar.html

• It consists of 60,000 32x32 color

images in 10 classes.
• 50,000 training images
• 10,000 testing images.

Neural Networks - Lecture 8 31

15
ImageNet Dataset
• ImageNet is the modern object recognition
benchmark dataset. It was introduced in
2009, and has led to amazing progress in
object recognition since then.
• ImageNet is a dataset of over 15 million
labeled high-resolution images belonging
to roughly 22,000 categories. The images
were collected from the web and labeled
by human labelers

Neural Networks - Lecture 8 32

ImageNet, cont.
• Used for the ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
2010 contest, an annual benchmark competition for object recognition.
• ILSVRC uses a subset of ImageNet with roughly 1000 images in each of 1000
categories. In all, there are roughly 1.2 million training images, 50,000
validation images, and 150,000 testing images.

Neural Networks - Lecture 8 33

16
Different CNN
Architectures

34
Neural Networks - Lecture 8

LeNet
[LeCun et al., 1998]

• It was applied to handwritten digit recognition on MNIST in 1998.

• Conv filters were 5x5, applied at stride 1.
• Subsampling (Pooling) layers were 2x2 applied at stride 2
• i.e. architecture is [CONV-POOL-CONV-POOL-FC-FC]

Neural Networks - Lecture 8 38

17
AlexNet
Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, 2012.

• It contains 8 weight learned layers (5 convolutional and 3 fully-connected).

• Architecture: [CONV1, MAX POOL1, NORM1, CONV2, MAX POOL2, NORM2,
CONV3, CONV4, CONV5, Max POOL3, FC6, FC7, FC8]
• They used lots of tricks (ReLU units, weight decay, data augmentation, stochastic
gradient descent (SGD) on training with momentum, dropout).
• AlexNet achieved 16.4% top-5 error (i.e. the network gets 5 tries to guess the right
category).
Neural Networks - Lecture 8 39

AlexNet
Input: 227x227x3 images
First layer (CONV1): 96 11x11 filters applied at stride 4
=>
• Q: what is the output volume size? Hint: (227-11)/4+1 = 55
Output volume [55x55x96]
• Q: What is the total number of parameters in this layer?
Parameters: (11*11*3)*96 = 35K
Input: 227x227x3 images
After CONV1: 55x55x96
Input: 227x227x3 images
Second layer (POOL1): 3x3 filters applied at stride 2
After CONV1: 55x55x96
• Q: what is the output volume size? Hint: (55-3)/2+1 = 27
After POOL1: 27x27x96
Output volume: 27x27x96
...
• Q: what is the number of parameters in this layer?
Parameters: 0!
Neural Networks - Lecture 8 40

18
AlexNet
Full (simplified) AlexNet architecture:
[227x227x3] INPUT
[55x55x96] CONV1: 96 11x11 filters at stride 4, pad 0
Details/Retrospectives:
[27x27x96] MAX POOL1: 3x3 filters at stride 2
- first use of ReLU
[27x27x96] NORM1: Normalization layer
- used Norm layers (not common anymore)
[27x27x256] CONV2: 256 5x5 filters at stride 1, pad 2
- heavy data augmentation
[13x13x256] MAX POOL2: 3x3 filters at stride 2
- dropout 0.5
[13x13x256] NORM2: Normalization layer
- batch size 128
[13x13x384] CONV3: 384 3x3 filters at stride 1, pad 1
- SGD Momentum 0.9
[13x13x384] CONV4: 384 3x3 filters at stride 1, pad 1
- Learning rate 1e-2, reduced by 10
[13x13x256] CONV5: 256 3x3 filters at stride 1, pad 1
manually when val accuracy plateaus
[6x6x256] MAX POOL3: 3x3 filters at stride 2
- L2 weight decay 5e-4
[4096] FC6: 4096 neurons
[4096] FC7: 4096 neurons
[1000] FC8: 1000 neurons (class scores)

Neural Networks - Lecture 8 41

ImageNet Large Scale Visual Recognition Challenge

(ILSVRC) winners

First CNN-based winner

Reference: cs321n, Stanford, spring 2019

Neural Networks - Lecture 8 42

19
ImageNet Large Scale Visual Recognition Challenge
(ILSVRC) winners
ZFNet: improved hyperparameters over AlexNet

Reference: cs321n, Stanford, spring 2019

Reference: cs321n, Stanford, spring 2019 Neural Networks - Lecture 8 43

ZFNet
[Zeiler and Fergus, 2013]

• It is AlexNet but:
CONV1: change from (11x11 stride 4) to (7x7 stride 2)
CONV3,4,5: instead of 384, 384, 256 filters use 512, 1024, 512
• Top-5 error in ILSVRC’13: 11.7%

Neural Networks - Lecture 8 44

20
ImageNet Large Scale Visual Recognition Challenge
(ILSVRC) winners
Deeper Networks

Reference: cs321n, Stanford, spring 2019

Neural Networks - Lecture 8 45

VGGNet
[K. Simonyan and A. Zisserman, University of Oxford, 2014]

• It is a Convolution Neural Network model.

• 16 – 19 layers.
• Only 3x3 CONV stride 1, pad 1
and 2x2 MAX POOL stride 2
• Top-5 error in ILSVRC’14:
7.3%

Neural Networks - Lecture 8 46

21
VGGNet
[K. Simonyan and A. Zisserman, University of Oxford, 2014]

Q: Why use smaller filters?

Stack of three 3x3 conv (stride 1) layers has

same effective receptive field as one 7x7 conv
layer.

But deeper, more non-linearities

And fewer parameters: 3 * (32C2) vs.

72C2 for C channels per layer

Neural Networks - Lecture 8 47

VGG 16

Neural Networks - Lecture 8 48

22
ImageNet Large Scale Visual Recognition Challenge
(ILSVRC) winners
Deeper Networks

Reference: cs321n, Stanford, spring 2019 Neural Networks - Lecture 8 50

GoogLeNet
[Szegedy et al., 2014]

• 22 layers.
• No fully connected FC layers.
• Convolutions are broken down into a
bunch of smaller convolutions (since this
requires fewer parameters total)
• GoogLeNet has only 5 million parameters, Inception module
compared with 60 million for AlexNet. 12x
less than AlexNet “Inception module”: design a
good local network topology
• Top-5 error in ILSVRC’14: 6.7% test error (network within a network) and
then stack these modules on
on ImageNet. top of each other

Neural Networks - Lecture 8 51

23
GoogLeNet

• Apply parallel filter operations on the input from

Q1: What is the output size of the
previous layer: 1x1 conv, with 128 filters?
- Multiple receptive field sizes for convolution (1x1,
3x3, 5x5) Q2: What are the output sizes of
all different filter operations?
- Pooling operation (3x3)
Concatenate all filter outputs together depth-wise
Q3:What is output size after filter
Q: What is the problem with this? concatenation?
Computational complexity

Neural Networks - Lecture 8 52

Conv Ops:
GoogLeNet [1x1 conv, 128] 28x28x128x1x1x256
Q: What is the problem with this? [3x3 conv, 192] 28x28x192x3x3x256
Computational complexity [5x5 conv, 96] 28x28x96x5x5x256
Total: 854M ops

Very expensive compute

Pooling layer also preserves feature

depth, which means total depth after
concatenation can only grow at every
layer!
Solution: “bottleneck” layers that
use 1x1 convolutions to reduce
feature depth

Neural Networks - Lecture 8 53

24
Reminder: 1x1 convolutions

(each filter has size

1x1x64, and performs a
64-dimensional dot
product)
preserves spatial
dimensions, reduces depth!

Projects depth to lower

dimension (combination of
feature maps)

Neural Networks - Lecture 8 54

GoogLeNet
Solution: “bottleneck” layers that use 1x1 convolutions to reduce feature depth.

Neural Networks - Lecture 8 55

25
GoogLeNet Using same parallel layers as
naive example, and adding “1x1
conv, 64 filter” bottlenecks:

Conv Ops:
[1x1 conv, 64] 28x28x64x1x1x256
[1x1 conv, 64] 28x28x64x1x1x256
[1x1 conv, 128] 28x28x128x1x1x256
[3x3 conv, 192] 28x28x192x3x3x64
[5x5 conv, 96] 28x28x96x5x5x64
[1x1 conv, 64] 28x28x64x1x1x256
Total: 358M ops
• Bottleneck can also reduce depth after
pooling layer

• Compared to 854M ops for naive version

Neural Networks - Lecture 8 56

GoogLeNet

Neural Networks - Lecture 8 57

26
GoogLeNet

Neural Networks - Lecture 8 58

GoogLeNet

Note: after the last convolutional layer, a global

average pooling layer is used that spatially averages
across each feature map, before final FC layer. No
longer multiple expensive FC layers!
Neural Networks - Lecture 8 59

27
GoogLeNet

Auxiliary classification outputs to inject additional gradient at lower

layers (AvgPool-1x1Conv-FC-FC-Softmax)
Neural Networks - Lecture 8 60

GoogLeNet

22 total layers with weights

(parallel layers count as 1 layer => 2 layers per Inception module. Don’t count
auxiliary output layers)

Neural Networks - Lecture 8 61

28
GoogLeNet
[Szegedy et al., 2014]

Deep networks, with computational

efficiency.

• 22 layers.
• Efficient Inception module.
Inception module
• Avoid expensive FC layers.
• 12x less parameters than AlexNet
• Top-5 error in ILSVRC’14: 6.7% test error
on ImageNet.

Neural Networks - Lecture 8 62

ImageNet Large Scale Visual Recognition Challenge

(ILSVRC) winners Revolution of Depth

Reference: cs321n, Stanford, spring 2019

Neural Networks - Lecture 8 63

29
ResNet
[He et al., 2015]

Very deep networks using residual

Connections

- 152-layer model for ImageNet

- ILSVRC’15 classification winner
(3.57% top 5 error)
- Swept all classification and
detection competitions in
ILSVRC’15 and COCO’15!

Neural Networks - Lecture 8 64

ResNet
[He et al., 2015]
• What happens when we continue stacking deeper layers on a “plain” convolutional
neural network?

Q: What’s strange about these training and test curves?

look at the order of the curves:
56-layer model performs worse on both training and test error
-> The deeper model performs worse, but it’s not caused by overfitting!

Neural Networks - Lecture 8 65

30
ResNet
[He et al., 2015]
Hypothesis: the problem is an optimization problem, deeper models are harder to
optimize.

• The deeper model should be able to perform at least as well as the shallower
model.

• A solution by construction is copying the learned layers from the shallower model
and setting additional layers to identity mapping.

Neural Networks - Lecture 8 66

ResNet
[He et al., 2015]
Solution: Use network layers to fit a residual mapping instead of directly trying to
fit a desired underlying mapping

Neural Networks - Lecture 8 67

31
ResNet No FC layers
besides FC
[He et al., 2015] 1000 to
output
Full ResNet architecture: classes
Global
- Stack residual blocks average
pooling layer
- Every residual block has after last
two 3x3 conv layers conv layer

- Periodically, double # of
filters and downsample
3x3 conv, 128
spatially using stride 2 filters, /2
spatially with
(/2 in each dimension) stride 2
- Additional conv layer at 3x3 conv, 64
the beginning filters

- No FC layers at the end

(only FC 1000 to output
Beginning
classes) conv layer

Neural Networks - Lecture 8 68

ResNet
[He et al., 2015]

For deeper networks

(ResNet-50+), use “bottleneck” 1x1 conv, 256 filters projects
back to 256 feature maps
layer to improve efficiency (28x28x256)
(similar to GoogLeNet) 3x3 conv operates over
only 64 feature maps

1x1 conv, 64 filters

to project to
28x28x64

Neural Networks - Lecture 8 69

32
ResNet
[He et al., 2015]

Training ResNet in practice:

- Batch Normalization after every CONV layer

- SGD + Momentum (0.9)
- Learning rate: 0.1, divided by 10 when validation error plateaus
- Mini-batch size 256
- Weight decay of 1e-5
- No dropout used

Neural Networks - Lecture 8 70

ResNet
[He et al., 2015]

Experimental Results:
- Able to train very deep networks
without degrading (152 layers on
ImageNet, 1202 on Cifar)
- Deeper networks now achieve lowing
training error as expected
- Swept 1st place in all ILSVRC and
COCO 2015 competitions
ILSVRC 2015 classification winner (3.6%
top 5 error) -- better than “human
performance”! (Russakovsky 2014)

Neural Networks - Lecture 8 71

33
AlexNet:
Comparing complexity...GoogLeNet: Smaller compute, still
memory heavy, lower VGG: Highest
Most efficient accuracy memory, most
Inception-v4: Resnet + Inception!
operations

ResNet:
Moderate efficiency depending on
model, highest accuracy

Neural Networks - Lecture 8 72

ImageNet Large Scale Visual Recognition Challenge

(ILSVRC) winners

Completion of the challenge:

Annual ImageNet competition no longer
held after 2017 -> now moved to Kaggle.

Reference: cs321n, Stanford, spring 2019

Neural Networks - Lecture 8 73

34
• Loss Function
• Hyperparameters
• Regularization for good generalization
• Dropout
• Data augmentation
• DropConnect
• Reduce the number of parameters
• Weight decay
Summary • CNN Applications
• Object Classification
• Different Dataset for Object Recognition.
• Different CNN Architectures for Object
recognition
• AlexNet, ZFNet, VGGNet, GoogLeNet,
ResNet
74
Neural Networks - Lecture 8

Resources
1. Roger Grosse and Jimmy Ba, CSC421 /2516 winter 2019 Neural
Network and Deep Learning, http://www.cs.toronto.edu.
2. Related Lecture from CS231n @ Stanford.
http://cs231n.stanford.edu/
3. MIT 6.S191, Introduction to Deep Learning, 2020.

Neural Networks - Lecture 8 75

35
Thanks
for your attention…

Neural Networks - Lecture 8 76

UNIT-III DLL Full Unit
No ratings yet
UNIT-III DLL Full Unit
63 pages
More On CNN
No ratings yet
More On CNN
131 pages
Unit 2
No ratings yet
Unit 2
112 pages
CS4442 - CS9542 - Part 2 - Lecture 5 - DNN - Intro
No ratings yet
CS4442 - CS9542 - Part 2 - Lecture 5 - DNN - Intro
113 pages
CII4Q3 - Computer Vision-EAR - Week-11-Intro To Deep Learning v1.0
No ratings yet
CII4Q3 - Computer Vision-EAR - Week-11-Intro To Deep Learning v1.0
50 pages
CII4Q3 VISI KOMPUTER - Deep Learning - CNN
No ratings yet
CII4Q3 VISI KOMPUTER - Deep Learning - CNN
106 pages
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
No ratings yet
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
65 pages
Chapter21 4e
No ratings yet
Chapter21 4e
35 pages
General Observation
No ratings yet
General Observation
93 pages
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
No ratings yet
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
75 pages
Week 10
No ratings yet
Week 10
69 pages
Deep Neural Network
No ratings yet
Deep Neural Network
60 pages
NN Bnu2
No ratings yet
NN Bnu2
47 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
43 pages
Lect 12 - Deep Feed Forward NN - Review
No ratings yet
Lect 12 - Deep Feed Forward NN - Review
93 pages
ML Lec 13 CNN
No ratings yet
ML Lec 13 CNN
44 pages
ITNN Week3
No ratings yet
ITNN Week3
21 pages
Adl Unit 1 2
No ratings yet
Adl Unit 1 2
67 pages
2 Deep Neural Network - 241120 - 095158
No ratings yet
2 Deep Neural Network - 241120 - 095158
47 pages
Supervised Deep Learning
No ratings yet
Supervised Deep Learning
28 pages
DL Intro
No ratings yet
DL Intro
64 pages
Lecture 5-6
No ratings yet
Lecture 5-6
45 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
Lec14 CNNRNNModels
No ratings yet
Lec14 CNNRNNModels
64 pages
8.augmentation and Regularization
No ratings yet
8.augmentation and Regularization
62 pages
465-Lecture 10-11
No ratings yet
465-Lecture 10-11
79 pages
Home Assignment Submission Solutions
No ratings yet
Home Assignment Submission Solutions
82 pages
Unit - IV
No ratings yet
Unit - IV
24 pages
Ai - W7L13
No ratings yet
Ai - W7L13
46 pages
Lecture W15ab
No ratings yet
Lecture W15ab
44 pages
Hyperparameters
No ratings yet
Hyperparameters
15 pages
Cours 4
No ratings yet
Cours 4
30 pages
Deep Learning: Technical Introduction: Thomas Epelbaum
No ratings yet
Deep Learning: Technical Introduction: Thomas Epelbaum
106 pages
ANN Presentation Exam Hafsa
No ratings yet
ANN Presentation Exam Hafsa
29 pages
DL Lect 7
No ratings yet
DL Lect 7
15 pages
6 - Tips For Training Deep Neural Networks
No ratings yet
6 - Tips For Training Deep Neural Networks
59 pages
Deep Learning 15
No ratings yet
Deep Learning 15
13 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
26 pages
Unit 4 Notes
100% (1)
Unit 4 Notes
45 pages
CNN Training Aspects Presentation
No ratings yet
CNN Training Aspects Presentation
26 pages
Training Neural Netwok: Data Set
No ratings yet
Training Neural Netwok: Data Set
35 pages
DL Regularization
No ratings yet
DL Regularization
28 pages
Practical Aspects of Deep Learning PI
No ratings yet
Practical Aspects of Deep Learning PI
46 pages
Deep Learning Module 2 Important Topics PYQs
No ratings yet
Deep Learning Module 2 Important Topics PYQs
30 pages
A Recurrent Neural Network
No ratings yet
A Recurrent Neural Network
22 pages
UNIT-IV Improving Deep Neural Networks
No ratings yet
UNIT-IV Improving Deep Neural Networks
17 pages
DL Unit 3
No ratings yet
DL Unit 3
14 pages
Artificial Neural Networks - Lect - 4
No ratings yet
Artificial Neural Networks - Lect - 4
17 pages
Deep Neural Network Module 4 Regularization
No ratings yet
Deep Neural Network Module 4 Regularization
53 pages
Deep Learning Book Part1
No ratings yet
Deep Learning Book Part1
100 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
16 pages
Deep Learning Notes (1) 2
No ratings yet
Deep Learning Notes (1) 2
54 pages
Chap 2 Training Feed Forward Neural Networks
No ratings yet
Chap 2 Training Feed Forward Neural Networks
22 pages
Unit 4
No ratings yet
Unit 4
13 pages
Unit-2 Improving-Deep-Neural-Networks
No ratings yet
Unit-2 Improving-Deep-Neural-Networks
18 pages
9.b Handout-2-Regularization
No ratings yet
9.b Handout-2-Regularization
5 pages
DL Class3
No ratings yet
DL Class3
28 pages
Assignment 13 Modern AI
No ratings yet
Assignment 13 Modern AI
3 pages
Seminar Report cnn1
No ratings yet
Seminar Report cnn1
23 pages
XOR Problem Demonstration Using MATLAB
0% (1)
XOR Problem Demonstration Using MATLAB
19 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
13 pages
NN 01
No ratings yet
NN 01
29 pages
Unit II-NNDL
No ratings yet
Unit II-NNDL
19 pages
Feedforward
No ratings yet
Feedforward
34 pages
Week 1
No ratings yet
Week 1
24 pages
Recurrent Neural Networks (RNNS) : A Gentle Introduction and Overview
No ratings yet
Recurrent Neural Networks (RNNS) : A Gentle Introduction and Overview
16 pages
WAP To Implement Artificial Neural Network
No ratings yet
WAP To Implement Artificial Neural Network
13 pages
Deep Learning Fill in The Blanks With Answers
No ratings yet
Deep Learning Fill in The Blanks With Answers
8 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
18 pages
NN 03
No ratings yet
NN 03
27 pages
Artificial Intelligence: Long Short Term Memory Networks
No ratings yet
Artificial Intelligence: Long Short Term Memory Networks
14 pages
3-ADALINE (Adaptive Linear Neuron) (Widrow & Hoff, 1960) : W X T E
No ratings yet
3-ADALINE (Adaptive Linear Neuron) (Widrow & Hoff, 1960) : W X T E
8 pages
Unit 5
No ratings yet
Unit 5
4 pages
Neural Networks From Scratch: 3.1 Formal Neuron
No ratings yet
Neural Networks From Scratch: 3.1 Formal Neuron
8 pages
Aust Cse Thesis Final Book
No ratings yet
Aust Cse Thesis Final Book
72 pages
Deep Learning
No ratings yet
Deep Learning
22 pages
Artificial Neural Network - Quick Guide - Tutorialspoint
No ratings yet
Artificial Neural Network - Quick Guide - Tutorialspoint
61 pages
JNTU - Neural Network
No ratings yet
JNTU - Neural Network
5 pages
Unit - I Artificial Neural Networks
No ratings yet
Unit - I Artificial Neural Networks
23 pages
NN 09
No ratings yet
NN 09
34 pages
NN 02
No ratings yet
NN 02
25 pages
15.03.2024 Csa3007 A24+d23+d24
No ratings yet
15.03.2024 Csa3007 A24+d23+d24
8 pages
NN Mdu Previousyears
No ratings yet
NN Mdu Previousyears
10 pages
NN 06
No ratings yet
NN 06
18 pages
Deep Learning Unit 5
No ratings yet
Deep Learning Unit 5
23 pages
Practice Final
No ratings yet
Practice Final
45 pages
Feed-Forward Neural Networks (Part 1)
No ratings yet
Feed-Forward Neural Networks (Part 1)
33 pages
AITools Unit-4
No ratings yet
AITools Unit-4
25 pages
NN 07
No ratings yet
NN 07
24 pages
NN 05
No ratings yet
NN 05
28 pages
NN 04
No ratings yet
NN 04
18 pages
Image Captioning: - A Deep Learning Approach
No ratings yet
Image Captioning: - A Deep Learning Approach
14 pages
DL Practical QP
No ratings yet
DL Practical QP
10 pages
Vanishing Gradient Problem in Deep Learning Understanding Intuition and Solutions
No ratings yet
Vanishing Gradient Problem in Deep Learning Understanding Intuition and Solutions
8 pages
Research Cloud
No ratings yet
Research Cloud
8 pages
BackPropagation For Exam Problem - 2
No ratings yet
BackPropagation For Exam Problem - 2
3 pages
Lecture 8
No ratings yet
Lecture 8
2 pages
Lecture 7
No ratings yet
Lecture 7
2 pages
Fundamentals of Machine Learning: An Introduction to Neural Networks
From Everand
Fundamentals of Machine Learning: An Introduction to Neural Networks
Peter Johnson
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet