Weight Dropout For Preventing Neural Networks From Overfitting

The paper introduces a novel regularization technique called weight dropout to prevent overfitting in deep neural networks, particularly in convolutional neural networks (CNNs). It compares the performance of this method against traditional dropout techniques in tasks like image classification and segmentation, demonstrating improved accuracy on various datasets. The proposed method effectively enhances the stability and performance of neural networks during training by randomly dropping weights rather than nodes.

Uploaded by

Sanjar Karshiev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views4 pages

Weight Dropout For Preventing Neural Networks From Overfitting

Uploaded by

Sanjar Karshiev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Weight Dropout for Preventing Neural Networks from Overfitting

Karshiev Sanjar 1 , Abdul Rehman1 , Anand Paul1, Kim JeongHong1 *

1
Department of Computer Science, Kyungpook National University, South Korea
email: [email protected], [email protected], [email protected], * [email protected]

Abstract—This paper briefly introduces an enhanced neural

network regularization method, so called weight dropout, in II. RELATED WORK
order to prevent deep neural networks from overfitting. In Deep neural networks include multiple nonlinear hidden
suggested method, the fully connected layer jointly used with layers, leading to expressive models learning very complex
weight dropout is a collection of layers in which the weights
between nodes are dropped randomly on the process of
relations between inputs and outputs. With limited training
training. To accomplish the desired regularization method, we data, many of these complex relationships refer to noise,
propose a building blocks with our weight dropout mask and which leads to overfitting. Several approaches have been
CNN. The performance of proposed method has been applied to avoid this. Strivastava et al. [1] proposed a
compared with other previous methods in the domain of dropout method to mitigate this problem. The technique
image classification and segmentation for the evaluation randomly drops units from the neural network during
purpose. The results show that the proposed method gives training. This prevents excessively perfect co-adaptations.
successful performance accuracies in several datasets. More technically, individual nodes are either dropped from
the net with probability of 1 or kept with probability of .
Keywords-component; overfitting; weight dropout;
regularization; image classification; semantic segmentation
The reduced network is then left for training while the
inputs and outputs for the dropped-out nodes are eliminated
as illustrated in Figure 1.
I. I NTRODUCTION
Deep Neural network models are suitable for the fields
where large number of labeled data is available. Owing to
their scope they can effortlessly be enlarged by inserting
more layers or more nodes in every layer. Conversely, deep
networks with many parameters can easily overfit despite
the size of dataset is large. Consistently, variety techniques
for regularizing Neural networks (NN) have been
introduced to deep learning community. Applying an
(a) (b)
L2 penalty to the model parameters is an easy yet efficient
method. Other techniques of regularization include Figure 1. Dropout neural network. (a) traditional network with 2 hidden
layers; (b) an example of a thinned network. Crossed nodes have been
simplifying network architecture, data augmentation or dropped
early stopping the training process. Applying these
regularization methods in training stage provides better This compensates for the larger size of the network now
performance on test data set. that no neurons are dropped and can be interpreted as
In most research works dropout technique has shown averaging over the possible networks during training. The
that neural network models can be improved by regularizing
probability can vary for each layer, with the original paper
them. Standard dropout [1] can generalize well to several
types of neural network models, however there is still room recommending p = 0.2 for the input layer and p = 0.5 for
to advance the models in terms of computational cost or hidden layers. Neurons in the output layer are not dropped.
performance. Specifically, dropout slows down the training Statistically, the formulation of dropout is given below:
because it chooses different neurons to drop in each training y = f(Wx)m, m  Bernoulli(1-p) (1)
step. Since new types of neural network architectures are
introduced to deep learning community, opportunities to where y is the result, f() is the non-linearity function, W
enhance these networks with dropout-based models will be is the learnable weight parameters, x is the input data,
in researchers’ observation. and m is dropout mask. The dropout mask gives the
Extensive academic research has been made to explain random zeros and ones with the probability of p.
the reason for why standard dropout performs well. A key Krizhevsky et al. [3] suggested complex CNN
instruction in this field have been inferring dropout as architecture to classify 1.2 M images in ImageNet.
directly generalizing over a group of neural networks [2]. Stochastic pooling [4] is a dropout based regularization
Study of dropout has also revealed experimental technique. Dropout has been widely applied to recurrent
conclusions that have confirmed being valuable in neural networks[5]–[8] meanwhile CNN has been great
recognizing dropout and in enhancing novel dropout success with dropout methods. As a replacement for
approaches. This has directed to research field in applying continuously taking the most difficult activity within every
dropout to sparsity and eventually simplify neural networks.
pooling section as max-pooling does, stochastic pooling
We propose another approach for dropout called Weight
Dropout that enhances the performance of convolutional randomly chooses the activations basedon a multinomial
neural networks. distribution. Other methods have been proposed for
developing dropout-based regularization in CNNs are to
operate dropping out at substitute viewpoints. Max-pooling
dropout [9] is an alternative method, where the elements of
pooling layer is dropped at max-pooling layer.
The max-pooling operator always pools the
largest value in a given pooling window, while the max-
pooling dropout method provides an opportunity for
smaller feature values to affect activations in later layers.
This technique can help the network to avoid overfitting as
saturated activation values have less contribution in the
network loss. At test time, the pooling operation becomes a
linear sum over activations, where each activation is
weighted by the probability that it would be selected as the Figure 4. Freezing the baseline ResNet-50 model
output during training according to this dropout method. B. Weight Dropout Mask
Another efficient regularization method called Cutout
is proposed in [10]. The authors showed that the ordinary We consider a fully connected layer of neural network
with input X = [x1, x2, … xn]T and weight parameters W.
regularization method by randomly cutting out square
The output of this fully connected layer,
regions in the input images. This method generally
improved the performance and robustness of a CNN model. Y = [y1, y2, … yn] is calculated by feeding dot products of
W weight parameters and X input into activation function :
Cutout is easy to implement and can be used jointly with
other techniques like data augmentation and dropout Y = a(WX) (7)
methods. One drawback of Cutout is that generating images
randomly from the given dataset is computationally In the proposed method, the fully connected layer with
expensive which slows down the training time and harmful Weight Dropout is a connected layer in which connections
to the performance. between nodes are selected randomly on the process of
training. For Weight Dropout layer, the output becomes as:
III. PROPOSED METHOD Y = a((M*W)X) (8)
We propose a new neural network model architecture
which consists of four main parts. Overall architectureof the Here, M is weight dropout mask and the result of
model is given in Figure 3. Bernoulli (p). Dropout mask (M) is chosen for each training
step independently, in other words, it gives different
connectivity for each training example. An example of fully
connected layers applied Weight Dropout mask with M = is
demonstrated in Figure 5.

Figure 3. Proposed Neural network architecture; Weight dropout

based convolutional layer composed of four main parts: batch
normalization (BN), activation function, weight dropout mask and CNN

A. Feature Extraction (ResNet-50)

(a) (b)
In case that we include more layers to deep neural Figure 5. Applying Weight Dropout into FCN; (a) FCN without
networks, the execution gets to be idle or begins to corrupt. Weight Dropout, (b) FCN after applying Weight Dropout
This scenario occurs owing to vanishing gradient issue.
When gradients are backpropagated through the deep neural The last part in Weight Dropout based conv layer is a
network and continuously multiplied, this makes weight fully connected convolution layer. After operating Weight
gradients cause vanishing gradient problem. ResNet Dropout, the FC layer output is calculated with Equation 1.
understands the vanishing gradient issue by utilizing
character alternate route or skip connections that skip one or C. Weight Dropout based CNN layer
more layers. We use pre-trained ResNet-50 neural network To reveal the reason behind in effectiveness of current
model for Transfer Learning feature extraction. To dropout methods for CNN layers, we thoroughly analyzed
implement transfer learning, we will eliminate the last the correlation among dropout and other widespread
output layers of the pre-trained ResNet-50 model and
methods related to CNN. We noticed that for the neuron and
connect it to our Weight Dropout based CNN layer. Weights
channel level dropout their ordinary usage is in conflict with
of ResNet-50 model are frozen and are not optimized during
model training step. Figure 4 shows how predicting layer of BN, which is embraced broadly in CNNs to become
the pre-trained model has been frozen. constant the beginning of two instants of its output
distribution, while the irregular passive activation of the
essential elements with drop-neuron and drop-channel
making disturbs such stability. In order not to lose this
stability, we propose a building blocks with our weight Input: Training example x, parameters Wl-1, Wsl-1, from
previous training step, learning rate
dropout mask and CNN, with flowing order (see Fig. 6)
Output: updated parameters W, Ws

Forward Propagation:
Extract features: output of ResNet-50 pre-trained model
Batch Normalization
Non-linearity function: 𝑎 = 𝑔(𝑥)
Random sample M mask tensor: 𝑀 ~ 𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖 ( 𝑝)
Convolutional layer: 𝑌 = ( 𝑀 ∗ 𝑊) 𝑥
Compute class predictions: 𝑌̂ = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 (𝑌, 𝑊𝑠 )
Backpropagation:
Compute the derivative of loss w.r.t parameters
Update softmax layer: 𝑊𝑠 = 𝑊𝑠 − 𝑎 𝐿𝑜𝑠𝑠′𝑊𝑠
(a) (b) Update Weight Dropout layer: 𝑊 = 𝑊 − 𝑎 𝐿𝑜𝑠𝑠′𝑊
Figure 6. Convolutional building blocks with weight dropout
operation IV. EXPERIMENTAL RESULTS
Previous dropout methods were initially resented
right after the convolutional layer and the BN layer, which We evaluate the proposed method accuracy for two types
directs to intense fluctuations of inputs obtained by BN of Computer Vision tasks:
layer. We investigate the collapse of the standard dropout
to the incorrect placement of the dropout procedures and ▪ Image Classification;
suggest universal convolutional blocks by placing dropout ▪ Skin Lesion Segmentation.
operations before each convolution layer (see Fig. 6b). We obtained successful generalization method on all
Incorporating drop operation before the convolution datasets.
operator precedes to lower gradient variance and earlier
convergence while training. A. MNIST
MNIST handwritten digit classification dataset contains
D. Optimization algorithm
28x28 grayscale images, every image represents one of the
Forward propagation of the model starts with choosing 10 classes, a digit in range zero and nine. Before feeding
a training example from training dataset. By running the this dataset into the neural network, we up-sample the size
training example through ResNet-50 model, we of images by maintaining aspect ratio and normalize each
extract features with frozen weight values. These features pixel value into [0; 1] range. Table 1 illustrates the
are the input to the Weight Dropout based conv layer. In performance of various models in the fully connected
BN and activation function layers, there are not learnable layers. We use an initial learning rate of 0.1 and this value
parameters. The amount produced from activation function will reduce during training with the aspect of callbacks.
is input to the Weight Dropout layer where a mask tensor is
TABLE 1. MNIST ACCURACY RATE FOR DIFFERENT DROPOUT MODELS
created from a distribution. Selecting different mask
tensor for each single training example is the main factor to Training Validation Time (per
Model
efficient training in Weight Dropout layer. After a mask accuracy accuracy (%) epoch)
tensor is chosen, this tensor is implemented to learnable No-dropout 96.02 85.26 6 ms
parameters in order to calculate the input to convolutional Standard
94.57 91.31 14.2 ms
layer. The resulting matrix of the convolutional layer, then, Dropout
merged into one column vector in Dense layer. The Max-pooling
93.44 90.54 319 ms
predicting softmax layer takes this column vector and dropout
generates class predictions before the Cross Entropy Loss Ising-dropout 92.85 86.21 9.7 ms
function computes the error between predicted and target Proposed
96.78 94.51 8.3 ms
values. The trainable parameters in the model, then are method
optimized with batch gradient descent (BGD) by passing If we increase the number of hidden layers No-dropout
backward derivatives of loss function with respect to the overfits while other models improve performance. Weight
parameters. As we mentioned earlier, the mask tensor in Dropout consistently gives higher validation accuracy
Weight Dropout should be different for each training step. comparing to other techniques.
While backpropagation stage, only the active elements of B. CIFAR-10
mask tensor in forward pass are updated. Overall algorithm
CIFAR-10 is a data set of ordinary tiny images
for optimization is given in Algorithm 1.
consisting of 50,000 training images and 10,000 for testing,
both subfolders split into 10 classes. Because CIFAR-10
Algorithm 1. BGD Training with Weight Dropout
consists of tiny images, the accuracy rate on this dataset is
not reasonably higher than the other up-to-date models. We
present the model performance on CIFAR-10 dataset on
Table 2. We used PReLU activation function as we works, we will continue improving dropout technique by
discussed in experimental settings part. choosing the dropped components not randomly but with
some information.
TABLE 2. CIFAR-10 ACCURACY RATE FOR DIFFERENT DROPOUT MODELS
ACKNOWLEDGMENT
Training Validation Time (per
Model
accuracy accuracy epoch) This study was supported by the BK21 Plus Project
No-dropout 90.42 81.62 19 ms (SW Human Resource Development Program for
Standard Supporting Smart Life) funded by the Ministry of
89.21 86.49 43 ms
Dropout Education, School of Computer Science and Engineering,
Max-pooling 91.71 87.98 708 ms Kyungpook National University, Korea
dropout (21A20131600005). This work is also supported
Ising-dropout 88.54 85.43 28 ms by National Research Foundation of Korea. Grant
Proposed 94.31 91.68 21 ms Number: 2020R1A2C1012196.
method
REFERENCE
C. Skin Lesion segmentation [1] N. Srivastava, G. Hinton, K. Alex, I. Sutskever,
By comparing the model results with no-dropout and and S. Ruslan, “Dropout: A Simple Way to
standard dropout methods on dermoscopic skin lesion Prevent Neural Networks from Overfitting,” J.
segmentation dataset, we achieved competitive accuracy Mach. Learn. Res., vol. 299, no. 3–4, pp. 1929–
performance on test data set. No-dropout method is 1958, 2014.
completely example of U-Net model, while Standard- [2] G. E. Hinton, N. Srivastava, A. Krizhevsky, I.
dropout is the proposed method in [11]. Figure 7 Sutskever, and R. R. Salakhutdinov, “Improving
demonstrates the overall accuracies on these 3 methods. neural networks by preventing co-adaptation of
feature detectors,” pp. 1–18, 2012.
[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton,
120

“Imagenet classification with deep convolutional

94.34 96.27
100 91.12 88.33 88.59

80
78.23
neural networks,” NIPS, pp. 1106–1114, 2012.
60 51.48 [4] M. D. Zeiler and R. Fergus, “Stochastic pooling
for regularization of deep convolutional neural
44.05
39.26
40

20
networks,” 1st Int. Conf. Learn. Represent. ICLR
2013 - Conf. Track Proc., pp. 1–9, 2013.
[5] I. A. Popova and N. G. Stepanova, “Estimation of
0
Pixel accuracy (%) Dice (%) IoU (%)

No-dropout Standard Dropout Weight Dropout inorganic phosphate in presence of

phosphocarbohydrates (Russian),” Vopr.
Figure 7. Skin lesion segmentation accuracy results Meditsinskoj Khimii, vol. 23, no. 1, pp. 135–139,
It can be clearly seen from the figure that using bilinear 1977.
interpolation jointly with PReLU discussed in [12], and [6] S. Semeniuta, A. Severyn, and E. Barth,
weight dropout is an efficient way for skin lesion “Recurrent dropout without memory loss,”
segmentation. COLING 2016 - 26th Int. Conf. Comput. Linguist.
V. CONCLUSION AND FUTURE WORK Proc. COLING 2016 Tech. Pap., pp. 1757–1766,
In this paper, we proposed enhanced version of Dropout 2016.
technique to regularize deep neural networks. By applying [7] D. Krueger et al., “Zoneout: Regularizing rNNs
transfer learning, we extracted features from the images by randomly preserving hidden activations,” 5th
before feeding them into the model. Then, Weighted Int. Conf. Learn. Represent. ICLR 2017 - Conf.
Dropout-based convolutional neural network has been Track Proc., pp. 1–11, 2017.
applied by placing dropout operations before each [8] S. Merity, N. S. Keskar, and R. Socher,
convolution layer. The connections between nodes are “Regularizing and Optimizing LSTM Language
dropped randomly with Bernoulli probability in weight Models,” arXiv, 2017.
dropout method. In addition to image classification task, we [9] H. Wu and X. Gu, “Max-pooling dropout for
also tested the proposed method on the skin lesion semantic regularization of convolutional neural networks,”
segmentation dataset, which was not utilized in previous Lect. Notes Comput. Sci. (including Subser. Lect.
dropout methods before. The proposed method can Notes Artif. Intell. Lect. Notes Bioinformatics),
generalize deep neural networks with lower gradient vol. 9489, pp. 46–54, 2015.
descent and faster convergence while training and achieved [10] T. DeVries and G. W. Taylor, “Improved
promising results on both classification and segmentation Regularization of Convolutional Neural Networks
task. with Cutout,” arXiv Prepr., vol. 1708.14552,
Deactivating basic components of neural network 2017.
models which are discussed in this work is based on [11] K. Sanjar, O. Bekhzod, J. Kim, J. Kim, A. Paul,
randomly setting the particular component into zero. The and J. Kim, “Improved U-net: Fully convolutional
main disadvantage of these techniques is the risk of losing network model for skin-lesion segmentation,”
useful information of the image. In our future research Appl. Sci., vol. 10, no. 10, 2020.

Stock Market Prediction Using MLP and Random Forest
No ratings yet
Stock Market Prediction Using MLP and Random Forest
18 pages
Final Exam Update Huawei
0% (1)
Final Exam Update Huawei
13 pages
Dropout in Deep Learning
No ratings yet
Dropout in Deep Learning
14 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
26 pages
9.b Handout-2-Regularization
No ratings yet
9.b Handout-2-Regularization
5 pages
Unit II.
No ratings yet
Unit II.
14 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
16 pages
Dataset Augmentation
No ratings yet
Dataset Augmentation
30 pages
Hyperparameters
No ratings yet
Hyperparameters
15 pages
Face Recognition Based On Deep Autoencoder Networks With Dropout
No ratings yet
Face Recognition Based On Deep Autoencoder Networks With Dropout
4 pages
Regularization For Neural Networks 1718966083
No ratings yet
Regularization For Neural Networks 1718966083
9 pages
2.6 Regularization
No ratings yet
2.6 Regularization
24 pages
Rademacher Dropout: An Adaptive Dropout For Deep Neural Network Via Optimizing Generalization Gap
No ratings yet
Rademacher Dropout: An Adaptive Dropout For Deep Neural Network Via Optimizing Generalization Gap
11 pages
DL Class3
No ratings yet
DL Class3
28 pages
Improved Regularization of Convolutional Neural Networks With Cutout
No ratings yet
Improved Regularization of Convolutional Neural Networks With Cutout
8 pages
UNIT-II Regularization in Deep Learning
No ratings yet
UNIT-II Regularization in Deep Learning
24 pages
Week 10
No ratings yet
Week 10
69 pages
Enhancing Transformer Training Efficiency With Dynamic Dropout
No ratings yet
Enhancing Transformer Training Efficiency With Dynamic Dropout
10 pages
Regularization and Normalization
No ratings yet
Regularization and Normalization
29 pages
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
No ratings yet
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
75 pages
May Deep Learning
No ratings yet
May Deep Learning
16 pages
Rennie 2014
No ratings yet
Rennie 2014
6 pages
Unit Ii
No ratings yet
Unit Ii
8 pages
Regularization Slides
No ratings yet
Regularization Slides
50 pages
What Is Regularization.
No ratings yet
What Is Regularization.
10 pages
Training Neural Netwok: Data Set
No ratings yet
Training Neural Netwok: Data Set
35 pages
1 s2.0 S0031320320303976 Main
No ratings yet
1 s2.0 S0031320320303976 Main
10 pages
03 Reg Slides
No ratings yet
03 Reg Slides
64 pages
DL Unit 3
No ratings yet
DL Unit 3
14 pages
Black Propagatio DL
No ratings yet
Black Propagatio DL
7 pages
Deep MLP's
No ratings yet
Deep MLP's
44 pages
Dis4 Sol
No ratings yet
Dis4 Sol
10 pages
How To Reduce Overfitting With Dropout Regularization in Keras
No ratings yet
How To Reduce Overfitting With Dropout Regularization in Keras
12 pages
Accelerated Bayesian Optimization For Deep Learning
No ratings yet
Accelerated Bayesian Optimization For Deep Learning
13 pages
Cours 4
No ratings yet
Cours 4
30 pages
Unit-2 Improving-Deep-Neural-Networks
No ratings yet
Unit-2 Improving-Deep-Neural-Networks
18 pages
Validation and Training
No ratings yet
Validation and Training
3 pages
Depth Dropout
No ratings yet
Depth Dropout
7 pages
Chap 2 Training Feed Forward Neural Networks
No ratings yet
Chap 2 Training Feed Forward Neural Networks
22 pages
Module-4 4
No ratings yet
Module-4 4
19 pages
Adapting Resilient Propagation For Deep Learning: Alan Mosca George D. Magoulas
No ratings yet
Adapting Resilient Propagation For Deep Learning: Alan Mosca George D. Magoulas
4 pages
Fast Dropout
No ratings yet
Fast Dropout
11 pages
CNN Regularization
No ratings yet
CNN Regularization
9 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
NIPS 2013 Adaptive Dropout For Training Deep Neural Networks Paper
No ratings yet
NIPS 2013 Adaptive Dropout For Training Deep Neural Networks Paper
9 pages
Module 2 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 2 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
20 pages
Neural Network Implementation Using Keras
No ratings yet
Neural Network Implementation Using Keras
8 pages
Intro To Neural Network
No ratings yet
Intro To Neural Network
25 pages
Machine Learning (ML) :: Aim: Analysis and Implementation of Deep Neural Network. Definitions
No ratings yet
Machine Learning (ML) :: Aim: Analysis and Implementation of Deep Neural Network. Definitions
6 pages
Neural Networks For Machine Learning: Lecture 9a Overview of Ways To Improve Generalization
No ratings yet
Neural Networks For Machine Learning: Lecture 9a Overview of Ways To Improve Generalization
39 pages
Deep Neural Network
No ratings yet
Deep Neural Network
60 pages
Lecture 5-6
No ratings yet
Lecture 5-6
45 pages
Regularization of Neural Networks Using Dropconnect: Hinton Et Al. 2012 Hinton Et Al. 2012
No ratings yet
Regularization of Neural Networks Using Dropconnect: Hinton Et Al. 2012 Hinton Et Al. 2012
9 pages
Practical Aspects of Deep Learning PI
No ratings yet
Practical Aspects of Deep Learning PI
46 pages
A Quick Guide On Basic Regularization Methods For Neural Networks - by Jaime Durán - Yottabytes - Medium
No ratings yet
A Quick Guide On Basic Regularization Methods For Neural Networks - by Jaime Durán - Yottabytes - Medium
9 pages
Deep Neural Network Module 4 Regularization
No ratings yet
Deep Neural Network Module 4 Regularization
53 pages
2 Deep Neural Network - 241120 - 095158
No ratings yet
2 Deep Neural Network - 241120 - 095158
47 pages
465-Lecture 10-11
No ratings yet
465-Lecture 10-11
79 pages
CS 182 Berkeley 2021 Discussion 4
No ratings yet
CS 182 Berkeley 2021 Discussion 4
7 pages
Deep Feedforward Networks and Regularization: Licheng Zhang
No ratings yet
Deep Feedforward Networks and Regularization: Licheng Zhang
56 pages
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
From Everand
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
Fouad Sabry
No ratings yet
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
From Everand
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
Fouad Sabry
No ratings yet
Chapter 6 AI
No ratings yet
Chapter 6 AI
52 pages
Dla QB
No ratings yet
Dla QB
3 pages
MRI-based Brain Tumor Detection Using Convolutional Deep Learning Methods and Chosen Machine Learning Techniques
No ratings yet
MRI-based Brain Tumor Detection Using Convolutional Deep Learning Methods and Chosen Machine Learning Techniques
17 pages
Neural Networks and Deep Learning Notes
No ratings yet
Neural Networks and Deep Learning Notes
88 pages
Backpropagation Algorithm
No ratings yet
Backpropagation Algorithm
3 pages
RNN Part1
No ratings yet
RNN Part1
12 pages
ELET442 - Artificial Neural Networks (ANNs)
No ratings yet
ELET442 - Artificial Neural Networks (ANNs)
56 pages
PNN
No ratings yet
PNN
13 pages
Deep Learning Curriculum
No ratings yet
Deep Learning Curriculum
23 pages
Unit Vi - Artificial Neural Network and Deep Learning - Notes
No ratings yet
Unit Vi - Artificial Neural Network and Deep Learning - Notes
16 pages
10.1007@s10462 019 09744 1
No ratings yet
10.1007@s10462 019 09744 1
40 pages
Convolution Neural Networks Vs Fully Connected Neural Networks
No ratings yet
Convolution Neural Networks Vs Fully Connected Neural Networks
6 pages
Unit 5-1
No ratings yet
Unit 5-1
6 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
28 pages
Soft Computing
No ratings yet
Soft Computing
96 pages
MLT Syllabus
No ratings yet
MLT Syllabus
1 page
Credit
No ratings yet
Credit
6 pages
Research Paper
No ratings yet
Research Paper
12 pages
Capstone Project-1
No ratings yet
Capstone Project-1
15 pages
Empirical Evaluation of Rectified Activations in ConvolutionNetwork
No ratings yet
Empirical Evaluation of Rectified Activations in ConvolutionNetwork
5 pages
Machine Learning: The Hundred-Page Book
No ratings yet
Machine Learning: The Hundred-Page Book
17 pages
Ai Unit 4 Notes
No ratings yet
Ai Unit 4 Notes
11 pages
Multilayer Perceptron Algorithm
No ratings yet
Multilayer Perceptron Algorithm
3 pages
Resnet50 Summary
No ratings yet
Resnet50 Summary
4 pages
Assignment 5
No ratings yet
Assignment 5
3 pages
Lect8 DNN
No ratings yet
Lect8 DNN
33 pages
C2 W3
No ratings yet
C2 W3
29 pages
d2l en Pytorch
No ratings yet
d2l en Pytorch
979 pages

Weight Dropout For Preventing Neural Networks From Overfitting

Uploaded by

Weight Dropout For Preventing Neural Networks From Overfitting

Uploaded by

Weight Dropout for Preventing Neural Networks from Overfitting

Karshiev Sanjar 1 , Abdul Rehman1 , Anand Paul1, Kim JeongHong1 *

Abstract—This paper briefly introduces an enhanced neural

Figure 3. Proposed Neural network architecture; Weight dropout

A. Feature Extraction (ResNet-50)

“Imagenet classification with deep convolutional

No-dropout Standard Dropout Weight Dropout inorganic phosphate in presence of

You might also like