CNN Architectures
CNN Architectures
CNN Architectures
LeNet-5, from the paper Gradient-Based Learning Applied to Document Recognition, is a very efficient
convolutional neural network for handwritten character recognition.
Authors: Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner
LeNet-5 Total seven layer , does not comprise an input, each containing a trainable parameters; each layer
has a plurality of the Map the Feature , a characteristic of each of the input FeatureMap extracted by means
of a convolution filter, and then each FeatureMap There are multiple neurons.
Detailed explanation of each layer parameter:
INPUT Layer
The first is the data INPUT layer. The size of the input image is uniformly normalized to 32 * 32.
Note: This layer does not count as the network structure of LeNet-5. Traditionally, the input
layer is not considered as one of the network hierarchy.
C1 layer-convolutional layer
Input picture: 32 * 32
Number of neurons: 28 28 6
Detailed description:
1. The first convolution operation is performed on the input image (using 6 convolution kernels of size 5 5)
to obtain 6 C1 feature maps (6 feature maps of size 28 28, 32-5 + 1 = 28).
2. Let's take a look at how many parameters are needed. The size of the convolution kernel is 5 5, and
there are 6 (5 * 5 + 1) = 156 parameters in total, where +1 indicates that a kernel has a bias.
3. For the convolutional layer C1, each pixel in C1 is connected to 5 5 pixels and 1 bias in the input
image, so there are 156 28 * 28 = 122304 connections in total. There are 122,304 connections, but we
only need to learn 156 parameters, mainly through weight sharing.
Sampling area: 2 * 2
Sampling method: 4 inputs are added, multiplied by a trainable parameter, plus a trainable
offset. Results via sigmoid
Sampling type: 6
Number of neurons: 14 14 6
Number of connections: (2 2 + 1) 6 14 14
The size of each feature map in S2 is 1/4 of the size of the feature map in C1.
Detailed description:
The pooling operation is followed immediately after the first convolution. Pooling is performed using 2 2
kernels, and S2, 6 feature maps of 14 14 (28/2 = 14) are obtained.
The pooling layer of S2 is the sum of the pixels in the 2 * 2 area in C1 multiplied by a weight coefficient plus
an offset, and then the result is mapped again.
So each pooling core has two training parameters, so there are 2x6 = 12 training parameters, but there are
5x14x14x6 = 5880 connections.
C3 layer-convolutional layer
Input: all 6 or several feature map combinations in S2
Each feature map in C3 is connected to all 6 or several feature maps in S2, indicating that the
feature map of this layer is a different combination of the feature maps extracted from the
previous layer.
One way is that the first 6 feature maps of C3 take 3 adjacent feature map subsets in S2 as
input. The next 6 feature maps take 4 subsets of neighboring feature maps in S2 as input.
The next three take the non-adjacent 4 feature map subsets as input. The last one takes all
the feature maps in S2 as input.
Detailed description:
After the first pooling, the second convolution, the output of the second convolution is C3, 16 10x10 feature
maps, and the size of the convolution kernel is 5 5. We know that S2 has 6 14 14 feature maps, how to get
16 feature maps from 6 feature maps? Here are the 16 feature maps calculated by the special combination
of the feature maps of S2. details as follows:
The first 6 feature maps of C3 (corresponding to the 6th column of the first red box in the figure above) are
connected to the 3 feature maps connected to the S2 layer (the first red box in the above figure), and the
next 6 feature maps are connected to the S2 layer The 4 feature maps are connected (the second red box
in the figure above), the next 3 feature maps are connected with the 4 feature maps that are not connected
at the S2 layer, and the last is connected with all the feature maps at the S2 layer. The convolution kernel
size is still 5 5, so there are 6 (3 5 5 + 1) + 6 (4 5 5 + 1) + 3 (4 5 5 + 1) +1 (6 5 5 + 1) = 1516 parameters.
The image size is 10 10, so there are 151600 connections.
The convolution structure of C3 and the first 3 graphs in S2 is shown below:
Sampling area: 2 * 2
Sampling method: 4 inputs are added, multiplied by a trainable parameter, plus a trainable
offset. Results via sigmoid
Sampling type: 16
The size of each feature map in S4 is 1/4 of the size of the feature map in C3
Detailed description:
S4 is the pooling layer, the window size is still 2 * 2, a total of 16 feature maps, and the 16 10x10 maps of
the C3 layer are pooled in units of 2x2 to obtain 16 5x5 feature maps. This layer has a total of 32 training
parameters of 2x16, 5x5x5x16 = 2000 connections.
C5 layer-convolution layer
Input: All 16 unit feature maps of the S4 layer (all connected to s4)
Detailed description:
The C5 layer is a convolutional layer. Since the size of the 16 images of the S4 layer is 5x5, which is the
same as the size of the convolution kernel, the size of the image formed after convolution is 1x1. This
results in 120 convolution results. Each is connected to the 16 maps on the previous level. So there are
(5x5x16 + 1) x120 = 48120 parameters, and there are also 48120 connections. The network structure of the
C5 layer is as follows:
Calculation method: calculate the dot product between the input vector and the weight
vector, plus an offset, and the result is output through the sigmoid function.
Detailed description:
Layer 6 is a fully connected layer. The F6 layer has 84 nodes, corresponding to a 7x12 bitmap, -1 means
white, 1 means black, so the black and white of the bitmap of each symbol corresponds to a code. The
training parameters and number of connections for this layer are (120 + 1) x84 = 10164. The ASCII
encoding diagram is as follows:
The value of the above formula w_ij is determined by the bitmap encoding of i, where i ranges from 0 to 9,
and j ranges from 0 to 7 * 12-1. The closer the value of the RBF output is to 0, the closer it is to i, that is, the
closer to the ASCII encoding figure of i, it means that the recognition result input by the current network is
the character i. This layer has 84x10 = 840 parameters and connections.
Summary
LeNet-5 is a very efficient convolutional neural network for handwritten character recognition.
Convolutional neural networks can make good use of the structural information of images.
The convolutional layer has fewer parameters, which is also determined by the main characteristics of
the convolutional layer, that is, local connection and shared weights.
Code Implementation
In [5]: from tensorflow import keras
from keras.datasets import mnist
from keras.layers import Conv2D, MaxPooling2D,AveragePooling2D
from keras.layers import Dense, Flatten
from keras.models import Sequential
model = Sequential()
model.add(Conv2D(6, kernel_size = (5,5), padding = 'valid', activation='tanh', input_sha
model.add(AveragePooling2D(pool_size= (2,2), strides = 2, padding = 'valid'))
model.add(Flatten())
model.add(Dense(120, activation='tanh'))
model.add(Dense(84, activation='tanh'))
model.add(Dense(10, activation='softmax'))
model.summary()
model.compile(loss=keras.metrics.categorical_crossentropy, optimizer=keras.optimizers.Ad
model.fit(x_train, y_train, batch_size=128, epochs=2, verbose=1, validation_data=(x_test
score = model.evaluate(x_test, y_test)
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_5 (Conv2D) (None, 28, 28, 6) 456
=================================================================
Total params: 62,006
Trainable params: 62,006
Non-trainable params: 0
_________________________________________________________________
Epoch 1/2
391/391 [==============================] - 14s 8ms/step - loss: 1.8395 - accuracy: 0.346
6 - val_loss: 1.7231 - val_accuracy: 0.3949
Epoch 2/2
391/391 [==============================] - 2s 5ms/step - loss: 1.6719 - accuracy: 0.4112
- val_loss: 1.6083 - val_accuracy: 0.4258
313/313 [==============================] - 1s 3ms/step - loss: 1.6083 - accuracy: 0.4258
Test Loss: 1.6083446741104126
Test accuracy: 0.42579999566078186
In [ ]:
Introduction
AlexNet was designed by Hinton, winner of the 2012 ImageNet competition, and his student
Alex Krizhevsky. It was also after that year that more and deeper neural networks were
proposed, such as the excellent vgg, GoogleLeNet. Its official data model has an accuracy
rate of 57.1% and top 1-5 reaches 80.2%. This is already quite outstanding for traditional
machine learning classification algorithms.
Size /
Filter Depth Stride Padding Number of Parameters Forward Computation
Operation
3* 227 * 227
11 * (11*11*3 + 1) * 96 * 55 *
Conv1 + Relu 96 4 (11*11*3 + 1) * 96=34944
11 55=105705600
96 * 55 * 55
Norm
(5 * 5 * 96 + 1) * 256 * 27 *
Conv2 + Relu 5*5 256 1 2 (5 * 5 * 96 + 1) * 256=614656
27=448084224
256 * 27 * 27
256 * 13 * 13
Norm
(3 * 3 * 256 + 1) * 384 * 13 *
Conv3 + Relu 3*3 384 1 1 (3 * 3 * 256 + 1) * 384=885120
13=149585280
384 * 13 * 13
(3 * 3 * 384 + 1) * 384 * 13 *
Conv4 + Relu 3*3 384 1 1 (3 * 3 * 384 + 1) * 384=1327488
13=224345472
384 * 13 * 13
(3 * 3 * 384 + 1) * 256 * 13 *
Conv5 + Relu 3*3 256 1 1 (3 * 3 * 384 + 1) * 256=884992
13=149563648
256 * 13 * 13
256 * 6 * 6
Dropout (rate
0.5)
4096
Dropout (rate
0.5)
4096
1000 classes
Conv:3.7million (6%) , FC: 58.6 Conv: 1.08 billion (95%) , FC: 58.6
Conv VS FC
million (94% ) million (5%)
After using ReLU f (x) = max (0, x), you will find that the value after the activation function has no range like
the tanh and sigmoid functions, so a normalization will usually be done after ReLU, and the LRU is a steady
proposal (Not sure here, it should be proposed?) One method in neuroscience is called "Lateral inhibition",
which talks about the effect of active neurons on its surrounding neurons.
1. Dropout
Dropout is also a concept often said, which can effectively prevent overfitting of neural networks. Compared
to the general linear model, a regular method is used to prevent the model from overfitting. In the neural
network, Dropout is implemented by modifying the structure of the neural network itself. For a certain layer
of neurons, randomly delete some neurons with a defined probability, while keeping the individuals of the
input layer and output layer neurons unchanged, and then update the parameters according to the learning
method of the neural network. In the next iteration, rerandom Remove some neurons until the end of
training.
1. Enhanced Data ( Data Augmentation )
In deep learning, when the amount of data is not large enough, there are generally 4 solutions:
Data augmentation- artificially increase the size of the training set-create a batch of "new"
data from existing data by means of translation, flipping, noise
Regularization——The relatively small amount of data will cause the model to overfit,
making the training error small and the test error particularly large. By adding a regular term
after the Loss Function , the overfitting can be suppressed. The disadvantage is that a need
is introduced Manually adjusted hyper-parameter.
Dropout- also a regularization method. But different from the above, it is achieved by
randomly setting the output of some neurons to zero
Code Implementation
In [1]: !pip install tflearn
x, y = oxflower17.load_data()
In [10]: print(x_train.shape)
print(y_train.shape)
# Pooling
model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'))
# Batch Normalisation before passing it to the next layer
model.add(BatchNormalization())
# Pooling
model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'))
# Batch Normalisation
model.add(BatchNormalization())
# Pooling
model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'))
# Batch Normalisation
model.add(BatchNormalization())
# Output Layer
model.add(Dense(17))
model.add(Activation('softmax'))
model.summary()
WARNING:tensorflow:From /usr/local/lib/python3.10/dist-packages/keras/layers/normalizati
on/batch_normalization.py:581: _colocate_with (from tensorflow.python.framework.ops) is
deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 54, 54, 96) 34944
=================================================================
Total params: 24,834,833
Trainable params: 24,815,697
Non-trainable params: 19,136
_________________________________________________________________
In [12]: # Train
model.fit(x_train, y_train, batch_size=64, epochs=5, verbose=1,validation_split=0.2, shu
In [ ]:
VGG-Net
Introduction
The full name of VGG is the Visual Geometry Group, which belongs to the Department of
Science and Engineering of Oxford University. It has released a series of convolutional
network models beginning with VGG, which can be applied to face recognition and image
classification, from VGG16 to VGG19. The original purpose of VGG's research on the depth
of convolutional networks is to understand how the depth of convolutional networks affects
the accuracy and accuracy of large-scale image classification and recognition. -Deep-16
CNN), in order to deepen the number of network layers and to avoid too many parameters, a
small 3x3 convolution kernel is used in all layers.
VGG16 contains 16 layers and VGG19 contains 19 layers. A series of VGGs are exactly the
same in the last three fully connected layers. The overall structure includes 5 sets of
convolutional layers, followed by a MaxPool. The difference is that more and more cascaded
convolutional layers are included in the five sets of convolutional layers .
Each convolutional layer in AlexNet contains only one convolution, and the size of the
convolution kernel is 7 7 ,. In VGGNet, each convolution layer contains 2 to 4 convolution
operations. The size of the convolution kernel is 3 3, the convolution step size is 1, the
pooling kernel is 2 * 2, and the step size is 2. The most obvious improvement of VGGNet is to
reduce the size of the convolution kernel and increase the number of convolution layers.
Using multiple convolution layers with smaller convolution kernels instead of a larger
convolution layer with convolution kernels can reduce parameters on the one hand, and the
author believes that it is equivalent to more non-linear mapping, which increases the Fit
expression ability.
Two consecutive 3 3 convolutions are equivalent to a 5 5 receptive field, and three are
equivalent to 7 7. The advantages of using three 3 3 convolutions instead of one 7 7
convolution are twofold : one, including three ReLu layers instead of one , makes the decision
function more discriminative; and two, reducing parameters . For example, the input and
output are all C channels. 3 convolutional layers using 3 3 require 3 (3 3 C C) = 27 C C, and
1 convolutional layer using 7 7 requires 7 7 C C = 49C C. This can be seen as applying a
kind of regularization to the 7 7 convolution, so that it is decomposed into three 3 3
convolutions.
The 1 1 convolution layer is mainly to increase the non-linearity of the decision function
without affecting the receptive field of the convolution layer. Although the 1 1 convolution
operation is linear, ReLu adds non-linearity.
Network Configuration
Table 1 shows all network configurations. These networks follow the same design principles,
but differ in depth.
This picture is definitely used when introducing VGG16. This picture contains a lot of
information. My interpretation here may be limited. If you have any supplements, please leave
a message.
Number 1 : This is a comparison chart of 6 networks. From A to E, the network is getting deeper.
Several layers have been added to verify the effect.
Number 3: This is a correct way to do experiments, that is, use the simplest method to solve the
problem , and then gradually optimize for the problems that occur.
Network A: First mention a shallow network, this network can easily converge on ImageNet. And then?
Network A-LRN: Add something that someone else (AlexNet) has experimented to say is effective (LRN),
but it seems useless. And then?
Network D: Change the 1 1 convolution kernel to 3 3. Try it. The effect has improved again. Seems to be
the best (2014).
Training
The optimization method is a stochastic gradient descent SGD + momentum (0.9) with momentum. The
batch size is 256.
Regularization : L2 regularization is used, and the weight decay is 5e-4. Dropout is after the first two fully
connected layers, p = 0.5.
Although it is deeper and has more parameters than the AlexNet network, we speculate that VGGNet can
converge in less cycles for two reasons: one, the greater depth and smaller convolutions bring implicit
regularization ; Second, some layers of pre-training.
Parameter initialization : For a shallow A network, parameters are randomly initialized, the weight w is
sampled from N (0, 0.01), and the bias is initialized to 0. Then, for deeper networks, first the first four
convolutional layers and three fully connected layers are initialized with the parameters of the A network.
However, it was later discovered that it is also possible to directly initialize it without using pre-trained
parameters.
In order to obtain a 224 * 224 input image, each rescaled image is randomly cropped in each SGD iteration.
In order to enhance the data set, the cropped image is also randomly flipped horizontally and RGB color
shifted.
1. In the convolutional structure of VGGNet, a 1 * 1 convolution kernel is introduced. Without affecting the
input and output dimensions, non-linear transformation is introduced to increase the expressive power
of the network and reduce the amount of calculation.
1. During training, first train a simple (low-level) VGGNet A-level network, and then use the weights of the
A network to initialize the complex models that follow to speed up the convergence of training .
Answer 1
3 3x3 convolutions, using 3 non-linear activation functions, increasing non-linear expression capabilities,
making the segmentation plane more separable Reduce the number of parameters. For the convolution
kernel of C channels, 7x7 contains parameters , and the number of 3 3x3 parameters is greatly reduced.
Q2: The role of 1x1 convolution kernel
Answer 2
Increase the nonlinearity of the model without affecting the receptive field 1x1 winding machine is equivalent
to linear transformation, and the non-linear activation function plays a non-linear role
Q3: The effect of network depth on results (in the same year, Google also independently released
the network GoogleNet with a depth of 22 layers)
Answer 3
VGG and GoogleNet models are deep Small convolution VGG only uses 3x3, while GoogleNet uses 1x1,
3x3, 5x5, the model is more complicated (the model began to use a large convolution kernel to reduce the
calculation of the subsequent machine layer)
Code Implementation
From Scratch
In [2]: !pip install tflearn
x, y = oxflower17.load_data()
x_train = x.astype('float32') / 255.0
y_train = to_categorical(y, num_classes=17)
WARNING:tensorflow:From /usr/local/lib/python3.10/dist-packages/tensorflow/python/compa
t/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scop
e) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Downloading Oxford 17 category Flower Dataset, Please wait...
100.0% 60276736 / 60270631
Succesfully downloaded 17flowers.tgz 60270631 bytes.
File Extracted
Starting to parse images...
Parsing Done!
In [4]: print(x_train.shape)
print(y_train.shape)
model.add(Flatten())
model.add(Dense(units=4096,activation="relu"))
model.add(Dense(units=4096,activation="relu"))
model.add(Dense(units=17, activation="softmax"))
model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_13 (Conv2D) (None, 224, 224, 64) 1792
=================================================================
Total params: 134,330,193
Trainable params: 134,330,193
Non-trainable params: 0
_________________________________________________________________
In [10]: # Train
model.fit(x_train, y_train, batch_size=64, epochs=5, verbose=1,validation_split=0.2, shu
In [ ]:
VGG Pretrained
In [11]: # download the data from g drive
import gdown
url = "https://drive.google.com/file/d/12jiQxJzYSYl3wnC8x5wHAhRzzJmmsCXP/view?usp=sharin
file_id = url.split("/")[-2]
print(file_id)
prefix = 'https://drive.google.com/uc?/export=download&id='
gdown.download(prefix+file_id, "catdog.zip")
12jiQxJzYSYl3wnC8x5wHAhRzzJmmsCXP
Downloading...
From: https://drive.google.com/uc?/export=download&id=12jiQxJzYSYl3wnC8x5wHAhRzzJmmsCXP
To: /content/catdog.zip
100%|██████████| 9.09M/9.09M [00:00<00:00, 118MB/s]
'catdog.zip'
Out[11]:
Archive: catdog.zip
creating: train/
creating: train/Cat/
inflating: train/Cat/0.jpg
inflating: train/Cat/1.jpg
inflating: train/Cat/2.jpg
inflating: train/Cat/cat.2405.jpg
inflating: train/Cat/cat.2406.jpg
inflating: train/Cat/cat.2436.jpg
inflating: train/Cat/cat.2437.jpg
inflating: train/Cat/cat.2438.jpg
inflating: train/Cat/cat.2439.jpg
inflating: train/Cat/cat.2440.jpg
inflating: train/Cat/cat.2441.jpg
inflating: train/Cat/cat.2442.jpg
inflating: train/Cat/cat.2443.jpg
inflating: train/Cat/cat.2444.jpg
inflating: train/Cat/cat.2445.jpg
inflating: train/Cat/cat.2446.jpg
inflating: train/Cat/cat.2447.jpg
inflating: train/Cat/cat.2448.jpg
inflating: train/Cat/cat.2449.jpg
inflating: train/Cat/cat.2450.jpg
inflating: train/Cat/cat.2451.jpg
inflating: train/Cat/cat.2452.jpg
inflating: train/Cat/cat.2453.jpg
inflating: train/Cat/cat.2454.jpg
inflating: train/Cat/cat.2455.jpg
inflating: train/Cat/cat.2456.jpg
inflating: train/Cat/cat.2457.jpg
inflating: train/Cat/cat.2458.jpg
inflating: train/Cat/cat.2459.jpg
inflating: train/Cat/cat.2460.jpg
inflating: train/Cat/cat.2461.jpg
inflating: train/Cat/cat.2462.jpg
inflating: train/Cat/cat.2463.jpg
inflating: train/Cat/cat.2464.jpg
inflating: train/Cat/cat.855.jpg
inflating: train/Cat/cat.856.jpg
inflating: train/Cat/cat.857.jpg
inflating: train/Cat/cat.858.jpg
inflating: train/Cat/cat.859.jpg
inflating: train/Cat/cat.86.jpg
inflating: train/Cat/cat.860.jpg
inflating: train/Cat/cat.861.jpg
inflating: train/Cat/cat.862.jpg
inflating: train/Cat/cat.863.jpg
inflating: train/Cat/cat.864.jpg
inflating: train/Cat/cat.865.jpg
inflating: train/Cat/cat.866.jpg
inflating: train/Cat/cat.867.jpg
inflating: train/Cat/cat.868.jpg
inflating: train/Cat/cat.869.jpg
inflating: train/Cat/cat.87.jpg
inflating: train/Cat/cat.870.jpg
inflating: train/Cat/cat.871.jpg
inflating: train/Cat/cat.872.jpg
inflating: train/Cat/cat.873.jpg
inflating: train/Cat/cat.874.jpg
inflating: train/Cat/cat.875.jpg
inflating: train/Cat/cat.876.jpg
inflating: train/Cat/cat.877.jpg
inflating: train/Cat/cat.878.jpg
inflating: train/Cat/cat.879.jpg
inflating: train/Cat/cat.88.jpg
inflating: train/Cat/cat.880.jpg
inflating: train/Cat/cat.881.jpg
inflating: train/Cat/cat.882.jpg
inflating: train/Cat/cat.883.jpg
inflating: train/Cat/cat.884.jpg
inflating: train/Cat/cat.885.jpg
inflating: train/Cat/cat.886.jpg
inflating: train/Cat/cat.887.jpg
inflating: train/Cat/cat.888.jpg
inflating: train/Cat/cat.889.jpg
inflating: train/Cat/cat.89.jpg
inflating: train/Cat/cat.890.jpg
inflating: train/Cat/cat.891.jpg
inflating: train/Cat/cat.892.jpg
inflating: train/Cat/cat.893.jpg
inflating: train/Cat/cat.894.jpg
inflating: train/Cat/cat.895.jpg
inflating: train/Cat/cat.896.jpg
inflating: train/Cat/cat.897.jpg
inflating: train/Cat/cat.898.jpg
inflating: train/Cat/cat.899.jpg
inflating: train/Cat/cat.9.jpg
inflating: train/Cat/cat.90.jpg
inflating: train/Cat/cat.900.jpg
inflating: train/Cat/cat.901.jpg
inflating: train/Cat/cat.902.jpg
inflating: train/Cat/cat.903.jpg
inflating: train/Cat/cat.904.jpg
inflating: train/Cat/cat.905.jpg
inflating: train/Cat/cat.906.jpg
inflating: train/Cat/cat.907.jpg
inflating: train/Cat/cat.908.jpg
inflating: train/Cat/cat.909.jpg
inflating: train/Cat/cat.91.jpg
inflating: train/Cat/cat.910.jpg
inflating: train/Cat/cat.911.jpg
inflating: train/Cat/cat.912.jpg
inflating: train/Cat/cat.913.jpg
inflating: train/Cat/cat.914.jpg
inflating: train/Cat/cat.915.jpg
inflating: train/Cat/cat.916.jpg
inflating: train/Cat/cat.917.jpg
inflating: train/Cat/cat.918.jpg
inflating: train/Cat/cat.919.jpg
inflating: train/Cat/cat.92.jpg
inflating: train/Cat/cat.920.jpg
inflating: train/Cat/cat.93.jpg
inflating: train/Cat/cat.94.jpg
inflating: train/Cat/cat.946.jpg
inflating: train/Cat/cat.947.jpg
inflating: train/Cat/cat.948.jpg
inflating: train/Cat/cat.949.jpg
inflating: train/Cat/cat.95.jpg
inflating: train/Cat/cat.950.jpg
inflating: train/Cat/cat.951.jpg
inflating: train/Cat/cat.952.jpg
inflating: train/Cat/cat.953.jpg
inflating: train/Cat/cat.954.jpg
inflating: train/Cat/cat.955.jpg
inflating: train/Cat/cat.956.jpg
inflating: train/Cat/cat.957.jpg
inflating: train/Cat/cat.958.jpg
inflating: train/Cat/cat.959.jpg
inflating: train/Cat/cat.96.jpg
inflating: train/Cat/cat.960.jpg
inflating: train/Cat/cat.961.jpg
inflating: train/Cat/cat.962.jpg
inflating: train/Cat/cat.963.jpg
inflating: train/Cat/cat.964.jpg
inflating: train/Cat/cat.965.jpg
inflating: train/Cat/cat.966.jpg
inflating: train/Cat/cat.967.jpg
inflating: train/Cat/cat.968.jpg
inflating: train/Cat/cat.969.jpg
inflating: train/Cat/cat.97.jpg
inflating: train/Cat/cat.970.jpg
inflating: train/Cat/cat.971.jpg
inflating: train/Cat/cat.972.jpg
inflating: train/Cat/cat.973.jpg
inflating: train/Cat/cat.974.jpg
inflating: train/Cat/cat.975.jpg
inflating: train/Cat/cat.976.jpg
inflating: train/Cat/cat.977.jpg
inflating: train/Cat/cat.978.jpg
inflating: train/Cat/cat.979.jpg
inflating: train/Cat/cat.98.jpg
inflating: train/Cat/cat.980.jpg
inflating: train/Cat/cat.981.jpg
inflating: train/Cat/cat.982.jpg
inflating: train/Cat/cat.983.jpg
inflating: train/Cat/cat.984.jpg
inflating: train/Cat/cat.985.jpg
inflating: train/Cat/cat.986.jpg
inflating: train/Cat/cat.987.jpg
inflating: train/Cat/cat.988.jpg
inflating: train/Cat/cat.989.jpg
inflating: train/Cat/cat.99.jpg
inflating: train/Cat/cat.990.jpg
inflating: train/Cat/cat.991.jpg
inflating: train/Cat/cat.992.jpg
inflating: train/Cat/cat.993.jpg
inflating: train/Cat/cat.994.jpg
inflating: train/Cat/cat.995.jpg
inflating: train/Cat/cat.996.jpg
inflating: train/Cat/cat.997.jpg
inflating: train/Cat/cat.998.jpg
inflating: train/Cat/cat.999.jpg
creating: train/Dog/
inflating: train/Dog/10493.jpg
inflating: train/Dog/11785.jpg
inflating: train/Dog/9839.jpg
inflating: train/Dog/dog.2432.jpg
inflating: train/Dog/dog.2433.jpg
inflating: train/Dog/dog.2434.jpg
inflating: train/Dog/dog.2435.jpg
inflating: train/Dog/dog.2436.jpg
inflating: train/Dog/dog.2437.jpg
inflating: train/Dog/dog.2438.jpg
inflating: train/Dog/dog.2439.jpg
inflating: train/Dog/dog.2440.jpg
inflating: train/Dog/dog.2441.jpg
inflating: train/Dog/dog.2442.jpg
inflating: train/Dog/dog.2443.jpg
inflating: train/Dog/dog.2444.jpg
inflating: train/Dog/dog.2445.jpg
inflating: train/Dog/dog.2446.jpg
inflating: train/Dog/dog.2447.jpg
inflating: train/Dog/dog.2448.jpg
inflating: train/Dog/dog.2449.jpg
inflating: train/Dog/dog.2450.jpg
inflating: train/Dog/dog.2451.jpg
inflating: train/Dog/dog.2452.jpg
inflating: train/Dog/dog.2453.jpg
inflating: train/Dog/dog.2454.jpg
inflating: train/Dog/dog.2455.jpg
inflating: train/Dog/dog.2456.jpg
inflating: train/Dog/dog.2457.jpg
inflating: train/Dog/dog.2458.jpg
inflating: train/Dog/dog.2459.jpg
inflating: train/Dog/dog.2460.jpg
inflating: train/Dog/dog.2461.jpg
inflating: train/Dog/dog.844.jpg
inflating: train/Dog/dog.845.jpg
inflating: train/Dog/dog.846.jpg
inflating: train/Dog/dog.847.jpg
inflating: train/Dog/dog.848.jpg
inflating: train/Dog/dog.849.jpg
inflating: train/Dog/dog.85.jpg
inflating: train/Dog/dog.850.jpg
inflating: train/Dog/dog.851.jpg
inflating: train/Dog/dog.852.jpg
inflating: train/Dog/dog.853.jpg
inflating: train/Dog/dog.854.jpg
inflating: train/Dog/dog.855.jpg
inflating: train/Dog/dog.856.jpg
inflating: train/Dog/dog.857.jpg
inflating: train/Dog/dog.858.jpg
inflating: train/Dog/dog.859.jpg
inflating: train/Dog/dog.86.jpg
inflating: train/Dog/dog.860.jpg
inflating: train/Dog/dog.861.jpg
inflating: train/Dog/dog.862.jpg
inflating: train/Dog/dog.863.jpg
inflating: train/Dog/dog.864.jpg
inflating: train/Dog/dog.865.jpg
inflating: train/Dog/dog.866.jpg
inflating: train/Dog/dog.867.jpg
inflating: train/Dog/dog.868.jpg
inflating: train/Dog/dog.869.jpg
inflating: train/Dog/dog.87.jpg
inflating: train/Dog/dog.870.jpg
inflating: train/Dog/dog.871.jpg
inflating: train/Dog/dog.872.jpg
inflating: train/Dog/dog.873.jpg
inflating: train/Dog/dog.874.jpg
inflating: train/Dog/dog.875.jpg
inflating: train/Dog/dog.876.jpg
inflating: train/Dog/dog.877.jpg
inflating: train/Dog/dog.878.jpg
inflating: train/Dog/dog.879.jpg
inflating: train/Dog/dog.88.jpg
inflating: train/Dog/dog.880.jpg
inflating: train/Dog/dog.881.jpg
inflating: train/Dog/dog.882.jpg
inflating: train/Dog/dog.883.jpg
inflating: train/Dog/dog.884.jpg
inflating: train/Dog/dog.885.jpg
inflating: train/Dog/dog.886.jpg
inflating: train/Dog/dog.887.jpg
inflating: train/Dog/dog.888.jpg
inflating: train/Dog/dog.889.jpg
inflating: train/Dog/dog.89.jpg
inflating: train/Dog/dog.890.jpg
inflating: train/Dog/dog.891.jpg
inflating: train/Dog/dog.892.jpg
inflating: train/Dog/dog.893.jpg
inflating: train/Dog/dog.894.jpg
inflating: train/Dog/dog.895.jpg
inflating: train/Dog/dog.896.jpg
inflating: train/Dog/dog.897.jpg
inflating: train/Dog/dog.898.jpg
inflating: train/Dog/dog.9.jpg
inflating: train/Dog/dog.90.jpg
inflating: train/Dog/dog.91.jpg
inflating: train/Dog/dog.92.jpg
inflating: train/Dog/dog.93.jpg
inflating: train/Dog/dog.936.jpg
inflating: train/Dog/dog.937.jpg
inflating: train/Dog/dog.938.jpg
inflating: train/Dog/dog.939.jpg
inflating: train/Dog/dog.94.jpg
inflating: train/Dog/dog.940.jpg
inflating: train/Dog/dog.941.jpg
inflating: train/Dog/dog.942.jpg
inflating: train/Dog/dog.943.jpg
inflating: train/Dog/dog.944.jpg
inflating: train/Dog/dog.945.jpg
inflating: train/Dog/dog.946.jpg
inflating: train/Dog/dog.947.jpg
inflating: train/Dog/dog.948.jpg
inflating: train/Dog/dog.949.jpg
inflating: train/Dog/dog.95.jpg
inflating: train/Dog/dog.950.jpg
inflating: train/Dog/dog.951.jpg
inflating: train/Dog/dog.952.jpg
inflating: train/Dog/dog.953.jpg
inflating: train/Dog/dog.954.jpg
inflating: train/Dog/dog.955.jpg
inflating: train/Dog/dog.956.jpg
inflating: train/Dog/dog.957.jpg
inflating: train/Dog/dog.958.jpg
inflating: train/Dog/dog.959.jpg
inflating: train/Dog/dog.96.jpg
inflating: train/Dog/dog.960.jpg
inflating: train/Dog/dog.961.jpg
inflating: train/Dog/dog.962.jpg
inflating: train/Dog/dog.963.jpg
inflating: train/Dog/dog.964.jpg
inflating: train/Dog/dog.965.jpg
inflating: train/Dog/dog.966.jpg
inflating: train/Dog/dog.967.jpg
inflating: train/Dog/dog.968.jpg
inflating: train/Dog/dog.969.jpg
inflating: train/Dog/dog.97.jpg
inflating: train/Dog/dog.970.jpg
inflating: train/Dog/dog.971.jpg
inflating: train/Dog/dog.972.jpg
inflating: train/Dog/dog.973.jpg
inflating: train/Dog/dog.974.jpg
inflating: train/Dog/dog.975.jpg
inflating: train/Dog/dog.976.jpg
inflating: train/Dog/dog.977.jpg
inflating: train/Dog/dog.978.jpg
inflating: train/Dog/dog.979.jpg
inflating: train/Dog/dog.98.jpg
inflating: train/Dog/dog.980.jpg
inflating: train/Dog/dog.981.jpg
inflating: train/Dog/dog.982.jpg
inflating: train/Dog/dog.983.jpg
inflating: train/Dog/dog.984.jpg
inflating: train/Dog/dog.985.jpg
inflating: train/Dog/dog.986.jpg
inflating: train/Dog/dog.987.jpg
inflating: train/Dog/dog.988.jpg
inflating: train/Dog/dog.989.jpg
inflating: train/Dog/dog.99.jpg
inflating: train/Dog/dog.990.jpg
inflating: train/Dog/dog.991.jpg
inflating: train/Dog/dog.992.jpg
inflating: train/Dog/dog.993.jpg
inflating: train/Dog/dog.994.jpg
inflating: train/Dog/dog.995.jpg
inflating: train/Dog/dog.996.jpg
inflating: train/Dog/dog.997.jpg
inflating: train/Dog/dog.998.jpg
inflating: train/Dog/dog.999.jpg
creating: validation/
creating: validation/Cat/
inflating: validation/Cat/cat.2407.jpg
inflating: validation/Cat/cat.2408.jpg
inflating: validation/Cat/cat.2409.jpg
inflating: validation/Cat/cat.2410.jpg
inflating: validation/Cat/cat.2411.jpg
inflating: validation/Cat/cat.2412.jpg
inflating: validation/Cat/cat.2413.jpg
inflating: validation/Cat/cat.2414.jpg
inflating: validation/Cat/cat.2415.jpg
inflating: validation/Cat/cat.2416.jpg
inflating: validation/Cat/cat.2417.jpg
inflating: validation/Cat/cat.2418.jpg
inflating: validation/Cat/cat.2419.jpg
inflating: validation/Cat/cat.2420.jpg
inflating: validation/Cat/cat.2421.jpg
inflating: validation/Cat/cat.2422.jpg
inflating: validation/Cat/cat.2423.jpg
inflating: validation/Cat/cat.2424.jpg
inflating: validation/Cat/cat.2425.jpg
inflating: validation/Cat/cat.2426.jpg
inflating: validation/Cat/cat.2427.jpg
inflating: validation/Cat/cat.2428.jpg
inflating: validation/Cat/cat.2429.jpg
inflating: validation/Cat/cat.2430.jpg
inflating: validation/Cat/cat.2431.jpg
inflating: validation/Cat/cat.2432.jpg
inflating: validation/Cat/cat.2433.jpg
inflating: validation/Cat/cat.2434.jpg
inflating: validation/Cat/cat.2435.jpg
creating: validation/Dog/
inflating: validation/Dog/dog.2402.jpg
inflating: validation/Dog/dog.2403.jpg
inflating: validation/Dog/dog.2404.jpg
inflating: validation/Dog/dog.2405.jpg
inflating: validation/Dog/dog.2406.jpg
inflating: validation/Dog/dog.2407.jpg
inflating: validation/Dog/dog.2408.jpg
inflating: validation/Dog/dog.2409.jpg
inflating: validation/Dog/dog.2410.jpg
inflating: validation/Dog/dog.2411.jpg
inflating: validation/Dog/dog.2412.jpg
inflating: validation/Dog/dog.2413.jpg
inflating: validation/Dog/dog.2414.jpg
inflating: validation/Dog/dog.2415.jpg
inflating: validation/Dog/dog.2416.jpg
inflating: validation/Dog/dog.2417.jpg
inflating: validation/Dog/dog.2418.jpg
inflating: validation/Dog/dog.2419.jpg
inflating: validation/Dog/dog.2420.jpg
inflating: validation/Dog/dog.2421.jpg
inflating: validation/Dog/dog.2422.jpg
inflating: validation/Dog/dog.2423.jpg
inflating: validation/Dog/dog.2424.jpg
inflating: validation/Dog/dog.2425.jpg
inflating: validation/Dog/dog.2426.jpg
inflating: validation/Dog/dog.2427.jpg
inflating: validation/Dog/dog.2428.jpg
inflating: validation/Dog/dog.2429.jpg
inflating: validation/Dog/dog.2430.jpg
inflating: validation/Dog/dog.2431.jpg
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(224, 224),
batch_size=batch_size,
class_mode='binary')
validation_generator = validation_datagen.flow_from_directory(
validation_data_dir,
target_size=(224, 224),
batch_size=batch_size,
class_mode='binary')
In [ ]:
In [ ]:
Inception
Also known as GoogLeNet , it is a 22-layer network that won the 2014 ILSVRC Championship.
1. The original intention of the design is to expand the width and depth on its basis .
2. which is designed motives derived from improving the performance of the depth of the network
generally can increase the size of the network and increase the size of the data set to increase, but at
the same time cause the network parameters and easily fit through excessive , computing resources
inefficient and The production of high-quality data sets is an expensive issue.
3. Its design philosophy is to change the full connection to a sparse architecture and try to change it to a
sparse architecture inside the convolution.
4. The main idea is to design an inception module and increase the depth and width of the network by
continuously copying these inception modules , but GooLeNet mainly extends these inception modules
in depth.
There are four parallel channels in each inception module , and concat is performed at the end of the
channel .
1x1 conv is mainly used to reduce the dimensions in the article to avoid calculation bottlenecks. It also adds
additional softmax loss to some branches of the previous network layer to avoid the problem of gradient
disappearance.
1x1 conv: Borrowed from [ Network in Network ], the input feature map can be reduced in dimension
and upgraded without too much loss of the input spatial information;
1x1conv followed by 3x3 conv: 3x3 conv increases the receptive field of the feature map, and changes
the dimension through 1x1conv;
1x1 conv followed by 5x5 conv: 5x5 conv further increases the receptive field of the feature map, and
changes the dimensions through 1x1 conv;
3x3 max pooling followed by 1x1 conv: The author believes that although the pooling layer will lose
space information, it has been effectively applied in many fields, which proves its effectiveness, so a
parallel channel is added, and it is changed by 1x1 conv Its output dimension.
The most direct way to improve the performance of deep neural networks is to increase their size .
This includes depth, the number of levels, and their width, the size of each level unit .
Another easy and safe way is to increase the size of the training data.
Larger models mean more parameters, which makes it easier for the network to overfit , especially when the
number of label samples in the training data set is limited.
At the same time, because the production of high-quality training sets is tricky and expensive ,especially
when some human experts do it , there is a large error rate . As shown below.
Another shortcoming is that uniformly increasing the size of the network will increase the use of computing
resources . For example, in a deep network, if two convolutions are chained, any unified improvement of
their convolution kernels will cause demand for resources.
Power increase: If the increased capacity is inefficient, for example, if most of the weights end with 0 , then
a lot of computing resources are wasted. But because the computing resources are always limited, an
effective computational distribution always tends to increase the size of the model indiscriminately, and even
the main objective goal is to improve the performance of the results.
The basic method to solve these two problems is to finally change the fully connected network to a sparse
architecture, even inside the convolution.
The details of the GooLeNet network layer are shown in the following table:
To sum up:
128 1x1 convolution kernels are used to reduce dimensions and modify linear activation units
A fully connected layer of 1024 units and a modified linear activation unit;
A dropout layer that drops neuron connections with a 70% probability;
A linear layer with softmax loss as classification Predict 1000 categories, but removed during the
inference phase
Training Methodology
The momentum is set to 0.9 and the learning rate is set to decrease by 4% every 8 epochs.
Seven models were trained . To make the problem more detailed, some models were trained
on small crops, and some were trained on large crops .
The factors that make the model train well include : the sampling of patches of various sizes
in the image , the size of which is evenly distributed between 8% and 100%, and the aspect
ratio between 3/4 and 4/3.
Inception-v2(2015)
This architecture is a landmark in the development of deep network models . The most prominent
contribution is to propose a normalized Batch Normalization layer to unify the output range of the network. It
is fixed in a relatively uniform range. If the BN layer is not added, the value range of the network input and
output of each layer is greatly different, so the size of the learning rate will be different. The BN layer avoids
this situation This accelerates the training of the network and gives the network regular terms to a certain
extent , reducing the degree of overfitting of the network. In the subsequent development of network
models, most models have more or less added BN layers to the model.
In this paper, the BN layer is standardized before being input to the activation function. At the same time,
VGG uses 2 3x3 convs instead of 5x5 convs in the inception module to reduce the amount of parameters
and speed up the calculation.
Algorithm advantages:
1. Improved learning rate : In the BN model, a higher learning rate is used to accelerate training
convergence, but it will not cause other effects. Because if the scale of each layer is different, then the
learning rate required by each layer is different. The scale of the same layer dimension often also
needs different learning rates. Usually, the minimum learning is required to ensure the loss function to
decrease, but The BN layer keeps the scale of each layer and dimension consistent, so you can
directly use a higher learning rate for optimization.
1. Remove the dropout layer : The BN layer makes full use of the goals of the dropout layer. Remove
the dropout layer from the BN-Inception model, but no overfitting will occur.
1. Decrease the attenuation coefficient of L2 weight : Although the L2 loss controls the overfitting of
the Inception model, the loss of weight has been reduced by five times in the BN-Inception model.
1. Accelerate the decay of the learning rate : When training the Inception model, we let the learning
rate decrease exponentially. Because our network is faster than Inception, we will increase the speed of
reducing the learning rate by 6 times.
1. Remove the local response layer : Although this layer has a certain role, but after the BN layer is
added, this layer is not necessary.
1. Scramble training samples more thoroughly : We scramble training samples, which can prevent the
same samples from appearing in a mini-batch. This can improve the accuracy of the validation set by
1%, which is the advantage of the BN layer as a regular term. In our method, random selection is more
effective when the model sees different samples each time.
1. To reduce image distortion: Because BN network training is faster and observes each training sample
less often, we want the model to see a more realistic image instead of a distorted image.
Inception-v3-2015
This architecture focuses, how to use the convolution kernel two or more smaller size of the convolution
kernel to replace, but also the introduction of asymmetrical layers i.e. a convolution dimensional
convolution has also been proposed for pooling layer Some remedies that can cause loss of spatial
information; there are ideas such as label-smoothing , BN-ahxiliary .
Experiments were performed on inputs with different resolutions . The results show that although low-
resolution inputs require more time to train, the accuracy and high-resolution achieved are not much
different.
The computational cost is reduced while improving the accuracy of the network.
General Design Principles
We will describe some design principles that have been proposed through extensive experiments with
different architectural designs for convolutional networks. At this point, full use of the following principles can
be guessed, and some additional experiments in the future will be necessary to estimate their accuracy and
effectiveness.
2. The higher the dimensionality of the feature, the faster the training converges . That is, the
independence of features has a great relationship with the speed of model convergence. The more
independent features, the more thoroughly the input feature information is decomposed. It is easier to
converge if the correlation is strong. Hebbin principle : fire together, wire together.
3. Reduce the amount of calculation through dimensionality reduction . In v1, the feature is first
reduced by 1x1 convolutional dimensionality reduction. There is a certain correlation between different
dimensions. Dimension reduction can be understood as a lossless or low-loss compression. Even if the
dimensions are reduced, the correlation can still be used to restore its original information.
4. Balance the depth and width of the network . Only by increasing the depth and width of the network
in the same proportion can the performance of the model be maximized.
GooLeNet uses many dimensionality reduction methods, which has achieved certain results. Consider the
example of a 1x1 convolutional layer used to reduce dimensions before a 3x3 convolutional layer. In the
network, we expect the network to be highly correlated between the output neighboring elements at the
activation function. Therefore, we can reduce their activation values before aggregation , which should
generate similar local expression descriptions.
This paper explores experiments to decompose the network layer into different factors under different
settings in order to improve the computational efficiency of the method . Because the Inception network is
fully convolutional, each weight value corresponds to a product operation each time it is activated.
Therefore, any reduction in computational cost will result in a reduction in parameters. This means that we
can use some suitable decomposition factors to reduce the parameters and thus speed up the training.
With the same number of convolution kernels, larger convolution kernels (such as 5x5 or 7x7) are more
expensive to calculate than 3x3 convolution kernels , which is about a multiple of 25/9 = 2.78. Of course, the
5x5 convolution kernel can obtain more correlations between the information and activation units in the
previous network, but under the premise of huge consumption of computing resources, a physical reduction
in the size of the convolution kernel still appears.
However, we still want to know whether a 5x5 convolutional layer can be replaced by a multi-layer
convolutional layer with fewer parameters when the input and output sizes are consistent . If we scale the
calculation map of 5x5 convolution, we can see that each output is like a small fully connected network
sliding on the input window with a size of 5x5. Refer to Figure 1.
Therefore, we have developed a network that explores translation invariance and replaces one layer of
convolution with two layers of convolution: the first layer is a 3x3 convolution layer and the second layer is a
fully connected layer . Refer to Figure 1. We ended up replacing two 5x5 convolutional layers with two 3x3
convolutional layers. Refer to Figure 4 Figure 5. This operation can realize the weight sharing of
neighboring layers. It is about (9 + 9) / 25 times reduction in computational consumption.
Spatial Factorization into Asymmetric Convolutions
We are wondering if the convolution kernel can be made smaller, such as 2x2, but there is an asymmetric
method that can be better than this method. That is to use nx1 size convolution. For example, using the
[3x1 + 1x3] convolution layer. In this case, a single 3x3 convolution has the same receptive field. Refer to
Figure 3. This asymmetric method can save [((3x3)-(3 + 3)) / (3x3) = 33%] computing resources, and
replacing two 2x2 only saves [11%] Computing resources.
In theory, we can have a deeper discussion and use the convolution of [1xn + nx1] instead of the
convolutional layer of nxn. Refer to Figure 6. But this situation is not very good in the previous layer, but it
can perform better on a medium-sized feature map [mxm, m is between 12 and 20]. In this case, use [1x7 +
7x1] convolutional layer can get a very good result.
Inception-v1 introduced some auxiliary classifiers (referring to some branches of the previous layer adding
the softmax layer to calculate the loss back propagation) to improve the aggregation problem in deep
networks. The original motive is to pass the gradient back to the previous convolutional layer , so that they
can effectively and improve the aggregation of features and avoid the problem of vanishing gradients.
Traditionally, pooling layers are used in convolutional networks to reduce the size of feature maps . In order
to avoid bottlenecks in the expression of spatial information, the number of convolution kernels in the
network can be expanded before using max pooling or average pooling.
For example, for a dxd network layer with K feature maps, to generate a network layer with 2K [d / 2 xd / 2]
feature maps, we can use 2K convolution kernels with a step size of 1. Convolution and then add a pooling
layer to get it, then this operation requires [2d 2 K 2 ]. But using pooling instead of convolution, the
approximate operation is [2 * (d / 2) 2 xK 2 ], which reduces the operation by four times. However, this will
cause a description bottleneck, because the feature map is reduced to [(d / 2) 2 xK], which will definitely
cause the loss of spatial information on the network. Refer to Figure 9. However, we have adopted a
different method to avoid this bottleneck, refer to Figure 10. That is, two parallel channels are used , one is
a pooling layer (max or average), the step size is 2, and the other is a convolution layer , and then it is
concatenated during output.
Inception-v4-2016
After ResNet appeared, ResNet residual structure was added.
It is based on Inception-v3 and added the skip connection structure in ResNet. Finally, under the structure of
3 residual and 1 inception-v4 , it reached the top-5 error 3.08% in CLS (ImageNet calssification) .
1-Introduction Residual conn works well when training very deep networks. Because the Inception network
architecture can be very deep, it is reasonable to use residual conn instead of concat.
Compared with v3, Inception-v4 has more unified simplified structure and more inception modules.
The big picture of Inception-v4:
Fig9 is an overall picture, and Fig3,4,5,6,7,8 are all local structures. For the specific structure of each
module, see the end of the article.
One is named Inception-ResNet-v1, which is consistent with the calculation cost of Inception-v3. One is
named Inception-ResNet-v2, which is consistent with the calculation cost of Inception-v4.
Figure 15 shows the structure of both. However, Inception-v4 is actually slower in practice, probably
because it has more layers.
Another small technique is that we use the BN layer in the header of the traditional layer in the Inception-
ResNet module, but not in the header of the summations. ** There is reason to believe that the BN layer is
effective. But in order to add more Inception modules, we made a compromise between the two.
Inception-ResNet-v1
Inception-ResNet-v2
This article finds that scale can stabilize the training process before adding the residual module to the
activation layer . This article sets the scale coefficient between 0.1 and 0.3.
In order to prevent the occurrence of unstable training of deep residual networks, He suggested in the
article that it is divided into two stages of training. The first stage is called warm-up (preheating) , that is,
training the model with a very low learning first. In the second stage, a higher learning rate is used. And this
article finds that if the convolution sum is very high, even a learning rate of 0.00001 cannot solve this
training instability problem, and the high learning rate will also destroy the effect. But this article considers
scale residuals to be more reliable than warm-up.
Even if scal is not strictly necessary, it has no effect on the final accuracy, but it can stabilize the training
process.
Conclusion
Inception-ResNet-v1 : a network architecture combining inception module and resnet module with similar
calculation cost to Inception-v3;
Inception-ResNet-v2 : A more expensive but better performing network architecture.
Inception-v4 : A pure inception module, without residual connections, but with performance similar to
Inception-ResNet-v2.
Fig6-Inception-C: (Inception-v4)
Fig7-Reduction-A: (Inception-v4 & Inception-ResNet-v1 & Inception-ResNet-v2)
Fig8-Reduction-B: (Inception-v4)
Fig10-Inception-ResNet-A: (Inception-ResNet-v1)
Fig11-Inception-ResNet-B: (Inception-ResNet-v1)
Fig12-Reduction-B: (Inception-ResNet-v1)
Fig13-Inception-ResNet-C: (Inception-ResNet-v1)
Fig14-Stem: (Inception-ResNet-v1)
Fig16-Inception-ResNet-A: (Inception-ResNet-v2)
Fig17-Inception-ResNet-B: (Inception-ResNet-v2)
Fig18-Reduction-B: (Inception-ResNet-v2)
Fig19-Inception-ResNet-C: (Inception-ResNet-v2)
Summary
Inception v1 network, 1x1, 3x3, 5x5 conv and 3x3 pooling and stacking together, on the one hand,
increase the width of the network, and on the other hand, increase the adaptability of the network to
scale.
The network of v2 has been improved based on v1. On the one hand, the BN layer has been added to
reduce the internal covariate shift (the internal neuron's data distribution has changed), so that the
output of each layer is normalized to an N (0, 1) Gaussian, on the other hand, learning VGG replaces
5x5 in the inception module with two 3x3 convs, which reduces the number of parameters and speeds
up the calculation.
One of the most important improvements in v3 is Factorization, which decomposes 7x7 into two one-
dimensional convolutions (1x7, 7x1), and 3x3 is the same (1x3, 3x1). This benefit can speed up
calculations (redundant calculations Capacity can be used to deepen the network), and one conv can
be split into two convs, which further increases the network depth and increases the nonlinearity of the
network. It is also worth noting that the network input has changed from 224x224 to 299x299, which is
more refined. Designed 35x35 / 17x17 / 8x8 modules.
v4 studied whether the Inception module combined with the Residual Connection can be improved? It
was found that the structure of ResNet can greatly speed up training and improve performance at the
same time. An Inception-ResNet v2 network was obtained. At the same time, a deeper and more
optimized Inception v4 model was designed to achieve performance comparable to Inception-ResNet
v2
Code implementation
From Scratch
In [ ]: !pip install tflearn
In [ ]: # Get Data
import tflearn.datasets.oxflower17 as oxflower17
from keras.utils import to_categorical
x, y = oxflower17.load_data()
WARNING:tensorflow:From /usr/local/lib/python3.10/dist-packages/tensorflow/python/compa
t/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scop
e) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Downloading Oxford 17 category Flower Dataset, Please wait...
100.0% 60276736 / 60270631
Succesfully downloaded 17flowers.tgz 60270631 bytes.
File Extracted
Starting to parse images...
Parsing Done!
In [ ]: print(x_train.shape)
print(y_train.shape)
In [ ]: # Inception block
def inception_block(x, filters):
tower_1 = Conv2D(filters[0], (1, 1), padding='same', activation='relu')(x)
tower_1 = Conv2D(filters[1], (3, 3), padding='same', activation='relu')(tower_1)
x = AveragePooling2D((4, 4))(x)
x = Flatten()(x)
outputs = Dense(num_classes, activation='softmax')(x)
# Train
model.fit(x_train, y_train, batch_size=64, epochs=5, verbose=1,validation_split=0.2, shu
Model: "model"
________________________________________________________________________________________
__________
Layer (type) Output Shape Param # Connected to
========================================================================================
==========
input_1 (InputLayer) [(None, 224, 224, 3 0 []
)]
8)
4) 'conv2d_4[0][0]',
'conv2d_5[0][0]']
conv2d_6 (Conv2D) (None, 112, 112, 12 18560 ['concatenate[0][0]']
8)
2)
4)
8)
6) 'conv2d_9[0][0]',
'conv2d_10[0][0]']
'conv2d_14[0][0]',
'conv2d_15[0][0]']
'conv2d_19[0][0]',
/usr/local/lib/python3.10/dist-packages/keras/optimizers/legacy/adam.py:117: UserWarnin
g: The `lr` argument is deprecated, use `learning_rate` instead.
super().__init__(name, **kwargs)
'conv2d_20[0][0]']
'conv2d_24[0][0]',
'conv2d_25[0][0]']
'conv2d_29[0][0]',
'conv2d_30[0][0]']
'conv2d_34[0][0]',
'conv2d_35[0][0]']
'conv2d_39[0][0]',
'conv2d_40[0][0]']
)
conv2d_42 (Conv2D) (None, 28, 28, 192) 663744 ['conv2d_41[0][0]']
'conv2d_44[0][0]',
'conv2d_45[0][0]']
ing2D)
========================================================================================
==========
Total params: 5,448,529
Trainable params: 5,448,529
Non-trainable params: 0
________________________________________________________________________________________
__________
Train on 1088 samples, validate on 272 samples
Epoch 1/5
1088/1088 [==============================] - ETA: 0s - loss: 2.8350 - acc: 0.0377
/usr/local/lib/python3.10/dist-packages/keras/engine/training_v1.py:2335: UserWarning: `
Model.state_updates` will be removed in a future version. This property should not be us
ed in TensorFlow 2.0, as `updates` are applied automatically.
updates = self.state_updates
1088/1088 [==============================] - 73s 67ms/sample - loss: 2.8350 - acc: 0.037
7 - val_loss: 2.8338 - val_acc: 0.0368
Epoch 2/5
1088/1088 [==============================] - 23s 21ms/sample - loss: 2.8338 - acc: 0.064
3 - val_loss: 2.8343 - val_acc: 0.0368
Epoch 3/5
1088/1088 [==============================] - 23s 22ms/sample - loss: 2.8332 - acc: 0.064
3 - val_loss: 2.8345 - val_acc: 0.0368
Epoch 4/5
1088/1088 [==============================] - 23s 21ms/sample - loss: 2.8334 - acc: 0.064
3 - val_loss: 2.8361 - val_acc: 0.0368
Epoch 5/5
1088/1088 [==============================] - 23s 21ms/sample - loss: 2.8330 - acc: 0.064
3 - val_loss: 2.8362 - val_acc: 0.0368
<keras.callbacks.History at 0x7f9d3badf010>
Out[ ]:
Pretrained
In [1]: # download the data from g drive
import gdown
url = "https://drive.google.com/file/d/12jiQxJzYSYl3wnC8x5wHAhRzzJmmsCXP/view?usp=sharin
file_id = url.split("/")[-2]
print(file_id)
prefix = 'https://drive.google.com/uc?/export=download&id='
gdown.download(prefix+file_id, "catdog.zip")
12jiQxJzYSYl3wnC8x5wHAhRzzJmmsCXP
Downloading...
From: https://drive.google.com/uc?/export=download&id=12jiQxJzYSYl3wnC8x5wHAhRzzJmmsCXP
To: /content/catdog.zip
100%|██████████| 9.09M/9.09M [00:00<00:00, 83.8MB/s]
'catdog.zip'
Out[1]:
Archive: catdog.zip
creating: train/
creating: train/Cat/
inflating: train/Cat/0.jpg
inflating: train/Cat/1.jpg
inflating: train/Cat/2.jpg
inflating: train/Cat/cat.2405.jpg
inflating: train/Cat/cat.2406.jpg
inflating: train/Cat/cat.2436.jpg
inflating: train/Cat/cat.2437.jpg
inflating: train/Cat/cat.2438.jpg
inflating: train/Cat/cat.2439.jpg
inflating: train/Cat/cat.2440.jpg
inflating: train/Cat/cat.2441.jpg
inflating: train/Cat/cat.2442.jpg
inflating: train/Cat/cat.2443.jpg
inflating: train/Cat/cat.2444.jpg
inflating: train/Cat/cat.2445.jpg
inflating: train/Cat/cat.2446.jpg
inflating: train/Cat/cat.2447.jpg
inflating: train/Cat/cat.2448.jpg
inflating: train/Cat/cat.2449.jpg
inflating: train/Cat/cat.2450.jpg
inflating: train/Cat/cat.2451.jpg
inflating: train/Cat/cat.2452.jpg
inflating: train/Cat/cat.2453.jpg
inflating: train/Cat/cat.2454.jpg
inflating: train/Cat/cat.2455.jpg
inflating: train/Cat/cat.2456.jpg
inflating: train/Cat/cat.2457.jpg
inflating: train/Cat/cat.2458.jpg
inflating: train/Cat/cat.2459.jpg
inflating: train/Cat/cat.2460.jpg
inflating: train/Cat/cat.2461.jpg
inflating: train/Cat/cat.2462.jpg
inflating: train/Cat/cat.2463.jpg
inflating: train/Cat/cat.2464.jpg
inflating: train/Cat/cat.855.jpg
inflating: train/Cat/cat.856.jpg
inflating: train/Cat/cat.857.jpg
inflating: train/Cat/cat.858.jpg
inflating: train/Cat/cat.859.jpg
inflating: train/Cat/cat.86.jpg
inflating: train/Cat/cat.860.jpg
inflating: train/Cat/cat.861.jpg
inflating: train/Cat/cat.862.jpg
inflating: train/Cat/cat.863.jpg
inflating: train/Cat/cat.864.jpg
inflating: train/Cat/cat.865.jpg
inflating: train/Cat/cat.866.jpg
inflating: train/Cat/cat.867.jpg
inflating: train/Cat/cat.868.jpg
inflating: train/Cat/cat.869.jpg
inflating: train/Cat/cat.87.jpg
inflating: train/Cat/cat.870.jpg
inflating: train/Cat/cat.871.jpg
inflating: train/Cat/cat.872.jpg
inflating: train/Cat/cat.873.jpg
inflating: train/Cat/cat.874.jpg
inflating: train/Cat/cat.875.jpg
inflating: train/Cat/cat.876.jpg
inflating: train/Cat/cat.877.jpg
inflating: train/Cat/cat.878.jpg
inflating: train/Cat/cat.879.jpg
inflating: train/Cat/cat.88.jpg
inflating: train/Cat/cat.880.jpg
inflating: train/Cat/cat.881.jpg
inflating: train/Cat/cat.882.jpg
inflating: train/Cat/cat.883.jpg
inflating: train/Cat/cat.884.jpg
inflating: train/Cat/cat.885.jpg
inflating: train/Cat/cat.886.jpg
inflating: train/Cat/cat.887.jpg
inflating: train/Cat/cat.888.jpg
inflating: train/Cat/cat.889.jpg
inflating: train/Cat/cat.89.jpg
inflating: train/Cat/cat.890.jpg
inflating: train/Cat/cat.891.jpg
inflating: train/Cat/cat.892.jpg
inflating: train/Cat/cat.893.jpg
inflating: train/Cat/cat.894.jpg
inflating: train/Cat/cat.895.jpg
inflating: train/Cat/cat.896.jpg
inflating: train/Cat/cat.897.jpg
inflating: train/Cat/cat.898.jpg
inflating: train/Cat/cat.899.jpg
inflating: train/Cat/cat.9.jpg
inflating: train/Cat/cat.90.jpg
inflating: train/Cat/cat.900.jpg
inflating: train/Cat/cat.901.jpg
inflating: train/Cat/cat.902.jpg
inflating: train/Cat/cat.903.jpg
inflating: train/Cat/cat.904.jpg
inflating: train/Cat/cat.905.jpg
inflating: train/Cat/cat.906.jpg
inflating: train/Cat/cat.907.jpg
inflating: train/Cat/cat.908.jpg
inflating: train/Cat/cat.909.jpg
inflating: train/Cat/cat.91.jpg
inflating: train/Cat/cat.910.jpg
inflating: train/Cat/cat.911.jpg
inflating: train/Cat/cat.912.jpg
inflating: train/Cat/cat.913.jpg
inflating: train/Cat/cat.914.jpg
inflating: train/Cat/cat.915.jpg
inflating: train/Cat/cat.916.jpg
inflating: train/Cat/cat.917.jpg
inflating: train/Cat/cat.918.jpg
inflating: train/Cat/cat.919.jpg
inflating: train/Cat/cat.92.jpg
inflating: train/Cat/cat.920.jpg
inflating: train/Cat/cat.93.jpg
inflating: train/Cat/cat.94.jpg
inflating: train/Cat/cat.946.jpg
inflating: train/Cat/cat.947.jpg
inflating: train/Cat/cat.948.jpg
inflating: train/Cat/cat.949.jpg
inflating: train/Cat/cat.95.jpg
inflating: train/Cat/cat.950.jpg
inflating: train/Cat/cat.951.jpg
inflating: train/Cat/cat.952.jpg
inflating: train/Cat/cat.953.jpg
inflating: train/Cat/cat.954.jpg
inflating: train/Cat/cat.955.jpg
inflating: train/Cat/cat.956.jpg
inflating: train/Cat/cat.957.jpg
inflating: train/Cat/cat.958.jpg
inflating: train/Cat/cat.959.jpg
inflating: train/Cat/cat.96.jpg
inflating: train/Cat/cat.960.jpg
inflating: train/Cat/cat.961.jpg
inflating: train/Cat/cat.962.jpg
inflating: train/Cat/cat.963.jpg
inflating: train/Cat/cat.964.jpg
inflating: train/Cat/cat.965.jpg
inflating: train/Cat/cat.966.jpg
inflating: train/Cat/cat.967.jpg
inflating: train/Cat/cat.968.jpg
inflating: train/Cat/cat.969.jpg
inflating: train/Cat/cat.97.jpg
inflating: train/Cat/cat.970.jpg
inflating: train/Cat/cat.971.jpg
inflating: train/Cat/cat.972.jpg
inflating: train/Cat/cat.973.jpg
inflating: train/Cat/cat.974.jpg
inflating: train/Cat/cat.975.jpg
inflating: train/Cat/cat.976.jpg
inflating: train/Cat/cat.977.jpg
inflating: train/Cat/cat.978.jpg
inflating: train/Cat/cat.979.jpg
inflating: train/Cat/cat.98.jpg
inflating: train/Cat/cat.980.jpg
inflating: train/Cat/cat.981.jpg
inflating: train/Cat/cat.982.jpg
inflating: train/Cat/cat.983.jpg
inflating: train/Cat/cat.984.jpg
inflating: train/Cat/cat.985.jpg
inflating: train/Cat/cat.986.jpg
inflating: train/Cat/cat.987.jpg
inflating: train/Cat/cat.988.jpg
inflating: train/Cat/cat.989.jpg
inflating: train/Cat/cat.99.jpg
inflating: train/Cat/cat.990.jpg
inflating: train/Cat/cat.991.jpg
inflating: train/Cat/cat.992.jpg
inflating: train/Cat/cat.993.jpg
inflating: train/Cat/cat.994.jpg
inflating: train/Cat/cat.995.jpg
inflating: train/Cat/cat.996.jpg
inflating: train/Cat/cat.997.jpg
inflating: train/Cat/cat.998.jpg
inflating: train/Cat/cat.999.jpg
creating: train/Dog/
inflating: train/Dog/10493.jpg
inflating: train/Dog/11785.jpg
inflating: train/Dog/9839.jpg
inflating: train/Dog/dog.2432.jpg
inflating: train/Dog/dog.2433.jpg
inflating: train/Dog/dog.2434.jpg
inflating: train/Dog/dog.2435.jpg
inflating: train/Dog/dog.2436.jpg
inflating: train/Dog/dog.2437.jpg
inflating: train/Dog/dog.2438.jpg
inflating: train/Dog/dog.2439.jpg
inflating: train/Dog/dog.2440.jpg
inflating: train/Dog/dog.2441.jpg
inflating: train/Dog/dog.2442.jpg
inflating: train/Dog/dog.2443.jpg
inflating: train/Dog/dog.2444.jpg
inflating: train/Dog/dog.2445.jpg
inflating: train/Dog/dog.2446.jpg
inflating: train/Dog/dog.2447.jpg
inflating: train/Dog/dog.2448.jpg
inflating: train/Dog/dog.2449.jpg
inflating: train/Dog/dog.2450.jpg
inflating: train/Dog/dog.2451.jpg
inflating: train/Dog/dog.2452.jpg
inflating: train/Dog/dog.2453.jpg
inflating: train/Dog/dog.2454.jpg
inflating: train/Dog/dog.2455.jpg
inflating: train/Dog/dog.2456.jpg
inflating: train/Dog/dog.2457.jpg
inflating: train/Dog/dog.2458.jpg
inflating: train/Dog/dog.2459.jpg
inflating: train/Dog/dog.2460.jpg
inflating: train/Dog/dog.2461.jpg
inflating: train/Dog/dog.844.jpg
inflating: train/Dog/dog.845.jpg
inflating: train/Dog/dog.846.jpg
inflating: train/Dog/dog.847.jpg
inflating: train/Dog/dog.848.jpg
inflating: train/Dog/dog.849.jpg
inflating: train/Dog/dog.85.jpg
inflating: train/Dog/dog.850.jpg
inflating: train/Dog/dog.851.jpg
inflating: train/Dog/dog.852.jpg
inflating: train/Dog/dog.853.jpg
inflating: train/Dog/dog.854.jpg
inflating: train/Dog/dog.855.jpg
inflating: train/Dog/dog.856.jpg
inflating: train/Dog/dog.857.jpg
inflating: train/Dog/dog.858.jpg
inflating: train/Dog/dog.859.jpg
inflating: train/Dog/dog.86.jpg
inflating: train/Dog/dog.860.jpg
inflating: train/Dog/dog.861.jpg
inflating: train/Dog/dog.862.jpg
inflating: train/Dog/dog.863.jpg
inflating: train/Dog/dog.864.jpg
inflating: train/Dog/dog.865.jpg
inflating: train/Dog/dog.866.jpg
inflating: train/Dog/dog.867.jpg
inflating: train/Dog/dog.868.jpg
inflating: train/Dog/dog.869.jpg
inflating: train/Dog/dog.87.jpg
inflating: train/Dog/dog.870.jpg
inflating: train/Dog/dog.871.jpg
inflating: train/Dog/dog.872.jpg
inflating: train/Dog/dog.873.jpg
inflating: train/Dog/dog.874.jpg
inflating: train/Dog/dog.875.jpg
inflating: train/Dog/dog.876.jpg
inflating: train/Dog/dog.877.jpg
inflating: train/Dog/dog.878.jpg
inflating: train/Dog/dog.879.jpg
inflating: train/Dog/dog.88.jpg
inflating: train/Dog/dog.880.jpg
inflating: train/Dog/dog.881.jpg
inflating: train/Dog/dog.882.jpg
inflating: train/Dog/dog.883.jpg
inflating: train/Dog/dog.884.jpg
inflating: train/Dog/dog.885.jpg
inflating: train/Dog/dog.886.jpg
inflating: train/Dog/dog.887.jpg
inflating: train/Dog/dog.888.jpg
inflating: train/Dog/dog.889.jpg
inflating: train/Dog/dog.89.jpg
inflating: train/Dog/dog.890.jpg
inflating: train/Dog/dog.891.jpg
inflating: train/Dog/dog.892.jpg
inflating: train/Dog/dog.893.jpg
inflating: train/Dog/dog.894.jpg
inflating: train/Dog/dog.895.jpg
inflating: train/Dog/dog.896.jpg
inflating: train/Dog/dog.897.jpg
inflating: train/Dog/dog.898.jpg
inflating: train/Dog/dog.9.jpg
inflating: train/Dog/dog.90.jpg
inflating: train/Dog/dog.91.jpg
inflating: train/Dog/dog.92.jpg
inflating: train/Dog/dog.93.jpg
inflating: train/Dog/dog.936.jpg
inflating: train/Dog/dog.937.jpg
inflating: train/Dog/dog.938.jpg
inflating: train/Dog/dog.939.jpg
inflating: train/Dog/dog.94.jpg
inflating: train/Dog/dog.940.jpg
inflating: train/Dog/dog.941.jpg
inflating: train/Dog/dog.942.jpg
inflating: train/Dog/dog.943.jpg
inflating: train/Dog/dog.944.jpg
inflating: train/Dog/dog.945.jpg
inflating: train/Dog/dog.946.jpg
inflating: train/Dog/dog.947.jpg
inflating: train/Dog/dog.948.jpg
inflating: train/Dog/dog.949.jpg
inflating: train/Dog/dog.95.jpg
inflating: train/Dog/dog.950.jpg
inflating: train/Dog/dog.951.jpg
inflating: train/Dog/dog.952.jpg
inflating: train/Dog/dog.953.jpg
inflating: train/Dog/dog.954.jpg
inflating: train/Dog/dog.955.jpg
inflating: train/Dog/dog.956.jpg
inflating: train/Dog/dog.957.jpg
inflating: train/Dog/dog.958.jpg
inflating: train/Dog/dog.959.jpg
inflating: train/Dog/dog.96.jpg
inflating: train/Dog/dog.960.jpg
inflating: train/Dog/dog.961.jpg
inflating: train/Dog/dog.962.jpg
inflating: train/Dog/dog.963.jpg
inflating: train/Dog/dog.964.jpg
inflating: train/Dog/dog.965.jpg
inflating: train/Dog/dog.966.jpg
inflating: train/Dog/dog.967.jpg
inflating: train/Dog/dog.968.jpg
inflating: train/Dog/dog.969.jpg
inflating: train/Dog/dog.97.jpg
inflating: train/Dog/dog.970.jpg
inflating: train/Dog/dog.971.jpg
inflating: train/Dog/dog.972.jpg
inflating: train/Dog/dog.973.jpg
inflating: train/Dog/dog.974.jpg
inflating: train/Dog/dog.975.jpg
inflating: train/Dog/dog.976.jpg
inflating: train/Dog/dog.977.jpg
inflating: train/Dog/dog.978.jpg
inflating: train/Dog/dog.979.jpg
inflating: train/Dog/dog.98.jpg
inflating: train/Dog/dog.980.jpg
inflating: train/Dog/dog.981.jpg
inflating: train/Dog/dog.982.jpg
inflating: train/Dog/dog.983.jpg
inflating: train/Dog/dog.984.jpg
inflating: train/Dog/dog.985.jpg
inflating: train/Dog/dog.986.jpg
inflating: train/Dog/dog.987.jpg
inflating: train/Dog/dog.988.jpg
inflating: train/Dog/dog.989.jpg
inflating: train/Dog/dog.99.jpg
inflating: train/Dog/dog.990.jpg
inflating: train/Dog/dog.991.jpg
inflating: train/Dog/dog.992.jpg
inflating: train/Dog/dog.993.jpg
inflating: train/Dog/dog.994.jpg
inflating: train/Dog/dog.995.jpg
inflating: train/Dog/dog.996.jpg
inflating: train/Dog/dog.997.jpg
inflating: train/Dog/dog.998.jpg
inflating: train/Dog/dog.999.jpg
creating: validation/
creating: validation/Cat/
inflating: validation/Cat/cat.2407.jpg
inflating: validation/Cat/cat.2408.jpg
inflating: validation/Cat/cat.2409.jpg
inflating: validation/Cat/cat.2410.jpg
inflating: validation/Cat/cat.2411.jpg
inflating: validation/Cat/cat.2412.jpg
inflating: validation/Cat/cat.2413.jpg
inflating: validation/Cat/cat.2414.jpg
inflating: validation/Cat/cat.2415.jpg
inflating: validation/Cat/cat.2416.jpg
inflating: validation/Cat/cat.2417.jpg
inflating: validation/Cat/cat.2418.jpg
inflating: validation/Cat/cat.2419.jpg
inflating: validation/Cat/cat.2420.jpg
inflating: validation/Cat/cat.2421.jpg
inflating: validation/Cat/cat.2422.jpg
inflating: validation/Cat/cat.2423.jpg
inflating: validation/Cat/cat.2424.jpg
inflating: validation/Cat/cat.2425.jpg
inflating: validation/Cat/cat.2426.jpg
inflating: validation/Cat/cat.2427.jpg
inflating: validation/Cat/cat.2428.jpg
inflating: validation/Cat/cat.2429.jpg
inflating: validation/Cat/cat.2430.jpg
inflating: validation/Cat/cat.2431.jpg
inflating: validation/Cat/cat.2432.jpg
inflating: validation/Cat/cat.2433.jpg
inflating: validation/Cat/cat.2434.jpg
inflating: validation/Cat/cat.2435.jpg
creating: validation/Dog/
inflating: validation/Dog/dog.2402.jpg
inflating: validation/Dog/dog.2403.jpg
inflating: validation/Dog/dog.2404.jpg
inflating: validation/Dog/dog.2405.jpg
inflating: validation/Dog/dog.2406.jpg
inflating: validation/Dog/dog.2407.jpg
inflating: validation/Dog/dog.2408.jpg
inflating: validation/Dog/dog.2409.jpg
inflating: validation/Dog/dog.2410.jpg
inflating: validation/Dog/dog.2411.jpg
inflating: validation/Dog/dog.2412.jpg
inflating: validation/Dog/dog.2413.jpg
inflating: validation/Dog/dog.2414.jpg
inflating: validation/Dog/dog.2415.jpg
inflating: validation/Dog/dog.2416.jpg
inflating: validation/Dog/dog.2417.jpg
inflating: validation/Dog/dog.2418.jpg
inflating: validation/Dog/dog.2419.jpg
inflating: validation/Dog/dog.2420.jpg
inflating: validation/Dog/dog.2421.jpg
inflating: validation/Dog/dog.2422.jpg
inflating: validation/Dog/dog.2423.jpg
inflating: validation/Dog/dog.2424.jpg
inflating: validation/Dog/dog.2425.jpg
inflating: validation/Dog/dog.2426.jpg
inflating: validation/Dog/dog.2427.jpg
inflating: validation/Dog/dog.2428.jpg
inflating: validation/Dog/dog.2429.jpg
inflating: validation/Dog/dog.2430.jpg
inflating: validation/Dog/dog.2431.jpg
Training
In [3]: from tensorflow import keras
from tensorflow.keras.applications.inception_v3 import InceptionV3, preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(224, 224),
batch_size=batch_size,
class_mode='binary')
validation_generator = validation_datagen.flow_from_directory(
validation_data_dir,
target_size=(224, 224),
batch_size=batch_size,
class_mode='binary')
Prediction
In [4]: import numpy as np
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing import image
import os
img_path = "/content/train/Cat/1.jpg"
In [8]: test_image = image.load_img(img_path, target_size = (224,224))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = np.argmax(model.predict(test_image), axis=1)
print(result)
In [9]: if result[0] == 1:
prediction = 'dog'
print(prediction)
else:
prediction = 'cat'
print(prediction)
cat
In [ ]:
Resnet
Introduction
ResNet is a network structure proposed by the He Kaiming, Sun Jian and others of Microsoft
Research Asia in 2015, and won the first place in the ILSVRC-2015 classification task. At the same
time, it won the first place in ImageNet detection, ImageNet localization, COCO detection, and COCO
segmentation tasks. It was a sensation at the time.
ResNet, also known as residual neural network, refers to the idea of adding residual learning to the
traditional convolutional neural network, which solves the problem of gradient dispersion and accuracy
degradation (training set) in deep networks, so that the network can get more and more The deeper, both
the accuracy and the speed are controlled.
Deep Residual Learning for Image Recognition Original link : ResNet Paper
The first problem brought by increasing depth is the problem of gradient explosion / dissipation . This is
because as the number of layers increases, the gradient of backpropagation in the network will become
unstable with continuous multiplication, and become particularly large or special. small. Among them ,
the problem of gradient dissipation often occurs .
In order to overcome gradient dissipation, many solutions have been devised, such as using
BatchNorm, replacing the activation function with ReLu, using Xaiver initialization, etc. It can be said
that gradient dissipation has been well solved
Another problem of increasing depth is the problem of network degradation, that is, as the depth
increases, the performance of the network will become worse and worse, which is directly reflected in
the decrease in accuracy on the training set. The residual network article solves this problem. And after
this problem is solved, the depth of the network has increased by several orders of magnitude.
With network depth increasing, accuracy gets saturated (which might be unsurprising) and
then degrades rapidly. Unexpectedly, such degradation is not caused by overfitting, and
adding more layers to a favored deep model leads to higher training error.
The above figure is the error rate of the training set classified by the network on the CIFAR10-data set
with the increase of the network depth . It can be seen that if we directly stack the convolutional layers,
as the number of layers increases, the error rate increases significantly. The trend is that the deepest
56-layer network has the worst accuracy . We verified it on the VGG network. For the CIFAR-10
dataset, it took 5 minutes on the 18-layer VGG network to get the full network training. The 80%
accuracy rate was achieved, and the 34-layer VGG model took 8 minutes to get the 72% accuracy rate.
The problem of network degradation does exist.
The decrease in the training set error rate indicates that the problem of degradation is not caused by
overfitting. The specific reason is that it is left for further study. The author's other paper "Identity
Mappings in Deep Residual Networks" proved the occurrence of degradation. It is because the
optimization performance is not good, which indicates that the deeper the network, the more difficult the
reverse gradient is to conduct.
We can imagine that when we simply stack the network directly to a particularly long length, the internal
characteristics of the network have reached the best situation in one of the layers. At this time, the
remaining layers should not make any changes to the characteristics and learn automatically. The form of
identity mapping. That is to say, for a particularly deep deep network, the solution space of the shallow form
of the network should be a subset of the solution space of the deep network, in other words, a network
deeper than the shallow network will not have at least Worse effect, but this is not true because of network
degradation.
Then, we settle for the second best. In the case of network degradation, if we do not add depth, we can
improve the accuracy. Can we at least make the deep network achieve the same performance as the
shallow network, that is, let the layers behind the deep network achieve at least The role of identity mapping
. Based on this idea, the author proposes a residual module to help the network achieve identity mapping.
To understand ResNet, we must first understand what kind of problems will occur
when the network becomes deeper.
The first problem brought by increasing the network depth is the disappearance and explosion of
the gradient.
This problem was successfully solved after Szegedy proposed the BN (Batch Normalization) structure.
The BN layer can normalize the output of each layer. The size can still be kept stable after the reverse layer
transfer, and it will not be too small or too large.
The answer is still negative. The author mentioned the second problem-the degradation problem: when
the level reaches a certain level, the accuracy will saturate and then decline rapidly. This decline is not
caused by the disappearance of the gradient. It is not caused by overfit, but because the network is so
complicated that it is difficult to achieve the ideal error rate by unconstrained stocking training alone.
The degradation problem is not a problem of the network structure itself, but is caused by the current
insufficient training methods. The currently widely used training methods, whether it is SGD, AdaGrad, or
RMSProp, cannot reach the theoretically optimal convergence result after the network depth becomes
larger.
We can also prove that as long as there is an ideal training method, deeper networks will definitely perform
better than shallow networks.
The proof process is also very simple: Suppose that several layers are added behind a
network A to form a new network B. If the added level is just an identity
mapping of the output of A, that is, the output of A is after the level of B
becomes the output of B, there is no change, so the error rates of network A
and network B are equal, which proves that the deepened network will not be
worse than the network before deepening.
He Kaiming proposed a residual structure to implement the above identity mapping (Below Figure): In
addition to the normal convolution layer output, the entire module has a branch directly connecting the input
to the output. The output and the output of the convolution do The final output is obtained by arithmetic
addition. The formula is H (x) = F (x) + x, x is the input, F (x) is the output of the convolution branch, and H
(x) is the output of the entire structure. It can be shown that if all parameters in the F (x) branch are 0, H (x)
is an identity mapping. The residual structure artificially creates an identity map, which can make the entire
structure converge in the direction of the identity map, ensuring that the final error rate will not become
worse because the depth becomes larger. If a network can achieve the desired result by simply setting the
parameter values by hand, then this structure can easily converge to the result through training. This is a
rule that is unsuccessful when designing complex networks. Recall that in order to restore the original
distribution after BN processing, the formula y = rx + delta is used. When r is manually set to standard
deviation and delta is the mean, y is the distribution before BN processing. This is the use of this Rules.
What does residual learning mean?
The idea of residual learning is the above picture, which can be understood as a block, defined as follows:
1. Identity mapping refers to the curved curve on the right side of the figure above. As its name implies,
identity mapping refers to its own mapping, which is x itself;
1. F(x) Residual mapping refers to another branch, that is, part. This part is called residual mapping (y -x)
.
The residual module will significantly reduce the parameter value in the module, so that the parameters
in the network have a more sensitive response ability to the loss of reverse conduction, although the
fundamental It does not solve the problem that the loss of backhaul is too small, but it reduces the
parameters. Relatively speaking, it increases the effect of backhaul loss and also generates a certain
regularization effect.
Secondly, because there are branches of the identity mapping in the forward process, the gradient
conduction in the back-propagation process also has more simple paths , and the gradient can be
transmitted to the previous module after only one relu.
The so-called backpropagation is that the network outputs a value, and then compares it with the real
value to an error loss. At the same time, the loss is changed to change the parameter. The returned
loss depends on the original loss and gradient. Since the purpose is to change the parameter, The
problem is that if the intensity of changing the parameter is too small, the value of the parameter can be
reduced, so that the loss of the intensity of changing the parameter is relatively greater.
Therefore, the most important role of the residual module is to change the way of forward and
backward information transmission, thereby greatly promoting the optimization of the network.
Using the four criteria proposed by Inceptionv3, we will use them again to improve the residual module.
Using criterion 3, the dimensionality reduction before spatial aggregation will not cause information
loss, so the same method is also used here, adding 1 * 1 convolution The kernel is used to increase the
non-linearity and reduce the depth of the output to reduce the computational cost. You get the form of a
residual module that becomes a bottleneck. The figure above shows the basic form on the left and the
bottleneck form on the right.
To sum up, the shortcut module will help the features in the network perform identity mapping in the
forward process, and help conduct gradients in the reverse process, so that deeper models can be
successfully trained.
Why can the residual learning solve the problem of "the accuracy of the network
deepening declines"?
For a neural network model, if the model is optimal, then training can easily optimize the residual mapping
to 0, and only identity mapping is left at this time. No matter how you increase the depth, the network will
always be in an optimal state in theory. Because it is equivalent to all the subsequent added networks to
carry information transmission along the identity mapping (self), it can be understood that the number of
layers behind the optimal network is discarded (without the ability to extract features), and it does not
actually play a role. . In this way, the performance of the network will not decrease with increasing depth.
The author used two types of data, ImageNet and CIFAR, to prove the effectiveness of ResNet:
The first is ImageNet. The authors compared the training effect of ResNet structure and traditional structure
with the same number of layers. The left side of Figure is a VGG-19 network with a traditional structure
(each followed by BN), the middle is a 34-layer network with a traditional structure (each followed by BN),
and the right side is 34 layers ResNet (the solid line indicates a direct connection, and the dashed line
indicates a dimensional change using 1x1 convolution to match the number of features of the input and
output). Figure 3 shows the results after training these types of networks.
The data on the left shows that the 34-layer network (red line) with the traditional structure has a higher
error rate than the VGG-19 (blue-green line). Because the BN structure is added to each layer Therefore,
the high error is not caused by the gradient disappearing after the level is increased, but by the degradation
problem; the ResNet structure on the right side of Figure 3 shows that the 34-layer network (red line) has a
higher error rate than the 18-layer network (blue-green line). Low, this is because the ResNet structure has
overcome the degradation problem. In addition, the final error rate of the ResNet 18-layer network on the
right is similar to the error rate of the traditional 18-layer network on the left. This is because the 18-layer
network is simpler and can converge to a more ideal result even without the ResNet structure.
The ResNet structure like the left side of Fig. 4 is only used for shallow ResNet networks. If there are many
network layers, the dimensions near the output end of the network will be very large. Still using the structure
on the left side of Fig. 4 will cause a huge amount of calculation. For deeper networks, we all use the
bottleneck structure on the right side of Figure 4, first using a 1x1 convolution for dimensionality reduction,
then 3x3 convolution, and finally using 1x1 dimensionality to restore the original dimension.
In practice, considering the cost of the calculation, the residual block is calculated and optimized, that is, the
two 3x3 convolution layers are replaced with 1x1 + 3x3 + 1x1 , as shown below. The middle 3x3
convolutional layer in the new structure first reduces the calculation under one dimensionality-reduced 1x1
convolutional layer , and then restores it under another 1x1 convolutional layer , both maintaining accuracy
and reducing the amount of calculation .
This is equivalent to reducing the amount of parameters for the same number of layers , so it can be
extended to deeper models. So the author proposed ResNet with 50, 101 , and 152 layers , and not only did
not have degradation problems, the error rate was greatly reduced, and the computational complexity was
also kept at a very low level .
At this time, the error rate of ResNet has already dropped other networks a few streets, but it does not seem
to be satisfied. Therefore, a more abnormal 1202 layer network has been built. For such a deep network,
optimization is still not difficult, but it appears The problem of overfitting is quite normal. The author also said
that the 1202 layer model will be further improved in the future.
Diffrent Variants : -
From Scratch
In [1]: !pip install tflearn
x, y = oxflower17.load_data()
In [4]: print(x_train.shape)
print(y_train.shape)
# Residual blocks
x = residual_block(x, filters=16)
x = residual_block(x, filters=16)
x = residual_block(x, filters=32, downsample=True)
x = residual_block(x, filters=32)
x = residual_block(x, filters=64, downsample=True)
x = residual_block(x, filters=64)
# Train
model.fit(x_train, y_train, batch_size=64, epochs=5, verbose=1,validation_split=0.2, shu
WARNING:tensorflow:From /usr/local/lib/python3.10/dist-packages/keras/layers/normalizati
on/batch_normalization.py:581: _colocate_with (from tensorflow.python.framework.ops) is
deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
/usr/local/lib/python3.10/dist-packages/keras/optimizers/legacy/adam.py:117: UserWarnin
g: The `lr` argument is deprecated, use `learning_rate` instead.
super().__init__(name, **kwargs)
Train on 1088 samples, validate on 272 samples
Epoch 1/5
1088/1088 [==============================] - ETA: 0s - loss: 2.8100 - acc: 0.1967
/usr/local/lib/python3.10/dist-packages/keras/engine/training_v1.py:2335: UserWarning: `
Model.state_updates` will be removed in a future version. This property should not be us
ed in TensorFlow 2.0, as `updates` are applied automatically.
updates = self.state_updates
1088/1088 [==============================] - 24s 22ms/sample - loss: 2.8100 - acc: 0.196
7 - val_loss: 2.8402 - val_acc: 0.0699
Epoch 2/5
1088/1088 [==============================] - 9s 9ms/sample - loss: 1.6519 - acc: 0.4550
- val_loss: 2.8888 - val_acc: 0.0735
Epoch 3/5
1088/1088 [==============================] - 9s 9ms/sample - loss: 1.2091 - acc: 0.5928
- val_loss: 2.9678 - val_acc: 0.0735
Epoch 4/5
1088/1088 [==============================] - 10s 9ms/sample - loss: 0.9970 - acc: 0.6765
- val_loss: 3.1419 - val_acc: 0.0735
Epoch 5/5
1088/1088 [==============================] - 10s 9ms/sample - loss: 0.7557 - acc: 0.7445
- val_loss: 3.3201 - val_acc: 0.0404
<keras.callbacks.History at 0x7f2260d58df0>
Out[6]:
Pretrained
In [7]: # download the data from g drive
import gdown
url = "https://drive.google.com/file/d/12jiQxJzYSYl3wnC8x5wHAhRzzJmmsCXP/view?usp=sharin
file_id = url.split("/")[-2]
print(file_id)
prefix = 'https://drive.google.com/uc?/export=download&id='
gdown.download(prefix+file_id, "catdog.zip")
12jiQxJzYSYl3wnC8x5wHAhRzzJmmsCXP
Downloading...
From: https://drive.google.com/uc?/export=download&id=12jiQxJzYSYl3wnC8x5wHAhRzzJmmsCXP
To: /content/catdog.zip
100%|██████████| 9.09M/9.09M [00:00<00:00, 113MB/s]
'catdog.zip'
Out[7]:
Archive: catdog.zip
creating: train/
creating: train/Cat/
inflating: train/Cat/0.jpg
inflating: train/Cat/1.jpg
inflating: train/Cat/2.jpg
inflating: train/Cat/cat.2405.jpg
inflating: train/Cat/cat.2406.jpg
inflating: train/Cat/cat.2436.jpg
inflating: train/Cat/cat.2437.jpg
inflating: train/Cat/cat.2438.jpg
inflating: train/Cat/cat.2439.jpg
inflating: train/Cat/cat.2440.jpg
inflating: train/Cat/cat.2441.jpg
inflating: train/Cat/cat.2442.jpg
inflating: train/Cat/cat.2443.jpg
inflating: train/Cat/cat.2444.jpg
inflating: train/Cat/cat.2445.jpg
inflating: train/Cat/cat.2446.jpg
inflating: train/Cat/cat.2447.jpg
inflating: train/Cat/cat.2448.jpg
inflating: train/Cat/cat.2449.jpg
inflating: train/Cat/cat.2450.jpg
inflating: train/Cat/cat.2451.jpg
inflating: train/Cat/cat.2452.jpg
inflating: train/Cat/cat.2453.jpg
inflating: train/Cat/cat.2454.jpg
inflating: train/Cat/cat.2455.jpg
inflating: train/Cat/cat.2456.jpg
inflating: train/Cat/cat.2457.jpg
inflating: train/Cat/cat.2458.jpg
inflating: train/Cat/cat.2459.jpg
inflating: train/Cat/cat.2460.jpg
inflating: train/Cat/cat.2461.jpg
inflating: train/Cat/cat.2462.jpg
inflating: train/Cat/cat.2463.jpg
inflating: train/Cat/cat.2464.jpg
inflating: train/Cat/cat.855.jpg
inflating: train/Cat/cat.856.jpg
inflating: train/Cat/cat.857.jpg
inflating: train/Cat/cat.858.jpg
inflating: train/Cat/cat.859.jpg
inflating: train/Cat/cat.86.jpg
inflating: train/Cat/cat.860.jpg
inflating: train/Cat/cat.861.jpg
inflating: train/Cat/cat.862.jpg
inflating: train/Cat/cat.863.jpg
inflating: train/Cat/cat.864.jpg
inflating: train/Cat/cat.865.jpg
inflating: train/Cat/cat.866.jpg
inflating: train/Cat/cat.867.jpg
inflating: train/Cat/cat.868.jpg
inflating: train/Cat/cat.869.jpg
inflating: train/Cat/cat.87.jpg
inflating: train/Cat/cat.870.jpg
inflating: train/Cat/cat.871.jpg
inflating: train/Cat/cat.872.jpg
inflating: train/Cat/cat.873.jpg
inflating: train/Cat/cat.874.jpg
inflating: train/Cat/cat.875.jpg
inflating: train/Cat/cat.876.jpg
inflating: train/Cat/cat.877.jpg
inflating: train/Cat/cat.878.jpg
inflating: train/Cat/cat.879.jpg
inflating: train/Cat/cat.88.jpg
inflating: train/Cat/cat.880.jpg
inflating: train/Cat/cat.881.jpg
inflating: train/Cat/cat.882.jpg
inflating: train/Cat/cat.883.jpg
inflating: train/Cat/cat.884.jpg
inflating: train/Cat/cat.885.jpg
inflating: train/Cat/cat.886.jpg
inflating: train/Cat/cat.887.jpg
inflating: train/Cat/cat.888.jpg
inflating: train/Cat/cat.889.jpg
inflating: train/Cat/cat.89.jpg
inflating: train/Cat/cat.890.jpg
inflating: train/Cat/cat.891.jpg
inflating: train/Cat/cat.892.jpg
inflating: train/Cat/cat.893.jpg
inflating: train/Cat/cat.894.jpg
inflating: train/Cat/cat.895.jpg
inflating: train/Cat/cat.896.jpg
inflating: train/Cat/cat.897.jpg
inflating: train/Cat/cat.898.jpg
inflating: train/Cat/cat.899.jpg
inflating: train/Cat/cat.9.jpg
inflating: train/Cat/cat.90.jpg
inflating: train/Cat/cat.900.jpg
inflating: train/Cat/cat.901.jpg
inflating: train/Cat/cat.902.jpg
inflating: train/Cat/cat.903.jpg
inflating: train/Cat/cat.904.jpg
inflating: train/Cat/cat.905.jpg
inflating: train/Cat/cat.906.jpg
inflating: train/Cat/cat.907.jpg
inflating: train/Cat/cat.908.jpg
inflating: train/Cat/cat.909.jpg
inflating: train/Cat/cat.91.jpg
inflating: train/Cat/cat.910.jpg
inflating: train/Cat/cat.911.jpg
inflating: train/Cat/cat.912.jpg
inflating: train/Cat/cat.913.jpg
inflating: train/Cat/cat.914.jpg
inflating: train/Cat/cat.915.jpg
inflating: train/Cat/cat.916.jpg
inflating: train/Cat/cat.917.jpg
inflating: train/Cat/cat.918.jpg
inflating: train/Cat/cat.919.jpg
inflating: train/Cat/cat.92.jpg
inflating: train/Cat/cat.920.jpg
inflating: train/Cat/cat.93.jpg
inflating: train/Cat/cat.94.jpg
inflating: train/Cat/cat.946.jpg
inflating: train/Cat/cat.947.jpg
inflating: train/Cat/cat.948.jpg
inflating: train/Cat/cat.949.jpg
inflating: train/Cat/cat.95.jpg
inflating: train/Cat/cat.950.jpg
inflating: train/Cat/cat.951.jpg
inflating: train/Cat/cat.952.jpg
inflating: train/Cat/cat.953.jpg
inflating: train/Cat/cat.954.jpg
inflating: train/Cat/cat.955.jpg
inflating: train/Cat/cat.956.jpg
inflating: train/Cat/cat.957.jpg
inflating: train/Cat/cat.958.jpg
inflating: train/Cat/cat.959.jpg
inflating: train/Cat/cat.96.jpg
inflating: train/Cat/cat.960.jpg
inflating: train/Cat/cat.961.jpg
inflating: train/Cat/cat.962.jpg
inflating: train/Cat/cat.963.jpg
inflating: train/Cat/cat.964.jpg
inflating: train/Cat/cat.965.jpg
inflating: train/Cat/cat.966.jpg
inflating: train/Cat/cat.967.jpg
inflating: train/Cat/cat.968.jpg
inflating: train/Cat/cat.969.jpg
inflating: train/Cat/cat.97.jpg
inflating: train/Cat/cat.970.jpg
inflating: train/Cat/cat.971.jpg
inflating: train/Cat/cat.972.jpg
inflating: train/Cat/cat.973.jpg
inflating: train/Cat/cat.974.jpg
inflating: train/Cat/cat.975.jpg
inflating: train/Cat/cat.976.jpg
inflating: train/Cat/cat.977.jpg
inflating: train/Cat/cat.978.jpg
inflating: train/Cat/cat.979.jpg
inflating: train/Cat/cat.98.jpg
inflating: train/Cat/cat.980.jpg
inflating: train/Cat/cat.981.jpg
inflating: train/Cat/cat.982.jpg
inflating: train/Cat/cat.983.jpg
inflating: train/Cat/cat.984.jpg
inflating: train/Cat/cat.985.jpg
inflating: train/Cat/cat.986.jpg
inflating: train/Cat/cat.987.jpg
inflating: train/Cat/cat.988.jpg
inflating: train/Cat/cat.989.jpg
inflating: train/Cat/cat.99.jpg
inflating: train/Cat/cat.990.jpg
inflating: train/Cat/cat.991.jpg
inflating: train/Cat/cat.992.jpg
inflating: train/Cat/cat.993.jpg
inflating: train/Cat/cat.994.jpg
inflating: train/Cat/cat.995.jpg
inflating: train/Cat/cat.996.jpg
inflating: train/Cat/cat.997.jpg
inflating: train/Cat/cat.998.jpg
inflating: train/Cat/cat.999.jpg
creating: train/Dog/
inflating: train/Dog/10493.jpg
inflating: train/Dog/11785.jpg
inflating: train/Dog/9839.jpg
inflating: train/Dog/dog.2432.jpg
inflating: train/Dog/dog.2433.jpg
inflating: train/Dog/dog.2434.jpg
inflating: train/Dog/dog.2435.jpg
inflating: train/Dog/dog.2436.jpg
inflating: train/Dog/dog.2437.jpg
inflating: train/Dog/dog.2438.jpg
inflating: train/Dog/dog.2439.jpg
inflating: train/Dog/dog.2440.jpg
inflating: train/Dog/dog.2441.jpg
inflating: train/Dog/dog.2442.jpg
inflating: train/Dog/dog.2443.jpg
inflating: train/Dog/dog.2444.jpg
inflating: train/Dog/dog.2445.jpg
inflating: train/Dog/dog.2446.jpg
inflating: train/Dog/dog.2447.jpg
inflating: train/Dog/dog.2448.jpg
inflating: train/Dog/dog.2449.jpg
inflating: train/Dog/dog.2450.jpg
inflating: train/Dog/dog.2451.jpg
inflating: train/Dog/dog.2452.jpg
inflating: train/Dog/dog.2453.jpg
inflating: train/Dog/dog.2454.jpg
inflating: train/Dog/dog.2455.jpg
inflating: train/Dog/dog.2456.jpg
inflating: train/Dog/dog.2457.jpg
inflating: train/Dog/dog.2458.jpg
inflating: train/Dog/dog.2459.jpg
inflating: train/Dog/dog.2460.jpg
inflating: train/Dog/dog.2461.jpg
inflating: train/Dog/dog.844.jpg
inflating: train/Dog/dog.845.jpg
inflating: train/Dog/dog.846.jpg
inflating: train/Dog/dog.847.jpg
inflating: train/Dog/dog.848.jpg
inflating: train/Dog/dog.849.jpg
inflating: train/Dog/dog.85.jpg
inflating: train/Dog/dog.850.jpg
inflating: train/Dog/dog.851.jpg
inflating: train/Dog/dog.852.jpg
inflating: train/Dog/dog.853.jpg
inflating: train/Dog/dog.854.jpg
inflating: train/Dog/dog.855.jpg
inflating: train/Dog/dog.856.jpg
inflating: train/Dog/dog.857.jpg
inflating: train/Dog/dog.858.jpg
inflating: train/Dog/dog.859.jpg
inflating: train/Dog/dog.86.jpg
inflating: train/Dog/dog.860.jpg
inflating: train/Dog/dog.861.jpg
inflating: train/Dog/dog.862.jpg
inflating: train/Dog/dog.863.jpg
inflating: train/Dog/dog.864.jpg
inflating: train/Dog/dog.865.jpg
inflating: train/Dog/dog.866.jpg
inflating: train/Dog/dog.867.jpg
inflating: train/Dog/dog.868.jpg
inflating: train/Dog/dog.869.jpg
inflating: train/Dog/dog.87.jpg
inflating: train/Dog/dog.870.jpg
inflating: train/Dog/dog.871.jpg
inflating: train/Dog/dog.872.jpg
inflating: train/Dog/dog.873.jpg
inflating: train/Dog/dog.874.jpg
inflating: train/Dog/dog.875.jpg
inflating: train/Dog/dog.876.jpg
inflating: train/Dog/dog.877.jpg
inflating: train/Dog/dog.878.jpg
inflating: train/Dog/dog.879.jpg
inflating: train/Dog/dog.88.jpg
inflating: train/Dog/dog.880.jpg
inflating: train/Dog/dog.881.jpg
inflating: train/Dog/dog.882.jpg
inflating: train/Dog/dog.883.jpg
inflating: train/Dog/dog.884.jpg
inflating: train/Dog/dog.885.jpg
inflating: train/Dog/dog.886.jpg
inflating: train/Dog/dog.887.jpg
inflating: train/Dog/dog.888.jpg
inflating: train/Dog/dog.889.jpg
inflating: train/Dog/dog.89.jpg
inflating: train/Dog/dog.890.jpg
inflating: train/Dog/dog.891.jpg
inflating: train/Dog/dog.892.jpg
inflating: train/Dog/dog.893.jpg
inflating: train/Dog/dog.894.jpg
inflating: train/Dog/dog.895.jpg
inflating: train/Dog/dog.896.jpg
inflating: train/Dog/dog.897.jpg
inflating: train/Dog/dog.898.jpg
inflating: train/Dog/dog.9.jpg
inflating: train/Dog/dog.90.jpg
inflating: train/Dog/dog.91.jpg
inflating: train/Dog/dog.92.jpg
inflating: train/Dog/dog.93.jpg
inflating: train/Dog/dog.936.jpg
inflating: train/Dog/dog.937.jpg
inflating: train/Dog/dog.938.jpg
inflating: train/Dog/dog.939.jpg
inflating: train/Dog/dog.94.jpg
inflating: train/Dog/dog.940.jpg
inflating: train/Dog/dog.941.jpg
inflating: train/Dog/dog.942.jpg
inflating: train/Dog/dog.943.jpg
inflating: train/Dog/dog.944.jpg
inflating: train/Dog/dog.945.jpg
inflating: train/Dog/dog.946.jpg
inflating: train/Dog/dog.947.jpg
inflating: train/Dog/dog.948.jpg
inflating: train/Dog/dog.949.jpg
inflating: train/Dog/dog.95.jpg
inflating: train/Dog/dog.950.jpg
inflating: train/Dog/dog.951.jpg
inflating: train/Dog/dog.952.jpg
inflating: train/Dog/dog.953.jpg
inflating: train/Dog/dog.954.jpg
inflating: train/Dog/dog.955.jpg
inflating: train/Dog/dog.956.jpg
inflating: train/Dog/dog.957.jpg
inflating: train/Dog/dog.958.jpg
inflating: train/Dog/dog.959.jpg
inflating: train/Dog/dog.96.jpg
inflating: train/Dog/dog.960.jpg
inflating: train/Dog/dog.961.jpg
inflating: train/Dog/dog.962.jpg
inflating: train/Dog/dog.963.jpg
inflating: train/Dog/dog.964.jpg
inflating: train/Dog/dog.965.jpg
inflating: train/Dog/dog.966.jpg
inflating: train/Dog/dog.967.jpg
inflating: train/Dog/dog.968.jpg
inflating: train/Dog/dog.969.jpg
inflating: train/Dog/dog.97.jpg
inflating: train/Dog/dog.970.jpg
inflating: train/Dog/dog.971.jpg
inflating: train/Dog/dog.972.jpg
inflating: train/Dog/dog.973.jpg
inflating: train/Dog/dog.974.jpg
inflating: train/Dog/dog.975.jpg
inflating: train/Dog/dog.976.jpg
inflating: train/Dog/dog.977.jpg
inflating: train/Dog/dog.978.jpg
inflating: train/Dog/dog.979.jpg
inflating: train/Dog/dog.98.jpg
inflating: train/Dog/dog.980.jpg
inflating: train/Dog/dog.981.jpg
inflating: train/Dog/dog.982.jpg
inflating: train/Dog/dog.983.jpg
inflating: train/Dog/dog.984.jpg
inflating: train/Dog/dog.985.jpg
inflating: train/Dog/dog.986.jpg
inflating: train/Dog/dog.987.jpg
inflating: train/Dog/dog.988.jpg
inflating: train/Dog/dog.989.jpg
inflating: train/Dog/dog.99.jpg
inflating: train/Dog/dog.990.jpg
inflating: train/Dog/dog.991.jpg
inflating: train/Dog/dog.992.jpg
inflating: train/Dog/dog.993.jpg
inflating: train/Dog/dog.994.jpg
inflating: train/Dog/dog.995.jpg
inflating: train/Dog/dog.996.jpg
inflating: train/Dog/dog.997.jpg
inflating: train/Dog/dog.998.jpg
inflating: train/Dog/dog.999.jpg
creating: validation/
creating: validation/Cat/
inflating: validation/Cat/cat.2407.jpg
inflating: validation/Cat/cat.2408.jpg
inflating: validation/Cat/cat.2409.jpg
inflating: validation/Cat/cat.2410.jpg
inflating: validation/Cat/cat.2411.jpg
inflating: validation/Cat/cat.2412.jpg
inflating: validation/Cat/cat.2413.jpg
inflating: validation/Cat/cat.2414.jpg
inflating: validation/Cat/cat.2415.jpg
inflating: validation/Cat/cat.2416.jpg
inflating: validation/Cat/cat.2417.jpg
inflating: validation/Cat/cat.2418.jpg
inflating: validation/Cat/cat.2419.jpg
inflating: validation/Cat/cat.2420.jpg
inflating: validation/Cat/cat.2421.jpg
inflating: validation/Cat/cat.2422.jpg
inflating: validation/Cat/cat.2423.jpg
inflating: validation/Cat/cat.2424.jpg
inflating: validation/Cat/cat.2425.jpg
inflating: validation/Cat/cat.2426.jpg
inflating: validation/Cat/cat.2427.jpg
inflating: validation/Cat/cat.2428.jpg
inflating: validation/Cat/cat.2429.jpg
inflating: validation/Cat/cat.2430.jpg
inflating: validation/Cat/cat.2431.jpg
inflating: validation/Cat/cat.2432.jpg
inflating: validation/Cat/cat.2433.jpg
inflating: validation/Cat/cat.2434.jpg
inflating: validation/Cat/cat.2435.jpg
creating: validation/Dog/
inflating: validation/Dog/dog.2402.jpg
inflating: validation/Dog/dog.2403.jpg
inflating: validation/Dog/dog.2404.jpg
inflating: validation/Dog/dog.2405.jpg
inflating: validation/Dog/dog.2406.jpg
inflating: validation/Dog/dog.2407.jpg
inflating: validation/Dog/dog.2408.jpg
inflating: validation/Dog/dog.2409.jpg
inflating: validation/Dog/dog.2410.jpg
inflating: validation/Dog/dog.2411.jpg
inflating: validation/Dog/dog.2412.jpg
inflating: validation/Dog/dog.2413.jpg
inflating: validation/Dog/dog.2414.jpg
inflating: validation/Dog/dog.2415.jpg
inflating: validation/Dog/dog.2416.jpg
inflating: validation/Dog/dog.2417.jpg
inflating: validation/Dog/dog.2418.jpg
inflating: validation/Dog/dog.2419.jpg
inflating: validation/Dog/dog.2420.jpg
inflating: validation/Dog/dog.2421.jpg
inflating: validation/Dog/dog.2422.jpg
inflating: validation/Dog/dog.2423.jpg
inflating: validation/Dog/dog.2424.jpg
inflating: validation/Dog/dog.2425.jpg
inflating: validation/Dog/dog.2426.jpg
inflating: validation/Dog/dog.2427.jpg
inflating: validation/Dog/dog.2428.jpg
inflating: validation/Dog/dog.2429.jpg
inflating: validation/Dog/dog.2430.jpg
inflating: validation/Dog/dog.2431.jpg
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(224, 224),
batch_size=batch_size,
class_mode='binary')
validation_generator = validation_datagen.flow_from_directory(
validation_data_dir,
target_size=(224, 224),
batch_size=batch_size,
class_mode='binary')