0% found this document useful (0 votes)

104 views169 pages

Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Learning in Python, Theano,... (The LazyProgrammer)

This document discusses convolutional neural networks. It covers an introduction to CNNs and convolution, the architecture of CNNs, and sample code for CNNs in Theano and TensorFlow. It also provides information on where to get datasets and code used in the document.

Uploaded by

lucasolveiga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

104 views169 pages

Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Learning in Python, Theano,... (The LazyProgrammer)

Uploaded by

lucasolveiga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Convolutional Neural Networks in

Python

Master Data Science and Machine Learning with

Modern Deep Learning in Python, Theano, and
TensorFlow

By: The LazyProgrammer ([Link]

Introduction

Chapter 1: Review of Feedforward Neural Networks

Chapter 2: Convolution

Chapter 3: The Convolutional Neural Network

Chapter 4: Sample Code in Theano

Chapter 5: Sample Code in TensorFlow

Conclusion
Introduction

This is the 3rd part in my Data Science and Machine Learning series
on Deep Learning in Python. At this point, you already know a lot
about neural networks and deep learning, including not just the
basics like backpropagation, but how to improve it using modern
techniques like momentum and adaptive learning rates. You've
already written deep neural networks in Theano and TensorFlow,
and you know how to run code using the GPU.

This book is all about how to use deep learning for computer vision
using convolutional neural networks. These are the state of the art
when it comes to image classification and they beat vanilla deep
networks at tasks like MNIST.
In this course we are going to up the ante and look at the StreetView
House Number (SVHN) dataset - which uses larger color images at
various angles - so things are going to get tougher both
computationally and in terms of the difficulty of the classification task.
But we will show that convolutional neural networks, or CNNs, are
capable of handling the challenge!

Because convolution is such a central part of this type of neural

network, we are going to go in-depth on this topic. It has more
applications than you might imagine, such as modeling artificial
organs like the pancreas and the heart. I'm going to show you how to
build convolutional filters that can be applied to audio, like the echo
effect, and I'm going to show you how to build filters for image
effects, like the Gaussian blur and edge detection.
After describing the architecture of a convolutional neural network,
we will jump straight into code, and I will show you how to extend the
deep neural networks we built last time with just a few new functions
to turn them into CNNs. We will then test their performance and
show how convolutional neural networks written in both Theano and
TensorFlow can outperform the accuracy of a plain neural network
on the StreetView House Number dataset.

All the materials used in this book are FREE. You can download and
install Python, Numpy, Scipy, Theano, and TensorFlow with pip or
easy_install.

Lastly, my goal is to show you that convolutional networks aren’t

magical and they don’t require expert-level math to figure out.
It’s just the same thing we had with regular neural networks:

y = softmax( relu([Link](W1).dot(W2) )

Except we replace the first “dot product” with a convolution:

y = softmax( relu(conv(X, W1)).dot(W2) )

The way they are trained is exactly the same as before, so all your
skills with backpropagation, etc. carry over.
Chapter 1: Review of Feedforward Neural
Networks

In this lecture we are going to review some important background

material that is needed in order to understand the material in this
course. I’m not going to cover the material in depth here but rather
just explain what it is that you need to know.

Train and Predict

You should know that the basic API that we can use for all
supervised learning problems is fit(X,Y) or train(X,Y) function, which
takes in some data X and labels Y, and a predict(X) function which
just takes in some data X and makes a prediction that we will try to
make close to the corresponding Y.

Predict

We know that for neural networks the predict function is also called
the feedforward action, and this is simply the dot product and a
nonlinear function on each layer of the neural network.
e.g. z1 = s(w0x), z2 = s(w1z1), z3 = s(w2z2), y = s(w3z3)

We know that the nonlinearities we usually use in the hidden layers

is usually a relu, sigmoid, or tanh.

We know that the output is a sigmoid for binary classification and

softmax for classification with >= 2 classes.

Train
We know that training a neural network simply is the application of
gradient descent, which is the same thing we use for logistic
regression and linear regression when we don’t have a closed-form
solution. We know that linear regression has a closed form solution
but we don’t necessarily have to use it, and that gradient descent is
a more general numerical optimization method.

W ← W - learning_rate * dJ/dW

We know that libraries like Theano and TensorFlow will calculate the
gradient for us, which can get very complicated the more layers
there are. You’ll be thankful for this feature of neural networks when
you see that the output function becomes even more complex when
we incorporate convolution (although the derivation is still do-able
and I would recommend trying for practice).

At this point you should be familiar with how the cost function J is
derived from the likelihood and how we might not calculate J over
the entire training data set but rather in batches to improve training
time.

If you want to learn more about backpropagation and gradient

descent you’ll want to check out my first course on deep learning,
Deep Learning in Python part 1, which you can find at
[Link]

Data Preprocessing
When we work with images you know that an image is really a 2-D
array of data, and that if we have a color image we have a 3-D array
of data where one extra dimension is for the red, green, and blue
channels.

In the past, we’ve flattened this array into a vector, which is the usual
input into a neural network, so for example a 28 x 28 image
becomes a 784 vector, and a 3 x 32 x 32 image becomes a 3072
dimensional vector.

In this book, we are going to keep the dimensions of the original

image for a portion of the processing.
Where to get the data used in this book

This book will use the MNIST dataset (handwritten digits) and the
streetview house number (SVHN) dataset.

The streetview house number dataset is a much harder problem

than MNIST since the images are in color, the digits can be at an
angle and in different styles or fonts, and the dimensionality is much
larger.
To get the code we use in this book you’ll want to go to:

[Link]

And look in the folder: cnn_class

If you’ve already checked out this repo then simply do a “git pull”
since this code will be on the master branch.
I would highly recommend NOT just running this code but using it as
a backup if yours doesn’t work, and try to follow along with the code
examples by typing them out yourself to build muscle memory.

Once you have the machine_learning_examples repo you’ll want to

create a folder adjacent to the cnn_class folder called large_files if
you haven’t already done that for a previous class.

That is where we will expect all the data to reside.

To get the MNIST data, you’ll want to go to
[Link]

I think it’s pretty straightforward to download at that point. We’re only

going to use the [Link] file since that’s the one with labels. You are
more than welcome to attempt the challenge and submit a solution
using the techniques you learn in this class.

You can get the streetview house number data from

[Link]
You’ll want to get the files under “format 2”, which are the cropped
digits.

Note that these are MATLAB binary data files, so we’ll need to use
the Scipy library to load them, which I’m sure you have heard of if
you’re familiar with the Numpy stack.
Chapter 2: Convolution

In this chapter I’m going to give you guys a crash course in

convolution. If you really want to dig deep on this topic you’ll want to
take a course on signal processing or linear systems.

So what is convolution?
Think of your favorite audio effect (suppose that’s the “echo”). An
echo is simply the same sound bouncing back at you in the future,
but with less volume. We’ll see how we can do that mathematically
later.

All effects can be thought of as filters, like the one I’ve shown here,
and they are often drawn in block diagrams. In machine learning and
statistics these are sometimes called kernels.

--------

x(t)--->| h(t) |--->y(t)

--------
I’m representing our audio signal by this triangle. Remember that we
want to do 2 things, we want to hear this audio signal in the future,
which is basically a shift in to the right, and this audio signal should
be lower in amplitude than the original.

The last operation is to sum them all together.

Notice that the width of the signal stays the same, because it hasn’t
gotten longer or shorter, which would change the pitch.

So how can we do this in math? Well we can represent the

amplitude changes by weights called w. And for this particular echo
filter we just make sure that each weight is less than the last.
e.g. y(t) = x(t) + 0.5x(t - delay) + 0.2x(t - 2*delay) + 0.1x(t - 3*delay) +
…

For any general filter, there wouldn’t be this restriction on the

weights. The weights themselves would define the filter.

And we can write the operation as a summation.

y(n) = sum[m=-inf..+inf]{ h(m)x(n - m) }

So now here is what we consider the “definition” of convolution. We

usually represent it by an asterisk (e.g. y(n) = x(n) * h(n)). We can do
it for a continuous independent variable (where it would involve an
integral instead of a sum) or a discrete independent variable.

You can think of it as we are “sliding” the filter across the signal, by
changing the value of m.

I want to emphasize that it doesn’t matter if we slide the filter across

the signal, or if we slide the signal across the filter, since they would
give us the same result.

There are some very practical applications of this signal processing

technique.

One of my favorite examples is that we can build artificial organs.

Remember that the organ’s function is to regulate certain parameters
in your body.

So to replace an organ, we would need to build a machine that could

exactly match the response of that organ. In other words, for all the
input parameters, like blood glucose level, we need to output the
same parameters that the organ does, like how much insulin to
produce.
So for every input X we need to output an accurate Y.

In fact, that sounds a lot like machine learning, doesn’t it!

Since we’ll be working with images, we need to talk about 2-

dimensional convolution, since images are 2-dimensional signals.
y(m,n) = sum[i=-inf..+inf]{ sum[j=-inf..+inf]{ h(i,j)x(m-i,n-j) } }

You can see from this formula that this just does both convolutions
independently in each direction. I’ve got some pseudocode here to
demonstrate how you might write this in code, but notice there’s a
problem. If i > n or j > m, we’ll go out of bounds.

def convolve(x, w):

y = [Link]([Link])

for n in xrange([Link][0]):
for m in xrange([Link][1]):

for i in xrange([Link][0]):

for j in xrange([Link][1]):

y[n,m] += w[i,j]*x[n-i,m-j]

What that tells us is that the shape of Y is actually BIGGER than X.

Sometimes we just ignore these extra parts and consider Y to be the
same size as X. You’ll see when we do this in Theano and
TensorFlow how we can control the method in which the size of the
output is determined.
Gaussian Blur

If you’ve ever done image editing with applications like Photoshop or

GIMP you are probably familiar with the blur filter. Sometimes it’s
called a Gaussian blur, and you’ll see why in a minute.

If you just want to see the code that’s already been written, check
out the file
[Link]
b/master/cnn_class/[Link] from Github.
The idea is the same as we did with the sound echo. We’re going to
take a signal and spread it out.

But this time instead of having predefined delays we are going to

spread out the signal in the shape of a 2-dimensional Gaussian.

Here is the definition of the filter:

W = [Link]((20, 20))
for i in xrange(20):

for j in xrange(20):

dist = (i - 9.5)2 + (j - 9.5)2

W[i, j] = [Link](-dist / 50.)

The filter itself looks like this:

And this is the result on the famous Lena image:
The full code

import numpy as np

from [Link] import convolve2d

import [Link] as plt

import [Link] as mpimg

# load the famous Lena image

img = [Link]('[Link]')

# what does it look like?

[Link](img)

[Link]()

# make it B&W
bw = [Link](axis=2)

[Link](bw, cmap='gray')

[Link]()

# create a Gaussian filter

W = [Link]((20, 20))

for i in xrange(20):
for j in xrange(20):

dist = (i - 9.5)2 + (j - 9.5)2

W[i, j] = [Link](-dist / 50.)

# let's see what the filter looks like

[Link](W, cmap='gray')

[Link]()
# now the convolution

out = convolve2d(bw, W)

[Link](out, cmap='gray')

[Link]()

# what's that weird black stuff on the edges? let's check the size of
output

print [Link]
# after convolution, the output signal is N1 + N2 - 1

# we can also just make the output the same size as the input

out = convolve2d(bw, W, mode='same')

[Link](out, cmap='gray')

[Link]()
print [Link]

Edge Detection

Edge detection is another important operation in computer vision. If

you just want to see the code that’s already been written, check out
the file
[Link]
b/master/cnn_class/[Link] from Github.
Now I’m going to introduce the Sobel operator. The Sobel operator is
defined for 2 directions, X and Y, and they approximate the gradient
at each point of the image. Let’s call them Hx and Hy.

Hx = [Link]([

[-1, 0, 1],

[-2, 0, 2],

[-1, 0, 1],

], dtype=np.float32)
Hy = [Link]([

[-1, -2, -1],

[0, 0, 0],

[1, 2, 1],

], dtype=np.float32)
Now let’s do convolutions on these. So Gx is the convolution
between the image and Hx. Gy is the convolution between the image
and Hy.
You can think of Gx and Gy as sort of like vectors, so we can
calculate the magnitude and direction. So G = sqrt(Gx^2 + Gy^2).
We can see that after applying both operators what we get out is all
the edges detected.

The full code

import numpy as np

from [Link] import convolve2d

import [Link] as plt

import [Link] as mpimg

# load the famous Lena image

img = [Link]('[Link]')

# make it B&W

bw = [Link](axis=2)
# Sobel operator - approximate gradient in X dir

Hx = [Link]([

[-1, 0, 1],

[-2, 0, 2],

[-1, 0, 1],

], dtype=np.float32)
# Sobel operator - approximate gradient in Y dir

Hy = [Link]([

[-1, -2, -1],

[0, 0, 0],

[1, 2, 1],

], dtype=np.float32)

Gx = convolve2d(bw, Hx)
[Link](Gx, cmap='gray')

[Link]()

Gy = convolve2d(bw, Hy)

[Link](Gy, cmap='gray')

[Link]()
# Gradient magnitude

G = [Link](Gx*Gx + Gy*Gy)

[Link](G, cmap='gray')

[Link]()

The Takeaway
So what is the takeaway from all these examples of convolution?
Now you know that there are SOME filters that help us detect
features - so perhaps, it would be possible to just do a convolution in
the neural network and use gradient descent to find the best filter.
Chapter 3: The Convolutional Neural Network

All of the networks we’ve seen so far have one thing in common: all
the nodes in one layer are connected to all the nodes in the next
layer. This is the “standard” feedforward neural network. With
convolutional neural networks you will see how that changes.

Note that most of this material is inspired by LeCun, 1998 (Gradient-

based learning applied to document recognition), specifically the
LeNet model.
Why do convolution?

Remember that you can think of convolution as a “sliding window” or

a “sliding filter”. So, if we are looking for a feature in an image, let’s
say for argument’s sake, a dog, then it doesn’t matter if the dog is in
the top right corner, or in the bottom left corner.

Our system should still be able to recognize that there is a dog in

there somewhere.
We call this “translational invariance”.

Question to think about: How can we ensure the neural network has
“rotational invariance?” What other kinds of invariances can you
think of?

Downsampling

Another important operation we’ll need before we build the

convolutional neural network is downsampling. So remember our
audio sample where we did an echo - that was a 16kHz sample.
Why 16kHz? Because this is adequate for representing voices.
The telephone has a sampling rate of 8kHz - that’s why voices sound
muffled over the phone.

For images, we just want to know if after we did the convolution, was
a feature present in a certain area of the image. We can do that by
downsampling the image, or in other words, changing its resolution.

So for example, we would downsample an image by converting it

from 32x32 to 16x16, and that would mean we downsampled by a
factor of 2 in both the horizontal and vertical direction.
There are a couple of ways of doing this: one is called maxpooling,
which means we take a 2x2 or 3x3 (or any other size) block and just
output the maximum value in that block.

Another way is average pooling - this means taking the average

value over the block. We will just use maxpooling in our code.

Theano has a function for this:

[Link].max_pool_2d
The simplest CNN

The simplest convolutional net is just the kind I showed you in the
introduction to this book. It does not even need to incorporate
downsampling.

Just compute the hidden layer as follows:

Z = conv(X, W1)
Y = softmax([Link](W2))

As stated previously, you could then train this simply by doing

gradient descent.

Exercise: Try this on MNIST. How well does it perform? Better or

worse than a fully-connected MLP?

The LeNet architecture

Now we are finally at the point where I can describe the layout of a
typical convolutional neural network, specifically the LeNet flavor.

You will see that it is just a matter of joining up the operations we

have already discussed.

So in the first layer, you take the image, and keep all the colors and
the original shape, meaning you don’t flatten it. (i.e. it remains (3 x W
x H))
Then you perform convolution on it.

Next you do maxpooling to reduce the size of the features.

Then you do another convolution and another maxpooling.

Finally, you flatten these features into a vector and you put it into a
regular, fully connected neural network like the ones we’ve been
talking about.
Schematically it would look like this:
The basic pattern is:

convolution / pool / convolution / pool / fully connected hidden layer /

logistic regression

Note that you can have arbitrarily many convolution + pool layers,
and more fully connected layers.
Some networks have only convolution. The design is up to you.

Technicalities

4-D tensor inputs: The dimension of the inputs is a 4-D tensor, and
it’s pretty easy to see why. The image already takes up 3
dimensions, since we have height, width, and color. The 4th
dimension is just the number of samples (i.e. for batch training).

4-D tensor filters / kernels: You might be surprised to learn that the
kernels are ALSO 4-D tensors. Now why is this? Well in the LeNet
model, you have multiple kernels per image and a different set of
kernels for each color channel. The next layer after the convolution is
called a feature map. This feature map is the same size as the
number of kernels. So basically you can think of this as, each kernel
will extract a different feature, and place it onto the feature map.
Example:

Input image size: (3, 32, 32)

First kernel size: (3, M1, 5, 5)

Note that the order in which the dimensions appear is somewhat

arbitrary. For example, the data from the MATLAB files has N as the
last dimension, whereas Theano expects it to be in the first
dimension.
We’ll see that in TensorFlow the dimensions of the kernels are going
to be different from Theano.

Another thing to note is that the shapes of our filters are usually
MUCH smaller than the image itself. What this means is that the
same tiny filter gets applied across the entire image. This is the idea
of weight sharing.

By sharing this weight we’re introducing less parameters into the

model, and this is going to help us generalize better, since as you
know from my previous courses, when you have TOO many
parameters, you’ll end up overfitting.
You can think of this as a method of generalization.

In the schematic above, we assume a pooling size of (2, 2), which is

what we will also use in the code. This fits our data nicely because
both 28 (MNIST) and 32 (SVHN) can be divided by 2 twice evenly.

Training a CNN

Now this is the cool part.

It’s ridiculous how many people take my courses or read my books
and ask things like, “But, but, … what about X modern technique?”

Well, here’s how you train a CNN:

W <— W - learning_rate * dJ/dW

Look familiar?

That’s because it’s the same “backpropagation” (gradient descent)

equation from plain neural networks!

People think there is some kind of sorcery or well-kept secret behind

all of this that is going to take years and years of effort for them to
figure out.
People have been using convolution since the 1700s. LeCun himself
published his paper in 1998.

Researchers conjure up new ways to hack together neural networks

everyday. The ones that become popular are the ones that perform
well.

You can imagine, however, with so many researchers researching

there is bound to be someone who does better than the others.

You too, can be a deep learning researcher. Just try different things.
Be creative. Use backprop. Easy, right?
Remember, in Theano, it’s just:

param = param - learning_rate * [Link](cost, param)

Chapter 4: Sample Code in Theano

In this chapter we are going to look at the components of the Theano

convolutional neural network. This code can also be found at
[Link]
b/master/cnn_class/cnn_theano.py

So the first thing you might be wondering after learning about

convolution and downsampling is - does Theano have functions for
these? And of course the answer is yes.
In the LeNet we always do the convolution followed by pooling, so
we just call it convpool.

def convpool(X, W, b, poolsize=(2, 2)):

conv_out = conv2d(input=X, filters=W)

pooled_out = downsample.max_pool_2d(

input=conv_out,

ds=poolsize,
ignore_border=True

return relu(pooled_out + [Link]('x', 0, 'x', 'x'))

Notice that max pool requires some additional parameters.

The last step where we call the function dimshuffle() on the bias
does a broadcasting since b is a 1-D vector and after the conv_pool
operation you get a 4-D tensor. You’ll see that TensorFlow has a
function that encapsulates this for us.
The next component is the rearranging of the input. Remember that
MATLAB does things a bit weirdly and puts the index to each sample
in the LAST dimension, but Theano expects it to be in the FIRST
dimension. It also happens to expect the color dimension to come
next. So that is what this code here is doing.

def rearrange(X):

# input is (32, 32, 3, N)

# output is (N, 3, 32, 32)

N = [Link][-1]

out = [Link]((N, 3, 32, 32), dtype=np.float32)

for i in xrange(N):

for j in xrange(3):

out[i, j, :, :] = X[:, :, j, i]

return out / 255

Also, as you know with neural networks we like our data to stay in a
small range, so we divide by the maximum value at the end which is
255.
It’s also good to keep track of the size of each matrix as each
operation is done. You’ll see that with TensorFlow, by default each
library treats the edges of the result of the convolution a little
differently, and the order of each dimension is also different.

So in Theano, our first filter has the dimensions

“num_feature_maps”, which you can think of as the number of
kernels or filters we are going to create, then it has
“num_color_channels”, which is 3 for a color image, and then the
filter width and height. I’ve chosen to use 5 since that’s what I usually
see in existing code, but of course this is a hyperparameter that you
can optimize.
# (num_feature_maps, num_color_channels, filter_width,
filter_height)

W1_shape = (20, 3, 5, 5)

W1 = [Link](W1_shape)

b1_init = [Link](W1_shape[0])

# (num_feature_maps, old_num_feature_maps, filter_width,

filter_height)

W2_shape = (50, 20, 5, 5)

W2 = [Link](W2_shape)
b2_init = [Link](W2_shape[0])

W3_init = [Link](W2_shape[0]*5*5, M)

b3_init = [Link](M)

W4_init = [Link](M, K)

b4_init = [Link](K)
Note that the bias is the same size as the number of feature maps.

Also note that this filter is a 4-D tensor, which is different from the
filters we were working with previously, which were 1-D and 2-D
filters.

So the OUTPUT of that first conv_pool operation will also be a 4-D

tensor. The first dimension of course will be the batch size. The
second is now no longer color, but the number of feature maps,
which after the first stage would be 20. The next 2 are the
dimensions of the new image after conv_pooling, which is 32 - 5 + 1,
which is 28, and then divided by 2 which is 14.
In the next stage, we’ll use a filter of size 50 x 20 x 5 x 5. This means
that we now have 50 feature maps. So the output of this will have the
first 2 dimensions as batch_size and 50. And then next 2 dimensions
will be the new image after conv_pooling, which will be 14 - 5 + 1,
which is 10, and then divided by 2 which is 5.

In the next stage we pass everything into a vanilla, fully-connected

ANN, which we’ve used before. Of course this means we have to
flatten our output from the previous layer from 4-dimensions to 2-
dimensions.

Since that image was 5x5 and had 50 feature maps, the new
flattened dimension will be 50x5x5.
Now that we have all the initial weights and operations we need, we
can compute the output of the neural network. So we do the
convpool twice, and then notice this flatten() operation before I do
the dot product. That’s because Z2, after convpooling, will still be an
image.

# forward pass

Z1 = convpool(X, W1, b1)

Z2 = convpool(Z1, W2, b2)

Z3 = relu([Link](ndim=2).dot(W3) + b3)
pY = [Link]([Link](W4) + b4)

But if you call flatten() by itself it’ll turn into a 1-D array, which we
don’t want, and luckily Theano provides us with a parameter that
allows us to control how much to flatten the array. ndim=2 means to
flatten all the dimensions after the 2nd dimension.

The full code is as follows:

import numpy as np
import theano

import [Link] as T

import [Link] as plt

from [Link] import conv2d

from [Link] import downsample

from [Link] import loadmat

from [Link] import shuffle

from datetime import datetime

def error_rate(p, t):

return [Link](p != t)
def relu(a):

return a * (a > 0)

def y2indicator(y):
N = len(y)

ind = [Link]((N, 10))

for i in xrange(N):

ind[i, y[i]] = 1

return ind

def convpool(X, W, b, poolsize=(2, 2)):

conv_out = conv2d(input=X, filters=W)

# downsample each feature map individually, using maxpooling

pooled_out = downsample.max_pool_2d(

input=conv_out,

ds=poolsize,

ignore_border=True
)

return relu(pooled_out + [Link]('x', 0, 'x', 'x'))

def init_filter(shape, poolsz):

w = [Link](*shape) / [Link]([Link](shape[1:]) +
shape[0]*[Link](shape[2:] / [Link](poolsz)))

return [Link](np.float32)
def rearrange(X):

# input is (32, 32, 3, N)

# output is (N, 3, 32, 32)

N = [Link][-1]

out = [Link]((N, 3, 32, 32), dtype=np.float32)

for i in xrange(N):
for j in xrange(3):

out[i, j, :, :] = X[:, :, j, i]

return out / 255

def main():

# step 1: load the data, transform as needed

train = loadmat('../large_files/train_32x32.mat')

test = loadmat('../large_files/test_32x32.mat')

# Need to scale! don't leave as 0..255

# Y is a N x 1 matrix with values 1..10 (MATLAB indexes by 1)

# So flatten it and make it 0..9

# Also need indicator matrix for cost calculation

Xtrain = rearrange(train['X'])
Ytrain = train['y'].flatten() - 1

del train

Xtrain, Ytrain = shuffle(Xtrain, Ytrain)

Ytrain_ind = y2indicator(Ytrain)

Xtest = rearrange(test['X'])

Ytest = test['y'].flatten() - 1
del test

Ytest_ind = y2indicator(Ytest)

max_iter = 8

print_period = 10

lr = np.float32(0.00001)
reg = np.float32(0.01)

mu = np.float32(0.99)

N = [Link][0]

batch_sz = 500

n_batches = N / batch_sz
M = 500

K = 10

poolsz = (2, 2)

# after conv will be of dimension 32 - 5 + 1 = 28

# after downsample 28 / 2 = 14

W1_shape = (20, 3, 5, 5) # (num_feature_maps,

num_color_channels, filter_width, filter_height)
W1_init = init_filter(W1_shape, poolsz)

b1_init = [Link](W1_shape[0], dtype=np.float32) # one bias per

output feature map

# after conv will be of dimension 14 - 5 + 1 = 10

# after downsample 10 / 2 = 5

W2_shape = (50, 20, 5, 5) # (num_feature_maps,

old_num_feature_maps, filter_width, filter_height)

W2_init = init_filter(W2_shape, poolsz)

b2_init = [Link](W2_shape[0], dtype=np.float32)

# vanilla ANN weights

W3_init = [Link](W2_shape[0]*5*5, M) /
[Link](W2_shape[0]*5*5 + M)

b3_init = [Link](M, dtype=np.float32)

W4_init = [Link](M, K) / [Link](M + K)

b4_init = [Link](K, dtype=np.float32)

# step 2: define theano variables and expressions

X = T.tensor4('X', dtype='float32')

Y = [Link]('T')

W1 = [Link](W1_init, 'W1')

b1 = [Link](b1_init, 'b1')

W2 = [Link](W2_init, 'W2')
b2 = [Link](b2_init, 'b2')

W3 = [Link](W3_init.astype(np.float32), 'W3')

b3 = [Link](b3_init, 'b3')

W4 = [Link](W4_init.astype(np.float32), 'W4')

b4 = [Link](b4_init, 'b4')

# momentum changes
dW1 = [Link]([Link](W1_init.shape, dtype=np.float32),
'dW1')

db1 = [Link]([Link](b1_init.shape, dtype=np.float32),

'db1')

dW2 = [Link]([Link](W2_init.shape, dtype=np.float32),

'dW2')

db2 = [Link]([Link](b2_init.shape, dtype=np.float32),

'db2')

dW3 = [Link]([Link](W3_init.shape, dtype=np.float32),

'dW3')

db3 = [Link]([Link](b3_init.shape, dtype=np.float32),

'db3')

dW4 = [Link]([Link](W4_init.shape, dtype=np.float32),

'dW4')
db4 = [Link]([Link](b4_init.shape, dtype=np.float32),
'db4')

# forward pass

Z1 = convpool(X, W1, b1)

Z2 = convpool(Z1, W2, b2)

Z3 = relu([Link](ndim=2).dot(W3) + b3)

pY = [Link]( [Link](W4) + b4)

# define the cost function and prediction

params = (W1, b1, W2, b2, W3, b3, W4, b4)

reg_cost = reg[Link]((paramparam).sum() for param in params)

cost = -(Y * [Link](pY)).sum() + reg_cost

prediction = [Link](pY, axis=1)

# step 3: training expressions and functions

# you could of course store these in a list =)

update_W1 = W1 + mudW1 - lr[Link](cost, W1)

update_b1 = b1 + mudb1 - lr[Link](cost, b1)

update_W2 = W2 + mudW2 - lr[Link](cost, W2)

update_b2 = b2 + mudb2 - lr[Link](cost, b2)

update_W3 = W3 + mudW3 - lr[Link](cost, W3)

update_b3 = b3 + mudb3 - lr[Link](cost, b3)

update_W4 = W4 + mu*dW4 - lr*[Link](cost, W4)

update_b4 = b4 + mudb4 - lr[Link](cost, b4)

# update weight changes

update_dW1 = mudW1 - lr[Link](cost, W1)

update_db1 = mudb1 - lr[Link](cost, b1)

update_dW2 = mudW2 - lr[Link](cost, W2)

update_db2 = mu*db2 - lr*[Link](cost, b2)

update_dW3 = mudW3 - lr[Link](cost, W3)

update_db3 = mudb3 - lr[Link](cost, b3)

update_dW4 = mudW4 - lr[Link](cost, W4)

update_db4 = mudb4 - lr[Link](cost, b4)

train = [Link](

inputs=[X, Y],
updates=[

(W1, update_W1),

(b1, update_b1),

(W2, update_W2),

(b2, update_b2),

(W3, update_W3),

(b3, update_b3),
(W4, update_W4),

(b4, update_b4),

(dW1, update_dW1),

(db1, update_db1),

(dW2, update_dW2),

(db2, update_db2),

(dW3, update_dW3),

(db3, update_db3),
(dW4, update_dW4),

(db4, update_db4),

# create another function for this because we want it over the whole
dataset

get_prediction = [Link](
inputs=[X, Y],

outputs=[cost, prediction],

t0 = [Link]()

LL = []

for i in xrange(max_iter):
for j in xrange(n_batches):

Xbatch = Xtrain[jbatch_sz:(jbatch_sz + batch_sz),]

Ybatch = Ytrain_ind[jbatch_sz:(jbatch_sz + batch_sz),]

train(Xbatch, Ybatch)

if j % print_period == 0:

cost_val, prediction_val = get_prediction(Xtest, Ytest_ind)

err = error_rate(prediction_val, Ytest)

print "Cost / err at iteration i=%d, j=%d: %.3f / %.3f" % (i, j, cost_val,
err)

[Link](cost_val)

print "Elapsed time:", ([Link]() - t0)

[Link](LL)

[Link]()
if __name__ == '__main__':

main()
Chapter 5: Sample Code in TensorFlow

In this chapter we are going to examine the code at:

[Link]
b/master/cnn_class/cnn_tf.py

We are going to do a similar thing that we did with Theano, which is

examine each part of the code more in depth before putting it all
together.
Hopefully it helps you guys isolate each of the parts and gain an
understanding of how they work.

This is the ConvPool in TensorFlow. It’s almost the same as what we

did with Theano except that the conv2d() function takes in a new
parameter called strides.

def convpool(X, W, b):

# just assume pool size is (2,2) because we need to augment it with
1s

conv_out = [Link].conv2d(X, W, strides=[1, 1, 1, 1], padding='SAME')

conv_out = [Link].bias_add(conv_out, b)

pool_out = [Link].max_pool(conv_out, ksize=[1, 2, 2, 1], strides=[1, 2,

2, 1], padding='SAME')

return pool_out

In the past we just assumed that we had to drag the filter along every
point of the signal, but in fact we can move with any size step we
want, and that’s what stride is. We’re also going to use the padding
parameter to control the size of the output.
Remember that the bias is a 1-D vector, and we used the dimshuffle
function in Theano to add it to the convolution output. Here we can
just use a function that TensorFlow built called bias_add().

Next we call the max_pool() function. Notice that the ksize parameter
is kind of like the poolsize parameter we had with Theano, but it’s
now 4-D instead of 2-D. We just add ones at the ends. Notice that
this function ALSO takes in a strides parameter, meaning we can
max_pool at EVERY step, but we’ll just use non-overlapping sub-
images like we did previously.

The next step is to rearrange the inputs. Remember that convolution

in Theano is not the same as convolution in TensorFlow. That means
we have to adjust not only the input dimensions but the filter
dimensions as well. The only change with the inputs is that the color
now comes last.

def rearrange(X):

# input is (32, 32, 3, N)

# output is (N, 32, 32, 3)

N = [Link][-1]

out = [Link]((N, 32, 32, 3), dtype=np.float32)

for i in xrange(N):
for j in xrange(3):

out[i, :, :, j] = X[:, :, j, i]

return out / 255

The next step is unique to the TensorFlow implementation. If you

recall, TensorFlow allows us to not have to specify the size of each
dimension in its input.

This is great and allows for a lot of flexibility, but I hit a snag during
development, which is my RAM started swapping when I did this. If
you haven’t noticed yet the size of the SVHN data is pretty big, about
73k samples.

So one way around this is to make the shapes constant, which you’ll
see later. That means we’ll always have to pass in batch_sz number
of samples each time, which means the total number of samples we
use has to be a multiple of it. In the code I used exact numbers but
you can also calculate it using the data.

X = [Link](tf.float32, shape=(batch_sz, 32, 32, 3), name='X')

T = [Link](tf.float32, shape=(batch_sz, K), name='T')

Just to reinforce this idea, the filter is going to be in a different order
than before. So now the dimensions of the image filter come first,
then the number of color channels, then the number of feature maps.

# (filter_width, filter_height, num_color_channels,

num_feature_maps)

W1_shape = (5, 5, 3, 20)

W1_init = init_filter(W1_shape, poolsz)

b1_init = [Link](W1_shape[-1], dtype=np.float32) # one bias per

output feature map
# (filter_width, filter_height, old_num_feature_maps,
num_feature_maps)

W2_shape = (5, 5, 20, 50)

W2_init = init_filter(W2_shape, poolsz)

b2_init = [Link](W2_shape[-1], dtype=np.float32)

# vanilla ANN weights

W3_init = [Link](W2_shape[-1]*8*8, M) /
[Link](W2_shape[-1]*8*8 + M)
b3_init = [Link](M, dtype=np.float32)

W4_init = [Link](M, K) / [Link](M + K)

b4_init = [Link](K, dtype=np.float32)

For the vanilla ANN portion, also notice that the outputs of the
convolution are now a different size. So now it’s 8 instead of 5.

For the forward pass, the first 2 parts are the same as Theano.
One thing that’s different is TensorFlow objects don’t have a flatten
method, so we have to use reshape.

Z1 = convpool(X, W1, b1)

Z2 = convpool(Z1, W2, b2)

Z2_shape = Z2.get_shape().as_list()

Z2r = [Link](Z2, [Z2_shape[0], [Link](Z2_shape[1:])])

Z3 = [Link]( [Link](Z2r, W3) + b3 )

Yish = [Link](Z3, W4) + b4

Luckily this is pretty straightforward EVEN when you pass in None

for the input shape parameter. You can just pass in -1 in reshape and
it will be automatically be calculated. But as you can imagine this will
make your computation take longer.

The last step is to calculate the output just before the softmax.
Remember that with TensorFlow the cost function requires the logits
without softmaxing, so we won’t do the softmax at this point.
The full code is as follows:

import numpy as np

import tensorflow as tf

import [Link] as plt

from datetime import datetime

from [Link] import convolve2d

from [Link] import loadmat

from [Link] import shuffle

def y2indicator(y):

N = len(y)
ind = [Link]((N, 10))

for i in xrange(N):

ind[i, y[i]] = 1

return ind

def error_rate(p, t):

return [Link](p != t)
def convpool(X, W, b):

# just assume pool size is (2,2) because we need to augment it with

conv_out = [Link].conv2d(X, W, strides=[1, 1, 1, 1], padding='SAME')

conv_out = [Link].bias_add(conv_out, b)

pool_out = [Link].max_pool(conv_out, ksize=[1, 2, 2, 1], strides=[1, 2,

2, 1], padding='SAME')
return pool_out

def init_filter(shape, poolsz):

w = [Link](*shape) / [Link]([Link](shape[:-1]) +
shape[-1]*[Link](shape[:-2] / [Link](poolsz)))

return [Link](np.float32)
def rearrange(X):

# input is (32, 32, 3, N)

# output is (N, 32, 32, 3)

N = [Link][-1]

out = [Link]((N, 32, 32, 3), dtype=np.float32)

for i in xrange(N):
for j in xrange(3):

out[i, :, :, j] = X[:, :, j, i]

return out / 255

def main():

train = loadmat('../large_files/train_32x32.mat') # N = 73257

test = loadmat('../large_files/test_32x32.mat') # N = 26032

# Need to scale! don't leave as 0..255

# Y is a N x 1 matrix with values 1..10 (MATLAB indexes by 1)

# So flatten it and make it 0..9

# Also need indicator matrix for cost calculation

Xtrain = rearrange(train['X'])

Ytrain = train['y'].flatten() - 1
print len(Ytrain)

del train

Xtrain, Ytrain = shuffle(Xtrain, Ytrain)

Ytrain_ind = y2indicator(Ytrain)

Xtest = rearrange(test['X'])

Ytest = test['y'].flatten() - 1

del test
Ytest_ind = y2indicator(Ytest)

# gradient descent params

max_iter = 20

print_period = 10

N = [Link][0]

batch_sz = 500
n_batches = N / batch_sz

# limit samples since input will always have to be same size

# you could also just do N = N / batch_sz * batch_sz

Xtrain = Xtrain[:73000,]

Ytrain = Ytrain[:73000]

Xtest = Xtest[:26000,]

Ytest = Ytest[:26000]
Ytest_ind = Ytest_ind[:26000,]

# initialize weights

M = 500

K = 10

poolsz = (2, 2)
W1_shape = (5, 5, 3, 20) # (filter_width, filter_height,
num_color_channels, num_feature_maps)

W1_init = init_filter(W1_shape, poolsz)

b1_init = [Link](W1_shape[-1], dtype=np.float32) # one bias per

output feature map

W2_shape = (5, 5, 20, 50) # (filter_width, filter_height,

old_num_feature_maps, num_feature_maps)

W2_init = init_filter(W2_shape, poolsz)

b2_init = [Link](W2_shape[-1], dtype=np.float32)

# vanilla ANN weights

W3_init = [Link](W2_shape[-1]*8*8, M) /
[Link](W2_shape[-1]*8*8 + M)

b3_init = [Link](M, dtype=np.float32)

W4_init = [Link](M, K) / [Link](M + K)

b4_init = [Link](K, dtype=np.float32)

# define variables and expressions

# using None as the first shape element takes up too much RAM
unfortunately

X = [Link](tf.float32, shape=(batch_sz, 32, 32, 3), name='X')

T = [Link](tf.float32, shape=(batch_sz, K), name='T')

W1 = [Link](W1_init.astype(np.float32))

b1 = [Link](b1_init.astype(np.float32))

W2 = [Link](W2_init.astype(np.float32))
b2 = [Link](b2_init.astype(np.float32))

W3 = [Link](W3_init.astype(np.float32))

b3 = [Link](b3_init.astype(np.float32))

W4 = [Link](W4_init.astype(np.float32))

b4 = [Link](b4_init.astype(np.float32))

Z1 = convpool(X, W1, b1)

Z2 = convpool(Z1, W2, b2)

Z2_shape = Z2.get_shape().as_list()

Z2r = [Link](Z2, [Z2_shape[0], [Link](Z2_shape[1:])])

Z3 = [Link]( [Link](Z2r, W3) + b3 )

Yish = [Link](Z3, W4) + b4

cost = tf.reduce_sum([Link].softmax_cross_entropy_with_logits(Yish,
T))
train_op = [Link](0.0001, decay=0.99,
momentum=0.9).minimize(cost)

# we'll use this to calculate the error rate

predict_op = [Link](Yish, 1)

t0 = [Link]()

LL = []
init = tf.initialize_all_variables()

with [Link]() as session:

[Link](init)

for i in xrange(max_iter):

for j in xrange(n_batches):

Xbatch = Xtrain[jbatch_sz:(jbatch_sz + batch_sz),]

Ybatch = Ytrain_ind[j*batch_sz:(j*batch_sz + batch_sz),]

if len(Xbatch) == batch_sz:

[Link](train_op, feed_dict={X: Xbatch, T: Ybatch})

if j % print_period == 0:

# due to RAM limitations we need to have a fixed size input

# so as a result, we have this ugly total cost and prediction

computation
test_cost = 0

prediction = [Link](len(Xtest))

for k in xrange(len(Xtest) / batch_sz):

Xtestbatch = Xtest[kbatch_sz:(kbatch_sz + batch_sz),]

Ytestbatch = Ytest_ind[kbatch_sz:(kbatch_sz + batch_sz),]

test_cost += [Link](cost, feed_dict={X: Xtestbatch, T:

Ytestbatch})

prediction[kbatch_sz:(kbatch_sz + batch_sz)] = [Link](

predict_op, feed_dict={X: Xtestbatch})

err = error_rate(prediction, Ytest)

print "Cost / err at iteration i=%d, j=%d: %.3f / %.3f" % (i, j, test_cost,
err)

[Link](test_cost)

print "Elapsed time:", ([Link]() - t0)

[Link](LL)

[Link]()
if __name__ == '__main__':

main()
Conclusion

I really hope you had as much fun reading this book as I did making
it.

Did you find anything confusing? Do you have any questions?

I am always available to help. Just email me at:
info@[Link]

Do you want to learn more about deep learning? Perhaps online

courses are more your style. I happen to have a few of them on
Udemy.

A lot of the material in this book is covered in this course, but you get
to see me derive the formulas and write the code live:

Deep Learning: Convolutional Neural Networks in Python

[Link]
networks-theano-tensorflow

The background and prerequisite knowledge for deep learning and

neural networks can be found in my class “Data Science: Deep
Learning in Python” (officially known as “part 1” of the series). In this
course I teach you the feedforward mechanism of a neural network
(which I assumed you already knew for this book), and how to derive
the training algorithm called backpropagation (which I also assumed
you knew for this book):

Data Science: Deep Learning in Python

[Link]

The corresponding book on Kindle is:

[Link]
action/us/[Link]/B01CVJ19E8
Are you comfortable with this material, and you want to take your
deep learning skillset to the next level? Then my follow-up Udemy
course on deep learning is for you. Similar to previous book, I take
you through the basics of Theano and TensorFlow - creating
functions, variables, and expressions, and build up neural networks
from scratch. I teach you about ways to accelerate the learning
process, including batch gradient descent, momentum, and adaptive
learning rates. I also show you live how to create a GPU instance on
Amazon AWS EC2, and prove to you that training a neural network
with GPU optimization can be orders of magnitude faster than on
your CPU.

Data Science: Practical Deep Learning in Theano and TensorFlow

[Link]
tensorflow
In part 4 of my deep learning series, I take you through unsupervised
deep learning methods. We study principal components analysis
(PCA), t-SNE (jointly developed by the godfather of deep learning,
Geoffrey Hinton), deep autoencoders, and restricted Boltzmann
machines (RBMs). I demonstrate how unsupervised pretraining on a
deep network with autoencoders and RBMs can improve supervised
learning performance.

Unsupervised Deep Learning in Python

[Link]
Would you like an introduction to the basic building block of neural
networks - logistic regression? In this course I teach the theory of
logistic regression (our computational model of the neuron), and give
you an in-depth look at binary classification, manually creating
features, and gradient descent. You might want to check this course
out if you found the material in this book too challenging.

Data Science: Logistic Regression in Python

[Link]
The corresponding book for Deep Learning Prerequisites is:

[Link]
action/us/[Link]/B01D7GDRQ2

To get an even simpler picture of machine learning in general, where

we don’t even need gradient descent and can just solve for the
optimal model parameters directly in “closed-form”, you’ll want to
check out my first Udemy course on the classical statistical method -
linear regression:

Data Science: Linear Regression in Python

[Link]

If you are interested in learning about how machine learning can be

applied to language, text, and speech, you’ll want to check out my
course on Natural Language Processing, or NLP:

Data Science: Natural Language Processing in Python

[Link]
in-python

If you are interested in learning SQL - structured query language - a

language that can be applied to databases as small as the ones
sitting on your iPhone, to databases as large as the ones that span
multiple continents - and not only learn the mechanics of the
language but know how to apply it to real-world data analytics and
marketing problems? Check out my course here:

SQL for Marketers: Dominate data analytics, data science, and big
data
[Link]
science-big-data

Finally, I am always giving out coupons and letting you know when
you can get my stuff for free. But you can only do this if you are a
current student of mine! Here are some ways I notify my students
about coupons and free giveaways:

My newsletter, which you can sign up for at

[Link] (it comes with a free 6-week intro to
machine learning course)
My Twitter, [Link]

My Facebook page, [Link] (don’t

forget to hit “like”!)

Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Le
100% (3)
Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Le
178 pages
Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Learning in Python, Theano, and TensorFlow (Machine Learning in Python) by LazyProgrammer
No ratings yet
Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Learning in Python, Theano, and TensorFlow (Machine Learning in Python) by LazyProgrammer
183 pages
Convolutional Neural Networks in Python
100% (3)
Convolutional Neural Networks in Python
141 pages
Convolutional Neural Networks in Python
No ratings yet
Convolutional Neural Networks in Python
75 pages
Introduction To Deep Learning With IBM PDF
No ratings yet
Introduction To Deep Learning With IBM PDF
15 pages
Deep Learning Notes (1) 2
No ratings yet
Deep Learning Notes (1) 2
54 pages
Convolutional Neural Networks Notes
No ratings yet
Convolutional Neural Networks Notes
29 pages
Deep Learning Fundamentals and ArchitecturesDeep Learning Fundamentals and Architectures
No ratings yet
Deep Learning Fundamentals and ArchitecturesDeep Learning Fundamentals and Architectures
9 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
Demystifying The Mathematics Behind Convolutional Neural Networks (CNNS)
No ratings yet
Demystifying The Mathematics Behind Convolutional Neural Networks (CNNS)
19 pages
Week 02 Ch2.1 Introduction To Neural Networks
No ratings yet
Week 02 Ch2.1 Introduction To Neural Networks
44 pages
Lecture2 Slides 1
No ratings yet
Lecture2 Slides 1
28 pages
Hacking Neural Networks: A Short Introduction
No ratings yet
Hacking Neural Networks: A Short Introduction
50 pages
Deep 2
No ratings yet
Deep 2
57 pages
Deep Learning
100% (4)
Deep Learning
100 pages
Business Data Mining Week 12
No ratings yet
Business Data Mining Week 12
24 pages
Class 17 Mathametics of CNN CNNHHHHHHHHHHHHH
No ratings yet
Class 17 Mathametics of CNN CNNHHHHHHHHHHHHH
17 pages
GK Deeplearning
No ratings yet
GK Deeplearning
15 pages
Neural Networks: Feedforward Basics
No ratings yet
Neural Networks: Feedforward Basics
24 pages
Morgan & Claypool - Introduction To Deep Learning For Engineers Using Python and Google Clod Platform - 2020
No ratings yet
Morgan & Claypool - Introduction To Deep Learning For Engineers Using Python and Google Clod Platform - 2020
111 pages
Deep Learning: A Technical Guide
No ratings yet
Deep Learning: A Technical Guide
106 pages
Deep Learning With Keras - Quick Guide
No ratings yet
Deep Learning With Keras - Quick Guide
22 pages
LLM For Maths People
No ratings yet
LLM For Maths People
53 pages
Introduction To Deep Neural Networks - DataCamp
No ratings yet
Introduction To Deep Neural Networks - DataCamp
10 pages
Machine Learning for Embedded AI
No ratings yet
Machine Learning for Embedded AI
58 pages
Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
Python TensorFlow Tutorial - Build A Neural Network - Adventures in Machine Learning
100% (1)
Python TensorFlow Tutorial - Build A Neural Network - Adventures in Machine Learning
18 pages
Neural Networks and Deep Learning
No ratings yet
Neural Networks and Deep Learning
22 pages
Copia de BuildingABrain
No ratings yet
Copia de BuildingABrain
8 pages
Shallow Networks Versus Deep Networks
No ratings yet
Shallow Networks Versus Deep Networks
6 pages
A Gentle Introduction To Neural Networks With Python
No ratings yet
A Gentle Introduction To Neural Networks With Python
85 pages
Beginner's PyTorch Guide
No ratings yet
Beginner's PyTorch Guide
35 pages
Intro to Neural Networks with Python
100% (1)
Intro to Neural Networks with Python
85 pages
Deep Learning Course Overview
No ratings yet
Deep Learning Course Overview
298 pages
Deep Learning Cheatsheet Guide
No ratings yet
Deep Learning Cheatsheet Guide
14 pages
DL Practical File
No ratings yet
DL Practical File
58 pages
Eng PPT Tech
No ratings yet
Eng PPT Tech
18 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
37 pages
Lec14 CNNRNNModels
No ratings yet
Lec14 CNNRNNModels
64 pages
Plant Disease Identification
No ratings yet
Plant Disease Identification
17 pages
Image Classification Using Convolutional Neural Network With Python
No ratings yet
Image Classification Using Convolutional Neural Network With Python
8 pages
CNN Guide for Machine Learning Students
No ratings yet
CNN Guide for Machine Learning Students
37 pages
CS230
No ratings yet
CS230
101 pages
UNIT-III Convolution Neural Networks
No ratings yet
UNIT-III Convolution Neural Networks
9 pages
Lec6 RNN Attention Search
No ratings yet
Lec6 RNN Attention Search
62 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
155 pages
Dense Neural Nets
No ratings yet
Dense Neural Nets
68 pages
Cours 8 B
No ratings yet
Cours 8 B
39 pages
Deep Learning Basics by Romain Tavenard
No ratings yet
Deep Learning Basics by Romain Tavenard
49 pages
Convolutional Neural Networks in Python - DataCamp
No ratings yet
Convolutional Neural Networks in Python - DataCamp
22 pages
CNN Algorithm Implementation Lab
No ratings yet
CNN Algorithm Implementation Lab
5 pages
5G From Space - An Overview of 3GPP Non-Terrestrial Networks
No ratings yet
5G From Space - An Overview of 3GPP Non-Terrestrial Networks
7 pages
DFT & Spectral Analysis with Matlab
No ratings yet
DFT & Spectral Analysis with Matlab
54 pages
Head International Module Antenna v2
No ratings yet
Head International Module Antenna v2
13 pages
6G Technologies: Key Drivers, Core Requirements, System Architectures, and Enabling Technologies
No ratings yet
6G Technologies: Key Drivers, Core Requirements, System Architectures, and Enabling Technologies
10 pages
Understanding Wire and Cable Types
No ratings yet
Understanding Wire and Cable Types
23 pages
Navigating Sustainability Transformation Challenges
No ratings yet
Navigating Sustainability Transformation Challenges
9 pages
Oop Poo
No ratings yet
Oop Poo
7 pages
Calibrating Instrumentation and Control Devices
No ratings yet
Calibrating Instrumentation and Control Devices
41 pages
Java Code
No ratings yet
Java Code
7 pages
Santi and Santoleri (2017)
No ratings yet
Santi and Santoleri (2017)
23 pages
Lesson 3.7 Descartes Rule of Signs
No ratings yet
Lesson 3.7 Descartes Rule of Signs
2 pages
All Billing Format .Doc - Food Money Dear Please
No ratings yet
All Billing Format .Doc - Food Money Dear Please
1 page
TD FSM2
No ratings yet
TD FSM2
3 pages
Role of Family in Value Inculcation
No ratings yet
Role of Family in Value Inculcation
19 pages
Case 6 - The Battle Over Net Neutrality
No ratings yet
Case 6 - The Battle Over Net Neutrality
2 pages
Sternberg's Love Philosophy Insights
No ratings yet
Sternberg's Love Philosophy Insights
11 pages
Udyam Registration Number: UDYAM-KR-12-0000230: (MICRO During Previous Financial Year)
No ratings yet
Udyam Registration Number: UDYAM-KR-12-0000230: (MICRO During Previous Financial Year)
1 page
Oxfordhb 9780199569458 e 1
No ratings yet
Oxfordhb 9780199569458 e 1
16 pages
A Detailed Lesson Plan Science 3
No ratings yet
A Detailed Lesson Plan Science 3
9 pages
MM 4191 Minimate 2 End of Sale
No ratings yet
MM 4191 Minimate 2 End of Sale
2 pages
The Physical Health Questionnaire (PHQ)
No ratings yet
The Physical Health Questionnaire (PHQ)
20 pages
HGM4100LT en
No ratings yet
HGM4100LT en
50 pages
DON Purchase Card Program Guide 4.0
No ratings yet
DON Purchase Card Program Guide 4.0
77 pages
ART 401 Creativity and Innovation Syllabus PDF
No ratings yet
ART 401 Creativity and Innovation Syllabus PDF
8 pages
STEM Paper Evaluation Rubric
No ratings yet
STEM Paper Evaluation Rubric
2 pages
Ngugi Wa Thiongo and Chinua Achebe On The Politics of Language and Literature in Africa
No ratings yet
Ngugi Wa Thiongo and Chinua Achebe On The Politics of Language and Literature in Africa
3 pages
NSIT B.E. 5th Semester Results 2013
No ratings yet
NSIT B.E. 5th Semester Results 2013
10 pages
6th Grade 2nd Exam 2nd Term
No ratings yet
6th Grade 2nd Exam 2nd Term
2 pages
CBSE Class 9 English The Fun They Had Summary
No ratings yet
CBSE Class 9 English The Fun They Had Summary
27 pages
Rajendran Et Al 2019
No ratings yet
Rajendran Et Al 2019
21 pages
From Dust To Man: A Scientific Proof: Bro. Eli Soriano
100% (1)
From Dust To Man: A Scientific Proof: Bro. Eli Soriano
40 pages
CSS Syllabus & Rules: Subjects For Written Examination (1200 Marks)
No ratings yet
CSS Syllabus & Rules: Subjects For Written Examination (1200 Marks)
3 pages
Nautical Distance Calculations and Coordinates
No ratings yet
Nautical Distance Calculations and Coordinates
12 pages
Grade 1 Arts Lesson Plan
No ratings yet
Grade 1 Arts Lesson Plan
6 pages

Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Learning in Python, Theano,... (The LazyProgrammer)

Uploaded by

Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Learning in Python, Theano,... (The LazyProgrammer)

Uploaded by

Convolutional Neural Networks in

Master Data Science and Machine Learning with

By: The LazyProgrammer ([Link]

Chapter 1: Review of Feedforward Neural Networks

Chapter 3: The Convolutional Neural Network

Chapter 4: Sample Code in Theano

Chapter 5: Sample Code in TensorFlow

Because convolution is such a central part of this type of neural

Lastly, my goal is to show you that convolutional networks aren’t

Except we replace the first “dot product” with a convolution:

y = softmax( relu(conv(X, W1)).dot(W2) )

In this lecture we are going to review some important background

Train and Predict

We know that the nonlinearities we usually use in the hidden layers

We know that the output is a sigmoid for binary classification and

If you want to learn more about backpropagation and gradient

In this book, we are going to keep the dimensions of the original

The streetview house number dataset is a much harder problem

And look in the folder: cnn_class

Once you have the machine_learning_examples repo you’ll want to

That is where we will expect all the data to reside.

I think it’s pretty straightforward to download at that point. We’re only

You can get the streetview house number data from

In this chapter I’m going to give you guys a crash course in

x(t)--->| h(t) |--->y(t)

The last operation is to sum them all together.

So how can we do this in math? Well we can represent the

For any general filter, there wouldn’t be this restriction on the

And we can write the operation as a summation.

So now here is what we consider the “definition” of convolution. We

I want to emphasize that it doesn’t matter if we slide the filter across

There are some very practical applications of this signal processing

One of my favorite examples is that we can build artificial organs.

So to replace an organ, we would need to build a machine that could

In fact, that sounds a lot like machine learning, doesn’t it!

Since we’ll be working with images, we need to talk about 2-

def convolve(x, w):

What that tells us is that the shape of Y is actually BIGGER than X.

If you’ve ever done image editing with applications like Photoshop or

But this time instead of having predefined delays we are going to

Here is the definition of the filter:

dist = (i - 9.5)**2 + (j - 9.5)**2

W[i, j] = [Link](-dist / 50.)

The filter itself looks like this:

from [Link] import convolve2d

import [Link] as plt

import [Link] as mpimg

# what does it look like?

# create a Gaussian filter

dist = (i - 9.5)**2 + (j - 9.5)**2

W[i, j] = [Link](-dist / 50.)

# let's see what the filter looks like

out = convolve2d(bw, W, mode='same')

Edge detection is another important operation in computer vision. If

[-1, -2, -1],

The full code

from [Link] import convolve2d

import [Link] as plt

# load the famous Lena image

[-1, -2, -1],

Note that most of this material is inspired by LeCun, 1998 (Gradient-

Remember that you can think of convolution as a “sliding window” or

Our system should still be able to recognize that there is a dog in

Another important operation we’ll need before we build the

So for example, we would downsample an image by converting it

Another way is average pooling - this means taking the average

Theano has a function for this:

Just compute the hidden layer as follows:

As stated previously, you could then train this simply by doing

Exercise: Try this on MNIST. How well does it perform? Better or

The LeNet architecture

You will see that it is just a matter of joining up the operations we

Next you do maxpooling to reduce the size of the features.

Then you do another convolution and another maxpooling.

convolution / pool / convolution / pool / fully connected hidden layer /

Input image size: (3, 32, 32)

dist = (i - 9.5)2 + (j - 9.5)2

dist = (i - 9.5)2 + (j - 9.5)2