0% found this document useful (0 votes)
24 views11 pages

Keras1 - 1.4 Advanced Model Architectures

1.4 Advanced Model Architectures 1.4.1. Tensors, layers, and autoencoders 1.4.2. INTRO TO CNNs 14.3.Intro to LSTMs

Uploaded by

Ayşe Bat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views11 pages

Keras1 - 1.4 Advanced Model Architectures

1.4 Advanced Model Architectures 1.4.1. Tensors, layers, and autoencoders 1.4.2. INTRO TO CNNs 14.3.Intro to LSTMs

Uploaded by

Ayşe Bat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

10/22/24, 10:27 PM OneNote

1.4 Advanced Model Architectures


22 Nisan 2024 Pazartesi 13:52

1-Introduction to Deep Learning with Keras


1.4. Advanced Model Architectures
1.4.1. Tensors, layers, and autoencoders
Now that you know how to tune your models, it's time to better understand how they
work internally and to explore newer network architectures.

Accessing Keras layers


Model layers are easily accessible, we just need to call layers on a built model and
access the index of the layer we want. From a chosen layer we can print its inputs,
outputs, and weights. You can see inputs and outputs are tensors of a given shape built
with TensorFlow tensor objects, weights are TensorFlow variable objects, which are just
tensors that change their value as the neural network learns the best weights.

What are tensors?


Tensors are the main data structures used in deep learning, inputs, outputs, and
transformations in neural networks are all represented using tensors and tensor
multiplication. A tensor is a multi-dimensional array of numbers. A 2 dimensional tensor
is a matrix, a 3 dimensional tensor is an array of matrices. Tensors are the main data
structures used in deep learning, inputs, outputs, and transformations in neural
networks are all represented using tensors and tensor multiplication. A tensor is a multi-
dimensional array of numbers. A 2 dimensional tensor is a matrix, a 3 dimensional
tensor is an array of matrices.

Keras backend
If we import the Keras backend we can build a function that takes in an input tensor
from a given layer and returns an output tensor from another or the same layer.
Tensorflow is the backend Keras is using in this course, but it could be any other, like
Theano. To define the function with our backend K we need to give it a list of inputs and
outputs, even if we just want 1 input and 1 output. Then we can use it on a tensor with
the same shape as the input layer given during its definition. If the weights of the layers
between our input and outputs change the function output for the same input will
change as well. We can use this to see how the output of certain layers change as
weights are adjusted during training, we will check this in the exercises!

https://onedrive.live.com/redir?resid=F8B40D88517D3A9B%212246&page=Edit&wd=target%28Keras1- Introduction to Deep Learning with Keras.one%7C9… 1/11


10/22/24, 10:27 PM OneNote
Autoencoders!
Autoencoders! Autoencoders are models that aim at producing the same inputs as
outputs.

This task alone wouldn't be very useful, but since along the way we decrease the
number of neurons, we are effectively making our network learn to compress its inputs
into a small set of neurons.

Autoencoder use cases


This makes autoencoders useful for things like: Dimensionality reduction, since we can
obtain a smaller dimensional space representation of our inputs. De-noising, if trained
with clear data and then fed with noisy data they will be able to decode back a good
representation of the input data without noise. Anomaly detection, if you train an
autoencoder to map inputs to outputs with data but you then pass in strange values, the
network will fail at giving accurate output values. And we can measure this as a loss.
Many other applications can also benefit from this architecture.

Building a simple autoencoder


To make an autoencoder that maps a hundred inputs to a hundred outputs, encoding
the inputs into a layer of 4 neurons, we would do the following: Instantiate a sequential
model, add a dense layer of 4 neurons with an input_shape of a hundred and end with
an output layer of 100 neurons. We use activation sigmoid because we assume that our
output can take a value between 0 and 1, we end compiling our model with adam
optimizer and binary_crossentropy loss since we used sigmoid.

Breaking it into an encoder


Once you've built and trained your autoencoder you might want to encode your inputs.
To do this, you just have to build a new model which just uses the first layer of your
previously trained autoencoder. This new model predictions returns the 4 numbers
given by the 4 neurons of the hidden layer. It's effectively returning a 4 number
encoding of each observation in the input dataset.

Exercise: It's a flow of tensors


If you have already built a model, you can use the model.layers and
the tensorflow.keras.backend to build functions that, provided with a valid input tensor,
return the corresponding output tensor.
This is a useful tool when we want to obtain the output of a network at an intermediate
layer.
For instance, if you get the input and output from the first layer of a network, you can
build an inp_to_out function that returns the result of carrying out forward propagation
through only the first layer for a given input tensor.
So that's what you're going to do right now!

https://onedrive.live.com/redir?resid=F8B40D88517D3A9B%212246&page=Edit&wd=target%28Keras1- Introduction to Deep Learning with Keras.one%7C9… 2/11


10/22/24, 10:27 PM OneNote
X_test from the Banknote Authentication dataset and its model are preloaded.
Type model.summary() in the console to check it.

Neural separation
Put on your gloves because you're going to perform brain surgery!
Neurons learn by updating their weights to output values that help them better
distinguish between the different output classes in your dataset. You will make use of
the inp_to_out() function you just built to visualize the output of two neurons in the first
layer of the Banknote Authentication model as it learns.
The model you built in chapter 2 is ready for you to use, just like X_test and y_test.
Paste show_code(plot) in the console if you want to check plot().
You're performing heavy duty, once all is done, click through the graphs to watch the
separation live!

Exercise: Building an autoencoder


Autoencoders have several interesting applications like anomaly detection or image
denoising. They aim at producing an output identical to its inputs. The input will be
compressed into a lower dimensional space, encoded. The model then learns
to decode it back to its original form.
You will encode and decode the MNIST dataset of handwritten digits, the hidden layer
will encode a 32-dimensional representation of the image, which originally consists of
784 pixels (28 x 28). The autoencoder will essentially learn to turn the 784 pixels
original image into a compressed 32 pixels image and learn how to use that encoded
representation to bring back the original 784 pixels image.
The Sequential model and Dense layers are ready for you to use.
Let's build an autoencoder!

https://onedrive.live.com/redir?resid=F8B40D88517D3A9B%212246&page=Edit&wd=target%28Keras1- Introduction to Deep Learning with Keras.one%7C9… 3/11


10/22/24, 10:27 PM OneNote

De-noising like an autoencoder


Okay, you have just built an autoencoder model. Let's see how it handles a more
challenging task.
First, you will build a model that encodes images, and you will check how different digits
are represented with show_encodings(). To build the encoder you will make use of
your autoencoder, that has already being trained. You will just use the first half of the
network, which contains the input and the bottleneck output. That way, you will obtain a
32 number output which represents the encoded version of the input image.
Then, you will apply your autoencoder to noisy images from MNIST, it should be able to
clean the noisy artifacts.
X_test_noise is loaded in your workspace. The digits in this noisy dataset look like this:

Apply the power of the autoencoder!

1.4.2. INTRO TO CNNs


Let's introduce Convolutional Neural Networks, a different type of network that has led
to a lot of advances in computer vision, as well as in many other areas.

How do they work?

https://onedrive.live.com/redir?resid=F8B40D88517D3A9B%212246&page=Edit&wd=target%28Keras1- Introduction to Deep Learning with Keras.one%7C9… 4/11


10/22/24, 10:27 PM OneNote
A convolutional model uses convolutional layers. A convolution is a simple
mathematical operation that preserves spatial relationships. When applied to images it
can detect relevant areas of interest like edges, corners, vertical lines, etc.

It consists of applying a filter, also known as kernel, of a given size. In this image, we
are applying a 3 by 3 kernel. We center the kernel matrix of numbers as we slide
through each pixel in the image, multiplying the kernel and pixel values at each location
and averaging the sum of values obtained. This effectively computes a new image
where certain characteristics are amplified depending on the filter used. The secret
sauce of CNNs resides in letting the network itself find the best filter values and to
combine them to achieve a given task.

Typical architectures
For a classification problem with many possible classes, CNNs tend to become very
deep. Architectures consist of concatenations of convolutional layers among other
layers known as pooling layers, that we won't cover here. Convolutional layers perform
feature learning, we then flatten the outputs into a unidimensional vector and pass it to
fully connected layers that carry out classification.

Input shape to convolutional neural networks


Images are 3D tensors, they have width, height, and depth. This depth is given by the
color channels. If we use black and white images we will just have one channel, so the
depth will be 1.

https://onedrive.live.com/redir?resid=F8B40D88517D3A9B%212246&page=Edit&wd=target%28Keras1- Introduction to Deep Learning with Keras.one%7C9… 5/11


10/22/24, 10:27 PM OneNote

How to build a simple convolutional net in keras?


To build a CNN in Keras we first import the Conv2D and Flatten layers from
tensorflow.keras.layers. We instantiate our model and add a convolutional layer. This
first convolutional layer has 32 filters, this means it will learn 32 different convolutional
masks. These masks will be squares of 3 by 3 as defined in the kernel_size. For 28
times 28 black and white images with only one channel, we use an input shape of (28,
28, 1). We can use any activation, as usual. We then add another convolutional layer
and end flattening this 2D layer into a unidimensional layer with the Flatten layer. We
finish with an output dense layer.

Deep convolutional models


ResNet50 is a 50 layer-deep model that performs well on the Imagenet Dataset, a huge
dataset of more than 14 million images. ResNet50 can distinguish between 1000
different classes. This model would take too long to train on a regular computer, but
Keras makes it easy for us to use it. We just need to prepare the image we want to
classify for the model, predict the processed image, and decode the predictions!
Pre-processing images for ResNet50
02:36 - 03:11
To use pre-trained models to classify images, we first have to adapt these images so
that they can be understood by the model. To prepare images for ResNet50 we would
do the following. First import the image from tensorflow.keras.preprocessing and
preprocess_input from tensorflow.keras.applications.resnet50. We then load our image
with load_img, providing the target size, for this particular model that is 224 by 224. We
turn the image into a numpy array with img_to_array, we expand the dimensions of the
array and preprocess the input in the same way the training images were.

We import ResNet50 and decode_predictions,load the model with Imagenet pre_trained


weights,predict on our image,and decode the predictions. That is, getting the predicted
classes with the highest probabilities.

What is going on inside a convnet?


Inside a CNN we can check how the different filters activate in response to an input
image. We will explore this in the exercises!

https://onedrive.live.com/redir?resid=F8B40D88517D3A9B%212246&page=Edit&wd=target%28Keras1- Introduction to Deep Learning with Keras.one%7C9… 6/11


10/22/24, 10:27 PM OneNote

Exercise: Building a CNN model


Building a CNN model in Keras isn't much more difficult than building any of the models
you've already built throughout the course! You just need to make use of convolutional
layers.
You're going to build a shallow convolutional model that classifies the MNIST digits
dataset. The same one you de-noised with your autoencoder! The images are 28 x 28
pixels and just have one channel, since they are black and white pictures.
Go ahead and build this small convolutional model

Looking at convolutions
Inspecting the activations of a convolutional layer is a cool thing. You have
to do it at least once in your lifetime!
To do so, you will build a new model with the Keras Model object, which
takes in a list of inputs and a list of outputs. The outputs you will provide to
this new model is the first convolutional layer outputs when given
an MNIST digit as input image.
The convolutional model you built in the previous exercise has already been
trained for you. It can now correctly classify MNIST handwritten images.
You can check it with model.summary() in the console.
Let's look at the convolutional masks that were learned in the first
convolutional layer of this model!

Preparing your input image


The original ResNet50 model was trained with images of size 224 x 224 pixels and a
number of preprocessing operations; like the subtraction of the mean pixel value in the
training set for all training images. You need to pre-process the images you want to
predict on in the same way.
When predicting on a single image you need it to fit the model's input shape, which in
this case looks like this: (batch-size, width, height, channels),np.expand_dims with
parameter axis = 0 adds the batch-size dimension, representing that a single image will
be passed to predict. This batch-size dimension value is 1, since we are only predicting
on one image.
You will go over these preprocessing steps as you prepare this dog's (named Ivy) image
into one that can be classified by ResNet50.

https://onedrive.live.com/redir?resid=F8B40D88517D3A9B%212246&page=Edit&wd=target%28Keras1- Introduction to Deep Learning with Keras.one%7C9… 7/11


10/22/24, 10:27 PM OneNote

Using a real world model


Okay, so Ivy's picture is ready to be used by ResNet50. It is stored in img_ready and
now looks like this:
ResNet50 is a model trained on the Imagenet dataset that is able to distinguish
between 1000 different labeled objects. ResNet50 is a deep model with 50 layers, you
can check it in 3D here.
ResNet50 and decode_predictions have both been imported
from tensorflow.keras.applications.resnet50 for you.
It's time to use this trained model to find out Ivy's breed!

14.3.Intro to LSTMs
It's time to briefly introduce Long Short Term Memory networks, also known as LSTMs.

What are RNNs?


LSTMs are a type of recurrent neural network, RNN for short. A simple RNN is a neural
network that can use past predictions in order to infer new ones. This allows us to solve
problems where there is a dependence on past inputs.

What are LSTMs?


LSTM neurons are pretty complex, they are actually called units or cells. They have an
internal state that is passed between units, you can see this as a memory of past steps.
A unit receives the internal state, an output from the previous unit, and a new input at
time t. Then it updates the state and produces a new output that is returned, as well as
passed as an input to the following unit.

LSTM units perform several operations. They learn what to ignore, what to keep and to
select the most important pieces of past information in order to predict the future. They
tend to work better than simple RNNs for most problems.

https://onedrive.live.com/redir?resid=F8B40D88517D3A9B%212246&page=Edit&wd=target%28Keras1- Introduction to Deep Learning with Keras.one%7C9… 8/11


10/22/24, 10:27 PM OneNote

LSTMs + Text
Let's go over an example on how to use LSTMs with text data to predict the next word in
a sentence!

Neural networks can only deal with numbers, not text. We need to transform each
unique word into a number. Then these numbers can be used as inputs to an
embedding layer.

Embedding layers learn to represent words as vectors of a predetermined size. These


vectors encode meaning and are used by subsequent layers.

Sequence preparation
We first define some text and choose a sequence length. With a sequence length of 3
we will end up feeding our model with two words and it will predict the third one. We
split the text into words with the split method. The output looks like this: We then, need
to turn these words into consecutive lines of 3 words each. We can loop from seq_len to
the number of words + 1 and store each line. The end results look like this:

Text preparation in Keras


After that we turn our text sequences into numbers. We import Keras Tokenizer from
the preprocessing text module. Instantiate it, fit it on lines, and then turn those lines into
numeric sequences. This is how the 3-word lines look now. The tokenizer object stores
the word-to-number mapping. There are two dictionaries, the index_word, and the
word_index. Here, the index_word is printed, which shows the encoded word for each
index. We can use this dictionary to decode our outputs, mapping numbers back to
words.

https://onedrive.live.com/redir?resid=F8B40D88517D3A9B%212246&page=Edit&wd=target%28Keras1- Introduction to Deep Learning with Keras.one%7C9… 9/11


10/22/24, 10:27 PM OneNote

Building a LSTM model


Our data is ready to be processed. Now, we are ready to build the LSTM model. We
start by importing the Dense, LSTM, and Embedding layers from
tensorflow.keras.layers. We then store the vocab_size, since we will use it when
defining our layers. The vocab_size is the length of the tokenizer dictionary plus one.
The plus one is because we account for 0 as an integer reserved for special characters,
as we saw, our dictionary starts at 1, not 0. We add an embedding layer, the input_dim
is the vocab_size variable, we will turn our word numbers into 8-dimensional vectors,
and need to declare the input_length so that our model understand that two words will
be passed simultaneously as a sequence. We end by adding an LSTM layer of 8 units,
a hidden layer, and an output layer with softmax and as many outputs as possible
words.

Exercise: Text prediction with LSTMs


During the following exercises you will build a toy LSTM model that is able to predict the
next word using a small text dataset. This dataset consist of cleaned quotes from
the The Lord of the Ring movies. You can find them in the text variable.
You will turn this text into sequences of length 4 and make use of the Keras Tokenizer to
prepare the features and labels for your model!
The Keras Tokenizer is already imported for you to use. It assigns a unique number to
each unique word, and stores the mappings in a dictionary. This is important since the
model deals with numbers but we later will want to decode the output numbers back into
words.

Exercise: Build your LSTM model


You've already prepared your sequences of text. It's time to build your LSTM model!
Remember your sequences had 4 words each, your model will be trained on the first
three words of each sequence, predicting the 4th one. You are going to use
an Embedding layer that will essentially learn to turn words into meaningful vectors.
These vectors will then be passed to a simple LSTM layer. Our output is a Dense layer
with as many neurons as words in the vocabulary and softmax activation. This is
because we want to obtain the highest probable next word out of all possible words.
The size of the vocabulary of words (the unique number of words) is stored
in vocab_size.

https://onedrive.live.com/redir?resid=F8B40D88517D3A9B%212246&page=Edit&wd=target%28Keras1- Introduction to Deep Learning with Keras.one%7C… 10/11


10/22/24, 10:27 PM OneNote

That's a nice looking model you've built! You'll see that this model is powerful enough to
learn text relationships, we aren't using a lot of text in this tiny example and our
sequences are quite short. This model is to be trained as usual, you would just need to
compile it with an optimizer like adam and use crossentropy loss. This is because we
have modeled this next word prediction task as a classification problem with all the
unique words in our vocabulary as candidate classes

Exercise: Decode your predictions


Your LSTM model has already been trained (details in the previous exercise success
message) so that you don't have to wait. It's time to define a function that decodes its
predictions. The trained model will be passed as a default parameter to this function.
Since you are predicting on a model that uses the softmax function,
numpy's argmax() can be used to obtain the index/position representing the most
probable next word out of the output vector of probabilities.
The tokenizer you previously created and fitted, is loaded for you. You will be making
use of its internal index_word dictionary to turn the model's next word prediction (which is
an integer) into the actual written word it represents.
You're very close to experimenting with your model!

https://onedrive.live.com/redir?resid=F8B40D88517D3A9B%212246&page=Edit&wd=target%28Keras1- Introduction to Deep Learning with Keras.one%7C… 11/11

You might also like