0% found this document useful (0 votes)
563 views57 pages

Deep Learning Manual

The document provides instructions on installing Keras and its packages. It recommends downloading the Anaconda distribution platform to obtain Python and over 1500 packages for machine learning and deep learning. Specific steps are outlined, including downloading the Anaconda Python version appropriate for the user's operating system, then using Anaconda to install Keras and other necessary packages like TensorFlow, Theano or CNTK which act as Keras' backends for tensor operations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
563 views57 pages

Deep Learning Manual

The document provides instructions on installing Keras and its packages. It recommends downloading the Anaconda distribution platform to obtain Python and over 1500 packages for machine learning and deep learning. Specific steps are outlined, including downloading the Anaconda Python version appropriate for the user's operating system, then using Anaconda to install Keras and other necessary packages like TensorFlow, Theano or CNTK which act as Keras' backends for tensor operations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

1

St. Johns College of Engineering & Technology


Yerrakota, Yemmiganur – 518 360, Kurnool (Dist) A.P.

Deep Learning -LAB MANUAL (R20


20)

Name of Student: __________________________


Roll No : __________________________
Year/Semester : __________________________
Branch : ___________________________
2

CERTIFICATE

Department of _________________________________________________________

Certified that this is the bonafide record of the work done by


Sri___________
____________________ of ________________ B.Tech in _____________________
______________ Laboratory

Date: Head of the Department Staff-in-Charge

Regd No:

Submitted for the Practical Examination held on_______________________

Internal Examiner External Examiner


3

LIST OF EXPERIMENTS

SNO NAME OF THE


DATE PAGENO SIGNATURE
EXPERIMENT
4

Experiment 1 – Introduction of Keras

Keras is an open-source high-level Neural Network library, which is written in Python is capable
enough to run on Theano, TensorFlow, or CNTK. It was developed by one of the Google
engineers, Francois Chollet. It is made user-friendly, extensible, and modular for facilitating
faster experimentation with deep neural networks. It not only supports Convolutional Networks
and Recurrent Networks individually but also their combination.

It cannot handle low-level computations, so it makes use of the Backend library to resolve it.
The backend library act as a high-level API wrapper for the low-level API, which lets it run on
TensorFlow, CNTK, or Theano.

Initially, it had over 4800 contributors during its launch, which now has gone up to 250,000
developers. It has a 2X growth ever since every year it has grown. Big companies like Microsoft,
Google, NVIDIA, and Amazon have actively contributed to the development of Keras. It has an
amazing industry interaction, and it is used in the development of popular firms likes Netflix,
Uber, Google, Expedia, etc.

Specialties of Keras

o Focus on user experience has always been a major part of Keras.


o Large adoption in the industry.
o It is a multi backend and supports multi-platform, which helps all the encoders come
together for coding.
o Research community present for Keras works amazingly with the production community.
o Easy to grasp all concepts.
o It supports fast prototyping.
o It seamlessly runs on CPU as well as GPU.
o It provides the freedom to design any architecture, which then later is utilized as an API
for the project.
o It is really very simple to get started with.
o Easy production of models actually makes Keras special.

Keras user experience

1. Keras is an API designed for humans – Best practices are followed by Keras to
decrease cognitive load, ensures that the models are consistent, and the corresponding
APIs are simple.
2. Not designed for machines – Keras provides clear feedback upon the occurrence of any
error that minimizes the number of user actions for the majority of the common use cases.
3. Easy to learn and use.
4. Highly Flexible – Keras provide high flexibility to all of its developers by integrating
low-level deep learning languages such as TensorFlow or Theano, which ensures that
anything written in the base language can be implemented in Keras.
5

multi-backend and multi-platform of Keras

Keras can be developed in R as well as Python, such that the code can be run with TensorFlow,
Theano, CNTK, or MXNet as per the requirement. Keras can be run on CPU, NVIDIA GPU,
AMD GPU, TPU, etc. It ensures that producing models with Keras is really simple as it totally
supports to run with TensorFlow serving, GPU acceleration (WebKeras, Keras.js), Android (TF,
TF Lite), iOS (Native CoreML) and Raspberry Pi. Play Video

Keras Backend

Keras being a model-level library helps in developing deep learning models by offering high-
level building blocks. All the low-level computations such as products of Tensor, convolutions,
etc. are not handled by Keras itself, rather they depend on a specialized tensor manipulation
library that is well optimized to serve as a backend engine. Keras has managed it so perfectly that
instead of incorporating one single library of tensor and performing operations related to that
particular library, it offers plugging of different backend engines into Keras.

Keras consist of three backend engines, which are as follows:

o TensorFlow
TensorFlow is a Google product, which is one of the most famous deep learning tools
widely used in the research area of machine learning and deep neural network. It came
into the market on 9th November 2015 under the Apache License 2.0. It is built in such a
way that it can easily run on multiple CPUs and GPUs as well as on mobile operating
systems. It consists of various wrappers in distinct languages such as Java, C++, or
Python.

o Theano
Theano was developed at the University of Montreal, Quebec, Canada, by the MILA
group. It is an open-source python library that is widely used for performing
mathematical operations on multi-dimensional arrays by incorporating scipy and numpy.
It utilizes GPUs for faster computation and efficiently computes the gradients by building
symbolic graphs automatically. It has come out to be very suitable for unstable
expressions, as it first observes them numerically and then computes them with more
stable algorithms.

o CNTK
Microsoft Cognitive Toolkit is deep learning's open-source framework. It consists of all
the basic building blocks, which are required to form a neural network. The models are
trained using C++ or Python, but it incorporates C# or Java to load the model for making
predictions.

Advantages of Keras

Keras encompasses the following advantages, which are as follows:


6

o It is very easy to understand and incorporate the faster deployment of network models.
o It has huge community support in the market as most of the AI companies are keen on
using it.
o It supports multi backend, which means you can use any one of them among TensorFlow,
CNTK, and Theano with Keras as a backend according to your requirement.
o Since it has an easy deployment, it also holds support for cross-platform. Following are
the devices on which Keras can be deployed:
1. iOS with CoreML
2. Android with TensorFlow Android
3. Web browser with .js support
4. Cloud engine
5. Raspberry pi
o It supports Data parallelism, which means Keras can be trained on multiple GPU's at an
instance for speeding up the training time and processing a huge amount of data.

Disadvantages of Keras

o The only disadvantage is that Keras has its own pre-configured layers, and if you want to
create an abstract layer, it won't let you because it cannot handle low-level APIs. It only
supports high-level API running on the top of the backend engine (TensorFlow, Theano,
and CNTK).

Prerequisite

This Keras tutorial is made for both beginners and professionals, to help them understand the
fundamental concept of Keras. After the completion of this tutorial, you will find yourself at a
moderate level of expertise from where you can take yourself to the next level.
7

Experiment 2 – Installing Keras and packages in Keras

To install Keras, you will need Anaconda Distribution, which is supported by a company called
Continuum Analytics. Anaconda provides a platform for Python and R languages, which is an
open-source and free distribution. It is a platform-independent, which means that it can be
installed on any operating system such as MAC OS, Windows, and Linux as per the user's
requirement. It has come up with more than 1500 packages of Python/R that are necessary for
developing deep learning as well as machine learning models.

It provides an easy python installation with several IDE's such as Jupyter Notebook, Anaconda
prompt, Spyder, etc. Once it is installed, it will automatically install Python with some of its
basic IDE's and libraries by providing as much convenience as it can to its user.

Following are the steps that illustrate Keras installation:

Step1: Download Anaconda Python

To download Anaconda, you can either go to one of your favorite browser and type Download
Anaconda Python in the search bar or, simply follow the link given below.

https://www.anaconda.com/distribution/#download-section.

Click on the very first link, and you will get directed to the Anaconda's download page, as shown
below:
8

You will notice that Anaconda is available for various operating systems such as Windows,
MAC OS, and Linux. You can download it by clicking on the available options as per your OS.
It will offer you Python 2.7 and Python 3.7 version. Since the latest version is Python 3.7, so
download it by clicking on the download option. The downloading will automatically start after
you hit the download option.

Step2: Install Anaconda Python

After the download is finished, go to the download folder and click on the Anaconda's .exe file
(Anaconda3-2019.03-Windows-x86_64.exe). The setup window for the installation of Anaconda
will get open up where you have to click on Next, as shown below:
9

After clicking on the Next, it will open a License Agreement window, click on I Agree to move
ahead with the installation.

Next, you will get two options in the window; click on the first option, followed by clicking
10

on Next.

Once you are done with the installation, click on Next.


11

Click on Finish after the installation is completed to end the process.

Step3: Create Environment

Now that you are done with installing Anaconda, you have to create a new conda environment
12

where you will be installing all your modules to build your models.

You can run Anaconda prompt as an Administrator, which you can do by searching the
Anaconda prompt in the search bar and then click right on it, followed by selecting the first
option, which says Run as administrator.

After you click on it, you will see that your anaconda prompt has opened, and it will look like the
image given below.

Next, you will need to create an environment. For which you have to write the following
13

command on the anaconda prompt and press enter. Here deeplearning specifies to the name of
the environment, but you can write anything as per your choice.

1. conda create --name deeplearning

From the image given above, you can see that it is asking you for the package plan environment
location, click on y and press enter.

So, you can see in the above image that you have successfully created an environment. Now the
next step is to activate the environment that you created earlier. To activate the environment,
write the following;
14

1. activate deeplearning

From the above image, you can see that you are in this environment.
Next, you have to install the Keras, which you can simply do by using the below-given
command.
1. conda install -c anaconda keras

You can see that it is asking you to install the following packages, so proceed with typing y.
15

From the above image, you can see that you are done with the installation successfully.

Since this is a new environment so, you need to do a few installations again so as to avoid the
occurrence of error: ModuleNotFoundError: No module named 'keras' while importing Keras.

So, you have to run two of the most important commands because when you create an
environment, jupyter and spyder are not preinstalled, that is why you have to run them.

First, you will run the command for jupyter, which is as follow:

1. conda install jupyter


16

Again, it will ask you to install the following packages, so proceed with typing y.

You can see in the above image that it has been successfully installed.

Next, you will do the same for spyder.

1. conda install spyder

Since you are doing for the very first time, so it will again ask you for y/n, so you just simply
17

proceed by clicking on y as you did before.

You can see that your installation is successfully completed.


You would require to install matplotlib for visualization. Again, the same procedure will be
carried out.
1. conda install matplotlib

It will ask you for y/n, click on y to proceed further.


18

You can see that you have successfully installed matplotlib.


Lastly, you will be installing pandas, and again the procedure is the same.
1. conda install pandas

Proceed with clicking on y.


19

From the image given above, you can see that it also has been installed successfully.
20

Experiment 3 - Train the model to add two numbers and report the result
With the advent of Deep Learning, there have been huge successes for these kinds of perceptual
problems. In this guide, for the sake of simplicity and ease of understanding, we will try to
change the simple arithmetic addition to that of a perceptual problem and then try to predict the
values through this trained model.
In this guide, we are going to use Keras library which is made available as part of the Tensorflow
library.

Data Tensors
Getting the data in proper shape is perhaps the most important aspect of any machine learning
model and it holds true here as well. The below program (data_creation.py) creates the training
and test sets for the Addition problem.

1import numpy as np
2train_data = np.array([[1.0,1.0]])
3train_targets = np.array([2.0])
4print(train_data)
5for i in range(3,10000,2):
6 train_data= np.append(train_data,[[i,i]],axis=0)
7 train_targets= np.append(train_targets,[i+i])
8test_data = np.array([[2.0,2.0]])
9test_targets = np.array([4.0])
10for i in range(4,8000,4):
11 test_data = np.append(test_data,[[i,i]],axis=0)
12 test_targets = np.append(test_targets,[i+i])
python
Let's analyze the above program:
1import numpy as np
2train_data = np.array([[1.0,1.0]])
3train_targets = np.array([2.0])
python
In the above three lines, we are importing the Numpy library and creating train_data and
train_target data sets. train_data is the array that will be used to hold the two numbers that are
going to be added while train_targets is the vector that will hold the Addition value of the two.
train_data is initialized to contain the value like 1.0 and 1.0 as two numbers. This is a very
simple program so you will see the same number repeated (1.0) and this pattern is repeated in the
entire train and test data set that is same number (i) is used to add itself.
21

1for i in range(3,10000,2):
2 train_data= np.append(train_data,[[i,i]],axis=0)
3 train_targets= np.append(train_targets,[i+i])
python
The above lines append the train_data array and train_target vector by looping over the counter
(i) that starts from 3 and goes up to 10000 with a step function of 2. This is what train_data looks
like:
Output
1[[1.000e+00 1.000e+00]
2 [3.000e+00 3.000e+00]
3 [5.000e+00 5.000e+00]
4 ...
5 [9.995e+03 9.995e+03]
6 [9.997e+03 9.997e+03]
7 [9.999e+03 9.999e+03]]
train_targets:
Output
1[2.0000e+00 6.0000e+00 1.0000e+01 ... 1.9990e+04 1.9994e+04 1.9998e+04]
test_data and test_targets are also created in a similar fashion, with one difference: it goes till
8000 with the step of 4.
1test_data = np.array([[2.0,2.0]])
2test_targets = np.array([4.0])
3for i in range(4,8000,4):
4 test_data = np.append(test_data,[[i,i]],axis=0)
5 test_targets = np.append(test_targets,[i+i])
test_data:
Output
1[[2.000e+00 2.000e+00]
2 [4.000e+00 4.000e+00]
3 [8.000e+00 8.000e+00]
4 ...
5 [7.988e+03 7.988e+03]
6 [7.992e+03 7.992e+03]
7 [7.996e+03 7.996e+03]]
test_targets:
22

Output
1[4.0000e+00 8.0000e+00 1.6000e+01 ... 1.5976e+04 1.5984e+04 1.5992e+04]

Developing Neural Network for Addition Using Keras


Keras is an API spec that can be used to run various deep learning libraries e.g. Tensorflow,
Theano, etc. It is to be noted that Keras does not have an implementation and it is a high-level
API that runs on top of other deep learning libraries. The problem we are attempting to solve is a
regression problem where the output can be a continuum of values rather than taking a specified
set of values. Below, the program creates a Deep Learning model, trains it using the training set
we created in the data_creation.py program, and then tests it using the test set also created in the
same program. Finally, the trained model is used to predict the values.
1import tensorflow as tf
2from tensorflow import keras
3import numpy as np
4import data_creation as dc
5
6model = keras.Sequential([
7 keras.layers.Flatten(input_shape=(2,)),
8 keras.layers.Dense(20, activation=tf.nn.relu),
9 keras.layers.Dense(20, activation=tf.nn.relu),
10 keras.layers.Dense(1)
11])
12
13model.compile(optimizer='adam',
14 loss='mse',
15 metrics=['mae'])
16
17model.fit(dc.train_data, dc.train_targets, epochs=10, batch_size=1)
18
19test_loss, test_acc = model.evaluate(dc.test_data, dc.test_targets)
20print('Test accuracy:', test_acc)
21a= np.array([[2000,3000],[4,5]])
22print(model.predict(a))
python
Let's analyze the above program by breaking it into small chunks:
23

1import tensorflow as tf
2from tensorflow import keras
3import numpy as np
4import data_creation as dc
python
The above lines import the Tensorflow, Keras, and Numpy libraries in the program. Also, the
data_creation.py program that we created earlier is also imported and given a named variable as
dc. All the trained test data sets we created can now be referenced using the dc. For example, if
the user needs to use the contents of train_data then all she has to do is use dc.train_data to
access it.
1model = keras.Sequential([
2 keras.layers.Flatten(input_shape=(2,)),
3 keras.layers.Dense(20, activation=tf.nn.relu),
4 keras.layers.Dense(20, activation=tf.nn.relu),
5 keras.layers.Dense(1)
6])
python
The above code creates the actual Deep Learning model. The above model initializes a model as
a stack of layers (Keras.Sequential) and then flattens the input array to a vector
(keras.layers.Flatten(input_shape=(2,)). The flattening part also happens to be the first layer of
the neural network. The second and third layers of the network consist of 20 nodes each and the
activation function we are using is relu (rectified linear unit). Other activation functions such
as softmax can also be used. The last layer, fourth layer, is the output layer. Since we expect only
one output value (a predicted value since this is a regression model), we have just one output
node in this model (keras.layers.Dense(1)).
The architecture of the model depends, to a large extent, on the problem we are trying to solve.
The model we have created above will not work very well for the classification problems, such
as image classification.
1model.compile(optimizer='adam',
2 loss='mse',
3 metrics=['mae'])
python
The above code will be used to compile the network. The optimization function we are using
is adam which is a momentum based optimizer and prevents the model from getting stuck in
local minima. The loss function we are using is mse (mean square error). It considers the squared
difference between the predicted values and the actual values. Also, we are monitoring another
metric, mae (mean absolute error).
1model.fit(dc.train_data, dc.train_targets, epochs=10, batch_size=1)
python
24

This is where the actual training of the networks happens. The training set will be fed to the
network 10 times (epochs) for the training purpose. The epoch needs to be carefully selected as a
lesser number of epochs may lead to an under-trained network while too many epochs may lead
to overfitting, wherein the network works well on the training data but not on the test data set.
1test_loss, test_acc = model.evaluate(dc.test_data, dc.test_targets)
2print('Test accuracy:', test_acc)
python
The above code evaluates the trained model on the test data set and subsequently prints the test
accuracy value.
1a= np.array([[2000,3000],[4,5]])
2print(model.predict(a))
python
Once the model has been trained and tested we can use it to predict the values by supplying real-
world values. In this case, we are supplying the 2 sets of values (2000,30000) and (4,5) and the
output from the model is printed.
Output
1Epoch 1/10
25000/5000 [==============================] - 5s 997us/sample - loss: 1896071.4827
- mean_absolute_error: 219.0276
3
4Epoch 2/10
55000/5000 [==============================] - 5s 956us/sample - loss: 492.9092 -
mean_absolute_error: 3.8202
6
7Epoch 3/10
85000/5000 [==============================] - 5s 1ms/sample - loss: 999.7580 -
mean_absolute_error: 7.1740
9
10Epoch 4/10
115000/5000 [==============================] - 5s 1ms/sample - loss: 731.0374 -
mean_absolute_error: 6.0325
12
13Epoch 5/10
145000/5000 [==============================] - 5s 935us/sample - loss: 648.6434 -
mean_absolute_error: 7.5037
15
16Epoch 6/10
25

175000/5000 [==============================] - 5s 942us/sample - loss: 603.1096 -


mean_absolute_error: 7.7574
18
19Epoch 7/10
205000/5000 [==============================] - 5s 1ms/sample - loss: 596.2445 -
mean_absolute_error: 5.1727
21
22Epoch 8/10
235000/5000 [==============================] - 5s 924us/sample - loss: 685.5327 -
mean_absolute_error: 4.9312
24
25Epoch 9/10
265000/5000 [==============================] - 5s 931us/sample - loss: 1895.0845 -
mean_absolute_error: 5.7679
27
28Epoch 10/10
295000/5000 [==============================] - 5s 996us/sample - loss: 365.9733 -
mean_absolute_error: 2.7120
302000/2000 [==============================] - 0s 42us/sample - loss: 5.8080 -
mean_absolute_error: 2.0810
31Test accuracy: 2.0810156
32[[5095.9385 ]
33 [ 9.108022]]
python
As can be seen, the value predicted for the input set (2000,3000) is 5095.9385 and for input set
(4,5) it is 9.108022. This can be optimized by changing the epochs or by increasing the layers or
increasing the number of nodes in a layer.

import numpy as np
import tensorflow as tf
from tensorflow import keras
train_data = np.array([[1.0,1.0]])
train_targets = np.array([2.0])
print(train_data)
for i in range(3,10000,2):
train_data= np.append(train_data,[[i,i]],axis=0)
train_targets= np.append(train_targets,[i+i])
test_data = np.array([[2.0,2.0]])
26

test_targets = np.array([4.0])
for i in range(4,8000,4):
test_data = np.append(test_data,[[i,i]],axis=0)
test_targets = np.append(test_targets,[i+i])
model = keras.Sequential([
keras.layers.Flatten(input_shape=(2,)),
keras.layers.Dense(20, activation=tf.nn.relu),
keras.layers.Dense(20, activation=tf.nn.relu),
keras.layers.Dense(1)
])

model.compile(optimizer='adam',
loss='mse',
metrics=['mae'])
model.fit(train_data, train_targets, epochs=10, batch_size=1)

test_loss, test_acc = model.evaluate(test_data, test_targets)


print('Test accuracy:', test_acc)
a= np.array([[2000,3000],[4,5]])
print(model.predict(a))

[[1. 1.]]
Epoch 1/10
5000/5000 [==============================] - 8s 1ms/step - loss:
742813.1875 - mae: 105.6519
Epoch 2/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1772.8480 - mae: 6.0134
Epoch 3/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1884.9642 - mae: 8.9911
Epoch 4/10
5000/5000 [==============================] - 8s 2ms/step - loss:
1049.1685 - mae: 10.6520
Epoch 5/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1018.8793 - mae: 6.2299
Epoch 6/10
5000/5000 [==============================] - 8s 2ms/step - loss:
1276.4749 - mae: 5.4312
Epoch 7/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1943.9398 - mae: 8.8076
Epoch 8/10
5000/5000 [==============================] - 7s 1ms/step - loss:
3522.0959 - mae: 8.8434
Epoch 9/10
27

5000/5000 [==============================] - 7s 1ms/step - loss:


707.0856 - mae: 5.1806
Epoch 10/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1415.4739 - mae: 8.1555
63/63 [==============================] - 0s 2ms/step - loss:
1182.1486 - mae: 29.7238
Test accuracy: 29.723848342895508
1/1 [==============================] - 0s 93ms/step
[[5440.4873 ]
[ 9.6786175]]
28

Experiment 4 - Train the model to multiply two matrices and report the result using keras.

import numpy as np

from keras.models import Sequential

from keras.layers.core import Dense

# Set seed for reproducibility

np.random.seed(1)

# the four different states of the XOR gate

training_data = np.array([[0,0],[0,1],[1,0],[1,1]], "float32")

# the four expected results in the same order

target_data = np.array([[0],[1],[1],[0]], "float32")

model = Sequential()

model.add(Dense(4, input_dim=2, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

model.compile(loss='mean_squared_error',optimizer='adam',metrics=['binary_accuracy'])

history = model.fit(training_data, target_data, epochs=1000, verbose=1)

# decimal output

print('decimal output:\n'+str(model.predict(training_data)))

# rounded output

print('rounded output:\n'+str(model.predict(training_data).round()))
29

import numpy as np

from keras.models import Sequential

from keras.layers.core import Dense

# Set seed for reproducibility

np.random.seed(1)

# the four different states of the XOR gate

training_data = np.array([[0,0],[0,1],[1,0],[1,1]], "float32")

# the four expected results in the same order

target_data = np.array([[0],[1],[1],[0]], "float32")

model = Sequential()

model.add(Dense(4, input_dim=2, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

model.compile(loss='mean_squared_error',optimizer='adam',metrics=['binary_accuracy'])

history = model.fit(training_data, target_data, epochs=1000, verbose=1)

# decimal output

print('decimal output:\n'+str(model.predict(training_data)))

# rounded output

print('rounded output:\n'+str(model.predict(training_data).round()))
30

Epoch 1/1000

1/1 [==============================] - 0s 473ms/step - loss: 0.2552 -


binary_accuracy: 0.7500

Epoch 2/1000

1/1 [==============================] - 0s 6ms/step - loss: 0.2550 - binary_accuracy:


0.7500

Epoch 3/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.2547 - binary_accuracy:


0.7500

Epoch 989/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.0967 - binary_accuracy:


1.0000

Epoch 990/1000

1/1 [==============================] - 0s 4ms/step - loss: 0.0966 - binary_accuracy:


1.0000

Epoch 991/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.0964 - binary_accuracy:


1.0000

Epoch 992/1000

1/1 [==============================] - 0s 4ms/step - loss: 0.0963 - binary_accuracy:


1.0000

Epoch 993/1000

1/1 [==============================] - 0s 4ms/step - loss: 0.0962 - binary_accuracy:


1.0000
31

Epoch 994/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.0960 - binary_accuracy:


1.0000

Epoch 995/1000

1/1 [==============================] - 0s 4ms/step - loss: 0.0959 - binary_accuracy:


1.0000

Epoch 996/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.0957 - binary_accuracy:


1.0000

Epoch 997/1000

1/1 [==============================] - 0s 4ms/step - loss: 0.0956 - binary_accuracy:


1.0000

Epoch 998/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.0954 - binary_accuracy:


1.0000

Epoch 999/1000

1/1 [==============================] - 0s 4ms/step - loss: 0.0953 - binary_accuracy:


1.0000

Epoch 1000/1000

1/1 [==============================] - 0s 5ms/step - loss: 0.0951 - binary_accuracy:


1.0000

1/1 [==============================] - 0s 63ms/step

decimal output:

[[0.34885803]

[0.7179798 ]

[0.68972814]

[0.28727812]]
32

1/1 [==============================] - 0s 24ms/step

rounded output:

[[0.]

[1.]

[1.]

[0.]]
33

Experiment 5 – Train the model to print the prime numbers using Keras

import numpy as np

from keras.layers import Dense, Dropout, Activation

from keras.layers.advanced_activations import PReLU

from keras.models import Sequential

from matplotlib import pyplot as plt

seed = 7

np.random.seed(seed)

num_digits = 14 # binary encode numbers

max_number = 2 ** num_digits

def prime_list():

counter = 0

primes = [2, 3]

for n in range(5, max_number, 2):

is_prime = True

for i in range(1, len(primes)):

counter += 1

if primes[i] ** 2 > n:

break

counter += 1

if n % primes[i] == 0:

is_prime = False

break
34

if is_prime:

primes.append(n)

return primes

primes = prime_list()

def prime_encode(i):

if i in primes:

return 1

else:

return 0

def bin_encode(i):

return [i >> d & 1 for d in range(num_digits)]

def create_dataset():

x, y = [], []

for i in range(102, max_number):

x.append(bin_encode(i))

y.append(prime_encode(i))

return np.array(x), y

x_train, y_train = create_dataset()

model = Sequential()

model.add(Dense(units=100, input_dim=num_digits))

model.add(PReLU())

model.add(Dropout(rate=0.2))

model.add(Dense(units=50))
35

model.add(PReLU())

model.add(Dropout(rate=0.2))

model.add(Dense(units=25))

model.add(PReLU())

model.add(Dropout(rate=0.2))

model.add(Dense(units=1))

model.add(Activation("sigmoid"))

model.compile(optimizer='RMSprop',

loss='binary_crossentropy',

metrics=['accuracy'])

history = model.fit(x_train, y_train, epochs=1000, batch_size=128,

validation_split=0.1)

# predict

errors, correct = 0, 0

tp, fn, fp = 0, 0, 0

for i in range(2, 101):

x = bin_encode(i)

y = model.predict(np.array(x).reshape(-1, num_digits))

if y[0][0] >= 0.5:

pred = 1

else:
36

pred = 0

obs = prime_encode(i)

print(i, obs, pred, y[0][0])

if pred == obs:

correct += 1

else:

errors += 1

if obs == 1 and pred == 1:

tp += 1

if obs == 1 and pred == 0:

fn += 1

if obs == 0 and pred == 1:

fp += 1

precision = tp / (tp + fp)

recall = tp / (tp + fn)

f_score = 2 * precision * recall / (precision + recall)

print("Errors :", errors, " Correct :", correct, "F score :", f_score)

def plot_history(history):

plt.plot(history.history['loss'])

plt.plot(history.history['val_loss'])

plt.title('model loss')

plt.xlabel('epoch')

plt.ylabel('loss')
37

plt.legend(['loss', 'val_loss'], loc='upper right')

plt.savefig('RMSprop_more')

plot_history(history)

Output

Errors : 9 Correct : 90 F score : 0.8235294117647058

2 1 0 3.60083e
3.60083e-08
3 1 0 0.0372073
4 0 0 8.21077e
8.21077e-06
5 1 1 0.617928
6 0 0 3.97566e
3.97566e-15
7 1 1 0.826378
8 0 0 0.000872908
9 0 0 0.0204348
10 0 0 1.00783e
1.00783e-09
11 1 0 0.187609
12 0 0 5.53888e
5.53888e-08
13 1 1 0.796401
14 0 0 3.67647e
3.67647e-15
15 0 0 0.029023
16 0 0 4.0449e
4.0449e-07
17 1 1 0.843016
18 0 0 4.10257e
4.10257e-14
19 1 1 0.917604
20 0 0 4.3736e
4.3736e-15
21 0 0 0.0141451
22 0 0 6.49346e
6.49346e-25
23 1 1 0.593597
24 0 0 4.45055e
4.45055e-08
25 0 1 0.735933
26 0 0 4.97732e
4.97732e-18
38

27 0 0 0.0958722
28 0 0 1.80154e-16
29 1 1 0.722513
30 0 0 2.00777e-28
31 1 1 0.774054
32 0 0 3.93779e-05
33 0 0 0.118341
34 0 0 1.88295e-11
35 0 0 0.480108
36 0 0 3.0609e-07
37 1 1 0.847888
38 0 0 3.42833e-18
39 0 0 0.0514646
40 0 0 5.82673e-07
41 1 1 0.726771
42 0 0 3.72693e-11
43 1 1 0.861872
44 0 0 5.71867e-14
45 0 0 0.18657
46 0 0 7.03075e-16
47 1 1 0.654062
48 0 0 1.30385e-10
49 0 1 0.923631
50 0 0 1.30955e-17
51 0 0 0.190215
52 0 0 6.45953e-19
53 1 1 0.558284
54 0 0 1.83163e-29
55 0 0 0.287756
56 0 0 3.29105e-11
57 0 0 0.292637
58 0 0 3.57044e-23
59 1 0 0.152102
60 0 0 1.80104e-22
61 1 1 0.858877
62 0 0 1.92684e-32
63 0 0 0.27367
64 0 0 1.74397e-09
65 0 1 0.727574
66 0 0 1.33752e-20
67 1 1 0.891129
68 0 0 1.47396e-17
69 0 0 0.346057
70 0 0 5.27672e-27
71 1 1 0.932053
72 0 0 4.04155e-10
73 1 1 0.879374
74 0 0 1.4077e-18
75 0 0 0.0290487
76 0 0 6.39801e-17
77 0 1 0.629597
78 0 0 1.54139e-30
79 1 1 0.791511
80 0 0 7.56631e-21
81 0 0 0.0438443
82 0 0 4.24787e-30
83 1 1 0.596353
84 0 0 6.45592e-32
85 0 0 0.431211
86 0 0 0.0
87 0 0 0.00903795
88 0 0 9.54647e-23
89 1 1 0.827787
90 0 0 2.43897e-31
91 0 1 0.746695
92 0 0 8.37092e-37
93 0 0 0.0384408
94 0 0 0.0
95 0 0 0.3743
39

96 0 0 7.28071e-13
97 1 1 0.888417
98 0 0 3.04541e-25
99 0 0 0.0649973
100 0 0 1.59478e-18
40

6. Recurrent Neural Network

a. Numpy implement of a simple recurrent neural network

b. Create a recurrent layer in keras

c. Prepare IMDB data for movie review classification problem.

d. Train the model with embedding and simple RNN layers.

e. Plot the Results

import numpy as np

import matplotlib.pyplot as plt

import tensorflow as tf

from tensorflow.keras.datasets import imdb

from tensorflow.keras.preprocessing.sequence import pad_sequences

imdb.load_data(num_words=20000)

(x_train,y_train),(x_test,y_test)=imdb.load_data(num_words=20000)

x_train=pad_sequences(x_train,maxlen=100)

x_test=pad_sequences(x_test,maxlen=100)

vocab_size=20000

emed_size=128

from tensorflow.keras import Sequential

from tensorflow.keras.layers import LSTM, Dropout, Dense, Embedding

model=Sequential()

model.add(Embedding(vocab_size,emed_size,input_shape=(x_train.shape[1],)))

model.add(LSTM(units=60,activation='tanh'))

model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
41

model.summary()

history=model.fit(x_train,y_train,epochs=5,batch_size=128,validation_data=(x_test,y_test))

epoch_range=[1,2,3,4,5]

plt.plot(epoch_range, history.history['accuracy'])

plt.plot(epoch_range, history.history['val_accuracy'])

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 100, 128) 2560000

lstm (LSTM) (None, 60) 45360

=================================================================
Total params: 2,605,360
Trainable params: 2,605,360
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
196/196 [==============================] - 38s 170ms/step - loss: 1.7606 -
accuracy: 0.0078 - val_loss: 1.1591 - val_accuracy: 0.1327
Epoch 2/5
196/196 [==============================] - 32s 164ms/step - loss: 1.1524 -
accuracy: 0.1252 - val_loss: 1.1365 - val_accuracy: 0.0272
Epoch 3/5
196/196 [==============================] - 32s 162ms/step - loss: 0.9810 -
accuracy: 0.0443 - val_loss: 0.9238 - val_accuracy: 0.0105
Epoch 4/5
196/196 [==============================] - 32s 165ms/step - loss: 0.7770 -
accuracy: 0.0857 - val_loss: 0.9253 - val_accuracy: 0.1010
Epoch 5/5
196/196 [==============================] - 32s 165ms/step - loss: 0.6973 -
accuracy: 0.0709 - val_loss: 1.0435 - val_accuracy: 0.0549
42
43

7. Consider temperature-forecast as one the example for recurrent neural network and
implement the following.
a. Inspect the data of the weather dataset
b. Parsing the data
c. Plotting the temperature timeseries
d. Plotting the first 10 days of the temperature timeseries

Now that we have an insight about RNN so let us begin to develop an RNN model that can
provide 4 days forecast of temperature based on 30 days of historical temperature data. I have
used Google colab to implement this code and Spyder for visualizations, there are a lot many tools
that you can use according to your preference.

You can download historical weather dataset from here, also feel free to use any weather dataset
of your choice which has temperature data.
44

Let's load the dataset and see the first few rows:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd#import dataset from data.csv file
dataset = pd.read_csv('data.csv')
dataset = dataset.dropna(subset=["Temperature"])
dataset=dataset.reset_index(drop=True)training_set = dataset.iloc[:,4:5].values

We include only the temperature column as we are going to forecast temperature and drop all the
rows that have no values or has a NaN.

Next, we will have to apply feature scaling to normalize temperature in the range 0 to 1.
#Feature Scaling
from sklearn.preprocessing import MinMaxScalersc = MinMaxScaler(feature_range=(0,1))
training_set_scaled = sc.fit_transform(training_set)

We will create a training set such that for every 30 days we will provide the next 4 days
temperature as output. In other words, input for our RNN would be 30 days temperature data and
the output would be 4 days forecast of temperature.
x_train = []
y_train = []n_future = 4 # next 4 days temperature forecast
n_past = 30 # Past 30 days for i in range(0,len(training_set_scaled)-n_past-n_future+1):
x_train.append(training_set_scaled[i : i + n_past , 0])
y_train.append(training_set_scaled[i + n_past : i + n_past + n_future , 0 ])x_train , y_train =
np.array(x_train), np.array(y_train)x_train = np.reshape(x_train, (x_train.shape[0] , x_train.shape[1],
1) )

x_train contains 30 previous temperature inputs before that day and y_train contains 4 days
temperature outputs after that day. Since x_train and y_train are lists we will have to convert them
to numpy array to fit training set to our model.

Now we are ready with our training data so let’s proceed to build an RNN model for forecasting
weather.
45

1. First, we will import keras sequential model from keras.models and keras layers ie. LSTM,
Dense and dropout. You can refer Keras documentation for more info on Keras models and
layers here
from keras.models import Sequential
from keras.layers import LSTM,Dense ,Dropout
# Fitting RNN to training set using Keras Callbacks. Read Keras callbacks docs for more info.

2. Let us define the layers in our RNN. We will create a sequential model by adding layers
sequentially using sequential(). The first layer is a Bidirectional LSTM with 30 memory
units, return_sequence=True means that the last output in the output sequence is returned and
the input_shape describes the structure of the input. With Bidirectional LSTM the output layer
gets feedback from past(forward) as well as future(backward) states simultaneously. We add 3
hidden layers and an output layer with a linear activation function that outputs 4 days temperature.
And at the last, we fit the RNN model with our training data.
regressor = Sequential()regressor.add(Bidirectional(LSTM(units=30, return_sequences=True,
input_shape = (x_train.shape[1],1) ) ))
regressor.add(Dropout(0.2))regressor.add(LSTM(units= 30 , return_sequences=True))
regressor.add(Dropout(0.2))regressor.add(LSTM(units= 30 , return_sequences=True))
regressor.add(Dropout(0.2))regressor.add(LSTM(units= 30))
regressor.add(Dropout(0.2))
regressor.add(Dense(units = n_future,activation='linear'))regressor.compile(optimizer='adam',
loss='mean_squared_error',metrics=['acc'])
regressor.fit(x_train, y_train, epochs=500,batch_size=32 )

Note: I have used Adam optimizer because it is computationally efficient.

3. Create test data to test our model performance.


# read test dataset
testdataset = pd.read_csv('data (12).csv')
#get only the temperature column
testdataset = testdataset.iloc[:30,3:4].valuesreal_temperature = pd.read_csv('data (12).csv')
real_temperature = real_temperature.iloc[30:,3:4].valuestesting = sc.transform(testdataset)
testing = np.array(testing)
testing = np.reshape(testing,(testing.shape[1],testing.shape[0],1))

4. Now that we have our test data ready, we can test our RNN model.
predicted_temperature = regressor.predict(testing)predicted_temperature =
46

sc.inverse_transform(predicted_temperature)predicted_temperature =
np.reshape(predicted_temperature,(predicted_temperature.shape[1],predicted_temperature.shape[0]))

The output from the model is in the normalized form, so to get the actual temperature values we
apply inverse_transform() to the predicted_temperature and then reshape it.

Let’s compare the predicted and real temperatures. As we can see the model performs well with
the given test data.
real_temperature
array([[82.], [82.], [83.], [83.]])predicted_temperature
array([[83.76233 ], [83.957565], [83.70461 ], [83.6326 ]])

If we forecast temperature for a month and visualize it we get the following results.

Forecast of temperature over a month


47

8. Long short-term memory network

a. Implement LSTM using LSTM layer in keras

b. Train and evaluate using reversed sequences for IMDB data

c. Train and evaluate a bidirectional LSTM for IMDB data

a. Implement LSTM using LSTM layer in keras

tf.keras.layers.LSTM(
units,
activation="tanh",
recurrent_activation="sigmoid",
use_bias=True,
kernel_initializer="glorot_uniform",
recurrent_initializer="orthogonal",
bias_initializer="zeros",
unit_forget_bias=True,
kernel_regularizer=None,
recurrent_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
recurrent_constraint=None,
bias_constraint=None,
dropout=0.0,
recurrent_dropout=0.0,
return_sequences=False,
return_state=False,
go_backwards=False,
stateful=False,
time_major=False,
unroll=False,
**kwargs
)

Based on available runtime hardware and constraints, this layer will choose different
implementations (cuDNN-based or pure-TensorFlow) to maximize the performance. If a GPU is
available and all the arguments to the layer meet the requirement of the cuDNN kernel (see
below for details), the layer will use a fast cuDNN implementation.

The requirements to use the cuDNN implementation are:

1. activation == tanh
48

2. recurrent_activation == sigmoid
3. recurrent_dropout == 0
4. unroll is False
5. use_bias is True
6. Inputs, if use masking, are strictly right-padded.
7. Eager execution is enabled in the outermost context.

For example:

>>> inputs = tf.random.normal([32, 10, 8])


>>> lstm = tf.keras.layers.LSTM(4)
>>> output = lstm(inputs)
>>> print(output.shape)
(32, 4)
>>> lstm = tf.keras.layers.LSTM(4, return_sequences=True, return_state=True)
>>> whole_seq_output, final_memory_state, final_carry_state = lstm(inputs)
>>> print(whole_seq_output.shape)
(32, 10, 4)
>>> print(final_memory_state.shape)
(32, 4)
>>> print(final_carry_state.shape)
(32, 4)
Arguments

 units: Positive integer, dimensionality of the output space.


 activation: Activation function to use. Default: hyperbolic tangent (tanh). If you
pass None, no activation is applied (ie. "linear" activation: a(x) = x).
 recurrent_activation: Activation function to use for the recurrent step. Default: sigmoid
(sigmoid). If you pass None, no activation is applied (ie. "linear" activation: a(x) = x).
 use_bias: Boolean (default True), whether the layer uses a bias vector.
 kernel_initializer: Initializer for the kernel weights matrix, used for the linear
transformation of the inputs. Default: glorot_uniform.
 recurrent_initializer: Initializer for the recurrent_kernel weights matrix, used for the
linear transformation of the recurrent state. Default: orthogonal.
 bias_initializer: Initializer for the bias vector. Default: zeros.
 unit_forget_bias: Boolean (default True). If True, add 1 to the bias of the forget gate at
initialization. Setting it to true will also force bias_initializer="zeros". This is
recommended in Jozefowicz et al..
 kernel_regularizer: Regularizer function applied to the kernel weights matrix.
Default: None.
 recurrent_regularizer: Regularizer function applied to the recurrent_kernel weights
matrix. Default: None.
 bias_regularizer: Regularizer function applied to the bias vector. Default: None.
 activity_regularizer: Regularizer function applied to the output of the layer (its
"activation"). Default: None.
49

 kernel_constraint: Constraint function applied to the kernel weights matrix.


Default: None.
 recurrent_constraint: Constraint function applied to the recurrent_kernel weights
matrix. Default: None.
 bias_constraint: Constraint function applied to the bias vector. Default: None.
 dropout: Float between 0 and 1. Fraction of the units to drop for the linear
transformation of the inputs. Default: 0.
 recurrent_dropout: Float between 0 and 1. Fraction of the units to drop for the linear
transformation of the recurrent state. Default: 0.
 return_sequences: Boolean. Whether to return the last output in the output sequence, or
the full sequence. Default: False.
 return_state: Boolean. Whether to return the last state in addition to the output.
Default: False.
 go_backwards: Boolean (default False). If True, process the input sequence backwards
and return the reversed sequence.
 stateful: Boolean (default False). If True, the last state for each sample at index i in a
batch will be used as initial state for the sample of index i in the following batch.
 time_major: The shape format of the inputs and outputs tensors. If True, the inputs and
outputs will be in shape [timesteps, batch, feature], whereas in the False case, it will
be [batch, timesteps, feature]. Using time_major = True is a bit more efficient because it
avoids transposes at the beginning and end of the RNN calculation. However, most
TensorFlow data is batch-major, so by default this function accepts input and emits
output in batch-major form.
 unroll: Boolean (default False). If True, the network will be unrolled, else a symbolic
loop will be used. Unrolling can speed-up a RNN, although it tends to be more memory-
intensive. Unrolling is only suitable for short sequences.

Call arguments

 inputs: A 3D tensor with shape [batch, timesteps, feature].


 mask: Binary tensor of shape [batch, timesteps] indicating whether a given timestep
should be masked (optional, defaults to None). An individual True entry indicates that the
corresponding timestep should be utilized, while a False entry indicates that the
corresponding timestep should be ignored.
 training: Python boolean indicating whether the layer should behave in training mode or
in inference mode. This argument is passed to the cell when calling it. This is only
relevant if dropout or recurrent_dropout is used (optional, defaults to None).
 initial_state: List of initial state tensors to be passed to the first call of the cell (optional,
defaults to None which causes creation of zero-filled initial state tensors).

b. Train and evaluate using reversed sequences for IMDB data


 Sometimes, a sequence is better used in reversed order. In those cases, you can simply
reverse a vector x using the Python syntax x[::-1] before using it to train your LSTM
network.
50

 Sometimes, neither the forward nor the reversed order works perfectly, but combining
them will give better results. In this case, you will need a bidirectional LSTM network.
 A bidirectional LSTM network is simply two separate LSTM networks; one feeds with a
forward sequence and another with reversed sequence. Then the output of the two LSTM
networks is concatenated together before being fed to the subsequent layers of the
network. In Keras, you have the function Bidirectional() to clone an LSTM layer for
forward-backward input and concatenate their output. For example,
model = Sequential()
1
model.add(Embedding(top_words, embedding_vecor_length,
2
input_length=max_review_length))
3
model.add(Bidirectional(LSTM(100, dropout=0.2, recurrent_dropout=0.2)))
4
model.add(Dense(1, activation='sigmoid'))
 Since you created not one, but two LSTMs with 100 units each, this network will take
twice the amount of time to train. Depending on the problem, this additional cost may be
justified.
 The full code listing with adding the bidirectional LSTM to the last example is listed
below for completeness.

1 # LSTM with dropout for sequence classification in the IMDB dataset


2 import tensorflow as tf
3 from tensorflow.keras.datasets import imdb
4 from tensorflow.keras.models import Sequential
5 from tensorflow.keras.layers import Dense
6 from tensorflow.keras.layers import LSTM
7 from tensorflow.keras.layers import Bidirectional
8 from tensorflow.keras.layers import Embedding
9 from tensorflow.keras.preprocessing import sequence
10 # fix random seed for reproducibility
11 tf.random.set_seed(7)
12 # load the dataset but only keep the top n words, zero the rest
13 top_words = 5000
14 (X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
15 # truncate and pad input sequences
16 max_review_length = 500
17 X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
18 X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
19 # create the model
20 embedding_vecor_length = 32
21 model = Sequential()
22 model.add(Embedding(top_words, embedding_vecor_length,
23 input_length=max_review_length))
24 model.add(Bidirectional(LSTM(100, dropout=0.2, recurrent_dropout=0.2)))
51

25 model.add(Dense(1, activation='sigmoid'))
26 model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
27 print(model.summary())
28 model.fit(X_train, y_train, epochs=3, batch_size=64)
29 # Final evaluation of the model
30 scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
 Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few
times and compare the average outcome.
 Running this example provides the following output.

Epoch 1/3
391/391 [==============================] - 405s 1s/step - loss: 0.4960 - accuracy:
1
0.7532
2
Epoch 2/3
3
391/391 [==============================] - 439s 1s/step - loss: 0.3075 - accuracy:
4
0.8744
5
Epoch 3/3
6
391/391 [==============================] - 430s 1s/step - loss: 0.2551 - accuracy:
7
0.9014
Accuracy: 87.69%

c. Train and evaluate a bidirectional LSTM for IMDB data

import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
max_features = 20000 # Only consider the top 20k words
maxlen = 200 # Only consider the first 200 words of each movie review

Build the model

# Input for variable-length sequences of integers


inputs = keras.Input(shape=(None,), dtype="int32")
# Embed each integer in a 128-dimensional vector
x = layers.Embedding(max_features, 128)(inputs)
# Add 2 bidirectional LSTMs
x = layers.Bidirectional(layers.LSTM(64, return_sequences=True))(x)
x = layers.Bidirectional(layers.LSTM(64))(x)
# Add a classifier
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)
model.summary()
52

Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, None)] 0
_________________________________________________________________
embedding (Embedding) (None, None, 128) 2560000
_________________________________________________________________
bidirectional (Bidirectional (None, None, 128) 98816
_________________________________________________________________
bidirectional_1 (Bidirection (None, 128) 98816
_________________________________________________________________
dense (Dense) (None, 1) 129
=================================================================
Total params: 2,757,761
Trainable params: 2,757,761
Non-trainable params: 0
_________________________________________________________________

Load the IMDB movie review sentiment data

(x_train, y_train), (x_val, y_val) = keras.datasets.imdb.load_data( num_words=max_features)


print(len(x_train), "Training sequences")
print(len(x_val), "Validation sequences")
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=maxlen)
x_val = keras.preprocessing.sequence.pad_sequences(x_val, maxlen=maxlen)
25000 Training sequences
25000 Validation sequences

Train and evaluate the model

model.compile("adam", "binary_crossentropy", metrics=["accuracy"])


model.fit(x_train, y_train, batch_size=32, epochs=2, validation_data=(x_val, y_val))
Epoch 1/2
782/782 [==============================] - 220s 281ms/step - loss: 0.4117 - accuracy:
0.8083 - val_loss: 0.6497 - val_accuracy: 0.6983
Epoch 2/2
726/782 [==========================>...] - ETA: 11s - loss: 0.3170 - accuracy: 0.8683
53

10. Convolutional Neural Networks


a. Preparing the IMDB data
b. Train and evaluate a simple 1D convent on IMDB Data
c. Train and evaluate a simple 1D convent on temperature prediction data
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import tensorflow as tf
import matplotlib.pyplot as plt
%matplotlib inline
import string
import os
mother_directory = os.getcwd()
data_directory = "../input/imdb-movie-ratings-sentiment-analysis/movie.csv"
data = pd.read_csv(data_directory)
max_length = 0
for sentence in data["text"]:
new_sentence = sentence.translate(str.maketrans("","",string.punctuation))
#print(new_sentence)
length = len(new_sentence.split(" "))
if length > max_length:
max_length = length
def tokenize_the_data_from_pandas(dataframe,column_name): # This function returns tokens
dictionary that includes every unique word as keys and unique integers for each key.
#Here we clean the string out of punctuations.
import string
for sent in range(len(data[column_name])):
example_sentence = data[column_name].iloc[sent]
new_sentence = example_sentence.translate(str.maketrans("","",string.punctuation))
data[column_name].iloc[sent] = new_sentence
#Here we create a dictionary that will encode each word into an integer to have a
representation of the word in the deep neural networks processes
tokens = {}
for sent in range(len(data[column_name])):
example_sentence = data[column_name].iloc[sent]
values = example_sentence.split(" ")
for word in values:
tokens[word] = 0

names = list(tokens.keys())
for num in range(len(names)):
tokens[names[num]] = num+1

return tokens
tokens = tokenize_the_data_from_pandas(data,column_name="text")
len(tokens.keys())
54

#This is a 1D convolutional model for the task


model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(190020,200,input_length=2470))#This embedding layer
reduces the data dimension from 290 to 10 by creating relations and using floating numbers to
represent the words.
model.add(tf.keras.layers.Conv1D(32,7,activation="relu"))
model.add(tf.keras.layers.MaxPooling1D(5))
model.add(tf.keras.layers.Conv1D(64,7,activation="relu"))
model.add(tf.keras.layers.MaxPooling1D(5))

model.add(tf.keras.layers.Conv1D(64,7,activation="relu"))
model.add(tf.keras.layers.MaxPooling1D(5))

#model.add(tf.keras.layers.GlobalMaxPooling1D())
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

model.compile(optimizer = "adam",loss="binary_crossentropy",metrics=["accuracy"])
model.fit(X_train,y_train,epochs=5,validation_data=(X_test,y_test))
Epoch 1/5
2022-01-26 13:16:46.737730: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded
cuDNN version 8005
1125/1125 [==============================] - 37s 26ms/step - loss: 0.3360 - accuracy:
0.8362 - val_loss: 0.2162 - val_accuracy: 0.9087
Epoch 2/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0945 - accuracy:
0.9680 - val_loss: 0.2506 - val_accuracy: 0.9013
Epoch 3/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0161 - accuracy:
0.9954 - val_loss: 0.4001 - val_accuracy: 0.8938
Epoch 4/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0096 - accuracy:
0.9965 - val_loss: 0.4988 - val_accuracy: 0.8920
Epoch 5/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0107 - accuracy:
0.9967 - val_loss: 0.5361 - val_accuracy: 0.8915
Out[14]:
<keras.callbacks.History at 0x7f560b67b890>
#Predicting the test set to investiagte further.
y_pred=model.predict(X_test[:])
#We turn our networks output into binary values
y_pred[y_pred>0.5] = 1
y_pred[y_pred< 0.5] = 0
Truth = 0
Falset = 0
55

for pik in range(len(y_pred)):

result = y_pred[pik] == y_test[pik]


if False in result:
Falset += 1
else:
Truth += 1

#print(y_pred[pik] == y_test[pik])
print(Truth,Falset)
3566 434
print("accuracy:",Truth / (Truth + Falset))
accuracy: 0.8915

11. Develop a traditional LSTM for sequence classification problem.


You can quickly develop a small LSTM for the IMDB problem and achieve good accuracy.

Let’s start by importing the classes and functions required for this model and initializing the
random number generator to a constant value to ensure you can easily reproduce the results.

1 import tensorflow as tf
2 from tensorflow.keras.datasets import imdb
3 from tensorflow.keras.models import Sequential
4 from tensorflow.keras.layers import Dense
5 from tensorflow.keras.layers import LSTM
6 from tensorflow.keras.layers import Embedding
7 from tensorflow.keras.preprocessing import sequence
8 # fix random seed for reproducibility
9 tf.random.set_seed(7)
You need to load the IMDB dataset. You are constraining the dataset to the top 5,000 words.
You will also split the dataset into train (50%) and test (50%) sets.

1 # load the dataset but only keep the top n words, zero the rest
2 top_words = 5000
3 (X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
Next, you need to truncate and pad the input sequences, so they are all the same length for
modeling. The model will learn that the zero values carry no information. The sequences are not
the same length in terms of content, but same-length vectors are required to perform the
computation in Keras.
56

1 # truncate and pad input sequences


2 max_review_length = 500
3 X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
4 X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
You can now define, compile and fit your LSTM model.

The first layer is the Embedded layer that uses 32-length vectors to represent each word. The
next layer is the LSTM layer with 100 memory units (smart neurons). Finally, because this is a
classification problem, you will use a Dense output layer with a single neuron and a sigmoid
activation function to make 0 or 1 predictions for the two classes (good and bad) in the problem.

Because it is a binary classification problem, log loss is used as the loss function
(binary_crossentropy in Keras). The efficient ADAM optimization algorithm is used. The
model is fit for only two epochs because it quickly overfits the problem. A large batch size of 64
reviews is used to space out weight updates.
# create the model
1
embedding_vecor_length = 32
2
model = Sequential()
3
model.add(Embedding(top_words, embedding_vecor_length,
4
input_length=max_review_length))
5
model.add(LSTM(100))
6
model.add(Dense(1, activation='sigmoid'))
7
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
8
print(model.summary())
9
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=3, batch_size=64)
Once fit, you can estimate the performance of the model on unseen reviews.

1 # Final evaluation of the model


2 scores = model.evaluate(X_test, y_test, verbose=0)
3 print("Accuracy: %.2f%%" % (scores[1]*100))
For completeness, here is the full code listing for this LSTM network on the IMDB dataset.

1 # LSTM for sequence classification in the IMDB dataset


2 import tensorflow as tf
3 from tensorflow.keras.datasets import imdb
4 from tensorflow.keras.models import Sequential
5 from tensorflow.keras.layers import Dense
6 from tensorflow.keras.layers import LSTM
7 from tensorflow.keras.layers import Embedding
8 from tensorflow.keras.preprocessing import sequence
9 # fix random seed for reproducibility
10 tf.random.set_seed(7)
57

11 # load the dataset but only keep the top n words, zero the rest
12 top_words = 5000
13 (X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
14 # truncate and pad input sequences
15 max_review_length = 500
16 X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
17 X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
18 # create the model
19 embedding_vecor_length = 32
20 model = Sequential()
21 model.add(Embedding(top_words, embedding_vecor_length,
22 input_length=max_review_length))
23 model.add(LSTM(100))
24 model.add(Dense(1, activation='sigmoid'))
25 model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
26 print(model.summary())
27 model.fit(X_train, y_train, epochs=3, batch_size=64)
28 # Final evaluation of the model
29 scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.
Running this example produces the following output.

Epoch 1/3
391/391 [==============================] - 124s 316ms/step - loss: 0.4525 -
1
accuracy: 0.7794
2
Epoch 2/3
3
391/391 [==============================] - 124s 318ms/step - loss: 0.3117 -
4
accuracy: 0.8706
5
Epoch 3/3
6
391/391 [==============================] - 126s 323ms/step - loss: 0.2526 -
7
accuracy: 0.9003
Accuracy: 86.83%

You might also like