Deep Learning Manual
Deep Learning Manual
CERTIFICATE
Department of _________________________________________________________
Regd No:
LIST OF EXPERIMENTS
Keras is an open-source high-level Neural Network library, which is written in Python is capable
enough to run on Theano, TensorFlow, or CNTK. It was developed by one of the Google
engineers, Francois Chollet. It is made user-friendly, extensible, and modular for facilitating
faster experimentation with deep neural networks. It not only supports Convolutional Networks
and Recurrent Networks individually but also their combination.
It cannot handle low-level computations, so it makes use of the Backend library to resolve it.
The backend library act as a high-level API wrapper for the low-level API, which lets it run on
TensorFlow, CNTK, or Theano.
Initially, it had over 4800 contributors during its launch, which now has gone up to 250,000
developers. It has a 2X growth ever since every year it has grown. Big companies like Microsoft,
Google, NVIDIA, and Amazon have actively contributed to the development of Keras. It has an
amazing industry interaction, and it is used in the development of popular firms likes Netflix,
Uber, Google, Expedia, etc.
Specialties of Keras
1. Keras is an API designed for humans – Best practices are followed by Keras to
decrease cognitive load, ensures that the models are consistent, and the corresponding
APIs are simple.
2. Not designed for machines – Keras provides clear feedback upon the occurrence of any
error that minimizes the number of user actions for the majority of the common use cases.
3. Easy to learn and use.
4. Highly Flexible – Keras provide high flexibility to all of its developers by integrating
low-level deep learning languages such as TensorFlow or Theano, which ensures that
anything written in the base language can be implemented in Keras.
5
Keras can be developed in R as well as Python, such that the code can be run with TensorFlow,
Theano, CNTK, or MXNet as per the requirement. Keras can be run on CPU, NVIDIA GPU,
AMD GPU, TPU, etc. It ensures that producing models with Keras is really simple as it totally
supports to run with TensorFlow serving, GPU acceleration (WebKeras, Keras.js), Android (TF,
TF Lite), iOS (Native CoreML) and Raspberry Pi. Play Video
Keras Backend
Keras being a model-level library helps in developing deep learning models by offering high-
level building blocks. All the low-level computations such as products of Tensor, convolutions,
etc. are not handled by Keras itself, rather they depend on a specialized tensor manipulation
library that is well optimized to serve as a backend engine. Keras has managed it so perfectly that
instead of incorporating one single library of tensor and performing operations related to that
particular library, it offers plugging of different backend engines into Keras.
o TensorFlow
TensorFlow is a Google product, which is one of the most famous deep learning tools
widely used in the research area of machine learning and deep neural network. It came
into the market on 9th November 2015 under the Apache License 2.0. It is built in such a
way that it can easily run on multiple CPUs and GPUs as well as on mobile operating
systems. It consists of various wrappers in distinct languages such as Java, C++, or
Python.
o Theano
Theano was developed at the University of Montreal, Quebec, Canada, by the MILA
group. It is an open-source python library that is widely used for performing
mathematical operations on multi-dimensional arrays by incorporating scipy and numpy.
It utilizes GPUs for faster computation and efficiently computes the gradients by building
symbolic graphs automatically. It has come out to be very suitable for unstable
expressions, as it first observes them numerically and then computes them with more
stable algorithms.
o CNTK
Microsoft Cognitive Toolkit is deep learning's open-source framework. It consists of all
the basic building blocks, which are required to form a neural network. The models are
trained using C++ or Python, but it incorporates C# or Java to load the model for making
predictions.
Advantages of Keras
o It is very easy to understand and incorporate the faster deployment of network models.
o It has huge community support in the market as most of the AI companies are keen on
using it.
o It supports multi backend, which means you can use any one of them among TensorFlow,
CNTK, and Theano with Keras as a backend according to your requirement.
o Since it has an easy deployment, it also holds support for cross-platform. Following are
the devices on which Keras can be deployed:
1. iOS with CoreML
2. Android with TensorFlow Android
3. Web browser with .js support
4. Cloud engine
5. Raspberry pi
o It supports Data parallelism, which means Keras can be trained on multiple GPU's at an
instance for speeding up the training time and processing a huge amount of data.
Disadvantages of Keras
o The only disadvantage is that Keras has its own pre-configured layers, and if you want to
create an abstract layer, it won't let you because it cannot handle low-level APIs. It only
supports high-level API running on the top of the backend engine (TensorFlow, Theano,
and CNTK).
Prerequisite
This Keras tutorial is made for both beginners and professionals, to help them understand the
fundamental concept of Keras. After the completion of this tutorial, you will find yourself at a
moderate level of expertise from where you can take yourself to the next level.
7
To install Keras, you will need Anaconda Distribution, which is supported by a company called
Continuum Analytics. Anaconda provides a platform for Python and R languages, which is an
open-source and free distribution. It is a platform-independent, which means that it can be
installed on any operating system such as MAC OS, Windows, and Linux as per the user's
requirement. It has come up with more than 1500 packages of Python/R that are necessary for
developing deep learning as well as machine learning models.
It provides an easy python installation with several IDE's such as Jupyter Notebook, Anaconda
prompt, Spyder, etc. Once it is installed, it will automatically install Python with some of its
basic IDE's and libraries by providing as much convenience as it can to its user.
To download Anaconda, you can either go to one of your favorite browser and type Download
Anaconda Python in the search bar or, simply follow the link given below.
https://www.anaconda.com/distribution/#download-section.
Click on the very first link, and you will get directed to the Anaconda's download page, as shown
below:
8
You will notice that Anaconda is available for various operating systems such as Windows,
MAC OS, and Linux. You can download it by clicking on the available options as per your OS.
It will offer you Python 2.7 and Python 3.7 version. Since the latest version is Python 3.7, so
download it by clicking on the download option. The downloading will automatically start after
you hit the download option.
After the download is finished, go to the download folder and click on the Anaconda's .exe file
(Anaconda3-2019.03-Windows-x86_64.exe). The setup window for the installation of Anaconda
will get open up where you have to click on Next, as shown below:
9
After clicking on the Next, it will open a License Agreement window, click on I Agree to move
ahead with the installation.
Next, you will get two options in the window; click on the first option, followed by clicking
10
on Next.
Now that you are done with installing Anaconda, you have to create a new conda environment
12
where you will be installing all your modules to build your models.
You can run Anaconda prompt as an Administrator, which you can do by searching the
Anaconda prompt in the search bar and then click right on it, followed by selecting the first
option, which says Run as administrator.
After you click on it, you will see that your anaconda prompt has opened, and it will look like the
image given below.
Next, you will need to create an environment. For which you have to write the following
13
command on the anaconda prompt and press enter. Here deeplearning specifies to the name of
the environment, but you can write anything as per your choice.
From the image given above, you can see that it is asking you for the package plan environment
location, click on y and press enter.
So, you can see in the above image that you have successfully created an environment. Now the
next step is to activate the environment that you created earlier. To activate the environment,
write the following;
14
1. activate deeplearning
From the above image, you can see that you are in this environment.
Next, you have to install the Keras, which you can simply do by using the below-given
command.
1. conda install -c anaconda keras
You can see that it is asking you to install the following packages, so proceed with typing y.
15
From the above image, you can see that you are done with the installation successfully.
Since this is a new environment so, you need to do a few installations again so as to avoid the
occurrence of error: ModuleNotFoundError: No module named 'keras' while importing Keras.
So, you have to run two of the most important commands because when you create an
environment, jupyter and spyder are not preinstalled, that is why you have to run them.
First, you will run the command for jupyter, which is as follow:
Again, it will ask you to install the following packages, so proceed with typing y.
You can see in the above image that it has been successfully installed.
Since you are doing for the very first time, so it will again ask you for y/n, so you just simply
17
From the image given above, you can see that it also has been installed successfully.
20
Experiment 3 - Train the model to add two numbers and report the result
With the advent of Deep Learning, there have been huge successes for these kinds of perceptual
problems. In this guide, for the sake of simplicity and ease of understanding, we will try to
change the simple arithmetic addition to that of a perceptual problem and then try to predict the
values through this trained model.
In this guide, we are going to use Keras library which is made available as part of the Tensorflow
library.
Data Tensors
Getting the data in proper shape is perhaps the most important aspect of any machine learning
model and it holds true here as well. The below program (data_creation.py) creates the training
and test sets for the Addition problem.
1import numpy as np
2train_data = np.array([[1.0,1.0]])
3train_targets = np.array([2.0])
4print(train_data)
5for i in range(3,10000,2):
6 train_data= np.append(train_data,[[i,i]],axis=0)
7 train_targets= np.append(train_targets,[i+i])
8test_data = np.array([[2.0,2.0]])
9test_targets = np.array([4.0])
10for i in range(4,8000,4):
11 test_data = np.append(test_data,[[i,i]],axis=0)
12 test_targets = np.append(test_targets,[i+i])
python
Let's analyze the above program:
1import numpy as np
2train_data = np.array([[1.0,1.0]])
3train_targets = np.array([2.0])
python
In the above three lines, we are importing the Numpy library and creating train_data and
train_target data sets. train_data is the array that will be used to hold the two numbers that are
going to be added while train_targets is the vector that will hold the Addition value of the two.
train_data is initialized to contain the value like 1.0 and 1.0 as two numbers. This is a very
simple program so you will see the same number repeated (1.0) and this pattern is repeated in the
entire train and test data set that is same number (i) is used to add itself.
21
1for i in range(3,10000,2):
2 train_data= np.append(train_data,[[i,i]],axis=0)
3 train_targets= np.append(train_targets,[i+i])
python
The above lines append the train_data array and train_target vector by looping over the counter
(i) that starts from 3 and goes up to 10000 with a step function of 2. This is what train_data looks
like:
Output
1[[1.000e+00 1.000e+00]
2 [3.000e+00 3.000e+00]
3 [5.000e+00 5.000e+00]
4 ...
5 [9.995e+03 9.995e+03]
6 [9.997e+03 9.997e+03]
7 [9.999e+03 9.999e+03]]
train_targets:
Output
1[2.0000e+00 6.0000e+00 1.0000e+01 ... 1.9990e+04 1.9994e+04 1.9998e+04]
test_data and test_targets are also created in a similar fashion, with one difference: it goes till
8000 with the step of 4.
1test_data = np.array([[2.0,2.0]])
2test_targets = np.array([4.0])
3for i in range(4,8000,4):
4 test_data = np.append(test_data,[[i,i]],axis=0)
5 test_targets = np.append(test_targets,[i+i])
test_data:
Output
1[[2.000e+00 2.000e+00]
2 [4.000e+00 4.000e+00]
3 [8.000e+00 8.000e+00]
4 ...
5 [7.988e+03 7.988e+03]
6 [7.992e+03 7.992e+03]
7 [7.996e+03 7.996e+03]]
test_targets:
22
Output
1[4.0000e+00 8.0000e+00 1.6000e+01 ... 1.5976e+04 1.5984e+04 1.5992e+04]
1import tensorflow as tf
2from tensorflow import keras
3import numpy as np
4import data_creation as dc
python
The above lines import the Tensorflow, Keras, and Numpy libraries in the program. Also, the
data_creation.py program that we created earlier is also imported and given a named variable as
dc. All the trained test data sets we created can now be referenced using the dc. For example, if
the user needs to use the contents of train_data then all she has to do is use dc.train_data to
access it.
1model = keras.Sequential([
2 keras.layers.Flatten(input_shape=(2,)),
3 keras.layers.Dense(20, activation=tf.nn.relu),
4 keras.layers.Dense(20, activation=tf.nn.relu),
5 keras.layers.Dense(1)
6])
python
The above code creates the actual Deep Learning model. The above model initializes a model as
a stack of layers (Keras.Sequential) and then flattens the input array to a vector
(keras.layers.Flatten(input_shape=(2,)). The flattening part also happens to be the first layer of
the neural network. The second and third layers of the network consist of 20 nodes each and the
activation function we are using is relu (rectified linear unit). Other activation functions such
as softmax can also be used. The last layer, fourth layer, is the output layer. Since we expect only
one output value (a predicted value since this is a regression model), we have just one output
node in this model (keras.layers.Dense(1)).
The architecture of the model depends, to a large extent, on the problem we are trying to solve.
The model we have created above will not work very well for the classification problems, such
as image classification.
1model.compile(optimizer='adam',
2 loss='mse',
3 metrics=['mae'])
python
The above code will be used to compile the network. The optimization function we are using
is adam which is a momentum based optimizer and prevents the model from getting stuck in
local minima. The loss function we are using is mse (mean square error). It considers the squared
difference between the predicted values and the actual values. Also, we are monitoring another
metric, mae (mean absolute error).
1model.fit(dc.train_data, dc.train_targets, epochs=10, batch_size=1)
python
24
This is where the actual training of the networks happens. The training set will be fed to the
network 10 times (epochs) for the training purpose. The epoch needs to be carefully selected as a
lesser number of epochs may lead to an under-trained network while too many epochs may lead
to overfitting, wherein the network works well on the training data but not on the test data set.
1test_loss, test_acc = model.evaluate(dc.test_data, dc.test_targets)
2print('Test accuracy:', test_acc)
python
The above code evaluates the trained model on the test data set and subsequently prints the test
accuracy value.
1a= np.array([[2000,3000],[4,5]])
2print(model.predict(a))
python
Once the model has been trained and tested we can use it to predict the values by supplying real-
world values. In this case, we are supplying the 2 sets of values (2000,30000) and (4,5) and the
output from the model is printed.
Output
1Epoch 1/10
25000/5000 [==============================] - 5s 997us/sample - loss: 1896071.4827
- mean_absolute_error: 219.0276
3
4Epoch 2/10
55000/5000 [==============================] - 5s 956us/sample - loss: 492.9092 -
mean_absolute_error: 3.8202
6
7Epoch 3/10
85000/5000 [==============================] - 5s 1ms/sample - loss: 999.7580 -
mean_absolute_error: 7.1740
9
10Epoch 4/10
115000/5000 [==============================] - 5s 1ms/sample - loss: 731.0374 -
mean_absolute_error: 6.0325
12
13Epoch 5/10
145000/5000 [==============================] - 5s 935us/sample - loss: 648.6434 -
mean_absolute_error: 7.5037
15
16Epoch 6/10
25
import numpy as np
import tensorflow as tf
from tensorflow import keras
train_data = np.array([[1.0,1.0]])
train_targets = np.array([2.0])
print(train_data)
for i in range(3,10000,2):
train_data= np.append(train_data,[[i,i]],axis=0)
train_targets= np.append(train_targets,[i+i])
test_data = np.array([[2.0,2.0]])
26
test_targets = np.array([4.0])
for i in range(4,8000,4):
test_data = np.append(test_data,[[i,i]],axis=0)
test_targets = np.append(test_targets,[i+i])
model = keras.Sequential([
keras.layers.Flatten(input_shape=(2,)),
keras.layers.Dense(20, activation=tf.nn.relu),
keras.layers.Dense(20, activation=tf.nn.relu),
keras.layers.Dense(1)
])
model.compile(optimizer='adam',
loss='mse',
metrics=['mae'])
model.fit(train_data, train_targets, epochs=10, batch_size=1)
[[1. 1.]]
Epoch 1/10
5000/5000 [==============================] - 8s 1ms/step - loss:
742813.1875 - mae: 105.6519
Epoch 2/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1772.8480 - mae: 6.0134
Epoch 3/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1884.9642 - mae: 8.9911
Epoch 4/10
5000/5000 [==============================] - 8s 2ms/step - loss:
1049.1685 - mae: 10.6520
Epoch 5/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1018.8793 - mae: 6.2299
Epoch 6/10
5000/5000 [==============================] - 8s 2ms/step - loss:
1276.4749 - mae: 5.4312
Epoch 7/10
5000/5000 [==============================] - 7s 1ms/step - loss:
1943.9398 - mae: 8.8076
Epoch 8/10
5000/5000 [==============================] - 7s 1ms/step - loss:
3522.0959 - mae: 8.8434
Epoch 9/10
27
Experiment 4 - Train the model to multiply two matrices and report the result using keras.
import numpy as np
np.random.seed(1)
model = Sequential()
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error',optimizer='adam',metrics=['binary_accuracy'])
# decimal output
print('decimal output:\n'+str(model.predict(training_data)))
# rounded output
print('rounded output:\n'+str(model.predict(training_data).round()))
29
import numpy as np
np.random.seed(1)
model = Sequential()
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error',optimizer='adam',metrics=['binary_accuracy'])
# decimal output
print('decimal output:\n'+str(model.predict(training_data)))
# rounded output
print('rounded output:\n'+str(model.predict(training_data).round()))
30
Epoch 1/1000
Epoch 2/1000
Epoch 3/1000
Epoch 989/1000
Epoch 990/1000
Epoch 991/1000
Epoch 992/1000
Epoch 993/1000
Epoch 994/1000
Epoch 995/1000
Epoch 996/1000
Epoch 997/1000
Epoch 998/1000
Epoch 999/1000
Epoch 1000/1000
decimal output:
[[0.34885803]
[0.7179798 ]
[0.68972814]
[0.28727812]]
32
rounded output:
[[0.]
[1.]
[1.]
[0.]]
33
Experiment 5 – Train the model to print the prime numbers using Keras
import numpy as np
seed = 7
np.random.seed(seed)
max_number = 2 ** num_digits
def prime_list():
counter = 0
primes = [2, 3]
is_prime = True
counter += 1
if primes[i] ** 2 > n:
break
counter += 1
if n % primes[i] == 0:
is_prime = False
break
34
if is_prime:
primes.append(n)
return primes
primes = prime_list()
def prime_encode(i):
if i in primes:
return 1
else:
return 0
def bin_encode(i):
def create_dataset():
x, y = [], []
x.append(bin_encode(i))
y.append(prime_encode(i))
return np.array(x), y
model = Sequential()
model.add(Dense(units=100, input_dim=num_digits))
model.add(PReLU())
model.add(Dropout(rate=0.2))
model.add(Dense(units=50))
35
model.add(PReLU())
model.add(Dropout(rate=0.2))
model.add(Dense(units=25))
model.add(PReLU())
model.add(Dropout(rate=0.2))
model.add(Dense(units=1))
model.add(Activation("sigmoid"))
model.compile(optimizer='RMSprop',
loss='binary_crossentropy',
metrics=['accuracy'])
validation_split=0.1)
# predict
errors, correct = 0, 0
tp, fn, fp = 0, 0, 0
x = bin_encode(i)
y = model.predict(np.array(x).reshape(-1, num_digits))
pred = 1
else:
36
pred = 0
obs = prime_encode(i)
if pred == obs:
correct += 1
else:
errors += 1
tp += 1
fn += 1
fp += 1
print("Errors :", errors, " Correct :", correct, "F score :", f_score)
def plot_history(history):
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.xlabel('epoch')
plt.ylabel('loss')
37
plt.savefig('RMSprop_more')
plot_history(history)
Output
2 1 0 3.60083e
3.60083e-08
3 1 0 0.0372073
4 0 0 8.21077e
8.21077e-06
5 1 1 0.617928
6 0 0 3.97566e
3.97566e-15
7 1 1 0.826378
8 0 0 0.000872908
9 0 0 0.0204348
10 0 0 1.00783e
1.00783e-09
11 1 0 0.187609
12 0 0 5.53888e
5.53888e-08
13 1 1 0.796401
14 0 0 3.67647e
3.67647e-15
15 0 0 0.029023
16 0 0 4.0449e
4.0449e-07
17 1 1 0.843016
18 0 0 4.10257e
4.10257e-14
19 1 1 0.917604
20 0 0 4.3736e
4.3736e-15
21 0 0 0.0141451
22 0 0 6.49346e
6.49346e-25
23 1 1 0.593597
24 0 0 4.45055e
4.45055e-08
25 0 1 0.735933
26 0 0 4.97732e
4.97732e-18
38
27 0 0 0.0958722
28 0 0 1.80154e-16
29 1 1 0.722513
30 0 0 2.00777e-28
31 1 1 0.774054
32 0 0 3.93779e-05
33 0 0 0.118341
34 0 0 1.88295e-11
35 0 0 0.480108
36 0 0 3.0609e-07
37 1 1 0.847888
38 0 0 3.42833e-18
39 0 0 0.0514646
40 0 0 5.82673e-07
41 1 1 0.726771
42 0 0 3.72693e-11
43 1 1 0.861872
44 0 0 5.71867e-14
45 0 0 0.18657
46 0 0 7.03075e-16
47 1 1 0.654062
48 0 0 1.30385e-10
49 0 1 0.923631
50 0 0 1.30955e-17
51 0 0 0.190215
52 0 0 6.45953e-19
53 1 1 0.558284
54 0 0 1.83163e-29
55 0 0 0.287756
56 0 0 3.29105e-11
57 0 0 0.292637
58 0 0 3.57044e-23
59 1 0 0.152102
60 0 0 1.80104e-22
61 1 1 0.858877
62 0 0 1.92684e-32
63 0 0 0.27367
64 0 0 1.74397e-09
65 0 1 0.727574
66 0 0 1.33752e-20
67 1 1 0.891129
68 0 0 1.47396e-17
69 0 0 0.346057
70 0 0 5.27672e-27
71 1 1 0.932053
72 0 0 4.04155e-10
73 1 1 0.879374
74 0 0 1.4077e-18
75 0 0 0.0290487
76 0 0 6.39801e-17
77 0 1 0.629597
78 0 0 1.54139e-30
79 1 1 0.791511
80 0 0 7.56631e-21
81 0 0 0.0438443
82 0 0 4.24787e-30
83 1 1 0.596353
84 0 0 6.45592e-32
85 0 0 0.431211
86 0 0 0.0
87 0 0 0.00903795
88 0 0 9.54647e-23
89 1 1 0.827787
90 0 0 2.43897e-31
91 0 1 0.746695
92 0 0 8.37092e-37
93 0 0 0.0384408
94 0 0 0.0
95 0 0 0.3743
39
96 0 0 7.28071e-13
97 1 1 0.888417
98 0 0 3.04541e-25
99 0 0 0.0649973
100 0 0 1.59478e-18
40
import numpy as np
import tensorflow as tf
imdb.load_data(num_words=20000)
(x_train,y_train),(x_test,y_test)=imdb.load_data(num_words=20000)
x_train=pad_sequences(x_train,maxlen=100)
x_test=pad_sequences(x_test,maxlen=100)
vocab_size=20000
emed_size=128
model=Sequential()
model.add(Embedding(vocab_size,emed_size,input_shape=(x_train.shape[1],)))
model.add(LSTM(units=60,activation='tanh'))
model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
41
model.summary()
history=model.fit(x_train,y_train,epochs=5,batch_size=128,validation_data=(x_test,y_test))
epoch_range=[1,2,3,4,5]
plt.plot(epoch_range, history.history['accuracy'])
plt.plot(epoch_range, history.history['val_accuracy'])
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 100, 128) 2560000
=================================================================
Total params: 2,605,360
Trainable params: 2,605,360
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
196/196 [==============================] - 38s 170ms/step - loss: 1.7606 -
accuracy: 0.0078 - val_loss: 1.1591 - val_accuracy: 0.1327
Epoch 2/5
196/196 [==============================] - 32s 164ms/step - loss: 1.1524 -
accuracy: 0.1252 - val_loss: 1.1365 - val_accuracy: 0.0272
Epoch 3/5
196/196 [==============================] - 32s 162ms/step - loss: 0.9810 -
accuracy: 0.0443 - val_loss: 0.9238 - val_accuracy: 0.0105
Epoch 4/5
196/196 [==============================] - 32s 165ms/step - loss: 0.7770 -
accuracy: 0.0857 - val_loss: 0.9253 - val_accuracy: 0.1010
Epoch 5/5
196/196 [==============================] - 32s 165ms/step - loss: 0.6973 -
accuracy: 0.0709 - val_loss: 1.0435 - val_accuracy: 0.0549
42
43
7. Consider temperature-forecast as one the example for recurrent neural network and
implement the following.
a. Inspect the data of the weather dataset
b. Parsing the data
c. Plotting the temperature timeseries
d. Plotting the first 10 days of the temperature timeseries
Now that we have an insight about RNN so let us begin to develop an RNN model that can
provide 4 days forecast of temperature based on 30 days of historical temperature data. I have
used Google colab to implement this code and Spyder for visualizations, there are a lot many tools
that you can use according to your preference.
You can download historical weather dataset from here, also feel free to use any weather dataset
of your choice which has temperature data.
44
Let's load the dataset and see the first few rows:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd#import dataset from data.csv file
dataset = pd.read_csv('data.csv')
dataset = dataset.dropna(subset=["Temperature"])
dataset=dataset.reset_index(drop=True)training_set = dataset.iloc[:,4:5].values
We include only the temperature column as we are going to forecast temperature and drop all the
rows that have no values or has a NaN.
Next, we will have to apply feature scaling to normalize temperature in the range 0 to 1.
#Feature Scaling
from sklearn.preprocessing import MinMaxScalersc = MinMaxScaler(feature_range=(0,1))
training_set_scaled = sc.fit_transform(training_set)
We will create a training set such that for every 30 days we will provide the next 4 days
temperature as output. In other words, input for our RNN would be 30 days temperature data and
the output would be 4 days forecast of temperature.
x_train = []
y_train = []n_future = 4 # next 4 days temperature forecast
n_past = 30 # Past 30 days for i in range(0,len(training_set_scaled)-n_past-n_future+1):
x_train.append(training_set_scaled[i : i + n_past , 0])
y_train.append(training_set_scaled[i + n_past : i + n_past + n_future , 0 ])x_train , y_train =
np.array(x_train), np.array(y_train)x_train = np.reshape(x_train, (x_train.shape[0] , x_train.shape[1],
1) )
x_train contains 30 previous temperature inputs before that day and y_train contains 4 days
temperature outputs after that day. Since x_train and y_train are lists we will have to convert them
to numpy array to fit training set to our model.
Now we are ready with our training data so let’s proceed to build an RNN model for forecasting
weather.
45
1. First, we will import keras sequential model from keras.models and keras layers ie. LSTM,
Dense and dropout. You can refer Keras documentation for more info on Keras models and
layers here
from keras.models import Sequential
from keras.layers import LSTM,Dense ,Dropout
# Fitting RNN to training set using Keras Callbacks. Read Keras callbacks docs for more info.
2. Let us define the layers in our RNN. We will create a sequential model by adding layers
sequentially using sequential(). The first layer is a Bidirectional LSTM with 30 memory
units, return_sequence=True means that the last output in the output sequence is returned and
the input_shape describes the structure of the input. With Bidirectional LSTM the output layer
gets feedback from past(forward) as well as future(backward) states simultaneously. We add 3
hidden layers and an output layer with a linear activation function that outputs 4 days temperature.
And at the last, we fit the RNN model with our training data.
regressor = Sequential()regressor.add(Bidirectional(LSTM(units=30, return_sequences=True,
input_shape = (x_train.shape[1],1) ) ))
regressor.add(Dropout(0.2))regressor.add(LSTM(units= 30 , return_sequences=True))
regressor.add(Dropout(0.2))regressor.add(LSTM(units= 30 , return_sequences=True))
regressor.add(Dropout(0.2))regressor.add(LSTM(units= 30))
regressor.add(Dropout(0.2))
regressor.add(Dense(units = n_future,activation='linear'))regressor.compile(optimizer='adam',
loss='mean_squared_error',metrics=['acc'])
regressor.fit(x_train, y_train, epochs=500,batch_size=32 )
4. Now that we have our test data ready, we can test our RNN model.
predicted_temperature = regressor.predict(testing)predicted_temperature =
46
sc.inverse_transform(predicted_temperature)predicted_temperature =
np.reshape(predicted_temperature,(predicted_temperature.shape[1],predicted_temperature.shape[0]))
The output from the model is in the normalized form, so to get the actual temperature values we
apply inverse_transform() to the predicted_temperature and then reshape it.
Let’s compare the predicted and real temperatures. As we can see the model performs well with
the given test data.
real_temperature
array([[82.], [82.], [83.], [83.]])predicted_temperature
array([[83.76233 ], [83.957565], [83.70461 ], [83.6326 ]])
If we forecast temperature for a month and visualize it we get the following results.
tf.keras.layers.LSTM(
units,
activation="tanh",
recurrent_activation="sigmoid",
use_bias=True,
kernel_initializer="glorot_uniform",
recurrent_initializer="orthogonal",
bias_initializer="zeros",
unit_forget_bias=True,
kernel_regularizer=None,
recurrent_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
recurrent_constraint=None,
bias_constraint=None,
dropout=0.0,
recurrent_dropout=0.0,
return_sequences=False,
return_state=False,
go_backwards=False,
stateful=False,
time_major=False,
unroll=False,
**kwargs
)
Based on available runtime hardware and constraints, this layer will choose different
implementations (cuDNN-based or pure-TensorFlow) to maximize the performance. If a GPU is
available and all the arguments to the layer meet the requirement of the cuDNN kernel (see
below for details), the layer will use a fast cuDNN implementation.
1. activation == tanh
48
2. recurrent_activation == sigmoid
3. recurrent_dropout == 0
4. unroll is False
5. use_bias is True
6. Inputs, if use masking, are strictly right-padded.
7. Eager execution is enabled in the outermost context.
For example:
Call arguments
Sometimes, neither the forward nor the reversed order works perfectly, but combining
them will give better results. In this case, you will need a bidirectional LSTM network.
A bidirectional LSTM network is simply two separate LSTM networks; one feeds with a
forward sequence and another with reversed sequence. Then the output of the two LSTM
networks is concatenated together before being fed to the subsequent layers of the
network. In Keras, you have the function Bidirectional() to clone an LSTM layer for
forward-backward input and concatenate their output. For example,
model = Sequential()
1
model.add(Embedding(top_words, embedding_vecor_length,
2
input_length=max_review_length))
3
model.add(Bidirectional(LSTM(100, dropout=0.2, recurrent_dropout=0.2)))
4
model.add(Dense(1, activation='sigmoid'))
Since you created not one, but two LSTMs with 100 units each, this network will take
twice the amount of time to train. Depending on the problem, this additional cost may be
justified.
The full code listing with adding the bidirectional LSTM to the last example is listed
below for completeness.
25 model.add(Dense(1, activation='sigmoid'))
26 model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
27 print(model.summary())
28 model.fit(X_train, y_train, epochs=3, batch_size=64)
29 # Final evaluation of the model
30 scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few
times and compare the average outcome.
Running this example provides the following output.
Epoch 1/3
391/391 [==============================] - 405s 1s/step - loss: 0.4960 - accuracy:
1
0.7532
2
Epoch 2/3
3
391/391 [==============================] - 439s 1s/step - loss: 0.3075 - accuracy:
4
0.8744
5
Epoch 3/3
6
391/391 [==============================] - 430s 1s/step - loss: 0.2551 - accuracy:
7
0.9014
Accuracy: 87.69%
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
max_features = 20000 # Only consider the top 20k words
maxlen = 200 # Only consider the first 200 words of each movie review
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, None)] 0
_________________________________________________________________
embedding (Embedding) (None, None, 128) 2560000
_________________________________________________________________
bidirectional (Bidirectional (None, None, 128) 98816
_________________________________________________________________
bidirectional_1 (Bidirection (None, 128) 98816
_________________________________________________________________
dense (Dense) (None, 1) 129
=================================================================
Total params: 2,757,761
Trainable params: 2,757,761
Non-trainable params: 0
_________________________________________________________________
names = list(tokens.keys())
for num in range(len(names)):
tokens[names[num]] = num+1
return tokens
tokens = tokenize_the_data_from_pandas(data,column_name="text")
len(tokens.keys())
54
model.add(tf.keras.layers.Conv1D(64,7,activation="relu"))
model.add(tf.keras.layers.MaxPooling1D(5))
#model.add(tf.keras.layers.GlobalMaxPooling1D())
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer = "adam",loss="binary_crossentropy",metrics=["accuracy"])
model.fit(X_train,y_train,epochs=5,validation_data=(X_test,y_test))
Epoch 1/5
2022-01-26 13:16:46.737730: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded
cuDNN version 8005
1125/1125 [==============================] - 37s 26ms/step - loss: 0.3360 - accuracy:
0.8362 - val_loss: 0.2162 - val_accuracy: 0.9087
Epoch 2/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0945 - accuracy:
0.9680 - val_loss: 0.2506 - val_accuracy: 0.9013
Epoch 3/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0161 - accuracy:
0.9954 - val_loss: 0.4001 - val_accuracy: 0.8938
Epoch 4/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0096 - accuracy:
0.9965 - val_loss: 0.4988 - val_accuracy: 0.8920
Epoch 5/5
1125/1125 [==============================] - 29s 26ms/step - loss: 0.0107 - accuracy:
0.9967 - val_loss: 0.5361 - val_accuracy: 0.8915
Out[14]:
<keras.callbacks.History at 0x7f560b67b890>
#Predicting the test set to investiagte further.
y_pred=model.predict(X_test[:])
#We turn our networks output into binary values
y_pred[y_pred>0.5] = 1
y_pred[y_pred< 0.5] = 0
Truth = 0
Falset = 0
55
#print(y_pred[pik] == y_test[pik])
print(Truth,Falset)
3566 434
print("accuracy:",Truth / (Truth + Falset))
accuracy: 0.8915
Let’s start by importing the classes and functions required for this model and initializing the
random number generator to a constant value to ensure you can easily reproduce the results.
1 import tensorflow as tf
2 from tensorflow.keras.datasets import imdb
3 from tensorflow.keras.models import Sequential
4 from tensorflow.keras.layers import Dense
5 from tensorflow.keras.layers import LSTM
6 from tensorflow.keras.layers import Embedding
7 from tensorflow.keras.preprocessing import sequence
8 # fix random seed for reproducibility
9 tf.random.set_seed(7)
You need to load the IMDB dataset. You are constraining the dataset to the top 5,000 words.
You will also split the dataset into train (50%) and test (50%) sets.
1 # load the dataset but only keep the top n words, zero the rest
2 top_words = 5000
3 (X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
Next, you need to truncate and pad the input sequences, so they are all the same length for
modeling. The model will learn that the zero values carry no information. The sequences are not
the same length in terms of content, but same-length vectors are required to perform the
computation in Keras.
56
The first layer is the Embedded layer that uses 32-length vectors to represent each word. The
next layer is the LSTM layer with 100 memory units (smart neurons). Finally, because this is a
classification problem, you will use a Dense output layer with a single neuron and a sigmoid
activation function to make 0 or 1 predictions for the two classes (good and bad) in the problem.
Because it is a binary classification problem, log loss is used as the loss function
(binary_crossentropy in Keras). The efficient ADAM optimization algorithm is used. The
model is fit for only two epochs because it quickly overfits the problem. A large batch size of 64
reviews is used to space out weight updates.
# create the model
1
embedding_vecor_length = 32
2
model = Sequential()
3
model.add(Embedding(top_words, embedding_vecor_length,
4
input_length=max_review_length))
5
model.add(LSTM(100))
6
model.add(Dense(1, activation='sigmoid'))
7
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
8
print(model.summary())
9
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=3, batch_size=64)
Once fit, you can estimate the performance of the model on unseen reviews.
11 # load the dataset but only keep the top n words, zero the rest
12 top_words = 5000
13 (X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)
14 # truncate and pad input sequences
15 max_review_length = 500
16 X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
17 X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
18 # create the model
19 embedding_vecor_length = 32
20 model = Sequential()
21 model.add(Embedding(top_words, embedding_vecor_length,
22 input_length=max_review_length))
23 model.add(LSTM(100))
24 model.add(Dense(1, activation='sigmoid'))
25 model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
26 print(model.summary())
27 model.fit(X_train, y_train, epochs=3, batch_size=64)
28 # Final evaluation of the model
29 scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
Note: Your results may vary given the stochastic nature of the algorithm or evaluation
procedure, or differences in numerical precision. Consider running the example a few times and
compare the average outcome.
Running this example produces the following output.
Epoch 1/3
391/391 [==============================] - 124s 316ms/step - loss: 0.4525 -
1
accuracy: 0.7794
2
Epoch 2/3
3
391/391 [==============================] - 124s 318ms/step - loss: 0.3117 -
4
accuracy: 0.8706
5
Epoch 3/3
6
391/391 [==============================] - 126s 323ms/step - loss: 0.2526 -
7
accuracy: 0.9003
Accuracy: 86.83%