Lesson 4 Deep Neural Network and Tools
Lesson 4 Deep Neural Network and Tools
TensorFlow
Deep Neural Network and Tools
Learning Objectives
When a neural network contains more than one hidden layer it becomes a Deep Neural Network.
Hidden Layers
Output Layer
Input Layer
Layer 1 Layer 2
Deep Learning: Example
In deep neural network, each layer recognizes a certain set of features based on the previous layer’s output.
Hidden Layers
v v v
v v v v v
v v v
Robert
Downey
Jr.
v v v
v v v v v
v v v
Robert
Downey
Jr.
v v v
v v v v v
v v v
Robert
Downey
Jr.
v v v
v v v v v
v v v
Robert
Downey
Jr.
Here the actual value is 10 Arrow hit the circle with point 8
Here our loss will be, Actual value – Predicted Value, i.e., 10 – 8 = 2.
Loss Function and its Major Categories
The losses of deep learning models can be evaluated very easily by using Loss Function.
Loss Function
Regression Classification
Losses Losses
Types of Regression Losses
Regression Losses
MSE is the average squared difference between actual and predicted value for N number of training data.
Y Y′
(Actual Value) (Predicted Value)
10.2 9.4 0.64
MAE is the absolute difference between actual and predicted value for N number of training data.
Y Y′
(Actual Value) (Predicted Value) IY - Y′I
10.2 9.4 0.64
In MSE, since each error is squared, it penalizes even small differences in prediction when compared to MAE.
Y Y′
(Actual Value) (Predicted Value) IY - Y′I
10.2 9.4 0.64
In MSE, since each error is squared, it penalizes even small differences in prediction when compared to MAE.
Y Y′
(Actual Value) (Predicted Value) IY - Y′I
Effect of MSE is adverse on outliers. Since each error is squared in MSE, the final MSE also increases. For example:
Y Y′
(Actual Value) (Predicted Value) IY - Y′I
Classification Losses
Samsung = [1,0,0]
LG = [0,0,1]
Predicted LG Actual
Probability Probability
Distribution Distribution
Input 0.1 0
Samsung
Apple 0.3 0
Model
Cross Entropy
LG 0.6 Measures distance 1
Between two
LG Distributions
⮚ The model gives the probability distribution for N classes for a particular input data C.
The following formula measures the cross entropy for a single observation or input data from the example:
A(LG) = [1, 0, 0]
Cross Entropy
Categorical Binary
Categorical Cross Entropy
Actual Predicted
Data Probability Probability Cross Entropy
Distribution Distribution
Loss Function (0.51 + 0.1 + 0.35 + 0.69 + 2.3 + 0.69 + 0.22) / 7 = 4.76
Binary Cross Entropy
⮚ Binary cross entropy assumes a binary value of 0 or 1 to denote negative and positive class respectively,
when there is only one output.
⮚ The actual output is denoted by a single variable y, then cross entropy for a particular data C can be
simplified as follows:
⮚ The error in binary classification for complete model is given by binary cross entropy which is nothing but
the mean of cross entropy for N data.
Overconfident wrong prediction occurs when MSE/MAE is used in classification, especially during the training phase.
⮚ Let us see how binary cross entropy, MAE and MSE penalizes in such situation.
⮚ In the example below, the two scenarios of y = 1, y’ = 0.2 and y = 0, y’ = 0.8 are examples of wrong
classification.
Parallelism It is easy for the system to identify operations that can be executed parallelly.
Distributed It is possible for TensorFlow to partition your program across multiple devices CPUs,
Execution GPUs, and TPUs.
You can build a dataflow graph in Python, store it in a saved model, and restore it in
Portability
a C++ program.
Why TensorFlow?
Flexibility
Parallel Computation
Large Community
Windows
TensorFlow: Parallel Computation
Python API offers flexibility to create all sorts of computations for every
neural network architecture
Linux
macOS
iOS
Android
Raspberry Pi
Windows
TensorFlow: Large Community
TFlearn is a modular and transparent deep learning library built on top of Tensorflow. It was
designed to provide a higher-level API to TensorFlow in order to facilitate and speed up
experimentations, while remaining fully transparent and compatible with it.
Features of TFLearn
Training functions are another core feature of TFLearn. In Tensorflow, there are no prebuilt API to
train a network, so TFLearn integrates a set of functions that can easily handle any neural
network training, for any number of inputs, outputs, and optimizers.
Visualization
TFLearn has the ability to manage a lot of useful logs. Currently, TFLearn supports a verbose level
to automatically manage summaries:
To save or restore a model, use 'save' or 'load' method of DNN model class.
Code
# Save a model
model.save('my_model.tflearn')
# Load a model
model.load('my_model.tflearn')
Weights Persistence
Retrieving a layer variable can either be done using the layer name, or directly by using 'W' or 'b' attributes
that are supercharged to the layer's returned tensor.
Code
To get or set the value of these variables, TFLearn model classes implement get_weight and
set_weights methods:
Code
Code
Code
Code
Augmentation
# Use HDF5 data model to train model
model = DNN(network)
model.fit(X, Y)
Data Preprocessing and Augmentation
Code
Code
Code
Code
Code
Layers
Built-In Operations
Extending TensorFlow: Layers
Any layer can be used with any other tensor from Tensorflow, i.e. you can directly use TFLearn wrappers
into your own Tensorflow graph.
Code
TFLearn implements a TrainOp class to represent an optimization process (i.e. backprop). It is defined as
follows:
Code
TrainOps can be fed into a Trainer class, that will handle the whole training process, considering all
TrainOp together as a whole model.
Code
TFLearn models are useful for more complex models to handle multiple optimization.
Code
For prediction, TFLearn implements an Evaluator class that works same as the trainer. It takes a
parameter and returns the predicted value.
Code
model = Evaluator(network)
model.predict(feed_dict={input_placeholder: X})
Trainer, Evaluator, and Predictor
To handle networks that have layer with different behaviors at training and testing time such as dropout and
batch normalization:
Trainer class uses a Boolean variable (is_training), that specifies if the network is used for training or
testing or predicting. This variable is stored under tf.GraphKeys.IS_TRAINING collection, as its first
element. So, while defining such layers, this variable should be used as the operational condition:
Code
Keras
TensorFlow, MxNet,
CNTK,Theano
cd keras
1 Architecture Definition: Number of layers, number of nodes in layers, and activation function to be used
2 Compile: Defines the loss function and details about how optimization works
3 Fit: Finalizes the model through back propagation and optimization of weights with input data
Code
model = Sequential()
Model.add(Convolution2D(16, 5, 5, activation='relu',
input_shape=(img_width, img_height, 3)))
model.add(MaxPooling2D(2, 2))
model.add(Convolution2D(32, 5, 5, activation='relu'))
model.add(MaxPooling2D(2, 2))
model.add(Flatten())
model.add(Dense(1000, activation='relu'))
model.add(Dense(10, activation='softmax'))
Compile the Model
Code
model.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['accuracy'])
⮚ The optimizer searches through different weights for the network and optional metrics to collect and
report during training.
Code
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
Code
Code
classes=model.predict(x_test,batch_size=128)
Simple Interface
Hybrid Frontend
Distributed Training
C++ Frontend
Cloud Partners
PyTorch Ecosystems
Language Python 2.7 Python 3.5 Python 3.6 Python 3.7 C++
Problem Statement: A data set is given of diabetes patients with different health
parameters make a deep learning classification model to predict.
Access:
⮚ Click on the Labs tab on the left side panel of the LMS. Copy the username and password.
⮚ Click on the Launch Lab button. On the page that appears, enter the username and
password, and click Login.
Loading Dataset
Predict
Splitting Dataset
Predict
Splitting Dataset
Predict
Importing Library
Code
Fit the Model
from keras.models import Sequential
Evaluate the Model
from keras.layers import Dense
Predict
Creating the Layers
Processing the Data ⮚ The model expects rows of data with eight variables (the input_dim=8
argument).
Define the Model
⮚ The first hidden layer has 12 nodes and uses the ReLU activation
function.
Compile the Model
⮚ The second hidden layer has 8 nodes and uses the ReLU activation
Fit the Model function.
⮚ The output layer has one node and uses the sigmoid activation
Evaluate the Model
function.
Predict Code
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
Compile the Model
Processing the Data ⮚ Binary Cross Entropy is set as loss function for this classification model
Predict
Evaluate the Model
_, accuracy = model.evaluate(X, y)
Evaluate the Model
print('Accuracy: %.2f' % (accuracy*100))
Predict
Predict
Predict
dataset['Outcome'].head()
Deep Learning Model with TensorFlow
Problem Statement: Create a deep learning model with MNIST dataset to predict the
handwritten digits.
Access:
⮚ Click on the Labs tab on the left side of the LMS panel. Copy the username and password.
⮚ Click on the Launch Lab button. On the page that appears, enter the username and
password, and click Login.
Loading the Dataset
Predict
print(y_train[6])
Normalizing the Data
Data normalization is achieved by tensorflow.keras.utils.normalize() function,
and the pixel of the images is normalized from the range 0 to 255 to the
Processing the Data range 0 to 1.
Code
Define the Model
Predict
⮚ The difference can be seen between original digit and normalized digit.
Define the Model
A feed forward sequential model is defined:
Code
Evaluate the Model
model = tf.keras.models.Sequential()
Predict
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(10,activation=tf.nn.softmax))
Compile the Model
Predict
Evaluate the Model
The evaluate() function returns a list with two values. The first is the loss
Processing the Data of the model on the data set, and the second is the accuracy of the model
on the dataset.
Define the Model
Code
Compile the Model
Predict
Predict
Problem Statement: Build a deep learning model using Fashion-MNIST, dataset of fashion
articles with 10 classes and each image is 28*28 pixel.
Access:
⮚ Click on the Labs tab on the left side of the LMS panel. Copy the username and password.
⮚ Click on the Launch Lab button. On the page that appears, enter the username and
password, and click Login.
Importing the Library
Code
show_image(torchvision.utils.make_grid(images))
Define Loss Function
and Optimizer print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
Code
return x
Define Loss Function and Optimizer
Processing the Data ⮚ Cross entropy is set as loss function for this classification model.
model = Network()
Evaluate the Model criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.1)
Train the Network
Code
print('Finished Training')
Train the Network
Code
Problem Statement: Make a deep learning model with MNIST data using Caffe2.
Access:
⮚ Click on the Labs tab on the left side of the LMS panel. Copy the username and password.
⮚ Click on the Launch Lab button. On the page that appears, enter the username and
password, and click Login.
Loading the Data
Code
Loading the Data
from keras.datasets.mnist import load_data
Code
Importing the Library
from caffe2.python import (
brew,
Loading the Data
core,
model_helper,
Constructing the Model net_drawer,
optimizer,
Training the Model visualize,
workspace,
)
Testing the Model
core.GlobalInit(['caffe2', '--caffe2_log_level=0'])
Deploying the Model print("Necessities imported!")
USE_LENET_MODEL = True
Running the Training
Loading the Data
Code
Loading the Data
Importing
Importing
the
Library
Library
⮚ The bookkeeping part where you just print the statistics for inspection
Testing the Model (AddBookkeepingOperators function)
return softmax
Constructing the Model
The AddModel function allows you to switch easily from MLP to LeNet model.
Change USE_LENET_MODEL at the very top of the notebook and rerun the whole
Importing the Library code.
Code
for i in range(total_iters):
Training the Model
workspace.RunNet(train_model.net)
accuracy[i] = workspace.blobs['accuracy']
Testing the Model loss[i] = workspace.blobs['loss']
if i % 25 == 0:
print("Iter: {}, Loss: {}, Accuracy:
Deploying the Model {}".format(i,loss[i],accuracy[i]))
Code
Importing the Library
Access:
⮚ Click on the Labs tab on the left side of the LMS panel. Copy the username and password.
⮚ Click on the Launch Lab button. On the page that appears, enter the username and
password, and click Login.
Creating the Neural Network Class
⮚ An input layer, x
⮚ An output layer, ŷ
⮚ A choice of activation function for each hidden layer, σ. In this tutorial, you’ll use a
Sigmoid activation function
Creating the Neural Network Class
Code
class NeuralNetwork:
Code
def backprop(self):
# application of the chain rule to find derivative of the loss function with
respect to weights2 and weights1
# update the weights with the derivative (slope) of the loss function
self.weights1 += d_weights1
self.weights2 += d_weights2
Creating the Neural Network Class
Code
def feedforward(self):
self.layer1 = sigmoid(np.dot(self.input, self.weights1))
self.output = sigmoid(np.dot(self.layer1, self.weights2)))
Output of the Neural Network
Prediction Y (actual)
0.023 0
0.979 1
0.975 1
0.025 0
Key Takeaways
a. 1
b. 2
c. 3
a. 1
b. 2
c. 3
If a neural network contains one hidden layer it is called shallow neural network and a shallow neural becomes a
deep neural network when one more hidden layer adds up.
Knowledge
Check
Which of the following deep learning ecosystems does Glow, an ML compiler belong to?
2
a. PyTorch
b. Keras
c. Tensorflow
a. PyTorch
b. Keras
c. Tensorflow
Glow belongs to PyTorch ecosystem along with Skorch, Torchbearer, and PyTorch Geometric.
Knowledge
Check Which of the following loss functions is best suited to build a regression model with
outlier?
3
a. MAE
b. MSE
a. MAE
b. MSE
MAE is the suitable loss function to build a regression model with outlier while for a dataset without outlier it is MSE.
Knowledge
Check
Which of the following deep learning frameworks uses Tensorflow backend?
4
a. Keras
b. PyTorch
c. Caffe
a. Keras
b. PyTorch
c. Caffe
Access: Click on the Labs tab on the left side of the LMS panel. Copy or
note the username and password that are generated. Click on the Launch
Lab button. On the page that appears, enter the username and password
in the respective fields, and click Login.
Thank You