0% found this document useful (0 votes)

228 views159 pages

Lesson 4 Deep Neural Network and Tools

This document discusses deep learning concepts including deep neural networks, loss functions, and tools for deep learning. It explains that a deep neural network contains more than one hidden layer and each layer recognizes features based on the previous layer's output. It also describes common loss functions for regression like mean squared error (MSE) and mean absolute error (MAE), and discusses that MSE is generally better but MAE handles outliers better. Finally, it mentions choosing appropriate loss functions for classification problems.

Uploaded by

VIJENDHER REDDY GURRAM

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

228 views159 pages

Lesson 4 Deep Neural Network and Tools

Uploaded by

VIJENDHER REDDY GURRAM

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 159

Deep Learning with Keras and

TensorFlow
Deep Neural Network and Tools
Learning Objectives

By the end of this lesson, you will be able to:

Explain a deep neural network

Design a deep neural network step by step

Choose a loss function for a deep neural network

Describe and work with deep learning tools

Deep Neural Network
Deep Neural Network

When a neural network contains more than one hidden layer it becomes a Deep Neural Network.

Hidden Layers

Output Layer

Input Layer
Layer 1 Layer 2
Deep Learning: Example

In deep neural network, each layer recognizes a certain set of features based on the previous layer’s output.

Hidden Layers

Layer 1 Layer 2 Layer 3

v v v

v v v v v

v v v

Robert
Downey
Jr.

Input Layer Output Layer

Deep Learning: Example
Hidden Layers

Layer 1 Layer 2 Layer 3

v v v

v v v v v

v v v

Robert
Downey
Jr.

Input Layer Output Layer

The first hidden layer trains on the

input and identifies the edges.
Deep Learning: Example
Hidden Layers

Layer 1 Layer 2 Layer 3

v v v

v v v v v

v v v

Robert
Downey
Jr.

Input Layer Output Layer

The second hidden layer gets the

identified edges as input and gives a
combination of edges as an output.
Deep Learning: Example
Hidden Layers

Layer 1 Layer 2 Layer 3

v v v

v v v v v

v v v

Robert
Downey
Jr.

Input Layer Output Layer

The third layer distinguishes different

facial features that leads to the image
recognition of the input.
Loss Function
What Is Loss?
In a deep learning model, while predicting, the output deviates from the actual value, the quantitative measure of
this difference is called loss. For example;

Here the actual value is 10 Arrow hit the circle with point 8

Here our loss will be, Actual value – Predicted Value, i.e., 10 – 8 = 2.
Loss Function and its Major Categories

The losses of deep learning models can be evaluated very easily by using Loss Function.

Loss Function

Regression Classification
Losses Losses
Types of Regression Losses

Regression Losses

Mean Squared Mean Absolute

Error (MSE) Error (MAE)
Mean Squared Error (MSE)

MSE is the average squared difference between actual and predicted value for N number of training data.

MSE = Sum of Squared Errors/N

Y Y′
(Actual Value) (Predicted Value)
10.2 9.4 0.64

7.1 6.9 0.04

17.2 18.4 1.44

9.5 11.3 3.24

11.5 11.1 0.16

Sum 5.52
MSE 5.52/5 = 1.104
Mean Squared Error (MSE)

MAE is the absolute difference between actual and predicted value for N number of training data.

MSE = Sum of Mean Errors/N

Y Y′
(Actual Value) (Predicted Value) IY - Y′I
10.2 9.4 0.64

7.1 6.9 0.04

17.2 18.4 1.44

9.5 11.3 3.24

11.5 11.1 0.16

Sum 5.52
MSE 5.52/5 = 1.104
Mean Squared Error (MSE)

In MSE, since each error is squared, it penalizes even small differences in prediction when compared to MAE.

MSE = Sum of Mean Errors/N

Y Y′
(Actual Value) (Predicted Value) IY - Y′I
10.2 9.4 0.64

7.1 6.9 0.04

17.2 18.4 1.44

9.5 11.3 3.24

11.5 11.1 0.16

Sum 5.52
MSE 5.52/5 = 1.104
MSE or MAE?

In MSE, since each error is squared, it penalizes even small differences in prediction when compared to MAE.

Y Y′
(Actual Value) (Predicted Value) IY - Y′I

10.2 9.4 0.64 0.8

7.1 6.9 0.04 0.2

17.2 18.4 1.44 1.2

9.5 11.3 3.24 1.8

11.5 11.1 0.16 0.4

Sum 5.52 4.4

Loss Function MSE = 5.52/5 = 1.104 MAE = 4.4/5 = 0.88

MSE or MAE?

Effect of MSE is adverse on outliers. Since each error is squared in MSE, the final MSE also increases. For example:

Y Y′
(Actual Value) (Predicted Value) IY - Y′I

10.2 9.4 0.64 0.8

7.1 6.9 0.04 0.2

17.2 18.4 1.44 1.2

31.5 11.3 408.04 20.2

11.5 11.1 0.16 0.4

Sum 5.52 4.4

Loss Function MSE = 415.84/5 = MAE = 27.2/5 = 5.44

83.16
MSE or MAE?
If the data has outliers, MAE will be a better option over MSE. For data without outliers MSE is preferable.

MAE as loss function MSE as loss function

Types of Classification Losses

Classification Losses

Cross Entropy Hinge Loss

Cross Entropy
Cross entropy is a way to calculate distance between two probability distributions. For example, Let us consider a
classification problem of 3 classes.

Class(Samsung, Apple, LG) Output = [P(Samsung), P(Apple), P(LG)]

The class with highest probability is the winner.

Cross Entropy
If the predicted probability distribution is not close to the actual value, the model adjusts its weight.

Samsung = [1,0,0]

Output = [P(Samsung), P(Apple), P(LG)] Apple = [0,1,0]

LG = [0,0,1]

The actual probability

distribution for each class
Cross Entropy
In this scenario, cross entropy is used as a tool to calculate the difference predicted probability distribution from
the actual one.

Predicted LG Actual
Probability Probability
Distribution Distribution

Input 0.1 0
Samsung

Apple 0.3 0
Model
Cross Entropy
LG 0.6 Measures distance 1
Between two
LG Distributions

Intuition behind Cross Entropy

Calculation Cross Entropy

⮚ The model gives the probability distribution for N classes for a particular input data C.

P(C) = [y1′ , y2′ , y3′ … yN’]

⮚ The actual or target probability distribution of the data C is:

A(C) = [y1 , y2 , y3 … yN]

⮚ Cross entropy for data C is calculated as:

CrossEntropy(A,P) = – ( y1log(y1′) + y2log(y2′) + y3log(y3′) + … + yNlog(yN’) )

Calculation Cross Entropy

The following formula measures the cross entropy for a single observation or input data from the example:

P(LG) = [0.6, 0.3, 0.1]

A(LG) = [1, 0, 0]

CrossEntropy(A,P) = – (1Log(0.6) + 0Log(0.3)+0*Log(0.1)) = 0.51

Types of Cross Entropy

Cross Entropy

Categorical Binary
Categorical Cross Entropy

Categorical Cross Entropy = Sum of Cross Entropy for N data/N

Actual Predicted
Data Probability Probability Cross Entropy
Distribution Distribution

Samsung [1, 0, 0] [0.6, 0.3, 0.1] – (1Log(0.6) + 0Log(0.3)+0*Log(0.1)) = 0.51

Samsung [1, 0, 0] [0.9, 0.1, 0] – (1Log(0.9) + 0Log(0.1)+0*Log(0.1)) = 0.1

Apple [0, 1, 0] [0.2, 0.7, 0.1] – (0Log(0.2) + 1Log(0.7)+0*Log(0.1)) = 0.35

LG [0, 0, 1] [0.3, 0.2, 0.5] – (0Log(0.3) + 0Log(0.2)+1*Log(0.5)) = 0.69

Apple [0, 1, 0] [0.6, 0.1, 0.3] – (0Log(0.6) + 1Log(0.1)+0*Log(0.3)) = 2.3

Samsung [1, 0, 0] [0.5, 0.2, 0.3] – (1Log(0.5) + 0Log(0.2)+0*Log(0.3)) = 0.69

LG [0, 0, 1] [0.1, 0.1, 0.8] – (0Log(0.1) + 0Log(0.1)+1*Log(0.8)) = 0.22

Loss Function (0.51 + 0.1 + 0.35 + 0.69 + 2.3 + 0.69 + 0.22) / 7 = 4.76
Binary Cross Entropy
⮚ Binary cross entropy assumes a binary value of 0 or 1 to denote negative and positive class respectively,
when there is only one output.

⮚ The actual output is denoted by a single variable y, then cross entropy for a particular data C can be
simplified as follows:

Cross Entropy(C) = – y*log(y’) when y = 1

Cross Entropy(C) = – (1-y)*log(1-y’) when y = 0

⮚ The error in binary classification for complete model is given by binary cross entropy which is nothing but
the mean of cross entropy for N data.

Binary Cross Entropy = Sum of Cross Entropy for N data/N

Cross Entropy over MSE/MAE

Overconfident wrong prediction occurs when MSE/MAE is used in classification, especially during the training phase.

Congrats, our ????

model says you
are pregnant.
Cross Entropy over MSE/MAE

⮚ Let us see how binary cross entropy, MAE and MSE penalizes in such situation.

⮚ In the example below, the two scenarios of y = 1, y’ = 0.2 and y = 0, y’ = 0.8 are examples of wrong
classification.

Scenario Actual y’ Predicted y MAE MSE Binary Cross Entropy

Prediction is confidently 1 0.9 I1 - 0.9I = 0.1 – 1*Log(0.9) = 0.1
closer to actual class 1
Prediction is confidently 1 0.2 I1 - 0.2I = 0.8 – 1*Log(0.2) = 1.64
closer to actual class 0

Prediction is confidently 0 0.1 I0 - 0.1I = 0.1 – 1*Log(1 - 0.1) = 0.1

closer to actual class 0
Prediction is confidently 0 0.8 I0 - 0.8I = 0.8 – 1*Log(1 – 0.8) = 1.64
closer to actual class 1

Binary Cross Entropy penalizes more severely than MAE or MSE.

TensorFlow
What Is TensorFlow?
What Is TensorFlow?

A popular open source library Developed by Google Brain Team

for deep learning and machine and released in 2015
learning

Used mainly for classification,

perception, understanding,
discovering, prediction, and creation
What Is TensorFlow?

TensorFlow uses a dataflow graph to represent your

computation.

Dataflow is a common programming model for parallel

computing.
Benefits of Using Graph

Parallelism It is easy for the system to identify operations that can be executed parallelly.

Distributed It is possible for TensorFlow to partition your program across multiple devices CPUs,
Execution GPUs, and TPUs.

Compilation It helps to generate faster code.

You can build a dataflow graph in Python, store it in a saved model, and restore it in
Portability
a C++ program.
Why TensorFlow?

Flexibility

Parallel Computation

Multiple Environment Friendly

Large Community

Windows
TensorFlow: Parallel Computation

TensorFlow supports distributed computing.

TensorFlow: Flexibility

Python API offers flexibility to create all sorts of computations for every
neural network architecture

Includes highly efficient C++ implementations of many ML operations

TensorFlow: Multiple Environment Friendly

Runs on desktop and mobile devices such as:

Linux

macOS

iOS

Android

Raspberry Pi

Windows
TensorFlow: Large Community

Is one of the most popular open source projects on GitHub

Has a dedicated team of passionate and helpful developers

Has a growing community contributing to improve it

Installation of TensorFlow

TensorFlow 2 packages require a pip version >19.0.

pip install --upgrade pip

TFLearn
What Is TFLearn?

TFlearn is a modular and transparent deep learning library built on top of Tensorflow. It was
designed to provide a higher-level API to TensorFlow in order to facilitate and speed up
experimentations, while remaining fully transparent and compatible with it.
Features of TFLearn

Easy to use, understand, and implement

Fast prototyping through highly modular built-in components

Full transparency over Tensorflow

Powerful helper functions to train any TensorFlow graph

Easy and clear graph visualization

Effortless device placement for using multiple CPU or GPU

Installation of TFLearn

For the bleeding edge version:

pip install git+https://github.com/tflearn/tflearn.git

For the latest stable version:

pip install tflearn

TFLearn Model
Layers of TFLearn

Currently available layers of TFLearn are:

Built-In Operations of TFlearn
Training of TFLearn

Training functions are another core feature of TFLearn. In Tensorflow, there are no prebuilt API to
train a network, so TFLearn integrates a set of functions that can easily handle any neural
network training, for any number of inputs, outputs, and optimizers.
Visualization

TFLearn has the ability to manage a lot of useful logs. Currently, TFLearn supports a verbose level
to automatically manage summaries:

1: Loss and Metric (Best Speed)

2: Loss, Metric, and Gradients

3: Loss, Metric, Gradients, and Weights

4: Loss, Metric, Gradients, Weights, Activations, and Sparsity

(Best Visualization)
Visualization: Loss and Accuracy
Visualization: Layers
Visualization: Layers
Weights Persistence

To save or restore a model, use 'save' or 'load' method of DNN model class.

Code

# Save a model
model.save('my_model.tflearn')
# Load a model
model.load('my_model.tflearn')
Weights Persistence

Retrieving a layer variable can either be done using the layer name, or directly by using 'W' or 'b' attributes
that are supercharged to the layer's returned tensor.

Code

# Let's create a layer

fc1 = fully_connected(input_layer, 64, name="fc_layer_1")
# Using Tensor attributes (Layer will supercharge the returned Tensor
with weights attributes)
fc1_weights_var = fc1.W
fc1_biases_var = fc1.b
# Using Tensor name
fc1_vars = tflearn.get_layer_variables_by_name("fc_layer_1")
fc1_weights_var = fc1_vars[0]
fc1_biases_var = fc1_vars[1]
Weights Persistence

To get or set the value of these variables, TFLearn model classes implement get_weight and
set_weights methods:

Code

input_data = tflearn.input_data(shape=[None, 784])

fc1 = tflearn.fully_connected(input_data, 64)
fc2 = tflearn.fully_connected(fc1, 10, activation='softmax')
net = tflearn.regression(fc2)
model = DNN(net)
# Get weights values of fc2
model.get_weights(fc2.W)
# Assign new random weights to fc2
model.set_weights(fc2.W, numpy.random.rand(64, 10))
Fine-Tuning
While defining a model in TFLearn, you can specify the layer's weights while loading the pre-trained model. This
can be handled by the restore argument of layer functions and it is only available for layers with weights.

Code

# Weights will be restored by default.

fc_layer = tflearn.fully_connected(input_layer, 32)
# Weights will not be restored, if specified so.
fc_layer = tflearn.fully_connected(input_layer, 32, restore='False')
Data Management
TFLearn supports numpy array data. Additionally, it supports HDF5 for handling large datasets. TFLearn can
directly use HDF5-formatted data:

Code

# Load hdf5 dataset

h5f = h5py.File('data.h5', 'r')
X, Y = h5f['MyLargeData']

... define network ...

# Use HDF5 data model to train model

model = DNN(network)
model.fit(X, Y)
Data Preprocessing and Augmentation
TFLearn provides wrappers to easily handle data preprocessing and data augmentation. TFLearn data
stream is designed with computing pipelines in order to speedup training by pre-processing data on CPU
while GPU is performing model training.

Code

# Load hdf5 dataset

Data Preprocessing and Data

h5f = h5py.File('data.h5', 'r')
X, Y = h5f['MyLargeData']

... define network ...

Augmentation
# Use HDF5 data model to train model
model = DNN(network)
model.fit(X, Y)
Data Preprocessing and Augmentation

Code

# Real-time image preprocessing

img_prep = tflearn.ImagePreprocessing()
# Zero Center (With mean computed over the whole dataset)
img_prep.add_featurewise_zero_center()
# STD Normalization (With std computed over the whole dataset)
img_prep.add_featurewise_stdnorm()
Data Preprocessing and Augmentation

Code

# Real-time data augmentation

img_aug = tflearn.ImageAugmentation()
# Random flip an image
img_aug.add_random_flip_leftright()

# Add these methods into an 'input_data' layer

network = input_data(shape=[None, 32, 32, 3],
data_preprocessing=img_prep,
data_augmentation=img_aug)
Scopes and Weights Sharing
All layers are built over variable_op_scope, that makes them easy to share the variables among multiple
layers and make TFLearn suitable for distributed training.

Code

# Define a model builder

def my_model(x):
x = tflearn.fully_connected(x, 32, scope='fc1')
x = tflearn.fully_connected(x, 32, scope='fc2')
x = tflearn.fully_connected(x, 2, scope='out')

# 2 different computation graphs but sharing the same weights

with tf.device('/gpu:0'):
# Force all Variables to reside on the CPU.
with tf.arg_scope([tflearn.variables.variable], device='/cpu:0'):
model1 = my_model(placeholder_X)
Scopes and Weights Sharing
All layers with inner variables support a scope argument to place variables under layers with same scope
name and these layers share the same weights.

Code

# Reuse Variables for the next model

tf.get_variable_scope().reuse_variables()
with tf.device('/gpu:1'):
with tf.arg_scope([tflearn.variables.variable], device='/cpu:0'):
model2 = my_model(placeholder_X)

# Model can now be trained by multiple GPUs (see gradient averaging)

Graph Initialization
It is useful to limit resources, or assign more or less GPU RAM memory while training. To do so, a graph
initializer can be used to configure a graph by running the following:

Code

tflearn.init_graph(set_seed=8888, num_cores=16, gpu_memory_fraction=0.5)

Extending TensorFlow
TFLearn is a very flexible library designed to let you use any of its component independently. A model can be
succinctly built using any combination of Tensorflow operations and TFLearn built-in layers and operations.
The following are the two basic fields where TensorFlow is extended:

Layers

Built-In Operations
Extending TensorFlow: Layers
Any layer can be used with any other tensor from Tensorflow, i.e. you can directly use TFLearn wrappers
into your own Tensorflow graph.

Code

# Some operations using Tensorflow.

X = tf.placeholder(shape=(None, 784), dtype=tf.float32)
net = tf.reshape(X, [-1, 28, 28, 1])

# Using TFLearn convolution layer.

net = tflearn.conv_2d(net, 32, 3, activation='relu')

# Using Tensorflow's max pooling op.

net = tf.nn.max_pool(net, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1],
padding='SAME')
Extending TensorFlow: Built-In Operations
TFLearn built-in operations make Tensorflow graph writing faster and more readable. These are fully
compatible with any TensorFlow expression. The following code examples show how to use them along with
pure Tensorflow API.
Trainer, Evaluator, and Predictor
TFLearn provides some helpers function that can train any Tensorflow graph. It is suitable to make training
more convenient, by introducing real-time monitoring, batch sampling, moving averages, tensorboard logs,
data feeding, etc. It supports any number of inputs, outputs, and optimization ops.

TFLearn implements a TrainOp class to represent an optimization process (i.e. backprop). It is defined as
follows:

Code

trainop = TrainOp(net=my_network, loss=loss, metric=accuracy)

Trainer, Evaluator, and Predictor

TrainOps can be fed into a Trainer class, that will handle the whole training process, considering all
TrainOp together as a whole model.

Code

model = Trainer(trainops=trainop, tensorboard_dir='/tmp/tflearn')

model.fit(feed_dict={input_placeholder: X, target_placeholder: Y})
Trainer, Evaluator, and Predictor

TFLearn models are useful for more complex models to handle multiple optimization.

Code

model = Trainer(trainops=[trainop1, trainop2])

model.fit(feed_dict=[{in1: X1, label1: Y1}, {in2: X2, in3: X3, label2:
Y2}])
Trainer, Evaluator, and Predictor

For prediction, TFLearn implements an Evaluator class that works same as the trainer. It takes a
parameter and returns the predicted value.

Code

model = Evaluator(network)
model.predict(feed_dict={input_placeholder: X})
Trainer, Evaluator, and Predictor
To handle networks that have layer with different behaviors at training and testing time such as dropout and
batch normalization:

Trainer class uses a Boolean variable (is_training), that specifies if the network is used for training or
testing or predicting. This variable is stored under tf.GraphKeys.IS_TRAINING collection, as its first
element. So, while defining such layers, this variable should be used as the operational condition:

Code

# Example for Dropout:

x = ...

def apply_dropout(): # Function to apply when training mode ON.

return tf.nn.dropout(x, keep_prob)

is_training = tflearn.get_training_mode() # Retrieve is_training

variable.
tf.cond(is_training, apply_dropout, lambda: x) # Only apply dropout at
training time.
What Is Keras?

A high-level neural network API, Most powerful and easy to use

written in Python for developing and evaluating
deep learning models

Runs seamlessly on CPU and GPU

Keras: Backends

Keras uses TensorFlow, Theano, MxNet, and CNTK (Microsoft) as backends.

Keras

TensorFlow, MxNet,
CNTK,Theano

CPU GPU TPU

Why Use Keras?

Allows easy and fast prototyping

Supports both convolutional networks, recurrent networks, and

combination of both

Provides clear and actionable feedback for user error

Follows best practices for reducing cognitive load

Installation of Keras

Installation of Keras is done as follows:

⮚ Install Keras in virtualenv:

pip3 install keras

⮚ Install Keras from the GitHub source:

Clone Keras using git:

git clone https://github.com/keras-team/keras.git

cd to the Keras folder and run the install command:

cd keras

sudo python setup.py install

Creating a Keras Model

1 Architecture Definition: Number of layers, number of nodes in layers, and activation function to be used

2 Compile: Defines the loss function and details about how optimization works

3 Fit: Finalizes the model through back propagation and optimization of weights with input data

4 Predict: Predicts with the model prepared

Create the Model

The sequential model is a linear stack of layers.

Code

model = Sequential()
Model.add(Convolution2D(16, 5, 5, activation='relu',
input_shape=(img_width, img_height, 3)))

model.add(MaxPooling2D(2, 2))
model.add(Convolution2D(32, 5, 5, activation='relu'))
model.add(MaxPooling2D(2, 2))

model.add(Flatten())
model.add(Dense(1000, activation='relu'))

model.add(Dense(10, activation='softmax'))
Compile the Model

Code

model.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['accuracy'])

⮚ The loss function evaluates a set of weights.

⮚ The optimizer searches through different weights for the network and optional metrics to collect and
report during training.

⮚ Set metrics=[‘accuracy’] for classification problem.

Fit the Model

Code

model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))

⮚ Executes model for some data

⮚ Trains and iterates data in batches

Evaluate the Model

Code

score = model.evaluate(x_test, y_test, verbose=0)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

⮚ Assesses the modeled data set

Predict

Code

classes=model.predict(x_test,batch_size=128)

⮚ Generates prediction on new data

Other Deep Learning Tool: PyTorch
What Is PyTorch?

A deep learning research A replacement for NumPy to use

platform that provides the power of GPUs
maximum flexibility and speed

A product of Facebook's artificial

intelligence team
Features of PyTorch
The features of Pytorch is as follows:

Simple Interface

Hybrid Frontend

Distributed Training

Native ONNX Support

C++ Frontend

Cloud Partners
PyTorch Ecosystems

Glow, an ML compiler increases the PyTorch Geometric, a library for

performance of deep learning deep learning for irregular input
platform. data.

Skorch, high-level library provides Torchbearer, a library for advanced

full scikit-learn compatibility. visualizations.
Installation of PyTorch

Configurations followed to install PyTorch:

PyTorch Build Stable(1.2) Preview (Nightly)

Your OS Linux MAC Windows

Package Conda Pip LibTorch Source

Language Python 2.7 Python 3.5 Python 3.6 Python 3.7 C++

CUDA 9.2 10.0 None

Run this Conda install pytorch torchvision cudotoolkit = 10.0 –c pytorch

command
Deep Learning Model with Keras

Problem Statement: A data set is given of diabetes patients with different health
parameters make a deep learning classification model to predict.

Access:

⮚ Click on the Labs tab on the left side panel of the LMS. Copy the username and password.

⮚ Click on the Launch Lab button. On the page that appears, enter the username and
password, and click Login.
Loading Dataset

Processing the Data

The dataset is read first by using pandas library and the first five
line of the dataset is printed.
Define the Model
Code
Compile the Model
import pandas as pd
dataset = pd.read_csv('diabetes.csv')
Fit the Model Dataset.head()

Evaluate the Model

Predict
Splitting Dataset

Processing the Data

The dataset is split into input and output.
Define the Model
Code

Compile the Model X = dataset.iloc[:,0:8]

y = dataset.iloc[:, 8]
Fit the Model
X.head()

Evaluate the Model

Predict
Splitting Dataset

Processing the Data

The dataset is split into input and output.
Define the Model
Code

Compile the Model Y.head()

Fit the Model

Evaluate the Model

Predict
Importing Library

Processing the Data

The model is sequential and the layers are defined with Dense class.

Define the Model

Compile the Model

Code
Fit the Model
from keras.models import Sequential
Evaluate the Model
from keras.layers import Dense

Predict
Creating the Layers

Layer structure of the model:

Processing the Data ⮚ The model expects rows of data with eight variables (the input_dim=8
argument).
Define the Model
⮚ The first hidden layer has 12 nodes and uses the ReLU activation
function.
Compile the Model
⮚ The second hidden layer has 8 nodes and uses the ReLU activation
Fit the Model function.

⮚ The output layer has one node and uses the sigmoid activation
Evaluate the Model
function.

Predict Code

model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
Compile the Model

Processing the Data ⮚ Binary Cross Entropy is set as loss function for this classification model

⮚ The optimizer is Adam algorithm

Define the Model
⮚ The metrics is set to accuracy
Compile the Model

Fit the Model

Code
Evaluate the Model
model.compile(loss='binary_crossentropy',
optimizer='adam', metrics=['accuracy'])
Predict
Fit the Model

Processing the Data

Training occurs over epochs and each epoch is split into batches.
Define the Model

Compile the Model

Code

Fit the Model

model.fit(X, y, epochs=60, batch_size=10)

Evaluate the Model

Predict
Evaluate the Model

Processing the Data

The priority of this demo is only the second part of the evaluate()
function, i.e, accuracy.
Define the Model

Compile the Model

Fit the Model Code

_, accuracy = model.evaluate(X, y)
Evaluate the Model
print('Accuracy: %.2f' % (accuracy*100))

Predict
Predict

The predict_classess() generate class predictions for the input.

Processing the Data

Code

Define the Model

predictions = model.predict_classes(X)
predictions[0:5]
Compile the Model

Fit the Model

Evaluate the Model

Predict
dataset['Outcome'].head()
Deep Learning Model with TensorFlow

Problem Statement: Create a deep learning model with MNIST dataset to predict the
handwritten digits.

Access:

⮚ Click on the Labs tab on the left side of the LMS panel. Copy the username and password.

⮚ Click on the Launch Lab button. On the page that appears, enter the username and
password, and click Login.
Loading the Dataset

Import tensorflow, and load the MNIST, dataset of handwritten digits, 0

Processing the Data to 9 of image with dimension 28x28

Define the Model

Compile the Model

Code
Fit the Model
import tensorflow as tf
Evaluate the Model
mnist_data = tf.keras.datasets.mnist
x_train, y_train),(x_test, y_test) = mnist_data.load_data()
Predict
Visualizing the Data

The digit at the position 6 is printed using matplotlib library.

Processing the Data
Code
Define the Model
import matplotlib.pyplot as plt
Compile the Model
plt.imshow(x_train[6],cmap=plt.cm.binary)
plt.show()
Fit the Model

Evaluate the Model

Predict

print(y_train[6])
Normalizing the Data
Data normalization is achieved by tensorflow.keras.utils.normalize() function,
and the pixel of the images is normalized from the range 0 to 255 to the
Processing the Data range 0 to 1.

Code
Define the Model

Compile the Model x_train = tf.keras.utils.normalize(x_train, axis=1)

x_test = tf.keras.utils.normalize(x_test, axis=1)
Fit the Model

Evaluate the Model

Predict

⮚ The difference can be seen between original digit and normalized digit.
Define the Model
A feed forward sequential model is defined:

⮚ Flattening the input layer from 2828 to 1128

Processing the Data
⮚ Applying two densely connected layer with rectified linear as activation
Define the Model function

Compile the Model

⮚ In the final layer, there are 10 nodes, one node for each digit. The
activation function is softmax, perfect when desired output is probability
distribution of the event over ‘n’ different events
Fit the Model

Code
Evaluate the Model

model = tf.keras.models.Sequential()
Predict
model.add(tf.keras.layers.Flatten())

model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))

model.add(tf.keras.layers.Dense(10,activation=tf.nn.softmax))
Compile the Model

⮚ Sparse Categorical Cross Entropy is set as loss function for this

Processing the Data classification model.

⮚ The optimizer is Adam algorithm, straightforward to implement, and

Define the Model gives efficient result.

Compile the Model ⮚ The metrics is set to accuracy.

Fit the Model

Code
Evaluate the Model
model.compile(optimizer='adam',
Predict loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
Fit the Model

Processing the Data

Training occurs over epochs and each epoch is split into batches.

Define the Model

Compile the Model Code

model.fit(x_train, y_train, epochs=3)

Fit the Model

Evaluate the Model

Predict
Evaluate the Model

The evaluate() function returns a list with two values. The first is the loss
Processing the Data of the model on the data set, and the second is the accuracy of the model
on the dataset.
Define the Model

Code
Compile the Model

loss, accuracy = model.evaluate(x_test, y_test)

Fit the Model print(loss)
print(accuracy)
Evaluate the Model

Predict
Predict

The predict() function predicts the digit for the input.

Processing the Data

Code

Define the Model predictions = model.predict(x_test)

Compile the Model import numpy as np

print(np.argmax(predictions[1]))

Fit the Model

Evaluate the Model

plt.imshow(x_test[1],cmap=plt.cm.binary)
plt.show()
Predict
Deep Learning Model with Keras

Problem Statement: Build a deep learning model using Fashion-MNIST, dataset of fashion
articles with 10 classes and each image is 28*28 pixel.

Access:

⮚ Click on the Labs tab on the left side of the LMS panel. Copy the username and password.

⮚ Click on the Launch Lab button. On the page that appears, enter the username and
password, and click Login.
Importing the Library

Processing the Data

Building the Network

Define Loss Function

and Optimizer
Code
Train the Network
import torch
Evaluate the Model import torchvision
import torchvision.transforms as transforms
Creating Image Normalizer

An object is created using Compose() function to normalize the image

Processing the Data data.

Building the Network

Define Loss Function

and Optimizer
Code
Train the Network
transform = transforms.Compose([transforms.ToTensor(),
Evaluate the Model transforms.Normalize((0.5,),
(0.5,))])
Creating Data Loader

Creating data loader object for training data:

Processing the Data

Building the Network

Code
Define Loss Function
and Optimizer
traindata = torchvision.datasets.FashionMNIST(root='./data',
train=True,
Train the Network
download=True, transform=transform)

Evaluate the Model trainloader = torch.utils.data.DataLoader(traindata,

batch_size=4,
shuffle=True)
Creating Data Loader

Creating data loader object for testing data:

Processing the Data

Code
Building the Network
import matplotlib.pyplot as plt
Define Loss Function import numpy as np
and Optimizer

Train the Network def show_image(image):

image = image / 2 + 0.5 # unnormalize
np_img = image.numpy()
Evaluate the Model
plt.imshow(np.transpose(np_img, (1, 2, 0)))
Visualizing the Image Data

Code

Processing the Data

dataiter = iter(trainloader)
images, labels = dataiter.next()
Building the Network

show_image(torchvision.utils.make_grid(images))
Define Loss Function
and Optimizer print(' '.join('%5s' % classes[labels[j]] for j in range(4)))

Train the Network

Evaluate the Model

Building the Network

Code

Processing the Data

import torch.nn as nn
import torch.nn.functional as F
Building the Network
class Network(nn.Module):
Define Loss Function def __init__(self):
and Optimizer super().__init__()
self.fc1 = nn.Linear(784, 128)
Train the Network self.fc2 = nn.Linear(128, 64)
self.fc3 = nn.Linear(64, 10)
Evaluate the Model
def forward(self, x):
x = x.view(x.shape[0], -1)
x = self.fc1(x)
x = F.relu(x)
x = self.fc2(x)
x = F.relu(x)
x = self.fc3(x)
x = F.softmax(x, dim=1)

return x
Define Loss Function and Optimizer

Processing the Data ⮚ Cross entropy is set as loss function for this classification model.

⮚ The optimizer is Adam algorithm.

Building the Network

Define Loss Function

and Optimizer Code

from torch import optim

Train the Network

model = Network()
Evaluate the Model criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.1)
Train the Network

Code

Processing the Data for epoch in range(5):

running_loss = 0.0
Building the Network
for inputs, labels in trainloader:
output = model(inputs)
Define Loss Function loss = criterion(output, labels)
and Optimizer
optimizer.zero_grad()
Train the Network loss.backward()
optimizer.step()

Evaluate the Model running_loss += loss.item()

else:
print(f"loss: {running_loss/len(trainloader)}")

print('Finished Training')
Train the Network

Code

Processing the Data correct = 0

total = 0

Building the Network with torch.no_grad():

for data in testloader:
Define Loss Function images, labels = data
and Optimizer outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
Train the Network total += labels.size(0)
correct += (predicted == labels).sum().item()
Evaluate the Model
print('Accuracy of the network on the 10000 test images: %d
%%' % (
100 * correct / total))
Deep Learning Model with Caffe2

Problem Statement: Make a deep learning model with MNIST data using Caffe2.

Access:

⮚ Click on the Labs tab on the left side of the LMS panel. Copy the username and password.

⮚ Click on the Launch Lab button. On the page that appears, enter the username and
password, and click Login.
Loading the Data

Code
Loading the Data
from keras.datasets.mnist import load_data

Constructing the Model (trainX, trainy), (testX, testy) = load_data()

Training the Model print('Train', trainX.shape, trainy.shape)

print('Test', testX.shape, testy.shape)
Testing the Model

Deploying the Model

Running the Training

Importing Library

Code
Importing the Library
from caffe2.python import (
brew,
Loading the Data
core,
model_helper,
Constructing the Model net_drawer,
optimizer,
Training the Model visualize,
workspace,
)
Testing the Model
core.GlobalInit(['caffe2', '--caffe2_log_level=0'])
Deploying the Model print("Necessities imported!")

USE_LENET_MODEL = True
Running the Training
Loading the Data

Importing the Library Code

Loading the Data

def DownloadResource(url, path):
'''Downloads resources from s3 by url and unzips them to
Constructing the Model the provided path'''
import requests, zipfile, StringIO
Training the Model print("Downloading... {} to {}".format(url, path))
r = requests.get(url, stream=True)
z = zipfile.ZipFile(StringIO.StringIO(r.content))
Testing the Model
z.extractall(path)

Deploying the Model

Running the Training

Loading the Data

Set up the paths for the necessary directories.

Importing the Library

Code
Loading the Data

Constructing the Model current_folder = os.path.join(os.path.expanduser('~'),

'caffe2_notebooks')
Training the Model data_folder = os.path.join(current_folder, 'tutorial_data',
'mnist')
root_folder = os.path.join(current_folder, 'tutorial_files',
Testing the Model 'tutorial_mnist')
db_missing = False
Deploying the Model

Running the Training

Loading the Data

Check if the data folder already exists.

Importing the Library

Loading the Data

Code
Constructing the Model
if os.path.exists(os.path.join(data_folder,"mnist-train-
Training the Model nchw-lmdb")):
print("lmdb train db found!")
else:
Testing the Model db_missing = True

Deploying the Model

Running the Training

Loading the Data

Check if the testing LMDB exists in the data folder.

Importing the Library

Loading the Data

Code
Constructing the Model
current_folder = os.path.join(os.path.expanduser('~'),
Training the Model 'caffe2_notebooks')
data_folder = os.path.join(current_folder, 'tutorial_data',
'mnist')
Testing the Model root_folder = os.path.join(current_folder, 'tutorial_files',
'tutorial_mnist')
Deploying the Model db_missing = False

Running the Training

Loading the Data

Attempt to download the database if it is missing from either of the folders.

Importing the Library

Loading the Data

Code
Constructing the Model
if db_missing:
Training the Model print("one or both of the MNIST lmbd dbs not found!!")
db_url = "http://download.caffe2.ai/databases/mnist-
lmdb.zip"
Testing the Model try:
DownloadResource(db_url, data_folder)
except Exception as ex:
Deploying the Model print("Failed to download dataset. Please download
it manually from {}".format(db_url))
Running the Training print("Unzip it and place the two database folders
here: {}".format(data_folder))
raise ex
Loading the Data

Clean up the statistics from any of the old runs.

Importing the Library

Loading the Data

Code
Constructing the Model
if os.path.exists(root_folder):
Training the Model print("Looks like you ran this before, so we need to
cleanup those old files...")
shutil.rmtree(root_folder)
Testing the Model
os.makedirs(root_folder)
Deploying the Model workspace.ResetWorkspace(root_folder)

Running the Training

Loading the Data

Importing
Importing
the
Library
Library

Loading the Data

Code
Constructing
Compile thethe Model
Model
if os.path.exists(root_folder):
Training the Model print("Looks like you ran this before, so we need to
Fit the Model
cleanup those old files...")
shutil.rmtree(root_folder)
Evaluate
Testing the
theModel
Model
os.makedirs(root_folder)
Deploying
Predict
the Model workspace.ResetWorkspace(root_folder)

Running the Training

Loading the Data

For the sake of modularity, we will separate the construction of the

Importing the Library
model into different parts:

⮚ The data input part (AddInput function)

Loading the Data
⮚ The main computation part (AddModel function)
Constructing the Model
⮚ The training part is where gradient operators, optimization algorithm,
Training the Model etc. are added (AddTrainingOperators function)

⮚ The bookkeeping part where you just print the statistics for inspection
Testing the Model (AddBookkeepingOperators function)

Deploying the Model

Running the Training

Constructing the Model

AddInput function loads the data from a database.

Importing the Library

Loading the Data

Code
Constructing the Model
def AddInput(model, batch_size, db, db_type):
Training the Model
data_uint8, label = model.TensorProtosDBInput(
[], ["data_uint8", "label"], batch_size=batch_size,
Testing the Model db=db, db_type=db_type)
data = model.Cast(data_uint8, "data",
to=core.DataType.FLOAT)
Deploying the Model data = model.Scale(data, data, scale=float(1./256))
data = model.StopGradient(data, data)
Running the Training return data, label
Constructing the Model

When the flag USE_LENET_MODEL is false, MLP model definition is used.

Importing the Library

Loading the Data

Code
Constructing the Model
def AddMLPModel(model, data):
size = 28 * 28 * 1
Training the Model
sizes = [size, size * 2, size * 2, 10]
layer = data
Testing the Model for i in range(len(sizes) - 1):
layer = brew.fc(model, layer, 'dense_{}'.format(i),
dim_in=sizes[i], dim_out=sizes[i + 1])
Deploying the Model layer = brew.relu(model, layer, 'relu_{}'.format(i))
softmax = brew.softmax(model, layer, 'softmax')
Running the Training return softmax
Constructing the Model

When the flag USE_LENET_MODEL is true, MLP model definition is used.

Importing the Library Code

Loading the Data def AddLeNetModel(model, data):

conv1 = brew.conv(model, data, 'conv1', dim_in=1,

Constructing the Model dim_out=20, kernel=5)
pool1 = brew.max_pool(model, conv1, 'pool1', kernel=2,
Training the Model stride=2)
conv2 = brew.conv(model, pool1, 'conv2', dim_in=20,
dim_out=50, kernel=5)
Testing the Model pool2 = brew.max_pool(model, conv2, 'pool2', kernel=2,
stride=2)
Deploying the Model fc3 = brew.fc(model, pool2, 'fc3', dim_in=50 * 4 * 4,
dim_out=500)
relu3 = brew.relu(model, fc3, 'relu3')
Running the Training
pred = brew.fc(model, relu3, 'pred', dim_in=500,
dim_out=10)
softmax = brew.softmax(model, pred, 'softmax')

return softmax
Constructing the Model

The AddModel function allows you to switch easily from MLP to LeNet model.
Change USE_LENET_MODEL at the very top of the notebook and rerun the whole
Importing the Library code.

Loading the Data

Code
Constructing the Model
def AddModel(model, data):
Training the Model if USE_LENET_MODEL:
return AddLeNetModel(model, data)
else:
Testing the Model return AddMLPModel(model, data)

Deploying the Model

Running the Training

Constructing the Model

The AddAccuracy function acts as an accuracy operator to the model.

It uses the softmax scores and the input training labels.
Importing the Library

Loading the Data

Code
Constructing the Model
def AddAccuracy(model, softmax, label):
Training the Model
accuracy = brew.accuracy(model, [softmax, label],
"accuracy")
return accuracy
Testing the Model

Deploying the Model

Running the Training

Constructing the Model

Add training operators.

Importing the Library

Loading the Data

Code
Constructing the Model
def AddTrainingOperators(model, softmax, label):
Training the Model xent = model.LabelCrossEntropy([softmax, label], 'xent')
loss = model.AveragedLoss(xent, "loss")
AddAccuracy(model, softmax, label)
Testing the Model model.AddGradientOperators([loss])
optimizer.build_sgd(
Deploying the Model model,
base_learning_rate=0.1,
policy="step",
Running the Training
stepsize=1,
gamma=0.999,
)
Constructing the Model

Add bookkeeping operators.

Importing the Library

Loading the Data

Code
Constructing the Model

Training the Model def AddBookkeepingOperators(model):

model.Print('accuracy', [], to_file=1)
model.Print('loss', [], to_file=1)
Testing the Model for param in model.params:
model.Summarize(param, [], to_file=1)
model.Summarize(model.param_to_grad[param], [],
Deploying the Model to_file=1)

Running the Training

Training the Model

Importing the Library

Loading the Data

Code

Constructing the Model

arg_scope = {"order": "NCHW"}
train_model = model_helper.ModelHelper(name="mnist_train",
Training the Model
arg_scope=arg_scope)
data, label = AddInput(
Testing the Model train_model, batch_size=64,
db=os.path.join(data_folder, 'mnist-train-nchw-lmdb'),
db_type='lmdb')
Deploying the Model softmax = AddModel(train_model, data)
AddTrainingOperators(train_model, softmax, label)
Running the Training AddBookkeepingOperators(train_model)
Testing the Model

Importing the Library

Loading the Data

Code

Constructing the Model

test_model = model_helper.ModelHelper(
name="mnist_test", arg_scope=arg_scope,
Training the Model
init_params=False)
data, label = AddInput(
test_model, batch_size=100,
Testing the Model
db=os.path.join(data_folder, 'mnist-test-nchw-lmdb'),
db_type='lmdb')
Deploying the Model softmax = AddModel(test_model, data)
AddAccuracy(test_model, softmax, label)
Running the Training
Deploying the Model

Importing the Library

Loading the Data

Code

Constructing the Model deploy_model = model_helper.ModelHelper(

name="mnist_deploy", arg_scope=arg_scope,
Training the Model init_params=False)
AddModel(deploy_model, "data")

Testing the Model

Deploying the Model

Running the Training

Code

Importing the Library

workspace.RunNetOnce(train_model.param_init_net)
workspace.CreateNet(train_model.net, overwrite=True)
Loading the Data total_iters = 200
accuracy = np.zeros(total_iters)
loss = np.zeros(total_iters)
Constructing the Model

for i in range(total_iters):
Training the Model
workspace.RunNet(train_model.net)
accuracy[i] = workspace.blobs['accuracy']
Testing the Model loss[i] = workspace.blobs['loss']
if i % 25 == 0:
print("Iter: {}, Loss: {}, Accuracy:
Deploying the Model {}".format(i,loss[i],accuracy[i]))

Running the Training

Code
Importing the Library

Loading the Data pyplot.plot(loss, 'b')

pyplot.plot(accuracy, 'r')
pyplot.title("Summary of Training Run")
Constructing the Model pyplot.xlabel("Iteration")
pyplot.legend(('Loss', 'Accuracy'), loc='upper right')
Training the Model

Testing the Model

Deploying the Model

Running the Training

Deep Learning Model with Python

Problem Statement: Make a deep learning model with Python.

Access:

⮚ Click on the Labs tab on the left side of the LMS panel. Copy the username and password.

⮚ Click on the Launch Lab button. On the page that appears, enter the username and
password, and click Login.
Creating the Neural Network Class

Neural network consist of the following components:

⮚ An input layer, x

⮚ An arbitrary amount of hidden layers

⮚ An output layer, ŷ

⮚ A set of weights and biases between each layer, W and b

⮚ A choice of activation function for each hidden layer, σ. In this tutorial, you’ll use a
Sigmoid activation function
Creating the Neural Network Class

Code

class NeuralNetwork:

def init(self, x, y):

self.input = x
self.weights1 = np.random.rand(self.input.shape[1],4)
self.weights2 = np.random.rand(4,1)
self.y = y
self.output = np.zeros(y.shape)
Creating the Neural Network Class

Adding backpropagation and loss function in the class NeuralNetwork

Code

def backprop(self):

# application of the chain rule to find derivative of the loss function with
respect to weights2 and weights1

d_weights2 = np.dot(self.layer1.T, (2(self.y - self.output)

sigmoid_derivative(self.output)))
d_weights1 = np.dot(self.input.T, (np.dot(2*(self.y - self.output) *
sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1)))

# update the weights with the derivative (slope) of the loss function

self.weights1 += d_weights1
self.weights2 += d_weights2
Creating the Neural Network Class

We are creating a Feed Forward function in NeuralNetwork class.

Code

def feedforward(self):
self.layer1 = sigmoid(np.dot(self.input, self.weights1))
self.output = sigmoid(np.dot(self.layer1, self.weights2)))
Output of the Neural Network

The loss of neural network after 1500 iterations:

Output of the Neural Network

Prediction after 1500 iterations:

Prediction Y (actual)

0.023 0

0.979 1

0.975 1

0.025 0
Key Takeaways

You are now able to:

Explain and define deep learning models

Determine suitable loss function for deep learning models

Apply python to create a deep learning model without any

deep learning framework

Build a deep neural network with Keras, Caffe, PyTorch, and

Tensorflow deep learning framework
Knowledge Check
Knowledge
Check What is the minimum number of hidden layers a neural network should have to be
qualified as a deep neural network?
1

a. 1

b. 2

c. 3

d. All of the above

Knowledge
Check What is the minimum number of hidden layers a neural network should consists to be
qualified as deep neural network?
1

a. 1

b. 2

c. 3

d. All of the above

The correct answer is b

If a neural network contains one hidden layer it is called shallow neural network and a shallow neural becomes a
deep neural network when one more hidden layer adds up.
Knowledge
Check
Which of the following deep learning ecosystems does Glow, an ML compiler belong to?
2

a. PyTorch

b. Keras

c. Tensorflow

d. None of the above

Knowledge
Check
Which of the following deep learning ecosystems does Glow, an ML compiler belong to?
2

a. PyTorch

b. Keras

c. Tensorflow

d. None of the above

The correct answer is a

Glow belongs to PyTorch ecosystem along with Skorch, Torchbearer, and PyTorch Geometric.
Knowledge
Check Which of the following loss functions is best suited to build a regression model with
outlier?
3

a. MAE

b. MSE

c. Both MAE and MSE

d. None of the above

Knowledge
Check Which of the following loss functions is best suited to build a regression model with
outlier?
3

a. MAE

b. MSE

c. Both MAE and MSE

d. None of the above

The correct answer is a

MAE is the suitable loss function to build a regression model with outlier while for a dataset without outlier it is MSE.
Knowledge
Check
Which of the following deep learning frameworks uses Tensorflow backend?
4

a. Keras

b. PyTorch

c. Caffe

d. None of the above

Knowledge
Check
Which of the following deep learning frameworks uses Tensorflow backend?s
4

a. Keras

b. PyTorch

c. Caffe

d. None of the above

The correct answer is a

Keras uses Tensorflow backend along with Theano and Mxnet.

Chars74k Image Classification

Problem Statement: Build a deep learning convolutional neural network

to recognize characters using Chars74k dataset.

Objective: Build a neural network-based classification model to recognize

characters using the following metrics:
Use 4 convolution layers with 3*3 kernel and activation function as ReLU.
Add maximum pooling layers after every other convolution layer and 2
hidden layers with dropout.

Access: Click on the Labs tab on the left side of the LMS panel. Copy or
note the username and password that are generated. Click on the Launch
Lab button. On the page that appears, enter the username and password
in the respective fields, and click Login.
Thank You

1 - Introduction To Datascience
No ratings yet
1 - Introduction To Datascience
444 pages
Data+Science+in+Python+ +Data+Prep+&+EDA
No ratings yet
Data+Science+in+Python+ +Data+Prep+&+EDA
196 pages
MLCourse Slides
No ratings yet
MLCourse Slides
356 pages
Data Science in Python - Regression
No ratings yet
Data Science in Python - Regression
234 pages
Understanding Industry 4 V007a
No ratings yet
Understanding Industry 4 V007a
239 pages
Lesson 5 Deep Neural Net Optimization Tuning Interpretability
100% (1)
Lesson 5 Deep Neural Net Optimization Tuning Interpretability
105 pages
CHAPTER 7 Project Management and Network Analysis
100% (3)
CHAPTER 7 Project Management and Network Analysis
43 pages
AI
No ratings yet
AI
101 pages
Introduction To DevOps
No ratings yet
Introduction To DevOps
146 pages
Bdhs - Ebook
No ratings yet
Bdhs - Ebook
970 pages
MTRN3210 W2L2 - First-Order System
No ratings yet
MTRN3210 W2L2 - First-Order System
30 pages
Siahaan V. Data Science Crash Course... With Python GUI 2ed 2023
No ratings yet
Siahaan V. Data Science Crash Course... With Python GUI 2ed 2023
610 pages
UE20CS302 Unit4 Slides
No ratings yet
UE20CS302 Unit4 Slides
312 pages
How To Run Cluster Analysis in Excel
No ratings yet
How To Run Cluster Analysis in Excel
9 pages
Lesson 08 Data Visualization With Python
No ratings yet
Lesson 08 Data Visualization With Python
125 pages
CS60010: Deep Learning CNN - Part 3: Sudeshna Sarkar
No ratings yet
CS60010: Deep Learning CNN - Part 3: Sudeshna Sarkar
167 pages
Application of Graph Coloring in Map Coloring and GSM Mobile Phone Networks
50% (4)
Application of Graph Coloring in Map Coloring and GSM Mobile Phone Networks
2 pages
CS7641 Machine Learning Midterm Notes PDF
No ratings yet
CS7641 Machine Learning Midterm Notes PDF
239 pages
Machine Learning Cheat Sheet ??? - ?
No ratings yet
Machine Learning Cheat Sheet ??? - ?
231 pages
Python For Non-Programmers Final
No ratings yet
Python For Non-Programmers Final
218 pages
Lesson 09 - Introduction To Model Building
No ratings yet
Lesson 09 - Introduction To Model Building
85 pages
Lectures Machine Learning
No ratings yet
Lectures Machine Learning
205 pages
Lesson 07 Data Manipulation With Pandas
No ratings yet
Lesson 07 Data Manipulation With Pandas
82 pages
Lesson 06 Mathematical Computing Using NumPy
No ratings yet
Lesson 06 Mathematical Computing Using NumPy
59 pages
771 A18 Lec4
100% (1)
771 A18 Lec4
128 pages
Lesson 3 Artificial Neural Network
No ratings yet
Lesson 3 Artificial Neural Network
77 pages
Lesson 5 - Supervised Learning-Classification
100% (1)
Lesson 5 - Supervised Learning-Classification
91 pages
Max Cerf - Optimization Techniques I - Continuous optimization-EDP Sciences (2023)
No ratings yet
Max Cerf - Optimization Techniques I - Continuous optimization-EDP Sciences (2023)
482 pages
DR Antonio Gulli - A Collection of Advanced Data Science and Machine Learning Interview Questions Solved in Python and Spark (II) - Hands-On Big Data and Machine - Programming Interview Questions) (
No ratings yet
DR Antonio Gulli - A Collection of Advanced Data Science and Machine Learning Interview Questions Solved in Python and Spark (II) - Hands-On Big Data and Machine - Programming Interview Questions) (
112 pages
Regression Project
100% (1)
Regression Project
60 pages
Data Science
100% (2)
Data Science
38 pages
Machine Learning - 6
No ratings yet
Machine Learning - 6
53 pages
Cap 2
No ratings yet
Cap 2
37 pages
AML AfterMid Merged
No ratings yet
AML AfterMid Merged
389 pages
2) Descriptive Statistics - Asst. Prof. Dr. Meliz Yuvalı
No ratings yet
2) Descriptive Statistics - Asst. Prof. Dr. Meliz Yuvalı
16 pages
Lecture8 PDF
No ratings yet
Lecture8 PDF
434 pages
Machine Learning + Devops Using Azure ML Services
No ratings yet
Machine Learning + Devops Using Azure ML Services
17 pages
2021 11 02 Session2
No ratings yet
2021 11 02 Session2
75 pages
Chapter 1: The Foundations: Logic and Proofs: Discrete Mathematics and Its Applications
No ratings yet
Chapter 1: The Foundations: Logic and Proofs: Discrete Mathematics and Its Applications
37 pages
Survival Theory by Machine Learning 2
No ratings yet
Survival Theory by Machine Learning 2
17 pages
AML - Mid Term - Merged
No ratings yet
AML - Mid Term - Merged
192 pages
BDS Session 1
100% (1)
BDS Session 1
70 pages
Introduction To Data Visualization With Python
No ratings yet
Introduction To Data Visualization With Python
47 pages
Unit 1&2
No ratings yet
Unit 1&2
270 pages
Lesson 1 - Course - Introduction
No ratings yet
Lesson 1 - Course - Introduction
9 pages
Daley Etal 2022 Practical Quantum Advantage in Quantum Simulation
No ratings yet
Daley Etal 2022 Practical Quantum Advantage in Quantum Simulation
14 pages
CS601PC - MACHINE LEARNING Unit - 1-2
No ratings yet
CS601PC - MACHINE LEARNING Unit - 1-2
145 pages
Bitcoinpricepredictor 221101065703 78a62742
No ratings yet
Bitcoinpricepredictor 221101065703 78a62742
14 pages
Lecture-1to8-HCL-DSE - Sumita Narang - IDS PDF
No ratings yet
Lecture-1to8-HCL-DSE - Sumita Narang - IDS PDF
304 pages
Major Project Thesis
No ratings yet
Major Project Thesis
44 pages
Nptel Week6 Module5 Greedy Huffman Code
No ratings yet
Nptel Week6 Module5 Greedy Huffman Code
36 pages
ML Course PDF
No ratings yet
ML Course PDF
133 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
103 pages
Mathematics II A22 R21 2002
No ratings yet
Mathematics II A22 R21 2002
3 pages
Lecture 9
No ratings yet
Lecture 9
24 pages
Bisection Method Final
No ratings yet
Bisection Method Final
13 pages
New Crypto Lab File
No ratings yet
New Crypto Lab File
24 pages
Result of UNG B.E. B.SC - B.tech - End Semester Examinations of SoE SoS July Aug 2024
No ratings yet
Result of UNG B.E. B.SC - B.tech - End Semester Examinations of SoE SoS July Aug 2024
11 pages
Data Science With Python - Lesson 06 - Scientific Computing With Python (Scipy) - Ebook
No ratings yet
Data Science With Python - Lesson 06 - Scientific Computing With Python (Scipy) - Ebook
48 pages
Class 11 Assignment 10 (Prac)
No ratings yet
Class 11 Assignment 10 (Prac)
3 pages
Bias and Variance
No ratings yet
Bias and Variance
6 pages
Data Science With Python - Lesson 09 - Data Science With Python - NLP PDF
No ratings yet
Data Science With Python - Lesson 09 - Data Science With Python - NLP PDF
62 pages
Lesson 02 2.01 Introduction To Data Science
No ratings yet
Lesson 02 2.01 Introduction To Data Science
31 pages
Machine Learning Bits
100% (2)
Machine Learning Bits
28 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
1 Bisection
No ratings yet
1 Bisection
30 pages
Introduction To Learning: Frederic Precioso 24/01/2019
No ratings yet
Introduction To Learning: Frederic Precioso 24/01/2019
179 pages
Data Minds - Data Science Curriculum 2023 V2
No ratings yet
Data Minds - Data Science Curriculum 2023 V2
15 pages
2nd Exam Question Paper 2
No ratings yet
2nd Exam Question Paper 2
16 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
Data Science With Python - Lesson 02 - Data Analytics Overview
No ratings yet
Data Science With Python - Lesson 02 - Data Analytics Overview
54 pages
P-Delta Effect: Articles Test Problems
No ratings yet
P-Delta Effect: Articles Test Problems
2 pages
11.3 Nomograms: Figure 11.1: Nomogram Representing The Solution of The Equation A+B C
No ratings yet
11.3 Nomograms: Figure 11.1: Nomogram Representing The Solution of The Equation A+B C
3 pages
Introductory Econometrics For Finance Chris Brooks Solutions To Review - Chapter 3
100% (2)
Introductory Econometrics For Finance Chris Brooks Solutions To Review - Chapter 3
7 pages
Cluster
100% (1)
Cluster
72 pages
Mas 42b Cost Behavior With Regression Analysis
No ratings yet
Mas 42b Cost Behavior With Regression Analysis
7 pages
Essential Python Libraries and Functions For Data Science 1706295212
No ratings yet
Essential Python Libraries and Functions For Data Science 1706295212
12 pages
Steven M. Joerger: Objective Education Employment
No ratings yet
Steven M. Joerger: Objective Education Employment
1 page
Ch5 Eng Approx
No ratings yet
Ch5 Eng Approx
11 pages
Big Data Analytics Notes
No ratings yet
Big Data Analytics Notes
9 pages
Machine Learning Module-3
No ratings yet
Machine Learning Module-3
23 pages
Simplilearn Deep Learning
No ratings yet
Simplilearn Deep Learning
6 pages
Introduction About Finite Element Analysis
No ratings yet
Introduction About Finite Element Analysis
19 pages
EDA Assignment
No ratings yet
EDA Assignment
15 pages
Data Science Course in Hyderabad - Innomatics
No ratings yet
Data Science Course in Hyderabad - Innomatics
10 pages
6 Different Ways To Compensate For Missing Values in A Dataset
No ratings yet
6 Different Ways To Compensate For Missing Values in A Dataset
6 pages
Neural Networks For Unicode Optical Character Recognition
No ratings yet
Neural Networks For Unicode Optical Character Recognition
2 pages
Career Plans For Next 2 Years
No ratings yet
Career Plans For Next 2 Years
11 pages
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Software Asset Management: What Is It and Why Do I Need It?: A Textbook on the Fundamentals in Software License Compliance, Audit Risks, Optimizing Software License ROI, Business Practices and Life Cycle Management
From Everand
Software Asset Management: What Is It and Why Do I Need It?: A Textbook on the Fundamentals in Software License Compliance, Audit Risks, Optimizing Software License ROI, Business Practices and Life Cycle Management
Carl A. Bolton
No ratings yet

Lesson 4 Deep Neural Network and Tools

Uploaded by

Lesson 4 Deep Neural Network and Tools

Uploaded by

Deep Learning with Keras and

By the end of this lesson, you will be able to:

Explain a deep neural network

Design a deep neural network step by step

Choose a loss function for a deep neural network

Describe and work with deep learning tools

Layer 1 Layer 2 Layer 3

Input Layer Output Layer

Layer 1 Layer 2 Layer 3

Input Layer Output Layer

The first hidden layer trains on the

Layer 1 Layer 2 Layer 3

Input Layer Output Layer

The second hidden layer gets the

Layer 1 Layer 2 Layer 3

Input Layer Output Layer

The third layer distinguishes different

Mean Squared Mean Absolute

MSE = Sum of Squared Errors/N

7.1 6.9 0.04

9.5 11.3 3.24

11.5 11.1 0.16

MSE = Sum of Mean Errors/N

7.1 6.9 0.04

9.5 11.3 3.24

11.5 11.1 0.16

MSE = Sum of Mean Errors/N

7.1 6.9 0.04

9.5 11.3 3.24

11.5 11.1 0.16

10.2 9.4 0.64 0.8

7.1 6.9 0.04 0.2

9.5 11.3 3.24 1.8

11.5 11.1 0.16 0.4

Loss Function MSE = 5.52/5 = 1.104 MAE = 4.4/5 = 0.88

10.2 9.4 0.64 0.8

7.1 6.9 0.04 0.2

31.5 11.3 408.04 20.2

11.5 11.1 0.16 0.4

Loss Function MSE = 415.84/5 = MAE = 27.2/5 = 5.44

MAE as loss function MSE as loss function

Cross Entropy Hinge Loss

Class(Samsung, Apple, LG) Output = [P(Samsung), P(Apple), P(LG)]

The class with highest probability is the winner.

Output = [P(Samsung), P(Apple), P(LG)] Apple = [0,1,0]

The actual probability

Intuition behind Cross Entropy

P(C) = [y1′ , y2′ , y3′ … yN’]

⮚ The actual or target probability distribution of the data C is:

A(C) = [y1 , y2 , y3 … yN]

⮚ Cross entropy for data C is calculated as:

CrossEntropy(A,P) = – ( y1*log(y1′) + y2*log(y2′) + y3*log(y3′) + … + yN*log(yN’) )

P(LG) = [0.6, 0.3, 0.1]

CrossEntropy(A,P) = – (1*Log(0.6) + 0*Log(0.3)+0*Log(0.1)) = 0.51

Categorical Cross Entropy = Sum of Cross Entropy for N data/N

Samsung [1, 0, 0] [0.6, 0.3, 0.1] – (1*Log(0.6) + 0*Log(0.3)+0*Log(0.1)) = 0.51

Samsung [1, 0, 0] [0.9, 0.1, 0] – (1*Log(0.9) + 0*Log(0.1)+0*Log(0.1)) = 0.1

Apple [0, 1, 0] [0.2, 0.7, 0.1] – (0*Log(0.2) + 1*Log(0.7)+0*Log(0.1)) = 0.35

LG [0, 0, 1] [0.3, 0.2, 0.5] – (0*Log(0.3) + 0*Log(0.2)+1*Log(0.5)) = 0.69

Apple [0, 1, 0] [0.6, 0.1, 0.3] – (0*Log(0.6) + 1*Log(0.1)+0*Log(0.3)) = 2.3

Samsung [1, 0, 0] [0.5, 0.2, 0.3] – (1*Log(0.5) + 0*Log(0.2)+0*Log(0.3)) = 0.69

LG [0, 0, 1] [0.1, 0.1, 0.8] – (0*Log(0.1) + 0*Log(0.1)+1*Log(0.8)) = 0.22

Cross Entropy(C) = – y*log(y’) when y = 1

Cross Entropy(C) = – (1-y)*log(1-y’) when y = 0

Binary Cross Entropy = Sum of Cross Entropy for N data/N

Congrats, our ????

Scenario Actual y’ Predicted y MAE MSE Binary Cross Entropy

Prediction is confidently 0 0.1 I0 - 0.1I = 0.1 – 1*Log(1 - 0.1) = 0.1

Binary Cross Entropy penalizes more severely than MAE or MSE.

A popular open source library Developed by Google Brain Team

Used mainly for classification,

TensorFlow uses a dataflow graph to represent your

Dataflow is a common programming model for parallel

Compilation It helps to generate faster code.

Multiple Environment Friendly

TensorFlow supports distributed computing.

CrossEntropy(A,P) = – ( y1log(y1′) + y2log(y2′) + y3log(y3′) + … + yNlog(yN’) )

CrossEntropy(A,P) = – (1Log(0.6) + 0Log(0.3)+0*Log(0.1)) = 0.51

Samsung [1, 0, 0] [0.6, 0.3, 0.1] – (1Log(0.6) + 0Log(0.3)+0*Log(0.1)) = 0.51

Samsung [1, 0, 0] [0.9, 0.1, 0] – (1Log(0.9) + 0Log(0.1)+0*Log(0.1)) = 0.1

Apple [0, 1, 0] [0.2, 0.7, 0.1] – (0Log(0.2) + 1Log(0.7)+0*Log(0.1)) = 0.35

LG [0, 0, 1] [0.3, 0.2, 0.5] – (0Log(0.3) + 0Log(0.2)+1*Log(0.5)) = 0.69

Apple [0, 1, 0] [0.6, 0.1, 0.3] – (0Log(0.6) + 1Log(0.1)+0*Log(0.3)) = 2.3

Samsung [1, 0, 0] [0.5, 0.2, 0.3] – (1Log(0.5) + 0Log(0.2)+0*Log(0.3)) = 0.69

LG [0, 0, 1] [0.1, 0.1, 0.8] – (0Log(0.1) + 0Log(0.1)+1*Log(0.8)) = 0.22