0% found this document useful (0 votes)
73 views

Introduction To DL With TensorFlow

The document provides an introduction to deep learning and TensorFlow. It discusses different machine learning algorithms and techniques used in deep learning like convolutional neural networks. It also demonstrates deep learning applications like facial recognition and classifying handwritten digits known as MNIST dataset.

Uploaded by

Upma Gandhi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

Introduction To DL With TensorFlow

The document provides an introduction to deep learning and TensorFlow. It discusses different machine learning algorithms and techniques used in deep learning like convolutional neural networks. It also demonstrates deep learning applications like facial recognition and classifying handwritten digits known as MNIST dataset.

Uploaded by

Upma Gandhi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Introduction to Deep Learning

with TensorFlow
Jian Tao
[email protected]
Spring 2020 HPRC Short Course
03/27/2020
Schedule
● Part I. Deep Learning (70 mins)

● Break (10 mins)

● Part II. Intro to TensorFlow (70 mins)


GitHub Repository for the Webinars
https://github.com/jtao/dswebinar
Jupyter Notebook and JupyterLab

Jupyter Notebook JupyterLab


Google Colaboratory
Google Colaboratory
Search GitHub user: jtao/dswebinar
Part I. Deep Learning
Deep Learning
by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
http://www.deeplearningbook.org/

Animation of Neutron Networks


by Grant Sanderson
https://www.3blue1brown.com/
Relationship of AI, ML, and DL
Artificial Intelligence
● Artificial Intelligence (AI)
is anything about
man-made intelligence Machine Learning
exhibited by machines.
● Machine Learning (ML) is
an approach to achieve AI.
● Deep Learning (DL) is one Deep Learning
technique to implement
ML.
Machine Learning
Traditional Modeling

Data
Computer Prediction
Scientific
Model

Machine Learning (Supervised Learning)


Sample
Data
Computer Model
Expected
Output

Model
Computer Prediction
Data
Types of ML Algorithms
● Supervised Learning
○ trained with labeled data; Machine Learning
including regression and
classification problems
● Unsupervised Learning Supervised Learning
○ trained with unlabeled data;
clustering and association rule
learning problems. Unsupervised Learning
● Reinforcement Learning
○ no training data; stochastic
Markov decision process; robotics Reinforcement Learning
and self-driving cars.
Supervised Learning
When both input variables - X and output variables - Y are known, one can
approximate the mapping function from X to Y.

Training Data ML Algorithm Step 1: Training

Step 2: Testing Model Test Data


Unsupervised Learning
When only input variables - X are known and the training data is neither
classified nor labeled. It is usually used for clustering problems.

Data Class 1

Class 2

Class 3
Reinforcement Learning
When the input variables are only available via interacting with the
environment, reinforcement learning can be used to train an "agent".

(Image Credit: Wikipedia.org) (Image Credit: deeplearning4j.org)


Why Deep Learning?

● Limitations of traditional machine learning algorithms


○ not good at handling high dimensional data.
○ difficult to do feature extraction and object recognition.

● Advantages of deep learning


○ DL is computationally expensive, but it is capable of
handling high dimensional data.
○ feature extraction is done automatically.
What is Deep Learning?
Deep learning is a class of machine learning algorithms that:
● use a cascade of multiple layers of nonlinear processing units
for feature extraction and transformation. Each successive
layer uses the output from the previous layer as input.
● learn in supervised (e.g., classification) and/or unsupervised
(e.g., pattern analysis) manners.
● learn multiple levels of representations that correspond to
different levels of abstraction; the levels form a hierarchy of
concepts.
(Source: Wikipedia)
Artificial Neural Network
Input Hidden Layers Output

(Image Credit: Wikipedia)


Inputs and Outputs
256 X 256
Matrix

DL model

4-Element Vector

X Y

1
2 A
3 C M
4 T F
5 G
6

With deep learning, we are searching for a surjective


(or onto) function f from a set X to a set Y.
Dataset Learning Principle

Output/Prediction

Target Output
x x ….. x
1 2 n

Error: - =5

18
Learning Principle

Output/Prediction

Target Output
x x ….. x
1 2 n

Error: - = 15

19
Learning Principle

Output/Prediction

Target Output
x x ….. x
1 2 n

Error: - = 2.5

20
Supervised Deep Learning with Neural Networks
Input Hidden Layers Output
From one layer to the next

X1
W1

X2 W2
f is the activation function,
Wi is the weight, and bi is Y3
the bias.
X3 W3
Training - Minimizing the Loss
The loss function with regard to weights Input Output
and biases can be defined as

W1, b1 X1

Y2

The weight update is computed by moving W2, b2 X2


a step to the opposite direction of the cost
gradient. L
W3, b3 X3

Iterate until L stops decreasing.


Convolution in 2D

(Image Credit: Applied Deep Learning | Arden Dertat)


Convolution Kernel

(Image Credit: Applied Deep Learning | Arden Dertat)


Convolution on Image

Image Credit: Deep Learning Methods for Vision | CVPR 2012 Tutorial
Activation Functions

Image Credit: towardsdatascience.com


Introducing Non Linearity (ReLU)

Image Credit: Deep Learning Methods for Vision | CVPR 2012 Tutorial
Max Pooling

(Image Credit: Applied Deep Learning | Arden Dertat)


Pooling - Max-Pooling and Sum-Pooling

Image Credit: Deep Learning Methods for Vision | CVPR 2012 Tutorial
CNN Implementation - Drop Out
Dropout is used to prevent overfitting. A neuron is temporarily
“dropped” or disabled with probability P during training.

(Image Credit: Applied Deep Learning | Arden Dertat)


CNN Implementation - Data Augmentation (DA)

DA helps to popular
artificial training
instances from the
existing train data sets.

(Image Credit: Applied Deep Learning | Arden Dertat)


Convolutional Neural Networks
A convolutional neural network (CNN, or ConvNet) is a class of deep, feed-forward
artificial neural networks that explicitly assumes that the inputs are images, which allows
us to encode certain properties into the architecture.

(Image Credit: https://becominghuman.ai)


Deep Learning for Facial Recognition

(Image Credit: www.edureka.co)


MNIST - Introduction
● MNIST (Mixed National
Institute of Standards and
Technology) is a database for
handwritten digits, distributed
by Yann Lecun.
● 60,000 examples, and a test
set of 10,000 examples.
● 28x28 pixels each.
● Widely used for research and
educational purposes.
(Image Credit: Wikipedia)
MNIST - CNN Visualization

(Image Credit: http://scs.ryerson.ca/~aharley/vis/)


Part II. Introduction to TensorFlow

TensorFlow Official Website


http://www.tensorflow.org

36
A Brief History of TensorFlow
TensorFlow is an end-to-end FOSS (free and open source software)
library for dataflow, differentiable programming. TensorFlow is one of
the most popular program frameworks for building machine learning
applications.
● Google Brain built DistBelief in 2011 for internal usage.
● TensorFlow 1.0.0 was released on Feb 11, 2017
● TensorFlow 2.0 was released in Jan 2018.
TensorFlow, Keras, and PyTorch

TensorFlow is an Keras is a high-level PyTorch is an open


end-to-end open neural networks API, source machine
source platform for written in Python and learning framework
machine learning. It capable of running on that accelerates the
has a comprehensive, top of TensorFlow, path from research
flexible ecosystem to CNTK, or Theano. It prototyping to
build and deploy ML was developed with a production
powered applications. focus on enabling fast deployment.
experimentation.
Google Trends for Popular ML Frameworks

(Image Credit: https://trends.google.com/)


Programming Environment
In TF 2.0, tf.keras is the
recommended
high-level API.

(Image Credit: tensorflow.org)


A Connected Pipeline for the Flow of Tensors

(Image Credit: Plumber Game by Mobiloids)


What is a Tensor in TensorFlow?

Name Rank Tensor


● TensorFlow uses a tensor
data structure to represent all Scalar 0 [5]

data. A TensorFlow tensor as Vector 1 [1 2 3]


an n-dimensional array or list. Matrix 2 [[1 2 3 4],
A tensor has a static type, a [5 6 7 8]]
rank, and a shape. Tensor 3 ...
TensorFlow Data Types

Basic TensorFlow data types include:


● int[8|16|32|64], float[16|32|64], double
● bool
● string

with tf.cast(), the data types of variables


could be converted.
Hello World with TensorFlow

import tensorflow as tf

v = tf.constant("Hello World!")

tf.print(v)
TensorFlow Constants
TensorFlow provides several operations to generate constant tensor.

import tensorflow as tf

x = tf.constant(1, tf.int32)
zeros = tf.zeros([2, 3], tf.int32)
ones = tf.ones([2, 3], tf.int32)
y = x *(zeros + ones + ones)

tf.print(y)
TensorFlow Variables
TensorFlow variables can represent shared, persistent state manipulated by
your program. Weights and biases are usually stored in variables.

import tensorflow as tf

W = tf.Variable(tf.random.normal([2,2], stddev=0.1),
name = "W")
b = tf.Variable(tf.zeros(shape=(2)), name="b")
Machine Learning Workflow with tf.keras

Step 1 Step 2 Step 3 Step 4

Prepare Train Data Define Model Training Configuration Train Model

The preprocessed data set needs A model could be defined with The configuration of the training The training begins by calling the
to be shuffled and splitted into tf.keras Sequential model for a process requires the fit function. The number of
training and testing data. linear stack of layers or tf.keras specification of an optimizer, a epochs and batch size need to be
functional API for complex loss function, and a list of set. The measurement metrics
network. metrics. need to be evaluated.
tf.keras Built-in Datasets
● tf.keras provides many popular reference datasets that could be used
for demonstrating and testing deep neural network models. To name a
few,
○ Boston Housing (regression)
○ CIFAR100 (classification of 100 image labels)
○ MNIST (classification of 10 digits)
○ Fashion-MNIST (classification of 10 fashion categories)
○ Reuters News (multiclass text classification)

● The built-in datasets could be easily read in for training purpose. E.g.,

from tensorflow.keras.datasets import boston_housing


(x_train, y_train), (x_test, y_test) = boston_housing.load_data()
Prepare Datasets for tf.keras
In order to train a deep neural network model with
Keras, the input data sets needs to be cleaned,
One-hot encoding
balanced, transformed, scaled, and splitted.
● Balance the classes. Unbalanced classes will Dog Cat Horse
1 0 0
interfere with training. 0 1 0
● Transform the categorical variables into 0 0 1
one-hot encoded variables.
● Extract the X (variables) and y (targets) values Numerical encoding
for the training and testing datasets. Dog Cat Horse
● Scale/normalize the variables. 1 2 3
● Shuffle and split the dataset into training and
testing datasets
Create a tf.keras Model from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,
Activation
● Layers are the fundamental
building blocks of tf.keras model = Sequential([
Dense(64, activation='relu', input_dim=20),
models. Dense(10, activation='softmax')
● The Sequential model is a ])
linear stack of layers.
● A Sequential model can be
created with a list of layer
instances to the constructor or
added with the .add() method.
● The input shape/dimension of
the first layer need to be set.
Input Hidden Layers Output
Compile a tf.keras Model
The compile method of a Keras model configures the learning
process before the model is trained. The following 3 arguments need
to be set (the optimizer and loss function are required).
● An optimizer: Adam, AdaGrad, SGD, RMSprop, etc.
● A loss function: mean_squared_error, mean_absolute_error,
mean_squared_logarithmic_error, categorical_crossentropy,
kullback_leibler_divergence, etc.
● A list of measurement metrics: accuracy, binary_accuracy,
categorical_accuracy, etc.
Train and Evaluate a tf.keras Model
Model: "sequential_1"
tf.keras is trained on NumPy arrays of input _______________________________________________
Layer (type) Output Shape Param #
data and labels. The training is done with the =============================================
● fit() function of the model class. In the fit dense_11 (Dense) (None, 64) 1344
_______________________________________________
function, the following two dense_12 (Dense) (None, 10) 650
=============================================
hyperparameters can be set: Total params: 1,994
Trainable params: 1,994
○ number of epochs Non-trainable params: 0
_______________________________________________
○ batch size None
● evaluate() function returns the loss value
& metrics values for the model in test
mode.
● summary() function prints out the
network architecture.
Make Predictions and More
After the model is trained,
● predict() function of the model class could be used to
generate output predictions for the input samples.
● get_weights() function returns a list of all weight tensors in
the model, as Numpy arrays.
● to_json() returns a representation of the model as a JSON
string. Note that the representation does not include the
weights, only the architecture.
● save_weights(filepath) saves the weights of the model as a
HDF5 file.
Hands-on Session #1
Getting Started with TensorFlow
Hands-on Session #2
Classify Handwritten Digits with
TensorFlow

You might also like