100% found this document useful (1 vote)
171 views132 pages

TensorFlow All-Around

The document outlines an agenda for a TensorFlow All-Around event at SUTD. It includes an introduction to Python from 10:00 am to 12:30 pm, followed by lunch and networking. From 1:30 pm to 3:30 pm there will be introductions to machine learning and TensorFlow 2.0, with a tea break in between. The event will conclude at 6:00 pm.

Uploaded by

Venkat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
171 views132 pages

TensorFlow All-Around

The document outlines an agenda for a TensorFlow All-Around event at SUTD. It includes an introduction to Python from 10:00 am to 12:30 pm, followed by lunch and networking. From 1:30 pm to 3:30 pm there will be introductions to machine learning and TensorFlow 2.0, with a tea break in between. The event will conclude at 6:00 pm.

Uploaded by

Venkat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 132

TensorFlow All-Around @ SUTD

13 September 2019
10:00 am Introduction to Python

12:30 pm Lunch and Networking

01:30 pm Introduction to Machine Learning


Outline 03:30 pm Tea Break

04:00 pm Introduction to TensorFlow 2.0

06:00 pm End

2
Introduction to Python
Instructions
● Please move towards the front/center
● Try to find seats near power plugs
● Try to seat nearer to the aisles
● WIFI: SUTD_Guest
You can find today's material at http://bit.ly/tf_slides and this link.
Pigeonhole (to ask questions): https://pigeonhole.at/TENSORFLOW

3
Part 1
Introduction to Python
10:00 am - 12:30 pm

http://bit.ly/tf_slides

4
Outline

1. Preface
2. Python Basics
3. Python for Scientific Computing
4. Some additional tips (might skip depending on time)

http://bit.ly/tf_slides

5
Preface

Computers are useful but dumb:


● Useful to solve problems
● Dumb because you need to give them instructions ("programming")
● They will execute every instruction given, even bad ones!

6
Preface

● Programming in Python/XXX is just a tool for problem solving


● Programming is not going to be easy
○ Expect things to change all the time

○ Ask questions about things and Google for answers

○ Consistent learning and application is key!

7
Preface

● In this ~2h session, we are going to be learning about the basics of


programming using Python.
● During the "practical" parts, feel free to raise your hands and asking any of
the TAs for help, especially if you run into technical issues
● For conceptual questions, encouraged to ask on pigeonhole:
https://pigeonhole.at/TENSORFLOW
● Slides: http://bit.ly/tf_slides
8
What is Python?

● A programming language known for simplicity, yet


capable of building extremely complicated systems
● Open source project managed by the Python Software
Foundation
● Widely used by many people and companies for a
large variety of tasks, and usage is growing!

9
Python vs other programming languages

Easier Harder
10
Major Python Users

● Google (various tools and products)


“Python where we can, C++ where we must.” [1, 2]
● Instagram (most of the backend)
“Do the simple thing first.” [1, 2]
● Spotify, Uber, Netflix, Dropbox, and many other companies

11
Python Advantages

● Clear and readable syntax


● Intuitive
● Natural expression
● Powerful
● Popular & Relevant

12
What does it mean to
"know" Python?

13
"Knowing English"

● The elements of English language


● The syntax of English grammar
● How to read sentences
● How to write meaningful sentences
Not trivial!
● How to combine sentences into a coherent passage

14
"Knowing Python"

● The elements of Python language


● The syntax of Python grammar
● How to read Python statements
● How to write meaningful Python statements
Not trivial!
● How to combine statements into a coherent program

15
How to get there?

● Know the elements/syntax/programming structure


○ This "crash course" will attempt to cover most of the basics

● The tools to help you


○ We'll show you some of them

● Consistent learning and application


○ Internet is your best friend :)

16
Tools for Programming in Python

● Simple editor for beginners:


IDLE, Thor
● More complex IDE-like environments:
VS Code, Pycharm (Pro edition free for students)
● Jupyter Notebook

17
Jupyter Notebook

● Web based development


environment
● File format is the
iPython notebook (.ipynb)

18
Jupyter Notebook

● Combines text, images, code,


output into a
computational narrative
● Preferred tool for data scientists
to do rapid iteration and share
results

19
Google Colab

Free hosted Jupyter notebooks

https://colab.research.google.com

20
Python Basics

● Statements
● Objects
● Functions
● Loops & Control Flow
● Object-oriented Programming

21
Python Statements

● Instructions that a Python interpreter can execute are called statements.


● Statements are usually contained within one line of code.
● (Refer to Notebook later)
● We'll go through the slides first and come back to the notebook

22
Python Objects

● Almost everything in Python is an Object


● Common types
○ Text Type: str

○ Numeric Types: int, float

○ Sequence Types: list, tuple

○ Mapping Type: dict

23
Python Functions

● A function is a block of code which only runs when it is called.


● You can pass data, known as arguments, into a function.
○ Sometimes arguments are referred to as parameters.

● A function can return data as a result.

24
Python Functions

Arguments
Arguments with default value

def function(arg_1, arg_2, arg_3=default_value):


# do something
return some_data

25
Python Control Flow

● If-else-elif
○ Used in combination with logic to determine code path

● Try-except
○ Used to handle errors in code

● Loops
○ For-loop and while-loop

26
Python Logic

● True and False


● Equivalence operator: ==
○ not True == False

● Used to check if conditions are met. E.g.:


○ 1 == 2 -> False

○ 2 == 2 -> True

27
Notebook: Python Basics

● (Go to Notebook)
● http://bit.ly/tf_slides -> Slide 28

28
Checkpoint
Next: object oriented programming in Python

Slides: http://bit.ly/tf_slides
Pigeonhole: https://pigeonhole.at/TENSORFLOW

29
Object Oriented Programming

● Object-oriented Programming, or OOP for short, is a programming


paradigm
● It provides a means of structuring programs so that properties and
behaviors are bundled into individual objects.

30
Object Oriented Programming

● An object could represent:


○ A type e.g. person

○ With properties (attributes) such as, age, address, etc.

○ With behaviors (methods! aka functions) like walking, talking, breathing, and running.

31
Python Classes

● Classes are used to create new user-defined data structures that


encapsulates attributes and methods.
● In the case of an animal, we could create an Animal() class to track
properties about the Animal like the name and age.
● It may help to think of a class as an idea for how something should be
defined.

32
Inheritance

● Inheritance is the process by which one class takes on the attributes and
methods of another.
● Newly formed classes are called child classes, and the classes that child
classes are derived from are called parent classes.
● It’s important to note that child classes override or extend the functionality
(e.g., attributes and behaviors) of parent classes.

33
Object Oriented Programming

● Notebook
● http://bit.ly/tf_slides -> Slide 34

34
Checkpoint
Next: scientific computing in Python

Slides: http://bit.ly/tf_slides
Pigeonhole: https://pigeonhole.at/TENSORFLOW

35
Scientific Computing

● Scientific Computing is one of the most important uses of programming,


and spans many fields

36
Progress is "gated" by computation ability

The angular shape of the Have Blue prototype and production F-117 aircraft is a direct result of the lack
of computational ability to simulate more complex geometry
37
Scientific Computing

● Scientific Computing is one of the most important uses of programming,


and spans many fields
● The use of Python is common in some fields, primarily when machine
learning is involved

38
Climate Modelling (TF/Python @ Exascale)

Exascale Deep Learning for Climate Analytics (Kurth et al., 2018)


https://arxiv.org/abs/1810.01993 39
https://www.youtube.com/watch?v=e0QK5glozC8
Universe n-body simulation

Learning to predict the cosmological structure formation (He et al., 2019)


40
https://www.pnas.org/content/116/28/13825
Scaling in Scientific Computing

In scientific computing, we are often concerned about


● Running problems faster
● Running larger problems
Hence we get this notion of scaling:
● Strong scaling: how to run a single task faster with many processors
● Weak scaling: how to run more tasks with more processors

41
Python for Scientific Computing

● Arrays in Python
● Google Colab tricks
● Some other widely used libraries:
○ Matplotlib (Plotting)

○ OpenCV (Image Processing)

○ SciPy

42
Arrays

● An array, is a data structure consisting of a collection of elements, each


identified by at least one array index or key.
● Python List is somewhat similar conceptually, but is not an array!
● In Python, the most popular array format is the numpy array.

43
Array vs Python List

44
Numpy (TLDR version)

● Numpy gives you efficient arrays in Python


○ Python lists are poor in terms of memory and speed

● Numpy allows you to write faster code with arrays

45
Numpy

Numpy is the fundamental package for scientific computing with Python.


It contains, among other things:

● a powerful N-dimensional array object


● sophisticated (broadcasting) functions
● tools for integrating C/C++ and Fortran code
● useful linear algebra, Fourier transform, and random number capabilities

46
Numpy

The main feature of NumPy is an ndarray object.

● All array elements have to be the same type (usually float or integer);
● Array elements can be accessed, sliced, and manipulated in the same way
as the lists;
● Arrays can be N-dimensional;
● The number of elements in the array is fixed;
● Shape of the array can be changed.

47
Brief History of Array Computing in Python
Pandas
Various incompatible libraries Manipulating numerical tables Dask
and time series
Numeric, numarray etc. Theano Dask scales Numpy workflows to
"Dark Ages" Manipulating and evaluating multiple cores or machines
mathematical expressions,
especially matrix-valued ones

2006 2012

< 2006 2008 2015


Numpy & Numpy API Numba

Numpy provides a standard Compiler that translates a subset


format and API for array of Python and NumPy into fast
computing in Python machine code using LLVM

48
Numpy

● Python vs Numpy Comparison:


https://colab.research.google.com/drive/1fEGWzxcdWyBNYleyUUX7FT-M
n906oSAw
● Good visual explanations (look at this in your spare time!):
http://jalammar.github.io/visual-numpy/
● Numpy 100 exercises (look at this in your spare time!):
https://github.com/rougier/numpy-100

49
Checkpoint
Next: "extras" - we'll skip if there's no time

Slides: http://bit.ly/tf_slides
Pigeonhole: https://pigeonhole.at/TENSORFLOW

50
Google Colab Tricks

● Overview of Colab Features:


https://colab.research.google.com/notebooks/basic_features_overview.ip
ynb
● Loading Data:
https://colab.research.google.com/notebooks/io.ipynb

51
Plotting with Matplotlib

● Notebook:
https://colab.research.google.com/notebooks/charts.ipynb

52
Images in Python and OpenCV

● Notebook:
https://colab.research.google.com/github/OpenSUTD/machine-learning-w
orkshop/blob/master/labs/Lab%201D%20-%20Images.ipynb

53
SciPy

Python-based ecosystem of open-source software for mathematics, science,


and engineering, with modules for

● statistics, optimization;
● integration, interpolation;
● linear algebra, solving non-linear equations;
● Fourier transforms;
● ODE solvers, special functions;
● signal and image processing.
54
Checkpoint
Next: Introduction to Machine Learning

Slides: http://bit.ly/tf_slides
Pigeonhole: https://pigeonhole.at/TENSORFLOW

55
Part 2
Introduction to Machine Learning
01:30 pm - 03:30 pm

Slides: http://bit.ly/tf_slides
Pigeonhole: https://pigeonhole.at/TENSORFLOW

56
Preface

● This is a ~2h session that aims to give a crash course on deep learning
basics. Where appropriate, additional links are provided for you to learn
more on your own (highly encouraged!)
● For conceptual questions, encouraged to ask on pigeonhole
Do not ask the TA conceptual questions! (TA reject questions pls)
You're unlikely the only person asking the question, asking on pigeonhole
lets everybody benefit and get their question answered.

57
Preface

● Slides: http://bit.ly/tf_slides
● Pigeonhole: https://pigeonhole.at/TENSORFLOW

58
What is Machine Learning?

● Machine learning allows us to solve problems


without explicitly knowing the solution
● "Automate creating rules to solve problems"
● "Data is the new source code"

59
Key reasons to use Machine Learning

● We cannot reasonably create "rules" to solve a problem


● We want to approximate ("curve fit") certain difficult problems

60
What does ML look like
Traditional Programming Machine Learning

INPUT
[Training]
RULES
Algorithm
INPUT
RESULTS
Program /
RESULTS
Algorithm
Store as parameters
RULES

[Inference]
INPUT Algorithm RESULTS
+ rules

61
What is Machine Learning?

● Interesting slide deck (~2h) by a Googler:


https://docs.google.com/presentation/d/1kSuQyW5DTnkVaZEjGYCkfOxvz
CqGEFzWBy4e9Uedd9k/edit
● Serves as a good complement to today's material
● Highly encourage you to set aside ~2h or so, go through the entirely of this,
and Google new concepts and terms that you don't understand!

62
Exponential Growth of ML

ArXiv papers about ML Google codebase FLOPS to train a model


2009 - 2017 2012 - 2017 2013 - 2019
(log scale!)

63
Exponential Growth of ML

3× 30× 3×
Engineers Compute Resource Used Data Used for ML
Facebook Ranking system Facebook Facebook

64
https://venturebeat.com/2019/07/11/facebook-vp-ai-has-a-compute-dependency-problem/
Machine Learning Algorithms

● Clustering, Curve fitting etc. algorithms


● Many can be found in the Scikit-Learn (CPU) or cuML (GPU) and other
libraries

65
66
Machine Learning Algorithms

● Clustering, Curve fitting etc. algorithms


● Many can be found in the Scikit-Learn (CPU) or cuML (GPU) and other
libraries
● Today, we be focused on a subset of ML known as Deep Learning

67
Deep Learning (DL)

● A subset of ML that deals with deep neural networks


● A key idea in deep learning is allowing the computer to build complex
concepts (representations) out of simpler concepts
● This is also known as learning a hierarchy of representations

68
Deep Learning

● Deep Learning is very hyped now because


we have great success applying it to many
previously "unsolved problems"
(e.g. ImageNet).
● These problems are mainly perception
problems on unstructured data

69
Deep Learning vs Neuroscience

● "Deep Learning is inspired by how the brain works"


● Deep Learning is very loosely inspired by some theories of how the brain
works, but these have little practical implication on how we use DL
● "Artificial neuron" bears very little similarity to biological neuron

70
Deep Learning vs Neuroscience

71
Applications of Deep Learning

Computer Vision

● Robotics & Autonomous Vehicles


● Intelligent Video Analytics

Natural Language Processing

● Social Media Analytics


● Chatbots & Conversational Agents

72
Autonomous Driving

73
74
Breaking down Deep Learning

Components

● Model ("deep neural network")


● Dataset
● Training procedure (gradient descend)

75
DL Model

● Models are design to reflect a certain inductive bias (assumption)


○ We'll highlight this for the models we show later

● Many different types of DNNs architectures (popular ave. for exploration)


○ Multilayer Perceptron (MLP) / fully connected network

○ Convolutional Neural Networks (CNN)

○ Recurrent Neural Network (RNN)

○ Transformer

76
DL Model

● Let's walk through a short summary of different types of DNNs

77
Multilayer Perceptron
● A hierarchy of relatively simple models
can model something more complex

y = σ( Σ(wixi) + b)

78
Convolutional Neural Network

● Relative positions of pixels (or any value) of an image (or other inputs)
encodes a certain meaning (feature)
○ Relative positions -> certain "formations" of pixels

○ Relative, not absolute! -> translational invariance

● We use layers of convolutional filters to create feature maps from inputs


○ http://setosa.io/ev/image-kernels/

79
Problem
The object is the same, but the
pixels are completely different!

80
Convolutional Neural Network

81
Convolutional Neural Network

● Relative positions of pixels (or any value) encodes a certain meaning


● We use convolutional filters to create feature maps from inputs
● https://tensorspace.org/html/playground/alexnet.html
(large file warning)
● In-depth lecture: https://www.youtube.com/watch?v=bNb2fEVKeEo

82
Computer Vision

83
Recurrent Neural Network

● There is a "time dependency" in modelling sequential data (e.g. text).


● Complete information at time step t requires accumulated knowledge from
previous time steps.
● In-depth lecture: https://www.youtube.com/watch?v=iWea12EAu6U

84
Recurrent Neural Network

Outputs

Hidden
State
(changes
over time)

Inputs

85
Transformer

● Dependencies between elements of a sequence are not encoded by


relative position
○ Great to model things like language and music

● Uses primarily self-attention based architecture


● In-depth lecture: https://www.youtube.com/watch?v=5vcj8kSwBCY

86
Dataset

● The larger, the better. Literally always.


○ Deep Learning Scaling is Predictable, Empirically (Hestness, 2017)
https://arxiv.org/abs/1712.00409

● Huge datasets like ImageNet kickstarted the DL revolution

87
Training

Two major components required:

1. Metric to measure performance


2. Optimization algorithm

88
Training Metric

● Value of loss function - quantification of error produced by model


○ We'll see examples of metrics provided by TF during the next segment

● The training algorithm will attempt to minimise this error with respect to
the model parameters

89
Training Algorithm

Core algorithm for DL training


● Backpropagation (or gradient descend)
● Iteratively change parameters to minimise the loss function

90
Training Algorithm

● Adaptive vs "vanilla" versions

● Different empirical performance

● All attempt to converge to some good


minima on the loss landscape

91
Training Algorithm (Forward Pass)

Data Model Prediction

Loss

Ground
Truth

92
Training Algorithm (Backward Pass)

Model

Loss
minimise (model parameters) w.r.t. loss

93
Training Algorithm Visualisation

● https://playground.tensorflow.org/

94
Checkpoint
Slides: http://bit.ly/tf_slides
Pigeonhole: https://pigeonhole.at/TENSORFLOW

95
Downsides of Deep Learning

More often than not, in the real world:

● You don't have enough (good) data


● Results are empirical (not theoretical)
○ Not easy to understand why something works, or doesn't work

● You don't have enough compute, so you can't experiment fast enough

96
97
98
99
Hardware Requirements for DL

● Deep Learning involves lots of


parallel computations and matrix-multiplications
● This makes it suitable for hardware accelerators
such as GPUs
● There are DL-specific hardware accelerators
starting to appear (e.g. TPU)

100
NVIDIA DGX SuperPOD (#22 on TOP500) 101
Google Cloud TPU v3 Pod 102
Hardware Requirements for DL

● The good news is for most "simple" tasks, 1 GPU is enough


Especially if you use transfer learning
● Throughout this workshop series, we will be using Google Colab, which
provides a free GPU for up to 12 hours
● Generally, you don't want to run things locally on a laptop.
Even a high-end gaming laptop!

103
NVIDIA Tensor Core GPUs
Delivering cutting-edge Deep Learning performance

NVIDIA Tesla T4
Single Precision: 8.1 TFLOPS
Mixed Precision: 65 TFLOPS
300GB/s VRAM (GDDR6)
Ultra-efficient 70W

Free for education/research use


on Google Colab ( )

104
NVIDIA Tensor Core GPUs
Delivering cutting-edge Deep Learning performance

NVIDIA Tesla T4
(~ low power RTX 2070)

Can be 25% faster than


GTX 1080 Ti in training speed

45% more VRAM

Free for education/research use


on Google Colab ( )

105
Tip: Don't be a hero

● Use Google Colab


○ No need to set up environment

○ Powerful GPU/TPU for free (faster than your laptop GPU)

● Simpler is better
○ Keep up to date, but there's no need to chase the latest "state of the art"

○ Start simple, add on complexity one step at a time

106
Checkpoint
Slides: http://bit.ly/tf_slides
Pigeonhole: https://pigeonhole.at/TENSORFLOW

107
Part 3
Introduction to TensorFlow 2.0
04:00 pm - 06:00 pm

Slides: http://bit.ly/tf_slides
Pigeonhole: https://pigeonhole.at/TENSORFLOW

108
Keras in one slide

● Keras is a super simple Python library for building deep learning models
● Keras provides 99% of the building blocks you need
○ Layers, Activations
○ Optimizers
○ Nice extras
■ Datasets
■ Callbacks

● Keras is widely used in industry and research

109
TensorFlow in one slide

● TensorFlow is a Google "open source" library designed for graph execution


for deep learning research and production at scale
● High level Python API, low level C++ API, everything in between
○ "Next-generation" Swift API in development

● Train models and deploy on a variety of platform


● High degree of performance optimization from both Google and vendors

110
tf.keras

● TensorFlow is the "most popular" library used for deep learning


○ Most people absolutely hate the API of TensorFlow 1.x

○ But TensorFlow ecosystem for research/development/deployment is good

● For TensorFlow 2.0, the official high-level API is now Keras


This is known as tf.keras,
and we will use this throughout our workshop for all code examples

111
TensorFlow Lattice
Components for interpolated
look-up tables in TensorFlow
TensorFlow Data Validation
Library for exploring and validating machine TensorFlow Federated
learning data Machine learning framework for
decentralized data
TensorFlow Transform
Library for preprocessing and manipulating TensorFlow Privacy
data with TensorFlow Training and analysing models with
differential privacy
TensorFlow Model Analysis
Evaluate models using a variety of metrics TensorFlow Probability
and techniques Probabilistic reasoning and
statistical analysis in TensorFlow
TensorFlow Model Optimization
Suite of tools for optimizing ML models for Mesh TensorFlow TensorFlow Agents
deployment and execution Large-scale distributed model Efficient batched reinforcement
parallelism toolkit for TensorFlow learning in TensorFlow
TensorFlow Serving
Scalable, high-performance serving system TensorFlow Hub TensorFlow Graphics
for models Library for transfer learning by Differentiable graphics layers for
reusing parts of TensorFlow models TensorFlow
TensorFlow Extended (TFX)
TensorFlow Datasets TensorFlow Ranking
Provides various public datasets in Components for Learning-to-Rank
for form of tf.data.Datasets (LTR) techniques in TensorFlow
112
Tensor
a multidimensional array (aka ndarray)

Flow
a computational graph

113
import tensorflow as tf

mnist = tf.keras.datasets.mnist
Load Dataset (x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
Define Model tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
Compile into loss='sparse_categorical_crossentropy',
optimized graph
metrics=['accuracy'])

Train Model model.fit(x_train, y_train, epochs=5)

114
tf.keras

● "Low Bar"
Easy to piece together cutting edge models that work
● "High Ceiling"
Write custom pieces and training loops
Do cutting edge research work
https://www.tensorflow.org/beta/tutorials/quickstart/advanced

115
A bit more about Keras
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

116
Dense layer - a fully connected layer

117
Dense layer - a fully connected layer

activation
function

weights
bias

118
A bit more about Keras
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

119
Activation Functions

120
A bit more about Keras
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

121
Dropout

122
A bit more about Keras
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

123
Keras Optimizers

● SGD(lr=0.01, momentum=0.0, decay=0.0)


○ Vanilla, usually quite slow, might give the best final result eventually

● Adam(lr=0.001, decay=0.0)
○ Good first choice, just pick this

● RMSprop(lr=0.001, decay=0.0)
○ Maybe good choice for RNNs. Try Adam, and then this, and compare.

124
Keras Optimizers
● Learning Rate (lr)
○ Too small == no fitting; too large == diverge/explode

○ Usually problem specific; Adam is less sensitive to LR

● Decay
○ Multiply learning rate by (1-rate) after every step (not epoch!!)

● Other parameters
○ Usually Optimizers have some other parameters. Unless you know why, there is no real
reason to tweak those. "Sane defaults" are set.
125
A bit more about Keras
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

126
Loss

Class Predictions Ground Truth

● Loss functions compare difference between class predictions and ground


truth values that you supply (Supervised learning)
● Smaller the error, the better the prediction

127
Loss

● binary_crossentropy
○ Possible to have multiple correct labels (e.g. tweet is both sad and angry)
Check if each category (individually) is correct

● categorical_crossentropy
○ Only one category is correct (e.g. picture is a cat, not a dog)
Check if the category is correct (only one out of many)

● mean_square_error
○ Compare input and output for differences (e.g. image reconstruction)

128
Dataset: MNIST

● Handwritten digits

● 28 x 28, single colour channel

● "Hello world"

● Generally not treated as significant


anymore - simple models can easily solve
this challenge

129
Dataset: CIFAR

● 32 x 32 colour images of real-world


objects

● Much, much harder than MNIST

● Often used as a benchmark for


"okay, this is a model that works"

● Harder CIFAR100 version also exists

130
Checkpoint
Slides: http://bit.ly/tf_slides
Pigeonhole: https://pigeonhole.at/TENSORFLOW

131
Code Walkthrough

1. Colab - Checking for GPU


2. Colab - MNIST
3. Colab - Image Classification
http://bit.ly/tf_slides -> slide 132

132

You might also like