0% found this document useful (0 votes)
10 views83 pages

Unit Ii

The document provides an overview of Keras and TensorFlow, highlighting their features, installation processes, and model architecture. It details the types of models available in Keras, including the Sequential model and the functional API, as well as various layers, optimizers, activation functions, and cost functions. Additionally, it discusses hyperparameter tuning methods and the importance of optimizing AI infrastructure for deep learning applications.

Uploaded by

prathammalviya8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views83 pages

Unit Ii

The document provides an overview of Keras and TensorFlow, highlighting their features, installation processes, and model architecture. It details the types of models available in Keras, including the Sequential model and the functional API, as well as various layers, optimizers, activation functions, and cost functions. Additionally, it discusses hyperparameter tuning methods and the importance of optimizing AI infrastructure for deep learning applications.

Uploaded by

prathammalviya8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 83

Building Models with

Keras
UNIT-II
Ker
as
• A python package (Python 2.7-3.6)
• Sits on top of TensorFlow or Theano (Stopped)
• High-level neural network API
• Runs seamlessly on CPU and GPU
• Open source with user manual (https://keras.io/)
• Less coding lines required to build/run a model
TensorFl
ow
• Inherit from Theano (data flow graph)
• A python(3.5-3.7) package/C++ library
• Running on CPU or NVIDIA CUDA GPU
• End-2-End platform for machine/deep learning
• Multi platform (desktop, web by TF.js, mobile by TF
Lite)
• Open source with user manual
(https://www.tensorflow.org/)

• More coding lines required to


build/run a model
NVIDIA CUDA
Toolkit
• C/C++ library
• A parallel computing platform for NVIDIA GPU
• Most deep learning researchers rely on
• GPU-accelerated computing/applications
• Not open source (https://developer.nvidia.com/cuda-zone)
• CPU vs GPU: TensorFlow training CNN model on CIFAR10 images
Anaconda3
Installation
• Anaconda3
• Download (https://www.anaconda.com/distribution/)
• Installation (https://docs.anaconda.com/anaconda/install/)
• Restart required
TensorFlow/Keras
Installation
• Start the anaconda navigator
• Windows: Start->All program->Anaconda3-
>Anaconda Navigator
• Linux: type “anaconda-navigator” under the
linux terminal
• Install TensorFlow and Keras
• Environments->choose All
• type “tensorflow”
• CPU based:
tensorflow (choose
1.14)
keras (2.2.4) apply
• GPU based:
• CUDA Compute Capability
>= 3.0, better
>= 3.7 (check more)
• tensorflow-gpu (choose 1.14) and keras-
gpu (2.2.4), then apply
Installation
Confirmed
• TensorFlow test code:
import tensorflow as tf
sess = tf.compat.v1.Session()
a = tf.compat.v1.constant(1)
b = tf.compat.v1.constant(2)
print(sess.run(a+b))

• Expect to have
answer 3
Installation
Confirmed
• Keras requires backend setting for Windows users:
• https://keras.io/backend/
• Setting in keras.json:
“backend”: “tensorflow”
• Keras test code:
import keras

• Expect to see
Using TensorFlow backend
Keras
Models
• Two main types of models available
• The Sequential model (easy to learn, high-level API)
• A linear stack of layers
• Need to specify what input shape it should expect (input dimension)
• https://keras.io/getting-started/sequential-model-guide/
• The Model class used with the functional API (similar to tensorflow2.0)
• https://keras.io/models/about-keras-models/
• https://keras.io/getting-started/functional-api-guide/
Keras Sequential
Model
• Define a sequential model • Training
• model = Sequential() model = model.fit(data, one_hot_labels,
mode.add(Dense(32, input_dim=784)) epoch=10, batch_size=32)

model.add(Activation(‘relu’)) • Predition
model.add(Dense(10)) Y = model.predict(X)
• model.add(Activation(‘softmax’))
• Compilation
• model.compile(optimizer=‘rmsprop’,
• loss=‘binary_crossentropy’,
metrics=[‘accuray’])
Keras: Layers
 Input:
input_img = Input(shape=(rows , cols , channels))
 Dense:
x = Dense(num_of_units , activation=‘activation_function’)
 Conv2D:
x = Conv2D(num_of_filters, kernel_size , stride,
activation=‘activation_function’,padding=‘type_of_padding’)
 MaxPool2D:
x = MaxPool2D(kernel_size)
 Flatten:
 Dropout:
x = Dropout(value_of_dropout)

11
Keras: Optimizers
 SGD

 RMSProp

 AdaGrad

 Adam

 …

12
Keras: Activation Functions
 Sigmoid

 Tanh

 Relu

 LeakyRelu

 ELU

 Softmax
 …

13
Keras: Cost Functions
 Mean Squared Error (‘mse’)

 Binary Cross Entropy (‘binary_crossentropy’)

 Kullback Leibler Divergence (‘kullback_leibler_divergence’)

 …

14
Keras: Defining the architecture
There are two ways to define the architecture:

15
Keras: Defining the architecture
There are two ways to define the architecture:

16
Exercises:
 Exercise 1:
Define the network architecture following the LeNet-5 model.
 Exercise 2:
Evaluate the network performance in terms of accuracy in relation to the
change of:
1. Learning rate: 0.1 and 0.001.
2. Activation functions: ReLU and Sigmoid.
3. Dropout values: 0.25, 0.5 and 0.75

17
Layer
s
• linea
r
• sigmoi
d
• tan
h
• rel
u
• PReLU

• Leaky
ReLU
• SReLU
• L1 weight
penalty
• L2 weight
penalty
https://github.com/vdumoulin/
conv_arithmetic
https://github.com/vdumoulin/
conv_arithmetic

border_mode = ‘valid’
no strides: subsample=
(1,1)
https://github.com/vdumoulin/
conv_arithmetic

border_mode =
‘same’ no strides:
subsample= (1,1)
https://github.com/vdumoulin/
conv_arithmetic

border_mode =
‘valid’ 2x2 strides:
subsample= (2,2)
https://github.com/vdumoulin/
conv_arithmetic

border_mode = ‘same’
2x2 strides: subsample=
(2,2)
Initializatio
ns

For a discussion of weight


initializations:
SGD

RMSPro

Adam

Tune the learning


rate!
• Use metrics to specify what you want in
history
• Up to you to save it!
Saving and loading
weights
Saving and loading a
model
Loading a retrained model: first
approach
Loading a retrained model: first
approach
Loading a retrained model: second
approach
Loading a retrained model: second
approach
Loading a retrained model: second
approach
Challenges in Tuning Hyperparameters

• Solving a problem with deep learning often follows a pipeline that includes feature engineering,
model selection, training by tuning hyperparameters, and validation.

• Hyperparameters (HPs) can be


divided into two categories:
• Training-related: learning rate, batch size, dropout rate, and
epoch count
• Model design-related: model structure, regularization, and
activation functions

N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 63


• Due to the number of hyperparameters involved, it is nearly impossible to explore all
possible combinations.

• Autotuning is an active research area that involves automated search techniques to find
an optimal solution.

• A few popular autotuning algorithms are Grid Search, Random Search, Bayesian
Optimization, and Gradient-based Optimization.

• Keras Tuner uses random search for finding a generalized solution.

https://keras.io/keras_tuner/

N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 64


Hyperparameter Tuning
• The process of selecting hyperparameters is a complex optimization problem.

• Grid search of the hyperparameter space is a popular method which is simple to implement and
parallelize, and provides insight into the search space.

• Ongoing research suggests that automated random search optimization is a more efficient alternative
that often yields as good or better models than manual methods due to their ability to search larger
configuration spaces.

• The problem of hyperparameter tuning () can be expressed as:

where,

= the hyperparameters, = the learning algorithm,

= the search space, = the hyperparameter response function.

= the loss function,

= the ground truth

N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 65


N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 66
DGX
Charlie Boyle
A100
Chris Lamb
Rajeev
Jayavant
OVERVIE
W

Conta
ct In Great Britain:
Sky Blue Microsystems Zerif Technologies Ltd.
GmbH Winnington House, 2 Woodberry
Geisenhausenerstr. 18 Grove
81379 Munich, Germany Finchley, London N12 0DR
+49 89 780 2970, +44 115 855 7883,
[email protected] [email protected] www.zerif.co.uk
www.skyblue.de
2

N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 68


SOLVING THE INFLEXIBILITY OF AI
INFRASTRUCTURE
Not Optimized, Complex to Manage, Difficult to Scale
Predictably
TRAINING
CLUSTER

Inflexible infrastructure silos that were


never meant for the pace of AI

Constrained workload placement by


system-level characteristics

Non-uniform performance across the data


center
6
9

Unable to adapt to dynamic workload

demands Constrained capacity planning

https://www.youtube.com/watch?v=MY7jZGZw9vA
ANALYTICS INFERENCE
CLUSTER CLUSTER https://www.youtube.com/watch?v=ZevjEbu8N3E

N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 69


DGX A100: THE
UNIVERSAL AI SYSTEM

One System for Integrated Access to Game-changing


7
0
Unmatched
Every AI Unmatched AI Performance for Data Center
Workload Expertise Innovators Scalability
Performance meets Fastrack AI Fastest time-to-solution Build leadership-class
utility – analytics, AI transformation with the world’s first 5 infrastructure that
training and inference DGXpert know-how and PFLOPS AI system, built scales to keep ahead
all in one experience on NVIDIA A100 of demand

N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 70


ONE SYSTEM FOR ALL AI
INFRASTRUCTURE
AI Infrastructure Re-Imagined, Optimized, and Ready
for Enterprise AI-at-Scale

Flexible AI infrastructure that adapts to


the pace of enterprise

One universal building block for the


AI data center

Uniform, consistent performance


across
7 the data center
1

Any workload on any node - any time

Limitless capacity planning with


predictably great performance with
scale
Analytics  Training  Inference
any job | any size | any node | anytime

N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 71


GAME-CHANGING
PERFORMANCE FOR
INNOVATORS

9x Mellanox ConnectX-6 VPI HDR InfiniBand/200Gb


Ethernet
450GB/sec Bi-directional Bandwidth

Dual 64-core AMD CPUs and 1TB System


Memory
3.2X More Cores to Power the Most Intensive AI Jobs

8 NVIDIA A100 GPUs with 320GB TOTAL GPU


Memory
12 NVLinks/GPU
600GB/sec GPU-to-GPU 7
Bi-directional Bandwidth
6 Second Generation2
NVSwitches
4.8TB/sec Bi-directional Bandwidth
2X More than Previous Generation
NVSwitch
15TB Gen4 NVME SSDs
25GB/sec Peak Bandwidth
2X Faster than Gen3 NVME
SSDs

N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 72


DGX A100
PERFORMANCE
1289 6
Sequenc 10 PetaOPS 8
es/s 8
B

G
r
6 172 a
p
X
X h

216 E
Sequences/s d
g
e
58 s
TOPS /
8x DGX
CPU DGX s
V100 A100 CPU Server DGX
FP32 TF32 Cluster A100
A100
Training Analytic
NLP: BERT- s
Large Inference
PageRan
BERT Pre-Training Throughput using PyTorch including Peak Compute
(2/3)Phase 1 and (1/3)Phase 2 | Phase 1 Seq Len = 128, k
13
CPU Server: 2x Intel Platinum 8280 using INT8 3000x CPU Servers vs. 4x DGX A100
Phase 2 Seq Len = 512 V100: DGX-1 with 8x V100 using DGX A100: DGX A100 with 8x A100 using INT8 with Published Common Crawl Data Set: 128B Edges,
FP32 precision Structural Sparsity 2.6TB Graph
DGX A100: DGX A100 with 8x A100 using TF32 precision
N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 7 73
NEW
FEATURES
8
DGX A100: NEW A100 GPUS AND
2X FASTER NVSWITCH
5 PetaFLOPS AI Performance

Eight new A100 Tensor Core GPUs/320GB total


HBM2

Twelve NVLinks per GPU, 2x more than V100

600GB/s bi-directional bandwidth between any


GPU pair

~10X PCIe Gen4 bandwidth with next-gen NVLink


7
5
All GPUs fully connected with six next-gen
NVSwitch

4.8TB/s bi-directional bandwidth

In one second we could transfer 426 hours of HD

video Download HD video to 80K smartphones

N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications simultaneously 75


CONSOLIDATING DIFFERENT WORKLOADS ON
DGX A100
One Platform for Training, Inference and Data
Analytics

4x
A100s
DL
Training

2x A100s
Data
Analytics

2x A100s Inferencing in MIG


7
6 mode
Instance Instance
1 7
TRT TRT TRT TRT TRT TRT TRT

Instance Instance
8 14
TRT TRT TRT TRT TRTT TRT TRT

N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 76


UNMATCHED SCALABILITY WITH
MELLANOX NETWORKING
Highest Network Throughput for Data
• Eightand Clustering
Mellanox single-port
ConnectX-6
For clustering
Storage • networking:
Supporting HDR/HDR100/EDR
Networking
InfiniBand default or 200GigE

450GB/sec total peak

bandwidth

Cluster •For data/storage


Cluster networking:
Networkin Networkin One Mellanox dual-port
g g ConnectX-6
7
7
Supporting: 200/100/50/40/25/10Gb Ethernet
default or
HDR/HDR100/EDR InfiniBand
Single-
port One optional Dual-Port CX-6 available as
CX-6 add-on
NIC All I/O now PCIe Gen4, 2x performance increase over Gen3

Scale up multiple DGX A100 nodes with Mellanox Quantum


Switch,
the world’s smartest network switch
N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 77
THE WORLD’S MOST SECURE AI
SYSTEM FOR ENTERPRISE
Built-In Security: Multi-layered Defense for AI
Infrastructure
DGX A100 delivers the most
robust
security posture for your AI
enterprise
Secure boot

Self-Encrypted Drives
(SED) to protect data at
rest

GPU
Board
CPU Secure
Boar of
Update
Firmware
d
BM
C

1
N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 2 78
INTRODUCING: NVIDIA DGXpert
With Every DGX system - Your Trusted Navigator in AI
Transformation

7
9

14,000+
AI-Fluent
Experts
DESIGN| PLAN | BUILD | TEST | DEPLOY | OPERATE| MONITOR

With you every step of the way - Included with every DGX
system
N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 79
DGX: DELIVERING AI FOR
BUSINESS
Backed by 1000’s of Data Scientists, Engineers &
SATURNV
Plan AI Deploy Optimize
• System Sizing

MAGLE
• HPL System testing • DLI for Workflow
new features
Lifecycle Data
• Network Design
Ingestion •Analytics
Data
Cluster Tools Setup • App Code Reviews
Management
• Secure AI Guidance • System Runbook • Technology Upgrades
Services Management

V
MAGLE DL Data
SW with HPC SW Data
Data Analytics
Workflow
SW
Optimized TF32Ingestion Optimized with Managementwith RAPIDS
Analytics

Acceleration Tensor Core


Management
V

Software
8
0

Highest
Performance
Systems

N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 80


NVIDIA DGX
SUPERPOD WITH
DGX A100
Unmatched data center scalability –
deployed in under 3 weeks
Leadership-class AI infrastructure

The blueprint for AI power and scale using DGX


A100

Infused with the expertise of NVIDIA’s AI

practitioners Designed to solve the previously

unsolvable Configurations start at 20 systems

NVIDIA DGX SuperPOD


8
1 deployed in SATURNV

1,120 A100 GPUs

140 DGX A100 Systems

170 Mellanox 200G HDR switches

4 PB of high-performance storage

N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 700 PFLOPS of power to train the previously 81
MORE THAN A SERVER
– NVIDIA’S
COMMITMENT TO
DELIVERING AI
SUCCESS
Backed by a Global Team of DGXperts
14,000+ of “AI-fluent” practitioners with a
decade
FPO of experience

Need nice GPU Backed by SATURNV – world’s largest DGX


tray/Delta/HGX-3 proving ground
image from
Creative Fully-Optimized
Full-stack solution, optimized at every
layer: data, algorithms, models +
compute, storage, networking, and more

The DGXpert World’s


Unmatched Universal AI-fluent First
Field-Proven
A100 Data Center AI System
Talent Platform Scale Thousands of deployed AI systems and
customers

1
N. Shawki: On Automating Hyperparameter Optimization for Deep Learning Applications 7 83
Contac
t In Great Britain:
Sky Blue Microsystems Zerif Technologies Ltd.
GmbH Winnington House, 2 Woodberry
Geisenhausenerstr. 18 Grove
81379 Munich, Germany Finchley, London N12 0DR
+49 89 780 2970, +44 115 855 7883,
[email protected] [email protected] www.zerif.co.uk
www.skyblue.de
2

You might also like