0% found this document useful (0 votes)
266 views65 pages

Deep Learning Handson

This document provides an overview of deep learning and training a deep neural network (DNN) in Python. It discusses problem definition, training a DNN which involves data analysis, architecture engineering, optimization, and training. It also covers improving the DNN through analysis capabilities, data augmentation, and monitoring layers' training. Finally, it discusses open source packages for hardware, Python frameworks like Theano, and deep learning packages that can help estimate effort. The overall document is a high-level tutorial on deep learning concepts and hands-on training of DNNs in Python.

Uploaded by

Alan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
266 views65 pages

Deep Learning Handson

This document provides an overview of deep learning and training a deep neural network (DNN) in Python. It discusses problem definition, training a DNN which involves data analysis, architecture engineering, optimization, and training. It also covers improving the DNN through analysis capabilities, data augmentation, and monitoring layers' training. Finally, it discusses open source packages for hardware, Python frameworks like Theano, and deep learning packages that can help estimate effort. The overall document is a high-level tutorial on deep learning concepts and hands-on training of DNNs in Python.

Uploaded by

Alan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Hands-on

Deep Learning in Python

Imry Kissos
Deep Learning Meetup
TLV August 2015
Outline
● Problem Definition
● Training a DNN

● Improving the DNN


● Open Source Packages
● Summary
2
Problem Definition

Deep
Convolution
Network

1 http://danielnouri.org/notes/2014/12/17/using-convolutional-neural-nets-to-detect-facial-keypoints-tutorial/ 3
Tutorial
● Goal: Detect facial
landmarks on (normal)
face images
● Data set provided by
Dr. Yoshua Bengio
● Tutorial code available:
https://github.com/dnouri/kfkd-tutorial/blob/master/kfkd.py
4
Flow

Train Model Train Model Predict Points


General “Nose Tip” on Test Set

Train Model
“Mouth Corners”
5
Flow

Train Images Fit Trained


Train Points Net

6
Flow

Test Predict Predicted


Images Points

7
Python Deep Learning Framework
High Level

nolearn - Wrapper to Lasagne

Lasagne - Theano extension for Deep Learning

Theano - Define, optimize, and mathematical expressions

Efficient Cuda GPU for DNN Low Level

HW Supports: GPU & CPU


OS: Linux, OS X, Windows 8
Training a Deep Neural Network
1. Data Analysis
2. Architecture Engineering
3. Optimization
4. Training the DNN

9
Training a Deep Neural Network
1. Data Analysis
a. Exploration + Validation
b. Pre-Processing
c. Batch and Split
2. Architecture Engineering
3. Optimization
4. Training the DNN
10
Data Exploration + Validation 1

Data:
● 7K gray-scale images of detected faces
● 96x96 pixels per image
● 15 landmarks per image (?)

Data validation:
● Some Landmarks are missing

11
Pre-Processing

Data
Normalization

Shuffle train data

12
Batch
-
- t - train batch
⇐One Epoch’s data
- validation batch

- - test batch
train/valid/test splits are constant 13
Train / Validation Split

Classification - Train/Validation preserve classes proportion

14
Training a Deep Neural Network
1. Data Analysis
2. Architecture Engineering
a. Layers Definition
b. Layers Implementation
3. Optimization
4. Training

15
Architecture

XY

Conv Pool Dense Output

16
Layers Definition

17
Activation Function 1

ReLU

18
Dense Layer

19
Dropout

20
Dropout

21
Training a Deep Neural Network
1. Data Analysis
2. Architecture Engineering
3. Optimization
a. Back Propagation
b. Objective
c. SGD
d. Updates
e. Convergence Tuning
4. Training the DNN 22
Back Propagation
Forward Path

XY

Output
Conv Dense Points

23
Back Propagation
Forward Path

XY XY

Output Training
Conv Dense Points Points

24
Back Propagation
Backward Path

XY

Conv Dense

25
Back Propagation
Update
For All Layers:

Conv Dense

26
Objective

27
S.G.D Updates the network after each batch

Karpathy - “Babysitting”: weights/updates ~1e3 28


Optimization - Updates

29
Alec Radford
Adjusting Learning Rate & Momentum

Linear in epoch

30
Convergence Tuning

stops according to validation loss

returns best weights

31
Training a Deep Neural Network
1. Data Analysis
2. Architecture Engineering
3. Optimization
4. Training the DNN
a. Fit
b. Fine Tune Pre-Trained
c. Learning Curves

32
Fit

Loop over train batchs

Forward+BackProp

Loop over validation batchs

Forward

33
Fine Tune Pre-Trained
fgd

change output layer

load pre-trained weight

fine tune specialist


34
Learning Curves
Loop over 6 Nets:

Epochs

35
Learning Curves Analysis
Net 1

Net 2
RMSE

RMSE
Epochs Epochs

Convergence Overfitting
Jittering 36
Part 1 Summary
Training a DNN:

37
Part 1 End
Break
Part 2
Beyond Training
Outline
● Problem Definition
● Motivation
● Training a DNN
● Improving the DNN
● Open Source Packages
● Summary

40
Beyond Training
1. Improving the DNN
a. Analysis Capabilities
b. Augmentation
c. Forward - Backward Path
d. Monitor Layers’ Training
2. Open Source Packages
3. Summary

41
Improving the DNN
Very tempting:
● >1M images
● >1M parameters
● Large gap: Theory ↔ Practice

⇒Brute force experiments?!


42
Analysis Capabilities
1. Theoretical explanation
a. Eg. dropout and augmentation decrease overfit
2. Empirical claims about a phenomena
a. Eg. normalization improves convergence
3. Numerical understanding
a. Eg. exploding / vanishing updates

43
Reduce Overfitting Net 1

Net 2

Solution:
Data Augmentation

Epochs

Overfitting
44
Data Augmentation

Horizontal Flip Perturbation

45
Advanced Augmentation

http://benanne.github.io/2015/03/17/plankton.html 46
Convergence Challenges
RMSE

Epochs Epochs
Normalization Data Error
Need to monitor forward + backward path
47
Forward - Backward Path
Forward

Backward:
Gradient w.r.t parameters

48
Monitor Layers’ Training
nolearn - visualize.py

49
Monitor Layers’ Training

X. Glorot ,Y. Bengio, Understanding the difficulty of training deep feedforward neural networks:
“Monitoring activation and gradients across layers and training
iterations is a powerful investigation tool”

Easy to monitor in Theano Framework

50
Weight Initialization matters (1)
Layer 1- Gradient are close to zero - vanishing gradients

51
Weight Initialization matters (2)
Network returns close to zero values for all inputs

52
Monitoring Activation
plateaus sometimes seen when training neural
networks

For most epochs the network returns close to zero output for all inputs

Objective plateaus sometimes can be explained by saturation 53


Monitoring weights/update ratio
3e-1

Max of Weights of Conv1: 2e-1

1e-1

0 Epoch

3e-3
Max of Updates of Conv1:
2e-3

1e-3

0 Epoch

http://cs231n.github.io/neural-networks-3/#baby 54
Beyond Training
1. Improving the DNN
2. Open Source Packages
a. Hardware and OS
b. Python Framework
c. Deep Learning Open Source Packages
d. Effort Estimation
3. Summary

55
Hardware and OS
● Amazon Cloud GPU:
AWS Lasagne GPU Setup
Spot ~ $0.0031 per GPU Instance Hour
● IBM Cloud GPU:
http://www-03.ibm.com/systems/platformcomputing/products/symphony/gpuharvesting.html
● Your Linux machine GPU:
pip install -r https://raw.githubusercontent.com/dnouri/kfkd-
tutorial/master/requirements.txt

● Window install

http://deeplearning.net/software/theano/install_windows.html#install-windows
56
Starting Tips
● Sanity Checks:
○ DNN Architecture : “Overfit a tiny subset of data” Karpathy
○ Check Regularization ↗ Loss ↗
● Use pre-trained VGG as a base line
● Start with ~3 conv layer with ~16 filter each - quickly iterate

57
Python

● Rich eco-system
● State-of-the-art
● Easy to port from prototype to production

Podcast : http://www.reversim.com/2015/10/277-scientific-python.html
58
Python Deep
Learning Framework

Keras ,pylearn2, OpenDeep, Lasagne - common base 59


Tips from Deep Learning Packages
Torch code organization Caffe’s separation
configuration ↔code

NeuralNet → YAML text format


defining experiment’s configuration

60
Deep Learning
Open Source Packages
Open source progress rapidly→ impossible to predict industry’s standard
Caffe for applications
Torch and Theano for research on Deep Learning itself
http://fastml.com/torch-vs-theano/

White Box Black Box


61
Disruptive Effort Estimation
Feature Eng Deep Learning

Still requires algorithmic expertise 62


Summary
● Dove into Training a DNN
● Presented Analysis Capabilities
● Reviewed Open Source Packages

63
References
Hinton Coursera Neuronal Network
https://www.coursera.org/course/neuralnets
Technion Deep Learning course
http://moodle.technion.ac.il/course/view.php?id=4128
Oxford Deep Learning course
https://www.youtube.com/playlist?list=PLE6Wd9FR--EfW8dtjAuPoTuPcqmOV53Fu
CS231n CNN for Visual Recognition
http://cs231n.github.io/
Deep Learning Book
http://www.iro.umontreal.ca/~bengioy/dlbook/
Montreal DL summer school
http://videolectures.net/deeplearning2015_montreal/

64
Questions?

Deep
Convolution
Regression
Network

65

You might also like