0% found this document useful (0 votes)
119 views

Introduction To Deep Learning: Internet of Things Group

This document provides an introduction to deep learning and neural networks. It discusses how neural networks are used to solve computer vision problems and are powering technologies like Tesla's autopilot system. It also covers the history of neural networks, how they are trained using gradient descent and backpropagation, and common neural network layers like convolutional layers, fully connected layers, and activation layers.

Uploaded by

Charles Nicollas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views

Introduction To Deep Learning: Internet of Things Group

This document provides an introduction to deep learning and neural networks. It discusses how neural networks are used to solve computer vision problems and are powering technologies like Tesla's autopilot system. It also covers the history of neural networks, how they are trained using gradient descent and backpropagation, and common neural network layers like convolutional layers, fully connected layers, and activation layers.

Uploaded by

Charles Nicollas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Introduction to Deep Learning

Anna Petrovicheva
IOTG Computer Vision

Internet of Things Group 1


Agenda

1. Neural Networks overview


2. Math engine
3. Neural Network layers
4. Solving Computer Vision problems
5. How to train a network

Internet of Things Group 2


Deep Learning systems in real world

Image credit: DeepMind, Prisma, Yayvo, Google Translate, Redmond Pie, TechRepublic, Brit

Internet of Things Group *Other names and brands may be claimed as the property of others 3
Tesla autopilot

Image credit: Autopilot Full Self Driving Demonstration Nov 18 2016 Realtime Speed

Internet of Things Group *Other names and brands may be claimed as the property of others 4
Brief history
● 1965: first idea
● AI winter
● 1998: LeNet-5
● 2000’s: “The biggest issue of this paper, is that it relies on neural networks”
● 2012: groundbreaking results in ImageNet contest
○ Old algorithms
○ Big dataset
○ Compute power

● 2012-now: wide adoption

Internet of Things Group 5


Artificial Neural Network
parameter

neuron
w1
v1

input w2
output
v2 vnew
w3 dog

v3

layer

Internet of Things Group 6


Training
Start: parameters are random

cat

Goal: find good parameters W = (w1, w2, … , wm)

Internet of Things Group 7


Finding parameters
prediction
error
● W = (w1, w2, … , wm) - point in
multidimensional space
○ Modern nets: 10s - 100s million
parameters

● Use W in network → get


corresponding prediction error
w1
○ Wstart: high prediction error
○ Woptimal: low prediction error
Wstart
w2 Woptimal ● Goal: get from Wstart to Woptimal

Internet of Things Group 8


Gradient descent
prediction
error

W1 = Wstart + α * F’(Wstart)
α - learning rate
Too small: long training
Too large: training diverges
w1

W1 Wstart
w2 Woptimal

Internet of Things Group 9


Gradient descent
prediction
error

W1 = Wstart + α * F’(Wstart)
W2 = W1 + α * F’(W1)
W3 = W2 + α * F’(W2)

w1 W4 = W3 + α * F’(W3)
W5 = W4 + α * F’(W4)
Wstart
W1
w2 Woptimal

Internet of Things Group 10


Non-convex task
prediction
error ● May stuck in local minima
● Solution depends on initial
point
State-of-the-art opinion:
● Local minima are not
w1 biggest problem
● “Like person driving a car
in a really confusing city”
Woptimal Wlocal
w2

Internet of Things Group 11


Stochastic gradient descent
Gradient descent:
▪ Take all data points (= all dataset)
▪ Compute parameter derivative in all points
▪ Make a step in this direction

Dataset is too big


▪ Too much time to compute
▪ Does not fit in operating memory

Stochastic Gradient Descent:


▪ Use random subset of data (new each iteration)

Internet of Things Group 12


Backpropagation algorithm
Forward pass w
1
w ● Cost function estimates
2
w prediction error
cat
3
● Layers compute derivative with
respect to parameters
● Parameter derivative is sent to
Stochastic Gradient Descent
Backward pass ● SGD outputs parameter update
w’1
for the next iteration
w’2
● Next iteration - new
SGD error e
parameters, new data from
w’3
dataset

Parameter
update ΔW

Internet of Things Group 13


Neural network layers

Internet of Things Group


Convolutional layer
1 0 1 ● Local connectivity

0 1 0 ● Convolves channels too


1 0 1 ● Each convolutional layer has many
different filters
● Each filter detects specific feature
○ Borders, colors

● General data transform tool


● Can have bias b

Image credit: Visualizing Neural Networks In Virtual Space

Internet of Things Group 15


Convolutional layer

Can represent any image operation

Goal: find suitable parameters


Takes 95% computations in network
Image credit: OpenCV documentation

Internet of Things Group 16


Convolutional layer filters

AlexNet 1st convolution filters


● Detect lines
● Detect color patterns
Further layers:
● Growing level of abstraction
○ “Face neuron”

Image credit: CS231n: Convolutional Neural Networks for Visual Recognition

Internet of Things Group 17


Fully connected layer

v1 w11 b1
w21
fc1

v2

● 95 % of parameters in network
● “Classic” layer
● Usually used before the final
bm
classificator
wnm fcm

vn

Internet of Things Group 18


Activation layer

● Applied after all convolution and fully


connected layers
● Analogous to biological neuron
mechanism
○ Neuron firing rate

Internet of Things Group 19


Activation layers
● Original idea: Heaviside step function Heaviside step
function
○ Fire / not fire
○ Non-differentiable -> cannot use
backpropagation

● Approximation: sigmoid / tanh tanh


sigmoid
○ Approximate step function
○ Differentiable
○ Saturate and kill gradients
● Used almost everywhere: Rectified Linear Unit ReLU
○ Accelerates convergence in training
○ Does not saturate

Internet of Things Group 20


Pooling layer
● Types:
○ Average pooling
0 -1 0 2 ○ Max pooling
max
1 1 -1 1 pooling 1 2
● Reduces data dimensionality
1 0 3 0 2 3
○ Less parameters
-1 2 0 1 ○ Less computations
○ Controls overfitting

Internet of Things Group 21


Typical feed-forward neural network

No cycles VGG16 topology

Activation after each convolution / FC


Pooling after several convolution blocks

Image credit: Feature Evaluation of Deep Convolutional Neural Networks for Object Recognition and Detection

Internet of Things Group 22


Solving Computer Vision with Deep
Learning

Internet of Things Group


Image classification

classification dog cat bird


backbone
head 0.7 0.2 0.1

● Predicts category of image


● Backbone extracts features

● Classification head outputs probabilities of each category

Internet of Things Group 24


Softmax layer + cross-entropy loss

Softmax layer Cross-entropy loss

label dog cat bird

ground truth 1 0 0 Cross-entropy loss

algorithm 1 0.2 0.6 0.2 - ((ln(0.2) * 1) + (ln(0.6) * 0) + (ln(0.2) * 0)) = 1.6

algorithm 2 0.5 0.4 0.1 - ((ln(0.5) * 1) + (ln(0.4) * 0) + (ln(0.1) * 0)) = 0.69

algorithm 3 0.8 0.1 0.1 - ((ln(0.8) * 1) + (ln(0.1) * 0) + (ln(0.1) * 0)) = 0.22

Internet of Things Group 25


ImageNet

Greatest driver of Deep Learning and


image classification
1 million images
1000 classes
▪ 120 dog breeds

ImageNet 2017 is the last one

Internet of Things Group 26


● Before 2012:
non-Deep
Learning
methods
● 2012: AlexNet
● 2014: VGG,
GoogLeNet
● 2015: ResNet

Internet of Things Group 27


ResNet topology
Won ImageNet 2015 image classification contest
Key advantage: residual connection
▪ Better convergence in parameter space

Outperformes human accuracy in image classification


▪ Andrej Karpathy blog

ResNet-like topologies are state-of-the-art


▪ Top accuracy in many Computer Vision tasks

Very deep
▪ 50 / 101 / 152 -convolution modifications
Image credit: Deep Residual Learning for Image Recognition

Internet of Things Group 28


Typical Deep Learning algorithm for Computer
Vision
Requirement: big datasets for the task exist
Typical solution

task-specific
input backbone output
layers

Backbone: AlexNet, VGG, GoogLeNet, ResNet and other


▪ Without softmax head
▪ Extracts representative features
▪ Pretrained on ImageNet

Internet of Things Group 29


Object detection

detection elephant

backbone
head tree
tree

VGG Faster R-CNN

Inception R-FCN

ResNet SSD
Image credit: Savanna

Internet of Things Group 30


Object detection

Image credit: YOLO v2


Internet of Things Group 31
Semantic segmentation

● Generate mask of objects of each


class on image
○ Road
○ Pedestrian
○ ...

● Each pixel classification


● Datasets
○ General case
○ Road scenarios

Image credit: DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Internet of Things Group 32


Semantic segmentation

Image credit: Feature Space Optimization for Semantic Video Segmentation - CityScapes Demo 02

Internet of Things Group 33


Instance segmentation
Mask for each object + category
of object
▪ Semantic segmentation

▪ Object detection

State-of-the-art: Mask-R-CNN

Internet of Things Group 34


Generative Adversarial Networks
● Generator network generates
sample
● Discriminator network tries to
distinguish real samples from
generated
○ Bank-counterfeiter task

● Trained GAN:
○ Good generator of new objects
○ Good estimator of object quality

● Any task can be interpreted as GAN


Image credit: Stability of Generative Adversarial Networks

Internet of Things Group 35


GAN for image generation

September 2016 March 2017


Image credit: BEGAN: Boundary Equilibrium Generative Adversarial Networks

Internet of Things Group 36


GAN for image generation

Image credit: BEGAN: Boundary Equilibrium Generative Adversarial Networks

Internet of Things Group 37


GAN for image generation from caption

Image credit: StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

Internet of Things Group 38


GAN for Super Resolution

4x

Original image Bicubic interpolation SRGAN


Image credit: Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

Internet of Things Group 39


GAN for image to image translation

Image credit: Image-to-Image Translation with Conditional Adversarial Nets

Internet of Things Group 40


GAN for image to image translation

Image credit: CycleGAN

Internet of Things Group 41


How to train a network

Internet of Things Group


Understand state-of-the-art

● Google Scholar, Arxiv papers


● Datasets, benchmarks
● Existing implementations, open repositories

Internet of Things Group 43


Prepare dataset
● Neural Networks demand big datasets
○ ImageNet: 1.4 million images
○ MS COCO: 300 thousand images
● Data augmentation
○ Cropping
○ Flipping
○ Brightness / contrast

Internet of Things Group 44


Prepare dataset
Small amount of real-life data: add train-val split

overfitting generalization

Train Train-Val Validation Test

high error high error high error high error

Bigger model More data Get more data similar


Train longer More regularization to test More validation data
Other architecture Other architecture Other architecture

Andrew Ng. Nuts and Bolts of Applying Deep Learning


Internet of Things Group 45
Iterative experiments
● Overfit 1 sample
● Put all results in table
● Variability:
▪ Backbone
▪ Task-specific layers and loss
▪ Data augmentation
▪ Optimization parameters
– Learning rate value and policy
– Regularization

Image credit: Speed/accuracy trade-offs for modern convolutional object detectors

Internet of Things Group 46


Accuracy evaluation

● Compare with state-of-the-art


● Analyze accuracy dynamics while training
1.0 1.0

train
0.9 0.9
val
0.8 0.8

accuracy
accuracy

0.7 0.7

0.6 0.6

0.5 0.5

iterations iterations

Typical good training Overfitting

Internet of Things Group 47


Choose accuracy metric

● Single accuracy metric


○ Comparable results

Example:

Accuracy Performance
● Accuracy: optimizing metric
Model 1 98 % 2 seconds
● Time: satisficing metric
Model 2 93 % 0.5 second

Internet of Things Group 48


General tips
● Neural Networks can solve vision problems human can solve in 1 second
● Open source repositories do not work out of the box
● Find your way to learn about new DL research

Papers submitted to Arxiv categories cs.AI, cs.LG, cs.CV, cs.CL, cs.NE, stat.ML over time

Image credit: Andrej Karpathy’s blog @ Medium

Internet of Things Group 49


Internet of Things Group

You might also like