0% found this document useful (0 votes)

39 views86 pages

Machine Learning Deep Learning Overview AIST

Uploaded by

Nhật Phạm Long

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views86 pages

Machine Learning Deep Learning Overview AIST

Uploaded by

Nhật Phạm Long

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 86

Machine learning &

Deep learning
Hoang Van Nam
MICA Institute - HUST
Agenda

• Introduction

• Machine Learning

• Deep Learning

• CNNs

• Discussion

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Introduction
Artificial Intelligence (AI)
• What is Artificial Intelligence (AI)?
• Using computers to solveproblems
• Or make automated decisions
• For tasks that, when done by humans,
• Typically require intelligence
Timeline Of Intelligent Machines

The Learning Perceptron Backpropagation Watson Wins Jeopardy Facebook DeepFace,

Machine (Alan Turing) (Frank Rosenblatt) (D. Rumelhart, G. Hinton, R. Williams) Amazon Echo

Machine Playing Checker Stanford Cart Deep Blue Beats Google NN recognizing DeepMind Wins Go
(Author Samuel)
Kasparov cat in Youtube

1950 1952 1957 1979 1986 1997 2011 2012 2014 2016
Limits of Artificial Intelligence
• “Strong” Artificial Intelligence ✘
• Computers thinking at a level that meets or surpassespeople
• Computers engaging in abstract reasoning & thinking
• This is not what we have today
• There is no evidence that we are close to Strong AI

• “Weak” Pattern-Based Artificial Intelligence ✔

• Computers solve problems by detecting useful patterns
• Pattern-based AI is an Extremely powerful tool
• Has been used to automate many processes today
• Driving, language translation
• This is the dominant mode of AItoday
Major AI Approaches
Two Major AI Techniques

• Logic and Rules-BasedApproach

• Machine Learning (Pattern-BasedApproach)

Logic and Rules-Based Approach
• Logic and Rules-BasedApproach
• Representing processes or systems using logical rules
• Top-down rules are created for computer
• Computers reason about those rules
• Can be used to automate processes

• Example within law – ExpertSystems

• Turbotax
• Personal income tax laws
• Represented as logical computer rules
• Software computes tax liability
Machine Learning (Patternbased)
• Machine Learning (ML)
• Algorithms find patterns in data and infer rules on their own
• ”Learn” from data and improve overtime
• These patterns can be used for automation or prediction
• ML is the dominant mode ofAI today
Hybrid Systems
• Many successful AI systems are hybrids of

• Machine learning & Rules-BasedHybrids

• e.g. Self-driving cars employ both approaches

• Human intelligence + AI Hybrids

• Also, many successful AI systems work best when
• They work with humanintelligence
• AI systems supply information for humans
What is Machine Learning?
Option 1- Build A Rule Engine

Human
Programmer

Input Output
Age Gender Purchase Items
Date Age Gender Purchase Items
Rule 1: 15 <age< 30 Date
30 M 3/1/2017 Toy Rule 2: Bought Toy=Y, Last
30 M 3/1/2017 Toy
Purchase<30 days
40 M 1/3/2017 Books Rule 3: Gender = ‘M’, Bought
…. …… ….. …..
Toy =‘Y’
…. …… ….. ….. Rule 4: ……..
Rule 5: ……..
Scalability
Problem with Hand Adaptability
Designed Rules
Closed Loop
Option 2 - Learn The Business Rules From Data
Input - New Unseen Data
Age Gender Items

35 F

39 M Toy

Age Gender Purchase Items

Date
30 M 3/20/2017 Toy *
Learning Model Prediction
40 M 1/3/2017 Books Algorithm
…. …… ….. …..

Output
Historical Purchase Data
(Training Data)
Option 2 - Learn The Business Rules From Data
Input - New Unseen Data
Age Gender Items

35 F

39 M Toy
X Y

Age Gender Purchase Items

Date
30 M 3/20/2017 Toy *
Learning Model Prediction
40 M 1/3/2017 Books Algorithm
…. …… ….. …..

Output
Historical Purchase Data f(X)=Y’
Y’~Y
(Training Data)
Machine learning as programming
We Call This Approach Machine Learning
Why Use Machine Learning?
• Use ML when you can’t code it
• Complex tasks where deterministic solution don’t suffice
• E.g. Recognizing speech/images
• Use ML when you can’t scale it
• Replace repetitive tasks needing human like expertise
• E.g Recommendations, spam, fraud detection, machine translation.
• Use ML when you have to adapt/personalize
• E.g. Recommendation and personalization

• Use ML when you can’t track it

• E.g. Automated driving, fraud detection.
Types Of Machine Learning
Reinforcement Learning
Reinforcement Learning
Supervised Learning
No, it’s a
Labrador.

It is a cat.
Supervised Learning – How Machine Learn
Human intervention and validation required
e.g. Photo classification and tagging
Training Data Adjust Model

Input

Machine
?
Learning Prediction Label
Algorithm

Labrador Label Cat Labrador

Unsupervised Learning
No human intervention required
(e.g. Customer segmentation)

Machine
Input Learning Prediction
Algorithm
Literature review onML

Learning hierarchical
representations
Bellman-1957: through deep SL,UL,
Dynamic Baum- Dempster- RL-Good Old-
Programming 1966:HMM Fashioned Artificial SVM Kernel-SVM
1977:EM
Intelligence

1979: Late 1980s-

1960-1981and 2006/7: improved
McCulloch- 1960s: visual convolution + 2000 and
1965: first beyond: CNNs/GPU-CNNs/BP
1943:Early cortex weight beyond:
feed forward backpropagation for MPCNNs/LSTM
NN, didnot inspiration for replication + numerous
deeplearning , gradient stacks
learn DL subsampling improvements
descent,RNN
Model Training

All Labeled Dataset

70% 30%

Training Data
Model Training – Training w/ training data

All Labeled Dataset

70% 30%

Trial
Training Data Training Model
Model Training – Split the test data

All Labeled Dataset Test

30%
Data
70%

Trial
Training Data Training Model
Model Training – Model evaluation

All Labeled Dataset Test

30%
Data
70%

Trial
Training Data Training Model

Evaluation
Result
Model Training - PerformanceMeasurement

All Labeled Dataset Test

30%
Data
70%

Trial
Training Data Training Model
Accuracy

Evaluation
Result
Model Training - PerformanceMeasurement
Deep Learning

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is Deep Learning?
• Deep Learning is a subfield of machine learning
concerned with algorithms inspired by the structure
and function of the brain called artificial neural
networks.

• Data is passed through multiple non-linear

transformations to generate a prediction

• Objective: Learn the parameters of the transformations

that minimize a cost function
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Performance
Performance Deep Learning Algorithms

Traditional Machine
Learning Algorithms

ASR/NLU Language Translation Self Driving Cars

Playing Go Financial Risk Medical Diagnosis

mazon Web Services, Inc. or its Affi

image understanding
The Advent of Deep Learning
Algorithms
speech recognition

Programming
Models Data
natural language
processing

GPUs & autonomy

Acceleration
Artificial Neuron/Perceptron
b

Input: X0 w0
Vector of training data x
Output: X1 w1
Linear function of input Neuron
Nonlinearity: Inputs w2 Output
⟨w, x⟩ !
Transform output into desired X2

range of value wn

…
Training
Learn the weights and bias b
Xn
by minimize loss

f(x) = ! (⟨w, x⟩ + b)
Human Brain Neuron

Inputs Output
Neural Network

X0 w1 0 Neuron 0 Neuron 0

w11

……
……

……
Output
Neuron
w1 2

Xn w1 3 Neuron n Neuron n

Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer

Neural Network – Forward Propagation

X0 w1 0 Neuron 0 Neuron 0

w11

……
5
……

……
Output
Input Neuron

w1 2

Xn w1 3 Neuron n Neuron n

Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer

Neural Network – Backpropagation

X0 w10 Neuron 0 Neuron 0

Error/Loss Error/Loss
w11
5

……
……

……
Output
Input

?
Neuron

w12
Label
4
Xn w13 Neuron n Neuron n

Error/Loss Error/Loss

Error/Loss
Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer
Neural Network – Backpropagation
Update
weights

W1’0
X0 Neuron 0 Neuron 0
W1’1 Error/Loss Error/Loss

……
……

……
Output
Input
W1’2
?
Neuron

W1’3 Label
4
Xn Neuron n Neuron n

Error/Loss Error/Loss

Error/Loss
Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer
NNs & DL: Neural Networks

Naming conventions:
◦ N-Layer: not include input layer
◦ “Artificial Neural Networks” (ANN) or “Multi-Layer Perceptrons” (MLP)

Output layer: normally don’t have activation function (linear identity activation
function) – output score (0-1)
47
Sizing neural networks: number of neurons, or more commonly the number
of parameters.
DL: NN-like model with many such stages
NNs & DL: Neural Networks Variations
NNs & DL:Neural Networks drawbacks
◦ Local maximum: who is afraid of non-convex loss functions – YannLecun
◦ Unsupervised learning
◦ No-memory networks: recurrent net, LSTM
◦ Computational consumption on conv-layer: GPU
◦ Memory bottleneck
◦ Network compression (Squeeze Net)
◦ Model re-designing
NNs & DL: DL vs Traditional
NNs & DL: DL vs Traditional
NNs & DL:Use cases
1.Data security: malware prediction, detect abnormal data accession behaviour
2.Personal security: speed up, spot things human screeners miss
3. Financial trading: stock market prediction
4.Healthcare: cancer prediction
5.Marketing personalization: target audiences
6. Fraud detection: spot potential cases of fraud
7.Recommendations: Amazon, Netflix
8.Online Search
9. NLP
10.Smart Cars And so on...
NNs & DL: DL vs Traditional
NNs & DL: Number and size of layers

More neurons: express more complicatedfunctions

Overfitting: model with high capacity fits the noise in the data
à smaller neural networks, regularization, dropout
smaller networks is hardly to train with GD, has few local, easy to converge but bad
minima
NNs & DL:Settingup the dataand the model
Data preprocessing:
mean subtraction, normalization
PCA, whitening

Weight Initialization: Pitfall (all zero initialization), small random numbers

Regularization: L1, L2, Maxnorm,
Dropout
Loss Function: 55
Classification: SVM Loss function, Softmax (cross-entropy loss), Hierarchical Softmax
Regression: L2, L1 Norm
NNs & DL:Settingup the dataand the model
Solver:
- Vanilla update:
- Momentum update:
- Nesterov momentum update

Per-parameter adaptive learning ratemethods

Adagrad, RMSprop, Adam
NNs & DL: Trainingnetworks
Before learning:sanity checks
◦ Look for correct loss at chance performance.
◦ Eg: CIFAR-10 with a Softmax classifier 10 classes -> 0.1 per class -> -
ln(0.1)=2.302
◦ Overfit a tiny subset of data

Babysitting the learning process

◦ Loss function, number of iteration, epoch
NNs & DL: Trainingnetworks
◦ Train/Val accuracy

◦ Ratio of weights:updates
◦ First-layer Visualizations

◦ Choose solver
http://cs231n.github.io/neural-networks-3/#loss
NNs & DL: Trainingnetworks
Introduction to CNNs: How brain’s visual system works
Introduction to CNNs: How brain’s visual system works
Introduction to CNNs: Neural networks
Neural networks
Introductionto CNNs: Image Convolution
Introductionto CNNs: ConvolutionLayer
Convolution as a neural layer:
◦ Goals: not to use predefined kernels, but instead to learndata-specific
kernels.
Introductionto CNNs: ConvolutionLayer
n Convolutional layers are locallyconnected
u A filter/kernel/window slides on the image or the
previous map
u The position of the filter explicitlyprovides
information for localizing
n Convolutional layers share weightsspatially:
translation-invariant
u Translation-invariant: a translated region will produce
the same response at the correspondingly translated
position
u A local pattern’s convolutional response can be re-
used by different candidate regions
n Convolutional layers can be applied toimages of
any sizes, yielding proportionally-sizedoutputs
Convolution: Principle
Cross-correlation: computing a series of dot-products and putting them
into an output vector
Convolution: similar with cross-correlation but flipping the kernel
Key feature: shift-invariant , linear –>simple
Different
◦ Convolution: associative properties F*(GI) = (FG)*I
◦ Cross-correlation: match a template to an image
How tocalculate Convolution

Complexity: O(w*h*Fw*Fh)

Convolution -> matrix multiplication

à Toepliz matrix

67
Introduction to CNNs: HOG by Convolutional Layers

n Thinking HOG as CONV layers [Mahendran & Vedaldi, CVPR 2015]

Steps of computing HOG Convolutional perspectives

Computing image gradients Horizontal/vertical edge filters

Binning gradients into 18 Directional filters + gating (non-
directions Computing cell linearity)
histograms Normalizing cell
histograms Sum/average pooling
Local response normalization (LRN)

n HOG, SIFT, and many other “hand-engineered” featuresare

convolutional feature maps.
Introductionto CNNs: ConvolutionLayer
Image is not just “2D”, it’s volume: 2Dx3 (3 channels of intensity map, ex:
RGB,HSV)
We want to learn more than 1 kernels for each layer
CNNs : Poolinglayer
- Reduce the spatial size of the representation
- Reduce the amount of parameters and computation in the
network à control overfitting.
- Operates independently on every depth slice of the inputand
resizes it spatially, using the MAX operation (or AveragePooling,
L2 pooling).

70
CNNs: FC and ReLU layers
As seen in regular network
Have full connections to all activations in the previous layer.
n Common use: Predict a label
n Convert FC-CONV
n Example: K=4096 input: 7×7×512 can be equivalently expressed as a
CONVlayer with F=7,P=0,S=1,K=4096
n Filter size is exactly the size of the input volume, output will simply be
1×1×4096
ReLU(Rectified LinearUnits):
◦ Apply the non-saturating activation function f(x)=max(0,x)
◦ Increases the nonlinear properties of the decision function without affecting the
receptive fields of the convolutionlayer.
Introductionto CNNs: ConvolutionNeural Network

72
CNNs: Learning
CNNs: Transferlearning
Very few people train from scratch (randominit)
◦ Rare dataset has sufficient size comparing to Imagenet (1.2 mil, 1000 classes)
◦ Training Imagenet takes 2-3 weeks using modern GPU(eg: Titan X)

Major Transfer Learning scenarios:

◦ ConvNet as fixed featureextractor.
◦ Use when new dataset is small and similar/different to originaldataset.
◦ Take a ConvNet pretrained on ImageNet, remove the last fully-connected layer
◦ Output: 4096 dimensional vector
◦ Train linear classifier (SVM/Softmax)
◦ Fine-tuning the ConvNet.
◦ New dataset is large and similar to the originaldataset.
◦ Fine-tune the weights of the pretrained network by continuing the backpropagation
◦ Most commonly: only fine-tune some higher-level portion of the network (earlier features of a
ConvNet contain more generic features: edge detectors or color blobdetectors)
◦ Train from scratch
◦ New dataset is large and very different from the original dataset
CNNs: Milestones
1990’s: LeNet by Yann LeCun: number classification
2012: AlexNet : ImageNet ILSVRC2012winner
◦ First work that popularizedCNN
◦ Significantly outperformed the secondrunner-up,

2013: ZF Net from Google: ILSVRC2013 winner.

GoogleLeNet from Google: ILSVRC2014winner.
◦ Inception Module dramatically reduced the number of parameters (4M, compared to AlexNet
with 60M).

VGGNet from Karen Simonyan and Andrew Zisserman: The runner-up in ILSVRC
2014.
◦ Showing that the depth of the network is a critical component for good performance.
◦ Two well-known architectures: VGG-16, VGG-19

ResNet (Residual Network) by Kaiming He:

◦ ILSVRC2015 winner, best paper on CVPR2016
◦ Currently state-of-the-art in Image Classification, Image Detection, ImageCaptioning.
◦ Main ideas: increase number of layer (up to 1000 layers) but remaining the model complexity by
skip connection.
CNNs: Applications
CNNs: Applications
CNNs: Applications

79
CNNs: Applications
CNNs: Applications
CNNs: Applications
CNNs: Applications
CNNs: Applications
References
1.CS231n Convolutional Neural Networks for Visual Recognition –
Standford University
2.Jürgen Schmidhuber 2005 - Deep learning in neural networks:An
overview
3. Yann Le Cun - Unsupervised Learning: The Next Frontier In AI

85
UsefulLinks
Understanding LSTM
Yolo – Realtime object detection
Visualize CNN Google
Deepdream

Visual Question & Answering

◦ Challenge
◦ Demo: http://visualqa.csail.mit.edu/
Deep Spreadsheets with ExcelNet
Deep Learning Frameworks

DL Unit 1
No ratings yet
DL Unit 1
200 pages
Nasscom Notes
No ratings yet
Nasscom Notes
498 pages
Deep Learning
100% (3)
Deep Learning
207 pages
AI in Healthcare
No ratings yet
AI in Healthcare
37 pages
Nasscom 1
No ratings yet
Nasscom 1
211 pages
Machine and Deep Learning Intro
No ratings yet
Machine and Deep Learning Intro
36 pages
Chapter 1
No ratings yet
Chapter 1
52 pages
Module 1-Basics of ML
No ratings yet
Module 1-Basics of ML
142 pages
ML Lab
No ratings yet
ML Lab
75 pages
NPTEL Week01 02 OverviewOfMachineLearning
No ratings yet
NPTEL Week01 02 OverviewOfMachineLearning
12 pages
01 Introduction ML
No ratings yet
01 Introduction ML
48 pages
Session 2 - Machine Learning Fundamental
No ratings yet
Session 2 - Machine Learning Fundamental
25 pages
Deep Learning Midsem Merged Previous Batch
No ratings yet
Deep Learning Midsem Merged Previous Batch
423 pages
Lecture 10 - AI Vs ML Vs DL - Classification
No ratings yet
Lecture 10 - AI Vs ML Vs DL - Classification
34 pages
Deep Learning PIAIC
100% (1)
Deep Learning PIAIC
229 pages
Module 1
No ratings yet
Module 1
175 pages
Artificial Inteligence PDF
No ratings yet
Artificial Inteligence PDF
328 pages
Machine Learning Full Course
No ratings yet
Machine Learning Full Course
31 pages
Intro Part1
No ratings yet
Intro Part1
50 pages
Deep Learning
No ratings yet
Deep Learning
100 pages
Unit 1
No ratings yet
Unit 1
38 pages
ML Unit 1
No ratings yet
ML Unit 1
34 pages
G5Tzhv - 1. Overview of Artificial Intelligence - Study Material - NASSCOM Skill Enhancement Course - Class Note PDF-3
No ratings yet
G5Tzhv - 1. Overview of Artificial Intelligence - Study Material - NASSCOM Skill Enhancement Course - Class Note PDF-3
25 pages
Jntuk r20 Unit-I Deep Learning Techniques (WWW - Jntumaterials.co - In)
No ratings yet
Jntuk r20 Unit-I Deep Learning Techniques (WWW - Jntumaterials.co - In)
23 pages
Deep Learning Unit1
No ratings yet
Deep Learning Unit1
63 pages
Fundamentals of ML 1
No ratings yet
Fundamentals of ML 1
38 pages
AI Presentation Machine Learning
100% (2)
AI Presentation Machine Learning
42 pages
AI Lecture 9
No ratings yet
AI Lecture 9
39 pages
Deep Learning: A Visual Introduction
No ratings yet
Deep Learning: A Visual Introduction
53 pages
ENG6500 1 IntroductionToMLDL Part1
No ratings yet
ENG6500 1 IntroductionToMLDL Part1
74 pages
Lecture 01 - Introduction To AML-Jan24
No ratings yet
Lecture 01 - Introduction To AML-Jan24
66 pages
Machine Learning New
No ratings yet
Machine Learning New
41 pages
Lecture 1 Ai
No ratings yet
Lecture 1 Ai
38 pages
Introduction To ML
No ratings yet
Introduction To ML
34 pages
Module 3
No ratings yet
Module 3
97 pages
Intro To ML - 1
No ratings yet
Intro To ML - 1
29 pages
DL Module 1
No ratings yet
DL Module 1
11 pages
Introduction To AI
No ratings yet
Introduction To AI
26 pages
I MSC DS ML Notes
No ratings yet
I MSC DS ML Notes
109 pages
Machine Learning - MT 2016: Varun Kanade
No ratings yet
Machine Learning - MT 2016: Varun Kanade
50 pages
ENG6500 1 IntroductionToMLDL Part1
No ratings yet
ENG6500 1 IntroductionToMLDL Part1
63 pages
Unit 1
No ratings yet
Unit 1
30 pages
ML Notes
No ratings yet
ML Notes
15 pages
Deep Learning Module 1 Chapter 1
No ratings yet
Deep Learning Module 1 Chapter 1
18 pages
Module 4 & 5
No ratings yet
Module 4 & 5
58 pages
Introduction To Machine Learning For Beginners: Ayush Pant
No ratings yet
Introduction To Machine Learning For Beginners: Ayush Pant
28 pages
The Deep Learning Revolution: Introductory Overview Lecture
No ratings yet
The Deep Learning Revolution: Introductory Overview Lecture
35 pages
AI and ML
No ratings yet
AI and ML
28 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
Lec 1,2
No ratings yet
Lec 1,2
69 pages
What's The Difference Between AI, Machine Learning
No ratings yet
What's The Difference Between AI, Machine Learning
21 pages
Introducton To Deep Learning: 1-SAM 2-JU ST 3-LAL IE 4-NA S
No ratings yet
Introducton To Deep Learning: 1-SAM 2-JU ST 3-LAL IE 4-NA S
37 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
Notes Unit 1
No ratings yet
Notes Unit 1
13 pages
Class Notes - XI
No ratings yet
Class Notes - XI
17 pages
MVDAFT Final
No ratings yet
MVDAFT Final
30 pages
Introduction To AI and ML - Day 1: Gururajan Narasimhan Erode
No ratings yet
Introduction To AI and ML - Day 1: Gururajan Narasimhan Erode
39 pages
Deep Learning Lecture 0 Introduction Alexander Tkachenko
No ratings yet
Deep Learning Lecture 0 Introduction Alexander Tkachenko
31 pages
Introduction Toartificial Intelligence
100% (1)
Introduction Toartificial Intelligence
6 pages

Machine Learning Deep Learning Overview AIST

Uploaded by

Machine Learning Deep Learning Overview AIST

Uploaded by

Machine learning &

The Learning Perceptron Backpropagation Watson Wins Jeopardy Facebook DeepFace,

• “Weak” Pattern-Based Artificial Intelligence ✔

• Logic and Rules-BasedApproach

• Machine Learning (Pattern-BasedApproach)

• Example within law – ExpertSystems

• Machine learning & Rules-BasedHybrids

• Human intelligence + AI Hybrids

Age Gender Purchase Items

Age Gender Purchase Items

• Use ML when you can’t track it

Labrador Label Cat Labrador

1979: Late 1980s-

All Labeled Dataset

All Labeled Dataset

All Labeled Dataset Test

All Labeled Dataset Test

All Labeled Dataset Test

• Data is passed through multiple non-linear

• Objective: Learn the parameters of the transformations

ASR/NLU Language Translation Self Driving Cars

Playing Go Financial Risk Medical Diagnosis

mazon Web Services, Inc. or its Affi

GPUs & autonomy

Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer

Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer

X0 w10 Neuron 0 Neuron 0

More neurons: express more complicatedfunctions

Weight Initialization: Pitfall (all zero initialization), small random numbers

Per-parameter adaptive learning ratemethods

Babysitting the learning process

Convolution -> matrix multiplication

n Thinking HOG as CONV layers [Mahendran & Vedaldi, CVPR 2015]

Steps of computing HOG Convolutional perspectives

Computing image gradients Horizontal/vertical edge filters

n HOG, SIFT, and many other “hand-engineered” featuresare

Major Transfer Learning scenarios:

2013: ZF Net from Google: ILSVRC2013 winner.

ResNet (Residual Network) by Kaiming He:

Visual Question & Answering

You might also like