2017 - Binary Convolutional Neural Network On RRAM - PPT
2017 - Binary Convolutional Neural Network On RRAM - PPT
Network on RRAM
Tianqi Tang, Lixue Xia, Boxun Li, Yu Wang, Huazhong Yang
• Experimental Results
– Comparison between “BCNN on RRAM” and “Multi-bit CNN on RRAM”
– Recognition Accuracy under Device Variation
– Area and Energy Cost Saving
• Conclusion
2
Convolutional Neural Network
• Popular in Recent Ten Years, Good Performance for Wide Ranges of Applications
Generalized
Recognition
(CNN)
Object Detection & Localization Image Caption
Specialized
Recognition
(CNN) Lane Detection & Vehicle Detection Pedestrian Detection Face Recognition
Besides
Vision Tasks
(CNN + other) 33
Speech Recognition Natural Language Processing Chess & Go
(CNN + RNN) (CNN + LSTM) (CNN + Reinforcement Learning)
Convolutional Neural Network
• Popular in Recent Ten Years, Good Performance for Wide Ranges of Applications
Winners of the Image-Net Large-Scale Visual Recognition Challenge (ILSVRC)
Task: Classification & Localization with Provided Data
2011 2012 2013 2014 2015 2016
Team XRCE SuperVision Clarifai VGG MSRA Trimps-Soushen
Model Not CNN AlexNet ZF VGG-16 ResNet-152 Ensemble
Err (Top-5) 25.8% 16.4% 11.7% 7.4% 3.57% 2.99%
CNN Operations
Fully-Connected Neuron Convolution Pooling Normalization
DNN Operations
5
Convolutional Neural Network
CNN on RRAM ?
Fully-Connected Neuron Convolution Pooling Normalization
DNN on RRAM (PRIME [ISCA 2016], ISAAC [ISCA 2016])
6
Convolutional Neural Network Input Image
FC Layer (N+1)
• One-to-One Mapping
Pooling
…
FC Layer (N+M)
Recognition Result
7
Convolutional Neural Network Input Image
Recognition Result
8
BCNN on RRAM
9 3
BCNN on RRAM
10 3
BCNN on RRAM
11 3
BCNN on RRAM
• Binary CNN
• Training Workflow: Binarize while Training
• BinaryNet [arXiv:1602.02830]
• XNOR-Net [Rastegari ECCV 2016]
12 3
BCNN on RRAM: Convolver Circuit
x = x =
• Column Splitting
• Row Splitting x = x = + +
13
BCNN on RRAM: Line Buffer & Pipeline
• Sliding Window
– Unnecessary to buffer the whole input feature maps.
– the Convolver circuit can awake (A) from sleep (S) once the input data
of the Conv kernel size is achieved;
– Structure of Line Buffer introduced to cache and fetch intermediate
feature map.
(𝑖)
C𝑖𝑛 line buffers
(𝑖) (𝑖)
C𝑖𝑛 IN ℎ ⋅ 𝑤 ⋅ C𝑖𝑛
MUX
……
0
……
Feature Map Feature Map
ℎ
From the Previous CONV_EN CTRL To the Next
OUT
Convovler Circuit 𝑤 Convovler Circuit
𝑊
(f)
14
BCNN on RRAM: Line Buffer & Pipeline
x
x
3*3 Kernel
Feature Map 3*3 Kernel
Feature Map
✔️ ✖
x x
3*3 Kernel
3*3 Kernel
Feature Map
Feature Map
15
Experimental Results
• Experimental Setup:
– Small Case:
• LeNet on MNIST (C5*20-S2-C5*50-S2-FC100-FC10)
• Unnecessary for Matrix Splitting
• Effects of Device Variation on Multi-bit/Binary CNN Model
Mapping onto N-bit RRAM devices
– Large Case:
• AlexNet on ImageNet
• Necessary for Matrix Splitting (4bit ADC for Partial Sum Interface)
• Area and Energy Estimation on Multi-bit/Binary CNN Model
Mapping onto N-bit RRAM platform
– Other Settings:
• Crossbar Size: (128x128)
16
Experimental Results
17
Experimental Results
• With Variation:
– Full Bit-Level Mode: Dynamic Quantization Error + Variation
– Binary Mode: Training Error + Variation
Binary Mode shows Better Robustness than Full Bit-Level Mode
18
Experimental Results
19
Conclusion
20
Thanks for your Attention
21