AI ML tutorial
AI ML tutorial
EE382M.20:
System-on-Chip (SoC) Design
Andreas Gerstlauer
Electrical and Computer Engineering
University of Texas at Austin
[email protected]
Lecture 1: Outline
• Marketing requirements
• Market focus, product description
• Cost metrics, product features
• Product requirements
• Deep learning
• Hardware acceleration
• Project description
• Deep/Convolutional Neural Networks (DNNs/CNNs)
• Object recognition
• You Only Look Once (YOLO) CNN
• Hardware and software development tasks
© 2018 A. Gerstlauer 1
EE382M.20: System-on-Chip (SoC) Design Lecture 1
Market Focus
Competition
• NVIDIA DRIVE PX
• https://www.nvidia.com/en-us/self-driving-cars
• ARM+GPU (Tegra) based solution
• Used by Tesla
© 2018 A. Gerstlauer 2
EE382M.20: System-on-Chip (SoC) Design Lecture 1
Product Description
• Cost metrics
• Real-time: frames per second (FPS), reaction time
• Detection accuracy: mean average precision (mAP)
• Power/thermal: W and operating temperature (°C)
• Cost: $ or die area (mm2)
• Product features
• Supported image resolutions
• Supported detection classes
• Flexibility: dynamic, over-the-air reprogramming/updating
Product Requirements
• High detection accuracy deep learning
• Convolutional Neural Network (CNN)
• Trained on large image data set
• Very computationally intensive
© 2018 A. Gerstlauer 3
EE382M.20: System-on-Chip (SoC) Design Lecture 1
Neuron
© 2018 A. Gerstlauer 4
EE382M.20: System-on-Chip (SoC) Design Lecture 1
Convolution Operations
© 2018 A. Gerstlauer 5
EE382M.20: System-on-Chip (SoC) Design Lecture 1
Convolution Operations
Input map sliding window xc,i,,j (c = 0…C-1)
*
F2
F1
(f1 = 0…F1-1, f2 = 0…F2-1)
Output element , , , , , , ,
*
F2
F1
Output element , , , , , , ,
Cin = 3
Cin = 3
Conv‐BN‐ReLU
Hout
Cout = 4
Wout
=6
in
H
© 2018 A. Gerstlauer 6
EE382M.20: System-on-Chip (SoC) Design Lecture 1
Cin = 3
Cin = 3
Conv‐BN‐ReLU
Hout
Cout = 4
Wout
Cin = 3
Cin = 3
Conv‐BN‐ReLU
Hout
Cout = 4
Wout
3. Perform
GEMM
© 2018 A. Gerstlauer 7
EE382M.20: System-on-Chip (SoC) Design Lecture 1
© 2018 A. Gerstlauer 8
EE382M.20: System-on-Chip (SoC) Design Lecture 1
© 2018 A. Gerstlauer 9
EE382M.20: System-on-Chip (SoC) Design Lecture 1
Project Description
© 2018 A. Gerstlauer 10
EE382M.20: System-on-Chip (SoC) Design Lecture 1
Development Tasks
• ARM software development
• Compile and profile YOLO/Darknet on ARM board
• Convert floating-point to fixed-point code and check mAP
• Compile and profile fixed-point Yolo on ARM board
• Optimize software on dual-core ARM platform
• Develop hardware abstraction layer (HAL) and I/O handler
• Develop interrupt handler & driver (Linux kernel module)
© 2018 A. Gerstlauer 11