TensorRT Sample Support Guide
TensorRT Sample Support Guide
Support Guide
TABLE OF CONTENTS
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | ii
Chapter 32. Object Detection With SSD In Python...................................................... 45
Chapter 33. INT8 Calibration In Python................................................................... 46
Chapter 34. Refitting An Engine In Python............................................................... 47
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | iii
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | iv
Chapter 1.
INTRODUCTION
The following samples show how to use TensorRT in numerous use cases while
highlighting different capabilities of the interface.
Title TensorRT Sample Name Description
trtexec giexec A tool to quickly utilize TensorRT
without having to develop your
own application.
“Hello World” For TensorRT sampleMNIST Performs the basic setup and
initialization of TensorRT using
the Caffe parser.
Building A Simple MNIST Network sampleMNISTAPI Uses the TensorRT API to build
Layer By Layer an MNIST (handwritten digit
recognition) layer by layer, sets
up weights and inputs/outputs
and then performs inference.
Importing The TensorFlow Model sampleUffMNIST Imports a TensorFlow model
And Running Inference trained on the MNIST dataset.
“Hello World” For TensorRT From sampleOnnxMNIST Converts a model trained on the
ONNX MNIST dataset in ONNX format to
a TensorRT network.
Building And Running GoogleNet sampleGoogleNet Shows how to import a model
In TensorRT trained with Caffe into TensorRT
using GoogleNet as an example.
Building An RNN Network Layer sampleCharRNN Uses the TensorRT API to build an
By Layer RNN network layer by layer, sets
up weights and inputs/outputs
and then performs inference.
Performing Inference In INT8 sampleINT8 Performs INT8 calibration and
Using Custom Calibration inference. Calibrates a network
for execution in INT8.
Performing Inference In INT8 sampleINT8API Sets per tensor dynamic range
Precision and computation precision of a
layer.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 1
Introduction
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 2
Introduction
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 3
Introduction
1.1. C++ Samples
You can find the C++ samples in the /usr/src/tensorrt/samples package directory
as well as on GitHub. The following C++ samples are shipped with TensorRT:
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 4
Introduction
$ cd <samples_dir>
$ make -j4
$ cd ../bin
$ ./<sample_bin>
1.2. Python Samples
You can find the Python samples in the /usr/src/tensorrt/samples/python
package directory. The following Python samples are shipped with TensorRT:
1
This sample is located in GitHub only; this is not part of the product package.
2
1
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 5
Introduction
For more information on running samples, see the README.md file included with the
sample.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 6
Chapter 2.
APPLICATION AREAS
Recommenders
Recommender systems are used to provide product or media recommendations to
users of social networking, media content consumption and e-commerce platforms.
MLP-based Neural Collaborative Filter (NCF) recommenders employ a stack of fully-
connected or matrix multiplication layers to generate recommendations.
Some examples of TensorRT recommenders samples include the following:
Machine translation
Machine translation systems are used to translate text from one language to another
language. Recurrent neural networks (RNN) are one of the most popular deep learning
solutions for machine translation.
Some examples of TensorRT machine translation samples include the following:
Character recognition
Character recognition, especially on the MNIST dataset, is a classic machine learning
problem. The MNIST problem involves recognizing the digit that is present in an image
of a handwritten digit.
Some examples of TensorRT character recognition samples include the following:
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 7
Application Areas
Image classification
Image classification is the problem of identifying one or more objects present in an
image. Convolutional neural networks (CNN) are a popular choice for solving this
problem. They are typically composed of convolution and pooling layers.
Some examples of TensorRT image classification samples include the following:
Object detection
Object detection is one of the classic computer vision problems. The task, for a given
image, is to detect, classify and localize all objects of interest. For example, imagine
that you are developing a self-driving car and you need to do pedestrian detection -
the object detection algorithm would then, for a given image, return bounding box
coordinates for each pedestrian in an image.
There have been many advances in recent years in designing models for object detection.
Some examples of TensorRT object detection samples include the following:
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 8
Application Areas
Integration
Integration samples demonstrate “how to” rather than “what to do”, which are different
from the samples mentioned above. In some cases, the TensorRT workflow may differ
from the standard workflow. In order to let developers know how to handle such cases,
integration samples are made to show workflows as well as API call sequences. As an
example, sampleNvmedia shows how to run the TensorRT engine on a safety certified
DLA, which involves NvMedia APIs.
Some examples of TensorRT integration samples include the following:
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 9
Chapter 3.
CROSS COMPILING SAMPLES FOR AARCH64
USERS
The following sections show how to cross compile TensorRT samples for AArch64 QNX,
Linux and Android platforms under x86_64 Linux.
3.1. Prerequisites
1. Install the CUDA cross-platform toolkit for the corresponding target and set the
environment variable CUDA_INSTALL_DIR.
If you are using the tar file release for the target platform, then you can safely
skip this step. The tar file release already includes the cross compile libraries so
no additional packages are required.
QNX AArch64
libnvinfer-dev-cross-qnx, libnvinfer5-cross-qnx
Linux AArch64
libnvinfer-dev-cross-aarch64, libnvinfer5-cross-aarch64
Android AArch64
No debian packages are available.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 10
Cross Compiling Samples For AArch64 Users
$ export QNX_HOST=/path/to/your/qnx/toolchains/host/linux/x86_64
$ export QNX_TARGET=/path/to/your/qnx/toolchain/target/qnx7
$ cd /path/to/TensorRT/samples
$ make TARGET=qnx
$ cd /path/to/TensorRT/samples
$ make TARGET=aarch64
$ $NDK/build/tools/make_standalone_toolchain.py \
--arch arm64 \
--api 26 \
--install-dir=/path/to/my-toolchain
$ cd /path/to/TensorRT/samples
$ make TARGET=android64 ANDROID_CC=/path/to/my-toolchain/bin/aarch64-linux-
android-clang++
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 11
Cross Compiling Samples For AArch64 Users
$ cd <samples_dir>
$ make -j4
$ cd ../bin
$ ./<sample_bin>
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 12
Chapter 4.
“HELLO WORLD” FOR TENSORRT
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 13
Chapter 5.
BUILDING A SIMPLE MNIST NETWORK
LAYER BY LAYER
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 14
Chapter 6.
IMPORTING THE TENSORFLOW MODEL
AND RUNNING INFERENCE
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 15
Chapter 7.
“HELLO WORLD” FOR TENSORRT FROM
ONNX
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 16
Chapter 8.
BUILDING AND RUNNING GOOGLENET IN
TENSORRT
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 17
Chapter 9.
BUILDING AN RNN NETWORK LAYER BY
LAYER
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 18
Chapter 10.
PERFORMING INFERENCE IN INT8 USING
CUSTOM CALIBRATION
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 19
Chapter 11.
PERFORMING INFERENCE IN INT8
PRECISION
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 20
Chapter 12.
ADDING A CUSTOM LAYER TO YOUR
NETWORK IN TENSORRT
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 21
Chapter 13.
OBJECT DETECTION WITH FASTER R-CNN
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 22
Chapter 14.
OBJECT DETECTION WITH A TENSORFLOW
SSD NETWORK
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 23
Chapter 15.
MOVIE RECOMMENDATION USING NEURAL
COLLABORATIVE FILTER (NCF)
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 24
Chapter 16.
MOVIE RECOMMENDATION USING MPS
(MULTI-PROCESS SERVICE)
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 25
Movie Recommendation Using MPS (Multi-Process Service)
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 26
Chapter 17.
OBJECT DETECTION WITH SSD
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 27
Chapter 18.
“HELLO WORLD” FOR MULTILAYER
PERCEPTRON (MLP)
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 28
Chapter 19.
SPECIFYING I/O FORMATS USING THE
REFORMAT FREE I/O APIS
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 29
Chapter 20.
ADDING A CUSTOM LAYER THAT SUPPORTS
INT8 I/O TO YOUR NETWORK IN
TENSORRT
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 30
Chapter 21.
USING THE NVMEDIA API TO RUN A
TENSORRT ENGINE
sampleNvmedia is included only in the Automotive releases and therefore works only
on Standard configurations in the auto build on QNX and D5L.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 31
Using The NvMedia API To Run A TensorRT Engine
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 32
Chapter 22.
DIGIT RECOGNITION WITH DYNAMIC
SHAPES IN TENSORRT
‣ Create a network with dynamic input dimensions to act as a preprocessor for the
model
‣ Parse an ONNX MNIST model to create a second network
‣ Build engines for both networks
‣ Run inference using both engines
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 33
Chapter 23.
NEURAL MACHINE TRANSLATION (NMT)
USING A SEQUENCE TO SEQUENCE
(SEQ2SEQ) MODEL
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 34
Chapter 24.
OBJECT DETECTION AND INSTANCE
SEGMENTATIONS WITH A TENSORFLOW
MASK R-CNN NETWORK
This sample is available only in GitHub and is not packaged with the product.
This sample makes use of TensorRT plugins to run the Mask R-CNN model. To use
these plugins, the Keras model should be converted to Tensorflow .pb model. Then this
.pb model needs to be preprocessed and converted to the UFF model with the help of
GraphSurgeon and the UFF utility.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 35
Object Detection And Instance Segmentations With A TensorFlow Mask R-CNN Network
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 36
Chapter 25.
OBJECT DETECTION WITH A TENSORFLOW
FASTER R-CNN NETWORK
This sample is available only in GitHub and is not packaged with the product.
In this sample, we provide a UFF model as a demo. While in the Transfer Learning
Toolkit workflow, we can't provide the UFF model. Instead, we can only get the .tlt
model during training and the .etlt model after tlt-export. Both of them are
encrypted models and the Transfer Learning Toolkit user will use tlt-converter to
decrypt the .etlt model and generate a TensorRT engine file in a single step. Therefore,
in the Transfer Learning Toolkit workflow, we will consume the TensorRT engine instead
of a UFF model. However, this sample can still serve as a demo on how to use the UFF
Faster R-CNN model regardless of its format.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 37
Object Detection With A TensorFlow Faster R-CNN Network
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 38
Chapter 26.
INTRODUCTION TO IMPORTING CAFFE,
TENSORFLOW AND ONNX MODELS INTO
TENSORRT USING PYTHON
Getting Started:
Refer to the /usr/src/tensorrt/samples/python/
introductory_parser_samples/README.md file for detailed information about how
this sample works, sample code, and step-by-step instructions on how to run and verify
its output.
A summary of the README.md file is included in this section for your reference, however,
you should always refer to the README.md within the package for the most recent
documentation updates.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 39
Chapter 27.
“HELLO WORLD” FOR TENSORRT USING
TENSORFLOW AND PYTHON
Getting Started:
Refer to the /usr/src/tensorrt/samples/python/
end_to_end_tensorflow_mnist/README.md file for detailed information about how
this sample works, sample code, and step-by-step instructions on how to run and verify
its output.
A summary of the README.md file is included in this section for your reference, however,
you should always refer to the README.md within the package for the most recent
documentation updates.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 40
Chapter 28.
“HELLO WORLD” FOR TENSORRT USING
PYTORCH AND PYTHON
Getting Started:
Refer to the /usr/src/tensorrt/samples/python/network_api_pytorch_mnist/
README.md file for detailed information about how this sample works, sample code, and
step-by-step instructions on how to run and verify its output.
A summary of the README.md file is included in this section for your reference, however,
you should always refer to the README.md within the package for the most recent
documentation updates.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 41
Chapter 29.
ADDING A CUSTOM LAYER TO YOUR CAFFE
NETWORK IN TENSORRT IN PYTHON
Getting started:
Refer to the /usr/src/tensorrt/samples/python/fc_plugin_caffe_mnist/
README.md file for detailed information about how this sample works, sample code, and
step-by-step instructions on how to run and verify its output.
A summary of the README.md file is included in this section for your reference, however,
you should always refer to the README.md within the package for the most recent
documentation updates.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 42
Chapter 30.
ADDING A CUSTOM LAYER TO YOUR
TENSORFLOW NETWORK IN TENSORRT IN
PYTHON
Getting Started:
Refer to the /usr/src/tensorrt/samples/python/uff_custom_plugin/
README.md file for detailed information about how this sample works, sample code, and
step-by-step instructions on how to run and verify its output.
A summary of the README.md file is included in this section for your reference, however,
you should always refer to the README.md within the package for the most recent
documentation updates.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 43
Chapter 31.
OBJECT DETECTION WITH THE ONNX
TENSORRT BACKEND IN PYTHON
This sample is not supported on Ubuntu 14.04 and older. Additionally, the
yolov3_to_onnx.py script does not support Python 3.
Getting Started:
Refer to the /usr/src/tensorrt/samples/python/yolov3_onnx/README.md file
for detailed information about how this sample works, sample code, and step-by-step
instructions on how to run and verify its output.
A summary of the README.md file is included in this section for your reference, however,
you should always refer to the README.md within the package for the most recent
documentation updates.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 44
Chapter 32.
OBJECT DETECTION WITH SSD IN PYTHON
Getting Started:
Refer to the /usr/src/tensorrt/samples/python/uff_ssd/README.md file for
detailed information about how this sample works, sample code, and step-by-step
instructions on how to run and verify its output.
A summary of the README.md file is included in this section for your reference, however,
you should always refer to the README.md within the package for the most recent
documentation updates.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 45
Chapter 33.
INT8 CALIBRATION IN PYTHON
Getting Started:
Refer to the /usr/src/tensorrt/samples/python/int8_caffe_mnist/README.md
file for detailed information about how this sample works, sample code, and step-by-
step instructions on how to run and verify its output.
A summary of the README.md file is included in this section for your reference, however,
you should always refer to the README.md within the package for the most recent
documentation updates.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 46
Chapter 34.
REFITTING AN ENGINE IN PYTHON
Getting Started:
Refer to the /usr/src/tensorrt/samples/python/engine_refit_mnist/
README.md file for detailed information about how this sample works, sample code, and
step-by-step instructions on how to run and verify its output.
A summary of the README.md file is included in this section for your reference, however,
you should always refer to the README.md within the package for the most recent
documentation updates.
www.nvidia.com
TensorRT Samples SWE-SWDOCTRT-001-SAMG_vTensorRT 6.0.1 | 47
Notice
THE INFORMATION IN THIS GUIDE AND ALL OTHER INFORMATION CONTAINED IN NVIDIA DOCUMENTATION
REFERENCED IN THIS GUIDE IS PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED,
STATUTORY, OR OTHERWISE WITH RESPECT TO THE INFORMATION FOR THE PRODUCT, AND EXPRESSLY
DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A
PARTICULAR PURPOSE. Notwithstanding any damages that customer might incur for any reason whatsoever,
NVIDIA’s aggregate and cumulative liability towards customer for the product described in this guide shall
be limited in accordance with the NVIDIA terms and conditions of sale for the product.
THE NVIDIA PRODUCT DESCRIBED IN THIS GUIDE IS NOT FAULT TOLERANT AND IS NOT DESIGNED,
MANUFACTURED OR INTENDED FOR USE IN CONNECTION WITH THE DESIGN, CONSTRUCTION, MAINTENANCE,
AND/OR OPERATION OF ANY SYSTEM WHERE THE USE OR A FAILURE OF SUCH SYSTEM COULD RESULT IN A
SITUATION THAT THREATENS THE SAFETY OF HUMAN LIFE OR SEVERE PHYSICAL HARM OR PROPERTY DAMAGE
(INCLUDING, FOR EXAMPLE, USE IN CONNECTION WITH ANY NUCLEAR, AVIONICS, LIFE SUPPORT OR OTHER
LIFE CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY OF FITNESS
FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR
IN PART, FOR ANY CLAIMS OR DAMAGES ARISING FROM SUCH HIGH RISK USES.
NVIDIA makes no representation or warranty that the product described in this guide will be suitable for
any specified use without further testing or modification. Testing of all parameters of each product is not
necessarily performed by NVIDIA. It is customer’s sole responsibility to ensure the product is suitable and
fit for the application planned by customer and to do the necessary testing for the application in order
to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect
the quality and reliability of the NVIDIA product and may result in additional or different conditions and/
or requirements beyond those contained in this guide. NVIDIA does not accept any liability related to any
default, damage, costs or problem which may be based on or attributable to: (i) the use of the NVIDIA
product in any manner that is contrary to this guide, or (ii) customer product designs.
Other than the right for customer to use the information in this guide with the product, no other license,
either expressed or implied, is hereby granted by NVIDIA under this guide. Reproduction of information
in this guide is permissible only if reproduction is approved by NVIDIA in writing, is reproduced without
alteration, and is accompanied by all associated conditions, limitations, and notices.
Trademarks
NVIDIA, the NVIDIA logo, and cuBLAS, CUDA, cuDNN, cuFFT, cuSPARSE, DALI, DIGITS, DGX, DGX-1, Jetson,
Kepler, NVIDIA Maxwell, NCCL, NVLink, Pascal, Tegra, TensorRT, and Tesla are trademarks and/or registered
trademarks of NVIDIA Corporation in the Unites States and other countries. Other company and product
names may be trademarks of the respective companies with which they are associated.
Copyright
© 2019 NVIDIA Corporation. All rights reserved.
www.nvidia.com