0% found this document useful (0 votes)
3 views

Module2- Optimization & Quantization of AI Models for Improved Performance

The document outlines Module 2 of the Intel® Distribution of OpenVINO™ Toolkit, focusing on the optimization and quantization of AI models to enhance performance. It covers various techniques such as Model Optimizer, Post-Training Optimization Tool (POT), and the importance of model quantization, alongside practical exercises for hands-on learning. Key learning objectives include understanding model optimization strategies and their impact on inference performance.

Uploaded by

Aayush Bhure
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Module2- Optimization & Quantization of AI Models for Improved Performance

The document outlines Module 2 of the Intel® Distribution of OpenVINO™ Toolkit, focusing on the optimization and quantization of AI models to enhance performance. It covers various techniques such as Model Optimizer, Post-Training Optimization Tool (POT), and the importance of model quantization, alongside practical exercises for hands-on learning. Key learning objectives include understanding model optimization strategies and their impact on inference performance.

Uploaded by

Aayush Bhure
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Intel® Distribution of OpenVINO™ Toolkit

Digital Courseware for


Educators
Course: Deploying Deep Learning Applications
Intel® Distribution of OpenVINO™ Toolkit

Deploying Deep Learning


Applications
MODULE 2: Optimization & Quantization of AI Models for
Improved Performance
Notices and disclaimers

Performance varies by use, configuration, and other factors. Learn more at intel.com/PerformanceIndex .
Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available
updates. See backup for configuration details. No product or component can be absolutely secure.
Your costs and results may vary.
Intel® technologies may require enabled hardware, software, or service activation.
Intel® optimizations, for Intel® compilers or other products, may not optimize to the same degree for non-Intel products.
Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.
Results have been estimated or simulated.
Intel is committed to respecting human rights and avoiding complicity in human rights abuses.
See Intel’s Global Human Rights Principles. Intel® products and software are intended only to be used in
applications that do not cause or contribute to a violation of an internationally recognized human right.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries.
Other names and brands may be claimed as the property of others.

3
Module 2
Optimization & Quantization of AI
Models for Improved Performance

4
Table of Contents
• Model Optimizer
• Setting Inputs Shape of a Model
• Cutting Off Parts of a Model
• Model Optimizer Optimization Techniques
• Generic Optimization
• Framework or topology specific optimization
• Model Quantization
• Compression of a Model
• Post Training Optimization (POT)
• Benchmark Tool
• Hands-on Labs
• Exercise 1 : Download a model from OMZ using OpenVINO™ Notebooks - 104-model- tools
• Exercise 2 : Tiny YOLO* V3 to IR conversion using OpenVINO toolkit

5
Module 2: Learning Objective

• Recognize the importance of optimizing and tuning pre-trained models for AI Inference.
• Understand the Model Optimizer, Post-training Optimization Tool, and their functions.
• Learn about OpenVINOTM Intermediate Representation (IR).
• Implement the model optimization strategies–Quantization and Topology optimization.
• Understand the workflow and factors to consider when using quantization for Deep Learning models.
• Work on practical projects to understand the difference between pre- and post-optimization model
performance.

6
Module 2: Learning Outcomes

After completing this module, students should be able to:

• Explain why optimization and tuning Deep Learning models for inference are necessary.
• Use the Model Optimizer and POT tools from the OpenVINOTM toolkit and become acquainted with how to use them.
• Make informed technical decisions in order to select the best optimization strategy.
• Describe the advantages and disadvantages of various model optimization strategies.

7
Module 2: Key Questions Addressed
• Why do pre-trained Deep Learning models need further optimization?
• What exactly is a Model Optimizer? What roles does it play?
• What is the Intermediate Representation (IR) used by the OpenVINO™ toolkit?
• What is quantization, and what factors need to be kept in mind while using this optimization method?
• What are the different optimization strategies available with the OpenVINO™ toolkit?
• What is the Post-training Optimization tool? How is it useful?
• How can you benchmark model performance with the OpenVINO™ toolkit?

8
Model Optimizer

9
Convert model with Model Optimizer
▪ A Python* based tool to read trained models and The simplest way to convert a model is:
convert them to Intermediate Representation format > mo --input_model <INPUT_MODEL>
▪ Optimizes for performance or space with conservative To get the full list of conversion parameters
topology transformations available in Model Optimizer, run the following
▪ Hardware-agnostic optimizations command:
> mo --help

Get Your Model

Intermediate
Run Model Representation
Optimizer (IR)
.xml and .bin

10
Discussion Points

• What are some other reasons for pre-trained models requiring optimization?
• What are some tradeoffs that need to be kept in mind while optimizing a pre-trained
model?

11
Model Optimizer: Generic Optimization
Operations Pruning
Linear Operations Fusion Example
Drop unused operations that only matter for
training

Linear Operations Fusion


1. BatchNorm and ScaleShift decomposition
• BN operations decompose to Mul->Add->Mul-
>Add sequence
• ScaleShift operations decompose to Mul->Add
sequence.

2. Linear Operations merge: Merges sequences


of Mul and Add operations to a single Mul-
>Add instance.
After MO
3. Linear Operations Fusion: Fuses Mul and Add
Caffe* Resnet269 (IR)
operations to Convolution or Fully Connected
layers.
Before MO
Caffe* Resnet269

12
Setting Input Shapes
Use the CLI options --input_shape. Model Optimizer supports conversion of models with dynamic
input shapes that contain undefined dimensions.
However, if the shape of data is fixed, then it’s recommended to set up fully defined shape for the
inputs. It can be beneficial from a performance perspective and memory consumption.

• Example 1: Run the Model Optimizer for the TensorFlow* MobileNet model with the single input
and specify input shape [2,300,300,3].
mo --input_model MobileNet.pb --input_shape [2,300,300,3]

• Example 2: Run the Model Optimizer for the ONNX* OCR model with a pair of inputs, data and
seq_len. Then specify shapes [3,150,200,1] and [3] respectively.
mo --input_model ocr.onnx --input data,seq_len --input_shape [3,150,200,1],[3]

Learn more about Input Shape


https://intel.ly/DEh4Af 13
Cutting Off Parts of a Model
Use the CLI options --input and --output
Here are several reasons some parts of a model must be removed by the Model Optimizer while converting
models to the Intermediate Representation

The Model has a The Model contains


Single custom layer
The Model has pre- training part that is lots of unsupported There is a Problem
or a combination of
or post-processing convenient to be operations that with model
custom layers is
parts that cannot be kept in the model, cannot be easily conversion in the
isolated for
translated but not used during implemented as Model Optimizer
debugging purposes
inference custom layers

Example: Cutting at the End Example: Cutting from the Beginning


mo --input_model inception_v1.pb -- mo --input_model inception_v1.pb --input
output=InceptionV1/InceptionV1/Conv2d_1a_7x7/ 0:InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu
Relu

Learn more about Cutting Off Parts of Model


https://intel.ly/DE38291 14
Embedding Preprocessing Computation
Use the CLI options --mean_values, --scale, --reverse_input_channels, and --layout
• The input color channel sequence for inference can be different from the training data set
• The Model Optimizer (MO) generates IR with additional subgraphs inserted that perform the defined
preprocessing causing it to be performed on the inference device.

Run the MO on a PaddlePaddle* UNet


mo --input_model unet.pdmodel --mean_values [123,117,104]
model and apply mean and scale
normalization to the input data --scale 255

Run the MO on a TensorFlow* AlexNet


model and embed a reverse_input_channel mo --input_model alexnet.pb --reverse_input_channels
preprocessing block into IR.

Run the MO on an ONNX* NASNet model


to convert the layout to NCHW
mo --input_model tf_nasnet_large.onnx --layout "nhwc->nchw"

Learn more about Embedding Processing Computation


https://intel.ly/DE3z2tX 15
Framework or topology specific optimization
Grouped Convolutions Fusing

Grouped convolution fusing is a specific optimization that applies for TensorFlow*


topologies. The main idea of this optimization is to combine convolutions results for the
Split outputs and then recombine them using Concat operation in the same order as they
were out from Split.

You can read about other strategies in depth in the Model Optimizer documentation 16
Discussion Points

• What are some changes you noticed between the original model's IR and the optimized
OpenVINO™ IR?
• Why is it essential for MO's model optimizations to be hardware agnostic?

17
Model Quantization
Model Quantization is a method of representing Deep Learning models using less memory
Most Deep Learning models are trained using full precision or FP32 representation. But research has
indicated that you can perform inference with lower numerical precision and minimal change in model
accuracy.
At lower numerical representation, INT8 or a similar data format is used to store the weights and biases of
the deep learning model.

Plugin FP32 FP16 I8


CPU plugin Supported and preferred Supported Supported
GPU plugin Supported Supported and preferred Supported

VPU plugins Not supported Supported Not supported


GNA plugin Supported Supported Not supported
Arm® CPU plugin Supported and preferred Supported Supported
(partially)

In addition to accuracy consideration, you need to ensure that the hardware platform supports the data format of your
quantization. Figure above provides this compatibility information.
18
Compression of a Model to FP16
Use the CLI option --data_type
• Model Optimizer can convert all floating-point weights to 16-bit.
• The resulting model will occupy about half the disk space and runtime memory.
• FP16 is the recommended data type for GPU optimizations and is the only supported data type for
MYRIAD VPUs

mo --input_model INPUT_MODEL --data_type FP16

Note: FP16 compression may have some accuracy drop, although for the majority of
models accuracy degradation is negligible.

Learn more about FP16 compression


https://intel.ly/DE3FriH
19
Discussion Points

• Which models are not trained in INT8 precision?


• In addition to quantization, what are some other optimization techniques that you can
apply to a model after training?

20
Post-Training Optimization Tool (POT)

21
Overview of Post-Training Optimization Tool
The POT uses a conversion
technique that reduces the
model size into low precision
without retraining POT
Configuration
• Improves latency with little Model (CLI/API)
degradation in model accuracy
• Different optimization OpenVINO™ Optimized
approaches are supported: Model IR model Post-Training INT8
Optimizer FP32 or FP16 Optimization Tool OpenVINO
quantization algorithms, etc. .xml & .bin IR model

• Available as a command line


tool and API.
Dataset

Learn more about the POT overview


https://intel.ly/POT 22
Command Line Tool
Simplified Engine uses Default Quantization Algorithm
Command:
• Simplified mode is designed to pot \
-q default \
make data preparation for the -m <path_to_xml> \
model optimization process -w <path_to_bin> \
--engine simplified \
easier. --data-source <path_to_data>

POT
Configuration
• The Default Quantization (CLI)
algorithm
• is designed to do a fast and, in OpenVINO™ Post-Training Optimization Optimized
many cases, accurate IR model
Tool INT8
FP32 or FP16 OpenVINO
quantization. .xml & .bin IR model
• The accuracy metric does not
change but provides a lot of
knobs that can be used to
improve it. Dataset

Learn more about Parameters of Default Quantization Algorithm:


https://intel.ly/DE3R7qr 23
Command Line Tool
Accuracy Checker Engine uses Accuracy-Aware Quantization Algorithm
• When using the Default Command:
Quantization Algorithm introduces pot \
-q accuracy_aware \
a significant accuracy degradation, -m <path_to_xml> \
the Accuracy-Aware Quantization -w <path_to_bin> \
--ac-config <path_to_AC_config_yml>\
algorithm can be used to check that --max-drop 0.01
the accuracy remains within the
pre-defined drop range. Accuracy Checker POT
Configuration Configuration
• The drop range is the maximum .yaml (CLI)
amount of the accuracy loss the
developer will allow
OpenVINO™ Post-Training Optimization Optimized
• This may cause a degradation in IR model Tool OpenVINO
inference compute performance in FP32 or FP16 INT8
.xml & .bin IR model
comparison to the Default
Quantization algorithm because
some layers can be reverted to the
original precision.
Dataset

Learn more about Parameters of Accuracy-Aware Quantization Algorithm


https://intel.ly/DE3VDvo 24
Configuration of Accuracy Checker
• The Accuracy Checker configuration .yaml file declares the validation process.
• Every validated model must have its entry with distinct name, launcher, datasets, and other properties

models:
- name: mobilenet-ssd
launchers:
- framework: openvino #backend frameworks for Accuracy Checker
adapter: ssd #Adapter converts raw output produced by framework to high level problem specific
representation (e.g., ClassificationPrediction, DetectionPrediction, etc).
datasets:
- name: VOC2007_detection
data_source: <DATASET_PATH>
preprocessing: #list of preprocessing steps applied to input data.
- type: resize
size: 300
postprocessing: #list of postprocessing steps.
- type: resize_prediction_boxes
metrics: #list of metrics that should be computed.
- type: map
integral: 11point
ignore_difficult: True
presenter: print_scalar
reference: 0.67

Learn more about Accuracy Checker


https://intel.ly/DE38L9B 25
Command Line Tool
Customized Config through Configuration File Description
The configuration .json file contains all the parameters required by POT.

Command:
pot -c mobilenet-ssd.json

Accuracy Checker
POT Configuration
Configuration
(CLI reads .json file)
.yaml

Post-Training Optimization Tool


OpenVINO™ IR
Optimized
model
OpenVINO INT8
FP32 or FP16
IR model
.xml & .bin

Dataset

26
Sample Configuration File of POT
{
"model": {
"model_name": "mobilenet-ssd",
Logically all parameters are divided into "model": "./public/mobilenet-ssd /FP32/mobilenet-
ssd.xml",
three groups: "weights": "./public/mobilenet-ssd /FP32/mobilenet-
• Model parameters are related to the model ssd.bin"
},
definition
"engine": {
• Engine parameters define parameters of the "config“: "./mobilenet-ssd.yaml"
engine that are responsible for the model },
inference and data preparation used for "compression": {
optimization and evaluation “algorithm”: {
"name": "AccuracyAwareQuantization",
• Compression parameters are related to the "params":
optimization algorithm {
"preset": "performance",
"stat_subset_size": 300,
"maximal_drop": 0.01
}
}
}
}

Learn more about Configuration File


https://intel.ly/DE3AGwo 27
OpenVINO™ Notebooks
114-quantization-simplified-mode
• This tutorial demonstrates how to perform INT8
quantization with an image classification model
using the Post-Training Optimization Tool
Simplified Mode (part of OpenVINO).
• We use ResNet20 model and Cifar10 dataset.

• The code in this tutorial is designed to extend to


custom models and datasets. It consists of the
following steps:
• Download and prepare the ResNet20 model and
calibration dataset
• Prepare the model for quantization
• Compress the model using the simplified mode
• Compare performance of the original and quantized
https://intel.ly/DE3R8dQ models
• Demonstrate the results of the optimized model

28
POT Python* API
Default Quantization algorithm using an unannotated dataset
To use this method, you need to create a Python* script that implements data loader and quantization pipeline:
1. Prepare data and dataset interface - Using openvino.tools.pot.DataLoader
2. Select quantization parameters - Same as the configuration.json file but in your Python code
3. Define and run quantization process - from openvino.tools.pot import IEEngine, load_model, save_model,
compress_model_weights, create_pipeline

Post-Training Optimization Tool API


OpenVINO™
load_model()
DataLoader FP32 IR Model
Data

Engine IEEngine Pipeline

Configuration OpenVINO
save_model()
INT8 IR Model
User’s Implementation Existing API helpers

Learn more about POT Python API for Default Quantization Algorthm
https://intel.ly/DE4bwYp 29
Code Example of Defining DataLoader
for Image Dataset
In most cases, it is required to implement only
openvino.tools.pot.DataLoader interface which allows
acquiring data from a dataset and applying model-
specific pre-processing providing access by index. Any
implementation should override the following methods:

• __len__(), returns the size of the dataset


• __getitem__(), provides access to the data by index in
range of 0 to len(self). It also can encapsulate the logic
of model-specific pre-processing. The method should
return data in the following format:
• (data, annotation)

Learn more about Text and Audio DataLoader


https://intel.ly/DE3Q4Yp 30
Code Example of Creating Pipeline
• POT API provides its own methods to load and save model objects from OpenVINO Intermediate
Representation: load_model and save_model.
• It also has a concept of Pipeline that sequentially applies specified optimization methods to the model.
create_pipeine method is used to instantiate a Pipeline object.
• A code example below shows a basic quantization workflow:
Defined in the previous page

POT Pipeline API


Same as configuration.json file
31
POT Python* API 1. Prepare data and dataset interface -
openvino.tools.pot.DataLoader
Quantizing with Accuracy Control 2. Define accuracy metric - openvino.tools.pot.Metric
3. Select quantization parameters - same as the
• This method assumes that users already tried configuration.json file but in your Python code
Default Quantization for the same model, but it 4. Define and run quantization process -
introduced a significant accuracy degradation.
from openvino.tools.pot load_model, save_model
• Some layers can be reverted to the original from openvino.tools.pot import
precision. compress_model_weights, create_pipeline

Post-Training Optimization Tool API


OpenVINO™
load_model()
DataLoader FP32 IR Model
Data

Metric Engine IEEngine Pipeline

Configuration OpenVINO
save_model()
INT8 IR Model
User’s Implementation Existing API helpers

Learn more about POT Python API for Accuracy Control


https://intel.ly/AccuracyAware 32
Code Example of
Defining Accuracy Metric
• To control accuracy during the optimization a
openvino.tools.pot.Metric interface should be implemented.
Each implementation should override the following
• properties:
• value - returns the accuracy metric value for the last model
output in a format of Dict[str, numpy.array].
• avg_value - returns the average accuracy metric over
collected model results in a format of Dict[str,
numpy.array].
• higher_better - returns True if a higher value of the metric
corresponds to better performance, otherwise, returns
False.
• methods:
• update(output, annotation) - calculates and updates the
accuracy metric value using the last model output and
annotation.
• reset() - resets collected accuracy metric.
• get_attributes() - returns a dictionary of metric attributes:
• direction - (higher-better or higher-worse) a string
parameter defining whether metric value should be
increased in accuracy-aware algorithms.
• type - a string representation of metric type. For
example, ‘accuracy’ or ‘mean_iou’.

Learn more about POT Python API for Accuracy Control


33
https://intel.ly/DE3pYMd
Code Example of Quantization Workflow with
Accuracy Control
User defined function - Depends on type of dataset

Example on the previous slide

Building Pipeline
Same as configuration.json file

34
Sample Application
Quantizing Object Detection Model with Accuracy Control

▪ This example demonstrates the use of the


Post-Training Optimization Toolkit API to
quantize an object detection model in the
accuracy-aware mode.

https://intel.ly/DE3d0gzY ▪ The MobileNetV1 FPN model from


TensorFlow* for object detection task is
Three files you can be found in this sample: used for this purpose.

▪ A custom DataLoader is created to load


the COCO dataset for object detection
task and the implementation of mAP
COCO is used for the model evaluation.

35
OpenVINO™ Notebooks - 105-language-quantize-bert
This tutorial demonstrates how to apply INT8 quantization to
the Natural Language Processing model BERT, using the Post-
Training Optimization Tool API (part of OpenVINO). We will use
HuggingFace BERT PyTorch model fine-tuned for Microsoft
Research Paraphrase Corpus (MRPC) task. The code of the
tutorial is designed to be extendable to custom models and
datasets.
https://intel.ly/DE3HQXJ

OpenVINO™ Notebooks - 111-detection-quantization


This tutorial shows how to quantize an object detection model,
using OpenVINO's Post-Training Optimization Tool API. For
demonstration purposes, we use a very small dataset of 10
images presenting people at the airport. The images have been
resized from the original resolution of 1920x1080 to 960x540.
For any real use cases, a representative dataset of about 300
images would have to be applied. The model used is person-
detection-retail-0013
https://intel.ly/DE3RaGN

36
Benchmark Tool
The benchmark app allows you to benchmark your model's throughput and latency. Performance for
a particular application can also be evaluated virtually using Intel® DevCloud for the Edge Workloads,
a remote development environment with access to Intel® hardware and the latest versions of the
Intel® Distribution of the OpenVINO™ Toolkit.

Basic Usage
The Python benchmark_app is automatically installed when you install OpenVINO Developer Tools
using PyPI. Before running benchmark_app, make sure the openvino_env virtual environment is
activated, and navigate to the directory where your model is located.
The benchmarking application works with models in the OpenVINO IR (model.xml and model.bin)
and ONNX* (model.onnx) formats. Make sure to convert your models if necessary.

To run benchmarking with default options on a model, use the following command:
benchmark_app -m model.xml

37
Summary
• We learned about the concepts and applications of artificial intelligence in this module.

38
Summary
• We learned about optimizing Deep Learning Models in this module.
• To begin, we learned why pre-trained deep learning models must be optimized for
inference.
• Then we delved deep into the OpenVINO™ toolkit's optimization tools, including the
Model Optimizer, Post-Training Optimization Tool, and other supporting software.

39
Hands-on Lab

40
Hands-on Labs
Exercise 1: Download a model from OMZ using OpenVINO Notebooks - 104-model- tools

In this exercise, you will learn how to


download a model from Open Model Zoo,
convert it to OpenVINO™ IR format, show
information about the model, and
benchmark the model.

https://intel.ly/104-model-tools

41
Hands-on Labs
Exercise 2: Tiny YOLO* V3 to IR conversion using OpenVINO™ toolkit

This exercise will show you how to use


Model Optimizer in the Intel® DevCloud for
Edge Workloads to convert the Tiny
YOLO* V3 model to IR.

https://intel.ly/tinyyolov3-IR

42
System configuration
System board Intel prototype, TGL U DDR4 SODIMM RVP ASUSTek COMPUTER INC./Prime z370-a

CPU 11th Gen Intel® Core™ i5-1145G7 @ 2.6 GHz 8th Gen Intel ® Core™ i5-8500t @ 3.0 GHz

Sockets/physical cores 1/4 1/6

Hyperthreading/turbo setting Enabled/On NA/On

Memory 2 x 8198 MB 3200 MT/s DDR4 2 x 16384 MB 2667 MT/s DDR4

OS Ubuntu 18.04 LTS Ubuntu 18.04 LTS

Kernel 5.8.0-050800-generic 5.3.0-24-generic

Software Intel® Distribution of OpenVINO™ toolkit 2021.1.075 Intel Distribution of OpenVINO toolkit 2021.1.075

BIOS Intel TGLIFUI1.R00.3243.A04.2006302148 AMI, version 2401

BIOS release date June 30, 2020 July 12, 2019

BIOS setting Load default settings Load default settings, set XMP to 2667

Test date September 9, 2020 September 9, 2020

Precision and batch size CPU: int8, GPU: FP16-int8, batch size: 1 CPU: int8, GPU: FP16-int8, batch size: 1

Number of inference requests 4 6

Number of execution streams 4 6

Power (TDP link) 28W 35W

Price (USD) link on 02/25/2022


USD 312 USD 192
Prices may vary

1) Memory is installed such that all primary memory slots are populated.
2) Testing by Intel as of September 9, 2020.

44
Compounding effect of hardware and software configuration
See the compounding effect

1) Purley E63448-400,
System board 2) Intel® Server Board S2600STB 3) Intel Server Board S2600STB 4) Intel® Internal Reference System
Intel® Internal Reference System

CPU Intel® Xeon® Silver 4116 @ 2.1 GHz Intel® Xeon® Silver 4216 CPU @ 2.10 GHz Intel® Xeon® Silver 4216R CPU @ 2.20 GHz Intel® Xeon® Silver 4316 CPU @ 2.30 GHz

Sockets, physical cores/socket 2, 12 2, 16 2, 16 2, 20

Hyperthreading/turbo setting Enabled/On Enabled/On Enabled/On Enabled/On

Memory 12x 16 GB DDR4 2400 MHz 12x 64 GB DDR4 2400 MHz 12x 32GB DDR4 2666 MHz 16 x32GB DDR4 2666 MHz

OS UB-16.04.3 LTS UB-18.04 LTS UB-18.04 LTS UB-20.04 LTS

Kernel 4.4.0-210-generic 4.15.0-96-generic 5.3.0-24-generic 5.13.0-rc5-intel-next+


Intel® Distribution of OpenVINO™ toolkit Intel® Distribution of OpenVINO™ toolkit Intel® Distribution of OpenVINO™ toolkit Intel® Distribution of OpenVINO™ toolkit
Software
R5 2018 R3 2019 2021.2 2021.4.1
Intel Corporation
BIOS PLYXCRB1.86B.0616.D08.2109180410 — SE5C620.86B.02.01. WLYDCRB1.SYS.0020.P93.2103190412
0009.092820190230
BIOS release date September 18, 2021 — September 28, 2019 March 19, 2021
Select optimized default settings, Select optimized default settings,
Select optimized default settings, Select optimized default settings,
BIOS setting change power policy to "performance," change power policy to "performance,"
save, and exit save, and exit
save, and exit save, and exit
Test date October 8, 2021 September 27, 2019 December 24, 2020 September 6, 2021

Precision and batch size FP32/Batch 1 int8/Batch 1 int8/Batch 1 int8/Batch 1

Workload: model/image size MobileNet-SSD/300x300 MobileNet-SSD/300x300 MobileNet-SSD/300x300 MobileNet-SSD/300x300

Number of inference requests 24 32 32 10

Number of execution streams 24 32 32 10

Power (TDP link) 170W 200W 250W 300W

Price (USD) link on 02/25/2022


USD 2,024 USD 1,926 USD 2,004 USD 2,166
Prices may vary

45

You might also like