0% found this document useful (0 votes)
343 views76 pages

Gpu-Applications-Catalog 2021

aplicaciones que usan placas nvidia para acelerar

Uploaded by

Alvaro Calandra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
343 views76 pages

Gpu-Applications-Catalog 2021

aplicaciones que usan placas nvidia para acelerar

Uploaded by

Alvaro Calandra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

GPU-ACCELERATED

APPLICATIONS

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 1 4/5/21 10:18 AM


Test Drive the World’s
Fastest Accelerator – Free!
Take the GPU Test Drive, a free and easy way to
experience accelerated computing on GPUs. You can
run your own application or try one of the preloaded
ones, all running on a remote cluster. Try it today.
www.nvidia.com/gputestdrive

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 2 4/5/21 10:18 AM


GPU‑ACCELERATED
APPLICATIONS
Accelerated computing has revolutionized a
broad range of industries with over six hundred
applications optimized for GPUs to help you
accelerate your work.

CONTENTS

1 Computational Finance 58 Research: Higher Education and Supercomputing


NUMERICAL ANALYTICS
2 Climate, Weather and Ocean Modeling PHYSICS
2 Data Science and Analytics SCIENTIFIC VISUALIZATION

5 Artificial Intelligence 63 Smart Spaces


DEEP LEARNING AND MACHINE LEARNING
66 Tools and Management
12 Public Sector and National Government 71 Agriculture
14 Design for Manufacturing/Construction: 71 Business Process Optimization
CAD/CAE/CAM
CFD (MFG)
CFD (RESEARCH DEVELOPMENTS)
COMPUTATIONAL STRUCTURAL MECHANICS
DESIGN AND VISUALIZATION
ELECTRONIC DESIGN AUTOMATION
INDUSTRIAL INSPECTION

27 Media and Entertainment


ANIMATION, MODELING AND RENDERING
COLOR CORRECTION AND GRAIN MANAGEMENT
COMPOSITING, FINISHING AND EFFECTS
(VIDEO) EDITING
(IMAGE & PHOTO) EDITING
ENCODING AND DIGITAL DISTRIBUTION
ON-AIR GRAPHICS
ON-SET, REVIEW AND STEREO TOOLS
WEATHER GRAPHICS

42 Medical Imaging
45 Oil and Gas
46 Life Sciences
BIOINFORMATICS
MICROSCOPY
MOLECULAR DYNAMICS
QUANTUM CHEMISTRY
(MOLECULAR) VISUALIZATION AND DOCKING

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 3 4/5/21 10:18 AM


hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 4 4/5/21 10:18 AM
Computational Finance
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Accelerated Elsen Secure, accessible, and accelerated back- •W


 eb-like API with Native bindings for Multi-GPU
Computing Engine testing, scenario analysis, risk analytics Python, R, Scala, C Single Node
and real-time trading designed for easy •C
 ustom models and data streams
integration and rapid development.
Adaptiv Analytics SunGard A flexible and extensible engine for fast •C
 odes in C# supported transparently, with Multi-GPU
calculations of a wide variety of pricing and minimal code changes Single Node
risk measures on a broad range of asset •S
 upports multiple backends including
classes and derivatives. CUDA and OpenCL
•S
 witches transparently between multiple
GPUs and CPUS depending on the deal
support and load factors.
Alea.cuBase F# QuantAleas F# package enabling a growing set of F# •F
 # for GPU accelerators Multi-GPU
capability to run on a GPU. Single Node
Esther Global Valuation In-memory risk analytics system for OTC •H
 igh quality models not admitting closed Multi-GPU
portfolios with a particular focus on XVA form solutions Single Node
metrics and balance sheet simulations. •E
 fficient solvers based on full matrix linear
algebra powered by GPUs and Monte Carlo
algorithms
Global Risk MISYS Regulatory compliance and enterprise wide •R
 isk analytics Multi-GPU
risk transparency package. Single Node
Hybridizer C# Altimesh Multi-target C# framework for data parallel •C
 # with translation to GPU Multi-GPU
computing. •M
 ulti-Core Xeon Single Node
MACS Analytics Murex Analytics library for modeling valuation and •M
 arket standard models for all asset Multi-GPU
Library risk for derivatives across multiple asset classes paired with the most efficient Single Node
classes. resolution methods (Monte Carlo
simulations and Partial Differential
Equations)
NAG Numerical Random number generators, Brownian •M
 onte Carlo and PDE solvers Single GPU
Algorithms bridges, and PDE solvers Single Node
Group
Oneview Numerix Numerix introduced GPU support for •E
 quity/FX basket models with Multi-GPU
Forward Monte Carlo simulation for Capital BlackScholes/Local Vol models for Multi-Node
Markets and Insurance. individual equities and FX
•A
 lgorithms: AAD (Automatic Algebraic
Differential)
•N
 ew approaches to AAD to reduce time
to market for fast Price Greeks and XVA
Greeks
O-Quant options O-Quant Offering for risk management and complex •C
 loud-based interface to price complex Multi-GPU
pricing options and derivatives pricing using GPUs. derivatives representing large baskets of Multi-Node
equities
Pathwise Aon Benfield Specialized platform for real-time hedging, •S
 preadsheet-like modeling interfaces Multi-GPU
valuation, pricing and risk management. •P
 ython-based scripting environment Single Node
•G
 rid middleware
SciFinance SciComp, Inc Derivative pricing (SciFinance) •M
 onte Carlo and PDE pricing models Single GPU
Single Node
Synerscope Data Synerscope Visual big data exploration and insight tools •G
 raphical exploration of large network Single GPU
Visualization datasets including geo-spatial and Single Node
temporal components
Volera Hanweck Real-time options analytical engine (Volera) •R
 eal-time analytics Multi-GPU
Associates Single Node
Xcelerit SDK Xcelerit Software Development Kit (SDK) to boost •C
 ++ programming language, cross- Multi-GPU
the performance of Financial applications platform (back-end generates CUDA and Single Node
(e.g. Monte-Carlo, Finite-difference) with optimized CPU code)
minimum changes to existing code. •S
 upports Windows and Linux operating
systems

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  1

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 1 4/5/21 10:18 AM


Climate, Weather and Ocean Modeling
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

COSMO COSMO Regional numerical weather prediction and •R


 adiation only in the trunk release Multi-GPU
Consortium climate research model •A
 ll features in the MCH branch used for Multi-Node
operational weather forecasting
E3SM-EAM US DOE Global atmospheric model used as •D
 ynamics and most physics Multi-GPU
component to E3SM global coupled climate Multi-Node
model.
Gales KNMI, TU Delft Regional numerical weather prediction •F
 ull Model Multi-GPU
model Multi-Node
GRAF IBM/TWC New GPU-based global weather model •F
 ull application Multi-GPU
based on MPAS from NCAR Multi-Node
WRF AceCAST- TempoQuest Inc. WRF model from NCAR now •A
 RW dynamics Multi-GPU
WRF commercialized by TQI. Used for numerical •1
 9 physics options including enough to run Multi-Node
weather prediction and regional climate the full WRF model on GPUs
studies. All popular aspects of WRF model
are GPU developed.

Data Science and Analytics


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Anaconda Anaconda The open-source Anaconda Distribution is •B  indings to CUDA libraries: cuBLAS, Multi-GPU
Distribution the easiest way to perform Python/R data cuFFT, cuSPARSE, cuRAND Multi-Node
science and machine learning on Linux, •S  orts algorithms from the CUB and Modern
Windows, and Mac OS X. Developed for solo GPU libraries
practitioners, it is the toolkit that equips • I ncludes Numba (JIT Python compiler),
you to work with thousands of open-source Dask (Python scheduler), NumPy, SciPy,
packages and libraries. • I ncludes single-line install of numerous DL
frameworks such as PYTORCH
AnswerRocket AnswerRocket AnswerRocket leverages AI and machine •P
 luggable machine learning models Multi-GPU
learning techniques to automate the hard •A
 sk Questions in Plain English Multi-Node
work of business analysis, empowering •C
 reate Interactive Visualizations &
teams to generate business intelligence and Dashboards
advanced analysis in seconds. •P
 rovides Augmented Analytics
•S
 upports a wide variety of data sources
ArgusSearch Planet AI Deep Learning driven document search •F
 ast full text search engine Multi-GPU
tool. •S
 earches hand-written and text Single Node
documents, including PDF
•A
 llows almost any arbitrary requests
(Regular Expressions are supported)
•P
 rovides a list of matches sorted by
confidence
Automatic Speech Capio In-house and Cloud-based speech •R
 eal-time and offline (batch) speech Multi-GPU
Recognition recognition technologies recognition Single Node
•E
 xceptional accuracy for transcription of
conversational speech
•C
 ontinuous Learning (System becomes
more accurate as more data is pushed to
the platform)
BlazingSQL BlazingSQL GPU-accelerated SQL Engine for analytics •D
 istributed SQL Query Engine Multi-GPU
available on all major CSP and on-premise •S
 upports petabyte scale applications Multi-Node
deployment. •S
 upports traditional big data formats and
data stores
BrytlytDB Brytlyt In-GPU-memory database built on top of •G
 PU-Accelerated joins, aggregations, Multi-GPU
PostgreSQL scans, etc. on PostgreSQL Multi-Node
•V
 isualization platform bundled with
database is called SpotLyt.
CuPy Preferred CuPy (https://github.com/cupy/cupy) is • CUDA Multi-GPU
Networks a GPU-accelerated scientific computing •m  ulti-GPU support Single Node
library for Python with a NumPy compatible
interface.

2  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 2 4/5/21 10:18 AM


Datalogue Datalogue AI powered pipelines that automatically •D
 ata transformation Multi-GPU
prepare any data from any source for •O
 ntology mapping Single Node
immediate & compliant use. •D
 ata standardization
•D
 ata augmentation
DeepGram Deepgram Voice processing solution for call centers, •S
 peech to text and phonetic search using Multi-GPU
financials and other scenarios. GPU deep learning Single Node
Driverless AI H2O.ai Automated Machine Learning with Feature •A  utomated machine learning and feature Multi-GPU
Extraction. Essentially BI for Machine extraction Single Node
Learning and AI, with accuracy very similar •A  utomated statistical visualization
to Kaggle Experts. • I nterpretability toolkit for machine learning
H2O Driverless AI is an artificial intelligence models
(AI) platform for automatic machine
learning. Driverless AI automates some
of the most difficult data science and
machine learning workflows such as
feature engineering, model validation,
model tuning, model selection and model
deployment. It aims to achieve highest
predictive accuracy, comparable to expert
data scientists, but in much shorter time
thanks to end-to-end automation. Driverless
AI also offers automatic visualizations and
machine learning interpretability (MLI).
Especially in regulated industries, model
transparency and explanation are just
as important as predictive performance.
Modeling pipelines (feature engineering and
models) are exported (in full fidelity, without
approximations) both as Python modules and
as pure Java standalone scoring artifacts.
GPUdb Kinetica Multi-GPU, Multi-Machine distributed •Q  uery against big data in real time Multi-GPU
object store providing SQL style query •N  o pre-indexing allows for complex, ad-hoc Single Node
capability, advanced geospatial query query chains
capability,heatmap generation, and • I nteractively explore large, streaming
distributed rasterization services. data sets
H2O4GPU H2O.ai H2O is a popular machine learning •A  vailable algorithms include Gradient Multi-GPU
platform which offers GPU-accelerated Boosting Machines (GBM’s) Single Node
machine learning. In addition, they offer •G  eneralized Linear Models (GLM’s)
deep learning by integrating popular deep •K  -Means Clustering
learning frameworks. • SVD
• PCA
• K-means
•X  GBoost.
• I t can be used as a drop-in replacement
for scikit-learn with support for GPUs on
selected (and ever-growing) algorithms.
•A  new R API brings the benefits of GPU-
accelerated machine learning to the R user
community. The R package is a wrapper
around the H2O4GPU Python package,
and the interface follows standard R
conventions for modeling.
IntelligentVoice INTELLIGENT Far more than a transcription tool, this •A  dvanced Speech Recognition across large Multi-GPU
VOICE speech recognition software learns data sets Single Node
what is important in a telephone call, •J  umpTo Technology, for data visualisation
extracts information and stores a visual • E-Discovery
representation of phone calls to be •E  xtraction from phone calls
combined with text/instant messaging and • I M & Email defining key phrases and
E-mail. Intelligent Voice’s search and alert emotional analysis
makes it possible to tackle issues before •C  ompliance, defining key conversations
they arise, address data security concerns and interactions
and monitor physical access to data.

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  3

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 3 4/5/21 10:18 AM


Jedox Jedox Helps with portfolio analysis, management •T
 his database holds all relevant data in Multi-GPU
consolidation, liquidity controlling, cash GPU memory Single Node
flow statements, profit center accounting, •T
 esla K40 &12 GB on-board RAM
treasury management, customer value •S
 cales up with multiple GPUs
analysis and many more applications. All •K
 eeps close to 100 GB of compressed data
accessible in a powerful web and mobile in GPU memory on a single server system
application or Excel environment. •F
 ast analysis, reporting, and planning
Labellio KYOCERA The world’s easiest deep learning web •N
 eural net fine-tuning for image data Multi-GPU
Communication service for computer vision, allowing •D
 ata crawling and data browsing Single Node
Systems Co everyone to build own image classifier with •D
 rag-and-drop style data cleansing backed
only web browser. by AI support
Numba Anaconda Numba is an open source JIT compiler that •O  n-the-fly code generation (at import time Multi-GPU
translates a subset of Python and NumPy or runtime, at the user’s preference) Single Node
code into fast machine code. •N  ative code generation for the CPU
Think of it as a compiler for Python array (default) and GPU hardware
and numerical functions that gives you the • I ntegration with the Python scientific
power to speed up your applications with software stack (enabled via Numpy)
high performance functions written directly •J  IT compilation of Python functions for
in Python. execution on various targets (including
Numba translates Python functions to CUDA)
optimized machine code at runtime using
the industry-standard LLVM compiler library.
Numba-compiled numerical algorithms
in Python can approach the speeds of C or
FORTRAN.
You don’t need to replace the Python
interpreter, run a separate compilation step,
or even have a C/C++ compiler installed. Just
apply one of the Numba decorators to your
Python function, and Numba does the rest.
Numba generates optimized machine code
from pure Python code using the LLVM
compiler infrastructure. With a few simple
annotations, array-oriented and math-heavy
Python code can be just-in-time optimized to
performance similar as C, C++ and Fortran,
without having to switch languages or Python
interpreters.
Numba is designed to be used with NumPy
arrays and functions. Numba generates
specialized code for different array data types
and layouts to optimize performance. Special
decorators can create universal functions
that broadcast over NumPy arrays just like
NumPy functions do.
Numba also works great with Jupyter
notebooks for interactive computing, and
with distributed execution frameworks,
like Dask and Spark. With support for GPU
acceleration, Numba lets you write parallel
GPU algorithms entirely from Python.
OmniSci OmniSci OmniSci is GPU-powered big data analytics •U
 ses LLVM’s nvptx backend to generate Multi-GPU
and visualization platform that is hundreds CUDA kernels Single Node
of times faster than CPU in-memory •O
 penGL- (EGL) based rendering
systems. OmniSci uses GPUs to execute •C
 an run in a docker container using
SQL queries on multi-billion row datasets NVIDIA-docker
and optionally render the results, all in
milliseconds.
Polymatica Polymatica Analytical OLAP and Data Mining Platform •V
 isualization, Reporting, OLAP in-memory Multi-GPU
with GPU acceleration Multi-Node
•D
 ata Mining
•M
 achine Learning
•P
 redictive Analytics

4  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 4 4/5/21 10:18 AM


Sqream DB SQream GPU accelerated SQL database engine for •U  p to 100TB of raw data can be stored and Multi-GPU
Technologies big data analytics. Sqream speeds SQL queried in a standard 2U server Single Node
analytics by 100X by translating SQL queries • I nserts and analyzes hundreds of billions of
into highly parallel algorithms run on the records in seconds
GPU. •N  o indexes required
•N  o changes to SQL code or data science
paradigms required
SynerScope Synerscope Big data visualization and data discovery, for •R
 eal-time Interaction with data Single GPU
combining Analytics on Analytics with IoT Single Node
compute-at-the-edge smart sensors.
ZX Lib (Fuzzy Logic) Tanay Financial analytics and data mining library •M
 onte Carlo simulations Multi-GPU
•P
 ricing of vanilla and exotic options Single Node
•F
 ixed income analytics
•D
 ata mining

Artificial Intelligence
DEEP LEARNING AND MACHINE LEARNING
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

AIC Tracxpoint AIC (Artificial Intelligence Cart) •T


 he smart IoT cart recognizes the shopper, Single GPU
revolutionizes the supermarket shopping loads their shopping list and buying Single Node
experience with sensor fusion and machine patterns, suggests compatible products
learning technology. and provides the most valuable offer
•R
 ecognizes the items placed in the cart and
bill the customer at the end of the shopping
experience with no checkout lanes
•F
 eature Jetpack
AiFi Nano AiFi Inc. Cashier-free (like Amazon grab and go • cuDNN Multi-GPU
solution) and stock out retail software • TensorRT Single Node
• DeepStream
AI Image Labeling Frenzy Builds robust self-labeling training datasets •G
 PU in the cloud Multi-GPU
for classifying exact objects and products in Single Node
visual scenes at a fraction of the time and
cost
AI Lifescycle Clarifai Clarifai brings a new level of understanding •G
 PU-based training and inference Multi-GPU
to visual content through deep learning •R
 ecognizes and indexes images with Single Node
technologies. Uses GPUs to train large predefined classifiers or custom classifiers
neural networks to solve practical
problems in advertising, media, and search
across a wide variety of industries such
as automated tagging, visual search,
and recommendation engine, predictive
maintenance, demographic analysis and
more.
Allganize NLU APIs Allganize, Inc. Natural Language Understanding APIs for •T
 raining and inferencing using V100 Multi-GPU
for Enterprises enterprise: Answer-bot based on documents Multi-Node
with unstructured data (text + table), e.g.,
manuals, instructions, FAQ documents;
Review analysis; sentiment analysis,
summarizing etc. Provided as APIs.
AlphaSense AlphaSense PaaS for Financial analysis based on public •P
 aaS for Financial analysis based on public Multi-GPU
corporate information. Geared at financial corporate information Single Node
analysts within financial services.. Allows •G
 eared at financial analysts within financial
very fast searches of public corporate services.
information, and allows questing answering •A
 llows very fast searches of public
format (“the Google for Analyst research”) corporate information, and allows questing
answering format (“the Google for Analyst
research”)
AlwaysAI Always AI Easy-to-use platform to build and deploy •J
 etson Nano Single GPU
computer vision applications for embedded Single Node
devices at the edge. Apply for an early
access on the product link

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  5

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 5 4/5/21 10:18 AM


Anaconda Anaconda The end-to-end data science platform. •B
 indings to CUDA libraries: cuBLAS, Multi-GPU
Enterprise Edition The Anaconda enterprise platform is cuFFT, cuSPARSE, cuRAND Single Node
a comprehensive foundation for any •S
 orts algorithms from the CUB and Modern
organization that wants to use data science GPU libraries
and machine learning to make better •N
 umba (JIT Python compiler), Dask (Python
decisions and build differentiating solutions. scheduler), NumPy, SciPy,
•S
 ingle-line install of numerous DL
frameworks such as PYTORCH
Antuit Demand Antuit Extracts maximum predictability from •C
 UDA 10.1 Multi-GPU
Planning and the available data. Proprietary “Dynamic •C
 uDNN 7.6 Multi-Node
Forecasting Aggregation” logic with attribute-based •C
 uBLAS 10.2
disaggregation generates forecasts for
all products, including new, slow-moving,
and end-of-life. Spark and GPU clusters,
along with optimized AI algorithms,
provide scaling for the largest retailers.
Incorporates all available demand drivers,
such as price elasticities, promotional lifts,
weather, and hyper-local event data.
Apache Mahout Apache Mahout Mahout is building an environment for •E
 xtremely easy to add new algorithms Multi-GPU
quickly creating scalable performant •D
 istributed instead of single machine Multi-Node
machine learning applications.
Applica RTA Applica Applica RTA combines computer vision and •G
 PU to accelerate model training, fine- Multi-GPU
deep-learning driven NLP to process all tuning and inferencing Multi-Node
documents types.
Artificial Deepwave The Artificial Intelligence Radio Transceiver •T
 he AIR-T is designed to be an edge- N/A
Intelligence Radio Digital (AIR-T) is software defined radio designed compute inference engine for deep learning
Transceiver (AIR-T) and developed for RF deep learning algorithms.
applications. The app is equipped with three
signal processors including a 256 core
NVIDIA Jetson TX2, a field programmable
gate array (FPGA), and dual embedded
CPUs.
ARYA.ai ARYA.ai Deep learning platform with end-to-end •D  eep learning Multi-GPU
workflows for Enterprise, incorporating • TensorFlow. Multi-Node
TensorFlow. Focuses on consumer banking
and insurance industries.
Aura Vision Aura Vision Capture unique insights from every visitor, •S
 egmented footfall Single GPU
using your existing cameras •S
 hopper motivation Single Node
•P
 roduct engagement
•W
 indow display ROI
•S
 tore utilization
•S
 ervice wait times
Avitas Systems Avitas Systems Avitas Systems configures various multi •D  rone based data capture Multi-GPU
- Inspection as a rotor and helicopter drones with multiple •R  GB Camera, Laser and Infrared sensing Multi-Node
Service sensor kits including RGB cameras, laser •D  eep learning driven Object detection for
sensors, infrared and others collecting Inspection
inspection data to meet different customer •D  etect corrosion levels, damaged/missing
use cases. Ingests inspection data where parts, encroaching vegetation volumes.
an AI back-end turns the raw data into •A  I workbench
inspection findings such as corrosion levels, • Photogrammetry
damaged/missing parts, encroaching
vegetation volumes.
AWM Smart Shelf Adroit Worldwide Application for Automated Inventory • kubernetes Multi-GPU
Media Inc. Intelligence (view and track virtually in a • Docker Single Node
retail environment), Content Management •R  TX 2080
System (manage inventory, prices and
content), Led Display (prices. promotions
and advertisements at the click of a button)
and Product Mapper (automate creation of
planograms and auditing process)
Badger Insights Badger Badger Technologies provides data and •G
 PU accelerated Single GPU
Technologies analytics for retail operations through Single Node
automation solutions that include a fully
autonomous robot to address out-of-stock,
planogram compliance, and price integrity

6  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 6 4/5/21 10:18 AM


BIDMach - UC Berkeley The fastest machine learning library •W
 ritten in Scala and supports Scala and Multi-GPU
available. Holds the record for many Java interfaces Single Node
common machine learning algorithms. •S
 upports linear regression, logistic
regression, SVM, LDA, K-Means and other
operations
Bons.ai Bons.ai Bons.ai is an artificial intelligence platform • Easy to use programming interface. Bons.ai Multi-GPU
which abstracts away the low-level, inner •N  ovel programming language called Single Node
workings of machine learning systems to Inkling
empower more developers to integrate •P  rimary focus on reinforcement learning
richer intelligence models into their work.
Brain Frame Aotu BrainFrame platform provides Out-Of- • Jetpack Single GPU
The-Box Smart Vision Applications for • Jetson Single Node
multiple verticals. The drag-and-drop
VisionCapsules system allows you to pick
from a wide selection of custom algorithms
to extract exactly the information you want
Caffe2 Facebook This is a faster framework for deep •G
 PU cluster processing Multi-GPU
learning, it’s forked from BVLC/caffe •M
 ass image data Single Node
(master branch). Allows data-parallel via
MPI.
Cartwatch Signatrix Protect the checkout area and reduce the •R
 eal-time alerts on theft (mis-scan) at the Single GPU
Checkout workload of your checkout staff checkout lanes Single Node
•F
 eaturing Jetpack and TensorRT
CatBoost Yandex CatBoost is an open-source gradient •E  xtremely fast learning on GPU Multi-GPU
boosting library with categorical features • Multi-GPU Multi-Node
support. • Multi-Node
Chainer Preferred DL framework that makes the construction •D
 ynamic NN construction, which makes Multi-GPU
Networks, Inc. of neural networks (NN) flexible and debugging easier Multi-Node
intuitive. •C
 PU/GPU-agnostic coding, which is
promoted by CuPy, partially NumPy-
compatible multidimensional array library
for CUDA
•D
 ata-dependent NN construction, which
fully exploits the control flows of Python
without magic
checout Everseen Loss prevention solution at the POS •M  Is-scan detection Multi-GPU
inteliigence powered by T4 •P  roduct and ticket switching detection Single Node
• ” Walk off” detection
ClearML Allegro.AI ClearML provides a suite of tools to •M
 ulti-system enterprise workflow Multi-GPU
streamline ML workflow, including scheduling Multi-Node
Experiment Manager, ML-Ops and Data •V
 ersion control (e.g., the ?git?) for models
Management. •D
 GX-ready and available from NGC
•O
 pen-source and paid options
•E
 nables reproducibility and automation
•C
 learML supports MIG functionality
•T
 ensorFlow, Keras, and PyTorch
•N
 VIDIA frameworks such as Clara for
healthcare and medical imaging
•R
 APIDS and TLT
CNTK Microsoft Corp. Microsoft Computational Network Toolkit •S  peech Recognition Multi-GPU
(CNTK) is a unified computational network •M  achine Translation Single Node
framework that describes deep neural • I mage Recognition
networks as a series of computational steps • I mage Captioning
via a directed graph. •T  ext Processing and Relevance
•L  anguage Understanding
•L  anguage Modeling
ConundrumAI Conundrum Conundrum, a UK-based company, develops •A
 utomated deep learning significantly Multi-GPU
Industrial AI solutions for predictive maintenance and speeds up a build of the applications based Single Node
Limited optimization of industrial processes. on DL models;
•T
 ransfer Learning enables to boost
the performance of the applications by
transferring knowledge between them;
•D
 ata based digital twins and reinforcement
learning for optimization.

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  7

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 7 4/5/21 10:18 AM


Darwin SparkCognition Darwin is a machine learning product •U
 nique neuro-evolutionary algorithm on Multi-GPU
that accelerates data science at scale by GPU Single Node
automating the building and deployment •A
 utomated ML for model building on GPUs
of models. Based on a proprietary neuro- •G
 PU accelerated PyTorch
evolutional algorithm, Darwin uses a
combination of ML methods and genetic
algorithms, to arrive at a new generation of
designs.
Databricks Unified Databricks Databricks provides a cloud-based platform •G  PU instances available with CUDA drivers Multi-GPU
Analytics Platform designed to make big data and machine included Multi-Node
learning simple. •G  PU support provided by Spark scheduler
• I ntegration of TensorFlow, Keras
•T  ensorFrames data connector
•D  eep learning pipelines/workflows
•T  ransfer learning and image loading
DeepInstinct DeepInstinct Zero day end point malware detection •Z
 ero-day threats & APT attack detection on Multi-GPU
solution offered to enterprise markets. endpoints, servers and mobile devices Single Node
Deeplearning4j Skymind Deeplearning4j is the most popular deep • I ntegrates with Hadoop and Spark to run Multi-GPU
learning framework for the JVM, and distributed Single Node
includes all major neural nets such as •J  ava and Scala APIs
convolutional, recurrent (LSTMs) and •C  omposable framework that facilitates
feedforward. building your own nets
• I ncludes ND4J, the Numpy for Java.
Dessa Dessa Deep Learning Platform based on •D
 eep learning workflows can be built Multi-GPU
TensorFlow. Allows end-to-end workflows. •B
 ased on TensorFlow Multi-Node
Targets consumer banking and insurance •U
 se cases in consumer banking and
industries. Insurance
Dextro Axon Dextro’s API uses deep learning systems to •O
 bject and scene detection Multi-GPU
analyze and categorize videos in real-time. •M
 achine transcription for audio Single Node
•M
 otion and movement detection
Dr. Retail SkyREC Inc. Instore data analytics •T  ensorRT 5.1 Single GPU
• nvJPEG Single Node
• NVEnc
• NVDec
Frenzy Enterprise Frenzy Frenzy Enterprise Solutions provides •G
 PU on the cloud Multi-GPU
Solutions retailers and brands with the tools to Single Node
provide customer’s the best experience and
more purchasing opportunities including
Similar Product Recommendations,
Inventory Tagging, Camera Search,
Complimentary Product Recommendations,
How To Wear It, Influencer Matching
G3C.AI Graymatics Retail in store analytics solutions through • I n store analytics: heat-maps, shopper Multi-GPU
Deep CCTV Streaming Analytics tracking, dwell time, people counting, Single Node
mood detection, demographics
•F  eaturing TensorRT and Deepstream
Gridspace Gridspace Voice analytics to turn streaming speech •S  peech-to-text transcription N/A
audio into useful data and service metrics. • Compliance
Instrumental to contact •C  all grading
call center and work communications •C  all topic modeling
with powerful deep learning-driven voice •C  ustomer service enhancement
analytics. •C  ustomer churn prediction
Insights AnyVision Insight delivers in-store analytics with •N
 VIDIA Tesla T4 and Jetson Multi-GPU
features such as: heavy shoppers, gaze Single Node
estimation, heatmaps, customer journey,
and offline to online
Keras Open Source Keras is a minimalist, highly modular • c uDNN version (depends on the version Multi-GPU
neural networks library, written in Python. of TensorFlow and Theano installed with Single Node
Capable of running on top of either Keras)
TensorFlow or Theano and developed with a •S  upported Interfaces: Python
focus on enabling fast experimentation.

8  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 8 4/5/21 10:18 AM


Malong Retail AI Malong RetailAI® Fresh solves for the time- •S
 upports T4 Multi-GPU
Fresh Technologies consuming and error-prone experience that •S
 upports Deepstream Single Node
grocery store customers today struggle with
when weighing fresh products on a self-
serve scale.
Malong Retail AI Malong For loss prevention at self-checkout and •S
 upports T4 Multi-GPU
Protect Technologies staffed lanes. Leverages award-winning •S
 upports Deepstream Single Node
product recognition technologies, the
system accurately identifies and stops
common scan errors as they happen ?
including mis-scans and ticket-switching ?
while helping to protect customer privacy.
Offers industry-leading accuracy while
being massively scalable for effectively
unlimited SKUs and stores.
MatConvNet Mathworks CNNs for MathWorks MATLAB, allows you •B  uilding Blocks Multi-GPU
to use MATLAB GPU support natively rather •S  imple CNN wrapper Single Node
than writing your own CUDA code. •D  agNN wrapper
• c uDNN implemented
Matriod Matroid Matroid offers video classification service •M
 atroid is multi-cloud and allows it Multi-GPU
in the cloud. Matroid allows training video customers to easily switch between AWS, Multi-Node
detections on a set of images and then Azure and Google Cloud.
applying those video detection.
MetaMind Einstein Provides a deep learning API for image •G
 PU-based training and inference Multi-GPU
Platform recognition and text sentiment analysis. •R
 ecognizes image and analyzes text Single Node
Services Uses either prebuilt, public, or custom •C
 reates and trains classifiers with tooling
classifiers. for uploading and managing datasets
MXNet Amazon MXnet is a deep learning framework •M
 Xnet supports cuDNN v5 for GPU Multi-GPU
designed for both efficiency and flexibility acceleration Multi-Node
that allows you to mix the flavors of
symbolic programming and imperative
programming to maximize efficiency and
productivity.
Neon Intel Neon is a fast, scalable, easy-to-use Python •T
 raining, inference and deployment of deep Multi-GPU
based deep learning framework that has learning models Single Node
been optimized down to the assembler •P
 rocesses over 442M images per day on a
level. Features a rich set of example and Titan X
pre-trained models for image, video, text,
deep reinforcement learning and speech
applications.
NVCaffe Berkeley AI The Caffe deep learning framework makes •P
 rocess over 40M images per day with a Single GPU
Research implementing state-of-the-art deep single NVIDIA K40 or Titan GPU Single Node
learning easy.
out of stock Focal Systems Deep Learning Computer Vision track your •O
 n-Shelf Availability Analytics per hour Multi-GPU
detection On-Shelf Availability throughout your entire •R
 eal-time Alerts on your “never be outs” Single Node
store 100+ times a day
PaddlePaddle PaddlePaddle PaddlePaddle (Parallel Distributed Deep •O
 ptimized math operations through SSE/ Multi-GPU
Learning) is an easy-to-use, efficient, AVX intrinsics, BLAS libraries (e.g. MKL, Single Node
flexible and scalable deep learning ATLAS, cuBLAS) or customized CPU/GPU
platform, which is originally developed kernels
by Baidu scientists and engineers for the •H
 ighly optimized recurrent networks which
purpose of applying deep learning to many can handle variable-length sequence
products at Baidu. without padding
•O
 ptimized local and distributed training for
models with high dimensional sparse data
Protects & Insights Briefcam Transform video into actionable •N  VIDIA Tesla and Jetson. Multi-GPU
intelligence. features: video synopsis and • TesnorRT Single Node
real time alerts, loss prevention, customer
engagement and tying info to POS data,
heatmaps, shopper tracking

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  9

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 9 4/5/21 10:18 AM


QA Bot Pryon Challenge: QA Bots are easy to build but hard • V100 Multi-GPU
to keep up-to-date . The last thing you want is Single Node
a bot distributing wrong answers 24/7.
Solution: With Pryon, QA bots are ridiculously
fast and easy to create ? and more importantly
easy to monitor and maintain.
Benefits:
- Real time monitoring of questions asked
- Update or add more answers directly or
by adding documents
- Process feedback easily
Retail Analytics Pilot AI Labs Retail in-store analytics for stock out • Jetpack Single GPU
(cameras in shelves), demographics (age/ • Jetson TX2 Single Node
gender), shopper tracking/counting, •R  TX 2080
anomaly detection, drive through solutions
and more
samadii/dem Metariver Software for computing various behaviors •S  olid particle simulator, DEM solver Multi-GPU
Technology of massive solid particles of various size •M  ulti-Physics module(Drag and Buoyancy Multi-Node
particles from small particle with Brownian force, Magnetic force, Coulomb force,
motion to large particle such as ore with adhesion force, Van der Waals force,
DEM(Discrete Element Method). Brownian motion and heat effect)
• VPS(Virtual Particle System), Cluster model
•C  o-simulation with MBD(Multi Body
Dynamics) solvers (ADAMS, DADS,
RecurDyn, Daful)
•C  o-simulation with ANSYS Mechanical
(Flexible body).
SAS SAS SAS Machine Learning. SAS Viya Visual •V  olta V100 with tensor cores Multi-GPU
Data Mining and Visualization suites now •T  ensorRT for inference on the NVIDIA Multi-Node
leverage GPU deep learning Jetson TX2 box
• RNN
•M  ultiple GPUs on a single SMP node
•H  omogeneous and heterogeneous MPP
with synchronized Stochastic Gradient
Descent
Sentient Sentient Sentient is an AI platform company •S
 entient is using GPU deep learning in Single GPU
with special focus on digital marketing, its commercially available ecommerce, Single Node
ecommerce and finance trading digital marketing and financial trading
applications. applications
•S
 tudio.ml is a new project designed to
make AI development easier by hiding most
of the complexity
•S
 tudio.ml runs on-premise and in the cloud
Shopic Frictionless Shopic Frictionless Shopping - using smart cart •N
 VIDIA Xavier NX Single GPU
Shopping Single Node
SmartCart Imagr SmartCart comprised of four tiny cameras •N  VIDIA Jetson, Xaiver Single GPU
and AI vision recognition system • TensorRT Single Node
Smart Skin Human engine AI-enhanced processing of 3D and 4D data. • CUDA Multi-GPU
Used to create high quality 3D characters • Hairworks Multi-Node
for interactive media (games, mobile apps, • PhysX
VFX, VR/AR and mixed reality experiences, • cuDNN
etc) • OptiX
- automatic retopology of 3D and 4D data
using machine learning
- photogrammetry : noise-reduction and
hole-patching using machine learning
- realistic lip-sync using 4D-trained neural
network
SpaceKnow PaaS SPACEKNOW PaaS for deep learning extraction of satellite •E
 xtracts economic activity from satellite Multi-GPU
data information targeted at Financial images using deep learning Multi-Node
Services and Defense / Intelligence. Tracks •P
 rovides batch mode extraction
macro/micro-economic activity by applying
deep learning to satellite images.

10  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 10 4/5/21 10:18 AM


Talkmap Talkmap NLU model training/re-training/fine-tuning • V100 Multi-GPU
for contact center operation automation • P100 Multi-Node
trained from raw transcripts to identify the •T  4 GPUs
intentions automatically, complemented • cuDNN
by human annotation. Models are used for
post-call analysis, chatbot design etc.
Tensorflow Google Google’s TensorFlow is an open source •T
 ensorFlow is flexible, portable and Multi-GPU
software library for numerical computation performant creating an open standard for Single Node
using data flow graphs. Nodes in the exchanging research ideas and putting
graph represent mathematical operations, machine learning in products
while the graph edges represent the
multidimensional data arrays (tensors)
communicated between them.
Theano LISA Lab Theano is a symbolic expression compiler •A
 bstract expression graphs for transparent Multi-GPU
that powers large-scale computationally GPU acceleration Single Node
intensive scientific investigations.
The Deep North Deep North The Deep North platform includes • TensorRT Multi-GPU
Video Analytics Occupancy Management, Gesture Analysis, Single Node
platform Zone Management, Vehicle Analysis,
Dashboard and reporting
theft & safety Third Eye Labs Theft, safety and loss detection •T
 esla T4 - metropolis Multi-GPU
•C
 oncealment detector in IN-AISLE AND Single Node
THE STOCKROOM
•S
 afety - social distance detector
•C
 heckout Theft Detector at the POS
ThermalNet Malong AI-based dual camera thermal + computer •S
 mart Alerts Multi-GPU
Technologies vision screening system that can be utilized •P
 rivacy Protection Single Node
by enterprises to help people stay safe •C
 ustomizable and Flexible Deployment
during epidemics. Powered by multiple •H
 igh Performance Accuracy
world-class AI models, the system can
accurately detect and alert on potentially
dangerous temperature levels combined
with PPE, occupancy, and social distancing
compliance.
Torch7 Open Source Torch7 is an interactive development •C
 omputational back-ends for multicore Multi-GPU
environment for machine learning and GPUs Single Node
computer vision.
TrigoVison TrigoVision Retail automation platform that provides • TensorRT Multi-GPU
seamless checkout, shoplifting prevention, Single Node
and real-time inventory updates.
Unify.ID Unify.ID Behavioral user authentication service • Identifies individuals based on unique factors Multi-GPU
such as the way they walk, type and sit Single Node
Veesion Veesion Shoplifting detection using deep learning •R
 eal-time shoplifting prospects alerts Multi-GPU
algorithm that continuously analyses Single Node
the content of security cameras. It
automatically detects gestures associated
with shoplifting in real-time. Sends a video
alert to a human operator who confirms the
theft and takes action.
Visual Intelligence Deep Vision Deep Vision specializes in understanding •V
 isual Intelligence API allows leader Single GPU
API visual content and getting the most value enterprises in verticals like e-commerce Single Node
of data by applying visual recognition for and online auctions, media and
enterprises. entertainment and retailers, to analyze
content related with faces, brands and
context tags to perform actions like:
•C
 urate and organize visual content
•S
 earch and recommend visually
•G
 et insights and analytics visually
Voca’s Virtual voca.ai Human like cell center conversation AI • Jasper Multi-GPU
Agent • NeMo Multi-Node

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  11

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 11 4/5/21 10:18 AM


vuForecast deepVu ML/DL enabled vuForecast learns •M
 L (dmlc/XGBoost) + Dask for distributed Multi-GPU
from historical inventory, point of sale, training Single Node
promotions and logistics data augmented •D
 L (RNN/LSTM networks) + PyTorch 1.1
with DeepVu’s real-time data platform •D
 L (RL) + TensorFlow 1.14 and 2.0
aggregating numerous external micro
and macro economic signals to accurately
forecast future demand
Walkout walkout Autonomous check out - smart cart •N
 VIDIA Jetson Tx2 Single GPU
Single Node
Yusp Gravity R&D Personalized recommendations for •S
 earch solution to create a smooth product Multi-GPU
E-commerce, powered by T4 discovery experience Single Node
•P
 roduct/Content recommendation
•O
 n-site personalization
•S
 earch personalization
•M
 obile personalization
•E
 -mail marketing (and push, SMS)
personalization
•P
 ersonalization ad retaergeting
•A
 d exchange yield optimization
Zippin Zippin Checkout-free technology offering inventory • Jetpack Multi-GPU
tracking and insights to ensure the right Single Node
products are in the right place, at the right
time.

Public Sector and National Government


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Advanced Ortho DigitalGlobe Geospatial visualization • I mage orthorectification Multi-GPU


Series Single Node
ArcGIS Pro ESRI Viewshed2 determines the raster surface • Viewshed2 Multi-GPU
locations visible to a set of observer •D  eep Learning Multi-Node
features, using geodesic methods. •A  spect - The values of each cell in the
Transforms the elevation surface into a output raster indicate the compass
geocentric 3D coordinate system and runs direction the surface faces at that location.
3D sightlines to each transformed cell It is measured clockwise in degrees from
center. Takes advantage of Tensor Cores for 0 (due north) to 360 (again due north),
both training and inference . coming full circle.
•S  lope - The output slope raster can be
calculated in two types of units, degrees or
percent (percent rise).
Blaze Terra Eternix Geospatial visualization tool •3
 D visualization of geospatial data Multi-GPU
Single Node
Elcomsoft Elcomsoft High-performance distributed password •G
 PU acceleration for password recovery Multi-GPU
recovery software with NVIDIA GPU •1
 0-100x speedup for password recovery Single Node
acceleration and scalability to over 10,000
workstations.
ENVI L3Harris Inc Image Processing and Analytics •D  eep Learning training Multi-GPU
•D  eep learning inferencing Single Node
• I mage orthorectification
• I mage transformation
•A  tmospheric correction
•P  anchromatic co-occurrence texture filter
•V  ideo processing and analytics using
Jagwire
ERDAS Imagine Hexagon Remote sensing, photogrammetry and GIS •G
 ray Level co-occurrence matrix (CLCM) Single GPU
Geospatial toolset for the interactive, semi-automated image processing operation Single Node
and automated extraction of information from •N
 NDiffuse image pan sharpening operation
remotely sensed imagery and point clouds. •D
 eep learning capabilities using the GPU
accelerated versions of Tensorflow

12  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 12 4/5/21 10:18 AM


Fortify Corsight AI Sureproof Facial Recognition AI For your •S
 mart technology that can overcome face Multi-GPU
Safety & Privacy masks & PPE Single Node
•F
 acial recognition in almost complete
darkness & extreme angles
•N
 on discriminative algorithm that is
ethnicity neutral
•V
 intage image match up to 30 years old
•M
 ask detection and alert on subjects not
wearing a face mask
Geomatics GXL PCI Image processing • I mage orthorectification Multi-GPU
•A  dditional image processing Single Node
GeoWeb3d Desktop Geoweb3d Geospatial visualization of 3D and 2D data, •3
 D visualization and analysis of geospatial Multi-GPU
mensuration and mission planning data Single Node
Graphistry Graphistry Graphistry is the first visual investigation •G
 raph reasoning Multi-GPU
platform to handle increasing enterprise- •G
 PU-accelerated visual analytics Single Node
scale workloads. •V
 isual pivoting
•R
 ich investigation templating
Ikena ISR MotionDSP Real-time full motion video (FMV) and wide- •R
 eal-time super-resolution-based video Multi-GPU
area motion imagery (WAMI) enhancement enhancement on live streams Single Node
and computer-vision-based analytics •G
 eospatial visualization
software. •T
 arget detection and tracking
•F
 ast 2-D mapping
LuciadLightspeed Hexagon Geospatial visualization and analysis •G
 PU accelerated line of sight and view shed Single GPU
Geospatial calculations Single Node
•G
 PU accelerated hypsometry calculations,
including terrain slope, ridge and valley
detection, terrain orientation and azimuth
calculations
•G
 PU accelerated imaging operator for
geospatially referenced imagery
Manifold Systems Manifold Full-featured GIS, vector/raster processing •M
 anifold surface tools Multi-GPU
Systems & analysis Single Node
OmniSIG DeepSig Inc. The OmniSig sensor provides a new •O  perates in a real-time streaming fashion Multi-GPU
class of RF sensing and awareness using • I ngests radio samples from many common Single Node
DeepSig’s pioneering application of radio interfaces
Artificial Intelligence (AI) to radio systems. •M  ake use of packet formats like VITA49 or
Going beyond the capabilities of existing SDDS.
spectrum monitoring solutions, OmniSIG •C  an be used from any device with a
is able to not only detect and classify browser, including mobile handsets
signals but understand the spectrum •O  mniSIG software also provides its
environment to inform contextual analysis metadata output stream in JSON form for
and decision making. Compared to use by other applications
traditional approaches, OmniSIG provides
higher sensitivity and accuracy, is more
robust to harsh impairments and dynamic
spectrum environments, and requires less
computational resources and dynamic range.
SNEAK OpCoast Electromagnetic signals propagation •R
 ay tracing, DTED and remote sensing Multi-GPU
modeling for complex urban and terrain inputs Single Node
environments.
SocetGXP BAE Systems Visual Profiler utilizes a cognitive vision •A
 utomated 3D feature extraction from Multi-GPU
and profiling methodology (using machine LiDAR Single Node
learning algorithms and state of the •A
 utomated feature detection from imagery
art deep learning schemes) to provide using deep learning
unlimited object definition and profiling
flexibility. The Automatic Spatial Modeler
(ASM) is designed to generate 3-D point
clouds with accuracy similar to LiDAR.
Extracts 3-D objects and 3_D dense point
clouds from stereo images. Also extracts
accurate building edges and corners from
stereo images with high resolution, large
overlaps, and high dynamic range.

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  13

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 13 4/5/21 10:18 AM


Terrabuilder Skyline Software PhotoMesh integrates a GPU-based, fast •3
 D model building from imagery Multi-GPU
PhotoMesh algorithm, able to automatically build 3D •B
 uilding texture generation Single Node
models from simple photographs. PhotoMesh
revolutionizes the use of geospatial data by
fully automating the generation of high-
resolution, textured, 3D mesh models from
standard 2D images.
Therm-App® MD Opgal Thermal imaging device for body •U
 nlimited Hotspot Detection & Tracking Single GPU
Pro temperature measurement •A
 dvanced Deep Learning Algorithm Single Node
•L
 inux-Based Solution
•S
 tand-Alone Solution
•R
 emote Sensor
•Q
 uick Hotspot Detection
•U
 p to 20 Simultaneous Scans
•A
 udio & Visual Alert
Wesafe WeSmart Simple low cost IVA solution for up to 4 •P
 eople Detection in ROI Single GPU
cameras on a Jetson Nano, Performing •N
 ight/Day, People Counting Single Node
people detection in ROI and people •P
 ush notifications with visuals of the alerts
counting. •S
 imple setup, ONVIF Cameras detection.

Design for Manufacturing/Construction: CAD/CAE/CAM


CFD (MFG)
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Actran FFT Simulation of acoustics propagation at •D


 iscontinuous Galerkin Method (DGM) Multi-GPU
high frequency or in huge domains such as solver Multi-Node
exhaust of turbomachines, full truck cabin
exterior acoustics, and ultrasonic parking
sensors.
ADS Flow Solver - ADSCFD, Inc. A Compressible, explicit time-marching •U
 nstructured/Structured Meshes Multi-GPU
Code LEO CFD solver for aerospace applications. •M
 ultigrid Accelerations Multi-Node
Capable of handling both internal and •M
 ultiple Turbulence Models
external flows with robustness and •R
 otor-stator Interfaces
accuracy
Altair AcuSolve Altair Computational Fluid Dynamics (CFD) •L
 inear solvers for flow, temperature, Single GPU
tool, providing users with a full range of turbulence model, and mesh movement Single Node
physical models. Simulations involving equations
flow, heat transfer, turbulence, and non-
Newtonian materials are handled with ease
by AcuSolve’s robust and scalable solver
technology.
Altair nanoFluidX Altair State-of-the-art particle-based (SPH) fluid •E  xtremely fast Multi-GPU
dynamics code for simulation of single and •S  ingle and Multiphase Flows Multi-Node
multiphase flows in complex geometries •A  rbitrary motion definition
with complex motion. •T  ime-dependent acceleration
• Inlets/outlets
•S  urface tension and adhesion
•S  teady-state thermal solutions through
coupling
Altair ultraFluidX Altair Simulation tool for ultra-fast prediction of •C
 UDA-accelerated high-fidelity flow Multi-GPU
the aerodynamic properties of passenger field computations based on the Lattice Multi-Node
and heavy-duty vehicles as well as for the Boltzmann method
evaluation of building and environmental •C
 UDA-aware MPI support for multi-GPU
aerodynamics. and multi-node usage
•E
 fficient implementation of tailor-made
automotive features, including rotating
wheels, belt systems, boundary layer
suction and porous media support
Ansys Fluent ANSYS General purpose CFD software •L
 inear equation solver Multi-GPU
•R
 adiation heat transfer model Multi-Node
•D
 iscrete Ordinate Radiation model
Ansys Icepak ANSYS CFD software for electronics thermal •L
 inear Equation Solver Multi-GPU
management Multi-Node
Ansys Polyflow ANSYS CFD software for the analysis of polymer •D
 irect Solvers Multi-GPU
and glass processing Single Node

14  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 14 4/5/21 10:18 AM


CharLES Cascade CharLES is a GPU-accelerated CFD •C
 UDA Toolkit Multi-GPU
Technologies, software application specializing in LES Multi-Node
Inc. (Large Eddy Simulations). Runs on a range
of CUDA GPUs from Kepler to Turing
architectures and scales with multiple
GPUs in a single server node as well as
scales across multiple GPUs over a cluster
of nodes.
CPFD Barracuda- CPFD Modeling software for simulating Fluidized •L
 inear equation solver for isothermal, Single GPU
VR and Barracuda Reactors non-reacting simulations and for thermal Single Node
reacting cases
•D
 iscrete multi-component particle
calculations
DYVERSO Next Limit Multi-physics simulation engine for liquids •F
 luid solver in Real Flow 10.5 based on Single GPU
and granular substances. Can be used to Smoothed particle hydrodynamics (SPH) Single Node
mimic behavior of rigid and soft bodies •F
 luid solver in Real Flow 10.5 based on
Position based dynamics (PBD)
Fine/Open Numeca FINE/Open with OpenLabs is a powerful • I ncompressible, low and high speed flows Multi-GPU
International CFD Flow Integrated Environment dedicated •E  fficient preconditioned compressible Multi-Node
to complex internal and external flows. It solver with fast agglomerated multigrid
allows users to freely develop and exchange acceleration and adaptation techniques
physical models in CFD, with a new open to combine completely unstructured
approach to CFD. Complex programming hexahedral grids
tasks are avoided through the usage of an
easy meta-language.
FINE/Turbo Numeca Structured, multi-block, multi-grid CFD •M
 ulti-grid solver Multi-GPU
International solver targeting the turbo machinery Multi-Node
industry
GeoPlat-RS GridPoint Geoplat Pro-RS is a parallel hydrodynamic • CUDA Multi-GPU
Dynamics (GPD) simulator with a flexible architecture. This •S  pectral Decomposition with CUFFT library Single Node
enables to reduce the time for writing
the entire simulator by 2/3, and, as
consequence, to quickly bring new physical
processes into the algorithm.
HiFUN SANDI High Resolution Flow Solver on •H
 iFUN imbibes most recent CFD Multi-GPU
Unstructured Meshes. State-of-the-art technologies; many of them home grown Single Node
Euler/RANS solver. Super scalability on •H
 iFUN exhibits highly scalable parallel
massively parallel HPC platforms, with code performance with its ability to scale up to
ported using OpenACC directives for NVIDIA several thousand processors on massively
GPU. parallel computing platforms
•C
 apable of handling complex geometries
and flow physics arising in high lift flows
JSCAST Qualica Inc. Integrated CAE product for studying and •S  olvers for mold filling and solidification Single GPU
predicting the casting process. Includes • Rendering Single Node
high precision mold filling and solidification
solvers.
midas NFX(CFD) Midas General purpose CFD software based on •L
 inear equation solver (Iterative Solver and Single GPU
FEM AMG Preconditioner) Single Node
MIKE 21 DHI 2D hydrological modelling of coast and •F
 lexible Mesh (FM) engines use GPUs. Multi-GPU
sea for simulating physical, chemical, and •H
 ydrodynamic and turbulence calculations Single Node
biological processes
MIKE 3 DHI 3D Modeling of Coast and Sea •H
 ydrodynamic part of the flexible mesh Multi-GPU
engines (MIKE 3 HD FM). Multi-Node
MIKE FLOOD DHI 1D & 2D urban, coastal, and riverine flood • Hydrodynamics Multi-GPU
modelling •2  D Overland flow Single Node
•C  oupling of 1D and 2D models for complex
flooding issues
MSC Apex MSC Software Generative Design based simulation to •U
 ltra-fast matrix solving Multi-GPU
Generative Design create several optimized, lightweight •A
 ccelerated computing power for part Multi-Node
designs ultra-fast and almost fully optimizations
automated

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  15

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 15 4/5/21 10:18 AM


M-Star CFD M-Star General purpose CFD Multiphysics •F
 luid flow & heat transfer Multi-GPU
Simulations, LLC modeling software •D
 EM simulation Multi-Node
•C
 hemical reactions
•M
 ulti-phase flow
Numerix Zeus Custom software development in the areas •L
 attice Boltzmann Method (LBM) for flow Multi-GPU
of CFD, FEA and Electromagnetics around buildings Single Node
•S
 PH based flow solver for simulating flow
over urban environments
Pacefish Numeric CFD application for Automotive •T  ransient Lattice-Boltzmann Method for Multi-GPU
Systems GmbH Aerodynamics, Pedestrian Comfort and single-phase flows Single Node
Wind Loading • I ntegrated fast and robust pre-processor
for complex geometries
•L  ocal grid refinement
•u  RANS (K-Omega-SST), hybrid uRANS-LES
(SST-DDES & SST-IDDES)
•L  ES (Smagorinsky) turbulence modeling
• s calable up to 16 GPUs
Particleworks Prometech CFD software using MPS (Moving Particle •E
 xplicit and Implicit methods Multi-GPU
Simulation) method for automotive, energy, Multi-Node
material, chemical processing, medical,
food, and civil engineering industries where
free surface fluid flow and fluid mixing
phenomena occur.
PowerViz Dassault Industry proven, modern post-processing • Rendering Multi-GPU
Systèmes app for EXA POWERFLOW CFD •R  ay tracing Single Node
SIMULIA Corp.
ScPOST Hexagon, Cradle Postprocessor for visualizing simulation •F
 ile loading acceleration Single GPU
results from CFD analysis, MSC Nastran Single Node
and MSC Marc
Simcenter 3D Siemens Digital A unified, scalable, open and extensible • Rendering Multi-GPU
Industries environment for 3D CAE with connections • Raytracing Single Node
Software to design, 1D simulation, test, and data
management.
Simcenter STAR- Siemens Digital Integrated solution for CFD-focused • Rendering Single GPU
CCM+ Industries Multiphysics simulation Single Node
Software
Speed IT FLOW Vratis Incompressible single-phase CFD software •F
 inite-volume solver: Simple and piso, Single GPU
incompressible single-phase flows with Single Node
k-OmegaSST turbulence
Turbostream Turbostream CFD software for turbomachinery flows •F
 inite Volume explicit solver for RANS/ Multi-GPU
Ltd. URANS calculations Multi-Node
•V
 ariable time-steps and multigrid for
convergence acceleration
zCFD Zenotech General purpose CFD solver •T
 urbulent flow (RANS, URANS, DDES or Multi-GPU
Simulation LES) including automatic scalable wall Single Node
Unlimited functions

CFD (RESEARCH DEVELOPMENTS)


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

ALYA Barcelona Alya is a high performance computational • I ncompressible Flows Multi-GPU


Supercomputing mechanics code to solve complex coupled •C  ompressible Flows Multi-Node
Center (BSC) multi-physics / multi-scale problems, which •N  on-linear Solid Mechanics
are mostly coming from the engineering •S  pecies transport equations
realm. •E  xcitable Media
•T  hermal Flows
•N  -body collisions
DualSPHysics University of SPH-based CFD software •S
 PH model Multi-GPU
Manchester Single Node
HiPSTAR University of CFD software for compressible reacting •E
 xplicit solver Multi-GPU
Southampton flows Single Node
and University
of Melbourne -
Sandberg

16  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 16 4/5/21 10:18 AM


Project Chrono University of Chrono is a physics-based modelling • Robotics Multi-GPU
Wisconsin- and simulation infrastructure based on a •W  heeled vehicle dynamics Multi-Node
Madison platform-independent open-source design •T  racked vehicle dynamics
implemented in C++. Systems can be made •N  onlinear finite element analysis
of rigid and flexible/compliant parts with • Mechatronics
constraints, motors and contacts; parts •O  ff-road vehicle mobility
can have three-dimensional shapes for • Terramechanics
collision detection •V  irtual reality
•G  ranular flows
•C  ollision detection
•A  utonomous vehicles
•S  eismic engineering
•A  ugmented reality
PyFR Imperial College General purpose CFD software for •H
 igh-order explicit solver based on flux Multi-GPU
- Vincent compressible flows reconstruction method Multi-Node
RAPTOR US DOE CFD formulation of turbulent combustion •F
 low solver Multi-GPU
for fuel injector and other engine Multi-Node
applications
S3D Sandia and Oak Direct numerical solver (DNS) for turbulent •C
 hemistry model Multi-GPU
Ridge NL combustion Multi-Node

COMPUTATIONAL STRUCTURAL MECHANICS


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Adams MSC Software Multi-Body Dynamics simulation software • Rendering Single GPU
Single Node
Altair EDEM Altair Software for bulk material simulation that •E  DEM Simulator, a DEM solver Multi-GPU
uses the Discrete Element Modeling (DEM) • I ntegration with Ansys and Abaqus for FEA Single Node
technology to simulate and analyze behavior for bulk material simulation
of bulk materials • I ntegration with Adams, Siemens and
RecurDyn for Multi-body Dynamics
• I ntegration with Ansys Fluent for Particle-
Fluid Systems
Altair HyperWorks Altair Comprehensive, open architecture CAE •O  penGL v3.2 Single GPU
simulation suite in the industry, offering the •O  penCL v2.0 support Single Node
best technologies to design and optimize • Anti-aliasing
high performance, weight efficient and
innovative products. It includes a full set of
modeling and visualization tools.
Altair OptiStruct Altair Industry proven, modern structural analysis •D  irect solver (BCS) Single GPU
solver for linear and nonlinear problems •E  igenvalue solvers (AMSES and Lanczos) Single Node
under static and dynamic loadings. It is also • I terative solver (PCG)
the market-leading solution for structural
design and optimization.
Amphyon AdditiveWorks Simulation-based process software for •M
 echanical Process Simulation Single GPU
powder bed based, laser beam melting •T
 hermal Process Simulation Single Node
additive manufacturing processes
Ansys Mechanical ANSYS Simulation and analysis tool for structural •D
 irect and iterative solvers Multi-GPU
mechanics Multi-Node
Autodesk Nastran Autodesk Autodesk Nastran FEA software analyzes •D
 ouble Precision on GPU Multi-GPU
linear and nonlinear stress, dynamics, and Multi-Node
heat transfer characteristics of structures
and mechanical components.
GranuleWorks Prometech DEM-based advanced simulator for •S
 ize distribution, contact force model, Multi-GPU
granular materials in pharma and powder rolling resistance model, liquid bridge force Multi-Node
metallurgy: granular material segregation, model, van der Waals force model, heat
screening, grinding, screw conveying, transfer and external force.
mixing, compaction, filling. dustproof, toner •B
 oundary conditions: polygon wall, inflow
transport, electrode materials filling, cliff and outflow boundary, and simulation
collapses/debris flow, etc. domain.
•C
 oupling with Particleworks MPS solver:
support for aeration and pumps
Helyx PEM Engys Specialised add-on solver for HELYX to simulate •P
 olyhedral Elements Method solver Single GPU
large numbers of solid objects in motion using Single Node
the Polyhedral Element Method (PEM)

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  17

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 17 4/5/21 10:18 AM


Impetus Afea Impetus Afea Predicts large deformations of structures •N
 on-linear Explicit Finite-Element Solver Multi-GPU
and components exposed to extreme Single Node
loading conditions
Irazu Geomechanica Simulation and analysis tool for rock •E
 xplicit 2D and 3D FEM and FDEM solvers Single GPU
Inc. mechanics, involving large deformations, •C
 oupled hydraulic, mechanical, transport, Single Node
fracturing and multi-physics phenomena. thermal and fracture processes
Marc MSC Software Simulation and analysis tool for structural •D
 irect sparse solver Multi-GPU
mechanics Single Node
MatDEM Nanjing MatDEM is a software for Fast GPU Matrix •F
 ull product support on GPU Multi-GPU
University computing of Discrete Element Method. The Single Node
software implements automatic stacking
modeling, layered material, joint surface
and load settings, rich post-processing
functions and secondary development.
midas GTS NX Midas Simulation tool for geo-technical analysis •L
 inear equation solver(Multi Frontal Solver) Single GPU
Single Node
midas Midas Simulation and analysis tool for structural •L
 inear equation solver(Multi Frontal Solver) Single GPU
NFX(Structural) mechanics Single Node
MSC Nastran MSC Software Multidisciplinary structural analysis •D
 irect sparse solver Multi-GPU
application used to perform static, dynamic, Single Node
and thermal analysis across linear and
nonlinear domains
PERMAS-XPU INTES GmbH General purpose structural simulation •L
 inear Equation Solver Single GPU
software Single Node
RecurDyn FunctionBay, Inc. Multi-Flexible Body Dynamics simulation • Rendering Single GPU
software Single Node
Rocky DEM Rocky DEM Discrete Element Modeling (DEM)-based •E
 xplicit DEM solver (dry/sticky contact Multi-GPU
particle simulation software for simulating rheologies) Single Node
behavior of bulk materials with complex •1
 -way & 2-way coupling with ANSYS Fluent
particle shapes and size distributions and ANSYS Mechanical
Simcenter Nastran Siemens Digital Finite element method (FEM) solver for •L
 inear and nonlinear equation solver Multi-GPU
Industries computational performance, accuracy, •F
 requency response module Multi-Node
Software reliability and scalability •M
 atrix decomposition computations
SIMULIA Dassault Realistic simulation solution (Uses Abaqus •D
 irect sparse solver Single GPU
3DEXPERIENCE Systèmes Standard for GPU computing) Single Node
SIMULIA Corp.
SIMULIA Abaqus/ Dassault Simulation and analysis tool for structural •D
 irect sparse solver Multi-GPU
Standard Systèmes mechanics •A
 MS Solver Multi-Node
SIMULIA Corp. •S
 teady State Dynamics
ThreeParticle/CAE BECKER 3D Multiphysics Discrete Element Method •G
 PU accelerated Smoothed Particle Single GPU
GmbH (DEM) simulation platform for bulk Hydrodynamics Single Node
materials with complex shapes and built-in •S
 imulate complex and real particle shapes
multi-body dynamics (MBD), Finite Element using DEM combined with SPH, FEA, MBD,
Analysis (FEA) & Smoothed Particle Wear
Hydrodynamics (SPH)

DESIGN AND VISUALIZATION


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

3D CAT.live Shenzhen Real-time rendering cloud service for 3D •C


 loud XR SDK Multi-GPU
Rayvision applications. The massive GPU computing •D
 LSS (potential) Multi-Node
Technology Co power in the cloud is used to process heavy
Ltd image rendering calculations and stream
output to the terminal device synchronously,
thereby realizing light weight of the terminal
device and making high-quality 3D graphics
applications ubiquitous. Users can use any
common networked device to access the
3D application hosted in the 3DCAT cloud
without downloading and installing the
application. Supports almost all rendering
engines that can run on the Windows
platform, and supports the opening of
NVIDIA RTX real-time ray tracing function.

18  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 18 4/5/21 10:18 AM


3DEXCITE DeltaGen Dassault High-end 3D visualization and realtime • I nteractive ray tracing and global Multi-GPU
Systèmes interaction to help increase visual quality, illumination. Single Node
speed, and flexibility. • I ntegration with Siemens TeamCenter.
•C  luster support Realtime & Offline
Production Process Integration and scene
building.
•S  cene Analysis, Xplore DeltaGen, SDK for
DeltaGen.
6SigmaET Future Facilities Thermal simulation software for the •M
 onte-Carlo ray tracing for Heat Radiation Single GPU
electronics industry. 6SigmaET’s unique •N
 VIDIA’s Optix library Single Node
MLUS Computational Fluid Dynamics (CFD)
solver predicts thermal issues in complex
electronics equipment.
Abaqus/CAE Dassault Complete solution for Abaqus finite element • Rendering Multi-GPU
Systèmes modeling, visualization, and process Single Node
SIMULIA Corp. automation
Accelerad MIT Sustainable Accelerad is a free suite of programs for •U
 p to forty times faster using OptiX N/A
Design Lab fast and accurate lighting and daylighting •R
 enderings with large numbers of ambient
analysis and visualization. bounces
•C
 alculations over many thousands of
sensor points
•F
 ast simulation of annual climate-based
daylighting metrics
•A
 cceleradRT - Interactive interface for
real-time daylighting, glare, and visual
comfort analysis with validated accuracy.
includes AcceleradVR, an immersive
visualization interface compatible with
most virtual reality headsets.
Additive Mfg Toolkit Dyndrite Dyndrite has developed a GPU-based • CUDA N/A
geometry kernel with CUDA. The initial
application for this kernel is an Additive
Manufacturing Toolkit which speeds up
the process of 3D printing, especially for
complex parts.
ALLPLAN Nemetschek Complete Building Information Modeling •O
 penGL 4, and now moving to Vulcan Single GPU
ALLPLAN (BIM) for Architecture, Engineering, and •V
 ulcan for wireframe rendering already Single Node
Construction. with plan to ship full integration with
Version 2022 in September 2021
ANSA BETA CAE Multidisciplinary CAE pre-processing tool • OpenGL Single GPU
Systems for full model build up, from CAD data to • OpenCL Single Node
ready-to-run solver input file, in a single
integrated environment
Ansys Discovery ANSYS Interactive and CAD-agnostic Windows- •O
 penGL-based visualization Single GPU
Live based app that gives engineers •C
 UDA-based Structural Stress, Modal, Single Node
instantaneous simulation results to help Fluid Dynamics, Thermal, Electrical
them explore and refine product designs Conduction and Coupled Multi-Physics
simulations
Ansys SPEOS ANSYS Physically accurate optical simulation •S  PEOS Live Preview Single GPU
software dedicated to predictive illumination • 360 degrees for immersive or observer view Single Node
and optical performance of systems. High- •O  ptical part design
fidelity visualization of the final result, based •O  ptical sensors test
on unique human vision algorithm. •H  UD design and analysis
• I nfrared modeling
Ansys ANSYS Predictive physics-based real time •P
 hysics-based real time lighting simulation Single GPU
VRXPERIENCE for lighting simulation with VR capabilities with VR capabilities from HMD to CAVEs Single Node
HMI and Perceived to experience and validate the impact of (multi-GPU, multi-node)
Quality your design proposition on appearance and •S
 PEOS Live Preview (raytracing) based
perceived quality. on CUDA/OptiX benefiting from RTX
architecture (single GPU)
•S
 calable rendering capabilities,ranging
from rasterization to fully GPU ray-traced
SPEOS Live Preview

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  19

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 19 4/5/21 10:18 AM


Ansys ANSYS Predictive validation of vehicle systems for •M
 ultispectral Physics-based real time Multi-GPU
VRXPERIENCE the optimization of intelligent headlamp lighting simulation with multi-display Multi-Node
Lighting and units and sensors dedicated to ADAS capabilities (driving simulator).
Sensors and AD. Rapid and simple virtual test of
systems, relying on the unique combination
of visually realistic driving simulator, and
physics-based simulation. Real-time and
interactive driving simulator to virtually
create, test and experience future vehicle
driving in real-world like conditions.
Ansys Workbench ANSYS Industry proven, modern pre- & post- • Rendering Multi-GPU
processing app for CAE Single Node
Apex MSC Software Unified environment for virtual product • Rendering Single GPU
development Single Node
Archicad Nemetschek Complete Building Information Modeling •O
 penGL based GPU rendering Single GPU
GRAPHISOFT (BIM) for Architecture, Engineering, and •F
 ast, efficient graphics in the viewport Single Node
Construction. •R
 TX photorealistic rendering with
Twinmotion, internal rendering engine
based on CineRender, and now integrating
Redshift into Archicad.
Arch-Log Luminova Japan A web service based on NVIDIA Iray and • Iray Multi-GPU
RealityServer (from migenius) for rendering • RealityServer Multi-Node
and configuring building materials. • Quadro
• DGX
AutoCAD Autodesk 2D and 3D CAD designing, drafting, •S
 urface, mesh and solid modeling tools, Single GPU
modeling, architectural drawing, and model documentation tools, parametric Single Node
engineering software. drawing capabilities
•O
 pen GL
•N
 ative DWG support
•G
 RID Support.
Avatar VR NeuroDigital Haptic VR gloves for training design or • PhysX Single GPU
Technologies remote operation. Single Node
BricsCAD Hexagon PPM Building information modeling software for • Rendering Single GPU
design, construction, documentation, and Single Node
manufactured building products.
CATIA Dassault The reference CAD application for advanced •G
 PU OpenGL performance scaling in Single GPU
3DEXPERIENCE Systèmes engineering with batching capability R2017x Single Node
and extreme reliability, used by 80 of •V
 R native integration with HTC Vive in
the automotive industry and the entire R2017x
aerospace industry. •V
 R SLI in R2018x
•S
 tellar GPU in R2019x FD01
CATIA Live Dassault Realistic 3D Rendering on full CATIA 3D •P
 hysically Based Rendering with no data Multi-GPU
Rendering Systèmes CAD model. preparation thanks to native NVIDIA Iray Single Node
Photoreal integration and interactive
realistic rendering using NVIDIA Iray IRT
Clarisse Isotropix Set dressing and layout tool with integrated •G
 PU accelerated interactive rendering 50- Single GPU
renderer 100X faster than with CPU Single Node
•O
 ptiX AI-accelerated de-noising
Clip Studio Paint Celsys Clip Studio Paint is a versatile digital •A
 ccelerated processing and AI features Single GPU
painting program that is ideal for the digital Single Node
creation of comics, general illustration, and
2D animation.
Clo3D CLO Virtual 3D garment simulation and design • CUDA Single GPU
Fashion Inc Single Node
COMSOL COMSOL Multiphysics general-purpose simulation •O
 penGL version 2.0 Multi-GPU
software for modeling designs, devices •D
 irectX version 9 Single Node
and processes in all fields of engineering,
manufacturing, and scientific research
Creo Generative PTC Creo Generative Topology Optimization •C
 UDA accelerated Generative Design Multi-GPU
Topology Extension (GTO) creates optimized product Single Node
Optimization designs based on your constraints and
Extension (GTO) requirements - including materials and
manufacturing processes

20  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 20 4/5/21 10:18 AM


Creo Parametric PTC Professional 3D CAD software for product •G  PU accelerated real-time engineering Single GPU
design and development, including simulation with Creo Simulation Live Single Node
parametric modeling, simulation/analysis, •F  ull scene anti-aliasing
and product documentation for companies •O  rder independent transparency
ranging from SMB to Enterprise. •B  etter lighting and enhanced shaded-with-
edges mode
• I mmersive design environment with
realistic materials
Easy 3D Scan Cappasity 3D digitizing software that creates and • OpenCL Single GPU
embeds 3D product images into your Single Node
website, mobile and AR/VR apps, and
gives your customer a near real shopping
experience.
Enscape Enscape GmbH Renderer with Plug-in for Revit, Rhino, •F  ull RTX-enabled Single GPU
SketchUp, ARCHICAD, and Vectorworks •O  ne-click to VR experience Single Node
•D  esign reviews for buildings
• 3D and VR visualization of CAD data for AEC
Grasshopper McNeel & Assoc. Grasshopper is a graphical algorithm editor •F
 ast, scalable OpenGL 3.3 pipeline Single GPU
tightly integrated with Rhino’s 3-D modeling leverages latest NVIDIA GPUs Single Node
tools. Unlike RhinoScript, Grasshopper •G
 PU computed shaders and memory
requires no knowledge of programming or optimizations
scripting, but still allows designers to build •R
 hino 6 leverages NVIDIA RT Cores for
form generators from the simple to the Real-time ray tracing viewport mode
awe-inspiring. •R
 endering engine is CYCLES, fully
integrated inside Rhino 6 now
IC.IDO ESI Group Immersive VR solution for engineering and •N
 V Pro Pipeline (RiX) for OpenGL rendering Multi-GPU
virtual prototyping. The Helios rendering •V
 RWorks SPS and VR SLI (NVLink support) Single Node
engine is highly optimized for NVIDIA GPUs. •D
 esignWorks, including VR Occlusion
Culling open source sample and OptiX
ImageStation Hexagon ImageStation software suite designed •S
 tereo Display and Viewing Single GPU
Geospatial for high-volume photogrammetry and Single Node
production mapping including aerial and
satellite triangulation, stereo feature and
digital terrain model (DTM) collection and
editing, automatic DTM and digital surface
model (DSM) generation, and orthophoto
production and editing
Inspire Studio/ Altair Inspire Studio is a high quality 3D Hybrid •N  URBS modeling Single GPU
Render (formerly Modeling and Rendering environment that •P  olyNURBS modeling Single Node
known as Evolve) enables industrial designers to evaluate, •O  penGL 4.5 Core
research and visualize various designs •O  penGL-based real-time high-quality
faster than ever before. Inspire Studio runs rendering
on both Mac OS X and Windows. • I nteractive high-quality rendering using
Thea Render
•P  roduction rendering using Thea Render
• I ntegrated “dark room” environment to
manage render queue and post-processing
of rendered images
Inventor Autodesk 3D mechanical design, documentation, and •U
 ses BIM for intelligent building Single GPU
product simulation. components to improve design accuracy Single Node
Iray NVIDIA A ready-to-integrate, physically-based, • I ray Interactive Multi-GPU
photorealistic rendering solution. • I ray Photoreal Multi-Node
• I ray Server
•F  ast interactive ray tracing
•P  hysically-based, global-illumination
rendering
•D  istributed cluster rendering.
Iray for 3ds Max Siemens Digital A physically-based renderer plugin for • Iray Photoreal and Iray Interactive support, Multi-GPU
Industries Autodesk 3ds Max VCA clustering, Cloud rendering, MDL Multi-Node
Software support and AI based denoising
Iray for Maya 0x1 Software A physically-based renderer plugin for • Iray Photoreal and Iray Interactive support, Multi-GPU
& Consulting Autodesk Maya. VCA clustering, Cloud rendering, MDL Multi-Node
GmbH support, AI based denoising

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 21 4/5/21 10:18 AM


Iray for Rhino migenius Pty Ltd Iray plugin for Rhino • I ray Photoreal and Iray Interactive support Single GPU
•V  CA clustering Single Node
•C  loud rendering
•M  DL support.
Iray Server migenius Pty Ltd The scaling solution for any Iray based • Iray Photoreal and Iray Interactive support, Multi-GPU
application VCA clustering, Cloud rendering, MDL Multi-Node
support and AI-based denoising
KeyShot Luxion Physically correct real time and batch •G
 PU accelerated real time and batch Multi-GPU
CPU / GPU photorealistic renderer, popular rendering with NVIDIA OptiX Multi-Node
in manufacturing, AEC, and M&E •G
 PU accelerated AI Denoising with NVIDIA
OptiX Denoiser
•N
 etwork rendering on GPU accelerated
nodes
•S
 upport for 30 different native file
formats, many free plugins and live linked
applications
LensMechanix Zemax LensMechanix is the best application for •O
 ptical product teams need an easier Single GPU
mechanical engineers to package optical and faster way to get from design to Single Node
systems in CAD software. It is available manufacture
for SOLIDWORKS users and for Creo •L
 ensMechanix is the answer
Parametric users. •L
 ensMechanix is software for mechanical
engineers who design housing for optical
products in CAD
•W
 ith LensMechanix, mechanical engineers
can access the complete design data of
optical systems designed in OpticStudio
and start designing the mechanical
envelope right away
•T
 hey can then validate their mechanical
design and fix issues before building a
physical prototype
LumenRT Bentley Systems Easily integrate life-like digital nature into •R
 T Cores for real time ray tracing Single GPU
your simulated infrastructure designs, and •T
 ensoRT for denoising Single Node
create high-impact visuals for stakeholders. •A
 ll using the DXR API
Best for very large infrastructure, i.e. 100s
of square kilometers rendering.
Medium by Adobe Adobe PC-based VR sculpting app for modeling •G  LSL shaders Single GPU
& painting in Quest VR headsets. For • Vulkan Single Node
beginners as well as pros. Adobe acquired • NVENC
from Occulus in December 2019. Requires
link cable to PC.
META BETA CAE High-performance multi-disciplinary CAE • OpenGL Single GPU
Systems post-processor • OpenCL Single Node
META VR BETA CAE Powerful processing and visualization • OpenGL Single GPU
Systems environment for interaction with full-scale • OpenCL Single Node
simulation models with collaboration
capabilities
MicroStation Bentley Systems MicroStation is the world’s leading 3D •D
 igital Nature modeling is Full Ray Single GPU
Connect computer-aided design and visualization Tracing-enabled Single Node
software for the architecture, engineering, •R
 eality Modeling leveraging NVIDIA AI
construction, and operation of all acceleration
infrastructure types. Largest CAD in AEC for •G
 PU acceleration for Viz, Rendering,
Civil Engineering users. Simulation Bentley apps are optimized for
• Very tight collaboration with Autodesk Revit. NV Quadro RTX
• MicroStation has internal Rendering tool
called Vue, shipping with the base CAD tool.
Notch Builder 10bit FX A motion graphics and VFX tool designed •G
 PU accelerated graphics and effects Single GPU
by games artists and VJs. Compositing, Single Node
grading and strong inter-operability with
other packages.
NX Siemens Digital Siemens PLM Software premium design •G  RID support Multi-GPU
Industries app with full Iray integration, supporting • I ray, MDL (see NX Ray Traced Studio) Multi-Node
Software multi-gpu rendering. Still CPU bound for
most tasks otherwise

22  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 22 4/5/21 10:18 AM


OpticStudio Zemax OpticStudio combines complex physics • Share designs between OpticStudio and CAD N/A
and interactive visuals so you can analyze, packages as native files, giving mechanical
simulate, and optimize optics, lighting and engineers full access to the optical
illumination systems, and laser systems, all coordinate system and all critical dimension
within tolerance specifications. there is no need for file format conversions
which can cause loss of design data
•S  imulate the impact of mechanical
components on optical performance to
uncover any issues and make informed
design decisions
•C  heck for, and resolve errors, before
building costly physical prototypes
Painter Corel Raster-based digital art application for •G
 PU accelerated brushes Single GPU
drawing, sketching and painting. Single Node
Patran MSC Software Industry proven, modern pre- & post- • Rendering Single GPU
processing app for CAE Single Node
Quark VR Quark VR QuarkVR is an ultra-fast software solution • CUDA Single GPU
which provides low-latency compression Single Node
and wireless transmission. It offloads
the heavy processing on the GPU, and is
hardware-agnostic.
QUINDOS Hexagon Coordinate metrology software • Rendering Single GPU
Manufacturing Single Node
Intelligence
RealityServer migenius Pty Ltd 3D rendering and collaborative visualization •N
 VIDIA Iray. Multi-GPU
and model manipulation platform based on Multi-Node
NVIDIA Iray.
Recap PRO Autodesk ReMake is a solution for converting reality •G
 eneration of 3D meshed models from Multi-GPU
captured with photos or scans into high- laser scans or photos of an object Single Node
definition 3D meshes. These meshes can be •G
 PU accelerated photogrammetry process
cleaned up, fixed, edited, scaled, measured, from 2D to 3D
re-topologized, decimated, aligned, •3
 D model display accelerated by GPU for
compared and optimized for downstream smooth navigation of converted models in
workflows entirely in ReMake. all display modes
REMCOM REMCOM WaveFarer is a high-fidelity radar simulator •N
 ear-field propagation method Multi-GPU
WaveFarer for drive scenario modeling at frequencies •T
 argeted ray casting, dynamic scenario, Single Node
up to and beyond 100GHz. radiation patterns from antennas
RETOMO BETA CAE New software for the generation of • OpenGL Single GPU
Systems 3D-tesellated models from CT-scan images Single Node
Review PiXYZ Imports any CAD data to prepare and •L
 arge CAD file support with NVIDIA Pascal Single GPU
experience your content with VR. Single Pass Stereo extension integration Single Node
Revit Autodesk Building Information Modeling (BIM) for •M
 odeling (BIM) to design, build, and Single GPU
architecture, engineering and construction. maintain higher-quality, more energy- Single Node
efficient buildings
•G
 RID support
RHINO McNeel & Assoc. General purpose conceptual/industrial •F
 ast, scalable OpenGL 3.3 pipeline Single GPU
design software for AEC and Manufacturing leverages latest NVIDIA GPUs Single Node
industries, including CYCLES (their custom- •G
 PU computed shaders and memory
Renderer based on open source Blender) a optimizations
real-time ray-traced display mode that is •R
 hino 6 and new RHINO 7 leverages NVIDIA
CUDA-based. RT CUDA Cores for Real-time ray tracing
viewport mode, and Tensor Cores for
Denoising
•R
 endering engine is CYCLES, fully
integrated inside RHINO 7 now
Simcenter Femap Siemens Digital Engineering simulation application for • Rendering Single GPU
Industries creating, editing, and importing/re-using Single Node
Software mesh-centric finite element analysis
models of complex products or systems

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  23

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 23 4/5/21 10:18 AM


Simcenter Prescan Siemens Digital virtually validate ADAS and automated •S
 peed up the TIS sensor used for radar, Multi-GPU
Industries vehicle functionalities by replicating real lidar, PMD and ultrasonic sensors Multi-Node
Software world scenarios, adding sensor models, •C
 amera sensor and fisheye camera sensor
and interface for control systems to design
and verify algorithms for data processing,
sensor fusion, decision making and control
Simcenter STAR- Siemens Digital Immersive VR for CFD results visualization •H
 TC Vive virtual reality headset Single GPU
CCM+ VR Industries Single Node
Software
Simpleware Synopsys 3D image data visualization, analysis and • OpenGL Single GPU
model generation software Single Node
SketchUp Pro Trimble SketchUp, formerly Google SketchUp, now •O
 penGL now but moving to DirectX 11 for Single GPU
SketchUp part of Trimble in Sunnyvale, CA. SketchUp, and DirectX 12 and VULKAN for Single Node
SketchUp is a 3D modeling computer TEKLA Structures (late 2021 and 2022)
program for a wide range of drawing •F
 ast, efficient graphics in the viewport
applications such as architectural, interior •R
 TX photorealistic rendering
design, landscape architecture, civil and •3
 rd party plug-ins supported by SketchUp
mechanical engineering, film and video Pro
game design.
Solid Edge Siemens Digital SMB CAD option from Siemens •K
 eyShot rendering Single GPU
Industries Single Node
Software
SOLIDWORKS Dassault 3D design and product development •H
 igh performance in Shaded, Shaded Single GPU
Systèmes solution including design, simulation, cost w/ Edges, and RealView modes, FSAA Single Node
estimation, manufacturability checks, CAM, for sharp edges, Order Independent
sustainable design, and data management. Transparency
•R
 eal time photorealistic renderings with
SOLIDWORKS Visualize, an Iray-based
application.
SOLIDWORKS Dassault Easy to use photorealistic rendering • I ray-based ray-tracing Single GPU
Visualize Systèmes software based on NVIDIA Iray •A  nimation support Single Node
•N  etwork rendering
•O  ptiX-based Artificial Intelligence denoiser
Spotscale Spotscale 3D reconstruction algorithms are tailored • cuDNN Multi-GPU
for buildings and urban environments. Single Node
using drones to captured data.
Studio PiXYZ Interactively prepare & optimize any CAD •L
 arge scale CAD format Single GPU
data before using your favorite staging tool. •S
 upport for multi-CAD file standard, Single Node
prepare, optimize and heal your geometry
before experiencing it in VR
Substance Adobe Allows to simply create material from •D
 L powered material recognition Single GPU
Alchemist picture or by blending pre-existing •M
 aterial scan, edit and blend Single Node
materials, create and manage your material
libraries
Substance Adobe Material shader edition and market •R  TX bakers Multi-GPU
Designer reference for procedural texture creation. • I ray viewport/rendering Single Node
Substance Painter Adobe Intuitive interactive 3D painting software •R  TX bakers Multi-GPU
with physics and particle support. • I ray viewport Single Node
Sunata Siemens Digital Cloud-based thermal modeling for additive •T
 hermal simulation Multi-GPU
Industries manufacturing. Recommends optimal Single Node
Software parameters for the print, including print
orientation and support structures.
Teamcenter Active Siemens Digital Active Workspace is an IT-friendly client for •G
 RID support Single GPU
Workspace Industries Teamcenter product lifecycle management, Single Node
Software with zero-install footprint and web browser
access that provides an identical and
seamless experience on any computing or
smart device.
T-FLEX CAD Top Systems 3D and 2D parametric design, simulation, •H  igh performance visualization Multi-GPU
photorealistic rendering •R  eal time photorealistic rendering Single Node
• CUDA
UE4 Epic Games Unreal Engine 4 is a suite of integrated •G
 PU Accelerated Rendering on OpenGL, Single GPU
tools for developers to design and build DirectX and Vulkan Single Node
games, simulations, and visualizations. •P
 hys-X implemented

24  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 24 4/5/21 10:18 AM


Vectorworks Nemetschek Building Information Modeling (BIM) •O
 penGL based GPU rendering Multi-GPU
VECTORWORKS enabled design software for the Single Node
Architecture, Landscape, and Entertainment
industries.
Volumetric Camera Volumetric 4D capture service with high quality and • CUDA Single GPU
Systems Camera Systems realistic “holograms-in-motion” of people, •Q  uadro GPUs Single Node
animals, or any moving subject
Secondly, we offer “photo-realistic 3D
environment captures” using industrial
grade Leica Laser Scanners and advanced
high-resolution multi-camera systems.
VRED Autodesk VRED 3D visualization software for •E  nhanced geometry behavior Multi-GPU
automotive designers and engineers to •A  utomotive product interoperability Single Node
create product presentations, design •N  avigation in a scene
reviews, and virtual prototypes. Uses Digital • I mport Alias layer structure
Prototyping to quickly visualize ideas and •A  sset Manager improvements
evaluate designs. • I ntegrated file converter
•A  nalytic rendering modes
•G  ap Analysis tool
•O  culus Rift support
•A  nimation module
•M  ultiple rendering modes
•S  ubsurface scattering
•D  isplacement mapping
WeViz Studio Meshroom VR Real-time rendering tool specially made •R
 TX real-time ray tracing Single GPU
for industrial design reviews, allowing to Single Node
import, edit materials, set up your scene
and showcase your model in real-time.
WYSIWYG Cast Software Wysiwyg is an all-in-one lighting design •G
 PU accelerated Shaded Views and Virtual Multi-GPU
software with fully integrated CAD, plots, Views Single Node
data, visualization and virtual show control.
Features the largest CAD library with
thousands of 3D objects you can choose
from to design your entire show.
ZLVE Zerolight Immersive customer experience with VR or •V
 RS and foveated rendering for VR and 3D Multi-GPU
web GPU streaming experience through AWS GPU streaming Single Node

ELECTRONIC DESIGN AUTOMATION


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Advanced Design KeySight Simulation tool for design of RF, microwave •T


 ransient Convolution simulation with Single GPU
System (ADS) and high speed digital circuits BSIM4 models Single Node
Altair Feko Altair Comprehensive computational •F
 DTD solver Multi-GPU
electromagnetics (CEM) code used widely •M
 oM solver Single Node
in the telecommunications, automobile, •R
 L-GO solver
space and defense industries to solve high- •C
 MA Solver
frequency problems.
Ansys HFSS ANSYS Simulation tool for modeling 3-D full-wave •T
 ransient solver Multi-GPU
electromagnetic fields in high-frequency and •F
 EM solver Single Node
high-speed electronic components •O
 penGL rendering
Ansys HFSS SBR+ ANSYS Simulation tool for installed antenna •H
 igh-frequency solver Multi-GPU
performance and antenna-to-antenna •O
 penGL rendering Multi-Node
coupling
Ansys Maxwell ANSYS Industry-leading electromagnetic field •E
 ddy Current Solver Multi-GPU
simulation software for the design and Single Node
analysis of electric motors, actuators, sensors,
transformers and other electromagnetic and
electromechanical devices
Ansys Nexxim ANSYS Circuit simulation engine for RF/analog/ •A
 MI analysis Single GPU
mixed-signal IC design, and IBIS-AMI Single Node
analysis speedup with GPU computing.
Cadence Allegro Cadence Design EDA/ECAD tool for PCB (Printed Circuit •O
 penGL extensions Multi-GPU
Systems Board) Design •S
 calable Vector Graphics (SVG), Path Multi-Node
Rendering SDK

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  25

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 25 4/5/21 10:18 AM


CDP D2S GPU acceleration of real-time in-line •C
 omputational lithography simulations for Multi-GPU
enhancement of semiconductor mask synthesis on GPUs Multi-Node
manufacturing equipment such as the
NuFlare EBM-9500 and MBM-1000
mask writers.
CST MPHYSICS Dassault Multiphysics simulation including thermal, •C
 onjugated Heat Transfer Solver Single GPU
STUDIO Systèmes CFD, and mechanical capabilities. Tightly Single Node
SIMULIA Corp. integrated with CST’s electromagnetic
solvers.
CST STUDIO SUITE Dassault Accurate and efficient computational •T  ransient Solver Multi-GPU
Systèmes solution for 3D simulation of electromagnetic • I ntegral Equation Solver Multi-Node
SIMULIA Corp. devices in a wide range of frequencies. •A  symptotic Solver
•M  ultilayer Solver
EMPro KeySight Modeling and simulation environment for •F
 inite Difference Time Domain (FDTD) Multi-GPU
analyzing 3D EM effects of high speed and solver Single Node
RF/Microwave components.
JMAG JMAG FEA software for electromechanical design. •E
 M transient solver Multi-GPU
Fast solver / High quality mesh / Advanced •E
 M time harmonic solver Single Node
modeling technologies. •E
 M static solver
REMCOM XFdtd REMCOM 3D EM Simulation solver. •F
 DTD Solver Multi-GPU
Multi-Node
samadii/em Metariver Software for computing the electromagnetic •E
 lectromagnetics simulator, FEM Multi-GPU
Technology field in three dimensional space using the solver(scalar FEM, vector FEM) Multi-Node
Maxwell equation, a governing equation •E
 lectrostatics solver, Electromagnetic wave
that can comprehensively represent these solver
electromagnetic phenomena •M
 agnetostatics solver, Electric current
solver, Electrodynamics solver
•C
 o-simulation with samadii/sciv, samadii/
dem and fluid flow solvers.
samadii/plasma Metariver Software for computing plasma •P
 lasma simulator, Charged particle motion Multi-GPU
Technology phenomenon with PIC(Particle-in-Cell) analysis Multi-Node
method. Two-way coupled simulation with •P
 article and surface reaction calculation,
samadii/em and samadii/sciv. Field analysis, Sheath range prediction
•D
 SMC collision module, PIC module
•C
 o-simulation with samadii/em, Ansys
Maxwell and COMSOL.
SEMCAD-X SPEAG 3D Full wave electromagnetic and •F
 DTD solver Multi-GPU
computational life sciences simulation Single Node
solver
Serenity Lucernhammer EM Simulation (RCS) tool •M
 oM solver Multi-GPU
Single Node
Sim4Life ZMT Zurich 3D Electromagnetics & Acoustic modeling •T
 ransient, Broadband, and Harmonic Multi-GPU
MedTech AG and simulation simulations FDTD solver Single Node
•L
 inear and non-linear 3D full wave
acoustics solvers
Synopsys Synopsys LucidShape is a computer aided lighting •R
 ay Tracing Single GPU
LucidShape (CAL) design software for automotive •M
 onte Carlo simulations using OptiX 6.5 Single Node
lighting design tasks. Supports algorithms and CUDA 10.2
optimized for automotive applications,
LucidShape facilitates the design of
automotive forward, rear and signal
lighting, and reflectors.
TrueMask MDP D2S GPU-accelerated simulation and data •S
 imulation-based processing Multi-GPU
preparation for mask writing. Multi-Node
TrueModel D2S GPU-accelerated simulation and geometric •S
 imulation-based processing Multi-GPU
checking of curvilinear shapes. Multi-Node
VSim for Tech-X Conformal FDTD for electromagnetics •F
 DTD solver Single GPU
Electromagnetics Corporation for a variety of material types, yielding Single Node
engineering outputs that can be used for
design of electromagnetic devices
WIPL-D 2D Solver WIPL-D 2D EM modeling and simulation for long •M
 oM Solver Multi-GPU
cylindrical structures •M
 atrix fill-in and near-field calculations Single Node

26  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 26 4/5/21 10:18 AM


WIPL-D Pro WIPL-D Solver for fast and accurate electromagnetic • M
 oM (Method of Moments) Solver Multi-GPU
analysis of arbitrary composite 3D metallic •D DS (Domain Decomposition Solver) Multi-Node
and dielectric structures
WIPL-D Pro CAD WIPL-D Modeling and simulation environment •M
 oM (Method of Moments) Solver Multi-GPU
uniting versatile, yet simple geometry Single Node
modeling, with signature WIPL-D
simulation accuracy
Wireless InSite REMCOM Uses Optix 4.1 for Ray-tracing and •X
 3D Ray Tracer Multi-GPU
Propagation prediction Single Node

INDUSTRIAL INSPECTION
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Cognex VisionPro Cognex Deep learning-based software dedicated to •F eature localization and identification Single GPU
ViDi industrial image analysis. Cognex ViDi Suite •S egmentation and defect detection Single Node
is a field-tested, optimized and reliable •O bject and scene classification
software solution based on a state-of-the- • Text & character recognition
art set of algorithms in machine learning.
HALCON MVTec Software MVTec HALCON is the comprehensive •D
 eep learning - pre-trained networks Single GPU
standard software for machine vision with optimized for latency or precision Single Node
an integrated development environment. •H
 ALCON also provides an IDE for training
HALCON allows models to be trained on neural networks
GPUs, and outputs trained models for •S
 ub-pixel detection, edge detection,
inference on CPU, GPU, or Jetson. counting, OCR, barcode reading, 3D
reconstruction from stereo
IBM Visual Insights IBM Corporation IBM Visual Insights uses cognitive •C
 loud-based DL training, deployment on Multi-GPU
capabilities to review and analyze parts, (spec’ed) edge server Single Node
components, and products. Identifies
defects by matching patterns to images
of defects that it has previously analyzed
and classified. Deploy models to edge
computing on production lines to facilitate
rapid image capture by camera and
cognitive identification of defects. Quickly
assess quality inspection metrics across
manufacturing processes.

Media and Entertainment


ANIMATION, MODELING AND RENDERING
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

3ds Max Autodesk 3D modeling, animation, and rendering •F


 aster interactive graphics Multi-GPU
•A
 vailability of Arnold with AI denoising Single Node
•A
 vailability of Chaos V-Ray, Otoy Octane,
Redshift, cebas finalRender third-party
GPU renderers
Altair Thea Render Altair Physically-based progressive spectral CPU/ •G
 PU-accelerated hybrid renderer Multi-GPU
GPU Renderer supporting fast interactive •A
 dvanced material layering system with Single Node
changes and bucket rendering for high subsurface scattering, displacement
resolution images mapping, physical sun-sky and IES support
ArmorPaint Armory ArmorPaint is a software designed for •G
 PU accelerated painting processes Single GPU
physically-based texture painting. There is Single Node
a standalone version, or you can use as an
Armory3D project. Draw textures directly
using node based materials and brushes.
Arnold Autodesk Solid Angle Arnold film and animation • RTX Multi-GPU
renderer Single Node
Beauty Box Digital Anarchy Automatic masking and skin retouching. •G
 PU accelerated graphics and compute Single GPU
Single Node
Blender Blender Institute 3D modeling, rendering and animation •G
 PU-accelerated interactive viewport Single GPU
Single Node
Blender Cycles Blender Institute GPU renderer •C
 UDA-accelerated rendering Multi-GPU
•R
 TX-accelerated ray tracing Single Node

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  27

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 27 4/5/21 10:18 AM


Character Creator Reallusion Character Creator 3 is a full character •G  PU accelerated processing Single GPU
creation solution for designers to easily • I ray support Single Node
create, import and customize stylized or
realistic looking character assets for use
with iClone, Maya, Blender, Unreal Engine
4, Unity or any other 3D tools. It connects
industry leading pipelines into one system
for 3D character generation, animation,
rendering, and interactive design.
Cinema 4D Maxon 3D modeling, animation, and rendering • I ncreased model complexity at interactive Single GPU
rates Single Node
•S  upport for Redshift and Chaos V-Ray and
Otoy Octane and third-party GPU renderers
Corona Chaos Group High-performance photorealistic renderer •O
 ptiX AI de-noising Single GPU
Single Node
D5 Render D5 Innovation D5 Render, based on NVIDIA RTX GPU’s •R
 eal-time GPU accelerated physically Single GPU
real-time ray tracing and rasterization based global illumination and ray tracing. Single Node
technology, aims to bring unprecedented
real-time rendering experience for
architecture and interior design.
Daz Studio Daz3D Powerful and free 3D creation software •G
 PU accelerated compute Multi-GPU
tool that is not only easy to use but rich in •R
 endering via NVIDIA IRAY and Optix Single Node
features and functionality.
Dimension Adobe 3D design tool enabling graphic •R
 TX ray tracing, accelerated graphics & Multi-GPU
designers to compose, adjust, and render MDL (Material Definition Language) Single Node
photorealistic images.
EmberGen JangaFX A standalone real-time fluid simulation tool •G
 PU accelerated volumetric fluid Single GPU
built specifically for real-time VFX Artists simulations Single Node
with an expansive node based system.
finalRender Cebas PLUGIN for 3dsMAX •C
 UDA-accelerated renderer for Autodesk Single GPU
Physically Based (Spectral) Wavelength 3DS Max Single Node
Simulation •O
 ptiX AI de-noising
Biased + Unbiased Hybrid Rendering
Unlimited Network Rendering
HIERO Player Foundry Shot management, conform and review •F
 luid, interactive playback Single GPU
timeline Single Node
Houdini SideFX Procedural 3D modeling, animation and •F
 aster simulations Multi-GPU
rendering Single Node
iClone Reallusion iClone is the software for real-time 3D •G
 PU accelerated ray-tracing and rendering Single GPU
animation, blending character creation, Single Node
scene design, and cinematic storytelling
into a real-time engine.
Indigo Glare Unbiased, physically-based renderer. •G
 PU-accelerated rendering Multi-GPU
Technology Single Node
KATANA Foundry Powerful look development and lighting tool •F
 aster interactive graphics Single GPU
Single Node
Lightwave 3D NewTek 3D modeling, animation, and rendering • I ncreased model complexity at interactive Single GPU
rates Single Node
LuxRender LuxRender GPU 3D Renderer •G
 PU-accelerated ray tracing Single GPU
Single Node
MARI Foundry 3D paint tool that allows painting directly •F
 aster interactive painting Single GPU
onto 3D models Single Node
Mars sheencity Real-time architectural visualization tool •R  TX Ray tracing Single GPU
with advanced features such as real-time • DLSS Single Node
ray tracing, DLSS, and VR.
Marvelous CLO Virtual Realistic and dynamic 3D modeling •G
 PU accelerated cloth simulations Single GPU
Designer Fashion Inc software for clothes and fabric. Single Node
Massive Massive Simulation and visualization tools for •G
 PU accelerated effects Single GPU
autonomous agent driven animation for Single Node
film, games, television, architecture and
transportation.

28  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 28 4/5/21 10:18 AM


Maverick Renderer Maverick CUDA-based GPU renderer •C
 UDA-accelerated ray-tracing Single GPU
•O
 ptiX 7 de-noising Single Node
Maxwell Next Limit CUDA-accelerated interactive and final- •C
 UDA-accelerated ray-tracing Multi-GPU
frame renderer •U
 nrestricted image resolution Single Node
•O
 ptiX de-noising
Maya Autodesk 3D modeling, animation, and rendering • I ncreased model complexity and larger Single GPU
scenes Single Node
• Availability of Chaos V-Ray, Otoy Octane and
Redshift third-party GPU renderers
Meshroom Czech Technical Open source photogrammetry 3D software •C
 UDA-accelerated depth analys Single GPU
University (CTU) Single Node
Metashape Agisoft Agisoft PhotoScan is a stand-alone software •C
 UDA-accelerated photogrammetry Multi-GPU
product that performs photogrammetric solution Single Node
processing of digital images. Generates 3D •R
 TX opportunity
spatial data to be used in GIS applications,
and cultural heritage documentation for
visual effects production and indirect
measurements of objects of various scales.
MODO Foundry 3D modeling, animation and rendering • I ncreased model complexity, larger scenes Single GPU
Single Node
Motion Builder Autodesk Character animation and motion capture • I ncreased model complexity at interactive Single GPU
rates Single Node
Mudbox Autodesk 3D sculpting • I ncreased model complexity at interactive Single GPU
rates Single Node
NX Ray Traced Siemens Digital Embedded rendering feature for • I ray based Multi-GPU
Studio Industries Siemens NX • MDL Single Node
Software •A  I denoising
OctaneRender Otoy CUDA-accelerated GPU renderer •G
 PU accelerated rendering Multi-GPU
•A
 I de-noising Single Node
Realflow Next Limit Fluid simulation system •G
 PU-accelerated simulation Single GPU
Single Node
RealityCapture Capturing Photogrammetry •C
 UDA-accelerated, fast photogrammetry Multi-GPU
Reality Single Node
Redshift Renderer Redshift GPU-accelerated, biased renderer •C
 UDA-based GPU final-frame rendering Multi-GPU
•M
 ac and Windows supported Single Node
Renderman Pixar Leading film renderer •O
 ptiX AI de-noising Single GPU
Single Node
Sculptris Pixologic 3D sculpting • I ncreased model complexity at interactive Single GPU
rates Single Node
Trapcode Red Giant Particle simulations and 3D effects for •G
 PU accelerated effects Single GPU
motion graphics and VFX. Now with Fluid Single Node
Dynamics.
TurbulenceFD Jawset Turbulence FD is a powerful simulation tool •G
 PU accelerated graphics, compute and Single GPU
to create smoke, fire and explosion effects. simulation Single Node
Vantage Chaos Group Vantage is an interactive viewer that takes •R  TX-accelerated, high frame-rate camera Multi-GPU
V-Ray scene files and uses DXR-accelerated • I nteractive animations Single Node
ray tracing to display interactive scenes. •B  i-directional link to Autodesk 3ds Max
It will be sold as a separate product, not • I deal for AEC walk throughs and product
bundled with V-Ray. design
V-Ray GPU Chaos Group GPU renderer with CPU Hybrid rendering •C
 UDA interactive and final-frame GPU Multi-GPU
rendering Single Node
vRt vRt vRt is an open-source project aiming to • v RtC (compute-based, native, default, wide Multi-GPU
offer Vulkan-based ray-tracing for modern GPU support) Single Node
graphics cards that offers a unified ray- • v RtX (NVIDIA RTX only, more higher
tracing, cross-platform library built against performance at now)
Vulkan 1.1
WispRenderer Bred University General purpose high level rendering •R  TX, RTGI, HBAO+ Multi-GPU
of Applied library with RTX, RTGI, HBAO+, and Ansel • Ansel Single Node
Sciences support.

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  29

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 29 4/5/21 10:18 AM


COLOR CORRECTION AND GRAIN MANAGEMENT
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

ARRI de-bayering ARRI RAW de-bayering SDK •D


 e-bayering of ARRI RAW and primary Single GPU
SDK color grading. Single Node
Baselight FilmLight Color grading •R
 eal-time color correction Multi-GPU
Single Node
Cinema RAW SDK Canon RAW de-bayering •G
 PU-accelerated de-bayering Single GPU
Single Node
Dark Energy Cinnafilm Application and plug-in for image • I mage de-noising and restoration Multi-GPU
enhancement •N  oise reduction, de-noise and de-grain Single Node
•G  rain removal, image sharpening and
texture management dust busting
•S  DR to HDR upres
DaVinci Resolve Blackmagic Color grading and editing •R
 eal-time color correction and de-noising Multi-GPU
Design •R
 TX-accelerated AI features for re-timing Single Node
and image enhancement
DeNoise AI Topaz Labs DeNoise AI uses machine-learning to •G
 PU accelerated effects Single GPU
remove noise from your image while Single Node
preserving detail for a crisp, clear result.
Whether you are shooting with High ISO or
in a low light scenario, DeNoise will correct
your image without removing any important
information or patterns in your image.
Diamant-Film HS-Art Film cleanup and restoration •C
 UDA accelerated optical flow, de-flicker, Multi-GPU
Restoration in-painting and over 30 filters Single Node
Grain and Noise Wavelet Beam Video noise reduction •C
 UDA-accelerated grain and noise Multi-GPU
Reducer reduction Single Node
HDR Image aja A 1RU waveform, histogram, vectorscope •P  recise, high quality UltraHD UI for native- Single GPU
Analyser and Nit-level HDR monitoring solution for resolution picture display Single Node
HD, UltraHD, 2K, and HD resolution with •A  dvanced out of gamut and out of
HDR and WCG content. brightness detection with error intolerance
•S  upport for SDR (Rec.709), ST2084/PQ and
HLG analysis
•C  IE graph, Vectorscope, Waveform,
Histogram
•O  ut of gamut false color mode to easily
spot out of gamut/out of brightness pixels
•D  ata analyzer with pixel picker
• Up to 4K/UltraHD 60p over 4x 3G-SDI inputs
•S  DI auto signal detection
•F  ile base error logging with timecode
•D  isplay and color processing look up table
(LUT) support
•L  ine mode to focus a region of interest onto
a single horizontal or vertical line
•L  oop through output to broadcast monitors
•S  till store
•N  it levels and phase metering
•B  uilt-in support for color spaces from
ARRI, Canon, Panasonic, RED and Sony
Magic Bullet Red Giant Real time, interactive, multi-layered •G
 PU accelerated effects Single GPU
Colorista masked color correction (video playback Single Node
too!) with the Mercury Playback engine in
Premiere Pro.
Magic Bullet Looks Red Giant Powerful looks and color correction for •G
 PU accelerated compute Single GPU
filmmakers. Single Node
Mist Marquise Mastering tool for cinema, broadcast and •1  00% CUDA-accelerated imaging pipeline Multi-GPU
Technologies over-the-top content for de-bayering, color grading, transcoding Single Node
and image enhancement
• I ntegrated Dolby Vision pipeline
Nucoda Digital Vision Color grading •G
 PU-accelerated color grading Single GPU
•A
 ccelerated scopes, playback and Single Node
rendering

30  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 30 4/5/21 10:18 AM


Pablo family Grass Valley Color grading and finishing •R
 eal time color correction Multi-GPU
Single Node
Pablo Rio Grass Valley Pablo Rio is a color grading application that •C
 UDA-accelerated color grading Multi-GPU
GV acquired when they purchased Snell. Single Node
PFClean The Pixel Farm Image restoration and remastering •C
 UDA-based image processing Multi-GPU
acceleration Single Node
RAW Converter ARRI RAW de-Bayering and primary color grading •C
 UDA-accelerated de-bayering and Single GPU
primary grading Single Node
REDCINE-X PRO Red Digital Primary color grading •C
 UDA-accelerated de-bayering and Single GPU
Cinema primary color grading Single Node
Red Digital Cinema Red Digital Red Digital Cinema camera SDK decodes •C
 UDA-accelerated wavelet decoding and Single GPU
R3D SDK Cinema and de-bayers Red RAW camera data, and de-bayering Single Node
allows primary color grading. Used by many
color grading and video editing applications.
Scratch Assimilate Color grading and finishing •A
 ccelerated de-bayering for real-time Single GPU
digital finishing Single Node
VFX Suite Red Giant VFX Suite is a complete set of visual effects •G
 PU accelerated effects Single GPU
and motion graphics plugins for creating Single Node
professional effects.

COMPOSITING, FINISHING AND EFFECTS


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

After Effects Adobe Motion graphics and effects •C


 UDA acceleration for up to 10x faster Single GPU
performance on key effects plus enhanced Single Node
3D ray tracing
Aura Rowbyte Aura is a procedural plug-in for After •G
 PU-accelerated High Frequency Single GPU
Effects that creates elegant geometric Rendering Single Node
shapes in 3D space. It’s akin to a particle
system but instead of rendering small
particles all over the place, it generates
vector like shapes (waves) that change over
time much like the classic Radiowaves
plug-in.
Clipster Rohde & Video and film player and DCI Packager • GPU-accelerated Multi-GPU
Schwarz •V  ideo scaling Single Node
•C  olor space conversion
•D  ata format conversion
Complete CoreMelt Visual effects plug-in •F
 aster effects Single GPU
Single Node
Continuum Boris FX Visual effects plug-in for creative effects, •G
 PU accelerated effects Single GPU
titling, and quick fixes. Single Node
DE:Noise RE:Vision Effects Reduce noise, dust, and artifacts with •F
 aster effects Single GPU
frame-to-frame motion tracking. Useful Single Node
for low light shoots, CG renders with ray
tracing sample artifacts, excessive film
grain.
DEFlicker RE:Vision Effects Reducing flicker and artifacts in high- •F
 aster effects Single GPU
frame-rate and time-lapse video. Single Node
Element 3D Video Copilot Advanced 3D object & particle render •G
 PU accelerated graphics and compute Single GPU
engine plugin for Adobe After Effects Single Node
Flame Premium Autodesk Finishing and color grading • I ntegrated toolset for 3D VFX, editorial, and Multi-GPU
color grading Single Node
Flicker Free Digital Anarchy Deflicker Time Lapse, Slow Motion, and Old •G
 PU accelerated effects Single GPU
Video. Flicker Free is a powerful, new way Single Node
to deflicker video.
Fusion Blackmagic Effects and compositing •3  D tracking Single GPU
Design • Compositing Single Node
• VR
HIERO Foundry Multi-shot management tool that supports •F
 luid, interactive playback Single GPU
collaborative working, review and approval, Single Node
quick production turnaround and delivery

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  31

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 31 4/5/21 10:18 AM


Imerge Pro FXhome Imerge Pro is layer-based image •G
 PU-accelerated processing Single GPU
compositing software that is GPU Single Node
accelerated, making performance
astonishingly fast, even on high-resolution
images.
Create pro-level composites with unlimited
layers and zero baked-in changes. Imerge
Pro is the first photo editing software to
keep your image data RAW and your layers
self-contained.
Magic Bullet Red Giant Magic Bullet Denoiser III lets you reduce •G
 PU accelerated effects Single GPU
Denoiser visible noise and grain in digital video Single Node
produced by digital video cameras,
camcorders, or film.
Magic Bullet Film Red Giant Gives digital footage the look of real film by •G
 PU accelerated effects Single GPU
emulating the entire photochemical process Single Node
from the original film negative, to color
grading, and finally to the print stock.
Magic Bullet Suite Red Giant Full suite of tools for color correction, •G
 PU-accelerated processing and affects Single GPU
finishing and film looks for filmmakers. Single Node
Mamba FX SGO High-end compositing •F
 aster keying, tracking, painting and Single GPU
restoration Single Node
MediaReactor Drastic Debayering and processing of raw camera •G
 PU-accelerated compute Single GPU
Technologies files. Single Node
Mighty Bake Mighty Bake A powerful, easy to use, all-in-one texture •G
 PU accelerated processing Single GPU
baking solution for any 3D artist Single Node
Mistika Ultima SGO Color grading and finishing •F
 aster keying, tracking, painting and Single GPU
restoration, de-bayering Single Node
Mistika VR SGO Near real-time optical flow stitching •G
 PU-accelerated video stitching with Single GPU
manual controls Single Node
•E
 xport clips in many formats, including
DPX and ProRes
Mocha Pro Boris FX Mocha Pro is an award-winning planar •G
 PU accelerated planar tracking and object Single GPU
tracking tool for motion tracking, removal Single Node
rotoscoping, object removal, camera
stabilization and general visual effects.
Natron Natron Natron is a free and open-source node- •G
 PU-accelerated processing and rendering Single GPU
based compositing software application. Single Node
Neat Video Absoft Digital filter with auto-profiling tool •G
 PU accelerated processing Single GPU
designed to reduce visible noise and grain Single Node
found in footage.
NUKE Foundry Compositing tool with 3D tracker •G
 PU-accelerated BLINK processing Single GPU
•F
 aster compositing and effects Single Node
Optics Boris FX Optics is designed to simulate optical •G
 PU accelerated processing and affects Single GPU
camera filters, specialized lenses, film Single Node
stocks and grain, lens flares, optical lab
processes, color correction as well as
natural light and photographic effects. First
collaborative product between Sapphire and
Digital Film Tools. Plugin for Photoshop and
Lightroom, also has a Windows and Mac
standalone application.
PFTrack The Pixel Farm 3D scene creation and tracking •C
 UDA-accelerated tracking Multi-GPU
Single Node
Plexus Rowbyte Plexus is a plug-in designed to bring •P
 lexus (interacts natively with AE’s Single GPU
generative art closer to a non-linear Camera) Single Node
program like After Effects. It lets you •H
 igh-quality, GPU-accelerated Depth of
create, manipulate and visualize data Field effects
in a procedural manner. Render the
particles and create all sorts of interesting
relationships between them based on
various parameters using lines and
triangles.

32  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 32 4/5/21 10:18 AM


Rotobot Kognat An AI product for compositing packages •C
 UDA accelerated AI rotoscoping Multi-GPU
which uses machine learning to generate Single Node
mattes for machine-based rotoscoping.
Sapphire Boris FX The Sapphire suite is an all-in-one solution •F
 aster effects Single GPU
containing hundreds of effects, presets, Single Node
and workflows that are aimed at taking
professional video work to the next level.
SilhouetteFX Boris FX Invaluable in post-production, Silhouette •G
 PU-accelerated processing and affects Single GPU
continues to bring best of class tools to the Single Node
visual effects industry. As a fully featured
GPU accelerated compositing system,
its standout features are award winning
rotoscoping and non-destructive paint as
well as keying, matting, warping, morphing,
and a total of 142 different nodes--all stereo
enabled.
Silhouette Paint Boris FX Rotoscoping tool that allows for intensive •G
 PU accelerated processing and affects Single GPU
VFX fixes, blemish cleanup, beauty effects, Single Node
wire/object removal, style effects on video,
and as an artistic paint tool. It is raster
based so it has a smaller memory footprint
(fastest paint plugin on the market),
Integrated with Mocha Pro planar tracker
Twixtor RE:Vision Effects Optical flow tracking of pixel motion •F
 aster effects Single GPU
to synthesize new frames by warping Single Node
& interpolating frames of the original
sequence. Reduces artifacts & retime
frames.
Video Essentials NewBlueFX Comprehensive collection of titling, •F
 aster effects Single GPU
transitions and video effects. Single Node

(VIDEO) EDITING
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Blackmagic RAW Blackmagic Blackmagic RAW is a CPU and GPU- •C


 UDA-accelerated de-coding and de- Single GPU
SDK Design enabled SDK for decoding and debayering bayering Single Node
Blackmagic RAW files on MacOS, Windows
and Linux
Catalyst Production Sony Creative 4K, Sony RAW, and HD video editing. •F
 aster effects, transitions and encoding Single GPU
Suite Software Includes 3 applications: Browse, Prepare, •R
 AW camera de-bayering Single Node
Edit
CineMatch FilmConvert CineMatch is a set of tools designed to •R
 eal-time color matching conversions with Single GPU
help you match footage shot on different CUDA Single Node
cameras to a baseline technical level - a
seamless, matched timeline in Log or
REC.709, ready for creative grading.
Edius Pro Grass Valley Video editing •F
 aster effects Single GPU
•R
 AW camera de-bayering Single Node
Filmora Wondershare Filmora is an easy-to-use and trendy video •G
 PU-accelerated processing Single GPU
editing software that lets you empower your Single Node
story and be amazed at results, regardless
of your skill level. With Filmora, you can
get started with any new movie project by
importing and editing your video, adding
special effects and transitions, and sharing
your final production on social media,
mobile devices, or DVDs.
Gigapixel AI Topaz Labs Photo up scaling by using AI to “fill in” and •G
 PU accelerated effects Single GPU
add new detail when enlarging photos. Single Node

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  33

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 33 4/5/21 10:18 AM


GPUSqueeze Multicamera GPUSqueeze is cross platform software •G
 PU accelerated video encoding and Multi-GPU
Systems library for multi-stream and ultra high decoding Multi-Node
speed video encoding, transcoding and
processing using multi-GPU and distributed
setups. The library uses highly optimized
patent pending algorithms to achieve
maximum speed, high hardware utilization
and provides almost linear performance
scaling with the increase of number of
GPUs in the system.
HitFilm Pro FXhome HitFilm Pro is an all-in-one video editor, •G
 PU accelerated effects and decoding Single GPU
compositor, and visual effects (VFX) Single Node
software designed for filmmakers,
professional video editors, and visual
content producers.
Illustrator Adobe Vector graphics software for creating •E
 ntire canvas optimized for NVIDIA GPUs Single GPU
logos, icons, drawings, typography, and for faster pan & zoom Single Node
illustrations for print, web, video, and
mobile devices.
Lightroom Classic Adobe Easily edits organizes, stores, and shares •G  PU accelerated Develop module plus new Single GPU
your photos. Sensei features like “Enhance Details” with Single Node
NVIDIA GPU AI optimization.
•U  p to 600% faster than integrated GPUs
with controls like Texture, Dehaze, &
Sharpening
• I mproved editing in 1:1 view & on hi-rez
displays.
Lightworks EditShare Video editing •F
 aster effects Single GPU
•C
 UDA-accelerated de-bayering Single Node
Live Planet Live Planet Livestreaming, recording and delivery of •R  eal time 360 3D capture and stitch Single GPU
stereoscopic 360 VR • 4K Single Node
Luminar AI Skylum Luminar is the world’s first photo editor •G
 PU accelerated processing and AI affects Single GPU
that adapts to your style & skill level. It is Single Node
designed to make complex photo editing
easy & enjoyable for everyone. Take
advantage of over 300 powerful, yet simple
photo editing tools that allow you to perform
all kind of image editing tasks.
Media Composer Avid Video editing •F
 aster video effects, unique stereo 3D Single GPU
capabilities Single Node
Movavi Video Suite Movavi An all-in-one video maker: an editor, •F
 aster conversion speed with NVIDIA CUDA Single GPU
converter, screen recorder, and more. Single Node
MXF Film Partners Collaborative editing system supporting •N
 VIDIA Video Codec allowing remote GPU- Single GPU
Avid Media Composer, Adobe Premiere Pro, accelerated production workflows Single Node
Grass Valley Edius and Blackmagic Resolve
Photoshop Adobe Photo editing to transform your images into •G
 PU-accelerated AI “Neural Filters” Single GPU
anything you can imagine •3
 0+ other GPU accelerated features Single Node
•B
 lur gallery, liquify, smart sharpen,
perspective warp
Pinnacle Studio Corel Video editing and sharing program. •G
 PU accelerated compute and effects Single GPU
Single Node
PowerDirector CyberLink PowerDirector delivers professional-grade •G
 PU accelerated video processing and Single GPU
video editing and production for creators effects Single Node
of all levels. Whether you are editing in 360
degrees, Ultra HD 4K or even the latest
online media formats, PowerDirector
remains the definitive Windows video
editing solution for anyone, whether they
are beginners or professionals.
PowerDVD CyberLink CyberLink PowerDVD is a universal media •G
 PU accelerated encoding and decoding Single GPU
player for movie discs, video files, photos Single Node
and music.
Premiere Pro Adobe Video editing software for film, TV, and the •R
 eal-time video editing & fast output Multi-GPU
web. rendering based on CUDA Single Node

34  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 34 4/5/21 10:18 AM


Premiere Rush Adobe Easy-to-use video editor for creating and • CUDA Multi-GPU
sharing online videos. •R  eal-time video editing Single Node
•F  ast output rendering
Sharpen AI Topaz Labs Sharpening and shake reduction software •G
 PU accelerated effects Single GPU
that can tell difference between real detail •M
 achine Learning Single Node
and noise.
SmartCourtPro PlaySight Sophisticated video and analytics training • IVA Single GPU
technology with the latest in AI, integrations Single Node
and player development tools.
Smoke Autodesk Finishing and editing •F
 aster effects Single GPU
Single Node
TotalFX NewBlueFX Comprehensive collection of Titling, •G
 PU-accelerated affects Single GPU
Compositing, Polishing and Styling tools. Single Node
Vegas Pro Magix Video editing •F
 aster video effects and encoding Single GPU
•U
 ses NVENC to encode/decode H.264 and Single Node
HEVC streams
Velocity Imagine Video editing •F
 aster effects Single GPU
Communications Single Node
Video Enhance AI Topaz Labs Trained on thousands of videos and •G
 PU accelerated AI inference and Single GPU
combining information from multiple input processing Single Node
video frames, Topaz Video Enhance AI will
enlarge and enhance your footage up to
8K resolution with true details and motion
consistency.
Video Studio Corel High quality tools that build, edit, and •G
 PU accelerated compute Single GPU
correct video skillfully. Single Node
VLC Media Player VideoLAN VLC is a free and open source cross- •N
 V Video Codec accelerated encoding and Single GPU
Organization platform multimedia player and framework decoding Single Node
that plays most multimedia files as well
as DVDs, Audio CDs, VCDs, and various
streaming protocols.
WonderLive Z Cam Cinematic VR Camera with excellent •U
 p to 4K output resolution equirectangular Single GPU
image quality, stereoscopic 360 degrees; image Single Node
recording, and live streaming. •S
 ave live stitched video file
•P
 review live stitched video
•R
 TMP live streaming output
•S
 upports VRworks 360 video SDK

(IMAGE & PHOTO) EDITING


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Adjust AI Topaz Labs Adjust AI is a one click application that •G


 PU accelerated effects Single GPU
leverages the power of machine learning to Single Node
intelligently enhance photos.
Affinity Photo Affinity A fast and precise image editing software for •G
 PU accelerated image processing Single GPU
photography and creative professionals, from Single Node
editing and retouching images, creating full-
blown multi-layered compositions, to making
beautiful raster paintings.
Corel Draw Corel Professional vector illustration, layout, •F
 aster processing of AI features Single GPU
photo editing and design tools Single Node
Corel Photo-Paint Corel Corel PHOTO-PAINT is an advanced photo •F
 aster processing of AI features Single GPU
editing software that offers professional Single Node
editing tools and support for PSD files, plus
extensive RAW file support for over 300
types of cameras.
Fresco Adobe Powerful painting and drawing app that let •D
 irectX acceleration on GPU Single GPU
you create with realistic watercolors and oils Single Node
JPEG to RAW AI Topaz Labs AI powered conversion of JPEG to high- •G
 PU accelerated processing Single GPU
quality RAW for better editing. Prevent Single Node
banding, remove compression artifacts,
recover detail, and enhance dynamic range

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  35

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 35 4/5/21 10:18 AM


Mask AI Topaz Labs This is a AI-based masking tool for •G
 PU-accelerated processing Single GPU
photography that lets creators automatically Single Node
detect and remove objects from image.
Neat Image Absoft Reduces noise, film grain, artifacts from •G
 PU accelerated processing Single GPU
photos. Single Node
ON1 Photo Raw ON1 Professional-grade photo organizer, raw •G
 PU-accelerated processing Single GPU
processor, layered editor, and effects Single Node
app, includes everything you need in one
photography application.
PhotoLab DxO PhotoLab is a photo editor with specializing •G
 PU-accelerated processing and AI Single GPU
in high-quality RAW processing and optical features Single Node
corrections for lens defect, along with
powerful local image adjustment tools.
Topaz Studio Topaz Labs Topaz Studio is an intuitive image effect •G
 PU-accelerated processing Single GPU
toolbox with Topaz Labs’ powerful Single Node
acclaimed photo enhancement technology.
It works a plugin within Lightroom,
Photoshop, Affinity Photo, and others,
as well as a standalone editor and host
application for your other Topaz plugins.

ENCODING AND DIGITAL DISTRIBUTION


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

4K Capture Utility ElGato ElGato sells Capture Cards and offers a •H


 DR recording over HEVC Single GPU
for Windows capture software with them. The ElGato •H
 DR to SDR conversion Single Node
4K60 Pro Mk.II capture card includes an
implementation of the Video Codec SDK (i.e.
NVENC).
Alchemist on Grass Valley Video standards conversion •G
 PU-accelerated video processing and Multi-GPU
Demand encoding Single Node
Amberfin Dalet Transcoding and video quality analysis •G
 PU-accelerated video processing and Single GPU
encoding Single Node
Aurora Tektronix Automated video quality measurement •C
 UDA-accelerated video quality Single GPU
assessment Single Node
AW-360C10 Panasonic 360-degree Live Camera designed for live • Low-latency Single GPU
sporting events, concerts and stadium events •R  eal-time 4K 360 degree stitching from Single Node
four camera inputs
•J  etson TX-1
Content Agent Root6 Automated transcoding and workflow •G
 PU-accelerated video processing and Multi-GPU
management encoding Single Node
Core ArcVideo Video processing and transcoding Live •A
 ccelerated transcoding and encoding Multi-GPU
Single Node
Daniel2 Cinegy Resolution-independent, CUDA accelerated •8  K+ video playback faster than real time Single GPU
video codec. •3  D LUT color profiles supported Single Node
• lossless 10-, 12-, 16-bit support
•A  dobe Premiere Pro plugin
Discord Go Live Discord Broadcast feature that enables Discord • NVENC N/A
users to broadcast their screen to a Discord
channel
DouYu App DouYu Douyu’s streaming application • NVENC Single GPU
Single Node
Elemental Live Elemental Live streaming video processing and •V
 ideo encoding and video processing Multi-GPU
encoding Single Node
Elemental Server Elemental File-based video processing and encoding •V
 ideo encoding and video processing Multi-GPU
Single Node
Fast CinemaDNG Fastvideo RAW video debayering, denoising and color •H
 igh-quality GPU-based RAW video Multi-GPU
Processor correction completely on GPU side processing up to 160 fps Single Node
•W
 avelet, realtime de-noising
•C
 olor correction features and monitoring
•E
 xport to 16-bit TIF or 10-bit ProResFull-
sized video processing
•R
 ealtime 4K, 6K, and 8K playback
supported

36  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 36 4/5/21 10:18 AM


FAST TICO-RAW intoPIX The intoPIX TICO-RAW SDKs provide the •C
 UDA GPU accelerated up to 10K decoding Single GPU
highest quality, visually lossless codec •L
 ossless and low latency Single Node
for the optimization of your application’s •A
 ll operating systems
infrastructure. FastTICO-RAW SDKs are
perfect for all professionals looking to
deploy ultra-low latency, lossless RAW
encoding over parts of their workflows.
FAST TICO-XS intoPIX The intoPIX FastTICO-XS SDKs provide •C
 UDA GPU accelerated HD, UHD-4K and Single GPU
the highest quality, lowest latency, visually -8K encoding / decoding Single Node
lossless codec for the optimization of your •L
 ossless and low latency
application. FastTICO-XS SDKs are perfect •A
 ll operating systems
for all professionals looking to deploy ultra- •J
 PEG XS standard compliant
low latency, lossless encoding over their
whole infrastructure and workflows.
Handbrake Handbrake HandBrake is an open-source, GPL- •G
 PU accelerated encoding Single GPU
licensed, multiplatform, multithreaded Single Node
video transcoder.
HuYa App HuYa Huya’s streaming app • NVENC Single GPU
Single Node
JPEG2000 Codec Comprimato JPEG2000 encoding and decoding for DCP, •F
 aster-than-real-time UltraHD / 4K Multi-GPU
IMF, video editing, broadcast contribution, •L
 ossy and mathematically lossless Single Node
and archiving. •H
 igh-bit-depth (HDR)
•U
 ses NVENC to encode/decode multiple
H.264 and HEVC streams
Lightspeed Live Telestream Enterprise-class live streaming system that •V
 ideo processing and transcoding Multi-GPU
can ingest, encode, package and deploy Single Node
multiple sources to multiple destinations.
System utilizes the latest technologies to
deliver pristine quality and exceptional
processing speed. Video processing and
transcoding can be accelerated with GPU
for up to 9x speed improvements
Live ArcVideo High-density, real-time video processing •A
 ccelerated broadcast encoding with Multi-GPU
and encoding. NVIDIA CUDA and NVENC Single Node
Logitech Capture Logitech Logitech’s app to control their webcam • NVENC Single GPU
Single Node
Medialooks SDK Medialooks MFormats SDK provides complete control •N
 VIDIA Video Codec used for accelerated Single GPU
over the video pipeline encoding and ecoding Single Node
Media Transcoding Ribbon Industry-leading SBC media transcoding •R
 ibbons Session Border Controller Single GPU
in the Cloud Communications scaling capabilities in virtual and cloud Release 7.0 now supports GPUs enabling Single Node
deployments using NVIDIA GPUs to increase greater performance and scale for media
performance and decrease cost per transcoding, at cost-effective price points,
transcoded session. in cloud and virtualized environments.
Expanded SBC and PSX support for SIP •R
 ibbons Centralized Policy and Routing
Recording (SIPRec) allows enterprises (PSX) can be instantiated as a Virtual
and call centers to conduct up to four (4) Network Function (VNF) aligned with the
simultaneous recordings of sessions via ONAP architecture.
secure, encrypted technology. •E
 nterprises now have increased capacity
Expanded capabilities for Virtual Network for up to four (4) concurrent SIP Recording
Functions (VNF) instantiation with the ability (SIPRec) sessions, enabling recorded
to instantiate Ribbon PSX VNF aligned with data to be used for multiple purposes
the Open Network Automation Platform simultaneously such as real-time analytics
(ONAP) framework. for call center agents, recordings for
Enhancements for operational efficiencies corporate compliance and back-up, and
that allow CSPs to reduce configuration lawful intercept
complexity and improve ease of use. •T
 he Insight Element Management System
Enhanced security across all products to (EMS) has an improved user interface
deliver more restrictive access, reduction in for ease of use and offers improved
possible network exposure and additional provisioning and management processes
encryption.
Multiplatform ERLAB Video processing and encoding software •P
 re-processing encoding, decoding, post- Single GPU
Transcoder processing and delivery Single Node

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  37

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 37 4/5/21 10:18 AM


mxfSPEEDRAIL MOG Baseband broadcast news and sports •N
 VIDIA Video codec used for encoding for Single GPU
Technologies production video ingest product line that higher channel density Single Node
allows editing of growing files during ingest. •C
 UDA RAW de-coding, de-bayering, and
video re-sizing and re-sampling
OBS Studio Open Free and open source software for video • NVENC Single GPU
Broadcaster recording and live streaming optimized for Single Node
Software NVIDIA video encoder
Piko TV Kizil Electronik Linear broadcast encoder •H
 .264 and HEVC 4K encoding for broadcast Single GPU
channels Single Node
PixelStrings Cinnafilm Cloud-based image processing Platform- •M  otion-compensated frame rate Multi-GPU
as-a-Service (PaaS) delivering high-quality, conversion Single Node
automated video conversion and frame •H  igh-quality de-interlacing
optimization •T  exture-aware scaling
•D  e-grain/re-grain to any film look,
•D  e-noise/re-texture to limit banding
•R  everse telecine/pulldown pattern
correction
• I nterlace artifact and dust removal
•R  untime retiming
Skywatch MOG Video and broadcast production •N
 VIDIA Video codec used for encoding for Single GPU
Technologies management system for collecting audio/ higher channel density Single Node
video usage and metadata. •C
 UDA RAW de-coding, de-bayering, and
video re-sizing and re-sampling
Smart Render Nablet H.264 and HEVC video encoding using NV •A
 ccelerated, high-density video encoding Single GPU
Editor Video Codec Single Node
Smart Render SDK Nablet Video de-noising, de-interlacing, JPEG 2000 •C
 UDA accelerated video processing Single GPU
encoding and video fingerprinting •N
 VIDIA Video codec Single Node
Speech Quality BabbleLabs BabbleLabs has just launched broad •R
 eal time encoding/decoding of audio Single GPU
transformed using production availability of our commercial •V
 ideo signals Single Node
Neural Network speech API, web service, and phone
Computing mobile apps for iPhone and Android.
These services clean up video and audio
recordings to make the speech much easier
to understand. The apps work on existing
videos as well as new audio and video
recorded inside the app.
StreamLabs OBS StreamLabs Branch of the OBS Studio project that adds • NVENC Single GPU
a custom UI, integrates plugins, and a Single Node
plugin store
Tachyon Cinnafilm Standards conversion • Video processing and frame rate conversion Multi-GPU
•S  tandards conversions and transcoding Single Node
•S  D to UHD, telecine correction, and frame
rate normalization
Tornado Marquise Transcoding engine for IMF and DCP • I mage re-sizing up to 8K Single GPU
Technologies facilities •C  olor space conversion: 601/709, REC Single Node
2020, DCI XYZ, ACES 1.0
•D  e-bayering: ARRIRAW, DNG, RED R3D,
SONY F65, F55 RAW, Phantom flex 4K,
Canon C500
•M  ezzanine: ProRes 444, Avid DNxHD 444,
XDCAM, AVC Intra, AS-11 DPP, IMF
•U  ncompressed: DPX, TIFF, OpenEXR
Transkoder Colorfront Encoding and transcoding for DCP, and IMF •J  PEG2000 encoding and decoding Multi-GPU
mastering •3  2-bit floating point processing on multiple Single Node
GPUs
•M  XF wrapping, accelerated checksums and
AES encryption and decryption,
• I MF/IMP and DCI/DCP package authoring,
editing, transwrapping
Twitch Studio Twitch.tv Broadcasting app focused on beginners • NVENC Single GPU
•M  ulti-video Codec support Single Node

38  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 38 4/5/21 10:18 AM


Vantage LightSpeed Telestream Enterprise-class live streaming system that •V
 ideo transcoding and processing Multi-GPU
can ingest, encode, package and deploy Single Node
multiple sources to multiple destinations.
System utilizes the latest technologies to
deliver pristine quality and exceptional
processing speed. Video processing and
transcoding can be accelerated with GPU
for up to 9x speed improvements
Viarte Isovideo Video standards conversion •C
 UDA-accelerated video processing and Multi-GPU
encoding Single Node
VidiCert Joanneum Video and film quality assurance •C
 UDA accelerated video quality analysis Multi-GPU
Research •G
 PU-accelerated noise, grain and dust Single Node
detection/removal
Wormhole Cinnafilm Time alteration •R
 etiming and motion compensation, Single GPU
•S
 uper slow motion, and run length Single Node
adjustment
•C
 ommercial insertion, audio retiming, and
caption retiming
Wowza Streaming Wowza H.264 video encoding •N
 VENC accelerated video encoding Single GPU
Engine Transcoder Single Node
XSplit Broadcaster SplitmediaLabs, Broadcast app for recording and streaming, • NVENC N/A
Ltd. now including a lightweight video editor • Record
• Stream
XSplit Gamecaster SplitmediaLabs, Simplified broadcast app for recording and • NVENC Single GPU
Ltd. streaming, now including a lightweight Single Node
video editor

ON-AIR GRAPHICS
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Air Cinegy Broadcast play-out server •R


 eal-time on-air graphics Single GPU
•N
 VIDIA Video Codec for accelerated Single Node
encoding and decoding HD and HEVC
Aximmetry Aximmetry Aximmetry?s solutions cover all aspects of •D  irextX 11 3D Rendering, Post Processing Single GPU
advanced broadcast presentation: tracked and Compositing Single Node
virtual sets, Augmented Reality (AR), •N  VEnc encoding in H264/265
interactive touch screen displays, data- • TXAA
driven graphics, virtual product placement, •G  ameworks: Screen-Space Ambient
and audience interaction via second-screen Occlusion
devices. •G  ameworks: Depth of Field
Brodcaast Dscript Monarch 3D on-air graphics •R
 eal-time rendering Single GPU
3D Single Node
Camino AJT Systems Camino is a powerful 3D rendering system •C
 amino’s real-time graphics overlay can Single GPU
for live-to-air broadcast graphics, capable be applied to tickertapes, scoreboards, Single Node
of up to 4K character generation. Camino’s schedule boards, program junctions, and
high end features, with excellent ease of use, TV show promotions
combine to deliver an exceptional system for •G
 raphics overlay may be done via
your broadcast graphics requirements. predefined templates, which may then be
populated with live data during playout
•M
 akes real-time rendering of data-driven
graphics possible in news and sports
events.4K, 1080p, 720p and SD Support
•N
 TSC and PAL Support
•G
 raphics, Clips and 3D Objects Importer
•2
 D and 3D Primitives
•R
 eal-Time Key-Frame Animations
•R
 eal-Time 3D Scene Lighting
•T
 imeline-Based Audio Support
•D
 ata Mapping to External Sources
•T
 ransition Logic
•A
 utomation Controller Support
•S
 tereoscopic 3D rendering
Capture Cinegy Video ingest •U
 ses NVENC to encode/decode multiple Single GPU
H.264 and HEVC streams Single Node

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  39

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 39 4/5/21 10:18 AM


Clarity Pixel Power On-air graphics •R
 eal-time rendering Single GPU
Single Node
Click Effects PRIME ChyronHego Click Effects PRIME is audiovisual content •R
 eal-time graphics rendering Single GPU
control and delivery solutions for live sports Single Node
& entertainment productions.
Cube Dalet On-air Graphics •R
 eal-time graphics rendering Single GPU
Single Node
Designer Disguise Designer is the ultimate software to •R
 eal-time graphics rendering Single GPU
visualize, design, and sequence projects •S
 ynchronized video playback Single Node
wherever you are, from concept all the way •P
 rojection Mapping
through to showtime.
eStudio Brainstorm Virtual sets and motion graphics •R
 eal-time rendering Single GPU
•R
 TX accelerated ray-tracing optional Epic Single Node
Unreal Engine
InfinitySet Brainstorm Realistic virtual sets •R
 eal-time RTX ray tracing through UE4 Single GPU
•H
 DR I/O Single Node
•P
 hysically-based rendering
•R
 TX accelerated ray-tracing optional Epic
Unreal Engine
KAIROS Panasonic The IT/IP platform ‘KAIROS’ is a live video •R
 ealtime playout Single GPU
production platform developed based on a •C
 UDA and NVEnc Single Node
new concept and innovative architecture. It •R
 ivermax SMPTE 2110
incorporates proprietary, ground-breaking •G
 PU Accelerated Video
software to maximize the CPU and GPU
capacities for video processing.
Livebook GFX AJT Systems The LiveBook is designed to fit every •G
 raphics solution for compact live sports Multi-GPU
production environment and facilitate productions Single Node
evolving work flows. Whether you are
broadcasting over IP, or using SDI for
internal or downstream keying, the LiveBook
will be able to adapt to your environment.
Mosaic ChyronHego On-air graphics •R
 eal-time rendering Single GPU
Single Node
Multiviewers Evertz Broadcast multiviewer •U
 ses NVENC H.264 and HEVC encoding and Single GPU
decoding Single Node
Nexio Imagine On-air graphics •R
 eal-time rendering Multi-GPU
Channelbrand Communications Single Node
Nexio G8 Imagine On-air graphics •R
 eal-time rendering Single GPU
Communications Single Node
Nexio TitleOne Imagine On-air graphics •R
 eal-time rendering Single GPU
Communications Single Node
Pixotope The Future All-in-one, real-time virtual production •R
 eal-time rendering Single GPU
Group system with integrated Unreal Engine •R
 TX accelerated ray-tracing Single Node
photorealistic rendering. Open software-
based solution for rapidly creating virtual
studios, augmented reality (AR), and on-air
graphics. Offers a real-time WYSIWYG
editor, a virtual set auto-generation tool, its
own powerful internal chroma keyer, and
user-designed custom control panels.
PRIME ChyronHego PRIME Graphics Platform is the next •R
 eal-time graphics rendering Single GPU
generation of pioneering real-time graphics Single Node
solutions, helping broadcasters create
engaging visuals for all types of programming.
Reality Engine Zero Density Photorealistic virtual studio solution in •R  TX-accelerated ray-tracing with Unreal Single GPU
broadcast industry, powered by Epic Unreal Engine Single Node
Engine 4.24 •N  ode-based compositing system designed
for real-time production
Using Mellanox Rivermax API • I mage quality is achieved by on NVIDIA
GPUs through deferred rendering methods
unique anti-aliasing technology and
advanced features such as depth of field,
motion blur, light maps, screen space
reflections and refraction

40  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 40 4/5/21 10:18 AM


Titler Pro NewBlueFX Create elegant video titles or 3D motion •G
 PU-accelerated graphics Single GPU
graphics. Single Node
tOG RT Software On-air graphics •R
 eal-time rendering Single GPU
Single Node
Type Cinegy On-air Graphics •R
 eal-time graphics rendering Single GPU
Single Node
Vertigo Grass Valley On-air Graphics •R
 eal-time rendering Single GPU
Single Node
Virtuoso Monarch Virtual sets and motion graphics •R
 eal-time rendering Single GPU
Single Node
Viz Engine vizrt On-air graphics and virtual sets •R
 eal-time graphics rendering Single GPU
Single Node
Wasp3D - CG Wasp3D On-air graphics and virtual sets •R
 eal-time graphics rendering Single GPU
Single Node

ON-SET, REVIEW AND STEREO TOOLS


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

4kScope Drastic 4kScope software provides a real time, •G


 PU accelerated effects and compute Single GPU
Technologies professional quality signal analysis tool for Single Node
on set, production, post production, and
research and development environments.
8KScope Drastic Real time, professional quality signal analysis •G
 PU-accelerated effects and compute Single GPU
Technologies tool for on set, production, post production, Single Node
and research and development environments.
Cortex Dailies MTI Film Review, color grading and transcoding on •C
 UDA accelerated grading and transcoding Multi-GPU
set Single Node
Fluid 4K Review BlueFish444 Review and approval of 4K content •R
 eal-time video review Single GPU
Single Node
ICE Marquise IMF reference video player •R
 AW data support for ARRIRAW, DNG, RED Single GPU
Technologies R3D, SONY F65, F55 RAW, Phantom flex 4K Single Node
and Canon C500
•H
 DR content encoded in Dolby Vision,
HDR10, HDR10+ or HLG
•U
 ncompressed formats support: DPX, TIFF
and OpenEXR
Net-X-Code Drastic Net-X-Code is a distributed capture and •G
 PU accelerated compute Single GPU
Technologies conversion system: IP Capture, Control, Single Node
Convert and Output for server level.
NewBlue Stream NewBlueFX NewBlue Stream is a lightweight streaming •G
 PU-accelerated processing, encoding and Single GPU
and broadcast solution paired with dynamic, decoding Single Node
data-driven graphics
On-Set Dailies Colorfront Review, color grading and transcoding on •R
 eal-time review Multi-GPU
set •N
 V Video Codec encoding and transcoding Single Node
Previzion Lightcraft On-set virtual production •R
 eal-time, virtual set production Single GPU
Single Node
VideoQC Drastic videoQC is a suite of video and audio •G
 PU accelerated effects and compute Single GPU
Technologies analysis and playback tools with both visual Single Node
and automated quality checking tools.
Takes the media coming into your facility
and perform a series of automated tests on
video, audio and metadata values against a
template, then analyze the audio and video.

WEATHER GRAPHICS
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Max Weather WSI Weather graphics •R


 eal-time graphics Single GPU
Single Node
Metacast ChyronHego Weather graphics •R
 eal-time graphics Single GPU
Single Node
MeteoEarth MeteoGraphics Weather graphics •R
 eal-time graphics Single GPU
Single Node

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  41

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 41 4/5/21 10:18 AM


Medical Imaging
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

3D Slicer 3D Slicer 3D Slicer is an open-source software •N


 VIDIA Clara AI-assisted Annotation Single GPU
platform for medical image informatics, •S
 upports multi organs, from head to toe Single Node
image processing, and three-dimensional •M
 ulti-modality imaging (MRI, CT, US,
visualization. Slicer brings free, powerful nuclear medicine, and microscopy)
cross-platform processing tools to physicians, •B
 idirectional interface for devices
researchers, and the general public.
aidoc Aidoc Medical AI based decision support software •C
 lassification and segmentation using deep Single GPU
analyzing medical imaging to provide learning on top of any PACS platform Single Node
solutions for detecting acute abnormalities
across the body, helping radiologists
prioritize life threatening cases and
expedite patient care. Agnostic to PACS and
RIS systems
AI-LAB American ACR AI-LAB offers radiologists tools •A
 I models for diagnostic imaging Single GPU
College of designed to help them learn the basics of •A
 I models tailored to their local patient Single Node
Radiology AI and participate directly in the creation, population
validation and use of health care AI. It •P
 atient data protection
accelerates the development and adoption
of artificial intelligence (AI) in clinical
practice, empowering radiologists to create
AI tools at their own institutions, to meet
their own patient needs.
deepflow Helmholtz Deep learning tool for reconstructing cell • Tool will show that deep convolutional neural Single GPU
Zentrum cycle and disease progression using deep networks combined with nonlinear dimension Single Node
München learning from flow cytometry data. reduction enable reconstructing biological
processes based on raw image data
• Tool will demonstrate this by reconstructing
the cell cycle of Jurkat cells and disease
progression in diabetic retinopathy. In further
analysis of Jurkat cells
• Tool will detect and separate a subpopulation
of dead cells in an unsupervised manner and,
in classifying discrete cell cycle stages
• Tool will reach a sixfold reduction in error
rate compared to a recent approach based
on boosting on image features. In contrast
to previous methods, deep learning based
predictions are fast enough for on-the-fly
analysis in an imaging flow cytometer
• Uses MXNet, cv2, numpy, python3
EBM AI Workflow EBM EBM AI Workflow is a software platform •P
 re-trained models for inference and AI- Multi-GPU
Technologies for seamless data annotation, training, and assisted annotation Multi-Node
advanced visualization and deployment of •A
 utomatic image analysis
AI-based medical imaging applications. •E
 BM PACS viewer
EBM AI workflow and NVIDIA Clara combine •F
 DA approved APP(UDE)
the power of AI and edge computing to •X
 Annptation APPs
retain critical processing tasks on devices
at the point of care, enabling healthcare
professionals, physicians and specialists to
make instantaneous, life-saving predictions
and emergency responses.
Ibex Decision IBEX IBEX run DL on prostate cancer digital •C
 ombines data from digitized glass slides Single GPU
Support pathology and to find any potential and electronic medical records to reveal Single Node
cancerous areas underlying patterns
•E
 xtracts valuable clinical insights that can
transform how pathology and oncology
are practiced and propel them into the
information age

42  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 42 4/5/21 10:18 AM


iNtuition Terarecon, Inc. Intuition offers AI-driven advanced 3D and •V  olumetric Navigation, CT and MRI Suites Multi-GPU
4D medical imaging post-processing and • I nterventional Radiology Multi-Node
visualization. •E  VAR / TAVR Planning
•B  ody Fusion
• Maxillo-Facial
• iGENTLE noise reduction
•L  ung / Liver Segmentation
•M  itral Valve (TMVR) Workflow
•L  ung Density Analysis-II
• I ntuition AI Adapter
•E  ureka Clinical AI Platform framework
•E  xplorer UX/UI, and AI algorithm runtime
licenses
LVO Viz.ai Automatically identify suspected LVOs on •R
 eal-Time Specialist Notifications Single GPU
CTA imaging in your network and to alert •A
 I-Powered LVO Detection Single Node
your on-call stroke physician within minutes •A
 utomated Maximum Intensity Projections
(MIP)
MITK German Cancer Free open-source software system for • I nteractive segmentation of slices in image Single GPU
Research Center development of interactive medical image volumes, including interactive region Single Node
processing software growing and easy correction, interpolation
of missing slices, surface generation, and
volumetry
•P  oint based registration of medical image
volumes allows to match two images based
on two corresponding sets of points; Rigid
registration of images by combination of
the ITK registration objects (transforms,
optimizers, metrics, etc.)
•M  easurement of distances and angles;
Volume visualization, GPU-based, easy to
modify transfer functions; Movie generation
(Windows only)
•D  eformable Registration
OHIF Open Health OHIF is a framework for building medical • I ntegrated AI-assisted Annotation with Single GPU
Imaging imaging web applications that uses NVIDIA Clara Plugin Single Node
Foundation react. The code is modular, using react •R  etrieve and load medical images from
components and a plug-in model making most sources and formats
it possible to add new tools and workflows •R  ender sets in 2D, 3D, and reconstructed
into the basic viewer UI. representations
•A  llows for the manipulation, annotation,
and serialization of observations
•S  upports internationalization, OpenID
Connect, offline use, hotkeys
PowerGrid University of Provides iterative non-cartesian MRI •G  PU accelerated implementations of the Multi-GPU
Illinois Urbana- reconstruction non-Unform FFT and Discrete Fourier Single Node
Champaign Transform
•M  PI is used to enable using multiple GPUs
in one or several machines
• I terative reconstruction using physics-
based model to correct for unwanted
effects, such as field inhomogeneity and
patient motion
Proprio Proprio Proprio’s multi-camera system, based on • CUDA Single GPU
networked camera array, depth sensing, Single Node
light filed for surgeons to operate and
access all the data they need. Offers
training based in captured real cases in a
safe and collaborative environment.

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  43

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 43 4/5/21 10:18 AM


Rad AI Follow-up RAD AI Rad AI provides communication and •C  ommunicates and tracks follow-up Multi-GPU
tracking of follow-up recommendations recommendations Multi-Node
for incidentalomas (such as for pulmonary • I ntegrates with a health system’s existing
nodules and lung cancer screening workflow
programs) that are top of mind for •A  ppropriate follow-up imaging is
improving patient safety. By ensuring performed on a timely basis
that these follow-ups are performed, the
overall quality of patient care is improved
and reduces patient morbidity/mortality,
while creating new imaging revenue for the
health system, and generating value from
additional downstream services.
Rad AI Impressions RAD AI Rad AI automatically generates customized •A
 utomatic report impressions, Multi-GPU
report impressions that save radiologists an •C
 ustomized to your language Multi-Node
average of more than 60 minutes per day. AI •F
 leischner, Lung-RADS and TI-RADS
automatically generates report impressions, •S
 eamless integration
customized to each radiologist’s exact
language and style, for more than 90% of
imaging modalities.
RadiAnt Medixant RadiAnt DICOM Viewer provides basic •F
 luid zooming and panning, Brightness Single GPU
tools for the manipulation and and contrast adjustments, negative mode, Single Node
measurement of images Preset window settings for Computed
Tomography (lung, bone, etc.)
•A
 bility to rotate (90, 180 degrees) or
flip (horizontal and vertical) images,
Segment length, Mean, minimum and
maximum parameter values (e.g. density in
Hounsfield Units in Computed Tomography)
within circle/ellipse and its area, Angle
value (normal and Cobb angle)
•P
 en tool for freehand drawing
Radiology Assist Zebra Imaging Receives imaging scans from various •C
 lassification and segmentation on top of Single GPU
modalities and automatically analyzes any PACS platform Single Node
them for a number of different clinical
findings. Findings are provided in real time
to radiologists or other physicians and
hospital systems as needed.
Radlogics Virtual RadLogics Software platform imports any DICOM- •R
 eal time analytics on medical imaging Single GPU
Resident compatible study directly from the modality Single Node
or the PACS. The software platform
provides APIs for image analysis algorithms
to incorporate search, measurement, and
other findings into the radiologist existing
PACS and reporting system as a preliminary
report.
Vitrea® Vital Images Vitrea provides advanced visualization tools • I nterface designed for viewing in the Multi-GPU
to a range of medical specialists (including reading room Multi-Node
radiologists, cardiologists, oncologists and • I mproved clinical outcomes with clinical
other specialists) so that they can visualize workflows and partner applications
patient images and communicate with each • I ncreased efficiency with a consistent user
other efficiently on a course of action. Vitrea interface and experience for all modalities
is a crucial tool for clinical decision support •E  asy to deploy thin client solution does not
and enabling physicians to communicate require specialized software to reside on
effectively about a common patient, and client computers.
specialists rely on its detailed 2D, 3D and
4D images for confident analysis in critical
scenarios.

44  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 44 4/5/21 10:18 AM


XNAT Radiologics XNAT is an open source imaging informatics •U
 pload data using DICOM image data and Single GPU
platform developed by the Neuroinformatics metadata Single Node
Research Group at Washington University. •O
 rganize and share data within user-
It facilitates common management, defined projects securely
productivity, and quality assurance tasks •V
 isualize and download using an embedded
for imaging and associated data. XNAT is medical image viewer that supports a
extensible and can be used to support a number of common medical imaging
wide range of imaging-based projects. formats
•S
 ecure and manage access to data using a
tiered architecture
•S
 earch and explore large data sets and
create and share customized search
patterns
•P
 rocess data using pipelines that allow
for the programming and automation of
complex workflows
xvision Augmedics Augmented reality guidance system •T
 ransparent AR Display N/A
for surgery, allows surgeons to see the •T
 racking system
patien’?s anatomy through skin and
tissue as if they have ‘x-ray vision’ and to
accurately guide instruments and implants
during spine procedures

Oil and Gas


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

6X Ridgeway Kite Reservoir Simulation on Tesla •C


 UDA Simulation Parallelization Single GPU
Single Node
AISight for SCADA BRS Labs Proactive integrity management and real- •2
 4/7 real-time analysis and alerting Multi-GPU
time precursor alerts for enhanced SCADA •S
 cales to thousands of sensors across Single Node
operations in oil and gas. remote and geographically dispersed
locations • H istorical analysis and trend
reports
AxRTM Acceleware Reverse Time Migration Software •C
 UDA accelerated libraries for building Multi-GPU
RTM software Multi-Node
DecisionSpace Halliburton E&P platform for geoscience, well planning, CUDA acceleration of fault extraction Multi-GPU
(Landmark) drilling and earth modeling. Single Node
Echelon Stone Ridge Full featured reservoir simulator designed •F
 ully GPU-accelerated reservoir model Multi-GPU
Technology from inception for GPU (Supported features) •D
 ual-perm, dual porosity, pressure varying Multi-Node
perm and porosity
•E
 clipse compatible input deck
GeoDepth Emerson Seismic Interpretation Suite •C
 UDA-accelerated RTM Multi-GPU
Multi-Node
Geoteric Geoteric Seismic interpretation •A
 ttributes calculations Multi-GPU
•G
 eobodies extraction Single Node
Graydient S Giant Grey Machine learning anomaly detection for •P
 roactive integrity management and real- Multi-GPU
(SCADA) large scale industrial data. time precursor alerts for enhanced SCADA Single Node
operations in oil and gas
•2
 4/7 real-time analysis and alerting scaling
to thousands of sensors across remote and
geographically dispersed location
HUESpace Bluware Library SDK toolkit for creating applications •C
 UDA acceleration for compression Multi-GPU
for seismic compression and seismic/ •L
 arge-scale visualization Single Node
geospatial imaging and interpretation.
InsightEarth CGG Seismic Interpretation Suite •O
 penCL acceleration for AFE Multi-GPU
•3
 D Curvature attributes Single Node
Omega2 RTM Schlumberger Seismic processing •M
 ultiple algorithms (RTM, etc) Multi-GPU
Multi-Node

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  45

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 45 4/5/21 10:18 AM


PumaFlow IFP Beicip-Franlab Reservoir simulation •G
 PU-accelerated linear solver Multi-GPU
Single Node
Roxar RMS Emerson Reservoir modeling •M
 ulti GPU capabilities via HUEspace Multi-GPU
Single Node
RTM Tsunami Seismic processing •R
 TM algorithm Multi-GPU
Multi-Node
Seismic City RTM Seismic City RTM Seismic Processing •C
 UDA acceleration Multi-GPU
Multi-Node
SKUA Emerson Reservoir modeling •F
 aults, Horizons and Flow Simulation Grid Multi-GPU
Single Node
tNavigator Rock Flow tNavigator Solver is a software package, • CUDA Multi-GPU
Dynamics (RFD) offered as a single executable, which •P  ascal/Volta architecture Multi-Node
allows to build static and dynamic reservoir • Multi-GPU
models, run dynamic simulations, calculate
PVT properties of fluids, build surface
network model, calculate lifting tables, and
perform extended uncertainty analysis as a
part of one integrated workflow.
VoxelGeo Emerson Seismic Interpretation Package •M  ulti-GPU volume rendering Multi-GPU
• Horizon-flattening Single Node
•A  ttribute calculations

Life Sciences
BIOINFORMATICS
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Arioc Johns Hopkins High-throughput read alignment with GPU- •S


 ingle-end alignment, paired-end Multi-GPU
University accelerated exploration of the seed-and- alignment Single Node
extend search space. •O
 utput in SAM or database-ready binary
formats
•M
 ultiple GPU implementation
AtacWorks NVIDIA AtacWorks is a deep learning toolkit for •C  overage track denoising Multi-GPU
coverage track denoising and peak calling • Retraining Single Node
from low-coverage or low-quality ATAC-Seq
data.
BarraCUDA University of Sequence mapping software •A
 lignment of short sequencing reads Multi-GPU
Cambridge •A
 lignment of indels with gap openings and Multi-Node
Metabolic extensions.
Research Labs
BEAGLE-lib Open Source BEAGLE is a high-performance library •E
 valuation of likelihood for sequence Multi-GPU
that can perform the core calculations at evolution on trees and Arbitrary models Single Node
the heart of most Bayesian and Maximum (e.g. nucleotide, amino acid, codon)
Likelihood phylogenetics packages. Makes •S
 peed-ups (over CPU only version):
use of highly-parallel processors such as nucleotide model = up to 25x, codon model
those in graphics cards (GPUs) found in = up to 50x.
many PCs.
Campaign SimTK An open-source library of GPU-accelerated • K-means Multi-GPU
data clustering algorithms and tools. • Kps-means Multi-Node
• K-medoids
• K-centers
•H  ierarchical clustering
•S  elf-organizing map
Clara Genomics NVIDIA Clara Genomics Analysis is a GPU- •C
 UDA based libraries partial order Multi-GPU
Analysis accelerated library for biological sequence alignment (cudapoa) Single Node
analysis. •G
 obal aligner (cudaaligner)
•M
 apper (cudamapper)
CUDASW++ Open Source Open source software for Smith-Waterman •P
 arallel search of Smith-Waterman Multi-GPU
protein database searches on GPUs. database. Single Node

46  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 46 4/5/21 10:18 AM


CUSHAW Open Source Parallelized short read aligner •P
 arallel, accurate long read aligner for Multi-GPU
large genomes Single Node
f5c University of An optimised re-implementation of the •M
 ethylated cytosine base and frequency Single GPU
New South call-methylation and eventalign modules detection Single Node
Wales in Nanopolish. Given a set of basecalled •E
 vent alignment
Nanopore reads and the raw signals, f5c
call-methylation detects the methylated
cytosine and f5c eventalign aligns raw
nanopore DNA signals (events) to the base-
called read. f5c can optionally utilise NVIDIA
graphics cards for acceleration.
G-BLASTN Hong Kong GPU-accelerated nucleotide alignment tool •B
 lastn and megablast modes of NCBI- Single GPU
Baptist based on the widely used NCBI-BLAST. BLAST Single Node
University
GHOST-Z GPU Akiyama_ Sequence homology search tool. •S
 hotgun Metagenome Analysis. Multi-GPU
Laboratory, Multi-Node
Tokyo Institute of
Technology
GPU-Blast Carnegie Mellon Local search with fast k-tuple heuristic •P
 rotein alignment according to BLASTP Single GPU
University Single Node
mCUDA-MEME Open Source Ultrafast scalable motif discovery algorithm •S
 calable motif discovery algorithm based Multi-GPU
based on MEME . on MEME Single Node
MUMmer GPU Open Source MUMmer GPU is a high-throughput local •A
 ligns multiple query sequences against Single GPU
sequence alignment program reference sequence in parallel Single Node
NVBIO Open Source NVBIO is an open source C++ library •D
 ata structures, algorithms Multi-GPU
of reusable components designed to •U
 tility routines useful for building complex Single Node
accelerate bioinformatics applications using computational genomics applications on
CUDA. CPU-GPU systems
NVBowtie Open Source A largely complete implementation of the •G
 ood coverage of Bowtie2 features Multi-GPU
Bowtie2 aligner on top of NVBIO. •C
 omparable quality results Single Node
Parabricks NVIDIA Parabricks provides 30-50 times faster •B
 WA-mem, Star, haplotype caller, CNVKit, Multi-GPU
secondary analysis of sequencer generated Mutect2, Deep Variant, ImportGVCF, Select Single Node
FASTQ files to variant call files (VCFs). Variants, Genotype GVCF, Mark, Sort,
Parabricks has accelerated the standard BQSR, Merge, VQSR, Variant Filtration,
secondary analyses such as GATK4, Google’s CNNScore, and many quality checking
Deepvariant to generate equivalent results, tools.
while increasing throughput significantly.
PEANUT Open Source Read mapper for DNA or RNA sequence •A
 chieves supreme sensitivity and speed Single GPU
that reads to a known reference genome. compared to current state of the art Single Node
•R
 eads mappers like BWA MEM, Bowtie2
and RazerS3
•P
 EANUT reports both only the best hits or
all hits
Racon University of Racon is intended as a standalone • It supports data produced by both Pacific Single GPU
Zagreb, Faculty consensus module to correct raw contigs Biosciences and Oxford Nanopore Single Node
of Electrical generated by rapid assembly methods Technologies. Racon can be used as a
Engineering and which do not include a consensus step. polishing tool after the assembly with either
Computing The goal of Racon is to generate genomic Illumina data or data produced by third
consensus which is of similar or better generation of sequencing. The type of data
quality compared to the output generated inputed is automatically detected. Racon
by assembly methods which employ both takes as input only three files: contigs in
error correction and consensus steps, FASTA/FASTQ format, reads in FASTA/FASTQ
while providing a speedup of several times format and overlaps/alignments between
compared to those methods. the reads and the contigs in MHAP/PAF/SAM
format. Output is a set of polished contigs
in FASTA format printed to stdout. All input
files can be compressed with gzip (which
will have impact on parsing time).
• Racon can also be used as a read error-
correction tool. In this scenario, the MHAP/
PAF/SAM file needs to contain pairwise
overlaps between reads including dual
overlaps.

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  47

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 47 4/5/21 10:18 AM


racon-gpu Open Source Racon is intended as a standalone • Racon can be used as a polishing tool after Single GPU
consensus module to correct raw contigs the assembly with either Illumina data Single Node
generated by rapid assembly methods or data produced by third generation of
which do not include a consensus step. sequencing
The goal of Racon is to generate genomic • The type of data inputted is automatically
consensus which is of similar or better detected.
quality compared to the output generated • Racon takes as input only three files: contigs
by assembly methods which employ both in FASTA/FASTQ format, reads in FASTA/
error correction and consensus steps, FASTQ format and overlaps/alignments
while providing a speedup of several times between the reads and the contigs in MHAP/
compared to those methods. It supports PAF/SAM format. Output is a set of polished
data produced by both Pacific Biosciences contigs in FASTA format printed to stdout.
and Oxford Nanopore Technologies. All input files can be compressed with gzip
(which will have impact on parsing time).
• Racon can also be used as a read error-
correction tool. In this scenario, the MHAP/
PAF/SAM file needs to contain pairwise
overlaps between reads including dual
overlaps.
REACTA Open Source A modified version of GCTA with improved •G
 RM creation Multi-GPU
computational performance, support •R
 EML analysis Single Node
for Graphics Processing Units (GPUs), •R
 egional Heritability (including multi-GPU)
and additional features. The purpose of
REACTA is to quantify the contribution of
genetic variation to phenotypic variation for
complex traits.
SeqNFind Accelerated SeqNFind; is a powerful tool suite that •H
 ardware and software for reference Multi-GPU
Technology addresses the need for complete and assembly, blast, SW, HMM, de novo Single Node
Laboratories accurate alignments of many small assembly
sequences against entire genomes utilizing
a unique hardware/software cluster system
for facilitating bioinformatics research in
Next Generation sequencing and genomic
comparisons.
SOAP3 Genomics GPU-based software for aligning short •S
 hort read alignment tool that is not Multi-GPU
reads with a reference sequence. Finds all heuristic based Multi-Node
alignments with k mismatches, where k is •R
 eports all answers
chosen from 0 to 3.
SOAP3-dp The University of SOAP3-dp is an ultra-fast GPU-based tool •B
 orrows-Wheeler Transformation Multi-GPU
Hong Kong for short read alignment via index-assisted •D
 ynamic Programming Single Node
dynamic programming.
Synomics Studio Row Analytics Multi-Omics Biomarker Network Discovery • Multi-SNP association studies (GWAS Multi-GPU
and ValidationSynomics Studio is a new, studies with up to 30 SNPs/SNVs in Single Node
highly scalable analysis platform that combination)
enables researchers and clinicians to • Configurable number of cycles of fully
discover novelassociations between random permutation for validation of SNP
multiple genotypic, phenotypic and clinical networks Speed-up on GPU = 170x vs multi-
attributes of their patients and their disease core CPU alone (further speed-up available
risk /therapy responses. on multi-GPU and NVLink devices)
• Representative performance for 15,000
case:controls, 200,000 SNPs
• 2 SNP associations found and validated in
12 mins on single 20 core IBM POWER8NVL
with 4x Tesla P100 GPU
• 17 SNP associations found and validated in
6 days on single 20 core IBM POWER8NVL
with 4x Tesla P100 GPU
UGene Unipro Open source Smith-Waterman for SSE/CUDA, •F
 ast short read alignment Multi-GPU
Suffix array based repeats finder and dotplot. Single Node
WideLM Open Source Fits numerous linear models to a fixed •P
 arallel linear regression on multiple Multi-GPU
design and response. similarly-shaped models Single Node

48  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 48 4/5/21 10:18 AM


MICROSCOPY
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

ANNA-PALM Institut Pasteur Accelerating Single Molecule Localization •U  ses a much smaller number of low Single GPU
Microscopy with Deep Learning: ANNA- resolution frames than other methods Single Node
PALM is a computational method that can •P  rocessing by localization algorithms
reconstruct super-resolution images from results in a sparse localization image using
sparse single molecule localization data a neural network previously trained on
and/or widefield images. ANNA-PALM conventional PALM images
can produce high quality super-resolution • I nputs sparse image and outputs a super-
images from data obtained in much shorter resolution image
acquisition time than standard single •R  uns well on GPU due to acceleration
molecule localization microscopy. By available in Tensorflow
strongly reducing acquisition time, ANNA-
PALM facilitates super-resolution imaging
of large numbers of cells (high throughput
imaging), large samples, and live cells.
Appion New York Appion is a “pipeline” for processing and •T
 he underlying packages integrated into Single GPU
Structural analysis of EM images. Appion is integrated Appion include MotionCor2, Gctf, EMAN, Single Node
Biology Center with Leginon data acquisition but can also Spider, Frealign, Imagic, XMIPP, IMOD,
be used stand-alone after uploading images ProTomo, ACE, CTFFind and CTFTilt,
(either digital or scanned micrographs) findEM, DogPicker, TiltPicker, RMeasure,
or particle stacks using a set of provided EM-BFACTOR, and Chimera.
tools. Appion consists of a web based user
interface linked to a set of python scripts
that control several underlying integrated
processing packages. All data input and
output within Appion is managed using
tightly integrated SQL databases. The goal
is to have all control of the processing
pipeline managed from a web based user
interface and all output from the processing
presented using web based viewing tools.
BioEM Max Planck GPU-accelerated computing of Bayesian •B
 ioEM can use CUDA for the cross- Multi-GPU
Institute inference of electron microscopy images. correlation step, which essentially consists Single Node
of an image multiplication in Fourier space
and a Fourier back-transformation
crYOLO Max Planck Novel automated particle picking software •P
 art of the image processing workflow in Multi-GPU
Institute for based on the deep learning object detection SPHIRE. Single Node
Molecular system ‘You Only Look Once’ (YOLO).
Physiology CrYOLO is available as standalone program
under http://sphire.mpg.de/ and will be
part of the image processing workflow in
SPHIRE.
cryoSPARC cryoSPARC CryoSPARC is an easy to use software tool •A
 b-initio reconstruction Multi-GPU
that enables rapid, unbiased structure •H
 eterogeneous reconstruction Multi-Node
discovery of proteins and molecular •H
 igh-speed and high resolution refinement
complexes from cryo-EM data. of 3D protein structures implemented on
GPUs
•M
 ultiple simultaneous jobs on multiple
GPUs
Dynamo Center for Dynamo is a software environment for •D  ynamo provides workflows all the way Single GPU
Cellular subtomogram averaging of cryo-EM data. from tomograms to averages and classes. Single Node
Imaging and • I n a full workflow, you would organize
Nano Analytics tomograms in catalogues, use them to
(C-CINA), pick particles and create alignment and
Biozentrum, classification projects to be run on different
University of computing environments
Basel •R  equires CUDA Toolkit of version 7.5 or
higher and CUDA driver compatible with
your actual GPU device

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  49

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 49 4/5/21 10:18 AM


EMAN2 Baylor College EMAN2 is the successor to EMAN1. It is a All EMAN2 programs, including GUI Single GPU
of Medicine broadly based greyscale scientific image programs, are written in the easy-to-learn Single Node
processing suite with a primary focus on Python scripting language. This permits
processing data from transmission electron knowledgeable end-users to customize any
microscopes. EMAN’s original purpose was of the code with unprecedented ease. If you
performing single particle reconstructions aren’t an advanced user, you can still make
(3-D volumetric models from 2-D cryo-EM use of the integrated GUI and all of EMAN2’s
images) at the highest possible resolution, command-line programs.
but the suite now also offers support for
single particle cryo-ET, and tools useful in
many other subdisciplines such as helical
reconstruction, 2-D crystallography and
whole-cell tomography. Image processing
in a suite like EMAN differs from consumer
image processing packages like Photoshop
in that pixels in images are represented
as floating-point numbers rather than
small (8-16 bit) integers. In addition, image
compression is avoided entirely, and there is
a focus on quantitative analysis rather than
qualitative image display.
emClarity Benjamin Himes emClarity is a collection of gpu accelerated •S
 ubtomogram averaging Multi-GPU
software developed to enable determination •V
 ery high resolution single particle analysis Single Node
of biological structures at resolutions better •H
 ybrid electron microscopy.
than 1nm from heterogeneous specimen
imaged by cryo-Electron Tomography.
Gautomatch MRC Laboratory Gautomatch is a GPU accelerated program •F
 ast: typically, 1.5~2.0s with 15 templates, Single GPU
of Molecular for accurate, fast, flexible and fully using a good GPU (e.g. GTX 980, Titan X) Single Node
Biology automatic particle picking from cryo-EM •F
 ully automatic with simple command on
micrographs with or without templates. entire data sets
•C
 onvenient and easy to use
•F
 lexible: with or without template, suitable
for both basic or advanced users
•C
 ompatible with Relion/EMAN
•B
 ackground correction: automatic correct
the gradient background that affects the
picking
•R
 ejection of ice/carbon: automatically
detect non-particle areas and reject them
•P
 ost-optimization: scripts available to re-
filter the coordinates after picking within
seconds
•A
 ccuracy: the user’s satisfaction is the only
‘gold standard’ criterion
GCTF MRC Laboratory Corrects contrast transfer function effects • CUDA Single GPU
of Molecular in electron microscope optics Single Node
Biology
Huygens Scientific Huygens Products: Greatly improve your •D
 econvolution of volumetric images and Multi-GPU
Volume Imaging microscope images time series from widefield, confocal, light Single Node
sheet, super-resolution STED microscopes
and more
•C
 hromatic aberration and cross-talk
correction, image stabilization and stitching
•V
 isualization, tracking, colocalization and
object analysis
•M
 ulti-GPU and cluster support

50  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 50 4/5/21 10:18 AM


IMOD University of IMOD is a set of image processing, modeling • ctfphaseflip : Corrects tilt series for Single GPU
Colorado and display programs used for tomographic microscope CTF by phase flipping Single Node
reconstruction and for 3D reconstruction • gputilttest : Test whether a GPU is reliable
of EM serial sections and optical sections. for computing reconstructions with the tilt
Contains tools for assembling and aligning program
data within multiple types and sizes of • 3dmod : Model editing and image display
image stacks, viewing 3-D data from any program. 3dmod can display three-
orientation, and modeling and display of the dimensional graphic data sets in many views
image files. simultaneously, can model these data sets,
and can display models and graphic data in
3-D. The views include a slice through the
3D volume, a projection of a sub-volume and
orthogonal views with contour overlays.
• xyzproj : Project 3-dimensional data at a
series of tilts around the X, Y, or Z axis.
ITK Kitware The National Library of Medicine Insight •L
 ibrary is used by Paraview, VTK, and many Single GPU
Segmentation and Registration Toolkit (ITK), other software distributions Single Node
or Insight Toolkit, is an open-source, cross- •M
 any capabilities for multi-dimensional
platform C++ toolkit for segmentation and image processing and extraction tools
registration. Segmentation is the process •M
 ost recent GPU acceleration of FFTs
of identifying and classifying data found in a using cuFFT (cuFFTW) and matrix math
digitally sampled representation. Typically the accelerated through CUDA enabled Eigen3
sampled representation is an image acquired
from such medical instrumentation as CT
or MRI scanners. Registration is the task
of aligning or developing correspondences
between data. For example, in the medical
environment, a CT scan may be aligned with a
MRI scan in order to combine the information
contained in both.
Leginon New York Leginon is a system designed for automated •A
 Leginon application is image acquisition Single GPU
Structural collection of images from a transmission process that is built of several smaller Single Node
Biology Center electron microscope. pieces called ‘nodes’
•N
 odes can be applications
•S
 ome of these are GPU accelerated
applications such as Topaz, Relion, and
MotionCor2
Microvolution Microvolution Nearly instantaneous 3D deconvolution & •3
 D deconvolution for fluorescence Single GPU
up to 200 times faster. microscopy Single Node
•W
 ritten for use only on GPUs
•M
 ulti-GPU support
MotionCor2 UCSF A multi-GPU program that corrects •O
 verall, MotionCor2 is extremely robust, Multi-GPU
beam-induced sample motion on dose and sufficiently accurate at correcting local Single Node
fractionated movie stacks. Implements motions so that the very time-consuming
a robust iterative alignment algorithm and computationally-intensive particle
that delivers precise measurement and polishing in RELION can be skipped.
correction of both global and non-uniform Importantly
local motions at single pixel level across •W
 orks on a wide range of data sets
the whole frame. Suitable for both single- including cryo tomographic tilt series
particle and tomographic images.
PSSR Waitt Advanced Deep Learning-Based Point-Scanning • Pre-trained models for Single GPU
Biophotonics Super-Resolution Imaging allows point- • PSSR for Electron Microscopy (EM) Single Node
Center Core scanning super-resolution (PSSR) imaging • PSSR single frame (PSSR-SF) for mitoTracker
and facilitates point-scanning image • PSSR multiframe (PSSR-MF) for mitoTracker
acquisition with otherwise unattainable • PSSR for neuronal mitochondria
resolution, speed, and sensitivity.
RELION MRC Laboratory RELION (for REgularised LIkelihood • I mage classification and high resolution Multi-GPU
of Molecular OptimisatioN, pronounce rely-on) is a refinement accelerated up to 40-fold Single Node
Biology stand-alone computer program that •T  emplate-based particle selection
employs an empirical Bayesian approach to accelerated almost 1000-fold
refinement of (multiple) 3D reconstructions •R  educed memory requirements
or 2D class averages in electron cryo- •H  igh-resolution cryo-EM structure
microscopy (cryo-EM). determination in a matter of day on a single
workstation

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  51

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 51 4/5/21 10:18 AM


Thunder Tsinghua THUNDER is a particle-filter algorithm •B
 oth image classification and Multi-GPU
University based cryoEM image processing software highresolution refinement accelerated up Multi-Node
for using THUNDER to analysis cryoEM to 40-fold
images in purpose of achieving a 3D model. •T
 emplate-based particle selection
accelerated almost 1000-fold
•R
 educed memory requirements
•H
 igh-resolution cryo-EM structure
determination in a matter of day on a single
workstation
Tomviz Kitware Tomviz enables 3D characterization of •3  D tomographic data processing, Single GPU
materials at the nano- and meso-scale, visualization, and analysis of Single Node
tailored for visualizing electron tomography • Python
data. It utilizes the large quantities • Windows
of memory and processing resources •M  ac OS
required to render, manipulate, and analyze • Linux
voluminous 3D tomograms.
Topaz Tristan Bepler A pipeline for particle detection in cryo- •D
 eep learning for cryo EM data particle Single GPU
electron microscopy images using picking Single Node
convolutional neural networks trained from •U
 ses CUDA and pytorch
positive and unlabeled examples.
Warp Max Planck Warp integrates novel algorithms for frame •C
 UDA enabled processing for electron Single GPU
Institute for alignment, defocus estimation, particle microscopy Single Node
Biophysical picking and tomographic reconstruction in •T
 ensorFlow (v1.10)
Chemistry a rich user interface. Enables data quality •C
 UDA kernels: backprojection, CTF,
monitoring in real time, data analysis at deconvolution, FFT, tomography
microscope level and obtains high-resolution refinement, and others
structures before data collection is over.

MOLECULAR DYNAMICS
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

ACEMD Acellera Ltd GPU simulation of molecular mechanics •W


 ritten for use only on GPUs. Multi-GPU
force fields, implicit and explicit solvent Multi-Node
AMBER University of Suite of programs to simulate molecular •P
 MEMD Explicit Solvent and GB Implicit Multi-GPU
California at San dynamics on biomolecule. Solvent Single Node
Francisco
CHARMM Harvard MD package to simulate molecular • I mplicit (5x) Multi-GPU
University dynamics on biomolecule. •E  xplicit (2x) Single Node
•S  olvent via OpenMM, now ported natively
to GPUs
Colvars Temple Software module for molecular simulation •L
 AMMPS, NAMD, VMD Multi-GPU
University and analysis that provides a high- •G
 PU support Multi-Node
performance implementation of sampling
algorithms defined on a reduced space of
continuously differentiable functions (aka
collective variables)
The module itself implements a variety of
functions and algorithms, including free-
energy estimators based on thermodynamic
forces, non-equilibrium work and
probability distributions
Computational Lawrence Open source component of the PHENIX •G
 PU acceleration for scattering and Multi-GPU
Crystallography Berkeley system to advance automation of general purpose math via Single Node
Toolbox Laboratories macromolecular structure determination. •C
 UDA and cuFFT
Useful for small-molecule crystallography
and even general scientific applications
DeePMD-kit Princeton DeePMD-kit is a package written in Python/ • TensorFlow Multi-GPU
University C++, designed to minimize the effort •H  igh-performance classical MD and Single Node
required to build deep learning based quantum (path-integral) MD packages
model of interatomic potential energy •D  eep Potential series models
and force field and to perform molecular •M  PI and GPU support
dynamics (MD). Addresses the accuracy-
versus-efficiency dilemma in molecular
simulations. Applications of DeePMD-kit
span from finite molecules to extended
systems and from metallic systems to
chemically bonded systems.

52  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 52 4/5/21 10:18 AM


DeepSite Acellera Ltd DeepSite is a protein binding pocket •D
 eep learning Single GPU
predictor based on deep neural networks. •M
 achine learning Single Node
Allows you to upload your structure on PDB •D
 rug discovery in a web interface
format, monitor the progress of your job
and visualize the results with our modern
WebGL viewer.
DESMOND David E. Shaw High-speed molecular dynamics •T
 he code uses novel parallel algorithms Multi-GPU
Research simulations of biological systems. and numerical techniques to achieve high Single Node
performance and accuracy
ESPResSo ESPResSo Highly versatile software package for •H
 ydrodynamic / Electrokinetic forces Multi-GPU
performing and analyzing scientific Molecular •P
 3M electrostatics. Single Node
Dynamics, many-particle simulations of coarse-
grained atomistic or bead-spring models as
they are used in soft-matter research in physics,
chemistry and molecular biology.
FEP+ Schrodinger, Inc. Molecular Dynamics (MD) and Free Energy •O
 ptimization of the FEP+ algorithm to take Multi-GPU
Perturbation (FEP) calculations occur full advantage of the Desmond GPU MD Multi-Node
on time scales that are computationally engine enabling 2 to 4 ligands to be scored
demanding to simulate. A key factor in per day on a multi-GPU server.
determining whether a simulation will
take days, hours, or minutes to run is the
hardware being used. The advent of GPU
computing, however, has opened the door
to a new world of computationally intensive
simulations that would not have been
possible even a few years ago. Desmond’s
high-performance Molecular Dynamics
code, together with continuously improving
computer hardware technologies are helping
scientists push the boundaries of discovery
further than ever before. MD simulations
to impact drug discovery has now been
attained in FEP+, due to the confluence of
hardware and software development along
with the formulation of sufficiently accurate
theoretical methods and models
Folding@Home Stanford A distributed computing project that studies •P  owerful distributed computing molecular Multi-GPU
University protein folding, misfolding, aggregation, and dynamics system Single Node
related diseases. • I mplicit solvent and folding
Galamost CAS-CIAC GALAMOST is a project of employing high- •F
 ull Molecular Simulation on GPU Multi-GPU
performance computational techniques Multi-Node
to accelerate molecular simulation by
fully utilizing the computational power of
NVIDIA GPUs. Enables the investigation og
polymeric systems in a large temporal and
spatial scale at a very low cost.
GALAMOST ChangChun GALAMOST is a package of employing high- •G  eneral molecular dynamics Single GPU
CHINA performance computational techniques •D  issipative particle dynamics (DPD) Single Node
on many-core processors to accelerate •B  rownian dynamics (BD)
molecular dynamics simulations. The • Coarse-graining molecular dynamics (CGMD)
package is written with CUDA and C++ •R  eaction model
languages for particularly running on •A  nisotropic particle models
NVIDIA GPUs and focuses on the large scale • MD-SCF
simulations of soft matters. •D  NA 3SPN model
•R  igid body method
•S  tretching method
Genesis Diamond GenesisRTX, is an advanced high-fidelity •P
 owerful parallelization for hybrid Multi-GPU
Visionics runtime rendering engine which eliminates (CPU+GPU) systems Single Node
the need for traditional off-line database •F
 ull electrostatics with PME
compiling or formatting. •L
 arge (1-100 million atoms) biological
systems
GENESIS RIKEN GENESIS (GENeralized-Ensemble •P
 owerful parallelization for hybrid Multi-GPU
SImulation System) is a software package (CPU+GPU) systems Single Node
for molecular dynamics simulations and •F
 ull electrostatics with PME
trajectory analyses. •L
 arge (1-100 million atoms) biological
systems

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  53

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 53 4/5/21 10:18 AM


GPUgrid.net Acellera Ltd A distributed computing project that uses •H
 igh-performance all-atom biomolecular Multi-GPU
GPUs for molecular simulations. simulations Single Node
•E
 xplicit solvent and binding
GROMACS KTH Royal Simulation of biochemical molecules with • I mplicit (5x) Multi-GPU
Institute of complicated bond interactions •E  xplicit (2x) Solvent Single Node
Technology
HALMD HALMD Large-scale simulations of simple and •S
 imple fluids and binary mixtures (pair Single GPU
complex liquids. potentials, high-precision NVE and NVT, Single Node
dynamic correlations)
HOOMD-Blue University of Particle dynamics package written grounds •W
 ritten for use only on GPUs Multi-GPU
Michigan up for GPUs. Single Node
HTMD Acellera Ltd High throughput molecular dynamics •A  vailable via Conda and github Multi-GPU
simulations. • ACEMD Single Node
• PMEMD
• NAMD
• GROMACS
• AMBER
•C  HARMM force fields
•A  daptive sampling, Markov State Models,
visualization, protein preparation and
ligand parameterization
LAMMPS Sandia National Classical molecular dynamics package • Lennard-Jones Multi-GPU
Lab • Gay-Berne Multi-Node
• Tersoff
MELD University of OpenMM plugin written for GPUs. • I ntegrative approach to combine physics Multi-GPU
Calgary and information Single Node
•O  rders of magnitude faster protein folding
than brute force MD
MOLECULAR Chemical Calculate and Analyze pH-Dependent •G
 PU Accelerated 3D Stereo Graphics Single GPU
OPERATING Computing Protein Properties. MOEsaic Session •A
 MBER GPU accelerated support Single Node
ENVIRONMENT Group ULC Sharing and Project Customization.
Determine Conformation Population from
NMR NOE Data
Predict Relative Binding Energies with
AMBER Thermodynamic Integration.
myPresto N2PC/AIST/ Open Source Computational Drug Discovery •H
 igh performance virtual screening by MD Multi-GPU
JBIC, Japan Suite. binding Multi-Node
•F
 ree energy calculation.
NAMD University Designed for high-performance simulation •F
 ull electrostatics with PME and most Multi-GPU
of Illinois at of large molecular systems. simulation features Single Node
Champaign •1
 00M atom capable
Urbana
OpenMM Stanford Library and application for molecular •M  olecular Dynamics toolkit Multi-GPU
University dynamics for HPC with GPUs. •E  xtensible and growing Single Node
• I mplicit and explicit solvent, custom forces
PolyFTS University of Classical molecular simulation code for •U
 ses auxiliary fields as the fundamental Single GPU
California at studying polymer self-assembly simulation degrees of freedom Single Node
Santa Barbara and thermodynamics. •U
 ses cuFFT extensively (~ 80%)
•C
 UDA code is ~20%
•M
 ulti CPU or single GPU per job
•1
 x = Ivy Bridge E5-2690 CPU all 10 cores
•3
 -8X on K40 or K80 (utilizing 1/2 of the K80)
SOP-GPU SOP-GPU SOP-GPU package for the Self Organized •L  angevin dynamics simulations using the Single GPU
Polymer Model fully implemented on a GPU. coarse-grained Self Organized Polymer Single Node
A scientific software package designed to (SOP) model
perform Langevin Dynamics Simulations •M  ultiple simulation trajectories can be
of the mechanical or thermal unfolding, performed simultaneously on a single GPU
and mechanical indentation of large •C  alpha and Calpha-Cbeta models
biomolecular systems in the experimental •S  imulations of protein forced unfolding
subsecond (millisecond-to-second) • Novel simulations of nanoindentation in silico
timescale. •S  upport for hydrodynamic interactions
•U  p to ~100 ms of simulation time per day,
•S  ystems of up to 1,000,000 amino-acids (on
GPUs with 6GB or great memory)

54  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 54 4/5/21 10:18 AM


QUANTUM CHEMISTRY
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Abinit ABINIT Allows to find total energy, charge density •L


 ocal Hamiltonian Multi-GPU
and electronic structure of systems made of •N
 on-local Hamiltonian Single Node
electrons and nuclei within DFT. •L
 OBPCG algorithm
•D
 iagonalization/ orthogonalization.
ACES 4 University of New SIA/aces4 development A new super • I ntegrating scheduling GPU into SIAL Multi-GPU
Florida instruction architecture with interface programming language and SIP runtime Single Node
applications for quantum chemistry (aces4). environment
ACES III University of ACES III takes the best features of parallel • I ntegrating scheduling GPU into SIAL Multi-GPU
Florida implementations of quantum chemistry programming language and SIP runtime Multi-Node
methods for electronic structure. environment.
ADF Software for Density Functional Theory (DFT) software •G
 eometry optimizations and frequency Multi-GPU
Chemistry & package that enables first-principles calculations with GGA functionals. Single Node
Materials electronic structure calculations.
BigDFT BigDFT Implements density functional theory •D
 aubechies wavelets Multi-GPU
by solving the Kohn-Sham equations Multi-Node
describing the electrons in a material.
BrianQC StreamNovation BrianQC is a software product in the field •T
 he range of NVIDIA architectures Multi-GPU
Ltd. of quantum chemistry. It accelerates supported by BrianQC has been expanded. Single Node
features of Q-Chem 5.0 or later. Optimized In addition to GPUs powered by Kepler,
for simulating large molecules and tested Maxwell and Pascal, BrianQC now supports
up to 20,000 Cartesian Gaussian basis NVIDIA Tesla V100 GPU as well
functions. Has full support of s, p, d, f and •C
 ompatible with features of Q-Chem 5.0
g-type orbitals. Full support for NVIDIA or later
GPU architectures (Kepler, Maxwell, Pascal, •O
 ptimized for simulating large molecules
Volta) with double precision accuracy on •T
 ested up to 20,000 Cartesian Gaussian
64-bit Linux operation systems. Targets the basis functions
speeds up of Q-Chem for every calculation •F
 ull support of s, p, d, f and g-type orbitals
that uses Coulomb or Exchange integrals •F
 ull support for NVIDIA GPU architectures
over Gaussian basis functions or their first (Kepler, Maxwell, Pascal). Double precision
analytic derivative (including HF-SCF, DFT, accuracy
SCF geom. opt, DFT geom. opt for most •R
 uns on 64-bit Linux operation systems
functionals, etc.) •S
 peeds up Q-Chem for every calculation
that uses Coulomb or Exchange integrals
over Gaussian basis functions or their first
analytic derivative (including HF-SCF, DFT,
SCF geom. opt, DFT geom. opt for most
functionals, etc.)
CP2K CP2K Program to perform atomistic and •D
 BCSR (space matrix multiply library) Multi-GPU
molecular simulations of solid state, liquid, Multi-Node
molecular and biological systems.
GAMESS-UK Open Source The general purpose ab initio molecular • ( ss|ss) type integrals within calculations Multi-GPU
electronic structure program for performing using Hartree-Fock ab initio methods and Multi-Node
SCF-, DFT- and MCSCF-gradient density functional theory
calculations. •S  upports organics and inorganics.
GAMESS-US Ames Computational chemistry suite used to •L  ibqc with Rys Quadrature Algorithm Multi-GPU
Laboratory/Iowa simulate atomic and molecular electronic • Hartree-Fock Multi-Node
State University structure. •M  P2 and CCSD
Gaussian Gaussian, Inc. Predicts energies, molecular structures, •J
 oint NVIDIA Multi-GPU
and vibrational frequencies of molecular •P
 GI and Gaussian collaboration Single Node
systems.
GPAW GPAW Real-space grid DFT code written in C and •E
 lectrostatic poisson equation Multi-GPU
Python. •O
 rthonormalizing of vectors Multi-Node
•R
 esidual minimization method (rmm-diis)
gWL-LSMS ORNL Materials code for investigating the effects •G
 eneralized Wang-Landau method Multi-GPU
of temperature on magnetism. Multi-Node
LATTE Open Sourcee Density matrix computations • CU_BLAS Multi-GPU
•S  P2 Algorithm Single Node

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  55

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 55 4/5/21 10:18 AM


libxc TDDFT Libxc is a library of exchange-correlation •G
 PU acceleration for quantum chemistry Multi-GPU
functionals for density-functional theory •L
 DA, GGA, hybrids and mGGA Single Node
providing portable, well tested and reliable •P
 ython 3 and C interfaces
set of exchange and correlation functionals
that can be used by all the ETSF codes and
also other codes
LSDalton LSDalton Linear-scaling HF and DFT code suitable • ( T) correction to the CCSD energy Multi-GPU
for large molecular systems, now also with •R  I-MP2 energy/gradient (in development) Single Node
some CCSD capabilitiesTensor Algebra •C  CSD energy (in development)
Library Routines for Shared Memory •G  PU-based ERI generator (in development)
Systems which is being used to GPU
accelerate three (3) CAAR codes; NWChem,
LSDALTON and DIRAC.
MAPS Scienomics MAPS CLASSICAL & MESOSCALE • Typical calculations that can be executed Single GPU
simulation toolkit contains world-class include molecular dynamics simulations and Single Node
simulation engines such as LAMMPS, Monte Carlo simulations, structure relaxation
CHAMELEON, TOWHEE, NAMD. Includes a in periodic or molecular systems using both
collection of ready-to-use workflows and a classical and quantum mechanics tools
rich Force-Field library. • Trajectory can be generated and then later
analyzed using the appropriate tools
• Additional simulations can be performed
using PC-SAFT and related methods for
thermodynamics modeling
MOLCAS MOLCAS Methods for calculating general electronic • CU_BLAS Multi-GPU
structures in molecular systems in both Single Node
ground and excited states.
MOPAC2012 MOPAC Semiempirical Quantum Chemistry • Pseudodiagonalization Single GPU
•F  ull diagonalization Single Node
•D  ensity matrix assembling via Magma
libraries
NWChem NWChem NWChem aims to provide its users with •T
 riples part of Reg-CCSD(T) Multi-GPU
computational chemistry tools that are •C
 CSD and EOMCCSD task schedulers Single Node
scalable both in their ability to treat
large scientific computational chemistry
problems efficiently, and in their use of
available parallel computing resources from
high-performance parallel supercomputers
to conventional workstation clusters.
NWChemEX Pacific NWChemEx targets developing high- •G  PU acceleration Single GPU
Northwest performance computational models for the • l ibraries like libxc Single Node
National production of advanced biofuels and other
Laboratories bioproducts
Octopus Harvard Used for ab initio virtual experimentation •F  ull GPU support for ground-state, real- Single GPU
University and quantum chemistry calculations. time calculations Single Node
•K  ohn-Sham Hamiltonian
• Orthogonalization
•S  ubspace diagonalization
•P  oisson solver
•T  me propagation
•D  FT application
PEtot Lawrence First principles materials code that •D
 ensity functional theory (DFT) plane wave Multi-GPU
Berkeley computes the behavior of the electron pseudopotential calculations Single Node
Laboratories structures of materials.

56  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 56 4/5/21 10:18 AM


QBox University of Qbox is a C++/MPI scalable parallel •T
 he availability of double precision graphics Single GPU
California Davis implementation of first-principles molecular cards provides an opportunity to speed Single Node
dynamics (FPMD) based on the plane-wave, up electronic structure computations. We
pseudopotential formalism. Designed for modify the Qbox code to utilize Fermi GPUs
operation on large parallel computers. on the Keeneland platform
•W
 e use the CUFFT library to speed
up Fourier transforms and perform
asynchronous communication to cut down
the cost of data transfers
•T
 he modified code is used in simulations of
a 64-molecule water system with an 85 Ry
plane wave energy cut off
•P
 reliminary results show a 2-3 times
speedup in the calculation of the charge
density and in the application of the
Hamiltonian operator to the wave function
•W
 e present these findings as well as
further speedups measured in other parts
of the code. http://eslab.ucdavis.edu/
software/qbox http://keeneland.gatech.edu
Q-CHEM Q-Chem Inc. Computational chemistry package designed •V
 arious features including RI-MP2 Single GPU
for HPC clusters. Single Node
QMCPACK QMCPACK QMCPACK, an open-source production level •M
 ain features Multi-GPU
many-body ab initio Quantum Monte Carlo Multi-Node
code for computing the electronic structure
of atoms, molecules, and solids.
Quantum Espresso Quantum An integrated suite of computer codes •P
 Wscf package: linear algebra (matix Multi-GPU
Espresso for electronic structure calculations and multiply) Multi-Node
Foundation materials modeling at the nanoscale. •E
 xplicit computational kernels
•3
 D FFTs
QUICK Michigan State QUICK is a GPU-enabled ab intio quantum •R
 unning Hartree-Fock and DFT energy on Multi-GPU
University chemistry software package. GPU Single Node
•S
 upports s, p, d, f orbitals on energy
calculation
•H
 F gradient with s,p,d orbital support
•G
 PU-based ERI generator
RESCU Hongzhiwei RESCU is a KS-DFT calculation software •P
 arallel high efficiency processing- KS-DFT Multi-GPU
technology that can study very large systems with only Single Node
a small computer. Offers new, extremely
powerful and parallel high efficiency
KS-DFT self-consistent calculation method.
RMG North Carolina RMG is a density functional theory •S  upports 10k+ GPU nodes Multi-GPU
State University (DFT) based electronics structure code •M  ultipetaflops capable Single Node
that uses real space grids to represent •H  andles thousands of atoms with full DFT
wavefunctions, charge densities, and ionic precision
potentials. Designed for scalability and runs •S  upports multiple GPUs per node
successfully on systems with thousands of •F  ully open source
nodes (including GPU nodes) and hundreds • I nstallation support
of thousands of CPU cores. •C  ray XE6/XK7
TAL-SH Oak Ridge Tensor Algebra Library Routines for Shared •T
 ensor Algebra Library for Shared Memory Multi-GPU
National Lab Memory Systems accelerates three (3) CAAR Computers: Nodes equipped with multicore Multi-Node
codes; NWChem, LSDALTON and DIRAC. CPU, NVIDIA GPU, and Intel Xeon Phi (in
progress)
TeraChem PetaChem LLC Quantum chemistry software designed to •F
 ull GPU-based solution; Performance Multi-GPU
run on NVIDIA GPU. compared to GAMESS CPU version Single Node
VASP University of Complex package for performing ab-initio • Blocked Davidson (ALGO = NORMAL & FAST) Multi-GPU
Vienna quantum-mechanical molecular dynamics • RMM-DIIS (ALGO = VERYFAST & FAST) Multi-Node
(MD) simulations using pseudopotentials or • K-Points and optimization for critical step in
the projector-augmented wave method and exact exchange calculations
a plane wave basis set

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  57

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 57 4/5/21 10:18 AM


(MOLECULAR) VISUALIZATION AND DOCKING
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Amira Thermo fisher A multifaceted software platform for •3


 D visualization of volumetric data and Single GPU
Scientific visualizing, manipulating, and understanding surfaces Single Node
Life Science and bio-medical data.
AUTODOCK Scripps The AutoDock Suite is a growing collection •O  penCL-accelerated version of Single GPU
of methods for computational docking and AutoDock4.2.6 Single Node
virtual screening, for use in structure- •A  utoDock GPU
based drug discovery and exploration of • ADADELTA
the basic mechanisms of biomolecular
structure and function.
BINDSURF Universidad A virtual screening methodology that uses •A
 llows fast processing of large ligand Single GPU
Catolica de GPUs to determine protein binding sites. databases Single Node
Murcia
BUDE Bristol Molecular docking program •E
 mpirical Free Energy Force field Single GPU
University Single Node
Docking Station
FastROCS OpenEye Molecule shape comparison application •R
 eal-time shape similarity searching/ Multi-GPU
Scientific comparison Multi-Node
Software, Inc.
Interactive University of Experimental interactive molecule •H
 igh quality images and ease of interaction Single GPU
Molecule Visualizer Illinois visualizer based on a ray-tracing engine. •L
 atest GPUcomputing acceleration Single Node
techniques
•N
 atural user interfaces such as Kinect and
Wiimotes
MEGADOCK Akiyama_ MEGADOCK is a fast protein-protein •M
 EGADOCK-GPU on 12 CPU cores Multi-GPU
Laboratory, docking software when more acceleration •3
 GPU calculation speed 37.0 times faster Single Node
Tokyo Institute of is demanded for an interactome prediction, than MEGADOCK on 1 CPU core
Technology which is composed of millions of protein pairs. •N
 ovel docking software facilitating the
application of docking techniques to assist
large-scale protein interaction network
analyses
Molegro Virtual QIAGEN Method for performing high accuracy •E
 nergy grid computation Single GPU
Docker 6 flexible molecular docking. •P
 ose evaluation Single Node
•G
 uided differential evolution
PIPER Protein Boston Protein-protein docking program •M
 olecule docking Single GPU
Docking University Single Node
PyMol Schrodinger, Inc. User-sponsored molecular visualization •L
 ines: 460% increase Single GPU
system on an open-source foundation. •C
 artoons: 1246% increase Single Node
•S
 urface: 1746% increase
•S
 pheres: 753% increase
•R
 ibbon: 426% increase
VEGA ZZ University of Molecular Modeling Toolkit •V
 irtual logP Single GPU
California, San •M
 olecular surface values Single Node
Francisco
VMD University of Visualization and analyzation of large bio- •H
 igh quality rendering Multi-GPU
Illinois molecular systems in 3-D graphics. •L
 arge structures (100M atoms) Single Node
•A
 nalysis and visualization tasks
•M
 ultiple GPU support for display of
molecular orbitals

Research: Higher Education and Supercomputing


NUMERICAL ANALYTICS
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

ArrayFire ArrayFire ArrayFire helps organizations develop •V  ector Algorithms Multi-GPU


high-performance computing solutions • I mage Processing Single Node
on modern computational platforms. •C  omputer Vision
Specializes in machine learning and •S  ignal Processing
computer vision. Uses CUDA and OpenCL •L  inear Algebra
programming, code acceleration and • Statistics
optimization, and software design.

58  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 58 4/5/21 10:18 AM


Eigen Eigen Eigen is a C++ template library for linear •C
 UDA enabled linear algebra Single GPU
algebra: matrices, vectors, numerical •e
 igen solver, reduction, random, etc. Single Node
solvers, and related algorithms.
Julia Julia Computing Julia delivers dramatic improvements in •N
 VIDIA CUDA via Julia CUDA JIT plugin Multi-GPU
simplicity, speed, scalability, capacity, and architecture Multi-Node
productivity to solve massive computational •P
 arallelism and distributed computation
problems quickly and accurately, making it •L
 ightweight “green” threading (coroutines)
the preferred language for big data analytics. •U
 nicode, including but not limited to UTF-8
•C
 all C
•L
 isp-like macros and other
metaprogramming facilities
Mathematica Wolfram A symbolic technical computing language •D
 evelopment environment for CUDA and Multi-GPU
and development environment. OpenCL Single Node
•G
 PU acceleration for Wolfram Finance
Platform
MATLAB Mathworks GPU acceleration for MATLAB (high-level • Acceleration for 200+ most used MATLAB Multi-GPU
technical computing language). functions Single Node
• Acceleration of more than 500 most
parallelizable MATLAB functions
• Accelerated Signal Processing toolkit
• Accelerated Image Processing toolkit
• Accelerated Communications Systems toolkit
• Available via an NGC container
NMath Premium NMath GPU-accelerated math and statistics for •A
 utomatically offloads computations to the Single GPU
.NET, automatically detects the presence GPU. Single Node
of a CUDA-enabled GPU at runtime
and seamlessly redirects appropriate
computations to it.

PHYSICS
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

AWP AWP The Anelastic Wave Propagation, AWP- •3


 D Finite Difference Computation Single GPU
ODC, independently simulates the dynamic Single Node
rupture and wave propagation that occurs
during an earthquake. Dynamic rupture
produces friction, traction, slip, and slip
rate information on the fault. The moment
function is constructed from this fault data
and used to currentize wave propagation.
BQCD USQCD Lattice quantum chromodynamics •W
 ilson-clover fermion linear solver Multi-GPU
application, used for nuclear ad high energy Single Node
physics calculations.
CADISHI Max Planck CADISHI is a software package that enables •H
 ighly tuned CPU and GPU kernels Multi-GPU
Institute scientists to compute (Euclidean) distance •P
 ython engine for throughput computing Single Node
histograms efficiently. Any sets of objects
that have 3D Cartesian coordinates may
be used as input, for example, atoms in
molecular dynamics datasets or galaxies in
astrophysical contexts.
CASTRO CASTRO A multicomponent compressible •G
 ravitational Field Solver Multi-GPU
hydrodynamic code for astrophysical flows Single Node
including self-gravity, nuclear reactions and
radiation. CASTRO uses an Eulerian grid
and incorporates adaptive mesh refinement
(AMR).
Changa CHANGA Astrophysics code performs collisionless •G
 ravitational Model has been accelerated Single GPU
N-body simulations and performs using CUDA Single Node
cosmological simulations with periodic
boundary conditions in comoving
coordinates or simulations of isolated
stellar systems.
Chemora CHEMORA Chemora is a system for performing •C
 hemora embeds the equations’ Multi-GPU
simulations of systems described computational kernels into dynamically Single Node
by differential equations running on compiled loop nests shaped for input size
accelerated computational clusters. and GPU structure

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  59

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 59 4/5/21 10:18 AM


Cholla Cholla Computational Hydrodynamics On ParaLLel •M  odels the Euler equations on a static Multi-GPU
Architectures for Astrophysics mesh and evolves the fluid properties of Single Node
thousands of cells simultaneously using
GPUs
• I t can update over ten million cells per
GPU-second while using an exact Riemann
solver and PPM reconstruction, allowing
computation of astrophysical simulations
with physically interesting grid resolutions
(>256^3) on a single device; calculations
can be extended onto multiple devices with
nearly ideal scaling beyond 64 GPUs
Chroma USQCD Lattice Quantum Chromodynamics (LQCD) •W  ilson-clover fermions Multi-GPU
•K  rylov solvers Multi-Node
• Domain-decomposition
CPS USQCD Lattice quantum chromodynamics •W
 ilson, domain-wall and Mbius fermion Multi-GPU
application, used for nuclear ad high energy linear solvers Single Node
physics calculations.
CPS (GRID) USQCD CPS is developed for lattice QCD and •C
 UDA is supported Multi-GPU
written by C++, with some machine-specific •T
 he GRID code from Edinburgh is currently Multi-Node
assembly routines. It is being developed being optimized.
by members of Columbia University,
Brookhaven National Laboratory. The CPS
consists of code to build a library which
is can be statically linked to your code to
create an executable. CPS has optimized
codes for QCDOC, IBM Blue Gene machines,
and builds for scalar machines or parallel
machines with QMP.
CST PARTICLE Dassault Self-consistent simulation of charged •P
 article-in-Cell Solver Multi-GPU
STUDIO Systèmes particles in electromagnetic fields Multi-Node
SIMULIA Corp.
GADGET Max Planck A code for cosmological simulations of • MPI Multi-GPU
Institute structure formation. Multi-Node
GAMER Open Source A GPU-accelerated Adaptive Mesh •A
 daptive mesh refinement (AMR). Multi-GPU
Refinement Code for astrophysical Hydrodynamics with self-gravity Single Node
applications. Currently the code solves the •A
 variety of GPU-accelerated hydrodynamic
hydrodynamics with self-gravity. and Poisson solvers
•H
 ybrid OpenMP/MPI/GPU parallelization
•C
 oncurrent CPU/GPU execution for
performance optimization. Hilbert space-
filling curve for load balance
GENE GENE GENE (Gyrokinetic Electromagnetic •B
 asic Modeling Multi-GPU
Numerical Experiment) is an open source Multi-Node
plasma microturbulence code which can
be used to efficiently compute gyroradius-
scale fluctuations and the resulting
transport coefficients in magnetized fusion/
astrophysical plasmas.
GPU-AH Universidade do Developed at Centro de Astrofisica e •C
 alculates average network density and Single GPU
Porto Astronomia da Universidade do Porto, GPU- velocity Single Node
AH simulates the evolution of a network of
line-like topological defects - Abelian-Higgs
cosmic strings - in a cosmic context.
GPUwalls Universidade do Developed at Centro de Astrofisica e •C
 alculates average network density and Single GPU
Porto Astronomia da Universidade do Porto, velocity Single Node
GPUwalls simulates the evolution of a
network of the simplest topological defect -
domain wall - in a cosmic context.
GTC University Gyrokinetic Plasma Fusion for Modeling a • NVLINK Multi-GPU
of California Tokamak reactor Multi-Node
Irvine(UC Irvine)

60  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 60 4/5/21 10:18 AM


GTC Irvine University The gyrokinetic toroidal code (GTC) is a •P
 USHe, Collision and Poisson Solver Multi-GPU
of California massively parallel, particle-in-cell code Multi-Node
Irvine(UC Irvine) for turbulence simulation in support of
the burning plasma experiment ITER, the
crucial next step in the quest for fusion
energy. GTC is the production code for the
multi-institutional US Department Of Energy
(DOE) Scientific Discovery through Advanced
Computing (SciDAC) project, GSEP Center
(Gyrokinetic Simulation of Energetic Particle
Turbulence and Transport), and DOE INCITE
project that was awarded 35M hours of CPU
time for 2011. Currently maintained at UC
Irvine, GTC was the first fusion code to reach
in production simulations the teraflop in
2001 on the seaborg computer at NERSC and
the petaflop in 2008 on the jaguar computer
at ORNL. GTC simulation of the turbulence
self-regulation by zonal flows was published
in a 1998 Science paper, which has received
the most citations for any magnetic fusion
research paper published since 1996.
GTC-P Princeton A development code for optimization of •O
 ptimized with CUDA Multi-GPU
Plasma Phyiscs plasma physics. Full science and data sets •O
 penACC development underway Single Node
Lab are included, but in a simplified form to
allow performance testing and tuning.
HACC HACC Simulates N-Body Astrophysics. The •T
 his code has been optimized with CUDA Multi-GPU
HACC (Hardware/Hybrid Accelerated runs in full production mode Single Node
Cosmology Code) framework exploits this
diverse landscape at the largest scales of
problem size, obtaining high scalability
and sustained performance. Developed
to satisfy the science requirements of
cosmological surveys, HACC melds
particle and grid methods using a novel
algorithmic structure that flexibly maps
across architectures, including CPU/GPU,
multi/many-core, and Blue Gene systems.
We demonstrate the success of HACC on
two very different machines, the CPU/GPU
system Titan and the BG/Q systems Sequoia
and Mira, attaining unprecedented levels
of scalable performance. We demonstrate
strong and weak scaling on Titan, obtaining
up to 99.2% parallel efficiency, evolving 1.1
trillion particles.
HAMR GPU HAMR GPU accelerated General Relativistic •A
 ctive galactic nuclei which assumes a Multi-GPU
Magneto Hydrodynamic application radiatively inefficient sub-eddington rate Single Node
torus
•A
 xisymmetric ideal MHD
•V
 iscosity and resistivity through use of
Riemann solver (HLL)
•D
 ensity floors to mass load the jet
•U
 ses grids that can resolve the
substructure of the jet over 5 orders of
magnitude
MAESTRO MAESTRO A low Mach number stellar hydrodynamics •G
 ravitational Field Solver Multi-GPU
code that can be used to simulate long- Single Node
time, low-speed flows that would be
prohibitively expensive to model using
traditional compressible code.
MILC USCQD Lattice Quantum Chromodynamics (LQCD) •S
 taggered fermions Multi-GPU
codes simulate how elemental particles •K
 rylov solvers Multi-Node
are formed and bound by the strong force •G
 auge-link fattening
to create larger particles like protons and
neutrons.

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  61

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 61 4/5/21 10:18 AM


NekCEM ANL A high-fidelity, open-source •T
 he OpenACC implementation covers all Multi-GPU
electromagnetics solver based on solution routines for the Maxwell equation Multi-Node
spectral element and spectral element solver in NekCEM, including a highly tuned
discontinuous Galerkin methods, written in element-by-element operator evaluation
Fortran and C. and a GPUDirect gather-scatter kernel to
effect nearest-neighbor flux exchanges
ORB5 EPFL ORB5 is a global, gyrokinetic, Lagrangian, • Plasma and background magnetic geometry Multi-GPU
Particle-In-Cell (PIC), finite element, • Axisymmetric ideal MHD equilibria Multi-Node
electromagnetic model • Computed with CHEASE code [9] kinetic
electrons, or various approximate models:
hybrid-trapped or adiabatic intra- and
inter-species linearized collision operators
electromagnetic perturbations, with the
cancellation problem solved using enhanced
control variates and a ‘pullback’ scheme
OSIRIS UCLA Plasma Simulates Plasma Physics including Laser •2
 dimensions of the particle push have Multi-GPU
Physics Group interaction been optimized with CUDA Single Node
•A
 dditional optimization is being planned
with OpenACC
PIConGPU HZDR A relativistic Particle-in-Cell code that •S
 imulation of laser-particle acceleration Multi-GPU
describes the dynamics of a plasma by and relativistic plasma physics Multi-Node
computing the motion of electrons and ions
subject to the Maxwell-Vlasov equation.
PPM PPM Piecewise parabolic method is a higher- •T
 urbulent, compressible mixing of gases in Single GPU
order extension of Godunov’s method which the context of stars near the ends of their Single Node
uses spatial interpolation and allows for a lives and also in inertial confinement fusion
steeper representation of discontinuities,
particularly contact discontinuities.
QUDA USQCD Library for Lattice QCD calculations using •Q
 UDA supports the following fermion Multi-GPU
GPUs. formulations: Wilson,Wilson-clover,Twisted Single Node
mass,Improved staggered (asqtad or HISQ)
and Domain wall
RAMSES CEA Simulates astrophysical problems on •G
 PU acceleration Multi-GPU
different scales (e.g. star formation, •R
 adiative transfer for reionization Multi-Node
galaxy dynamics, cosmological structure •H
 ydrodynamic solver using AMR
formation).
samadii/sciv Metariver Software for computing flow field in high •D
 SMC simulator, gas dynamics solver Multi-GPU
Technology vacuum condition using the DSMC(Direct •O
 LED & Semiconductor deposition and Multi-Node
Simulation with Monte Carlo) method. etching analysis, Vacuum field analysis
Simulating the interactions between gas •P
 DL(Pixel Define Layer) growth analysis
and surfaces boundaries, the gas flow with •D
 eposition mask toolkits, Wall growth,
molecular particles Chemical reaction
XGC PPPL Simulates edge effects for MHD plasma •T
 he particle push portion has been Multi-GPU
physics optimized with CUDA and is being fully Multi-Node
optimized with OpenACC and CUDA

SCIENTIFIC VISUALIZATION
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Animator GNS Industry proven, modern post-processing • Rendering Multi-GPU


app for CAE Single Node
Ansys EnSight ANSYS Industry proven post-processing app for • Rendering Multi-GPU
CAE •R  ay tracing Single Node
FieldView IntelligentLight Visualization application for CFD • Rendering Single GPU
Single Node
HVR (LCSE, U of University of Interactive volume rendering application •V
 olume rendering Multi-GPU
Minnesota) Minnesota Single Node
IndeX NVIDIA Interactive distributed volumetric compute •P
 arallel distributed 3D rendering of dense Multi-GPU
and visualization framework. or sparse volumes Multi-Node
•A
 ccurate ray casting or ray tracing at high
resolution of full size datasets
•P
 lug-in to ParaView also available.
Inside Explorer Interspectral An interactive and intuitive software with • vGPU Single GPU
volumetric rendering and 3D-visualization Single Node
of real captured data.

62  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 62 4/5/21 10:18 AM


ParaView Kitware Scalable data analysis and visualization •R
 endering and analysis tasks Multi-GPU
application. One of the main vis tools at •P
 lugin for NVIDIA IndeX Multi-Node
HPC sites. •O
 ptiX rendering backend
•C
 UDA accelerated filters (data
transformation routines)
Pix4Dmapper Pix4D This professional photogrammetry •G
 PU accelerated processing Single GPU
software uses images to generate point Single Node
clouds, digital surface and terrain models,
orthomosaics, textured models and
more. It is most often used by geospatial
professionals such as surveyors and civil
engineers.
SPECFEM3D CIG There are two modules/apss in the •O
 penCL and CUDA hardware accelerators, Multi-GPU
SPECFEM family: GLOBE and CARTESIAN. based on an automatic source-to-source Single Node
The global model is the former Gordon Bell transformation library
Awardee code. Used for global inversion. •S
 imulates acoustic (fluid), elastic (solid),
Also part of the CAAR effort (although, that coupled acoustic/elastic, poroelastic or
one is mostly focused on workflow, rather seismic wave propagation in any type of
than the actual model). The regional model conforming mesh of hexahedra (structured
is CARTESIAN and it is the app used for or not).
seismic simulations, earthquake models,
submarine acoustics etc. In addition to
being used as a community app, Specfem3D
is also use as a proxy app for proprietary
codes
Tecplot Tecplot General purpose scientific visualization • Rendering Single GPU
software for Aerodynamics, O&G, Internal Single Node
Combustion and Geoscience applications
VisIt LLNL Scalable data anlysis and visualization •R
 endering and analysis tasks Multi-GPU
application Single Node
vl3 (Argonne Argonne Large dataset visualization in cosmology, •V
 olume rendering of particles Multi-GPU
National Lab) National Lab astrophysics, and biosciences fields. Single Node

Smart Spaces
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

AI-NVR IronYun Search in Video, Real time intrusion •S


 earch amongst 1000s of videos for Single GPU
detection interesting activities or attributes. Single Node
Alert Irvine Sensors Alert provides people counting and intrusion •P  eople counting Single GPU
detection • I ntrusion detection Single Node
Arvas VI Dimensions ARVAS, is an Intelligent Video Analytics • Abnormally Detection Features - Break-ins, Single GPU
solution that uses advance statistical robbery, rioting, floods, accidents, fights, Single Node
modelling based on deep machine learning arson, fire, maintenance and vandalism.
technology to detect anomalies. This
automated approach enables more accurate
detection of complex risk pattern that would
otherwise escape human analysts and
caused high false alarm.
BioSurveillance Herta Security Real time facial recognition and forensic •S
 upports crowded scenes and difficult Multi-GPU
NEXT, BioFinder alerts against multiple watchlists. lighting Single Node
•F
 aster than real-time analysis
•P
 artial face concealment

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  63

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 63 4/5/21 10:18 AM


Cezurity EVO Cezurity Event Observer (EvO): engine for detecting • CUDA Multi-GPU
malicious activity on user computers. Single Node
Centralized detection engine; Event chains;
Context; Real-time analysis - Cezurity
Cloud: Cloud-based technology for detecting
malware. Cezurity Cloud has the flexibility
to fit into diverse solutions. Different
information can be sent and processed by
the server, depending on the needs of each
product or solution. For example, Cezurity
Cloud is currently used as a subsystem to
supply data for the Cezurity EvO detection
engine. Cezurity Cloud helps the Anti-Virus
Scanner to detect malware. In addition,
the technology is used for monitoring and
analyzing changes in our APT-D solution
designed to detect persistent threats against
corporate networks.
Cylance Cylance Advanced AI-based endpoint malware •E
 ndpoint malware detection solution Multi-GPU
detection. •G
 PU deep learning technology Single Node
FaceControl VOCORD Detects and recognizes the faces of people, •N  on-cooperative biometrical facial Multi-GPU
freely passing-by cameras, providing an recognition system Multi-Node
instant alert to people on a watchlist, • ALPR
recognizes age and gender, counts people by •V  ideo analytics and pattern recognition,
faces, tags newcomers and regular visitors. •V  ideo processing and video enhancement
The system uses deep neural network
algorithms and performs recognition with
extremely high accuracy in field applications.
Glueck Media; Glueck Deep Learning/Machine Learning based •F  acial Expression Multi-GPU
Glueck Analytics Computer Vision technology enabling •A  ge Estimation Single Node
understanding of how human feels and • Gender
perceives the environment around them, • Ethnicity
focusing on face and people analytics. •M  ulti Face Tracking
•A  ttention Time
Ikena Forensic, MotionDSP Real-time (render-less) super-resolution- •M
 ulti-filter, render-less video Multi-GPU
Ikena Spotlight based video enhancement and redaction reconstruction (super-resolution, Single Node
software for forensic analysts and law stabilization, light/color correction)
enforcement professionals. •A
 utomatic tracking for redaction video from
body cameras, CCTV and other sources
iMotionFocus iCetana Intelligent analysis of video on 1,000+ •G  PU accelerated machine learning Multi-GPU
camera streams to significantly filter and • I dentifies abnormal activity within video Single Node
reduce the camera streams requiring an streams
operator view.
IZA500G On Edge Inex/Zamir The IZA500G with processing-on-edge •O
 perating Distance: 9-19 ft (3-6m); 16-32 Single GPU
Processing ALPR combines two sensors (OV and LPR), a ft (5-10m) Single Node
System quad core processor, and ALPR software •V
 ehicle Speed Range: 0 ? 120 mph (0 ? 193
in a single housing, delivering crystal clear km/h)
images, automatically recognized license •F
 ield of View: 12 ft (3.66 m)
plate data, GPS coordinates, and streaming
video.
Nodeflux IVA Nodeflux Nodeflux IVA products and services cover •F
 ace recognition Multi-GPU
wide range of sector including but not limited •L
 icense plate recognition Single Node
to smart city, defense and security, traffic •T
 raffic violation detection
management, toll management, store analytic •T
 raffic monitoring, and flood monitoring
(wholesale and retail), asset and facilities
management, advertising, and transportation.
OpenALPR OpenALPR Automatic license plate and vehicle make/ •H
 igh accuracy license plate character Multi-GPU
model/year recognition software applied to recognition spanning North America, Single Node
video streams from IP cameras. Europe, United Kingdom, Australia, Korea,
Singapore and Brazil
•A
 PIs and source code available for
embedded applications and web services

64  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 64 4/5/21 10:18 AM


Operating Room Artisight, Inc. Artisight?s Operating Room Efficiency • I ndependently validated de- identification Multi-GPU
Efficiency solution improves operating room protocols Multi-Node
productivity with intelligent sensor •M  achine learning algorithms
network and machine learning algorithms. • I ntelligent cameras and Bluetooth sensors
Delivers real-time access to the actionable •A  dvanced interoperability
data needed to improve your operating •H  ighly granular analytics
room productivity while ensuring HIPAA
compliance. Deep-learning prediction
helps reduce costs, improve productivity,
increase profitability and provide clinicians
with a safer, more efficient operating room
environment.
Patient Location Artisight, Inc. An IoT sensor network for healthcare, • I ndependently validated de- identification Multi-GPU
Tracking Artisight?s intelligent solutions improve protocols Multi-Node
organizational operations and financial •M  achine learning algorithms
performance. Designed by physicians, AI • I ntelligent cameras and Bluetooth sensors
scientists and operational experts, Artisight?s •A  dvanced interoperability
patient location tracking system uses data •H  ighly granular analytics
in a HIPAA-compliant platform to solve for
the challenges of moving people efficiently
around and through a hospital system.
Recotraffic; Recogine Intelligent Transportation Systems •T  raffic Data Collection, Multi-GPU
Recosecure; covering complex multi-modal surface • I ncident Detection Single Node
Recohospital transportation solutions at a regional, • I ntegrated Management
sub-regional, corridor and small area level •V  ehicle Classification and supporting
using deep computer vision technologies. related application
SenDISA Platform Sensen SenSen provides Video-IoT data analytic • I ntelligent Transportation - parking Single GPU
Networks software solutions targeted at increasing enforcement Single Node
revenue and reducing the cost of operations •C  asino game table analytics
of customers. SenSen software can
process and fuse data from cameras and
other sensors like GPS, Radar, and Lidar
in real time for parking guidance, parking
enforcement, speed enforcement, traffic
data analytics and road safety applications.
Casinos use SenSen solutions for table
game analytic solutions and customer
analytics. SenSen solutions are also used in
retail, security and tolling applications.
Syndex Pro Briefcam Improved security and operations by turning •R
 eview hours of video in minutes Single GPU
video data into useful information. Based •S
 earch in Video Single Node
on Video Synopsis technology, Syndex
Pro allows users to review hours of video
in minutes, while applying search filters
for achieving accurate results and faster
time-to-target. Data can be processed on-
demand or in real time to support a wide
range of use cases.
Telemonitoring Artisight, Inc. Artisight?s Telemonitoring solution uses •M  onitor up to 12 patients per screen, with 6 Multi-GPU
a constellation of thousands of intelligent screens per station Multi-Node
pan, tilt, and zoom cameras with two •H  igh definition 1080p video
way audio to allow for the simultaneous •2  -way audio with push-to-talk functionality
monitoring of multiple patients from a • I ntuitive on-screen controls for responsive
single workstation. Provides constant visual pan, tilt, and zoom
and verbal contact with patients, while •P  rivacy screen for patient and staff
reducing personal protective equipment autonomy
consumption, as well as front line workers
exposure to the virus.

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  65

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 65 4/5/21 10:18 AM


Telesitting Artisight, Inc. With Artisight?s Intelligent Telesitting solution, •M
 achine learning algorithms that prevent Multi-GPU
your hospital can provide safe, accurate falls and pressure ulcers Multi-Node
remote patient monitoring around the clock. •A
 utomated bed capacity management and
Intelligent Telesitting allows a single staff throughput coordination
member to remotely monitor multiple patients •M
 ultiple video feeds on one screen, and
simultaneously, providing better oversight of multiple tabs per browser
each patient. Not only does this dramatically •B
 i-directional audio with HD pan-tilt-zoom
decrease staffing costs, it also provides more cameras
comprehensive information in real time to help •S
 ystem available in mobile or fixed ceiling
avoid costly falls. versions
Tera, Tera+, Tera SmartCow Embedded and Backend video analytics for •A
 utomatic number plate recognition Multi-GPU
Vortex real-time insights from your security and •T
 raffic Management Single Node
service-related monitoring systems. •S
 mart Car Parking Policy
•A
 ccident Detection
Thermal Screening Artisight, Inc. Thermal imaging eliminates the obstacles •D  ynamic temperature adjustment based on Multi-GPU
associated with manual screening and ambient humidity and temperature Multi-Node
maintains the safety of your screening staff. • I ntuitive multi-touch and slider- based
Our thermal imaging camera can screen interface
thousands of people every hour, and its •M  achine learning algorithms
flexible viewing options mean you?ll spend •W  i-fi access gateway processes and
less on staffing. It?s easy to configure, broadcasts
requires minimal training for operation and •E  ncrypted video feeds for enhanced
is accurate to within +/-0.3 degrees Celsius. stability, security, and privacy
•B  luetooth integration for fully autonomous
screening
XRVision, IoP XRVision Face Recognition and Video Analytics for •F
 ace Recognition and Video Analytics Multi-GPU
Uncontrolled, Crowded and In Motion •S
 mart City, Public Safety, Transportation Single Node
Environments Analytics, Retail Analytics, Ordinance and
Environment Safety

Tools and Management


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Acrobat Adobe Apps & web services to view, create, •A


 I inference & training in the cloud Single GPU
manipulate, print and manage files in PDF Single Node
(Portable Document Format)
Altair Access Altair A simple, powerful, and consistent portal •3  D Remote Visualization Multi-GPU
for submitting and monitoring jobs on •H  igh-fidelity collaboration Multi-Node
remote clusters and clouds, and for remote • I ntegrated with Altair PBS Professional for
visualization. Brings high-end 3D visualization scheduling and control on GPU use and
datacenter hardware right to the user. accounting
Altair Grid Engine Univa Altair Grid Engine is a leading distributed • NVIDIA CUDA Multi-GPU
resource management system for • OpenACC Multi-Node
optimizing workloads and resources in •O  penCL plus MPI hybrid apps
thousands of data centers. Improves •O  ptimizes scheduling with resource-
performance, productivity and efficiency. mapped GPUs
Optimizes throughput and performance of •M  anages GPU apps within or without
applications, containers, and services while Docker containers
maximizing shared compute resources •O  btain visibility with CUDA-specific metrics
across on-premises, hybrid, and cloud for GPU monitors and reports
infrastructures. •E  xtend on-premise deployments to
incorporate cloud-based GPU instances
Altair PBS Altair PBS Professional is a fast, powerful •G
 PU auto discovery Multi-GPU
Professional workload manager designed to improve •S
 pecify GPU count per CPU Multi-Node
productivity, optimize utilization and •S
 pecify GPU type
efficiency, and simplify administration for •G
 PU/CPU affinity
clusters, clouds, and supercomputers. •G
 PU awareness and equality in accounting,
Supports biggest HPC workloads to millions quotas, and fair share
of small, high-throughput jobs. PBS •G
 PU/CPU syntax/scheduling equivalence
Professional automates job scheduling, •S
 pecify memory use per GPU
management, monitoring, and reporting, •A
 dd-on/integration project
and it’s the trusted solution for complex •N
 VIDIA Data Center GPU Management
Top500 systems as well as smaller clusters. (DCGM)
•O
 pen source and commercial versions

66  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 66 4/5/21 10:18 AM


Arm Forge Arm Build reliable and optimized code for the •C  ross Platform: Moving to a new architecture Multi-GPU
(formerly Allinea) right results on multiple Server and HPC or system is challenging enough without Multi-Node
architectures, from the latest compilers having to learn a new tool chain at the same
and C++ 11 standards including NVIDIA time. Arm DDT, MAP and Performance
GPU hardware. Arm Forge combines Arm Reports run everywhere - on your own
DDT, the leading debugger for time-saving laptop, the latest supercomputer, and
high performance application debugging, tomorrow’s upcoming architectures
Arm MAP, the trusted performance •A  utomatically detect memory bugs, profile
profiler for invaluable optimization advice behavior and see advanced performance
across native and Python HPC codes, and metrics at all scales on Arm 64-bit, Intel
Arm Performance Reports for advanced Xeon, Intel Xeon Phi, NVIDIA GPUs , and
reporting capabilities. OpenPOWER
Arm Forge Professional (DDT & MAP) •F  ast Debug: Arm DDT is the debugger of
providing all you will need to debug, choice for developing of C++, C or Fortran
profile and optimize for high performance parallel, and threaded applications on CPUs,
from single threads through to complex GPUs and Intel Xeon Phi
parallel HPC and scientific codes with MPI, • Its powerful intuitive graphical interface
OpenACC, OpenMP, threads or NVIDIA helps you easily detect memory bugs and
CUDA applications. divergent behavior at all scales, making Arm
DDT the number one debugger in research,
industry and academia.
•L  ow-overhead Profiling: Profile your code
without distorting application behavior. Arm
MAP is Arm Forge’s scalable low-overhead
profiler of C++, C, Fortran and Python with no
instrumentation or code changes required.
It helps developers accelerate their code by
revealing the causes of slow performance
•F  rom multicore Linux workstations to the
largest supercomputers, you can profile
realistic test cases with typically less than
5% runtime overhead.
• Short Learning Curve: Arm DDT offers a
powerful intuitive GUI that sets the standard
for multi-process and multi-threaded
debugging
• Complex software debugging is made
simple whether you’re working on a PC or
offline, with the help of zero-click variable
comparisons, built-in memory debugging,
and powerful array visualizations - for
today’s increasingly parallel processors,
clusters, and supercomputers.
• Wide Issue Coverage: Arm MAP exposes
a wide set of performance indicators,
including MPI metrics, PAPI counters, IO
metrics, energy metrics and even your own
custom metrics
• Profile computation (with self and child
and call tree representations over time),
thread activity (to identify over-subscribed
cores and sleeping threads that waste CPU
time for OpenMP and pthreads), instruction
types, as well as synchronization and I/O
performance.
• Single and Multi Threaded Profiling: Arm
MAP profiles parallel, multithreaded, and
single threaded C, C++, Fortran, F90 and
Python codes, providing in-depth analysis
and bottleneck pinpointing to the source line
• Unlike most profilers , it can profile
pthreads, OpenMP or MPI for parallel and
threaded code, including communication
and workload imbalance issues for MPI and
multi-process codes
Artec Leo Artec 3D A smart 3D scanner that enables you to see • Jetpack Single GPU
your object projected in 3D directly on the • Tx2 Single Node
HD display.

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  67

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 67 4/5/21 10:18 AM


Bright Cluster Bright Bright Cluster Manager lets you administer •P
 owerful Cluster Management Shell Multi-GPU
Manager Computing clusters as a single entity, provisioning (CMSH) Multi-Node
the servers, GPUs, operating system, and •N
 VIDIA libraries, CUDA, OpenCL, OpenACC,
workload manager from a unified interface. CUDA-aware libraries, NCCL, and CUB
•L
 inux distributions: RHEL and derivatives,
SUSE SLES and Ubuntu LTS
•G
 PU-enabled Kubernetes and Singularity
for running containers
CMake Kitware CMake is a cross-platform build tool •C  olor output for make Multi-GPU
for controlling the software compilation •P  rogress output for make Multi-Node
process using simple platform- and • Incremental linking support with vs 8,9 and
compiler-independent configuration files. manifests
Generates native makefiles and workspaces •S  upports out-of-tree builds
that can be used in the compiler of •A  uto-rerun of cmake if any cmake input files
choice. Integrates with CDash to provide a change (works with vs 8, 9 using ide macros)
comprehensive suite of tools. •A  uto depend information for C++, C, and
Fortran
ELPA Max Planck The publicly available ELPA library provides • I mproved one-step ScaLAPACK-type solver Multi-GPU
Institute highly efficient and highly scalable direct ELPA1 Multi-Node
eigensolvers for symmetric matrices. •N  ovel two-step solver ELPA2
Though especially designed for use for
PetaFlop/s applications solving large
problem sizes on massively parallel
supercomputers, ELPA eigensolvers have
proven to be also very efficient for smaller
matrices.
HPCToolkit Rice University HPCToolkit is an integrated suite of tools •C  oarse-grain mode: collect multiple Multi-GPU
for measurement and analysis of program metrics in a single run Multi-Node
performance on computers ranging from •G  PU kernel metrics
multicore desktop systems to the nation’s •S  ynchronization metrics
largest supercomputers. Provides support •M  emory copy metrics
for analyzing a program execution cost, •M  emory allocation metrics
inefficiency, and scaling characteristics •L  ess than 2× overhead
both within and across nodes of a parallel •F  ine-grain mode: collect GPU PC samples
system. •8  PC sampling shortcomings
• I ntroduces up to 20× overhead
•S  erialized GPU kernel executions
IBM Spectrum LSF IBM Corporation A comprehensive workload management • Enforcement of GPU allocations via cgroups Multi-GPU
solution that simplifies HPC with an •E  xclusive allocation and round robin shared Multi-Node
enhanced user and administrator mode allocation
experience, reliability and performance •C  PU-GPU affinity
at scale. Great for big data, cognitive, •B  oost control
GPU machine learning and containerized •P  ower management
workloads. •M  ulti-Process Server (MPS) support
•N  VIDIA Volta and DCGM support
Magma ICL - University MAGMA provides a dense linear algebra •L  inear system solvers Multi-GPU
of Tennessee library similar to LAPACK but for •E  igenvalue problem solvers Single Node
Knoxville heterogeneous/hybrid architectures, •A  uxiliary BLAS
starting with current “Multicore+GPU” •B  atched LA
systems. •S  parse LA
•C  PU/GPU Interface
•M  ultiple precision support
•N  on-GPU-resident factorizations
•M  ulticore and multi-GPU support
•M  AGMA Analytics/DNN
•L  APACK testing
• Linux
• Windows
•M  ac OS
•S  upport for NVIDIA A100, V100, T4, P100
GPUs

68  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 68 4/5/21 10:18 AM


PAPI ICL - University PAPI provides the tool designer and •S  tandard API on most modern Multi-GPU
of Tennessee application engineer with a consistent microprocessors Multi-Node
Knoxville interface and methodology for use of the •S  mall set of registers that count Events
performance counter hardware found in • Events-monitoring
most major microprocessors. PAPI enables •C  orrelation between source/object code
software engineers to see, in near real time, and underlying architecture
the relation between software performance •P  lease refer to the PAPI News page for the
and processor events. latest on GPU support:
https://icl.utk.edu/papi/news/index.html
Parallware Trainer Appentra Parallelware Trainer is an interactive, real- • I nteractive, real-time editor GUI that N/A
Solutions time code editor with features that facilitate shows you how and where to implement
the learning, usage, and implementation of parallelism.
parallel programming by understanding how •A  ssists in the parallelization of code using
and why sections of code can be parallelized. OpenMP and OpenACC.
Users are actively involved in learning •T  ransparent, local/ remote, execution and
parallel programming through observation, benchmarking.
comparison, and hands-on experimentation. •S  upport for the C programming language.
Parallelware Trainer provides support for Full Fortran support coming soon.
widely used parallel programming strategies •D  etailed report of opportunities for
using OpenMP and OpenACC with execution parallelism discovered in your code.
on multicore processors and GPUs. •S  upport for multiple compilers including
GCC, Intel and PGI.
• Benefits:
•F  aster, more effective learning.
•R  educed learning curve.
•A  ll-in-one learning tool for parallel
programming.
• I mmediate use of parallel programming.
•S  upport for multicore processors and
GPUs.
SLURM SchedMD Slurm is an open source, fault-tolerant, and •G  PU support Multi-GPU
highly scalable cluster management and • GPGPUs Multi-Node
job scheduling system for large and small •M  ilitary grade security
Linux clusters. •H  eterogenous platform
•F  lexible plugin framework
STRIVR StriVR STRIVR offers an end-to-end Immersive •V
 RWorks 360 Video Single GPU
Learning platform that revolutionizes the Single Node
way people and businesses train, learn,
and perform.

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  69

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 69 4/5/21 10:18 AM


TAU - Tuning and University of TAU Performance System is a portable • Instrumentation Multi-GPU
Analysis Utilities Oregon profiling and tracing toolkit for performance • PerfDMF Multi-Node
analysis of parallel programs written in • Paraprof
Fortran, C, C++, UPC, Java, Python. •L  oad Profiles
TAU (Tuning and Analysis Utilities) •M  etric Window
is capable of gathering performance •T  hread Windows
information through instrumentation •C  ommunication Matrix
of functions, methods, basic blocks, •3  D Visualization
and statements as well as event-based •D  erived Metrics
sampling. All C++ language features •S  elective Instrumentation
are supported including templates and • PerfExplorer
namespaces. The API also provides •C  luster Analysis
selection of profiling groups for organizing •C  orrelation Analysis
and controlling instrumentation. The •S  calability Chart
instrumentation can be inserted in •P  reset Charts
the source code using an automatic •C  ustom Charts
instrumentor tool based on the Program • Visualizations
Database Toolkit (PDT), dynamically •E  clipse Introduction
using DyninstAPI, at runtime in the Java •S  elective Instrumentation
Virtual Machine, or manually using the • I nstrumenting Java
instrumentation API. •C  onfiguration Manager
TAU’s profile visualization tool, paraprof,
provides graphical displays of all the
performance analysis results, in aggregate
and single node/context/thread forms.
The user can quickly identify sources of
performance bottlenecks in the application
using the graphical interface. In addition,
TAU can generate event traces that can
be displayed with the Vampir, Paraver or
JumpShot trace visualization tools.
Torque / Moab Adaptive Moab HPC Suite is a workload and resource •R
 equests and schedules gpus based on gpu Multi-GPU
Computing orchestration platform that automates the location in NUMA systems Multi-Node
scheduling, managing, monitoring, and •C
 ollects and report smetrics and status
reporting of HPC workloads on massive scale. information
TORQUE provides control over batch jobs •S
 ets gpu mode at job run time
and distributed computing resources. It is
an advanced open-source product based on
the original PBS project and incorporates
the best of both community and professional
development.
Totalview Perforce TotalView is the leading dynamic analysis • OpenACC directives Multi-GPU
and debugging tool designed to handle • CUDA running directly on NVIDIA latest GPUs Multi-Node
complex CPU and GPU based multi- • Linux and GPU device thread visibility
threaded, multi-process and multi-node • CUDA function calls, host pinned memory
cluster applications.TotalView supports the regions and CUDA contexts
latest CUDA SDK’s, NVIDIA GPU hardware, • Handling CUDA functions inline and on the
Linux x86-64, Arm64, and OpenPower stack
platforms and applications utilizing MPI and • Command line interface (CLI) commands for
OpenMP technologies. CUDA functions
• MPI applications on CUDA-accelerated
clusters
Vampir TU Dresden Easy-to-use framework that enables •N  VIDIA CUDA Multi-GPU
developers to quickly display and analyze • CUPTI Multi-Node
arbitrary program behavior at any level of •C  UDA libraries
detail. The tool suite implements optimized
event analysis algorithms and customizable
displays that enable fast and interactive
rendering of very complex performance
monitoring data.

70  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 70 4/5/21 10:18 AM


Agriculture
APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Taranis Taranis Taranis provides a platform for discovering • r eport plant population to farmers Multi-GPU
various crop health issues, helping farmers •d  etect when a weed emerges in field and Multi-Node
take care of both land and crops and constitutes a potential threat
making sure they get the best of their yield. • c alculate amounts of nutrients in
vegetation, water content in the soil, plant
temperature
• identify and categorize the top relevant
diseases for prevalent crops

Business Process Optimization


APPLICATION NAME COMPANYNAME PRODUCT DESCRIPTION SUPPORTED FEATURES GPU SCALING

Automated Focal Systems Focal’s Product Recognition eliminates • cuDNN Multi-GPU


checkout barcode scanning entirely at the cashier • TensorRT Single Node
and achieves 99% accuracy on thousands of
products.
DataX.AI CrowdANALYTIX Cloud-based crowd-sourced analytics • cuDNN Single GPU
services that create an online retail product Single Node
catalog, on-boarding SKU in minutes
instead of the manual process of tagging
and provide produce info and removing
human error involved.
Helix Maxerience CPG product training platform: creates • TensorRT Single GPU
digital copies of products right at the Single Node
production line in a matter of minutes,
and creates an AI model in less than 30
minutes!
Part Finder Kiosk Slyce A visual search and image recognition •R
 eal time scan item and direct customer to Single GPU
solution for retailers and brands item’s location in store Single Node
•F
 ind a replacement or additional info
•F
 eature Jetpack
Peak Trading Out BeMyEye Out of Stock (OOS) and Almost OOS (AOOS) •P
 roduct recognition on the cloud Single GPU
Of Stock crowed sourcing solutions for retailers Single Node
Perfect Shelf BeMyEye Track Hypermarkets, Supermarkets, •R
 eal time inferencing on the cloud Single GPU
Discounters, Managed Convenience •S
 KU recognition Single Node
and Chemists, using unique blend of IR
technologies and crowdsourcing, to provide
you with on-shelf sales fundamental data
across an entire category
Predictive Pricing Evo Pricing Market-driven optimal prices based on •G
 PU on the cloud Multi-GPU
demand, competition, product features and Single Node
customer feedback
Third Wave Third Wave Automation cloud robotics and machine •G
 eforce 2080 Ti Single GPU
Automation Automation learning technology to material handling Single Node
forklift automation in a warehouse

For more information on GPU-accelerated applications please visit, www.nvidia.com/teslaapps

POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21  |  71

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 71 4/5/21 10:18 AM


© 2021 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, and CUDA, are trademarks and/or registered trademarks of
NVIDIA Corporation. All company and product names are trademarks or registered trademarks of the respective owners with which
they are associated. Features, pricing, availability, and specifications are all subject to change without notice. Apr21
72  |  POPULAR GPU‑ACCELERATED APPLICATIONS CATALOG  |  Apr21

hpc-gpu-apps-catalog-1633552-print-layout-r2.indd 72 4/5/21 10:18 AM

You might also like