Intel OpenMP Webinar
Intel OpenMP Webinar
Intel® Corporation
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 2
*Other names and brands may be claimed as the property of others.
3
Task-based parallelism
Advantages of task-based parallelism
• Makes parallelization efficient for irregular and runtime dependent
execution
• Promotes higher level thinking
• Improves load balancing
Tasks with dependencies
• Fall into two categories: explicit and implicit
• Extends the expressiveness of task-based parallel programming
• Reduces need for global synchronization mechanism such as task barriers
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 4
*Other names and brands may be claimed as the property of others.
Applications often contain multiple levels of
parallelism
Visible in FGA
Task Parallelism/
Message Passing
Visible in FGA
fork-join fork-join
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 5
*Other names and brands may be claimed as the property of others.
Asynchronous task graphs (implicit vs. explicit)
OpenMP* Threading Building Blocks (TBB)
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 6
*Other names and brands may be claimed as the property of others.
Challenges with asynchronous task graphs
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 7
*Other names and brands may be claimed as the property of others.
8
Intel® Advisor – Flow Graph Analyzer Toolbar supporting basic file and edition operations, visualization and
analytics that operate on the graph or performance traces
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 9
*Other names and brands may be claimed as the property of others.
Workflows and UI features
10
Workflows: Create, Debug, Visualize and Analyze
Design mode
• Allows you to create a graph topology interactively
• Validate the graph and explore what-if scenarios
• Add C/C++ code to the node body
• Export C++ code using Threading Building Blocks (TBB) flow graph API
Analysis mode
• Compile your application (with tracing enabled)
• Capture execution traces during the application run
• Visualize/analyze in Flow Graph Analyzer
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 11
*Other names and brands may be claimed as the property of others.
Creating Asynchronous Task-graphs
12
Intel® Advisor – Flow Graph Analyzer (Design mode)
Graph Creation
Interactive Canvas
Code Generation
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 13
*Other names and brands may be claimed as the property of others.
Intel® Advisor – Flow Graph Analyzer (Design mode)
Serialization
GraphML* file format – uses extensions C/C++ code generated from the graph
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 14
*Other names and brands may be claimed as the property of others.
Challenges With asynchronous task graphs
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 15
*Other names and brands may be claimed as the property of others.
Intel® Advisor – Flow Graph Analyzer (Design mode)
Compiling and collecting traces
Path must be updated so fgtrun.bat and fgt2xml.exe can be run from the command line
>cl hello_world.cpp /O2 /DTBB_USE_THREADING_TOOLS ... /link tbb.lib /OUT:hello_world.exe
>set FGT_ROOT=<installation-directory>\fga\fgt
>set INTEL_LIBITTNOTIFY64=<installation-directory>\fga\fgt\windows\bin\intel64\<vc-version>\fgt.dll
>hello_world.exe
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 16
*Other names and brands may be claimed as the property of others.
Understanding Graph Execution
17
Examining the trace data: what’s possible?
“hello” node in all views that
represent different information.
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 18
*Other names and brands may be claimed as the property of others.
Examining the trace data: correlation
“hello” node in all views that
represent different information.
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 19
*Other names and brands may be claimed as the property of others.
Examining the trace data through Trace Playback
Playback of execution traces to
see how data is flowing through
the graph.
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 20
*Other names and brands may be claimed as the property of others.
Examining the trace data: node view
Node view captures all execution
traces for a given node and
presents it in a single swim-lane
for the node
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 21
*Other names and brands may be claimed as the property of others.
Challenges With asynchronous task graphs
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 22
*Other names and brands may be claimed as the property of others.
Examining the trace data with data analysis
How do we know which instance
of the Hello task is in response to
which input message?
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 23
*Other names and brands may be claimed as the property of others.
Examining the trace data with data analysis, cont.
Harder to track the data in dependency graphs as the Data ID cannot be
propagated from one node to the next
• continue_node requires an input of type continue_msg
continue_node<continue_msg> hello( hello_world_g0, []( continue_msg & ) {
cout << “Hello “;
} );
We are going to convert the Hello World example to use function_node instead
so we can send the ID from one node to the next
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 24
*Other names and brands may be claimed as the property of others.
Examining the trace data with data analysis, cont.
Data tracking using an
experimental feature will allow
you to track which task instance
is for which inputs.
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 25
*Other names and brands may be claimed as the property of others.
Examining the trace data with data analysis, cont.
Data tracking using an
experimental feature will allow
you to track which task instance
is for which inputs
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 26
*Other names and brands may be claimed as the property of others.
Challenges with asynchronous task graphs
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 27
*Other names and brands may be claimed as the property of others.
Understanding the performance
28
A simulation example
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Example: performance analysis
A complex graph was created
programmatically.
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 30
*Other names and brands may be claimed as the property of others.
Challenges with asynchronous task graphs
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 31
*Other names and brands may be claimed as the property of others.
Example: identifying problem areas
What was run and how much was
run?
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 32
*Other names and brands may be claimed as the property of others.
Example: identifying problem areas, cont.
Clicking on the node takes you to
the node in the graph
visualization
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 33
*Other names and brands may be claimed as the property of others.
Example: critical path
Analysis features
1. Critical Path
2. Rule-check
Critical Path
Critical path reduces the complexity in large graphs by isolating a small set of nodes for analysis and tuning for performance improvements
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 34
*Other names and brands may be claimed as the property of others.
What else can we look at?
35
Example: performance analysis
Analysis features
1. Critical Path
2. Rule-check
Rule check
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 36
*Other names and brands may be claimed as the property of others.
Challenges with asynchronous task graphs
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 37
*Other names and brands may be claimed as the property of others.
What does it look like in FGA?
38
Applications often contain multiple levels of
parallelism
Visible in FGA
Task Parallelism/
Message Passing
Visible in FGA
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 39
*Other names and brands may be claimed as the property of others.
Fork-join parallelism: tbb::parallel_for
Captures the execution task-
graph for a fork-join construct
and provides additional analytics
that present information about
the construct
1. Imbalance
2. Efficiency
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 40
*Other names and brands may be claimed as the property of others.
Multi-level parallelism: graph level + fork-join
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 41
*Other names and brands may be claimed as the property of others.
Multi-level parallelism in OpenMP*
Double-click
Top-level here onshows
the parallel
just one
region node to
entity, which is see the activity
a parallel region in
within the region
this OpenMP* example
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 42
*Other names and brands may be claimed as the property of others.
Download through Intel® Advisor package
43
Intel® Advisor – Flow Graph Analyzer
https://software.intel.com/en-us/articles/getting-started-
with-flow-graph-analyzer
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Summary
Asynchronous task-graphs improves the efficiency of irregular and runtime
dependent execution
• TBB and OpenMP* provide mechanisms to program in this manner
Flow Graph Analyzer helps you create, debug, visualize and analyze such
graphs
• Critical path analysis is crucial in reducing the complexity of the analysis
problem to a handful of nodes
• Runtime specific analyses, such as the lightweight policy analysis for TBB,
target additional performance improvements
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 45
*Other names and brands may be claimed as the property of others.
Resources
CPUs, GPUs, FPGAs: Managing the alphabet soup with Intel Threading
Building Blocks
https://software.intel.com/en-us/videos/cpus-gpus-fpgas-managing-the-alphabet-soup-with-
intel-threading-building-blocks
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 46
*Other names and brands may be claimed as the property of others.
Legal Disclaimer & Optimization Notice
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance
tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any
change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully
evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete
information visit www.intel.com/benchmarks.
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY
INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS
ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS
FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY
RIGHT.
Copyright © 2018, Intel Corporation. All rights reserved. Intel, the Intel logo, Pentium, Xeon, Core, VTune, OpenVINO, Cilk, are trademarks of
Intel Corporation or its subsidiaries in the U.S. and other countries.
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel
microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the
availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent
optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture
are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the
specific instruction sets covered by this notice.
Notice revision #20110804
Optimization Notice
Copyright © 2018, Intel Corporation. All rights reserved. 47
*Other names and brands may be claimed as the property of others.