OpenMP API Specification 5 1
OpenMP API Specification 5 1
OpenMP API Specification 5 1
Application Programming
Interface
c
Copyright
1997-2020 OpenMP Architecture Review Board.
Permission to copy without fee all or part of this material is granted, provided the OpenMP
Architecture Review Board copyright notice and the title of this document appear. Notice is
given that copying is by permission of the OpenMP Architecture Review Board.
This page intentionally left blank in published version.
Contents
i
2 Directives 37
2.1 Directive Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.1.1 Fixed Source Form Directives . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.1.2 Free Source Form Directives . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.1.3 Stand-Alone Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.1.4 Array Shaping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.1.5 Array Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.1.6 Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.2 Conditional Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.2.1 Fixed Source Form Conditional Compilation Sentinels . . . . . . . . . . . . 52
2.2.2 Free Source Form Conditional Compilation Sentinel . . . . . . . . . . . . . 53
2.3 Variant Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.3.1 OpenMP Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.3.2 Context Selectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.3.3 Matching and Scoring Context Selectors . . . . . . . . . . . . . . . . . . . 59
2.3.4 Metadirectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.3.5 Declare Variant Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.3.6 dispatch Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
2.4 Internal Control Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.4.1 ICV Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.4.2 ICV Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.4.3 Modifying and Retrieving ICV Values . . . . . . . . . . . . . . . . . . . . 77
2.4.4 How ICVs are Scoped . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.4.4.1 How the Per-Data Environment ICVs Work . . . . . . . . . . . . . . . 81
2.4.5 ICV Override Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.5 Informational and Utility Directives . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.5.1 requires Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.5.2 Assume Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.5.3 nothing Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
2.5.4 error Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
2.6 parallel Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
2.6.1 Determining the Number of Threads for a parallel Region . . . . . . . . 96
2.6.2 Controlling OpenMP Thread Affinity . . . . . . . . . . . . . . . . . . . . . 98
Contents iii
2.12.6 Task Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
2.13 Memory Management Directives . . . . . . . . . . . . . . . . . . . . . . . . . . 177
2.13.1 Memory Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
2.13.2 Memory Allocators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
2.13.3 allocate Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
2.13.4 allocate Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
2.14 Device Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
2.14.1 Device Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
2.14.2 target data Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
2.14.3 target enter data Construct . . . . . . . . . . . . . . . . . . . . . . . 191
2.14.4 target exit data Construct . . . . . . . . . . . . . . . . . . . . . . . 193
2.14.5 target Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
2.14.6 target update Construct . . . . . . . . . . . . . . . . . . . . . . . . . 205
2.14.7 Declare Target Directive . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
2.15 Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
2.15.1 interop Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
2.15.2 Interoperability Requirement Set . . . . . . . . . . . . . . . . . . . . . . . 220
2.16 Combined Constructs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
2.16.1 Parallel Worksharing-Loop Construct . . . . . . . . . . . . . . . . . . . . . 221
2.16.2 parallel loop Construct . . . . . . . . . . . . . . . . . . . . . . . . . 222
2.16.3 parallel sections Construct . . . . . . . . . . . . . . . . . . . . . . 223
2.16.4 parallel workshare Construct . . . . . . . . . . . . . . . . . . . . . 224
2.16.5 Parallel Worksharing-Loop SIMD Construct . . . . . . . . . . . . . . . . . 225
2.16.6 parallel masked Construct . . . . . . . . . . . . . . . . . . . . . . . . 226
2.16.7 masked taskloop Construct . . . . . . . . . . . . . . . . . . . . . . . . 228
2.16.8 masked taskloop simd Construct . . . . . . . . . . . . . . . . . . . . 229
2.16.9 parallel masked taskloop Construct . . . . . . . . . . . . . . . . . 230
2.16.10 parallel masked taskloop simd Construct . . . . . . . . . . . . . . 231
2.16.11 teams distribute Construct . . . . . . . . . . . . . . . . . . . . . . . 233
2.16.12 teams distribute simd Construct . . . . . . . . . . . . . . . . . . . 234
2.16.13 Teams Distribute Parallel Worksharing-Loop Construct . . . . . . . . . . . 235
2.16.14 Teams Distribute Parallel Worksharing-Loop SIMD Construct . . . . . . . . 236
2.16.15 teams loop Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Contents v
2.21 Data Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
2.21.1 Data-Sharing Attribute Rules . . . . . . . . . . . . . . . . . . . . . . . . . 302
2.21.1.1 Variables Referenced in a Construct . . . . . . . . . . . . . . . . . . . 302
2.21.1.2 Variables Referenced in a Region but not in a Construct . . . . . . . . . 306
2.21.2 threadprivate Directive . . . . . . . . . . . . . . . . . . . . . . . . . 307
2.21.3 List Item Privatization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
2.21.4 Data-Sharing Attribute Clauses . . . . . . . . . . . . . . . . . . . . . . . . 315
2.21.4.1 default Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
2.21.4.2 shared Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
2.21.4.3 private Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
2.21.4.4 firstprivate Clause . . . . . . . . . . . . . . . . . . . . . . . . . 318
2.21.4.5 lastprivate Clause . . . . . . . . . . . . . . . . . . . . . . . . . . 321
2.21.4.6 linear Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
2.21.5 Reduction Clauses and Directives . . . . . . . . . . . . . . . . . . . . . . . 325
2.21.5.1 Properties Common to All Reduction Clauses . . . . . . . . . . . . . . 326
2.21.5.2 Reduction Scoping Clauses . . . . . . . . . . . . . . . . . . . . . . . . 331
2.21.5.3 Reduction Participating Clauses . . . . . . . . . . . . . . . . . . . . . 332
2.21.5.4 reduction Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
2.21.5.5 task_reduction Clause . . . . . . . . . . . . . . . . . . . . . . . 335
2.21.5.6 in_reduction Clause . . . . . . . . . . . . . . . . . . . . . . . . . 335
2.21.5.7 declare reduction Directive . . . . . . . . . . . . . . . . . . . . 336
2.21.6 Data Copying Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
2.21.6.1 copyin Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
2.21.6.2 copyprivate Clause . . . . . . . . . . . . . . . . . . . . . . . . . . 343
2.21.7 Data-Mapping Attribute Rules, Clauses, and Directives . . . . . . . . . . . 345
2.21.7.1 map Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
2.21.7.2 Pointer Initialization for Device Data Environments . . . . . . . . . . . 356
2.21.7.3 defaultmap Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
2.21.7.4 declare mapper Directive . . . . . . . . . . . . . . . . . . . . . . . 358
2.22 Nesting of Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
Contents vii
3.4 Teams Region Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
3.4.1 omp_get_num_teams . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
3.4.2 omp_get_team_num . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
3.4.3 omp_set_num_teams . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
3.4.4 omp_get_max_teams . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
3.4.5 omp_set_teams_thread_limit . . . . . . . . . . . . . . . . . . . . 400
3.4.6 omp_get_teams_thread_limit . . . . . . . . . . . . . . . . . . . . 401
3.5 Tasking Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
3.5.1 omp_get_max_task_priority . . . . . . . . . . . . . . . . . . . . . 402
3.5.2 omp_in_final . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
3.6 Resource Relinquishing Routines . . . . . . . . . . . . . . . . . . . . . . . . . . 404
3.6.1 omp_pause_resource . . . . . . . . . . . . . . . . . . . . . . . . . . 404
3.6.2 omp_pause_resource_all . . . . . . . . . . . . . . . . . . . . . . . 406
3.7 Device Information Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
3.7.1 omp_get_num_procs . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
3.7.2 omp_set_default_device . . . . . . . . . . . . . . . . . . . . . . . 408
3.7.3 omp_get_default_device . . . . . . . . . . . . . . . . . . . . . . . 408
3.7.4 omp_get_num_devices . . . . . . . . . . . . . . . . . . . . . . . . . . 409
3.7.5 omp_get_device_num . . . . . . . . . . . . . . . . . . . . . . . . . . 410
3.7.6 omp_is_initial_device . . . . . . . . . . . . . . . . . . . . . . . . 411
3.7.7 omp_get_initial_device . . . . . . . . . . . . . . . . . . . . . . . 411
3.8 Device Memory Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
3.8.1 omp_target_alloc . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
3.8.2 omp_target_free . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
3.8.3 omp_target_is_present . . . . . . . . . . . . . . . . . . . . . . . . 416
3.8.4 omp_target_is_accessible . . . . . . . . . . . . . . . . . . . . . . 417
3.8.5 omp_target_memcpy . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
3.8.6 omp_target_memcpy_rect . . . . . . . . . . . . . . . . . . . . . . . 419
3.8.7 omp_target_memcpy_async . . . . . . . . . . . . . . . . . . . . . . 422
3.8.8 omp_target_memcpy_rect_async . . . . . . . . . . . . . . . . . . 424
3.8.9 omp_target_associate_ptr . . . . . . . . . . . . . . . . . . . . . . 426
3.8.10 omp_target_disassociate_ptr . . . . . . . . . . . . . . . . . . . 429
3.8.11 omp_get_mapped_ptr . . . . . . . . . . . . . . . . . . . . . . . . . . 430
Contents ix
4 OMPT Interface 471
4.1 OMPT Interfaces Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
4.2 Activating a First-Party Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
4.2.1 ompt_start_tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
4.2.2 Determining Whether a First-Party Tool Should be Initialized . . . . . . . . 473
4.2.3 Initializing a First-Party Tool . . . . . . . . . . . . . . . . . . . . . . . . . 474
4.2.3.1 Binding Entry Points in the OMPT Callback Interface . . . . . . . . . . 475
4.2.4 Monitoring Activity on the Host with OMPT . . . . . . . . . . . . . . . . . 476
4.2.5 Tracing Activity on Target Devices with OMPT . . . . . . . . . . . . . . . 478
4.3 Finalizing a First-Party Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
4.4 OMPT Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
4.4.1 Tool Initialization and Finalization . . . . . . . . . . . . . . . . . . . . . . 485
4.4.2 Callbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
4.4.3 Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
4.4.3.1 Record Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
4.4.3.2 Native Record Kind . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
4.4.3.3 Native Record Abstract Type . . . . . . . . . . . . . . . . . . . . . . . 487
4.4.3.4 Record Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
4.4.4 Miscellaneous Type Definitions . . . . . . . . . . . . . . . . . . . . . . . . 489
4.4.4.1 ompt_callback_t . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
4.4.4.2 ompt_set_result_t . . . . . . . . . . . . . . . . . . . . . . . . . 490
4.4.4.3 ompt_id_t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
4.4.4.4 ompt_data_t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
4.4.4.5 ompt_device_t . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
4.4.4.6 ompt_device_time_t . . . . . . . . . . . . . . . . . . . . . . . . 492
4.4.4.7 ompt_buffer_t . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
4.4.4.8 ompt_buffer_cursor_t . . . . . . . . . . . . . . . . . . . . . . . 493
4.4.4.9 ompt_dependence_t . . . . . . . . . . . . . . . . . . . . . . . . . 493
4.4.4.10 ompt_thread_t . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
4.4.4.11 ompt_scope_endpoint_t . . . . . . . . . . . . . . . . . . . . . . 494
4.4.4.12 ompt_dispatch_t . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
4.4.4.13 ompt_sync_region_t . . . . . . . . . . . . . . . . . . . . . . . . 495
4.4.4.14 ompt_target_data_op_t . . . . . . . . . . . . . . . . . . . . . . 496
Contents xi
4.5.2.13 ompt_callback_sync_region_t . . . . . . . . . . . . . . . . . 523
4.5.2.14 ompt_callback_mutex_acquire_t . . . . . . . . . . . . . . . 525
4.5.2.15 ompt_callback_mutex_t . . . . . . . . . . . . . . . . . . . . . . 526
4.5.2.16 ompt_callback_nest_lock_t . . . . . . . . . . . . . . . . . . . 527
4.5.2.17 ompt_callback_flush_t . . . . . . . . . . . . . . . . . . . . . . 528
4.5.2.18 ompt_callback_cancel_t . . . . . . . . . . . . . . . . . . . . . 529
4.5.2.19 ompt_callback_device_initialize_t . . . . . . . . . . . . 530
4.5.2.20 ompt_callback_device_finalize_t . . . . . . . . . . . . . . 531
4.5.2.21 ompt_callback_device_load_t . . . . . . . . . . . . . . . . . 532
4.5.2.22 ompt_callback_device_unload_t . . . . . . . . . . . . . . . 533
4.5.2.23 ompt_callback_buffer_request_t . . . . . . . . . . . . . . . 533
4.5.2.24 ompt_callback_buffer_complete_t . . . . . . . . . . . . . . 534
4.5.2.25 ompt_callback_target_data_op_emi_t and
ompt_callback_target_data_op_t . . . . . . . . . . . . . . . 535
4.5.2.26 ompt_callback_target_emi_t and
ompt_callback_target_t . . . . . . . . . . . . . . . . . . . . . 538
4.5.2.27 ompt_callback_target_map_emi_t and
ompt_callback_target_map_t . . . . . . . . . . . . . . . . . . 540
4.5.2.28 ompt_callback_target_submit_emi_t and
ompt_callback_target_submit_t . . . . . . . . . . . . . . . 542
4.5.2.29 ompt_callback_control_tool_t . . . . . . . . . . . . . . . . 544
4.5.2.30 ompt_callback_error_t . . . . . . . . . . . . . . . . . . . . . . 545
4.6 OMPT Runtime Entry Points for Tools . . . . . . . . . . . . . . . . . . . . . . . 546
4.6.1 Entry Points in the OMPT Callback Interface . . . . . . . . . . . . . . . . . 547
4.6.1.1 ompt_enumerate_states_t . . . . . . . . . . . . . . . . . . . . 547
4.6.1.2 ompt_enumerate_mutex_impls_t . . . . . . . . . . . . . . . . 548
4.6.1.3 ompt_set_callback_t . . . . . . . . . . . . . . . . . . . . . . . 549
4.6.1.4 ompt_get_callback_t . . . . . . . . . . . . . . . . . . . . . . . 550
4.6.1.5 ompt_get_thread_data_t . . . . . . . . . . . . . . . . . . . . . 551
4.6.1.6 ompt_get_num_procs_t . . . . . . . . . . . . . . . . . . . . . . . 552
4.6.1.7 ompt_get_num_places_t . . . . . . . . . . . . . . . . . . . . . . 552
4.6.1.8 ompt_get_place_proc_ids_t . . . . . . . . . . . . . . . . . . . 553
4.6.1.9 ompt_get_place_num_t . . . . . . . . . . . . . . . . . . . . . . . 554
Contents xiii
5.3 OMPD Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580
5.3.1 Size Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580
5.3.2 Wait ID Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580
5.3.3 Basic Value Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581
5.3.4 Address Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581
5.3.5 Frame Information Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582
5.3.6 System Device Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . 582
5.3.7 Native Thread Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583
5.3.8 OMPD Handle Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583
5.3.9 OMPD Scope Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584
5.3.10 ICV ID Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
5.3.11 Tool Context Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
5.3.12 Return Code Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
5.3.13 Primitive Type Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
5.4 OMPD Third-Party Tool Callback Interface . . . . . . . . . . . . . . . . . . . . . 587
5.4.1 Memory Management of OMPD Library . . . . . . . . . . . . . . . . . . . 588
5.4.1.1 ompd_callback_memory_alloc_fn_t . . . . . . . . . . . . . . 588
5.4.1.2 ompd_callback_memory_free_fn_t . . . . . . . . . . . . . . . 589
5.4.2 Context Management and Navigation . . . . . . . . . . . . . . . . . . . . . 590
5.4.2.1 ompd_callback_get_thread_context_for_thread_id
_fn_t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
5.4.2.2 ompd_callback_sizeof_fn_t . . . . . . . . . . . . . . . . . . . 591
5.4.3 Accessing Memory in the OpenMP Program or Runtime . . . . . . . . . . . 592
5.4.3.1 ompd_callback_symbol_addr_fn_t . . . . . . . . . . . . . . . 592
5.4.3.2 ompd_callback_memory_read_fn_t . . . . . . . . . . . . . . . 594
5.4.3.3 ompd_callback_memory_write_fn_t . . . . . . . . . . . . . . 595
5.4.4 Data Format Conversion: ompd_callback_device_host_fn_t . . . 596
5.4.5 ompd_callback_print_string_fn_t . . . . . . . . . . . . . . . . 598
5.4.6 The Callback Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
5.5 OMPD Tool Interface Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
5.5.1 Per OMPD Library Initialization and Finalization . . . . . . . . . . . . . . 600
5.5.1.1 ompd_initialize . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
5.5.1.2 ompd_get_api_version . . . . . . . . . . . . . . . . . . . . . . . 602
Contents xv
5.5.8 Display Control Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
5.5.8.1 ompd_get_display_control_vars . . . . . . . . . . . . . . . 626
5.5.8.2 ompd_rel_display_control_vars . . . . . . . . . . . . . . . 627
5.5.9 Accessing Scope-Specific Information . . . . . . . . . . . . . . . . . . . . 628
5.5.9.1 ompd_enumerate_icvs . . . . . . . . . . . . . . . . . . . . . . . 628
5.5.9.2 ompd_get_icv_from_scope . . . . . . . . . . . . . . . . . . . . 629
5.5.9.3 ompd_get_icv_string_from_scope . . . . . . . . . . . . . . . 630
5.5.9.4 ompd_get_tool_data . . . . . . . . . . . . . . . . . . . . . . . . 631
5.6 Runtime Entry Points for OMPD . . . . . . . . . . . . . . . . . . . . . . . . . . 632
5.6.1 Beginning Parallel Regions . . . . . . . . . . . . . . . . . . . . . . . . . . 633
5.6.2 Ending Parallel Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633
5.6.3 Beginning Task Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634
5.6.4 Ending Task Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634
5.6.5 Beginning OpenMP Threads . . . . . . . . . . . . . . . . . . . . . . . . . . 635
5.6.6 Ending OpenMP Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . 635
5.6.7 Initializing OpenMP Devices . . . . . . . . . . . . . . . . . . . . . . . . . 636
5.6.8 Finalizing OpenMP Devices . . . . . . . . . . . . . . . . . . . . . . . . . . 636
Index 681
Contents xvii
List of Figures
2.1 Determining the schedule for a Worksharing-Loop . . . . . . . . . . . . . . . . 134
xviii
List of Tables
2.1 ICV Initial Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.2 Ways to Modify and to Retrieve ICV Values . . . . . . . . . . . . . . . . . . . . . 77
2.3 Scopes of ICVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.4 ICV Override Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.5 schedule Clause kind Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
2.6 schedule Clause modifier Values . . . . . . . . . . . . . . . . . . . . . . . . . 131
2.7 ompt_callback_task_create Callback Flags Evaluation . . . . . . . . . . 165
2.8 Predefined Memory Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
2.9 Allocator Traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
2.10 Predefined Allocators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
2.11 Implicitly Declared C/C++ reduction-identifiers . . . . . . . . . . . . . . . . . . . 326
2.12 Implicitly Declared Fortran reduction-identifiers . . . . . . . . . . . . . . . . . . . 327
2.13 Map-Type Decay of Map Type Combinations . . . . . . . . . . . . . . . . . . . . 360
4.1 OMPT Callback Interface Runtime Entry Point Names and Their Type Signatures . 477
4.2 Callbacks for which ompt_set_callback Must Return ompt_set_always 479
4.3 Callbacks for which ompt_set_callback May Return Any Non-Error Code . . 480
4.4 OMPT Tracing Interface Runtime Entry Point Names and Their Type Signatures . . 482
xix
This page intentionally left blank
1 1 Overview of the OpenMP API
2 The collection of compiler directives, library routines, and environment variables that this
3 document describes collectively define the specification of the OpenMP Application Program
4 Interface (OpenMP API) for parallelism in C, C++ and Fortran programs.
5 This specification provides a model for parallel programming that is portable across architectures
6 from different vendors. Compilers from numerous vendors support the OpenMP API. More
7 information about the OpenMP API can be found at the following web site
8 http://www.openmp.org
9 The directives, library routines, environment variables, and tool support that this document defines
10 allow users to create, to manage, to debug and to analyze parallel programs while permitting
11 portability. The directives extend the C, C++ and Fortran base languages with single program
12 multiple data (SPMD) constructs, tasking constructs, device constructs, worksharing constructs,
13 and synchronization constructs, and they provide support for sharing, mapping and privatizing data.
14 The functionality to control the runtime environment is provided by library routines and
15 environment variables. Compilers that support the OpenMP API often include command line
16 options to enable or to disable interpretation of some or all OpenMP directives.
17 1.1 Scope
18 The OpenMP API covers only user-directed parallelization, wherein the programmer explicitly
19 specifies the actions to be taken by the compiler and runtime system in order to execute the program
20 in parallel. OpenMP-compliant implementations are not required to check for data dependences,
21 data conflicts, race conditions, or deadlocks. Compliant implementations also are not required to
22 check for any code sequences that cause a program to be classified as non-conforming. Application
23 developers are responsible for correctly using the OpenMP API to produce a conforming program.
24 The OpenMP API does not cover compiler-generated automatic parallelization.
1
1 1.2 Glossary
2 1.2.1 Threading Concepts
21 base language A programming language that serves as the foundation of the OpenMP specification.
22 COMMENT: See Section 1.7 for a listing of current base languages for
23 the OpenMP API.
24 base program A program written in a base language.
25 preprocessed code For C/C++, a sequence of preprocessing tokens that result from the first six phases of
26 translation, as defined by the base language.
27 program order An ordering of operations performed by the same thread as determined by the
28 execution sequence of operations specified by the base language.
24 inactive parallel region A parallel region that is executed by a team of only one thread.
25 active target region A target region that is executed on a device other than the device that encountered
26 the target construct.
27 inactive target region A target region that is executed on the same device that encountered the target
28 construct.
29 sequential part All code encountered during the execution of an initial task region that is not part of
30 a parallel region corresponding to a parallel construct or a task region
31 corresponding to a task construct.
32 COMMENTS:
33 A sequential part is enclosed by an implicit parallel region.
10 canonical loop nest A loop nest that complies with the rules and restrictions defined in Section 2.11.1.
11 loop-associated An OpenMP executable directive for which the associated user code must be a
12 directive canonical loop nest.
13 associated loop A loop from a canonical loop nest that is controlled by a given loop-associated
14 directive.
15 loop nest depth For a canonical loop nest, the maximal number of loops, including the outermost
16 loop, that can be associated with a loop-associated directive.
17 logical iteration space For a loop-associated directive, the sequence 0,. . . ,N − 1 where N is the number of
18 iterations of the loops associated with the directive. The logical numbering denotes
19 the sequence in which the iterations would be executed if the set of associated loops
20 were executed sequentially.
21 logical iteration An iteration from the associated loops of a loop-associated directive, designated by a
22 logical number from the logical iteration space of the associated loops.
23 logical iteration vector For a loop-associated directive with n associated nested loops, the set of n-tuples
24 space (i1 , . . . , in ). For the k th associated loop, from outermost to innermost, ik is its
25 logical iteration number as if it was the only associated loop.
26 logical iteration vector An iteration from the associated nested loops of a loop-associated directive, where n
27 is the number of associated loops, designated by an n-tuple from the logical iteration
28 vector space of the associated loops.
29 lexicographic order The total order of two logical iteration vectors ωa = (i1 , . . . , in ) and
30 ωb = (j1 , . . . , jn ), denoted by ωa ≤lex ωb , where either ωa = ωb or
31 ∃m ∈ {1, . . . , n} such that im < jm and ik = jk for all k ∈ {1, . . . , m − 1}.
2 task A specific instance of executable code and its data environment that the OpenMP
3 implementation can schedule for execution by threads.
4 task region A region consisting of all code encountered during the execution of a task.
5 COMMENT: A parallel region consists of one or more implicit task
6 regions.
7 implicit task A task generated by an implicit parallel region or generated when a parallel
8 construct is encountered during execution.
9 binding implicit task The implicit task of the current thread team assigned to the encountering thread.
10 explicit task A task that is not an implicit task.
11 initial task An implicit task associated with an implicit parallel region.
12 current task For a given thread, the task corresponding to the task region in which it is executing.
13 encountering task For a given region, the current task of the encountering thread.
14 child task A task is a child task of its generating task region. A child task region is not part of
15 its generating task region.
16 sibling tasks Tasks that are child tasks of the same task region.
17 descendent task A task that is the child task of a task region or of one of its descendent task regions.
18 task completion A condition that is satisfied when a thread reaches the end of the executable code that
19 is associated with the task and any allow-completion event that is created for the task
20 has been fulfilled.
21 COMMENT: Completion of the initial task that is generated when the
22 program begins occurs at program exit.
23 task scheduling point A point during the execution of the current task region at which it can be suspended
24 to be resumed later; or the point of task completion, after which the executing thread
25 may switch to a different task region.
26 task switching The act of a thread switching from the execution of one task to another task.
27 tied task A task that, when its task region is suspended, can be resumed only by the same
28 thread that was executing it before suspension. That is, the task is tied to that thread.
29 untied task A task that, when its task region is suspended, can be resumed by any thread in the
30 team. That is, the task is not tied to any thread.
20 target task A mergeable and untied task that is generated by a device construct or a call to a
21 device memory routine and that coordinates activity between the current device and
22 the target device.
23 taskgroup set A set of tasks that are logically grouped by a taskgroup region.
2 variable A named data storage block, for which the value can be defined and redefined during
3 the execution of a program.
4 COMMENT: An array element or structure element is a variable that is
5 part of another variable.
6 scalar variable For C/C++, a scalar variable, as defined by the base language.
7 For Fortran, a scalar variable with intrinsic type, as defined by the base language,
8 excluding character type.
9 aggregate variable A variable, such as an array or structure, composed of other variables.
10 array section A designated subset of the elements of an array that is specified using a subscript
11 notation that can select more than one element.
12 array item An array, an array section, or an array element.
13 shape-operator For C/C++, an array shaping operator that reinterprets a pointer expression as an
14 array with one or more specified dimensions.
15 implicit array For C/C++, the set of array elements of non-array type T that may be accessed by
16 applying a sequence of [] operators to a given pointer that is either a pointer to type T
17 or a pointer to a multidimensional array of elements of type T.
18 For Fortran, the set of array elements for a given array pointer.
19 COMMENT: For C/C++, the implicit array for pointer p with type T
20 (*)[10] consists of all accessible elements p[i][j], for all i and j=0,1,...,9.
21 base pointer For C/C++, an lvalue pointer expression that is used by a given lvalue expression or
22 array section to refer indirectly to its storage, where the lvalue expression or array
23 section is part of the implicit array for that lvalue pointer expression.
24 For Fortran, a data pointer that appears last in the designator for a given variable or
25 array section, where the variable or array section is part of the pointer target for that
26 data pointer.
27 COMMENT: For the array section
28 (*p0).x0[k1].p1->p2[k2].x1[k3].x2[4][0:n], where identifiers pi have a
29 pointer type declaration and identifiers xi have an array type declaration,
30 the base pointer is: (*p0).x0[k1].p1->p2.
31 named pointer For C/C++, the base pointer of a given lvalue expression or array section, or the base
32 pointer of one of its named pointers.
33 For Fortran, the base pointer of a given variable or array section, or the base pointer
34 of one of its named pointers.
13 simply contiguous An array section that statically can be determined to have contiguous storage or that,
14 array section in Fortran, has the CONTIGUOUS attribute.
19 supported active levels An implementation-defined maximum number of active parallel regions that may
20 of parallelism enclose any region of code in the program.
21 OpenMP API support Support of at least one active level of parallelism.
19 tool Code that can observe and/or modify the execution of an application.
20 first-party tool A tool that executes in the address space of the program that it is monitoring.
21 third-party tool A tool that executes as a separate process from the process that it is monitoring and
22 potentially controlling.
23 activated tool A first-party tool that successfully completed its initialization.
24 event A point of interest in the execution of a thread.
25 native thread A thread defined by an underlying thread implementation.
26 tool callback A function that a tool provides to an OpenMP implementation to invoke when an
27 associated event occurs.
28 registering a callback Providing a tool callback to an OpenMP implementation.
10 The flush properties that define whether a flush operation is a strong flush, a release flush, or an
11 acquire flush are not mutually disjoint. A flush operation may be a strong flush and a release flush;
12 it may be a strong flush and an acquire flush; it may be a release flush and an acquire flush; or it
13 may be all three.
15 Note – Since flush operations by themselves cannot prevent data races, explicit flush operations are
16 only useful in combination with non-sequentially consistent atomic directives.
17
24 1.5.2 OMPD
25 The OMPD interface is intended for third-party tools, which run as separate processes. An
26 OpenMP implementation must provide an OMPD library that can be dynamically loaded and used
27 by a third-party tool. A third-party tool, such as a debugger, uses the OMPD library to access
28 OpenMP state of a program that has begun execution. OMPD defines the following:
29 • An interface that an OMPD library exports, which a tool can use to access OpenMP state of a
30 program that has begun execution;
31 • A callback interface that a tool provides to the OMPD library so that the library can use it to
32 access the OpenMP state of a program that has begun execution; and
Fortran (cont.)
C/C++ (cont.)
6 Some text is for information only, and is not part of the normative specification. Such text is
7 designated as a note or comment, like this:
8
CHAPTER 2. DIRECTIVES 37
1 Restrictions
2 The following restrictions apply to OpenMP directives:
C / C++
C
3 • A declarative directive may not be used in place of a substatement in a selection statement, in
4 place of the loop body in an iteration statement, or in place of the statement that follows a label.
C
C++
5 • A declarative directive may not be used in place of a substatement in a selection statement or
6 iteration statement, or in place of the statement that follows a label.
C++
C / C++
Fortran
7 • OpenMP directives, except simd and declarative directives, may not appear in pure procedures.
8 • OpenMP directives may not appear in the WHERE, FORALL or DO CONCURRENT constructs.
Fortran
12 Where directive-name is the name of the directive and, when specified in the syntax of the directive,
13 any directive-level arguments enclosed in parentheses.
14
15 Note – In the following example, depobj(o) is the directive-name:
16 #pragma omp depobj(o) depend(inout: d)
17
18 Each #pragma directive starts with #pragma omp. The remainder of the directive follows the
19 conventions of the C and C++ standards for compiler directives. In particular, white space can be
20 used before and after the #, and sometimes white space must be used to separate the words in a
21 directive. Preprocessing tokens following #pragma omp are subject to macro replacement.
22 Some OpenMP directives may be composed of consecutive #pragma directives if specified in
23 their syntax.
C / C++
4 or
5 [[ using omp : directive( directive-name[[,] clause[[,] clause]... ] ) ]]
6 The above two forms are interchangeable for any OpenMP directive. Some OpenMP directives may
7 be composed of consecutive attribute specifiers if specified in their syntax. Any two consecutive
8 attribute specifiers may be reordered or expressed as a single attribute specifier, as permitted by the
9 base language, without changing the behavior of the OpenMP directive.
10 Some directives may have additional forms that use the attribute syntax.
11 Multiple attributes on the same statement are allowed. A directive that uses the attribute syntax
12 cannot be applied to the same statement as a directive that uses the pragma syntax. For any
13 directive that has a paired end directive, including those with a begin and end pair, both directives
14 must use either the attribute syntax or the pragma syntax. Attribute directives that apply to the same
15 statement are unordered. An ordering can be imposed with the sequence attribute, which is
16 specified as follows:
17 [[ omp :: sequence( [omp::]directive-attr [, [omp::]directive-attr]... ) ]]
18 where directive-attr is any attribute in the omp namespace, optionally specified with a omp::
19 namespace qualifier, which may be another sequence attribute.
20 The application of multiple attributes in a sequence attribute is ordered as if each directive had
21 been written as a #pragma directive on subsequent lines.
22
30
C++
CHAPTER 2. DIRECTIVES 39
C / C++
1 Directives are case-sensitive.
2 Each of the expressions used in the OpenMP syntax inside of the clauses must be a valid
3 assignment-expression of the base language unless otherwise specified.
C / C++
C++
4 Directives may not appear in constexpr functions or in constant expressions.
C++
Fortran
5 OpenMP directives for Fortran are specified as follows:
6 sentinel directive-name [clause[ [,] clause]...]
7 All OpenMP compiler directives must begin with a directive sentinel. The format of a sentinel
8 differs between fixed form and free form source files, as described in Section 2.1.1 and
9 Section 2.1.2.
10 Directives are case insensitive. Directives cannot be embedded within continued statements, and
11 statements cannot be embedded within directives.
12 Each of the expressions used in the OpenMP syntax inside of the clauses must be a valid expression
13 of the base language unless otherwise specified.
14 In order to simplify the presentation, free form is used for the syntax of OpenMP directives for
15 Fortran in the remainder of this document, except as noted.
Fortran
16 A directive may be categorized as one of the following: a metadirective, a declarative directive, an
17 executable directive, an informational directive, or a utility directive.
18 Only one directive-name can be specified per directive (note that this includes combined directives,
19 see Section 2.16). The order in which clauses appear on directives is not significant. Clauses on
20 directives may be repeated as needed, subject to the restrictions listed in the description of each
21 clause or the directives on which they can appear.
22 Some clauses accept a list, an extended-list, or a locator-list. A list consists of a comma-separated
23 collection of one or more list items. An extended-list consists of a comma-separated collection of
24 one or more extended list items. A locator-list consists of a comma-separated collection of one or
25 more locator list items.
C / C++
26 A list item is a variable or an array section. An extended list item is a list item or a function name. A
27 locator list item is any lvalue expression including variables, an array section, or a reserved locator.
C / C++
29 The reserved locator omp_all_memory is a reserved identifier that denotes a list item treated as
30 having storage that corresponds to the storage of all other objects in memory.
31 Some directives have an associated structured block or a structured block sequence.
CHAPTER 2. DIRECTIVES 41
C / C++
1 A structured block sequence that consists of more than one statement may appear only for
2 executable directives that explicitly allow it. The corresponding compound statement obtained by
3 enclosing the sequence in { and } must be a structured block and the structured block sequence
4 then should be considered to be a structured block with all of its restrictions.
C / C++
5 A structured block:
6 • may contain infinite loops where the point of exit is never reached;
7 • may halt due to an IEEE exception;
C / C++
8 • may contain calls to exit(), _Exit(), quick_exit(), abort() or functions with a
9 _Noreturn specifier (in C) or a noreturn attribute (in C/C++);
10 • may be an expression statement, iteration statement, selection statement, or try block, provided
11 that the corresponding compound statement obtained by enclosing it in { and } would be a
12 structured block; and
C / C++
Fortran
13 • may contain STOP or ERROR STOP statements.
Fortran
14 Restrictions
15 Restrictions to structured blocks are as follows:
16 • Entry to a structured block must not be the result of a branch.
17 • The point of exit cannot be a branch out of the structured block.
C / C++
18 • The point of entry to a structured block must not be a call to setjmp.
19 • longjmp must not violate the entry/exit criteria.
C / C++
C++
20 • throw must not violate the entry/exit criteria.
21 • co_await, co_yield and co_return must not violate the entry/exit criteria.
C++
11 Sentinels must start in column 1 and appear as a single word with no intervening characters.
12 Fortran fixed form line length, white space, continuation, and column rules apply to the directive
13 line. Initial directive lines must have a space or a zero in column 6, and continuation directive lines
14 must have a character other than a space or a zero in column 6.
15 Comments may appear on the same line as a directive. The exclamation point initiates a comment
16 when it appears after column 6. The comment extends to the end of the source line and is ignored.
17 If the first non-blank character after the directive sentinel of an initial or continuation directive line
18 is an exclamation point, the line is ignored.
19
20 Note – In the following example, the three formats for specifying the directive are equivalent (the
21 first line represents the position of the first 9 columns):
22 c23456789
23 !$omp parallel do shared(a,b,c)
24
25 c$omp parallel do
26 c$omp+shared(a,b,c)
27
28 c$omp paralleldoshared(a,b,c)
29
CHAPTER 2. DIRECTIVES 43
1 2.1.2 Free Source Form Directives
2 The following sentinel is recognized in free form source files:
3 !$omp
4 The sentinel can appear in any column as long as it is preceded only by white space. It must appear
5 as a single word with no intervening white space. Fortran free form line length, white space, and
6 continuation rules apply to the directive line. Initial directive lines must have a space after the
7 sentinel. Continued directive lines must have an ampersand (&) as the last non-blank character on
8 the line, prior to any comment placed inside the directive. Continuation directive lines can have an
9 ampersand after the directive sentinel with optional white space before and after the ampersand.
10 Comments may appear on the same line as a directive. The exclamation point (!) initiates a
11 comment. The comment extends to the end of the source line and is ignored. If the first non-blank
12 character after the directive sentinel is an exclamation point, the line is ignored.
13 One or more blanks or horizontal tabs are optional to separate adjacent keywords in
14 directive-names unless otherwise specified.
15
16 Note – In the following example the three formats for specifying the directive are equivalent (the
17 first line represents the position of the first 9 columns):
18 !23456789
19 !$omp parallel do &
20 !$omp shared(a,b,c)
21
22 !$omp parallel &
23 !$omp&do shared(a,b,c)
24
25 !$omp paralleldo shared(a,b,c)
26
27
Fortran
4 Description
5 Stand-alone directives do not have any associated executable user code. Instead, they represent
6 executable statements that typically do not have succinct equivalent statements in the base language.
7 Some restrictions limit the placement of a stand-alone directive within a program. A stand-alone
8 directive may be placed only at a point where a base language executable statement is allowed.
C / C++
9 Restrictions
10 Restrictions to stand-alone directives are as follows:
C
11 • A stand-alone directive may not be used in place of a substatement in a selection statement, in
12 place of the loop body in an iteration statement, or in place of the statement that follows a label.
C
C++
13 • A stand-alone directive may not be used in place of a substatement in a selection statement or
14 iteration statement, or in place of the statement that follows a label.
C++
CHAPTER 2. DIRECTIVES 45
1 Restrictions
2 Restrictions to the shape-operator are as follows:
3 • The type T must be a complete type.
4 • The shape-operator can appear only in clauses for which it is explicitly allowed.
5 • The result of a shape-operator must be a named array of a list item.
6 • The type of the expression upon which a shape-operator is applied must be a pointer type.
C++
7 • If the type T is a reference to a type T’, then the type will be considered to be T’ for all purposes
8 of the designated array.
C++
C / C++
CHAPTER 2. DIRECTIVES 47
1 Assume a is declared to be a 1-dimensional array with dimension size 11. The first two examples
2 are equivalent, and the third and fourth examples are equivalent. The fifth example specifies a stride
3 of 2 and therefore is not contiguous.
4 Assume b is declared to be a pointer to a 2-dimensional array with dimension sizes 10 and 10. The
5 sixth example refers to all elements of the 2-dimensional array given by b[10]. The seventh
6 example is a zero-length array section.
7 Assume c is declared to be a 3-dimensional array with dimension sizes 50, 50, and 50. The eighth
8 example is contiguous, while the ninth and tenth examples are not contiguous.
9 The final four examples show array sections that are formed from more general base expressions.
10 The following are examples that are non-conforming array sections:
11 s[:10].x
12 p[:10]->y
13 *(xp[:10])
14 For all three examples, a base language operator is applied in an undefined manner to an array
15 section. The only operator that may be applied to an array section is a subscript operator for which
16 the array section appears as the postfix expression.
17
18
C / C++
Fortran
19 Fortran has built-in support for array sections although some restrictions apply to their use in
20 OpenMP directives, as enumerated in the following section.
Fortran
21 Restrictions
22 Restrictions to array sections are as follows:
23 • An array section can appear only in clauses for which it is explicitly allowed.
24 • A stride expression may not be specified unless otherwise stated.
C / C++
25 • An element of an array section with a non-zero size must have a complete type.
26 • The base expression of an array section must have an array or pointer type.
27 • If a consecutive sequence of array subscript expressions appears in an array section, and the first
28 subscript expression in the sequence uses the extended array section syntax defined in this
29 section, then only the last subscript expression in the sequence may select array elements that
30 have a pointer type.
C / C++
9 2.1.6 Iterators
10 Iterators are identifiers that expand to multiple values in the clause on which they appear.
11 The syntax of the iterator modifier is as follows:
12 iterator(iterators-definition)
17 where:
18 • identifier is a base language identifier.
C / C++
19 • iterator-type is a type name.
C / C++
Fortran
20 • iterator-type is a type specifier.
Fortran
21 • range-specification is of the form begin:end[:step], where begin and end are expressions for
22 which their types can be converted to iterator-type and step is an integral expression.
CHAPTER 2. DIRECTIVES 49
C / C++
1 In an iterator-specifier, if the iterator-type is not specified then that iterator is of int type.
C / C++
Fortran
2 In an iterator-specifier, if the iterator-type is not specified then that iterator has default integer type.
Fortran
3 In a range-specification, if the step is not specified its value is implicitly defined to be 1.
4 An iterator only exists in the context of the clause in which it appears. An iterator also hides all
5 accessible symbols with the same name in the context of the clause.
6 The use of a variable in an expression that appears in the range-specification causes an implicit
7 reference to the variable in all enclosing constructs.
C / C++
8 The values of the iterator are the set of values i0 , . . . , iN −1 where:
9 • i0 = (iterator-type) begin,
10 • ij = (iterator-type) (ij−1 + step), where j ≥ 1 and
11 • if step > 0,
12 – i0 < (iterator-type) end,
13 – iN −1 < (iterator-type) end, and
14 – (iterator-type) (iN −1 + step) ≥ (iterator-type) end;
15 • if step < 0,
16 – i0 > (iterator-type) end,
17 – iN −1 > (iterator-type) end, and
18 – (iterator-type) (iN −1 + step) ≤ (iterator-type) end.
C / C++
Fortran
19 The values of the iterator are the set of values i1 , . . . , iN where:
20 • i1 = begin,
21 • ij = ij−1 + step, where j ≥ 2 and
22 • if step > 0,
23 – i1 ≤ end,
24 – iN ≤ end, and
13 Restrictions
14 Restrictions to iterators are as follows:
15 • An expression that contains an iterator identifier can only appear in clauses that explicitly allow
16 expressions that contain iterators.
17 • The iterator-type must not declare a new type.
C / C++
18 • The iterator-type must be an integral or pointer type.
19 • The iterator-type must not be const qualified.
C / C++
Fortran
20 • The iterator-type must be an integer type.
Fortran
21 • If the step expression of a range-specification equals zero, the behavior is unspecified.
22 • Each iterator identifier can only be defined once in an iterators-definition.
23 • Iterators cannot appear in the range-specification.
CHAPTER 2. DIRECTIVES 51
1 2.2 Conditional Compilation
2 In implementations that support a preprocessor, the _OPENMP macro name is defined to have the
3 decimal value yyyymm where yyyy and mm are the year and month designations of the version of
4 the OpenMP API that the implementation supports.
5 If a #define or a #undef preprocessing directive in user code defines or undefines the
6 _OPENMP macro name, the behavior is unspecified.
Fortran
7 The OpenMP API requires Fortran lines to be compiled conditionally, as described in the following
8 sections.
12 To enable conditional compilation, a line with a conditional compilation sentinel must satisfy the
13 following criteria:
14 • The sentinel must start in column 1 and appear as a single word with no intervening white space;
15 • After the sentinel is replaced with two spaces, initial lines must have a space or zero in column 6
16 and only white space and numbers in columns 1 through 5;
17 • After the sentinel is replaced with two spaces, continuation lines must have a character other than
18 a space or zero in column 6 and only white space in columns 1 through 5.
19 If these criteria are met, the sentinel is replaced by two spaces. If these criteria are not met, the line
20 is left unchanged.
21
22 Note – In the following example, the two forms for specifying conditional compilation in fixed
23 source form are equivalent (the first line represents the position of the first 9 columns):
24 c23456789
25 !$ 10 iam = omp_get_thread_num() +
26 !$ & index
27
28 #ifdef _OPENMP
29 10 iam = omp_get_thread_num() +
30 & index
31 #endif
32
4 To enable conditional compilation, a line with a conditional compilation sentinel must satisfy the
5 following criteria:
6 • The sentinel can appear in any column but must be preceded only by white space;
7 • The sentinel must appear as a single word with no intervening white space;
8 • Initial lines must have a space after the sentinel;
9 • Continued lines must have an ampersand as the last non-blank character on the line, prior to any
10 comment appearing on the conditionally compiled line.
11 Continuation lines can have an ampersand after the sentinel, with optional white space before and
12 after the ampersand. If these criteria are met, the sentinel is replaced by two spaces. If these criteria
13 are not met, the line is left unchanged.
14
15 Note – In the following example, the two forms for specifying conditional compilation in free
16 source form are equivalent (the first line represents the position of the first 9 columns):
17 c23456789
18 !$ iam = omp_get_thread_num() + &
19 !$& index
20
21 #ifdef _OPENMP
22 iam = omp_get_thread_num() + &
23 index
24 #endif
25
26
Fortran
CHAPTER 2. DIRECTIVES 53
1 traits, clause-list traits, non-property traits and extension traits. This categorization determines the
2 syntax that is used to match the trait, as defined in Section 2.3.2.
3 The construct set is composed of the directive names, each being a trait, of all enclosing constructs
4 at that point in the program up to a target construct. Combined and composite constructs are
5 added to the set as distinct constructs in the same nesting order specified by the original construct.
6 Whether the dispatch construct is added to the construct set is implementation defined. If it is
7 added, it will only be added for the target-call of the associated code. The set is ordered by nesting
8 level in ascending order. Specifically, the ordering of the set of constructs is c1 , . . . , cN , where c1 is
9 the construct at the outermost nesting level and cN is the construct at the innermost nesting level. In
10 addition, if the point in the program is not enclosed by a target construct, the following rules are
11 applied in order:
12 1. For procedures with a declare simd directive, the simd trait is added to the beginning of the
13 set as c1 for any generated SIMD versions so the total size of the set is increased by 1.
14 2. For procedures that are determined to be function variants by a declare variant directive, the
15 selectors c1 , . . . , cM of the construct selector set are added in the same order to the
16 beginning of the set as c1 , . . . , cM so the total size of the set is increased by M .
17 3. For device routines, the target trait is added to the beginning of the set as c1 for any versions of
18 the procedure that are generated for target regions so the total size of the set is increased by 1.
19 The simd trait is a clause-list trait that is defined with properties that match the clauses accepted by
20 the declare simd directive with the same name and semantics. The simd trait defines at least the
21 simdlen property and one of the inbranch or notinbranch properties. Traits in the construct set
22 other than simd are non-property traits.
23 The device set includes traits that define the characteristics of the device being targeted by the
24 compiler at that point in the program. For each target device that the implementation supports, a
25 target_device set exists that defines the characteristics of that device. At least the following traits
26 must be defined for the device and all target_device sets:
27 • The kind(kind-name-list) trait specifies the general kind of the device. The following kind-name
28 values are defined:
29 – host, which specifies that the device is the host device;
30 – nohost, which specifies that the devices is not the host device; and
31 – the values defined in the OpenMP Additional Definitions document.
32 • The isa(isa-name-list) trait specifies the Instruction Set Architectures supported by the device.
33 The accepted isa-name values are implementation defined.
34 • The arch(arch-name-list) trait specifies the architectures supported by the device. The accepted
35 arch-name values are implementation defined.
36 The kind, isa and arch traits in the device and target_device sets are name-list traits.
CHAPTER 2. DIRECTIVES 55
1 trait-property-name
2 or
3 trait-property-clause
4 or
5 trait-property-expression
6 or
7 trait-property-extension
8
9 trait-property-clause:
10 clause
11
12 trait-property-name:
13 identifier
14 or
15 string-literal
16
17 trait-property-expression
18 scalar-expression (for C/C++)
19 or
20 scalar-logical-expression (for Fortran)
21 or
22 scalar-integer-expression (for Fortran)
23
24 trait-score:
25 score(score-expression)
26
27 trait-property-extension:
28 trait-property-name
29 or
30 identifier(trait-property-extension[, trait-property-extension[, ...]])
31 or
32 constant integer expression
33 For trait selectors that correspond to name-list traits, each trait-property should be
34 trait-property-name and for any value that is a valid identifier both the identifier and the
35 corresponding string literal (for C/C++) and the corresponding char-literal-constant (for Fortran)
36 representation are considered representations of the same value.
37 For trait selectors that correspond to clause-list traits, each trait-property should be
38 trait-property-clause. The syntax is the same as for the matching OpenMP clause.
39 The construct selector set defines the construct traits that should be active in the OpenMP
40 context. The following selectors can be defined in the construct set: target; teams;
41 parallel; for (in C/C++); do (in Fortran); simd and dispatch. Each trait-property of the
42 simd selector is a trait-property-clause. The syntax is the same as for a valid clause of the
CHAPTER 2. DIRECTIVES 57
C++
1 Each occurrence of the this pointer in an expression in a context selector that appears in the
2 match clause of a declare variant directive is treated as an expression that is the address of
3 the object on which the associated base function is invoked.
C++
4 Implementations can allow further selectors to be specified. Each specified trait-property for these
5 implementation-defined selectors should be trait-property-extension. Implementations can ignore
6 specified selectors that are not those described in this section.
7 Restrictions
8 Restrictions to context selectors are as follows:
9 • Each trait-property can only be specified once in a trait-selector other than the construct
10 selector set.
11 • Each trait-set-selector-name can only be specified once.
12 • Each trait-selector-name can only be specified once.
13 • A trait-score cannot be specified in traits from the construct, device or
14 target_device trait-selector-sets.
15 • A score-expression must be a non-negative constant integer expression.
16 • The expression of a device_num trait must evaluate to a non-negative integer value that is less
17 than or equal to the value of omp_get_num_devices().
18 • A variable or procedure that is referenced in an expression that appears in a context selector must
19 be visible at the location of the directive on which the selector appears unless the directive is a
20 declare variant directive and the variable is an argument of the associated base function.
21 • If trait-property any is specified in the kind trait-selector of the device or
22 target_device selector set, no other trait-property may be specified in the same selector.
23 • For a trait-selector that corresponds to a name-list trait, at least one trait-property must be
24 specified.
25 • For a trait-selector that corresponds to a non-property trait, no trait-property may be specified.
26 • For the requires selector of the implementation selector set, at least one trait-property
27 must be specified.
CHAPTER 2. DIRECTIVES 59
1 2.3.4 Metadirectives
2 Summary
3 A metadirective is a directive that can specify multiple directive variants of which one may be
4 conditionally selected to replace the metadirective based on the enclosing OpenMP context.
5 Syntax
C / C++
6 The syntax of a metadirective is as follows:
7 #pragma omp metadirective [clause[ [,] clause] ... ] new-line
8 or
9 #pragma omp begin metadirective [clause[ [,] clause] ... ] new-line
10 stmt(s)
11 #pragma omp end metadirective
17 or
18 !$omp begin metadirective [clause[ [,] clause] ... ]
19 stmt(s)
20 !$omp end metadirective
CHAPTER 2. DIRECTIVES 61
1 associated with the directive variant. If the nothing directive is selected to replace the
2 begin metadirective directive, its paired end metadirective directive is ignored.
3 Restrictions
4 Restrictions to metadirectives are as follows:
5 • The directive variant appearing in a when or default clause must not specify a
6 metadirective, begin metadirective, or end metadirective directive.
C / C++
7 • The directive variant that appears in a when or default clause must not specify a
8 begin declare variant or end declare variant.
C / C++
9 • The context selector that appears in a when clause must not specify any properties for the simd
10 selector.
11 • Replacement of the metadirective with the directive variant associated with any of the dynamic
12 replacement candidates must result in a conforming OpenMP program.
13 • Insertion of user code at the location of a metadirective must be allowed if the first dynamic
14 replacement candidate does not have a static context selector.
15 • All items must be executable directives if the first dynamic replacement candidate does not have
16 a static context selector.
17 • Any directive variant that is specified by a when or default clause on a
18 begin metadirective directive must be an OpenMP directive that has a paired
19 end directive or must be the nothing directive, and the begin metadirective directive
20 must have a paired end metadirective directive.
21 • The default clause may appear at most once on a metadirective.
6 Syntax
C / C++
7 The syntax of the declare variant directive is as follows:
8 #pragma omp declare variant(variant-func-id) clause [[[,] clause] ... ] new-line
9 [#pragma omp declare variant(variant-func-id) clause [[[,] clause] ... ] new-line
10 [ ... ]]
11 function definition or declaration
12 or
13 #pragma omp begin declare variant clause new-line
14 declaration-definition-seq
15 #pragma omp end declare variant new-line
25 and where variant-func-id is the name of a function variant that is either a base language identifier
26 or, for C++, a template-id.
C / C++
CHAPTER 2. DIRECTIVES 63
Fortran
1 The syntax of the declare variant directive is as follows:
2 !$omp declare variant([base-proc-name:]variant-proc-name) clause [[[,] clause] ...
3 ]
13 and where variant-proc-name is the name of a function variant that is a base language identifier.
Fortran
14 Description
15 The declare variant directive declares a base function to have the specified function variant. The
16 context selector in the match clause is associated with the variant.
C / C++
17 The begin declare variant directive associates the context selector in the match clause
18 with each function definition in declaration-definition-seq.
19 For the purpose of call resolution, each function definition that appears between a
20 begin declare variant directive and its paired end declare variant directive is a
21 function variant for an assumed base function, with the same name and a compatible prototype, that
22 is declared elsewhere without an associated declare variant directive.
23 If a declare variant directive appears between a begin declare variant directive and its
24 paired end declare variant directive the effective context selectors of the outer directive are
25 appended to the context selector of the inner directive to form the effective context selector of the
26 inner directive. If a trait-set-selector is present on both directives, the trait-selector list of the outer
27 directive is appended to the trait-selector list of the inner directive after equivalent trait-selectors
28 have been removed from the outer list. Restrictions that apply to explicitly specified context
29 selectors also apply to effective context selectors constructed through this process.
30 The symbol name of a function definition that appears between a begin declare variant
31 directive and its paired end declare variant directive shall be determined through the base
CHAPTER 2. DIRECTIVES 65
1 construct converts its pointer list items into device pointers. If the argument cannot be converted
2 into a device pointer then the NULL value will be passed as the argument.
3 If the adjust-op modifier is nothing, the argument is passed to the selected variant without being
4 modified.
5 If an append_args clause is present on the matching directive then additional arguments are
6 passed in the call. The arguments are constructed according to any specified append-op modifiers
7 and are passed in the same order in which they are specified in the append_args clause.
C / C++
8 The interop operation constructs an argument of type omp_interop_t from the
9 interoperability requirement set of the encountering task.
C / C++
Fortran
10 The interop operation constructs an argument of type omp_interop_kind from the
11 interoperability requirement set of the encountering task.
Fortran
12 The argument is constructed as if an interop construct with an init clause of interop-types was
13 specified. If the interoperability requirement set contains one or more properties that could be used
14 as clauses for an interop construct of the interop-type type, the behavior is as if the
15 corresponding clauses would also be part of the aforementioned interop construct and those
16 properties will be removed from the interoperability requirement set.
17 This argument is destroyed after the call to the selected variant returns, as if an interop construct
18 with a destroy clause was used with the same clauses that were used to initialize the argument.
19 Any differences that the specific OpenMP context requires in the prototype of the variant from the
20 base function prototype are implementation defined.
C
21 For the declare variant directive, any expressions in the match clause are interpreted as if
22 they appeared in the scope of arguments of the base function.
C
23 Different declare variant directives may be specified for different declarations of the same base
24 function.
C++
25 The function variant is determined by base language standard name lookup rules ([basic.lookup])
26 of variant-func-id using the argument types at the call site after implementation-defined changes
27 have been made according to the OpenMP context.
28 For the declare variant directive, the variant-func-id and any expressions in the match
29 clause are interpreted as if they appeared at the scope of the trailing return type of the base
30 function.
C++
5 Restrictions
6 Restrictions to the declare variant directive are as follows:
7 • Calling functions that a declare variant directive determined to be a function variant directly in
8 an OpenMP context that is different from the one that the construct selector set of the context
9 selector specifies is non-conforming.
10 • If a function is determined to be a function variant through more than one declare variant
11 directive then the construct selector set of their context selectors must be the same.
12 • A function determined to be a function variant may not be specified as a base function in another
13 declare variant directive.
14 • All variables that are referenced in an expression that appears in the context selector of a match
15 clause must be accessible at a call site to the base function according to the base language rules.
16 • At most one match clause can appear on a declare variant directive.
17 • At most one append_args clause can appear on a declare variant directive.
18 • Each argument can only appear in a single adjust_args clause for each declare variant
19 directive.
20 • An adjust_args clause or append_args clause can only be specified if the dispatch
21 selector of the construct selector set appears in the match clause.
C / C++
22 • The type of the function variant must be compatible with the type of the base function after the
23 implementation-defined transformation for its OpenMP context.
24 • Only the match clause can appear on a begin declare variant directive.
25 • The match clause of a begin declare variant directive may not contain a simd
26 trait-selector-name.
27 • Matching pairs of begin declare variant and end declare variant directives shall
28 either encompass disjoint source ranges or they shall be perfectly nested.
C / C++
CHAPTER 2. DIRECTIVES 67
C++
1 • The declare variant directive cannot be specified for a virtual, defaulted or deleted function.
2 • The declare variant directive cannot be specified for a constructor or destructor.
3 • The declare variant directive cannot be specified for an immediate function.
4 • The function that a declare variant directive determined to be a function variant may not be an
5 immediate function.
6 • A match clause that appears on a begin declare target directive must not contain a
7 dynamic context selector that references the this pointer.
8 • If an expression in the context selector that appears in a match clause references the this
9 pointer, the base function must be a non-static member function.
C++
Fortran
10 • base-proc-name must not be a generic name, an entry name, the name of a procedure pointer, a
11 dummy procedure or a statement function.
12 • If base-proc-name is omitted then the declare variant directive must appear in an interface
13 block or the specification part of a procedure.
14 • Any declare variant directive must appear in the specification part of a subroutine
15 subprogram, function subprogram, or interface body to which it applies.
16 • If the directive is specified for a procedure that is declared via a procedure declaration statement,
17 the base-proc-name must be specified.
18 • The procedure base-proc-name must have an accessible explicit interface at the location of the
19 directive.
20 • Each argument that appears in a need_device_ptr adjust-op must be of type C_PTR in the
21 dummy argument declaration.
Fortran
22 Cross References
23 • OpenMP Context Specification, see Section 2.3.1.
24 • Context Selectors, see Section 2.3.2.
4 Syntax
C / C++
5 The syntax of the dispatch construct is as follows:
6 #pragma omp dispatch [clause[ [,] clause] ... ] new-line
7 expression-stmt
CHAPTER 2. DIRECTIVES 69
1 and where clause is one of the following:
2 device(scalar-integer-expression)
3 depend([depend-modifier,] dependence-type : locator-list)
4 nowait
5 novariants(scalar-logical-expression)
6 nocontext(scalar-logical-expression)
7 is_device_ptr(list)
Fortran
8 Binding
9 The binding task set for a dispatch region is the generating task. The dispatch region binds
10 to the region of the generating task.
11 Description
12 When a novariants clause is present on the dispatch construct, and the novariants
13 clause expression evaluates to true, no function variant will be selected for the target-call even if
14 one would be selected normally. The use of a variable in a novariants clause expression of a
15 dispatch construct causes an implicit reference to the variable in all enclosing constructs.
16 The novariants clause expression is evaluated in the enclosing context.
17 When a nocontext clause is present on the dispatch construct, and the nocontext clause
18 expression evaluates to true, the dispatch construct is not added to the construct set of the
19 OpenMP context. The use of a variable in a nocontext clause expression of a dispatch
20 construct causes an implicit reference to the variable in all enclosing constructs.
21 The nocontext clause expression is evaluated in the enclosing context.
22 The is_device_ptr clause indicates that its list items are device pointers. For each list item
23 specified in the clause, an is_device_ptr property for that list item is added to the
24 interoperability requirement set. Support for device pointers created outside of OpenMP,
25 specifically outside of any OpenMP mechanism that returns a device pointer, is implementation
26 defined.
27 If one or more depend clauses are present on the dispatch construct, they are added as depend
28 properties of the interoperability requirement set. If a nowait clause is present on the dispatch
29 construct the nowait property is added to the interoperability requirement set.
30 This construct creates an explicit task, as if the task construct was used, that surrounds the
31 associated code. Properties added to the interoperability requirement set can be removed by the
32 effect of other directives (see Section 2.15.2) before the task is created. If the interoperability
33 requirement set contains one or more depend properties, the behavior is as if those properties were
34 applied to the task construct as depend clauses. If the interoperability requirement set does not
35 contain the nowait property then the task will also be an included task.
3 Restrictions
4 Restrictions to the dispatch construct are as follows:
5 • At most one novariants clause can appear on a dispatch directive.
6 • At most one nocontext clause can appear on a dispatch directive.
7 • At most one nowait clause can appear on a dispatch directive.
8 • A list item that appears in an is_device_ptr clause must be a valid device pointer for the
9 device data environment.
C++
10 • The target-call expression can only be a direct call.
C++
Fortran
11 • target-call must be a procedure name.
12 • target-call must not be a procedure pointer.
13 • A list item that appears in an is_device_ptr clause must be of type C_PTR.
Fortran
14 Cross References
15 • declare variant directive, see Section 2.3.5.
16 • Interoperability requirement set, see Section 2.15.2.
CHAPTER 2. DIRECTIVES 71
1 2.4.1 ICV Descriptions
2 The following ICVs store values that affect the operation of parallel regions.
3 • dyn-var - controls whether dynamic adjustment of the number of threads is enabled for
4 encountered parallel regions. One copy of this ICV exists per data environment.
5 • nthreads-var - controls the number of threads requested for encountered parallel regions.
6 One copy of this ICV exists per data environment.
7 • thread-limit-var - controls the maximum number of threads that participate in the contention
8 group. One copy of this ICV exists per data environment.
9 • max-active-levels-var - controls the maximum number of nested active parallel regions
10 when the innermost parallel region is generated by a given task. One copy of this ICV exists
11 per data environment.
12 • place-partition-var - controls the place partition available to the execution environment for
13 encountered parallel regions. One copy of this ICV exists per implicit task.
14 • active-levels-var - the number of nested active parallel regions that enclose a given task such
15 that all of the parallel regions are enclosed by the outermost initial task region on the device
16 on which the task executes. One copy of this ICV exists per data environment.
17 • levels-var - the number of nested parallel regions that enclose a given task such that all of
18 the parallel regions are enclosed by the outermost initial task region on the device on which
19 the task executes. One copy of this ICV exists per data environment.
20 • bind-var - controls the binding of OpenMP threads to places. When binding is requested, the
21 variable indicates that the execution environment is advised not to move threads between places.
22 The variable can also provide default thread affinity policies. One copy of this ICV exists per
23 data environment.
24 The following ICVs store values that affect the operation of worksharing-loop regions.
25 • run-sched-var - controls the schedule that is used for worksharing-loop regions when the
26 runtime schedule kind is specified. One copy of this ICV exists per data environment.
27 • def-sched-var - controls the implementation defined default scheduling of worksharing-loop
28 regions. One copy of this ICV exists per device.
29 The following ICVs store values that affect program execution.
30 • stacksize-var - controls the stack size for threads that the OpenMP implementation creates. One
31 copy of this ICV exists per device.
32 • wait-policy-var - controls the desired behavior of waiting threads. One copy of this ICV exists
33 per device.
34 • display-affinity-var - controls whether to display thread affinity. One copy of this ICV exists for
35 the whole program.
CHAPTER 2. DIRECTIVES 73
1 • def-allocator-var - controls the memory allocator to be used by memory allocation routines,
2 directives and clauses when a memory allocator is not specified by the user. One copy of this
3 ICV exists per implicit task.
4 The following ICVs store values that affect the operation of teams regions.
5 • nteams-var - controls the number of teams requested for encountered teams regions. One copy
6 of this ICV exists per device.
7 • teams-thread-limit-var - controls the maximum number of threads that participate in each
8 contention group created by a teams construct. One copy of this ICV exists per device.
6 Description
7 • Each device has its own ICVs.
8 • The initial value of dyn-var is implementation defined if the implementation supports dynamic
9 adjustment of the number of threads; otherwise, the initial value is false.
CHAPTER 2. DIRECTIVES 75
1 • The value of the nthreads-var ICV is a list.
2 • The value of the bind-var ICV is a list.
3 The host and non-host device ICVs are initialized before any OpenMP API construct or OpenMP
4 API routine executes. After the initial values are assigned, the values of any OpenMP environment
5 variables that were set by the user are read and the associated ICVs are modified accordingly. If no
6 <device> number is specified on the device-specific environment variable then the value is applied
7 to all non-host devices.
8 Cross References
9 • OMP_SCHEDULE environment variable, see Section 6.1.
10 • OMP_NUM_THREADS environment variable, see Section 6.2.
11 • OMP_DYNAMIC environment variable, see Section 6.3.
12 • OMP_PROC_BIND environment variable, see Section 6.4.
13 • OMP_PLACES environment variable, see Section 6.5.
14 • OMP_STACKSIZE environment variable, see Section 6.6.
15 • OMP_WAIT_POLICY environment variable, see Section 6.7.
16 • OMP_MAX_ACTIVE_LEVELS environment variable, see Section 6.8.
17 • OMP_NESTED environment variable, see Section 6.9.
18 • OMP_THREAD_LIMIT environment variable, see Section 6.10.
19 • OMP_CANCELLATION environment variable, see Section 6.11.
20 • OMP_DISPLAY_AFFINITY environment variable, see Section 6.13.
21 • OMP_AFFINITY_FORMAT environment variable, see Section 6.14.
22 • OMP_DEFAULT_DEVICE environment variable, see Section 6.15.
23 • OMP_MAX_TASK_PRIORITY environment variable, see Section 6.16.
24 • OMP_TARGET_OFFLOAD environment variable, see Section 6.17.
25 • OMP_TOOL environment variable, see Section 6.18.
26 • OMP_TOOL_LIBRARIES environment variable, see Section 6.19.
27 • OMP_DEBUG environment variable, see Section 6.21.
28 • OMP_ALLOCATOR environment variable, see Section 6.22.
29 • OMP_NUM_TEAMS environment variable, see Section 6.23.
30 • OMP_TEAMS_THREAD_LIMIT environment variable, see Section 6.24.
4 Description
5 • The value of the nthreads-var ICV is a list. The runtime call omp_set_num_threads sets
6 the value of the first element of this list, and omp_get_max_threads retrieves the value of
7 the first element of this list.
CHAPTER 2. DIRECTIVES 77
1 • The value of the bind-var ICV is a list. The runtime call omp_get_proc_bind retrieves the
2 value of the first element of this list.
3 • Detailed values in the place-partition-var ICV are retrieved using the runtime calls
4 omp_get_partition_num_places, omp_get_partition_place_nums,
5 omp_get_place_num_procs, and omp_get_place_proc_ids.
6 Cross References
7 • thread_limit clause of the teams construct, see Section 2.7.
8 • thread_limit clause of the target construct, see Section 2.14.5.
9 • omp_set_num_threads routine, see Section 3.2.1.
10 • omp_get_num_threads routine, see Section 3.2.2.
11 • omp_get_max_threads routine, see Section 3.2.3.
12 • omp_get_thread_num routine, see Section 3.2.4.
13 • omp_set_dynamic routine, see Section 3.2.6.
14 • omp_get_dynamic routine, see Section 3.2.7.
15 • omp_get_cancellation routine, see Section 3.2.8.
16 • omp_set_nested routine, see Section 3.2.9.
17 • omp_set_schedule routine, see Section 3.2.11.
18 • omp_get_schedule routine, see Section 3.2.12.
19 • omp_get_thread_limit routine, see Section 3.2.13.
20 • omp_get_supported_active_levels, see Section 3.2.14.
21 • omp_set_max_active_levels routine, see Section 3.2.15.
22 • omp_get_max_active_levels routine, see Section 3.2.16.
23 • omp_get_level routine, see Section 3.2.17.
24 • omp_get_active_level routine, see Section 3.2.20.
25 • omp_get_proc_bind routine, see Section 3.3.1.
26 • omp_get_place_num_procs routine, see Section 3.3.3.
27 • omp_get_place_proc_ids routine, see Section 3.3.4.
28 • omp_get_partition_num_places routine, see Section 3.3.6.
29 • omp_get_partition_place_nums routine, see Section 3.3.7.
30 • omp_set_affinity_format routine, see Section 3.3.8.
ICV Scope
CHAPTER 2. DIRECTIVES 79
table continued from previous page
ICV Scope
1 Description
2 • One copy of each ICV with device scope exists per device.
3 • Each data environment has its own copies of ICVs with data environment scope.
4 • Each implicit task has its own copy of ICVs with implicit task scope.
5 Calls to OpenMP API routines retrieve or modify data environment scoped ICVs in the data
6 environment of their binding tasks.
CHAPTER 2. DIRECTIVES 81
1 2.4.5 ICV Override Relationships
2 Table 2.4 shows the override relationships among construct clauses and ICVs. The table only lists
3 ICVs that can be overwritten by a clause.
nthreads-var num_threads
run-sched-var schedule
def-sched-var schedule
bind-var proc_bind
def-allocator-var allocator
nteams-var num_teams
teams-thread-limit-var thread_limit
4 Description
5 • The num_threads clause overrides the value of the first element of the nthreads-var ICV.
6 • If a schedule clause specifies a modifier then that modifier overrides any modifier that is
7 specified in the run-sched-var ICV.
8 • If bind-var is not set to false then the proc_bind clause overrides the value of the first element
9 of the bind-var ICV; otherwise, the proc_bind clause has no effect.
10 Cross References
11 • parallel construct, see Section 2.6.
12 • proc_bind clause, Section 2.6.
13 • num_threads clause, see Section 2.6.1.
14 • num_teams clause, see Section 2.7.
15 • thread_limit clause, see Section 2.7.
16 • Worksharing-loop construct, see Section 2.11.4.
17 • schedule clause, see Section 2.11.4.1.
8 Syntax
C / C++
9 The syntax of the requires directive is as follows:
10 #pragma omp requires clause[ [ [,] clause] ... ] new-line
CHAPTER 2. DIRECTIVES 83
1 Description
2 The requires directive specifies features that an implementation must support for correct
3 execution. The behavior that a requirement clause specifies may override the normal behavior
4 specified elsewhere in this document. Whether an implementation supports the feature that a given
5 requirement clause specifies is implementation defined.
6 The requires directive specifies requirements for the execution of all code in the current
7 compilation unit.
8
9 Note – Use of this directive makes your code less portable. Users should be aware that not all
10 devices or implementations support all requirements.
11
21 Restrictions
22 The restrictions to the requires directive are as follows:
23 • Each of the clauses can appear at most once on the directive.
24 • All atomic_default_mem_order clauses that appear on a requires directive in the
25 same compilation unit must specify the same parameter.
26 • A requires directive with a unified_address, unified_shared_memory, or
27 reverse_offload clause must appear lexically before any device constructs or device
28 routines.
29 • A requires directive may not appear lexically after a context selector in which any clause of
30 the requires directive is used.
31 • A requires directive with any of the following clauses must appear in all compilation units of
32 a program that contain device constructs or device routines or in none of them:
33 – reverse_offload
34 – unified_address
35 – unified_shared_memory
CHAPTER 2. DIRECTIVES 85
1 • The requires directive with atomic_default_mem_order clause may not appear
2 lexically after any atomic construct on which memory-order-clause is not specified.
C
3 • The requires directive may only appear at file scope.
C
C++
4 • The requires directive may only appear at file or namespace scope.
C++
Fortran
5 • The requires directive must appear in the specification part of a program unit, after any USE
6 statement, any IMPORT statement, and any IMPLICIT statement, unless the directive appears
7 by referencing a module and each clause already appeared with the same parameters in the
8 specification part of the program unit.
Fortran
14 Syntax
C / C++
15 The syntax of the assume directive is as follows:
16 #pragma omp assumes clause[ [ [,] clause] ... ] new-line
17 or
18 #pragma omp begin assumes clause[ [ [,] clause] ... ] new-line
19 declaration-definition-seq
20 #pragma omp end assumes new-line
21 or
22 #pragma omp assume clause[ [ [,] clause] ... ] new-line
23 structured-block
9 or
10 !$omp assume clause[ [ [,] clause] ... ]
11 loosely-structured-block
12 !$omp end assume
13 or
14 !$omp assume clause[ [ [,] clause] ... ]
15 strictly-structured-block
16 [!$omp end assume]
CHAPTER 2. DIRECTIVES 87
1 Description
2 The assume directive gives the implementation additional information about the expected
3 properties of the program that can optionally be used to optimize the implementation. An
4 implementation may ignore this information without altering the behavior of the program.
5 The scope of the assumes directive is the code executed and reached from the current compilation
6 unit. The scope of the assume directive is the code executed in the corresponding region or in any
7 region that is nested in the corresponding region.
C / C++
8 The scope of the begin assumes directive is the code that is executed and reached from any of
9 the declared functions in declaration-definition-seq.
C / C++
10 The absent and contains clauses accept a list of one or more directive names that may match
11 a construct that is encountered within the scope of the directive. An encountered construct matches
12 the directive name if it has the same name as one of the specified directive names or if it is a
13 combined or composite construct for which a constituent construct has the same name as one of the
14 specified directive names.
15 When the absent clause appears on an assume directive, the application guarantees that no
16 constructs that match a listed directive name are encountered in the scope of the assume directive.
17 When the contains clause appears on an assume directive, the application provides a hint that
18 constructs that match the listed directive names are likely to be encountered in the scope of the
19 assume directive.
20 When the holds clause appears on an assume directive, the application guarantees that the listed
21 expression evaluates to true in the scope of the directive. The effect of the clause does not include
22 an evaluation of the expression that is observable.
23 The no_openmp clause guarantees that no OpenMP related code is executed in the scope of the
24 directive.
25 The no_openmp_routines clause guarantees that no explicit OpenMP runtime library calls are
26 executed in the scope of the directive.
27 The no_parallelism clause guarantees that no OpenMP tasks (explicit or implicit) will be
28 generated and that no SIMD constructs will be executed in the scope of the directive.
29 Implementers are allowed to include additional implementation-defined assumption clauses. All
30 implementation-defined assumptions should begin with ext_. Assumption names that do not start
31 with ext_ are reserved.
20 Syntax
C / C++
21 The syntax of the nothing directive is as follows:
22 #pragma omp nothing new-line
C / C++
Fortran
23 The syntax of the nothing directive is as follows:
24 !$omp nothing
Fortran
CHAPTER 2. DIRECTIVES 89
1 Description
2 The nothing directive has no effect on the execution of the OpenMP program.
3 Cross References
4 • Metadirectives, see Section 2.3.4.
10 Syntax
C / C++
11 The syntax of the error directive is as follows:
12 #pragma omp error [clause[ [,] clause] ... ] new-line
25 Tool Callbacks
26 A thread dispatches a registered ompt_callback_error callback for each occurrence of a
27 runtime-error event in the context of the encountering task. This callback has the type signature
28 ompt_callback_error_t.
29 Restrictions
30 Restrictions to the error directive are as follows:
31 • At most one at clause can appear on the directive.
32 • At most one severity clause can appear on the directive.
33 • At most one message clause can appear on the directive.
34 Cross References
35 • Stand-alone directives, see Section 2.1.3.
36 • ompt_callback_error_t, see Section 4.5.2.30.
CHAPTER 2. DIRECTIVES 91
1 2.6 parallel Construct
2 Summary
3 The parallel construct creates a team of OpenMP threads that execute the region.
4 Syntax
C / C++
5 The syntax of the parallel construct is as follows:
6 #pragma omp parallel [clause[ [,] clause] ... ] new-line
7 structured-block
21 Binding
22 The binding thread set for a parallel region is the encountering thread. The encountering thread
23 becomes the primary thread of the new team.
24 Description
25 When a thread encounters a parallel construct, a team of threads is created to execute the
26 parallel region (see Section 2.6.1 for more information about how the number of threads in the
27 team is determined, including the evaluation of the if and num_threads clauses). The thread
28 that encountered the parallel construct becomes the primary thread of the new team, with a
29 thread number of zero for the duration of the new parallel region. All threads in the new team,
30 including the primary thread, execute the region. Once the team is created, the number of threads in
31 the team remains constant for the duration of that parallel region.
CHAPTER 2. DIRECTIVES 93
1 The optional proc_bind clause, described in Section 2.6.2, specifies the mapping of OpenMP
2 threads to places within the current place partition, that is, within the places listed in the
3 place-partition-var ICV for the implicit task of the encountering thread.
4 Within a parallel region, thread numbers uniquely identify each thread. Thread numbers are
5 consecutive whole numbers ranging from zero for the primary thread up to one less than the
6 number of threads in the team. A thread may obtain its own thread number by a call to the
7 omp_get_thread_num library routine.
8 A set of implicit tasks, equal in number to the number of threads in the team, is generated by the
9 encountering thread. The structured block of the parallel construct determines the code that
10 will be executed in each implicit task. Each task is assigned to a different thread in the team and
11 becomes tied. The task region of the task that the encountering thread is executing is suspended and
12 each thread in the team executes its implicit task. Each thread can execute a path of statements that
13 is different from that of the other threads.
14 The implementation may cause any thread to suspend execution of its implicit task at a task
15 scheduling point, and to switch to execution of any explicit task generated by any of the threads in
16 the team, before eventually resuming execution of the implicit task (for more details see
17 Section 2.12).
18 An implicit barrier occurs at the end of a parallel region. After the end of a parallel region,
19 only the primary thread of the team resumes execution of the enclosing task region.
20 If a thread in a team that is executing a parallel region encounters another parallel
21 directive, it creates a new team, according to the rules in Section 2.6.1, and it becomes the primary
22 thread of that new team.
23 If execution of a thread terminates while inside a parallel region, execution of all threads in all
24 teams terminates. The order of termination of threads is unspecified. All work done by a team prior
25 to any barrier that the team has passed in the program is guaranteed to be complete. The amount of
26 work done by each thread after the last barrier that it passed and before it terminates is unspecified.
6 Tool Callbacks
7 A thread dispatches a registered ompt_callback_parallel_begin callback for each
8 occurrence of a parallel-begin event in that thread. The callback occurs in the task that encounters
9 the parallel construct. This callback has the type signature
10 ompt_callback_parallel_begin_t. In the dispatched callback,
11 (flags & ompt_parallel_team) evaluates to true.
12 A thread dispatches a registered ompt_callback_implicit_task callback with
13 ompt_scope_begin as its endpoint argument for each occurrence of an implicit-task-begin
14 event in that thread. Similarly, a thread dispatches a registered
15 ompt_callback_implicit_task callback with ompt_scope_end as its endpoint
16 argument for each occurrence of an implicit-task-end event in that thread. The callbacks occur in
17 the context of the implicit task and have type signature ompt_callback_implicit_task_t.
18 In the dispatched callback, (flags & ompt_task_implicit) evaluates to true.
19 A thread dispatches a registered ompt_callback_parallel_end callback for each
20 occurrence of a parallel-end event in that thread. The callback occurs in the task that encounters
21 the parallel construct. This callback has the type signature
22 ompt_callback_parallel_end_t.
23 A thread dispatches a registered ompt_callback_thread_begin callback for the
24 native-thread-begin event in that thread. The callback occurs in the context of the thread. The
25 callback has type signature ompt_callback_thread_begin_t.
26 A thread dispatches a registered ompt_callback_thread_end callback for the
27 native-thread-end event in that thread. The callback occurs in the context of the thread. The
28 callback has type signature ompt_callback_thread_end_t.
29 Restrictions
30 Restrictions to the parallel construct are as follows:
31 • A program must not depend on any ordering of the evaluations of the clauses of the parallel
32 directive, or on any side effects of the evaluations of the clauses.
33 • At most one if clause can appear on the directive.
34 • At most one proc_bind clause can appear on the directive.
35 • At most one num_threads clause can appear on the directive. The num_threads
36 expression must evaluate to a positive integer value.
CHAPTER 2. DIRECTIVES 95
C++
1 • A throw executed inside a parallel region must cause execution to resume within the same
2 parallel region, and the same thread that threw the exception must catch it.
C++
3 Cross References
4 • OpenMP execution model, see Section 1.3.
5 • num_threads clause, see Section 2.6.
6 • proc_bind clause, see Section 2.6.2.
7 • allocate clause, see Section 2.13.4.
8 • if clause, see Section 2.18.
9 • default, shared, private, firstprivate, and reduction clauses, see
10 Section 2.21.4.
11 • copyin clause, see Section 2.21.6.
12 • omp_get_thread_num routine, see Section 3.2.4.
13 • ompt_scope_begin and ompt_scope_end, see Section 4.4.4.11.
14 • ompt_callback_thread_begin_t, see Section 4.5.2.1.
15 • ompt_callback_thread_end_t, see Section 4.5.2.2.
16 • ompt_callback_parallel_begin_t, see Section 4.5.2.3.
17 • ompt_callback_parallel_end_t, see Section 4.5.2.4.
18 • ompt_callback_implicit_task_t, see Section 4.5.2.11.
3
4
5
Algorithm 2.1
6 let ThreadsBusy be the number of OpenMP threads currently executing in this contention group;
7 if an if clause exists
8 then let IfClauseValue be the value of the if clause expression;
9 else let IfClauseValue = true;
10 if a num_threads clause exists
11 then let ThreadsRequested be the value of the num_threads clause expression;
12 else let ThreadsRequested = value of the first element of nthreads-var;
13 let ThreadsAvailable = (thread-limit-var - ThreadsBusy + 1);
14 if (IfClauseValue = false)
15 then number of threads = 1;
16 else if (active-levels-var ≥ max-active-levels-var)
17 then number of threads = 1;
18 else if (dyn-var = true) and (ThreadsRequested ≤ ThreadsAvailable)
19 then 1 ≤ number of threads ≤ ThreadsRequested;
20 else if (dyn-var = true) and (ThreadsRequested > ThreadsAvailable)
21 then 1 ≤ number of threads ≤ ThreadsAvailable;
22 else if (dyn-var = false) and (ThreadsRequested ≤ ThreadsAvailable)
23 then number of threads = ThreadsRequested;
24 else if (dyn-var = false) and (ThreadsRequested > ThreadsAvailable)
25 then behavior is implementation defined;
26
27
CHAPTER 2. DIRECTIVES 97
1
2 Note – Since the initial value of the dyn-var ICV is implementation defined, programs that depend
3 on a specific number of threads for correct execution should explicitly disable dynamic adjustment
4 of the number of threads.
5
6 Cross References
7 • nthreads-var, dyn-var, thread-limit-var, and max-active-levels-var ICVs, see Section 2.4.
8 • parallel construct, see Section 2.6.
9 • num_threads clause, see Section 2.6.
10 • if clause, see Section 2.18.
CHAPTER 2. DIRECTIVES 99
1 2.7 teams Construct
2 Summary
3 The teams construct creates a league of initial teams and the initial thread in each team executes
4 the region.
5 Syntax
C / C++
6 The syntax of the teams construct is as follows:
7 #pragma omp teams [clause[ [,] clause] ... ] new-line
8 structured-block
23 or
24 !$omp teams [clause[ [,] clause] ... ]
25 strictly-structured-block
26 [!$omp end teams]
11 Binding
12 The binding thread set for a teams region is the encountering thread.
13 Description
14 When a thread encounters a teams construct, a league of teams is created. Each team is an initial
15 team, and the initial thread in each team executes the teams region.
16 If the num_teams clause is present, lower-bound is the specified lower bound and upper-bound is
17 the specified upper bound on the number of teams requested. If a lower bound is not specified, the
18 lower bound is set to the specified upper bound. The number of teams created is implementation
19 defined, but it will be greater than or equal to the lower bound and less than or equal to the upper
20 bound.
21 If the num_teams clause is not specified and the value of the nteams-var ICV is greater than zero,
22 the number of teams created is less or equal to the value of the nteams-var ICV. Otherwise, the
23 number of teams created is implementation defined, but it will be greater than or equal to 1.
24 A thread may obtain the number of teams created by the construct with a call to the
25 omp_get_num_teams routine.
26 If a thread_limit clause is not present on the teams construct, but the construct is closely
27 nested inside a target construct on which the thread_limit clause is specified, the behavior
28 is as if that thread_limit clause is also specified for the teams construct.
29 As described in Section 2.4.4.1, the teams construct limits the number of threads that may
30 participate in a contention group initiated by each team by setting the value of the thread-limit-var
31 ICV for the initial task to an implementation defined value greater than zero. If the
32 thread_limit clause is specified, the number of threads will be less than or equal to the value
33 specified in the clause. Otherwise, if the teams-thread-limit-var ICV is greater than zero, the
34 number of threads will be less than or equal to that value.
30 Tool Callbacks
31 A thread dispatches a registered ompt_callback_parallel_begin callback for each
32 occurrence of a teams-begin event in that thread. The callback occurs in the task that encounters the
33 teams construct. This callback has the type signature
34 ompt_callback_parallel_begin_t. In the dispatched callback,
35 (flags & ompt_parallel_league) evaluates to true.
17 Restrictions
18 Restrictions to the teams construct are as follows:
19 • A program that branches into or out of a teams region is non-conforming.
20 • A program must not depend on any ordering of the evaluations of the clauses of the teams
21 directive, or on any side effects of the evaluation of the clauses.
22 • At most one thread_limit clause can appear on the directive. The thread_limit
23 expression must evaluate to a positive integer value.
24 • At most one num_teams clause can appear on the directive. The lower-bound and upper-bound
25 specified in the num_teams clause must evaluate to positive integer values.
26 • A teams region must be strictly nested within the implicit parallel region that surrounds the
27 whole OpenMP program or a target region. If a teams region is nested inside a target
28 region, the corresponding target construct must not contain any statements, declarations or
29 directives outside of the corresponding teams construct.
30 • distribute, distribute simd, distribute parallel worksharing-loop, distribute parallel
31 worksharing-loop SIMD, parallel regions, including any parallel regions arising from
32 combined constructs, omp_get_num_teams() regions, and omp_get_team_num()
33 regions are the only OpenMP regions that may be strictly nested inside the teams region.
19 Syntax
C / C++
20 The syntax of the masked construct is as follows:
21 #pragma omp masked [ filter(integer-expression) ] new-line
22 structured-block
C / C++
Fortran
23 The syntax of the masked construct is as follows:
24 !$omp masked [ filter(scalar-integer-expression) ]
25 loosely-structured-block
26 !$omp end masked
8 Binding
9 The binding thread set for a masked region is the current team. A masked region binds to the
10 innermost enclosing parallel region.
11 Description
12 Only the threads of the team that executes the binding parallel region that the filter clause
13 selects participate in the execution of the structured block of a masked region. Other threads in the
14 team do not execute the associated structured block. No implied barrier occurs either on entry to, or
15 exit from, the masked construct.
16 If a filter clause is present on the construct and the parameter specifies the thread number of the
17 current thread in the current team then the current thread executes the associated structured block.
18 If the filter clause is not present, the construct behaves as if the parameter is a constant integer
19 expression that evaluates to zero, so that only the primary thread executes the associated structured
20 block. The use of a variable in a filter clause expression causes an implicit reference to the
21 variable in all enclosing constructs. The result of evaluating the parameter of the filter clause
22 may vary across threads.
23 If more than one thread in the team executes the structured block of a masked region, the
24 structured block must include any synchronization required to ensure that data races do not occur.
25 The master construct, which has been deprecated, has identical semantics to the masked
26 construct with no filter clause present.
8 Restrictions
9 Restrictions to the masked construct are as follows:
C++
10 • A throw executed inside a masked region must cause execution to resume within the same
11 masked region, and the same thread that threw the exception must catch it.
C++
12 Cross References
13 • parallel construct, see Section 2.6.
14 • ompt_scope_begin and ompt_scope_end, see Section 4.4.4.11.
15 • ompt_callback_masked_t, see Section 4.5.2.12.
20 Syntax
C / C++
21 The syntax of the scope construct is as follows:
22 #pragma omp scope [clause[ [,] clause] ... ] new-line
23 structured-block
5 or
6 !$omp scope [clause[ [,] clause] ... ]
7 strictly-structured-block
8 [!$omp end scope [nowait]]
12 Binding
13 The binding thread set for a scope region is the current team. A scope region binds to the
14 innermost enclosing parallel region. Only the threads of the team that executes the binding parallel
15 region participate in the execution of the structured block and the implied barrier of the scope
16 region if the barrier is not eliminated by a nowait clause.
17 Description
18 All encountering threads will execute the structured block associated with the scope construct.
19 An implicit barrier occurs at the end of a scope region if the nowait clause is not specified.
25 Tool Callbacks
26 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
27 as its endpoint argument and ompt_work_scope as its wstype argument for each occurrence of a
28 scope-begin event in that thread. Similarly, a thread dispatches a registered
29 ompt_callback_work callback with ompt_scope_end as its endpoint argument and
30 ompt_work_scope as its wstype argument for each occurrence of a scope-end event in that
31 thread. The callbacks occur in the context of the implicit task. The callbacks have type signature
32 ompt_callback_work_t.
12 Syntax
C / C++
13 The syntax of the sections construct is as follows:
14 #pragma omp sections [clause[ [,] clause] ... ] new-line
15 {
16 [#pragma omp section new-line]
17 structured-block-sequence
18 [#pragma omp section new-line
19 structured-block-sequence]
20 ...
21 }
15 Binding
16 The binding thread set for a sections region is the current team. A sections region binds to
17 the innermost enclosing parallel region. Only the threads of the team that executes the binding
18 parallel region participate in the execution of the structured block sequences and the implied
19 barrier of the sections region if the barrier is not eliminated by a nowait clause.
20 Description
21 Each structured block sequence in the sections construct is preceded by a section directive
22 except possibly the first sequence, for which a preceding section directive is optional.
23 The method of scheduling the structured block sequences among the threads in the team is
24 implementation defined.
25 An implicit barrier occurs at the end of a sections region if the nowait clause is not specified.
12 Restrictions
13 Restrictions to the sections construct are as follows:
14 • Orphaned section directives are prohibited. That is, the section directives must appear
15 within the sections construct and must not be encountered elsewhere in the sections
16 region.
17 • The code enclosed in a sections construct must be a structured block sequence.
18 • Only a single nowait clause can appear on a sections directive.
C++
19 • A throw executed inside a sections region must cause execution to resume within the same
20 section of the sections region, and the same thread that threw the exception must catch it.
C++
21 Cross References
22 • allocate clause, see Section 2.13.4.
23 • private, firstprivate, lastprivate, and reduction clauses, see Section 2.21.4.
24 • ompt_scope_begin and ompt_scope_end, see Section 4.4.4.11.
25 • ompt_work_sections, see Section 4.4.4.15.
26 • ompt_callback_work_t, see Section 4.5.2.5.
27 • ompt_callback_dispatch_t, see Section 4.5.2.6.
7 Syntax
C / C++
8 The syntax of the single construct is as follows:
9 #pragma omp single [clause[ [,] clause] ... ] new-line
10 structured-block
21 or
22 !$omp single [clause[ [,] clause] ... ]
23 strictly-structured-block
24 [!$omp end single [end_clause[ [,] end_clause] ... ]]
9 Description
10 Only one of the encountering threads will execute the structured block associated with the single
11 construct. The method of choosing a thread to execute the structured block each time the team
12 encounters the construct is implementation defined. An implicit barrier occurs at the end of a
13 single region if the nowait clause is not specified.
19 Tool Callbacks
20 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
21 as its endpoint argument for each occurrence of a single-begin event in that thread. Similarly, a
22 thread dispatches a registered ompt_callback_work callback with ompt_scope_end as its
23 endpoint argument for each occurrence of a single-end event in that thread. For each of these
24 callbacks, the wstype argument is ompt_work_single_executor if the thread executes the
25 structured block associated with the single region; otherwise, the wstype argument is
26 ompt_work_single_other. The callback has type signature ompt_callback_work_t.
27 Restrictions
28 Restrictions to the single construct are as follows:
29 • The copyprivate clause must not be used with the nowait clause.
30 • At most one nowait clause can appear on a single construct.
C++
31 • A throw executed inside a single region must cause execution to resume within the same
32 single region, and the same thread that threw the exception must catch it.
C++
14 Syntax
15 The syntax of the workshare construct is as follows:
16 !$omp workshare
17 loosely-structured-block
18 !$omp end workshare [nowait]
19 or
20 !$omp workshare
21 strictly-structured-block
22 [!$omp end workshare [nowait]]
23 Binding
24 The binding thread set for a workshare region is the current team. A workshare region binds
25 to the innermost enclosing parallel region. Only the threads of the team that executes the
26 binding parallel region participate in the execution of the units of work and the implied barrier
27 of the workshare region if the barrier is not eliminated by a nowait clause.
1 Description
2 An implicit barrier occurs at the end of a workshare region if a nowait clause is not specified.
3 An implementation of the workshare construct must insert any synchronization that is required
4 to maintain standard Fortran semantics. For example, the effects of one statement within the
5 structured block must appear to occur before the execution of succeeding statements, and the
6 evaluation of the right hand side of an assignment must appear to complete prior to the effects of
7 assigning to the left hand side.
8 The statements in the workshare construct are divided into units of work as follows:
9 • For array expressions within each statement, including transformational array intrinsic functions
10 that compute scalar values from arrays:
11 – Evaluation of each element of the array expression, including any references to elemental
12 functions, is a unit of work.
13 – Evaluation of transformational array intrinsic functions may be freely subdivided into any
14 number of units of work.
15 • For an array assignment statement, the assignment of each element is a unit of work.
16 • For a scalar assignment statement, the assignment operation is a unit of work.
17 • For a WHERE statement or construct, the evaluation of the mask expression and the masked
18 assignments are each a unit of work.
19 • For a FORALL statement or construct, the evaluation of the mask expression, expressions
20 occurring in the specification of the iteration space, and the masked assignments are each a unit
21 of work.
22 • For an atomic construct, the atomic operation on the storage location designated as x is a unit
23 of work.
24 • For a critical construct, the construct is a single unit of work.
25 • For a parallel construct, the construct is a unit of work with respect to the workshare
26 construct. The statements contained in the parallel construct are executed by a new thread
27 team.
28 • If none of the rules above apply to a portion of a statement in the structured block, then that
29 portion is a unit of work.
30 The transformational array intrinsic functions are MATMUL, DOT_PRODUCT, SUM, PRODUCT,
31 MAXVAL, MINVAL, COUNT, ANY, ALL, SPREAD, PACK, UNPACK, RESHAPE, TRANSPOSE,
32 EOSHIFT, CSHIFT, MINLOC, and MAXLOC.
33 It is unspecified how the units of work are assigned to the threads that execute a workshare
34 region.
1 If an array expression in the block references the value, association status, or allocation status of
2 private variables, the value of the expression is undefined, unless the same value would be
3 computed by every thread.
4 If an array assignment, a scalar assignment, a masked array assignment, or a FORALL assignment
5 assigns to a private variable in the block, the result is unspecified.
6 The workshare directive causes the sharing of work to occur only in the workshare construct,
7 and not in the remainder of the workshare region.
13 Tool Callbacks
14 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
15 as its endpoint argument and ompt_work_workshare as its wstype argument for each
16 occurrence of a workshare-begin event in that thread. Similarly, a thread dispatches a registered
17 ompt_callback_work callback with ompt_scope_end as its endpoint argument and
18 ompt_work_workshare as its wstype argument for each occurrence of a workshare-end event
19 in that thread. The callbacks occur in the context of the implicit task. The callbacks have type
20 signature ompt_callback_work_t.
21 Restrictions
22 Restrictions to the workshare construct are as follows:
23 • The only OpenMP constructs that may be closely nested inside a workshare construct are the
24 atomic, critical, and parallel constructs.
25 • Base language statements that are encountered inside a workshare construct but that are not
26 enclosed within a parallel construct that is nested inside the workshare construct must
27 consist of only the following:
28 – array assignments
29 – scalar assignments
30 – FORALL statements
31 – FORALL constructs
32 – WHERE statements
33 – WHERE constructs
7 Cross References
8 • parallel construct, see Section 2.6.
9 • critical construct, see Section 2.19.1.
10 • atomic construct, see Section 2.19.7.
11 • ompt_scope_begin and ompt_scope_end, see Section 4.4.4.11.
12 • ompt_work_workshare, see Section 4.4.4.15.
13 • ompt_callback_work_t, see Section 4.5.2.5.
Fortran
17 Symbol Meaning
1 or
Fortran
2 DO [ label ] var = lb , ub [ , incr ]
3 [intervening-code]
4 loop-body
5 [intervening-code]
6 [ label ] END DO
13 or
14 generated-canonical-loop
17 or
C / C++
18 {
19 [intervening-code]
20 loop-body
21 [intervening-code]
22 }
C / C++
1 or
Fortran
2 BLOCK
3 [intervening-code]
4 loop-body
5 [intervening-code]
6 END BLOCK
Fortran
7 or if none of the previous productions match
8 final-loop-body
10 generated-canonical- A generated loop from a loop transformation construct that has canonical loop nest
11 loop form and for which the loop body matches loop-body.
12 intervening-code A structured block sequence that does not contain OpenMP directives or calls to the
13 OpenMP runtime API in its corresponding region, referred to as intervening code. If
14 intervening code is present, then a loop at the same depth within the loop nest is not a
15 perfectly nested loop.
C / C++
16 It must not contain iteration statements, continue statements or break statements
17 that apply to the enclosing loop.
C / C++
Fortran
18 It must not contain loops, array expressions, CYCLE statements or EXIT statements.
Fortran
19 final-loop-body A structured block that terminates the scope of loops in the loop nest. If the loop nest
20 is associated with a loop-associated directive, loops in this structured block cannot be
21 associated with that directive.
C / C++
1 init-expr One of the following:
2 var = lb
3 integer-type var = lb
C
4 pointer-type var = lb
C
C++
5 random-access-iterator-type var = lb
C++
1 a1 * var-outer - a2
2 a2 - a1 * var-outer
3 var-outer * a1
4 var-outer * a1 + a2
5 a2 + var-outer * a1
6 var-outer * a1 - a2
7 a2 - var-outer * a1
8 where var-outer is of an integer type.
9 lb and ub are loop bounds. A loop for which lb or ub refers to var-outer is a
10 non-rectangular loop. If var is of an integer type, var-outer must be of an integer
11 type with the same signedness and bit precision as the type of var.
12 The coefficient in a loop bound is 0 if the bound does not refer to var-outer. If a loop
13 bound matches a form in which a1 appears, the coefficient is -a1 if the product of
14 var-outer and a1 is subtracted from a2, and otherwise the coefficient is a1. For other
15 matched forms where a1 does not appear, the coefficient is −1 if var-outer is
16 subtracted from a2, and otherwise the coefficient is 1.
17 a1, a2, incr Integer expressions that are loop invariant with respect to the outermost loop of the
18 loop nest.
19 If the loop is associated with a loop-associated directive, the expressions are
20 evaluated before the construct formed from that directive.
21 var-outer The loop iteration variable of a surrounding loop in the loop nest.
C++
22 range-decl A declaration of a variable as defined by the base language for range-based for
23 loops.
24 range-expr An expression that is valid as defined by the base language for range-based for
25 loops. It must be invariant with respect to the outermost loop of the loop nest and the
26 iterator derived from it must be a random access iterator.
C++
15 Restrictions
16 Restrictions to canonical loop nests are as follows:
C / C++
17 • If test-expr is of the form var relational-op b and relational-op is < or <= then incr-expr must
18 cause var to increase on each iteration of the loop. If test-expr is of the form var relational-op b
19 and relational-op is > or >= then incr-expr must cause var to decrease on each iteration of the
20 loop. Increase and decrease are using the order induced by relational-op.
21 • If test-expr is of the form ub relational-op var and relational-op is < or <= then incr-expr must
22 cause var to decrease on each iteration of the loop. If test-expr is of the form ub relational-op
23 var and relational-op is > or >= then incr-expr must cause var to increase on each iteration of the
24 loop. Increase and decrease are using the order induced by relational-op.
25 • If relational-op is != then incr-expr must cause var to always increase by 1 or always decrease
26 by 1 and the increment must be a constant expression.
27 • final-loop-body must not contain any break statement that would cause the termination of the
28 innermost loop.
C / C++
Fortran
29 • final-loop-body must not contain any EXIT statement that would cause the termination of the
30 innermost loop.
Fortran
8 Cross References
9 • Loop transformation constructs, see Section 2.11.9.
10 • threadprivate directive, see Section 2.21.2.
26 Syntax
27 The syntax of the order clause is as follows:
28 order([ order-modifier :]concurrent)
4 Description
5 The order clause specifies an expected order of execution for the iterations of the associated loops
6 of a loop-associated directive. The specified order must be concurrent.
7 The order clause is part of the schedule specification for the purpose of determining its
8 consistency with other schedules (see Section 2.11.2).
9 If the order clause specifies concurrent, the logical iterations of the associated loops may
10 execute in any order, including concurrently.
11 If order-modifier is not unconstrained, the behavior is as if the reproducible modifier is
12 present.
13 The specified schedule is reproducible if the reproducible modifier is present.
14 Restrictions
15 Restrictions to the order clause are as follows:
16 • The only constructs that may be encountered inside a region that corresponds to a construct with
17 an order clause that specifies concurrent are the loop construct, the parallel
18 construct, the simd construct, and combined constructs for which the first construct is a
19 parallel construct.
20 • A region that corresponds to a construct with an order clause that specifies concurrent may
21 not contain calls to procedures that contain OpenMP directives.
22 • A region that corresponds to a construct with an order clause that specifies concurrent may
23 not contain calls to the OpenMP Runtime API.
24 • If a threadprivate variable is referenced inside a region that corresponds to a construct with an
25 order clause that specifies concurrent, the behavior is unspecified.
26 • At most one order clause may appear on a construct.
5 where loop-nest is a canonical loop nest and clause is one of the following:
6 private(list)
7 firstprivate(list)
8 lastprivate([lastprivate-modifier:]list)
9 linear(list[:linear-step])
10 reduction([reduction-modifier,]reduction-identifier:list)
11 schedule([modifier [, modifier]:]kind[, chunk_size])
12 collapse(n)
13 ordered[(n)]
14 nowait
15 allocate([allocator:]list)
16 order([order-modifier:]concurrent)
C / C++
Fortran
17 The syntax of the worksharing-loop construct is as follows:
18 !$omp do [clause[ [,] clause] ... ]
19 loop-nest
20 [!$omp end do [nowait]]
21 where loop-nest is a canonical loop nest and clause is one of the following:
22 private(list)
23 firstprivate(list)
24 lastprivate([lastprivate-modifier:]list)
25 linear(list[:linear-step])
26 reduction([reduction-modifier,]reduction-identifier:list)
27 schedule([modifier [, modifier]:]kind[, chunk_size])
28 collapse(n)
4 If an end do directive is not specified, an end do directive is assumed at the end of the do-loops.
Fortran
5 Binding
6 The binding thread set for a worksharing-loop region is the current team. A worksharing-loop
7 region binds to the innermost enclosing parallel region. Only the threads of the team executing
8 the binding parallel region participate in the execution of the loop iterations and the implied
9 barrier of the worksharing-loop region when that barrier is not eliminated by a nowait clause.
10 Description
11 An implicit barrier occurs at the end of a worksharing-loop region if a nowait clause is not
12 specified.
13 The collapse and ordered clauses may be used to specify the number of loops from the loop
14 nest that are associated with the worksharing-loop construct. If specified, their parameters must be
15 constant positive integer expressions.
16 The collapse clause specifies the number of loops that are collapsed into a logical iteration
17 space that is then divided according to the schedule clause. If the collapse clause is omitted,
18 the behavior is as if a collapse clause with a parameter value of one was specified.
19 If the ordered clause is specified with parameter n then the n outer loops from the associated
20 loop nest form a doacross loop nest. The parameter of the ordered clause does not affect how the
21 logical iteration space is divided.
22 At the beginning of each logical iteration, the loop iteration variable or the variable declared by
23 range-decl of each associated loop has the value that it would have if the set of the associated loops
24 was executed sequentially. The schedule clause specifies how iterations of these associated
25 loops are divided into contiguous non-empty subsets, called chunks, and how these chunks are
26 distributed among threads of the team. Each thread executes its assigned chunks in the context of
27 its implicit task. The iterations of a given chunk are executed in sequential order by the assigned
28 thread. The chunk_size expression is evaluated using the original list items of any variables that are
29 made private in the worksharing-loop construct. Whether, in what order, or how many times, any
30 side effects of the evaluation of this expression occur is unspecified. The use of a variable in a
31 schedule clause expression of a worksharing-loop construct causes an implicit reference to the
32 variable in all enclosing constructs.
33 See Section 2.11.4.1 for details of how the schedule for a worksharing-loop region is determined.
34 The schedule kind can be one of those specified in Table 2.5.
static When kind is static, iterations are divided into chunks of size chunk_size,
and the chunks are assigned to the threads in the team in a round-robin
fashion in the order of the thread number. Each chunk contains chunk_size
iterations, except for the chunk that contains the sequentially last iteration,
which may have fewer iterations.
When no chunk_size is specified, the iteration space is divided into chunks
that are approximately equal in size, and at most one chunk is distributed to
each thread. The size of the chunks is unspecified in this case.
dynamic When kind is dynamic, the iterations are distributed to threads in the team
in chunks. Each thread executes a chunk of iterations, then requests another
chunk, until no chunks remain to be distributed.
Each chunk contains chunk_size iterations, except for the chunk that contains
the sequentially last iteration, which may have fewer iterations.
When no chunk_size is specified, it defaults to 1.
guided When kind is guided, the iterations are assigned to threads in the team in
chunks. Each thread executes a chunk of iterations, then requests another
chunk, until no chunks remain to be assigned.
table continued on next page
monotonic When the monotonic modifier is specified then each thread executes
the chunks that it is assigned in increasing logical iteration order.
nonmonotonic When the nonmonotonic modifier is specified then chunks are
assigned to threads in any order and the behavior of an application that
depends on any execution order of the chunks is unspecified.
simd When the simd modifier is specified and the loop is associated with
a SIMD construct, the chunk_size for all chunks except the first and
last chunks is new_chunk_size = dchunk_size/simd_widthe e∗
simd_width where simd_width is an implementation-defined value.
The first chunk will have at least new_chunk_size iterations except if
it is also the last chunk. The last chunk may have fewer iterations than
new_chunk_size. If the simd modifier is specified and the loop is not
associated with a SIMD construct, the modifier is ignored.
8 Tool Callbacks
9 A thread dispatches a registered ompt_callback_work callback with ompt_scope_begin
10 as its endpoint argument and work_loop as its wstype argument for each occurrence of a
11 ws-loop-begin event in that thread. Similarly, a thread dispatches a registered
12 ompt_callback_work callback with ompt_scope_end as its endpoint argument and
13 work_loop as its wstype argument for each occurrence of a ws-loop-end event in that thread. The
14 callbacks occur in the context of the implicit task. The callbacks have type signature
15 ompt_callback_work_t.
16 A thread dispatches a registered ompt_callback_dispatch callback for each occurrence of a
17 ws-loop-iteration-begin event in that thread. The callback occurs in the context of the implicit task.
18 The callback has type signature ompt_callback_dispatch_t.
24 Cross References
25 • ICVs, see Section 2.4.
schedule No
clause present? Use def-sched-var schedule kind
Yes
schedule No
kind value is Use schedule kind specified in
runtime? schedule clause
Yes
Use run-sched-var schedule kind
7 Syntax
C / C++
8 The syntax of the simd construct is as follows:
9 #pragma omp simd [clause[ [,] clause] ... ] new-line
10 loop-nest
17 where loop-nest is a canonical loop nest and clause is one of the following:
18 if([ simd :] scalar-logical-expression)
19 safelen(length)
20 simdlen(length)
21 linear(list[ : linear-step])
22 aligned(list[ : alignment])
23 nontemporal(list)
24 private(list)
25 lastprivate([ lastprivate-modifier:] list)
26 reduction([ reduction-modifier,]reduction-identifier : list)
27 collapse(n)
28 order([ order-modifier :]concurrent)
29 If an end simd directive is not specified, an end simd directive is assumed at the end of the
30 do-loops.
Fortran
4 Description
5 The simd construct enables the execution of multiple iterations of the associated loops
6 concurrently by using SIMD instructions.
7 The collapse clause may be used to specify how many loops are associated with the simd
8 construct. The collapse clause specifies the number of loops that are collapsed into a logical
9 iteration space that is then executed with SIMD instructions. The parameter of the collapse
10 clause must be a constant positive integer expression. If the collapse clause is omitted, the
11 behavior is as if a collapse clause with a parameter value of one was specified.
12 At the beginning of each logical iteration, the loop iteration variable or the variable declared by
13 range-decl of each associated loop has the value that it would have if the set of the associated loops
14 was executed sequentially. The number of iterations that are executed concurrently at any given
15 time is implementation defined. Each concurrent iteration will be executed by a different SIMD
16 lane. Each set of concurrent iterations is a SIMD chunk. Lexical forward dependences in the
17 iterations of the original loop must be preserved within each SIMD chunk, unless an order clause
18 that specifies concurrent is present.
19 The safelen clause specifies that no two concurrent iterations within a SIMD chunk can have a
20 distance in the logical iteration space that is greater than or equal to the value given in the clause.
21 The parameter of the safelen clause must be a constant positive integer expression. The
22 simdlen clause specifies the preferred number of iterations to be executed concurrently, unless an
23 if clause is present and evaluates to false, in which case the preferred number of iterations to be
24 executed concurrently is one. The parameter of the simdlen clause must be a constant positive
25 integer expression.
26 If an order clause is present then the semantics are as described in Section 2.11.3.
C / C++
27 The aligned clause declares that the object to which each list item points is aligned to the
28 number of bytes expressed in the optional parameter of the aligned clause.
C / C++
Fortran
29 The aligned clause declares that the location of each list item is aligned to the number of bytes
30 expressed in the optional parameter of the aligned clause.
Fortran
31 The optional parameter of the aligned clause, alignment, must be a constant positive integer
32 expression. If no optional parameter is specified, implementation-defined default alignments for
33 SIMD instructions on the target platforms are assumed.
34 The nontemporal clause specifies that accesses to the storage locations to which the list items
35 refer have low temporal locality across the iterations in which those storage locations are accessed.
7 Cross References
8 • Canonical loop nest form, see Section 2.11.1.
9 • order clause, see Section 2.11.3.
10 • if clause, see Section 2.18.
11 • Data-sharing attribute clauses, see Section 2.21.4.
18 Syntax
C / C++
19 The syntax of the worksharing-loop SIMD construct is as follows:
20 #pragma omp for simd [clause[ [,] clause] ... ] new-line
21 loop-nest
22 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the for
23 or simd directives with identical meanings and restrictions.
C / C++
5 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the simd
6 or do directives, with identical meanings and restrictions.
7 If an end do simd directive is not specified, an end do simd directive is assumed at the end of
8 the do-loops.
Fortran
9 Description
10 The worksharing-loop SIMD construct will first distribute the logical iterations of the associated
11 loops across the implicit tasks of the parallel region in a manner consistent with any clauses that
12 apply to the worksharing-loop construct. Each resulting chunk of iterations will then be converted
13 to a SIMD loop in a manner consistent with any clauses that apply to the simd construct.
16 Tool Callbacks
17 This composite construct dispatches the same callbacks as the worksharing-loop construct.
18 Restrictions
19 All restrictions to the worksharing-loop construct and the simd construct apply to the
20 worksharing-loop SIMD construct. In addition, the following restrictions apply:
21 • No ordered clause with a parameter can be specified.
22 • A list item may appear in a linear or firstprivate clause, but not in both.
23 Cross References
24 • Canonical loop nest form, see Section 2.11.1.
25 • Worksharing-loop construct, see Section 2.11.4.
26 • simd construct, see Section 2.11.5.1.
27 • Data-sharing attribute clauses, see Section 2.21.4.
8 Syntax
C / C++
9 The syntax of the declare simd directive is as follows:
10 #pragma omp declare simd [clause[ [,] clause] ... ] new-line
11 [#pragma omp declare simd [clause[ [,] clause] ... ] new-line]
12 [ ... ]
13 function definition or declaration
9 Restrictions
10 Restrictions to the declare simd directive are as follows:
11 • Each argument can appear in at most one uniform or linear clause.
12 • At most one simdlen clause can appear in a declare simd directive.
13 • Either inbranch or notinbranch may be specified, but not both.
14 • When a linear-step expression is specified in a linear clause it must be either a constant integer
15 expression or an integer-typed parameter that is specified in a uniform clause on the directive.
16 • The function or subroutine body must be a structured block.
17 • The execution of the function or subroutine, when called from a SIMD loop, cannot result in the
18 execution of an OpenMP construct except for an ordered construct with the simd clause or an
19 atomic construct.
20 • The execution of the function or subroutine cannot have any side effects that would alter its
21 execution for concurrent iterations of a SIMD chunk.
22 • A program that branches into or out of the function is non-conforming.
C / C++
23 • If the function has any declarations, then the declare simd directive for any declaration that
24 has one must be equivalent to the one specified for the definition. Otherwise, the result is
25 unspecified.
26 • The function cannot contain calls to the longjmp or setjmp functions.
C / C++
C
27 • The type of list items appearing in the aligned clause must be array or pointer.
C
21 Cross References
22 • linear clause, see Section 2.21.4.6.
23 • reduction clause, see Section 2.21.5.4.
5 where loop-nest is a canonical loop nest and clause is one of the following:
6 private(list)
7 firstprivate(list)
8 lastprivate(list)
9 collapse(n)
10 dist_schedule(kind[, chunk_size])
11 allocate([allocator :]list)
12 order([ order-modifier :]concurrent)
C / C++
Fortran
13 The syntax of the distribute construct is as follows:
14 !$omp distribute [clause[ [,] clause] ... ]
15 loop-nest
16 [!$omp end distribute]
17 where loop-nest is a canonical loop nest and clause is one of the following:
18 private(list)
19 firstprivate(list)
20 lastprivate(list)
21 collapse(n)
22 dist_schedule(kind[, chunk_size])
23 allocate([allocator :]list)
24 order([ order-modifier :]concurrent)
4 Description
5 The distribute construct is associated with a loop nest consisting of one or more loops that
6 follow the directive.
7 The collapse clause may be used to specify how many loops are associated with the
8 distribute construct. The parameter of the collapse clause must be a constant positive
9 integer expression. If the collapse clause is omitted, the behavior is as if a collapse clause
10 with a parameter value of one was specified.
11 No implicit barrier occurs at the end of a distribute region. To avoid data races the original list
12 items that are modified due to lastprivate or linear clauses should not be accessed between
13 the end of the distribute construct and the end of the teams region to which the
14 distribute binds.
15 At the beginning of each logical iteration, the loop iteration variable or the variable declared by
16 range-decl of each associated loop has the value that it would have if the set of the associated loops
17 was executed sequentially.
18 If the dist_schedule clause is specified, kind must be static. If specified, iterations are
19 divided into chunks of size chunk_size. These chunks are assigned to the initial teams of the league
20 in a round-robin fashion in the order of the initial team number. When chunk_size is not specified,
21 the iteration space is divided into chunks that are approximately equal in size, and at most one
22 chunk is distributed to each initial team of the league.
23 When dist_schedule clause is not specified, the schedule is implementation defined.
24 If an order clause is present then the semantics are as described in Section 2.11.3.
25 The schedule is reproducible if one of the following conditions is true:
26 • The order clause is present and uses the reproducible modifier; or
27 • The dist_schedule clause is specified with static as the kind parameter.
28 Programs can only depend on which team executes a particular iteration if the schedule is
29 reproducible. Schedule reproducibility is also used for determining its consistency with other
30 schedules (see Section 2.11.2).
9 Restrictions
10 Restrictions to the distribute construct are as follows:
11 • The distribute construct inherits the restrictions of the worksharing-loop construct.
12 • Each distribute region must be encountered by the initial threads of all initial teams in a
13 league or by none at all.
14 • The sequence of the distribute regions encountered must be the same for every initial thread
15 of every initial team in a league.
16 • The region that corresponds to the distribute construct must be strictly nested inside a
17 teams region.
18 • A list item may appear in a firstprivate or lastprivate clause, but not in both.
19 • At most one dist_schedule clause can appear on the directive.
20 • If the dist_schedule is present then none of the associated loops may be non-rectangular
21 loops.
22 Cross References
23 • teams construct, see Section 2.7
24 • Canonical loop nest form, see Section 2.11.1.
25 • order clause, see Section 2.11.3.
26 • Worksharing-loop construct, see Section 2.11.4.
27 • tile construct, see Section 2.11.9.1.
28 • ompt_work_distribute, see Section 4.4.4.15.
29 • ompt_callback_work_t, see Section 4.5.2.5.
6 Syntax
C / C++
7 The syntax of the distribute simd construct is as follows:
8 #pragma omp distribute simd [clause[ [,] clause] ... ] new-line
9 loop-nest
10 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
11 distribute or simd directives with identical meanings and restrictions.
C / C++
Fortran
12 The syntax of the distribute simd construct is as follows:
13 !$omp distribute simd [clause[ [,] clause] ... ]
14 loop-nest
15 [!$omp end distribute simd]
16 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
17 distribute or simd directives with identical meanings and restrictions.
18 If an end distribute simd directive is not specified, an end distribute simd directive is
19 assumed at the end of the do-loops.
Fortran
20 Description
21 The distribute simd construct will first distribute the logical iterations of the associated loops
22 across the initial tasks of the teams region in a manner consistent with any clauses that apply to
23 the distribute construct. Each resulting chunk of iterations will then be converted to a SIMD
24 loop in a manner consistent with any clauses that apply to the simd construct.
27 Tool Callbacks
28 This composite construct dispatches the same callbacks as the distribute construct.
7 Cross References
8 • Canonical loop nest form, see Section 2.11.1.
9 • simd construct, see Section 2.11.5.1.
10 • distribute construct, see Section 2.11.6.1.
11 • Data-sharing attribute clauses, see Section 2.21.4.
17 Syntax
C / C++
18 The syntax of the distribute parallel worksharing-loop construct is as follows:
19 #pragma omp distribute parallel for [clause[ [,] clause] ... ] new-line
20 loop-nest
21 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
22 distribute or parallel worksharing-loop directives with identical meanings and restrictions.
C / C++
Fortran
23 The syntax of the distribute parallel worksharing-loop construct is as follows:
24 !$omp distribute parallel do [clause[ [,] clause] ... ]
25 loop-nest
26 [!$omp end distribute parallel do]
27 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
28 distribute or parallel worksharing-loop directives with identical meanings and restrictions.
29 If an end distribute parallel do directive is not specified, an end distribute
30 parallel do directive is assumed at the end of the do-loops.
Fortran
10 Tool Callbacks
11 This composite construct dispatches the same callbacks as the distribute and parallel
12 worksharing-loop constructs.
13 Restrictions
14 All restrictions to the distribute and parallel worksharing-loop constructs apply to the
15 distribute parallel worksharing-loop construct. In addition, the following restrictions apply:
16 • No ordered clause can be specified.
17 • No linear clause can be specified.
18 • The conditional modifier must not appear in a lastprivate clause.
19 Cross References
20 • Canonical loop nest form, see Section 2.11.1.
21 • distribute construct, see Section 2.11.6.1.
22 • Parallel worksharing-loop construct, see Section 2.16.1.
23 • Data-sharing attribute clauses, see Section 2.21.4.
6 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
7 distribute or parallel worksharing-loop SIMD directives with identical meanings and
8 restrictions.
C / C++
Fortran
9 The syntax of the distribute parallel worksharing-loop SIMD construct is as follows:
10 !$omp distribute parallel do simd [clause[ [,] clause] ... ]
11 loop-nest
12 [!$omp end distribute parallel do simd]
13 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
14 distribute or parallel worksharing-loop SIMD directives with identical meanings and
15 restrictions.
16 If an end distribute parallel do simd directive is not specified, an end distribute
17 parallel do simd directive is assumed at the end of the do-loops.
Fortran
18 Description
19 The distribute parallel worksharing-loop SIMD construct will first distribute the logical iterations
20 of the associated loops across the initial tasks of the teams region in a manner consistent with any
21 clauses that apply to the distribute construct. Each resulting chunk of iterations will then
22 execute as if part of a parallel worksharing-loop SIMD region in a manner consistent with any
23 clauses that apply to the parallel worksharing-loop SIMD construct.
27 Tool Callbacks
28 This composite construct dispatches the same callbacks as the distribute and parallel
29 worksharing-loop SIMD constructs.
13 Cross References
14 • Canonical loop nest form, see Section 2.11.1.
15 • distribute construct, see Section 2.11.6.1.
16 • Parallel worksharing-loop SIMD construct, see Section 2.16.5.
17 • Data-sharing attribute clauses, see Section 2.21.4.
22 Syntax
C / C++
23 The syntax of the loop construct is as follows:
24 #pragma omp loop [clause[ [,] clause] ... ] new-line
25 loop-nest
26 where loop-nest is a canonical loop nest and clause is one of the following:
27 bind(binding)
28 collapse(n)
29 order([ order-modifier :]concurrent)
30 private(list)
11 where loop-nest is a canonical loop nest and clause is one of the following:
12 bind(binding)
13 collapse(n)
14 order([ order-modifier :]concurrent)
15 private(list)
16 lastprivate(list)
17 reduction([default ,]reduction-identifier : list)
22 If an end loop directive is not specified, an end loop directive is assumed at the end of the
23 do-loops.
Fortran
14 Description
15 The loop construct is associated with a loop nest that consists of one or more loops that follow the
16 directive. The directive asserts that the iterations may execute in any order, including concurrently.
17 The collapse clause may be used to specify how many loops are associated with the loop
18 construct. The parameter of the collapse clause must be a constant positive integer expression.
19 If the collapse clause is omitted, the behavior is as if a collapse clause with a parameter
20 value of one was specified. The collapse clause specifies the number of loops that are collapsed
21 into a logical iteration space.
22 At the beginning of each logical iteration, the loop iteration variable or the variable declared by
23 range-decl of each associated loop has the value that it would have if the set of the associated loops
24 was executed sequentially.
25 Each logical iteration is executed once per instance of the loop region that is encountered by the
26 binding thread set.
27 If an order clause is present then the semantics are as described in Section 2.11.3. If the order
28 clause is not present, the behavior is as if an order clause that specifies concurrent appeared
29 on the construct.
30 The set of threads that may execute the iterations of the loop region is the binding thread set. Each
31 iteration is executed by one thread from this set.
32 If the loop region binds to a teams region, the threads in the binding thread set may continue
33 execution after the loop region without waiting for all logical iterations of the associated loops to
34 complete. The iterations are guaranteed to complete before the end of the teams region.
35 If the loop region does not bind to a teams region, all logical iterations of the associated loops
36 must complete before the encountering threads continue execution after the loop region.
37 For the purpose of determining its consistency with other schedules (see Section 2.11.2), the
38 schedule is defined by the implicit order clause.
3 Restrictions
4 Restrictions to the loop construct are as follows:
5 • At most one collapse clause can appear on a loop directive.
6 • A list item may not appear in a lastprivate clause unless it is the loop iteration variable of a
7 loop that is associated with the construct.
8 • If a loop construct is not nested inside another OpenMP construct and it appears in a procedure,
9 the bind clause must be present.
10 • If a loop region binds to a teams or parallel region, it must be encountered by all threads in
11 the binding thread set or by none of them.
12 • At most one bind clause can appear on a loop directive.
13 • If the bind clause is present on the loop construct and binding is teams then the
14 corresponding loop region must be strictly nested inside a teams region.
15 • If the bind clause, with teams specified as binding, is present on the loop construct and the
16 corresponding loop region executes on a non-host device then the behavior of a reduction
17 clause that appears on the construct is unspecified if the construct is not nested inside a teams
18 construct.
19 • If the bind clause is present on the loop construct and binding is parallel then the
20 behavior is unspecified if the corresponding loop region is closely nested inside a simd region.
21 Cross References
22 • The single construct, see Section 2.10.2.
23 • Canonical loop nest form, see Section 2.11.1.
24 • order clause, see Section 2.11.3.
25 • The Worksharing-Loop construct, see Section 2.11.4.
26 • SIMD directives, see Section 2.11.5.
27 • distribute construct, see Section 2.11.6.1.
11 and where the containing loop body belongs to the innermost loop that is associated with the
12 directive of an enclosing for, for simd, or simd construct.
C / C++
Fortran
13 The syntax of the scan directive and the loop body that contains it is as follows:
14 structured-block-sequence
15 !$omp scan clause
16 structured-block-sequence
20 and where the containing loop body belongs to the innermost loop that is associated with the
21 directive of an enclosing do, do simd, or simd construct.
Fortran
7 Restrictions
8 Restrictions to the scan directive are as follows:
9 • Exactly one scan directive must be associated with a given worksharing-loop, worksharing-loop
10 SIMD, or simd directive on which a reduction clause with the inscan modifier is present.
11 • The loops that are associated with the directive to which the scan directive is associated must
12 all be perfectly nested.
13 • A list item that appears in the inclusive or exclusive clause must appear in a
14 reduction clause with the inscan modifier on the associated worksharing-loop,
15 worksharing-loop SIMD, or simd construct.
16 • Cross-iteration dependences across different logical iterations must not exist, except for
17 dependences for the list items specified in an inclusive or exclusive clause.
18 • Intra-iteration dependences from a statement in the structured block sequence that precede a
19 scan directive to a statement in the structured block sequence that follows a scan directive
20 must not exist, except for dependences for the list items specified in an inclusive or
21 exclusive clause.
22 • The private copy of list items that appear in the inclusive or exclusive clause may not be
23 modified in the scan phase.
24 Cross References
25 • Worksharing-loop construct, see Section 2.11.4.
26 • simd construct, see Section 2.11.5.1.
27 • Worksharing-loop SIMD construct, see Section 2.11.5.2.
28 • reduction clause, see Section 2.21.5.4.
7 Cross References
8 • Canonical loop nest form, see Section 2.11.1.
12 Syntax
C / C++
13 The syntax of the tile construct is as follows:
14 #pragma omp tile sizes(size-list) new-line
15 loop-nest
16 where loop-nest is a canonical loop nest and size-list is a list s1 , . . . , sn of positive integer
17 expressions.
C / C++
Fortran
18 The syntax of the tile construct is as follows:
19 !$omp tile sizes(size-list)
20 loop-nest
21 [!$omp end tile]
22 where loop-nest is a canonical loop nest and size-list is a list s1 , . . . , sn of positive integer
23 expressions.
24 If an end tile directive is not specified, an end tile directive is assumed at the end of the
25 do-loops.
Fortran
13 The floor loops iterate over all tiles {Tα1 ,...,αn ∈ F } in lexicographic order with respect to their
14 indices (α1 , . . . , αn ) and the tile loops iterate over the iterations in Tα1 ,...,αn in the lexicographic
15 order of the corresponding iteration vectors. An implementation may reorder the sequential
16 execution of two iterations if at least one is from a partial tile and if their respective logical iteration
17 vectors in loop-nest do not have a product order relation.
18 Restrictions
19 Restrictions to the tile construct are as follows:
20 • The depth of the associated loop nest must be greater than or equal to n.
21 • All loops that are associated with the construct must be perfectly nested.
22 • No loop that is associated with the construct may be a non-rectangular loop.
23 Cross References
24 • Canonical loop nest form, see Section 2.11.1.
25 • Worksharing-loop construct, see Section 2.11.4.
26 • distribute construct, see Section 2.11.6.1.
27 • taskloop construct, see Section 2.12.2.
4 Syntax
C / C++
5 The syntax of the unroll construct is as follows:
6 #pragma omp unroll [clause] new-line
7 loop-nest
8 where loop-nest is a canonical loop nest and clause is one of the following:
9 full
10 partial[(unroll-factor)]
16 where loop-nest is a canonical loop nest and clause is one of the following:
17 full
18 partial[(unroll-factor)]
22 Description
23 The unroll construct controls the outermost loop of the loop nest.
24 When the full clause is specified, the associated loop is fully unrolled – it is replaced with n
25 instances of its loop body, one for each logical iteration of the associated loop and in the order of its
26 logical iterations. The construct is replaced by a structured block that only contains the n loop body
27 instances.
8 Restrictions
9 Restrictions to the unroll construct are as follows:
10 • If the full clause is specified, the iteration count of the loop must be a compile-time constant.
11 Cross References
12 • Canonical loop nest form, see Section 2.11.4.
13 • tile construct, see Section 2.11.9.1.
18 Syntax
C / C++
19 The syntax of the task construct is as follows:
20 #pragma omp task [clause[ [,] clause] ... ] new-line
21 structured-block
15 or
16 !$omp task [clause[ [,] clause] ... ]
17 strictly-structured-block
18 [!$omp end task]
7 Binding
8 The binding thread set of the task region is the current team. A task region binds to the
9 innermost enclosing parallel region.
10 Description
11 The task construct is a task generating construct. When a thread encounters a task construct, an
12 explicit task is generated from the code for the associated structured block. The data environment
13 of the task is created according to the data-sharing attribute clauses on the task construct, per-data
14 environment ICVs, and any defaults that apply. The data environment of the task is destroyed when
15 the execution code of the associated structured block is completed.
16 The encountering thread may immediately execute the task, or defer its execution. In the latter case,
17 any thread in the team may be assigned the task. Completion of the task can be guaranteed using
18 task synchronization constructs and clauses. If a task construct is encountered during execution
19 of an outer task, the generated task region that corresponds to this construct is not a part of the
20 outer task region unless the generated task is an included task.
21 If a detach clause is present on a task construct a new allow-completion event is created and
22 connected to the completion of the associated task region. The original event-handle is updated
23 to represent that allow-completion event before the task data environment is created. The
24 event-handle is considered as if it was specified on a firstprivate clause. The use of a
25 variable in a detach clause expression of a task construct causes an implicit reference to the
26 variable in all enclosing constructs.
27 If no detach clause is present on a task construct the generated task is completed when the
28 execution of its associated structured block is completed. If a detach clause is present on a task
29 construct, the task is completed when the execution of its associated structured block is completed
30 and the allow-completion event is fulfilled.
31 When an if clause is present on a task construct and the if clause expression evaluates to false,
32 an undeferred task is generated, and the encountering thread must suspend the current task region,
33 for which execution cannot be resumed until execution of the structured block that is associated
34 with the generated task is completed. The use of a variable in an if clause expression of a task
35 construct causes an implicit reference to the variable in all enclosing constructs.
9 Tool Callbacks
10 A thread dispatches a registered ompt_callback_task_create callback for each occurrence
11 of a task-create event in the context of the encountering task. This callback has the type signature
12 ompt_callback_task_create_t and the flags argument indicates the task types shown in
13 Table 2.7.
14 Restrictions
15 Restrictions to the task construct are as follows:
16 • A program must not depend on any ordering of the evaluations of the clauses of the task
17 directive, or on any side effects of the evaluations of the clauses.
18 • At most one if clause can appear on the directive.
19 • At most one final clause can appear on the directive.
20 • At most one priority clause can appear on the directive.
21 • At most one detach clause can appear on the directive.
22 • If a detach clause appears on the directive, then a mergeable clause cannot appear on the
23 same directive.
17 Syntax
C / C++
18 The syntax of the taskloop construct is as follows:
19 #pragma omp taskloop [clause[[,] clause] ...] new-line
20 loop-nest
21 where loop-nest is a canonical loop nest and clause is one of the following:
22 if([ taskloop :] scalar-expression)
23 shared(list)
24 private(list)
25 firstprivate(list)
26 lastprivate(list)
27 reduction([default ,]reduction-identifier : list)
28 in_reduction(reduction-identifier : list)
15 where loop-nest is a canonical loop nest and clause is one of the following:
16 if([ taskloop :] scalar-logical-expression)
17 shared(list)
18 private(list)
19 firstprivate(list)
20 lastprivate(list)
21 reduction([default ,]reduction-identifier : list)
22 in_reduction(reduction-identifier : list)
23 default(data-sharing-attribute)
24 grainsize([strict:]grain-size)
25 num_tasks([strict:]num-tasks)
26 collapse(n)
27 final(scalar-logical-expr)
28 priority(priority-value)
29 untied
30 mergeable
3 If an end taskloop directive is not specified, an end taskloop directive is assumed at the end
4 of the do-loops.
Fortran
5 Binding
6 The binding thread set of the taskloop region is the current team. A taskloop region binds to
7 the innermost enclosing parallel region.
8 Description
9 The taskloop construct is a task generating construct. When a thread encounters a taskloop
10 construct, the construct partitions the iterations of the associated loops into explicit tasks for
11 parallel execution. The data environment of each generated task is created according to the
12 data-sharing attribute clauses on the taskloop construct, per-data environment ICVs, and any
13 defaults that apply. The order of the creation of the loop tasks is unspecified. Programs that rely on
14 any execution order of the logical iterations are non-conforming.
15 By default, the taskloop construct executes as if it was enclosed in a taskgroup construct
16 with no statements or directives outside of the taskloop construct. Thus, the taskloop
17 construct creates an implicit taskgroup region. If the nogroup clause is present, no implicit
18 taskgroup region is created.
19 If a reduction clause is present, the behavior is as if a task_reduction clause with the
20 same reduction operator and list items was applied to the implicit taskgroup construct that
21 encloses the taskloop construct. The taskloop construct executes as if each generated task
22 was defined by a task construct on which an in_reduction clause with the same reduction
23 operator and list items is present. Thus, the generated tasks are participants of the reduction defined
24 by the task_reduction clause that was applied to the implicit taskgroup construct.
25 If an in_reduction clause is present, the behavior is as if each generated task was defined by a
26 task construct on which an in_reduction clause with the same reduction operator and list
27 items is present. Thus, the generated tasks are participants of a reduction previously defined by a
28 reduction scoping clause.
29 If a grainsize clause is present, the number of logical iterations assigned to each generated task
30 is greater than or equal to the minimum of the value of the grain-size expression and the number of
31 logical iterations, but less than two times the value of the grain-size expression. If the grainsize
32 clause has the strict modifier, the number of logical iterations assigned to each generated task is
33 equal to the value of the grain-size expression, except for the generated task that contains the
34 sequentially last iteration, which may have fewer iterations. The parameter of the grainsize
35 clause must be a positive integer expression.
36 If num_tasks is specified, the taskloop construct creates as many tasks as the minimum of the
37 num-tasks expression and the number of logical iterations. Each task must have at least one logical
16 Tool Callbacks
17 A thread dispatches a registered ompt_callback_work callback for each occurrence of a
18 taskloop-begin and taskloop-end event in that thread. The callback occurs in the context of the
19 encountering task. The callback has type signature ompt_callback_work_t. The callback
20 receives ompt_scope_begin or ompt_scope_end as its endpoint argument, as appropriate,
21 and ompt_work_taskloop as its wstype argument.
22 A thread dispatches a registered ompt_callback_dispatch callback for each occurrence of a
23 taskloop-iteration-begin event in that thread. The callback occurs in the context of the encountering
24 task. The callback has type signature ompt_callback_dispatch_t.
25 Restrictions
26 Restrictions to the taskloop construct are as follows:
27 • If a reduction clause is present, the nogroup clause must not be specified.
28 • The same list item cannot appear in both a reduction and an in_reduction clause.
29 • At most one grainsize clause can appear on the directive.
30 • At most one num_tasks clause can appear on the directive.
31 • Neither the grainsize clause nor the num_tasks clause may appear on the directive if any
32 of the associated loops is a non-rectangular loop.
33 • The grainsize clause and num_tasks clause are mutually exclusive and may not appear on
34 the same taskloop directive.
35 • At most one collapse clause can appear on the directive.
4 Cross References
5 • Canonical loop nest form, see Section 2.11.1.
6 • tile construct, see Section 2.11.9.1.
7 • task construct, Section 2.12.1.
8 • if clause, see Section 2.18.
9 • taskgroup construct, Section 2.19.6.
10 • Data-sharing attribute clauses, Section 2.21.4.
11 • ompt_scope_begin and ompt_scope_end, see Section 4.4.4.11.
12 • ompt_work_taskloop, see Section 4.4.4.15.
13 • ompt_callback_work_t, see Section 4.5.2.5.
14 • ompt_callback_dispatch_t, see Section 4.5.2.6.
20 Syntax
C / C++
21 The syntax of the taskloop simd construct is as follows:
22 #pragma omp taskloop simd [clause[[,] clause] ...] new-line
23 loop-nest
24 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
25 taskloop or simd directives with identical meanings and restrictions.
C / C++
5 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
6 taskloop or simd directives with identical meanings and restrictions.
7 If an end taskloop simd directive is not specified, an end taskloop simd directive is
8 assumed at the end of the do-loops.
Fortran
9 Binding
10 The binding thread set of the taskloop simd region is the current team. A taskloop simd
11 region binds to the innermost enclosing parallel region.
12 Description
13 The taskloop simd construct first distributes the iterations of the associated loops across tasks
14 in a manner consistent with any clauses that apply to the taskloop construct. The resulting tasks
15 are then converted to a SIMD loop in a manner consistent with any clauses that apply to the simd
16 construct, except for the collapse clause. For the purposes of each task’s conversion to a SIMD
17 loop, the collapse clause is ignored and the effect of any in_reduction clause is as if a
18 reduction clause with the same reduction operator and list items is present on the simd
19 construct.
22 Tool Callbacks
23 This composite construct dispatches the same callbacks as the taskloop construct.
24 Restrictions
25 Restrictions to the taskloop simd construct are as follows:
26 • The restrictions for the taskloop and simd constructs apply.
27 • The conditional modifier may not appear in a lastprivate clause.
28 • If any if clause on the directive includes a directive-name-modifier then all if clauses on the
29 directive must include a directive-name-modifier.
30 • At most one if clause without a directive-name-modifier can appear on the directive.
31 • At most one if clause with the taskloop directive-name-modifier can appear on the directive.
32 • At most one if clause with the simd directive-name-modifier can appear on the directive.
10 Syntax
C / C++
11 The syntax of the taskyield construct is as follows:
12 #pragma omp taskyield new-line
C / C++
Fortran
13 The syntax of the taskyield construct is as follows:
14 !$omp taskyield
Fortran
15 Binding
16 A taskyield region binds to the current task region. The binding thread set of the taskyield
17 region is the current team.
18 Description
19 The taskyield region includes an explicit task scheduling point in the current task region.
20 Cross References
21 • Task scheduling, see Section 2.12.6.
14 Note – Task scheduling points dynamically divide task regions into parts. Each part is executed
15 uninterrupted from start to end. Different parts of the same task region are executed in the order in
16 which they are encountered. In the absence of task synchronization constructs, the order in which a
17 thread executes parts of different schedulable tasks is unspecified.
18 A program must behave correctly and consistently with all conceivable scheduling sequences that
19 are compatible with the rules above.
20 For example, if threadprivate storage is accessed (explicitly in the source code or implicitly
21 in calls to library routines) in one part of a task region, its value cannot be assumed to be preserved
22 into the next part of the same task region if another schedulable task exists that modifies it.
23 As another example, if a lock acquire and release happen in different parts of a task region, no
24 attempt should be made to acquire the same lock in any part of another task that the executing
25 thread may schedule. Otherwise, a deadlock is possible. A similar situation can occur when a
26 critical region spans multiple parts of a task and another schedulable task contains a
27 critical region with the same name.
28 The use of threadprivate variables and the use of locks or critical sections in an explicit task with an
29 if clause must take into account that when the if clause evaluates to false, the task is executed
30 immediately, without regard to Task Scheduling Constraint 2.
31
8 Cross References
9 • ompt_callback_task_schedule_t, see Section 4.5.2.10.
20 Restrictions
21 Restrictions to OpenMP memory spaces are as follows:
22 • Variables in the omp_const_mem_space memory space may not be written.
Fortran
17 If any operation of the base language causes a reallocation of a variable that is allocated with a
18 memory allocator then that memory allocator will be used to deallocate the current memory and to
19 allocate the new memory. For allocated allocatable components of such variables, the allocator that
20 will be used for the deallocation and allocation is unspecified.
Fortran
11 Syntax
C / C++
12 The syntax of the allocate directive is as follows:
13 #pragma omp allocate(list) [clause[ [,] clause] ... ] new-line
21 or
22 !$omp allocate[(list)] [clause[ [,] clause] ... ]
23 [!$omp allocate[(list)] [clause[ [,] clause] ... ]
24 [...]]
25 allocate-stmt
26 where allocate-stmt is a Fortran ALLOCATE statement and clause is one of the following:
27 allocator(allocator)
28 align(alignment)
3 Description
4 The storage for each list item that appears in the allocate directive is provided by an allocation
5 through a memory allocator. If no allocator clause is specified then the memory allocator
6 specified by the def-allocator-var ICV is used. If the allocator clause is specified, the memory
7 allocator specified in the clause is used. If the align clause is specified then the allocation of each
8 list item is byte aligned to at least the maximum of the alignment required by the base language for
9 the type of that list item, the alignment trait of the allocator and the alignment value of the
10 align clause. If the align clause is not specified then the allocation of each list item is byte
11 aligned to at least the maximum of the alignment required by the base language for the type of that
12 list item and the alignment trait of the allocator.
13 The scope of this allocation is that of the list item in the base language. At the end of the scope for a
14 given list item the memory allocator used to allocate that list item deallocates the storage.
Fortran
15 If the directive is associated with an allocate-stmt, the allocate-stmt allocates all list items that
16 appear in the directive list using the specified memory allocator. If no list items are specified then
17 all variables that are listed by the allocate-stmt and are not listed in an allocate directive
18 associated with the statement are allocated with the specified memory allocator.
Fortran
19 For allocations that arise from this directive the null_fb value of the fallback allocator trait
20 behaves as if the abort_fb had been specified.
21 Restrictions
22 Restrictions to the allocate directive are as follows:
23 • At most one allocator clause may appear on the directive.
24 • At most one align clause may appear on the directive.
25 • A variable that is part of another variable (as an array or structure element) cannot appear in a
26 declarative allocate directive.
27 • A declarative allocate directive must appear in the same scope as the declarations of each of
28 its list items and must follow all such declarations.
29 • A declared variable may appear as a list item in at most one declarative allocate directive in a
30 given compilation unit.
31 • At most one allocator clause can appear on the allocate directive.
8 Cross References
9 • def-allocator-var ICV, see Section 2.4.1.
10 • Memory allocators, see Section 2.13.2.
11 • omp_allocator_handle_t and omp_allocator_handle_kind, see Section 3.13.1.
16 Syntax
17 The syntax of the allocate clause is one of the following:
18 allocate([allocator:] list)
19 allocate(allocate-modifier [, allocate-modifier]: list)
23 where alignment is a constant positive integer expression with a value that is a power of two; and
C / C++
24 where allocator is an expression of the omp_allocator_handle_t type.
C / C++
Fortran
25 where allocator is an integer expression of the omp_allocator_handle_kind kind.
Fortran
16 Restrictions
17 Restrictions to the allocate clause are as follows:
18 • At most one allocator allocate-modifier may be specified on the clause.
19 • At most one align allocate-modifier may be specified on the clause.
20 • For any list item that is specified in the allocate clause on a directive, a data-sharing attribute
21 clause that may create a private copy of that list item must be specified on the same directive.
22 • For task, taskloop or target directives, allocation requests to memory allocators with the
23 trait access set to thread result in unspecified behavior.
24 • allocate clauses that appear on a target construct or on constructs in a target region
25 must specify an allocator expression unless a requires directive with the
26 dynamic_allocators clause is present in the same compilation unit.
27 Cross References
28 • def-allocator-var ICV, see Section 2.4.1.
29 • Memory allocators, see Section 2.13.2.
30 • List Item Privatization, see Section 2.21.3.
31 • omp_allocator_handle_t and omp_allocator_handle_kind, see Section 3.13.1.
14 Tool Callbacks
15 A thread dispatches a registered ompt_callback_device_initialize callback for each
16 occurrence of a device-initialize event in that thread. This callback has type signature
17 ompt_callback_device_initialize_t.
18 A thread dispatches a registered ompt_callback_device_load callback for each occurrence
19 of a device-load event in that thread. This callback has type signature
20 ompt_callback_device_load_t.
21 A thread dispatches a registered ompt_callback_device_unload callback for each
22 occurrence of a device-unload event in that thread. This callback has type signature
23 ompt_callback_device_unload_t.
24 A thread dispatches a registered ompt_callback_device_finalize callback for each
25 occurrence of a device-finalize event in that thread. This callback has type signature
26 ompt_callback_device_finalize_t.
27 Restrictions
28 Restrictions to OpenMP device initialization are as follows:
29 • No thread may offload execution of an OpenMP construct to a device until a dispatched
30 ompt_callback_device_initialize callback completes.
31 • No thread may offload execution of an OpenMP construct to a device after a dispatched
32 ompt_callback_device_finalize callback occurs.
10 Syntax
C / C++
11 The syntax of the target data construct is as follows:
12 #pragma omp target data clause[ [ [,] clause] ... ] new-line
13 structured-block
24 or
25 !$omp target data clause[ [ [,] clause] ... ]
26 strictly-structured-block
27 [!$omp end target data]
7 Binding
8 The binding task set for a target data region is the generating task. The target data region
9 binds to the region of the generating task.
10 Description
11 When a target data construct is encountered, the encountering task executes the region. If no
12 device clause is present, the behavior is as if the device clause appeared with an expression
13 equal to the value of the default-device-var ICV. When an if clause is present and the if clause
14 expression evaluates to false, the target device is the host. Variables are mapped for the extent of the
15 region, according to any data-mapping attribute clauses, from the data environment of the
16 encountering task to the device data environment.
17 If a list item that appears in a use_device_addr clause has corresponding storage in the device
18 data environment, references to the list item in the associated structured block are converted into
19 references to the corresponding list item. If the list item is not a mapped list item, it is assumed to
20 be accessible on the target device. Inside the structured block, the list item has a device address and
21 its storage may not be accessible from the host device. The list items that appear in a
22 use_device_addr clause may include array sections.
C / C++
23 If a list item in a use_device_addr clause is an array section that has a base pointer, the effect
24 of the clause is to convert the base pointer to a pointer that is local to the structured block and that
25 contains the device address. This conversion may be elided if the list item was not already mapped.
26 If a list item that appears in a use_device_ptr clause is a pointer to an object that is mapped to
27 the device data environment, references to the list item in the associated structured block are
28 converted into references to a device pointer that is local to the structured block and that refers to
29 the device address of the corresponding object. If the list item does not point to a mapped object, it
30 must contain a valid device address for the target device, and the list item references are instead
31 converted to references to a local device pointer that refers to this device address.
C / C++
19 Tool Callbacks
20 The tool callbacks dispatched when entering a target data region are the same as the tool
21 callbacks dispatched when encountering a target enter data construct, as described in
22 Section 2.14.3.
23 The tool callbacks dispatched when exiting a target data region are the same as the tool
24 callbacks dispatched when encountering a target exit data construct, as described in
25 Section 2.14.4.
26 Restrictions
27 Restrictions to the target data construct are as follows:
28 • A program must not depend on any ordering of the evaluations of the clauses of the
29 target data directive, except as explicitly stated for map clauses and for map clauses relative
30 to use_device_ptr and use_device_addr clauses, or on any side effects of the
31 evaluations of the clauses.
32 • At most one device clause can appear on the directive. The device clause expression must
33 evaluate to a non-negative integer value that is less than or equal to the value of
34 omp_get_num_devices().
25 Cross References
26 • default-device-var, see Section 2.4.
27 • if clause, see Section 2.18.
28 • map clause, see Section 2.21.7.1.
29 • omp_get_num_devices routine, see Section 3.7.4.
5 Syntax
C / C++
6 The syntax of the target enter data construct is as follows:
7 #pragma omp target enter data [clause[ [,] clause] ... ] new-line
22 Binding
23 The binding task set for a target enter data region is the generating task, which is the target
24 task generated by the target enter data construct. The target enter data region binds
25 to the corresponding target task region.
26 Tool Callbacks
27 Callbacks associated with events for target tasks are the same as for the task construct defined in
28 Section 2.12.1; (flags & ompt_task_target) always evaluates to true in the dispatched
29 callback.
30 A thread dispatches a registered ompt_callback_target or
31 ompt_callback_target_emi callback with ompt_scope_begin as its endpoint
32 argument and ompt_target_enter_data or ompt_target_enter_data_nowait if
33 the nowait clause is present as its kind argument for each occurrence of a target-enter-data-begin
34 event in that thread in the context of the target task on the host. Similarly, a thread dispatches a
35 registered ompt_callback_target or ompt_callback_target_emi callback with
36 ompt_scope_end as its endpoint argument and ompt_target_enter_data or
5 Restrictions
6 Restrictions to the target enter data construct are as follows:
7 • A program must not depend on any ordering of the evaluations of the clauses of the
8 target enter data directive, or on any side effects of the evaluations of the clauses.
9 • At least one map clause must appear on the directive.
10 • At most one device clause can appear on the directive. The device clause expression must
11 evaluate to a non-negative integer value that is less than or equal to the value of
12 omp_get_num_devices().
13 • At most one if clause can appear on the directive.
14 • A map-type must be specified in all map clauses and must be either to or alloc.
15 • At most one nowait clause can appear on the directive.
16 Cross References
17 • default-device-var, see Section 2.4.1.
18 • task, see Section 2.12.1.
19 • task scheduling constraints, see Section 2.12.6.
20 • target data, see Section 2.14.2.
21 • target exit data, see Section 2.14.4.
22 • if clause, see Section 2.18.
23 • map clause, see Section 2.21.7.1.
24 • omp_get_num_devices routine, see Section 3.7.4.
25 • ompt_callback_target_t and ompt_callback_target_emi_t callback type, see
26 Section 4.5.2.26.
18 Binding
19 The binding task set for a target exit data region is the generating task, which is the target
20 task generated by the target exit data construct. The target exit data region binds to
21 the corresponding target task region.
22 Description
23 When a target exit data construct is encountered, the list items in the map clauses are
24 unmapped from the device data environment according to the map clause semantics.
25 The target exit data construct is a task generating construct. The generated task is a target
26 task. The generated task region encloses the target exit data region.
21 Tool Callbacks
22 Callbacks associated with events for target tasks are the same as for the task construct defined in
23 Section 2.12.1; (flags & ompt_task_target) always evaluates to true in the dispatched
24 callback.
25 A thread dispatches a registered ompt_callback_target or
26 ompt_callback_target_emi callback with ompt_scope_begin as its endpoint
27 argument and ompt_target_exit_data or ompt_target_exit_data_nowait if the
28 nowait clause is present as its kind argument for each occurrence of a target-exit-data-begin
29 event in that thread in the context of the target task on the host. Similarly, a thread dispatches a
30 registered ompt_callback_target or ompt_callback_target_emi callback with
31 ompt_scope_end as its endpoint argument and ompt_target_exit_data or
32 ompt_target_exit_data_nowait if the nowait clause is present as its kind argument for
33 each occurrence of a target-exit-data-end event in that thread in the context of the target task on the
34 host. These callbacks have type signature ompt_callback_target_t or
35 ompt_callback_target_emi_t, respectively.
13 Cross References
14 • default-device-var, see Section 2.4.1.
15 • task, see Section 2.12.1.
16 • task scheduling constraints, see Section 2.12.6.
17 • target data, see Section 2.14.2.
18 • target enter data, see Section 2.14.3.
19 • if clause, see Section 2.18.
20 • map clause, see Section 2.21.7.1.
21 • omp_get_num_devices routine, see Section 3.7.4.
22 • ompt_callback_target_t and ompt_callback_target_emi_t callback type, see
23 Section 4.5.2.26.
5 Syntax
C / C++
6 The syntax of the target construct is as follows:
7 #pragma omp target [clause[ [,] clause] ... ] new-line
8 structured-block
5 or
6 !$omp target [clause[ [,] clause] ... ]
7 strictly-structured-block
8 [!$omp end target]
4 Description
5 The target construct provides a superset of the functionality provided by the target data
6 directive, except for the use_device_ptr and use_device_addr clauses.
7 The functionality added to the target directive is the inclusion of an executable region to be
8 executed on a device. That is, the target directive is an executable directive.
9 The target construct is a task generating construct. The generated task is a target task. The
10 generated task region encloses the target region. The device clause determines the device on
11 which the target region executes.
12 All clauses are evaluated when the target construct is encountered. The data environment of the
13 target task is created according to the data-sharing and data-mapping attribute clauses on the
14 target construct, per-data environment ICVs, and any default data-sharing attribute rules that
15 apply to the target construct. If a variable or part of a variable is mapped by the target
16 construct and does not appear as a list item in an in_reduction clause on the construct, the
17 variable has a default data-sharing attribute of shared in the data environment of the target task.
18 Assignment operations associated with mapping a variable (see Section 2.21.7.1) occur when the
19 target task executes.
20 As described in Section 2.4.4.1, the target construct limits the number of threads that may
21 participate in a contention group initiated by the initial thread by setting the value of the
22 thread-limit-var ICV for the initial task to an implementation defined value greater than zero. If the
23 thread_limit clause is specified, the number of threads will be less than or equal to the value
24 specified in the clause.
25 If a device clause in which the device_num device-modifier appears is present on the
26 construct, the device clause expression specifies the device number of the target device. If
27 device-modifier does not appear in the clause, the behavior of the clause is as if device-modifier is
28 device_num.
29 If a device clause in which the ancestor device-modifier appears is present on the target
30 construct and the device clause expression evaluates to 1, execution of the target region occurs
31 on the parent device of the enclosing target region. If the target construct is not encountered
32 in a target region, the current device is treated as the parent device. The encountering thread
33 waits for completion of the target region on the parent device before resuming. For any list item
34 that appears in a map clause on the same construct, if the corresponding list item exists in the device
35 data environment of the parent device, it is treated as if it has a reference count of positive infinity.
36 If no device clause is present, the behavior is as if the device clause appears without a
37 device-modifier and with an expression equal to the value of the default-device-var ICV.
21 Tool Callbacks
22 Callbacks associated with events for target tasks are the same as for the task construct defined in
23 Section 2.12.1; (flags & ompt_task_target) always evaluates to true in the dispatched
24 callback.
25 A thread dispatches a registered ompt_callback_target or
26 ompt_callback_target_emi callback with ompt_scope_begin as its endpoint
27 argument and ompt_target or ompt_target_nowait if the nowait clause is present as its
28 kind argument for each occurrence of a target-begin event in that thread in the context of the target
29 task on the host. Similarly, a thread dispatches a registered ompt_callback_target or
30 ompt_callback_target_emi callback with ompt_scope_end as its endpoint argument
31 and ompt_target or ompt_target_nowait if the nowait clause is present as its kind
13 Restrictions
14 Restrictions to the target construct are as follows:
15 • If a target update, target data, target enter data, or target exit data
16 construct is encountered during execution of a target region, the behavior is unspecified.
17 • The result of an omp_set_default_device, omp_get_default_device, or
18 omp_get_num_devices routine called within a target region is unspecified.
19 • The effect of an access to a threadprivate variable in a target region is unspecified.
20 • If a list item in a map clause is a structure element, any other element of that structure that is
21 referenced in the target construct must also appear as a list item in a map clause.
22 • A list item cannot appear in both a map clause and a data-sharing attribute clause on the same
23 target construct.
24 • A variable referenced in a target region but not the target construct that is not declared in
25 the target region must appear in a declare target directive.
26 • At most one defaultmap clause for each category can appear on the directive.
27 • At most one nowait clause can appear on the directive.
28 • At most one if clause can appear on the directive.
29 • A map-type in a map clause must be to, from, tofrom or alloc.
30 • A list item that appears in an is_device_ptr clause must be a valid device pointer for the
31 device data environment.
32 • A list item that appears in a has_device_addr clause must have a valid device address for
33 the device data environment.
34 • A list item may not be specified in both an is_device_ptr clause and a
35 has_device_addr clause on the directive.
23 Syntax
C / C++
24 The syntax of the target update construct is as follows:
25 #pragma omp target update clause[ [ [,] clause] ... ] new-line
22 Binding
23 The binding task set for a target update region is the generating task, which is the target task
24 generated by the target update construct. The target update region binds to the
25 corresponding target task region.
9 Tool Callbacks
10 Callbacks associated with events for target tasks are the same as for the task construct defined in
11 Section 2.12.1; (flags & ompt_task_target) always evaluates to true in the dispatched
12 callback.
13 A thread dispatches a registered ompt_callback_target or
14 ompt_callback_target_emi callback with ompt_scope_begin as its endpoint
15 argument and ompt_target_update or ompt_target_update_nowait if the nowait
16 clause is present as its kind argument for each occurrence of a target-update-begin event in that
17 thread in the context of the target task on the host. Similarly, a thread dispatches a registered
18 ompt_callback_target or ompt_callback_target_emi callback with
19 ompt_scope_end as its endpoint argument and ompt_target_update or
20 ompt_target_update_nowait if the nowait clause is present as its kind argument for each
21 occurrence of a target-update-end event in that thread in the context of the target task on the host.
22 These callbacks have type signature ompt_callback_target_t or
23 ompt_callback_target_emi_t, respectively.
24 Restrictions
25 Restrictions to the target update construct are as follows:
26 • A program must not depend on any ordering of the evaluations of the clauses of the
27 target update directive, or on any side effects of the evaluations of the clauses.
28 • Each of the motion-modifier modifiers can appear at most once on a motion clause.
29 • At least one motion-clause must be specified.
30 • A list item can only appear in a to or from clause, but not in both.
31 • A list item in a to or from clause must have a mappable type.
32 • At most one device clause can appear on the directive. The device clause expression must
33 evaluate to a non-negative integer value that is less than or equal to the value of
34 omp_get_num_devices().
35 • At most one if clause can appear on the directive.
36 • At most one nowait clause can appear on the directive.
19 Syntax
C / C++
20 The syntax of the declare target directive is as follows:
21 #pragma omp declare target new-line
22 declaration-definition-seq
23 #pragma omp end declare target new-line
24 or
25 #pragma omp declare target (extended-list) new-line
26 or
27 #pragma omp declare target clause[ [ [,] clause] ... ] new-line
10 where invoked-by-fptr is a constant boolean expression that evaluates to true or false at compile
11 time.
C / C++
Fortran
12 The syntax of the declare target directive is as follows:
13 !$omp declare target (extended-list)
14 or
15 !$omp declare target [clause[ [,] clause] ... ]
17 Tool Callbacks
18 A thread dispatches a registered ompt_callback_target_data_op callback, or a registered
19 ompt_callback_target_data_op_emi callback with ompt_scope_beginend as its
20 endpoint argument for each occurrence of a target-global-data-op event in that thread. These
21 callbacks have type signature ompt_callback_target_data_op_t or
22 ompt_callback_target_data_op_emi_t, respectively.
23 Restrictions
24 Restrictions to the declare target directive are as follows:
25 • A threadprivate variable cannot appear in the directive.
26 • A variable declared in the directive must have a mappable type.
27 • The same list item must not appear multiple times in clauses on the same directive.
28 • The same list item must not explicitly appear in both a to clause on one declare target directive
29 and a link clause on another declare target directive.
30 • If the directive has a clause, it must contain at least one to clause or at least one link clause.
31 • A variable for which nohost is specified may not appear in a link clause.
32 • At most one indirect clause can be specified on the directive.
33 • At most one device_type clause can be specified on the directive.
10 Cross References
11 • target data construct, see Section 2.14.2.
12 • target construct, see Section 2.14.5.
13 • ompt_callback_target_data_op_t or
14 ompt_callback_target_data_op_emi_t callback type, see Section 4.5.2.25.
15 2.15 Interoperability
16 An OpenMP implementation may interoperate with one or more foreign runtime environments
17 through the use of the interop construct that is described in this section, the interop operation
18 for a declared variant function and the interoperability routines that are available through the
19 OpenMP Runtime API.
C / C++
20 The implementation must provide foreign-runtime-id values that are enumerators of type
21 omp_interop_fr_t and that correspond to the supported foreign runtime environments.
C / C++
Fortran
22 The implementation must provide foreign-runtime-id values that are named integer constants with
23 kind omp_interop_fr_kind and that correspond to the supported foreign runtime
24 environments.
Fortran
25 Each foreign-runtime-id value provided by an implementation will be available as
26 omp_ifr_name, where name is the name of the foreign runtime environment. Available names
27 include those that are listed in the OpenMP Additional Definitions document;
28 implementation-defined names may also be supported. The value of omp_ifr_last is defined as
29 one greater than the value of the highest supported foreign-runtime-id value that is listed in the
30 aforementioned document.
9 Syntax
10 In the following syntax, interop-type is the type of interoperability information being requested or
11 used by the interop construct, and action-clause is a clause that indicates the action to take with
12 respect to that interop object.
C / C++
13 The syntax of the interop construct is as follows:
14 #pragma omp interop clause[ [ [,] clause] ... ] new-line
23 where interop-var is a variable of type omp_interop_t, and interop-type is one of the following:
24 target
25 targetsync
22 where preference-list is a comma separated list for which each item is a foreign-runtime-id, which
23 is a base language string literal or a compile-time constant integral expression. Allowed values for
24 foreign-runtime-id include the names (as string literals) and integer values specified in the OpenMP
25 Additional Definitions document and the corresponding omp_ifr_name integer constants of kind
26 omp_interop_fr_kind; implementation-defined values may also be supported.
Fortran
27 Binding
28 The binding task set for an interop region is the generating task. The interop region binds to
29 the region of the generating task.
4 Restrictions
5 Restrictions to the interop construct are as follows:
6 • At least one action-clause must appear on the directive.
7 • Each interop-type may be specified on an action-clause at most once.
8 • The interop-var passed to init or destroy must be non-const.
9 • A depend clause can only appear on the directive if a targetsync interop-type is present or
10 the interop-var was initialized with the targetsync interop-type.
11 • Each interop-var may be specified for at most one action-clause of each interop construct.
12 • At most one device clause can appear on the directive. The device clause expression must
13 evaluate to a non-negative integer value less than or equal to the value returned by
14 omp_get_num_devices().
15 • At most one nowait clause can appear on the directive.
16 Cross References
17 • depend clause, see Section 2.19.11.
18 • Interoperability routines, see Section 3.12.
13 Syntax
C / C++
14 The syntax of the parallel worksharing-loop construct is as follows:
15 #pragma omp parallel for [clause[ [,] clause] ... ] new-line
16 loop-nest
17 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
18 parallel or for directives, except the nowait clause, with identical meanings and restrictions.
19
C / C++
Fortran
20 The syntax of the parallel worksharing-loop construct is as follows:
21 !$omp parallel do [clause[ [,] clause] ... ]
22 loop-nest
23 [!$omp end parallel do]
24 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
25 parallel or do directives, with identical meanings and restrictions.
26 If an end parallel do directive is not specified, an end parallel do directive is assumed at
27 the end of the loop-nest.
Fortran
4 Restrictions
5 The restrictions for the parallel construct and the worksharing-loop construct apply.
6 Cross References
7 • parallel construct, see Section 2.6.
8 • Canonical loop nest form, see Section 2.11.1.
9 • Worksharing-loop construct, see Section 2.11.4.
10 • Data attribute clauses, see Section 2.21.4.
15 Syntax
C / C++
16 The syntax of the parallel loop construct is as follows:
17 #pragma omp parallel loop [clause[ [,] clause] ... ] new-line
18 loop-nest
19 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
20 parallel or loop directives, with identical meanings and restrictions.
C / C++
Fortran
21 The syntax of the parallel loop construct is as follows:
22 !$omp parallel loop [clause[ [,] clause] ... ]
23 loop-nest
24 [!$omp end parallel loop]
25 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
26 parallel or loop directives, with identical meanings and restrictions.
27 If an end parallel loop directive is not specified, an end parallel loop directive is
28 assumed at the end of the loop-nest.
Fortran
4 Restrictions
5 The restrictions for the parallel construct and the loop construct apply.
6 Cross References
7 • parallel construct, see Section 2.6.
8 • Canonical loop nest form, see Section 2.11.1.
9 • loop construct, see Section 2.11.7.
10 • Data attribute clauses, see Section 2.21.4.
15 Syntax
C / C++
16 The syntax of the parallel sections construct is as follows:
17 #pragma omp parallel sections [clause[ [,] clause] ... ] new-line
18 {
19 [#pragma omp section new-line]
20 structured-block-sequence
21 [#pragma omp section new-line
22 structured-block-sequence]
23 ...
24 }
25 where clause can be any of the clauses accepted by the parallel or sections directives,
26 except the nowait clause, with identical meanings and restrictions.
C / C++
9 where clause can be any of the clauses accepted by the parallel or sections directives, with
10 identical meanings and restrictions.
Fortran
11 Description
C / C++
12 The semantics are identical to explicitly specifying a parallel directive immediately followed
13 by a sections directive.
C / C++
Fortran
14 The semantics are identical to explicitly specifying a parallel directive immediately followed
15 by a sections directive, and an end sections directive immediately followed by an
16 end parallel directive.
Fortran
17 Restrictions
18 The restrictions for the parallel construct and the sections construct apply.
19 Cross References
20 • parallel construct, see Section 2.6.
21 • sections construct, see Section 2.10.1.
22 • Data attribute clauses, see Section 2.21.4.
Fortran
6 or
7 !$omp parallel workshare [clause[ [,] clause] ... ]
8 strictly-structured-block
9 [!$omp end parallel workshare]
10 where clause can be any of the clauses accepted by the parallel directive, with identical
11 meanings and restrictions.
12 Description
13 The semantics are identical to explicitly specifying a parallel directive immediately followed
14 by a workshare directive, and an end workshare directive immediately followed by an
15 end parallel directive.
16 Restrictions
17 The restrictions for the parallel construct and the workshare construct apply.
18 Cross References
19 • parallel construct, see Section 2.6.
20 • workshare construct, see Section 2.10.3.
21 • Data attribute clauses, see Section 2.21.4.
Fortran
26 Syntax
C / C++
27 The syntax of the parallel worksharing-loop SIMD construct is as follows:
28 #pragma omp parallel for simd [clause[ [,] clause] ... ] new-line
29 loop-nest
30 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
31 parallel or for simd directives, except the nowait clause, with identical meanings and
32 restrictions.
C / C++
5 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
6 parallel or do simd directives, with identical meanings and restrictions.
7 If an end parallel do simd directive is not specified, an end parallel do simd directive
8 is assumed at the end of the loop-nest.
Fortran
9 Description
10 The semantics of the parallel worksharing-loop SIMD construct are identical to explicitly
11 specifying a parallel directive immediately followed by a worksharing-loop SIMD directive.
12 Restrictions
13 The restrictions for the parallel construct and the worksharing-loop SIMD construct apply
14 except for the following explicit modifications:
15 • If any if clause on the directive includes a directive-name-modifier then all if clauses on the
16 directive must include a directive-name-modifier.
17 • At most one if clause without a directive-name-modifier can appear on the directive.
18 • At most one if clause with the parallel directive-name-modifier can appear on the directive.
19 • At most one if clause with the simd directive-name-modifier can appear on the directive.
20 Cross References
21 • parallel construct, see Section 2.6.
22 • Canonical loop nest form, see Section 2.11.1.
23 • Worksharing-loop SIMD construct, see Section 2.11.5.2.
24 • if clause, see Section 2.18.
25 • Data attribute clauses, see Section 2.21.4.
5 where clause can be any of the clauses accepted by the parallel or masked directives, with
6 identical meanings and restrictions.
C / C++
Fortran
7 The syntax of the parallel masked construct is as follows:
8 !$omp parallel masked [clause[ [,] clause] ... ]
9 loosely-structured-block
10 !$omp end parallel masked
11 or
12 !$omp parallel masked [clause[ [,] clause] ... ]
13 strictly-structured-block
14 [!$omp end parallel masked]
15 where clause can be any of the clauses accepted by the parallel or masked directives, with
16 identical meanings and restrictions.
Fortran
17 The parallel master construct, which has been deprecated, has identical syntax to the
18 parallel masked construct other than the use of parallel master as the directive name.
19 Description
20 The semantics are identical to explicitly specifying a parallel directive immediately followed
21 by a masked directive.
22 Restrictions
23 The restrictions for the parallel construct and the masked construct apply.
24 Cross References
25 • parallel construct, see Section 2.6.
26 • masked construct, see Section 2.8.
27 • Data attribute clauses, see Section 2.21.4.
5 Syntax
C / C++
6 The syntax of the masked taskloop construct is as follows:
7 #pragma omp masked taskloop [clause[ [,] clause] ... ] new-line
8 loop-nest
9 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
10 masked or taskloop directives with identical meanings and restrictions.
C / C++
Fortran
11 The syntax of the masked taskloop construct is as follows:
12 !$omp masked taskloop [clause[ [,] clause] ... ]
13 loop-nest
14 [!$omp end masked taskloop]
15 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
16 masked or taskloop directives with identical meanings and restrictions.
17 If an end masked taskloop directive is not specified, an end masked taskloop directive is
18 assumed at the end of the loop-nest.
Fortran
19 The master taskloop construct, which has been deprecated, has identical syntax to the
20 masked taskloop construct other than the use of master taskloop as the directive name.
21 Description
22 The semantics are identical to explicitly specifying a masked directive immediately followed by a
23 taskloop directive.
24 Restrictions
25 The restrictions for the masked and taskloop constructs apply.
26 Cross References
27 • masked construct, see Section 2.8.
28 • Canonical loop nest form, see Section 2.11.1.
29 • taskloop construct, see Section 2.12.2.
30 • Data attribute clauses, see Section 2.21.4.
5 Syntax
C / C++
6 The syntax of the masked taskloop simd construct is as follows:
7 #pragma omp masked taskloop simd [clause[ [,] clause] ... ] new-line
8 loop-nest
9 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
10 masked or taskloop simd directives with identical meanings and restrictions.
C / C++
Fortran
11 The syntax of the masked taskloop simd construct is as follows:
12 !$omp masked taskloop simd [clause[ [,] clause] ... ]
13 loop-nest
14 [!$omp end masked taskloop simd]
15 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
16 masked or taskloop simd directives with identical meanings and restrictions.
17 If an end masked taskloop simd directive is not specified, an end masked
18 taskloop simd directive is assumed at the end of the loop-nest.
Fortran
19 The master taskloop simd construct, which has been deprecated, has identical syntax to the
20 masked taskloop simd construct other than the use of master taskloop simd as the
21 directive name.
22 Description
23 The semantics are identical to explicitly specifying a masked directive immediately followed by a
24 taskloop simd directive.
25 Restrictions
26 The restrictions for the masked and taskloop simd constructs apply.
27 Cross References
28 • masked construct, see Section 2.8.
29 • Canonical loop nest form, see Section 2.11.1.
30 • taskloop simd construct, see Section 2.12.3.
31 • Data attribute clauses, see Section 2.21.4.
8 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
9 parallel or masked taskloop directives, except the in_reduction clause, with identical
10 meanings and restrictions.
C / C++
Fortran
11 The syntax of the parallel masked taskloop construct is as follows:
12 !$omp parallel masked taskloop [clause[ [,] clause] ... ]
13 loop-nest
14 [!$omp end parallel masked taskloop]
15 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
16 parallel or masked taskloop directives, except the in_reduction clause, with identical
17 meanings and restrictions.
18 If an end parallel masked taskloop directive is not specified, an
19 end parallel masked taskloop directive is assumed at the end of the loop-nest.
Fortran
20 The parallel master taskloop construct, which has been deprecated, has identical syntax
21 to the parallel masked taskloop construct other than the use of
22 parallel master taskloop as the directive name.
23 Description
24 The semantics are identical to explicitly specifying a parallel directive immediately followed
25 by a masked taskloop directive.
9 Cross References
10 • parallel construct, see Section 2.6.
11 • Canonical loop nest form, see Section 2.11.1.
12 • masked taskloop construct, see Section 2.16.7.
13 • if clause, see Section 2.18.
14 • Data attribute clauses, see Section 2.21.4.
15 • in_reduction clause, see Section 2.21.5.6.
20 Syntax
C / C++
21 The syntax of the parallel masked taskloop simd construct is as follows:
22 #pragma omp parallel masked taskloop simd [clause[ [,] clause] ... ] new-line
23 loop-nest
24 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
25 parallel or masked taskloop simd directives, except the in_reduction clause, with
26 identical meanings and restrictions.
C / C++
5 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
6 parallel or masked taskloop simd directives, except the in_reduction clause, with
7 identical meanings and restrictions.
8 If an end parallel masked taskloop simd directive is not specified, an end parallel
9 masked taskloop simd directive is assumed at the end of the loop-nest.
Fortran
10 The parallel master taskloop simd construct, which has been deprecated, has identical
11 syntax to the parallel masked taskloop simd construct other than the use of
12 parallel master taskloop simd as the directive name.
13 Description
14 The semantics are identical to explicitly specifying a parallel directive immediately followed
15 by a masked taskloop simd directive.
16 Restrictions
17 The restrictions for the parallel construct and the masked taskloop simd construct apply
18 except for the following explicit modifications:
19 • If any if clause on the directive includes a directive-name-modifier then all if clauses on the
20 directive must include a directive-name-modifier.
21 • At most one if clause without a directive-name-modifier can appear on the directive.
22 • At most one if clause with the parallel directive-name-modifier can appear on the directive.
23 • At most one if clause with the taskloop directive-name-modifier can appear on the directive.
24 • At most one if clause with the simd directive-name-modifier can appear on the directive.
25 Cross References
26 • parallel construct, see Section 2.6.
27 • Canonical loop nest form, see Section 2.11.1.
28 • masked taskloop simd construct, see Section 2.16.8.
29 • if clause, see Section 2.18.
30 • Data attribute clauses, see Section 2.21.4.
31 • in_reduction clause, see Section 2.21.5.6.
5 Syntax
C / C++
6 The syntax of the teams distribute construct is as follows:
7 #pragma omp teams distribute [clause[ [,] clause] ... ] new-line
8 loop-nest
9 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
10 teams or distribute directives with identical meanings and restrictions.
C / C++
Fortran
11 The syntax of the teams distribute construct is as follows:
12 !$omp teams distribute [clause[ [,] clause] ... ]
13 loop-nest
14 [!$omp end teams distribute]
15 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
16 teams or distribute directives with identical meanings and restrictions.
17 If an end teams distribute directive is not specified, an end teams distribute
18 directive is assumed at the end of the loop-nest.
Fortran
19 Description
20 The semantics are identical to explicitly specifying a teams directive immediately followed by a
21 distribute directive.
22 Restrictions
23 The restrictions for the teams and distribute constructs apply.
24 Cross References
25 • teams construct, see Section 2.7.
26 • Canonical loop nest form, see Section 2.11.1.
27 • distribute construct, see Section 2.11.6.1.
28 • Data attribute clauses, see Section 2.21.4.
5 Syntax
C / C++
6 The syntax of the teams distribute simd construct is as follows:
7 #pragma omp teams distribute simd [clause[ [,] clause] ... ] new-line
8 loop-nest
9 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
10 teams or distribute simd directives with identical meanings and restrictions.
C / C++
Fortran
11 The syntax of the teams distribute simd construct is as follows:
12 !$omp teams distribute simd [clause[ [,] clause] ... ]
13 loop-nest
14 [!$omp end teams distribute simd]
15 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
16 teams or distribute simd directives with identical meanings and restrictions.
17 If an end teams distribute simd directive is not specified, an end teams
18 distribute simd directive is assumed at the end of the loop-nest.
Fortran
19 Description
20 The semantics are identical to explicitly specifying a teams directive immediately followed by a
21 distribute simd directive.
22 Restrictions
23 The restrictions for the teams and distribute simd constructs apply.
24 Cross References
25 • teams construct, see Section 2.7.
26 • Canonical loop nest form, see Section 2.11.1.
27 • distribute simd construct, see Section 2.11.6.2.
28 • Data attribute clauses, see Section 2.21.4.
6 Syntax
C / C++
7 The syntax of the teams distribute parallel worksharing-loop construct is as follows:
8 #pragma omp teams distribute parallel for \
9 [clause[ [,] clause] ... ] new-line
10 loop-nest
11 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
12 teams or distribute parallel for directives with identical meanings and restrictions.
C / C++
Fortran
13 The syntax of the teams distribute parallel worksharing-loop construct is as follows:
14 !$omp teams distribute parallel do [clause[ [,] clause] ... ]
15 loop-nest
16 [!$omp end teams distribute parallel do]
17 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
18 teams or distribute parallel do directives with identical meanings and restrictions.
19 If an end teams distribute parallel do directive is not specified, an end teams
20 distribute parallel do directive is assumed at the end of the loop-nest.
Fortran
21 Description
22 The semantics are identical to explicitly specifying a teams directive immediately followed by a
23 distribute parallel worksharing-loop directive.
24 Restrictions
25 The restrictions for the teams and distribute parallel worksharing-loop constructs apply.
26 Cross References
27 • teams construct, see Section 2.7.
28 • Canonical loop nest form, see Section 2.11.1.
29 • Distribute parallel worksharing-loop construct, see Section 2.11.6.3.
30 • Data attribute clauses, see Section 2.21.4.
7 Syntax
C / C++
8 The syntax of the teams distribute parallel worksharing-loop SIMD construct is as follows:
9 #pragma omp teams distribute parallel for simd \
10 [clause[ [,] clause] ... ] new-line
11 loop-nest
12 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
13 teams or distribute parallel for simd directives with identical meanings and
14 restrictions.
C / C++
Fortran
15 The syntax of the teams distribute parallel worksharing-loop SIMD construct is as follows:
16 !$omp teams distribute parallel do simd [clause[ [,] clause] ... ]
17 loop-nest
18 [!$omp end teams distribute parallel do simd]
19 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
20 teams or distribute parallel do simd directives with identical meanings and restrictions.
21 If an end teams distribute parallel do simd directive is not specified, an end teams
22 distribute parallel do simd directive is assumed at the end of the loop-nest.
Fortran
23 Description
24 The semantics are identical to explicitly specifying a teams directive immediately followed by a
25 distribute parallel worksharing-loop SIMD directive.
26 Restrictions
27 The restrictions for the teams and distribute parallel worksharing-loop SIMD constructs apply.
10 Syntax
C / C++
11 The syntax of the teams loop construct is as follows:
12 #pragma omp teams loop [clause[ [,] clause] ... ] new-line
13 loop-nest
14 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
15 teams or loop directives with identical meanings and restrictions.
C / C++
Fortran
16 The syntax of the teams loop construct is as follows:
17 !$omp teams loop [clause[ [,] clause] ... ]
18 loop-nest
19 [!$omp end teams loop]
20 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
21 teams or loop directives with identical meanings and restrictions.
22 If an end teams loop directive is not specified, an end teams loop directive is assumed at the
23 end of the loop-nest.
Fortran
24 Description
25 The semantics are identical to explicitly specifying a teams directive immediately followed by a
26 loop directive.
27 Restrictions
28 The restrictions for the teams and loop constructs apply.
10 Syntax
C / C++
11 The syntax of the target parallel construct is as follows:
12 #pragma omp target parallel [clause[ [,] clause] ... ] new-line
13 structured-block
14 where clause can be any of the clauses accepted by the target or parallel directives, except
15 for copyin, with identical meanings and restrictions.
C / C++
Fortran
16 The syntax of the target parallel construct is as follows:
17 !$omp target parallel [clause[ [,] clause] ... ]
18 loosely-structured-block
19 !$omp end target parallel
20 or
21 !$omp target parallel [clause[ [,] clause] ... ]
22 strictly-structured-block
23 [!$omp end target parallel]
24 where clause can be any of the clauses accepted by the target or parallel directives, except
25 for copyin, with identical meanings and restrictions.
Fortran
26 Description
27 The semantics are identical to explicitly specifying a target directive immediately followed by a
28 parallel directive.
9 Cross References
10 • parallel construct, see Section 2.6.
11 • target construct, see Section 2.14.5.
12 • if clause, see Section 2.18.
13 • Data attribute clauses, see Section 2.21.4.
14 • copyin clause, see Section 2.21.6.1.
19 Syntax
C / C++
20 The syntax of the target parallel worksharing-loop construct is as follows:
21 #pragma omp target parallel for [clause[ [,] clause] ... ] new-line
22 loop-nest
23 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
24 target or parallel for directives, except for copyin, with identical meanings and
25 restrictions.
C / C++
5 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
6 target or parallel do directives, except for copyin, with identical meanings and
7 restrictions.
8 If an end target parallel do directive is not specified, an end target parallel do
9 directive is assumed at the end of the loop-nest.
Fortran
10 Description
11 The semantics are identical to explicitly specifying a target directive immediately followed by a
12 parallel worksharing-loop directive.
13 Restrictions
14 The restrictions for the target and parallel worksharing-loop constructs apply except for the
15 following explicit modifications:
16 • If any if clause on the directive includes a directive-name-modifier then all if clauses on the
17 directive must include a directive-name-modifier.
18 • At most one if clause without a directive-name-modifier can appear on the directive.
19 • At most one if clause with the parallel directive-name-modifier can appear on the directive.
20 • At most one if clause with the target directive-name-modifier can appear on the directive.
21 Cross References
22 • Canonical loop nest form, see Section 2.11.1.
23 • target construct, see Section 2.14.5.
24 • Parallel Worksharing-Loop construct, see Section 2.16.1.
25 • if clause, see Section 2.18.
26 • Data attribute clauses, see Section 2.21.4.
27 • copyin clause, see Section 2.21.6.1.
5 Syntax
C / C++
6 The syntax of the target parallel worksharing-loop SIMD construct is as follows:
7 #pragma omp target parallel for simd \
8 [clause[[,] clause] ... ] new-line
9 loop-nest
10 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
11 target or parallel for simd directives, except for copyin, with identical meanings and
12 restrictions.
C / C++
Fortran
13 The syntax of the target parallel worksharing-loop SIMD construct is as follows:
14 !$omp target parallel do simd [clause[ [,] clause] ... ]
15 loop-nest
16 [!$omp end target parallel do simd]
17 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
18 target or parallel do simd directives, except for copyin, with identical meanings and
19 restrictions.
20 If an end target parallel do simd directive is not specified, an end target parallel
21 do simd directive is assumed at the end of the loop-nest.
Fortran
22 Description
23 The semantics are identical to explicitly specifying a target directive immediately followed by a
24 parallel worksharing-loop SIMD directive.
10 Cross References
11 • Canonical loop nest form, see Section 2.11.1.
12 • target construct, see Section 2.14.5.
13 • Parallel worksharing-loop SIMD construct, see Section 2.16.5.
14 • if clause, see Section 2.18.
15 • Data attribute clauses, see Section 2.21.4.
16 • copyin clause, see Section 2.21.6.1.
21 Syntax
C / C++
22 The syntax of the target parallel loop construct is as follows:
23 #pragma omp target parallel loop [clause[ [,] clause] ... ] new-line
24 loop-nest
25 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
26 target or parallel loop directives, except for copyin, with identical meanings and
27 restrictions.
C / C++
5 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
6 target or parallel loop directives, except for copyin, with identical meanings and
7 restrictions.
8 If an end target parallel loop directive is not specified, an end target parallel
9 loop directive is assumed at the end of the loop-nest. nowait may not be specified on an
10 end target parallel loop directive.
Fortran
11 Description
12 The semantics are identical to explicitly specifying a target directive immediately followed by a
13 parallel loop directive.
14 Restrictions
15 The restrictions for the target and parallel loop constructs apply except for the following
16 explicit modifications:
17 • If any if clause on the directive includes a directive-name-modifier then all if clauses on the
18 directive must include a directive-name-modifier.
19 • At most one if clause without a directive-name-modifier can appear on the directive.
20 • At most one if clause with the parallel directive-name-modifier can appear on the directive.
21 • At most one if clause with the target directive-name-modifier can appear on the directive.
22 Cross References
23 • Canonical loop nest form, see Section 2.11.1.
24 • target construct, see Section 2.14.5.
25 • parallel loop construct, see Section 2.16.2.
26 • if clause, see Section 2.18.
27 • Data attribute clauses, see Section 2.21.4.
28 • copyin clause, see Section 2.21.6.1.
5 Syntax
C / C++
6 The syntax of the target simd construct is as follows:
7 #pragma omp target simd [clause[ [,] clause] ... ] new-line
8 loop-nest
9 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
10 target or simd directives with identical meanings and restrictions.
C / C++
Fortran
11 The syntax of the target simd construct is as follows:
12 !$omp target simd [clause[ [,] clause] ... ]
13 loop-nest
14 [!$omp end target simd]
15 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
16 target or simd directives with identical meanings and restrictions.
17 If an end target simd directive is not specified, an end target simd directive is assumed at
18 the end of the loop-nest.
Fortran
19 Description
20 The semantics are identical to explicitly specifying a target directive immediately followed by a
21 simd directive.
22 Restrictions
23 The restrictions for the target and simd constructs apply except for the following explicit
24 modifications:
25 • If any if clause on the directive includes a directive-name-modifier then all if clauses on the
26 directive must include a directive-name-modifier.
27 • At most one if clause without a directive-name-modifier can appear on the directive.
28 • At most one if clause with the target directive-name-modifier can appear on the directive.
29 • At most one if clause with the simd directive-name-modifier can appear on the directive.
11 Syntax
C / C++
12 The syntax of the target teams construct is as follows:
13 #pragma omp target teams [clause[ [,] clause] ... ] new-line
14 structured-block
15 where clause can be any of the clauses accepted by the target or teams directives with identical
16 meanings and restrictions.
C / C++
Fortran
17 The syntax of the target teams construct is as follows:
18 !$omp target teams [clause[ [,] clause] ... ]
19 loosely-structured-block
20 !$omp end target teams
21 or
22 !$omp target teams [clause[ [,] clause] ... ]
23 strictly-structured-block
24 [!$omp end target teams]
25 where clause can be any of the clauses accepted by the target or teams directives with identical
26 meanings and restrictions.
Fortran
27 Description
28 The semantics are identical to explicitly specifying a target directive immediately followed by a
29 teams directive.
3 Cross References
4 • teams construct, see Section 2.7.
5 • target construct, see Section 2.14.5.
6 • Data attribute clauses, see Section 2.21.4.
11 Syntax
C / C++
12 The syntax of the target teams distribute construct is as follows:
13 #pragma omp target teams distribute [clause[ [,] clause] ... ] new-line
14 loop-nest
15 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
16 target or teams distribute directives with identical meanings and restrictions.
C / C++
Fortran
17 The syntax of the target teams distribute construct is as follows:
18 !$omp target teams distribute [clause[ [,] clause] ... ]
19 loop-nest
20 [!$omp end target teams distribute]
21 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
22 target or teams distribute directives with identical meanings and restrictions.
23 If an end target teams distribute directive is not specified, an end target teams
24 distribute directive is assumed at the end of the loop-nest.
Fortran
25 Description
26 The semantics are identical to explicitly specifying a target directive immediately followed by a
27 teams distribute directive.
28 Restrictions
29 The restrictions for the target and teams distribute constructs apply.
10 Syntax
C / C++
11 The syntax of the target teams distribute simd construct is as follows:
12 #pragma omp target teams distribute simd \
13 [clause[ [,] clause] ... ] new-line
14 loop-nest
15 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
16 target or teams distribute simd directives with identical meanings and restrictions.
C / C++
Fortran
17 The syntax of the target teams distribute simd construct is as follows:
18 !$omp target teams distribute simd [clause[ [,] clause] ... ]
19 loop-nest
20 [!$omp end target teams distribute simd]
21 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
22 target or teams distribute simd directives with identical meanings and restrictions.
23 If an end target teams distribute simd directive is not specified, an end target
24 teams distribute simd directive is assumed at the end of the loop-nest.
Fortran
25 Description
26 The semantics are identical to explicitly specifying a target directive immediately followed by a
27 teams distribute simd directive.
9 Cross References
10 • Canonical loop nest form, see Section 2.11.1.
11 • target construct, see Section 2.14.2.
12 • teams distribute simd construct, see Section 2.16.12.
13 • if clause, see Section 2.18.
14 • Data attribute clauses, see Section 2.21.4.
19 Syntax
C / C++
20 The syntax of the target teams loop construct is as follows:
21 #pragma omp target teams loop [clause[ [,] clause] ... ] new-line
22 loop-nest
23 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
24 target or teams loop directives with identical meanings and restrictions.
C / C++
5 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
6 target or teams loop directives with identical meanings and restrictions.
7 If an end target teams loop directive is not specified, an end target teams loop
8 directive is assumed at the end of the loop-nest.
Fortran
9 Description
10 The semantics are identical to explicitly specifying a target directive immediately followed by a
11 teams loop directive.
12 Restrictions
13 The restrictions for the target and teams loop constructs apply.
14 Cross References
15 • Canonical loop nest form, see Section 2.11.1.
16 • target construct, see Section 2.14.5.
17 • Teams loop construct, see Section 2.16.15.
18 • Data attribute clauses, see Section 2.21.4.
6 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
7 target or teams distribute parallel for directives with identical meanings and
8 restrictions.
C / C++
Fortran
9 The syntax of the target teams distribute parallel worksharing-loop construct is as follows:
10 !$omp target teams distribute parallel do [clause[ [,] clause] ... ]
11 loop-nest
12 [!$omp end target teams distribute parallel do]
13 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
14 target or teams distribute parallel do directives with identical meanings and
15 restrictions.
16 If an end target teams distribute parallel do directive is not specified, an
17 end target teams distribute parallel do directive is assumed at the end of the
18 loop-nest.
Fortran
19 Description
20 The semantics are identical to explicitly specifying a target directive immediately followed by a
21 teams distribute parallel worksharing-loop directive.
22 Restrictions
23 The restrictions for the target and teams distribute parallel worksharing-loop constructs apply
24 except for the following explicit modifications:
25 • If any if clause on the directive includes a directive-name-modifier then all if clauses on the
26 directive must include a directive-name-modifier.
27 • At most one if clause without a directive-name-modifier can appear on the directive.
28 • At most one if clause with the parallel directive-name-modifier can appear on the directive.
29 • At most one if clause with the target directive-name-modifier can appear on the directive.
13 Syntax
C / C++
14 The syntax of the target teams distribute parallel worksharing-loop SIMD construct is as follows:
15 #pragma omp target teams distribute parallel for simd \
16 [clause[ [,] clause] ... ] new-line
17 loop-nest
18 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
19 target or teams distribute parallel for simd directives with identical meanings and
20 restrictions.
C / C++
Fortran
21 The syntax of the target teams distribute parallel worksharing-loop SIMD construct is as follows:
22 !$omp target teams distribute parallel do simd [clause[ [,] clause] ... ]
23 loop-nest
24 [!$omp end target teams distribute parallel do simd]
25 where loop-nest is a canonical loop nest and clause can be any of the clauses accepted by the
26 target or teams distribute parallel do simd directives with identical meanings and
27 restrictions.
28 If an end target teams distribute parallel do simd directive is not specified, an
29 end target teams distribute parallel do simd directive is assumed at the end of the
30 loop-nest.
Fortran
4 Restrictions
5 The restrictions for the target and teams distribute parallel worksharing-loop SIMD constructs
6 apply except for the following explicit modifications:
7 • If any if clause on the directive includes a directive-name-modifier then all if clauses on the
8 directive must include a directive-name-modifier.
9 • At most one if clause without a directive-name-modifier can appear on the directive.
10 • At most one if clause with the parallel directive-name-modifier can appear on the directive.
11 • At most one if clause with the target directive-name-modifier can appear on the directive.
12 • At most one if clause with the simd directive-name-modifier can appear on the directive.
13 Cross References
14 • Canonical loop nest form, see Section 2.11.1.
15 • target construct, see Section 2.14.5.
16 • Teams distribute parallel worksharing-loop SIMD construct, see Section 2.16.14.
17 • if clause, see Section 2.18.
18 • Data attribute clauses, see Section 2.21.4.
31 2.18 if Clause
32 Summary
33 The semantics of an if clause are described in the section on the construct to which it applies. The
34 if clause directive-name-modifier names the associated construct to which an expression applies,
35 and is particularly useful for composite and combined constructs.
5 or
6 !$omp critical [(name) [[,] hint(hint-expression)] ]
7 strictly-structured-block
8 [!$omp end critical [(name)]]
9 where hint-expression is a constant expression that evaluates to a scalar value with kind
10 omp_sync_hint_kind and a value that is a valid synchronization hint (as described
11 in Section 2.19.12).
Fortran
12 Binding
13 The binding thread set for a critical region is all threads in the contention group.
14 Description
15 The region that corresponds to a critical construct is executed as if only a single thread at a
16 time among all threads in the contention group enters the region for execution, without regard to the
17 teams to which the threads belong. An optional name may be used to identify the critical
18 construct. All critical constructs without a name are considered to have the same unspecified
19 name.
C / C++
20 Identifiers used to identify a critical construct have external linkage and are in a name space
21 that is separate from the name spaces used by labels, tags, members, and ordinary identifiers.
C / C++
Fortran
22 The names of critical constructs are global entities of the program. If a name conflicts with
23 any other entity, the behavior of the program is unspecified.
Fortran
24 The threads of a contention group execute the critical region as if only one thread of the
25 contention group executes the critical region at a time. The critical construct enforces
26 these execution semantics with respect to all critical constructs with the same name in all
27 threads in the contention group.
28 If present, the hint clause gives the implementation additional information about the expected
29 runtime properties of the critical region that can optionally be used to optimize the
30 implementation. The presence of a hint clause does not affect the isolation guarantees provided
31 by the critical construct. If no hint clause is specified, the effect is as if
32 hint(omp_sync_hint_none) had been specified.
8 Tool Callbacks
9 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
10 occurrence of a critical-acquiring event in that thread. This callback has the type signature
11 ompt_callback_mutex_acquire_t.
12 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
13 occurrence of a critical-acquired event in that thread. This callback has the type signature
14 ompt_callback_mutex_t.
15 A thread dispatches a registered ompt_callback_mutex_released callback for each
16 occurrence of a critical-released event in that thread. This callback has the type signature
17 ompt_callback_mutex_t.
18 The callbacks occur in the task that encounters the critical construct. The callbacks should receive
19 ompt_mutex_critical as their kind argument if practical, but a less specific kind is
20 acceptable.
21 Restrictions
22 Restrictions to the critical construct are as follows:
23 • Unless the effect is as if hint(omp_sync_hint_none) was specified, the critical
24 construct must specify a name.
25 • If the hint clause is specified, each of the critical constructs with the same name must
26 have a hint clause for which the hint-expression evaluates to the same value.
C++
27 • A throw executed inside a critical region must cause execution to resume within the same
28 critical region, and the same thread that threw the exception must catch it.
C++
Fortran
29 • If a name is specified on a critical directive, the same name must also be specified on the
30 end critical directive.
31 • If no name appears on the critical directive, no name can appear on the end critical
32 directive.
Fortran
10 Syntax
C / C++
11 The syntax of the barrier construct is as follows:
12 #pragma omp barrier new-line
C / C++
Fortran
13 The syntax of the barrier construct is as follows:
14 !$omp barrier
Fortran
15 Binding
16 The binding thread set for a barrier region is the current team. A barrier region binds to the
17 innermost enclosing parallel region.
18 Description
19 All threads of the team that is executing the binding parallel region must execute the barrier
20 region and complete execution of all explicit tasks bound to this parallel region before any are
21 allowed to continue execution beyond the barrier.
22 The barrier region includes an implicit task scheduling point in the current task region.
7 Tool Callbacks
8 A thread dispatches a registered ompt_callback_sync_region callback with
9 ompt_sync_region_barrier_explicit as its kind argument and ompt_scope_begin
10 as its endpoint argument for each occurrence of an explicit-barrier-begin event. Similarly, a thread
11 dispatches a registered ompt_callback_sync_region callback with
12 ompt_sync_region_barrier_explicit as its kind argument and ompt_scope_end as
13 its endpoint argument for each occurrence of an explicit-barrier-end event. These callbacks occur
14 in the context of the task that encountered the barrier construct and have type signature
15 ompt_callback_sync_region_t.
16 A thread dispatches a registered ompt_callback_sync_region_wait callback with
17 ompt_sync_region_barrier_explicit as its kind argument and ompt_scope_begin
18 as its endpoint argument for each occurrence of an explicit-barrier-wait-begin event. Similarly, a
19 thread dispatches a registered ompt_callback_sync_region_wait callback with
20 ompt_sync_region_barrier_explicit as its kind argument and ompt_scope_end as
21 its endpoint argument for each occurrence of an explicit-barrier-wait-end event. These callbacks
22 occur in the context of the task that encountered the barrier construct and have type signature
23 ompt_callback_sync_region_t.
24 A thread dispatches a registered ompt_callback_cancel callback with
25 ompt_cancel_detected as its flags argument for each occurrence of a cancellation event in
26 that thread. The callback occurs in the context of the encountering task. The callback has type
27 signature ompt_callback_cancel_t.
28 Restrictions
29 Restrictions to the barrier construct are as follows:
30 • Each barrier region must be encountered by all threads in a team or by none at all, unless
31 cancellation has been requested for the innermost enclosing parallel region.
32 • The sequence of worksharing regions and barrier regions encountered must be the same for
33 every thread in a team.
22 Tool Callbacks
23 A thread dispatches a registered ompt_callback_sync_region callback for each implicit
24 barrier begin and end event. Similarly, a thread dispatches a registered
25 ompt_callback_sync_region_wait callback for each implicit barrier wait-begin and
26 wait-end event. All callbacks for implicit barrier events execute in the context of the encountering
27 task and have type signature ompt_callback_sync_region_t.
28 For the implicit barrier at the end of a worksharing construct, the kind argument is
29 ompt_sync_region_barrier_implicit_workshare. For the implicit barrier at the end
30 of a parallel region, the kind argument is
31 ompt_sync_region_barrier_implicit_parallel. For an extra barrier added by an
32 OpenMP implementation, the kind argument is
33 ompt_sync_region_barrier_implementation. For a barrier at the end of a teams
34 region, the kind argument is ompt_sync_region_barrier_teams.
5 Restrictions
6 Restrictions to implicit barriers are as follows:
7 • If a thread is in the state ompt_state_wait_barrier_implicit_parallel, a call to
8 ompt_get_parallel_info may return a pointer to a copy of the data object associated
9 with the parallel region rather than a pointer to the associated data object itself. Writing to the
10 data object returned by omp_get_parallel_info when a thread is in the
11 ompt_state_wait_barrier_implicit_parallel results in unspecified behavior.
12 Cross References
13 • ompt_scope_begin and ompt_scope_end, see Section 4.4.4.11.
14 • ompt_sync_region_barrier_implementation,
15 ompt_sync_region_barrier_implicit_parallel
16 ompt_sync_region_barrier_teams, and
17 ompt_sync_region_barrier_implicit_workshare, see Section 4.4.4.13.
18 • ompt_cancel_detected, see Section 4.4.4.25.
19 • ompt_callback_sync_region_t, see Section 4.5.2.13.
20 • ompt_callback_cancel_t, see Section 4.5.2.18.
12 Binding
13 The taskwait region binds to the current task region. The binding thread set of the taskwait
14 region is the current team.
15 Description
16 If no depend clause is present on the taskwait construct, the current task region is suspended
17 at an implicit task scheduling point associated with the construct. The current task region remains
18 suspended until all child tasks that it generated before the taskwait region complete execution.
19 If one or more depend clauses are present on the taskwait construct and the nowait clause is
20 not also present, the behavior is as if these clauses were applied to a task construct with an empty
21 associated structured block that generates a mergeable and included task. Thus, the current task
22 region is suspended until the predecessor tasks of this task complete execution.
23 If one or more depend clauses are present on the taskwait construct and the nowait clause is
24 also present, the behavior is as if these clauses were applied to a task construct with an empty
25 associated structured block that generates a task for which execution may be deferred. Thus, all
26 predecessor tasks of this task must complete execution before any subsequently generated task that
27 depends on this task starts its execution.
16 Tool Callbacks
17 A thread dispatches a registered ompt_callback_sync_region callback with
18 ompt_sync_region_taskwait as its kind argument and ompt_scope_begin as its
19 endpoint argument for each occurrence of a taskwait-begin event in the task that encounters the
20 taskwait construct. Similarly, a thread dispatches a registered
21 ompt_callback_sync_region callback with ompt_sync_region_taskwait as its
22 kind argument and ompt_scope_end as its endpoint argument for each occurrence of a
23 taskwait-end event in the task that encounters the taskwait construct. These callbacks occur in
24 the task that encounters the taskwait construct and have the type signature
25 ompt_callback_sync_region_t.
26 A thread dispatches a registered ompt_callback_sync_region_wait callback with
27 ompt_sync_region_taskwait as its kind argument and ompt_scope_begin as its
28 endpoint argument for each occurrence of a taskwait-wait-begin event. Similarly, a thread
29 dispatches a registered ompt_callback_sync_region_wait callback with
30 ompt_sync_region_taskwait as its kind argument and ompt_scope_end as its endpoint
31 argument for each occurrence of a taskwait-wait-end event. These callbacks occur in the context of
32 the task that encounters the taskwait construct and have type signature
33 ompt_callback_sync_region_t.
34 A thread dispatches a registered ompt_callback_task_create callback for each occurrence
35 of a taskwait-init event in the context of the encountering task. This callback has the type signature
36 ompt_callback_task_create_t. In the dispatched callback, (flags &
37 ompt_task_taskwait) always evaluates to true. If the nowait clause is not present,
38 (flags & ompt_task_undeferred) also evaluates to true.
5 Restrictions
6 Restrictions to the taskwait construct are as follows:
7 • The mutexinoutset dependence-type may not appear in a depend clause on a taskwait
8 construct.
9 • If the dependence-type of a depend clause is depobj then the dependence objects cannot
10 represent dependences of the mutexinoutset dependence type.
11 • The nowait clause may only appear on a taskwait directive if the depend clause is present.
12 • At most one nowait clause can appear on a taskwait directive.
13 Cross References
14 • task construct, see Section 2.12.1.
15 • Task scheduling, see Section 2.12.6.
16 • depend clause, see Section 2.19.11.
17 • ompt_scope_begin and ompt_scope_end, see Section 4.4.4.11.
18 • ompt_sync_region_taskwait, see Section 4.4.4.13.
19 • ompt_callback_sync_region_t, see Section 4.5.2.13.
24 Syntax
C / C++
25 The syntax of the taskgroup construct is as follows:
26 #pragma omp taskgroup [clause[[,] clause] ...] new-line
27 structured-block
5 or
6 !$omp taskgroup [clause [ [,] clause] ...]
7 strictly-structured-block
8 [!$omp end taskgroup]
12 Binding
13 The binding task set of a taskgroup region is all tasks of the current team that are generated in
14 the region. A taskgroup region binds to the innermost enclosing parallel region.
15 Description
16 When a thread encounters a taskgroup construct, it starts executing the region. All child tasks
17 generated in the taskgroup region and all of their descendants that bind to the same parallel
18 region as the taskgroup region are part of the taskgroup set associated with the taskgroup
19 region.
20 An implicit task scheduling point occurs at the end of the taskgroup region. The current task is
21 suspended at the task scheduling point until all tasks in the taskgroup set complete execution.
19 Cross References
20 • Task scheduling, see Section 2.12.6.
21 • task_reduction clause, see Section 2.21.5.5.
22 • ompt_scope_begin and ompt_scope_end, see Section 4.4.4.11.
23 • ompt_sync_region_taskgroup, see Section 4.4.4.13.
24 • ompt_callback_sync_region_t, see Section 4.5.2.13.
30 Syntax
31 In the following syntax, atomic-clause is a clause that indicates the semantics for which atomicity is
32 enforced, and memory-order-clause is a clause that indicates the memory ordering behavior of the
33 construct. Specifically, atomic-clause is one of the following:
34 read
35 write
36 update
C / C++
13 The syntax of the atomic construct is:
14 #pragma omp atomic [clause[[,] clause] ... ] new-line
15 statement
6 – or cond-update-stmt, a conditional update statement that has one of the following forms:
7 if(expr ordop x) { x = expr; }
8 if(x ordop expr) { x = expr; }
9 if(x == e) { x = d; }
10 • If the capture clause is present, statement can have one of the following forms:
11 v = expr-stmt
12 { v = x; expr-stmt }
13 { expr-stmt v = x; }
16 or
17 !$omp atomic [clause[[[,] clause] ... ] [,]] capture [[,] clause [[[,] clause] ... ]]
18 statement
19 capture-statement
20 [!$omp end atomic]
21 or
22 !$omp atomic [clause[[[,] clause] ... ] [,]] capture [[,] clause [[[,] clause] ... ]]
23 capture-statement
24 statement
25 [!$omp end atomic]
16 – or, if the capture clause is also present and statement is not preceded or followed by
17 capture-statement, statement has this form:
18 if (x == e) then
19 x = d
20 else
21 v = x
22 end if
15 Binding
16 If the size of x is 8, 16, 32, or 64 bits and x is aligned to a multiple of its size, the binding thread set
17 for the atomic region is all threads on the device. Otherwise, the binding thread set for the
18 atomic region is all threads in the contention group. atomic regions enforce exclusive access
19 with respect to other atomic regions that access the same storage location x among all threads in
20 the binding thread set without regard to the teams to which the threads belong.
21 Description
22 If atomic-clause is not present on the construct, the behavior is as if the update clause is specified.
23 The atomic construct with the read clause results in an atomic read of the location designated
24 by x.
25 The atomic construct with the write clause results in an atomic write of the location designated
26 by x.
27 The atomic construct with the update clause results in an atomic update of the location
28 designated by x using the designated operator or intrinsic. Only the read and write of the location
29 designated by x are performed mutually atomically. The evaluation of expr or expr-list need not be
30 atomic with respect to the read or write of the location designated by x. No task scheduling points
31 are allowed between the read and the write of the location designated by x.
32 If the capture clause is present, the atomic update is an atomic captured update — an atomic
33 update to the location designated by x using the designated operator or intrinsic while also
34 capturing the original or final value of the location designated by x with respect to the atomic
35 update. The original or final value of the location designated by x is written in the location
22 Note – Allowing for spurious failure by specifying a weak clause can result in performance gains
23 on some systems when using compare-and-swap in a loop. For cases where a single
24 compare-and-swap would otherwise be sufficient, using a loop over a weak compare-and-swap is
25 unlikely to improve performance.
26
27 If memory-order-clause is present, or implicitly provided by a requires directive, it specifies the
28 effective memory ordering and otherwise the effective memory ordering is relaxed. If the fail
29 clause is present, its parameter overrides the effective memory ordering used if the comparison for
30 an atomic conditional update fails.
31 The atomic construct may be used to enforce memory consistency between threads, based on the
32 guarantees provided by Section 1.4.6. A strong flush on the location designated by x is performed
33 on entry to and exit from the atomic operation, ensuring that the set of all atomic operations applied
34 to the same location in a race-free program has a total completion order. If the write or update
35 clause is specified, the atomic operation is not an atomic conditional update for which the
36 comparison fails, and the effective memory ordering is release, acq_rel, or seq_cst, the
37 strong flush on entry to the atomic operation is also a release flush. If the read or update clause
38 is specified and the effective memory ordering is acquire, acq_rel, or seq_cst then the
31 Tool Callbacks
32 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
33 occurrence of an atomic-acquiring event in that thread. This callback has the type signature
34 ompt_callback_mutex_acquire_t.
35 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
36 occurrence of an atomic-acquired event in that thread. This callback has the type signature
37 ompt_callback_mutex_t.
6 Restrictions
7 Restrictions to the atomic construct are as follows:
8 • OpenMP constructs may not be encountered during execution of an atomic region.
9 • At most one atomic-clause may appear on the construct.
10 • At most one memory-order-clause may appear on the construct.
11 • At most one hint clause may appear on the construct.
12 • At most one capture clause may appear on the construct.
13 • At most one compare clause may appear on the construct.
14 • If a capture or compare clause appears on the construct then atomic-clause must be
15 update.
16 • At most one fail clause may appear on the construct.
17 • At most one weak clause may appear on the construct.
18 • If atomic-clause is read then memory-order-clause must not be release.
19 • If atomic-clause is write then memory-order-clause must not be acquire.
20 • The weak clause may only appear if the resulting atomic operation is an atomic conditional
21 update for which the comparison tests for equality.
C / C++
22 • All atomic accesses to the storage locations designated by x throughout the program are required
23 to have a compatible type.
24 • The fail clause may only appear if the resulting atomic operation is an atomic conditional
25 update.
C / C++
Fortran
26 • All atomic accesses to the storage locations designated by x throughout the program are required
27 to have the same type and type parameters.
28 • The fail clause may only appear if the resulting atomic operation is an atomic conditional
29 update or an atomic update where intrinsic-procedure-name is either MAX or MIN.
Fortran
19 Syntax
C / C++
20 The syntax of the flush construct is as follows:
21 #pragma omp flush [memory-order-clause] [(list)] new-line
8 Binding
9 The binding thread set for a flush region is all threads in the device-set of its flush operation.
10 Execution of a flush region affects the memory and it affects the temporary view of memory of
11 the encountering thread. It does not affect the temporary view of other threads. Other threads on
12 devices in the device-set must themselves execute a flush operation in order to be guaranteed to
13 observe the effects of the flush operation of the encountering thread.
14 Description
15 If neither memory-order-clause nor a list appears on the flush construct then the behavior is as if
16 memory-order-clause is seq_cst.
17 A flush construct with the seq_cst clause, executed on a given thread, operates as if all data
18 storage blocks that are accessible to the thread are flushed by a strong flush operation. A flush
19 construct with a list applies a strong flush operation to the items in the list, and the flush operation
20 does not complete until the operation is complete for all specified list items. An implementation
21 may implement a flush construct with a list by ignoring the list and treating it the same as a
22 flush construct with the seq_cst clause.
23 If no list items are specified, the flush operation has the release and/or acquire flush properties:
24 • If memory-order-clause is seq_cst or acq_rel, the flush operation is both a release flush
25 and an acquire flush.
26 • If memory-order-clause is release, the flush operation is a release flush.
27 • If memory-order-clause is acquire, the flush operation is an acquire flush.
18 Note – Use of a flush construct with a list is extremely error prone and users are strongly
19 discouraged from attempting it. The following examples illustrate the ordering properties of the
20 flush operation. In the following incorrect pseudocode example, the programmer intends to prevent
21 simultaneous execution of the protected section by the two threads, but the program does not work
22 properly because it does not enforce the proper ordering of the operations on variables a and b.
23 Any shared data accessed in the protected section is not guaranteed to be current or consistent
24 during or after the protected section. The atomic notation in the pseudocode in the following two
25 examples indicates that the accesses to a and b are atomic write and atomic read operations.
26 Otherwise both examples would contain data races and automatically result in unspecified behavior.
27 The flush operations are strong flushes that are applied to the specified flush lists
thread 1 thread 2
atomic(b = 1) atomic(a = 1)
1 flush(b) flush(a)
flush(a) flush(b)
atomic(tmp = a) atomic(tmp = b)
if (tmp == 0) then if (tmp == 0) then
protected section protected section
end if end if
2 The problem with this example is that operations on variables a and b are not ordered with respect
3 to each other. For instance, nothing prevents the compiler from moving the flush of b on thread 1 or
4 the flush of a on thread 2 to a position completely after the protected section (assuming that the
5 protected section on thread 1 does not reference b and the protected section on thread 2 does not
6 reference a). If either re-ordering happens, both threads can simultaneously execute the protected
7 section.
8 The following pseudocode example correctly ensures that the protected section is executed by only
9 one thread at a time. Execution of the protected section by neither thread is considered correct in
10 this example. This occurs if both flushes complete prior to either thread executing its if statement.
Correct example:
a = b = 0
thread 1 thread 2
atomic(b = 1) atomic(a = 1)
11
flush(a,b) flush(a,b)
atomic(tmp = a) atomic(tmp = b)
if (tmp == 0) then if (tmp == 0) then
protected section protected section
end if end if
12 The compiler is prohibited from moving the flush at all for either thread, ensuring that the
13 respective assignment is complete and the data is flushed before the if statement is executed.
14
3 Tool Callbacks
4 A thread dispatches a registered ompt_callback_flush callback for each occurrence of a
5 flush event in that thread. This callback has the type signature ompt_callback_flush_t.
6 Restrictions
7 Restrictions to the flush construct are as follows:
8 • If a memory-order-clause is specified, list items must not be specified on the flush directive.
9 Cross References
10 • ompt_callback_flush_t, see Section 4.5.2.17.
8 Syntax
C / C++
9 The syntax of the ordered construct is as follows:
10 #pragma omp ordered [clause[ [,] clause] ] new-line
11 structured-block
15 or
16 #pragma omp ordered clause [[[,] clause] ... ] new-line
24 or
25 !$omp ordered [clause[ [,] clause] ]
26 strictly-structured-block
27 [!$omp end ordered]
7 Binding
8 The binding thread set for an ordered region is the current team. An ordered region binds to
9 the innermost enclosing simd or worksharing-loop SIMD region if the simd clause is present, and
10 otherwise it binds to the innermost enclosing worksharing-loop region. ordered regions that bind
11 to different regions execute independently of each other.
12 Description
13 If no clause is specified, the ordered construct behaves as if the threads clause had been
14 specified. If the threads clause is specified, the threads in the team that is executing the
15 worksharing-loop region execute ordered regions sequentially in the order of the loop iterations.
16 If any depend clauses are specified then those clauses specify the order in which the threads in the
17 team execute ordered regions. If the simd clause is specified, the ordered regions
18 encountered by any thread will execute one at a time in the order of the loop iterations.
19 When the thread that is executing the first iteration of the loop encounters an ordered construct,
20 it can enter the ordered region without waiting. When a thread that is executing any subsequent
21 iteration encounters an ordered construct without a depend clause, it waits at the beginning of
22 the ordered region until execution of all ordered regions that belong to all previous iterations
23 has completed. When a thread that is executing any subsequent iteration encounters an ordered
24 construct with one or more depend(sink:vec) clauses, it waits until its dependences on all
25 valid iterations specified by the depend clauses are satisfied before it completes execution of the
26 ordered region. A specific dependence is satisfied when a thread that is executing the
27 corresponding iteration encounters an ordered construct with a depend(source) clause.
5 Tool Callbacks
6 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
7 occurrence of an ordered-acquiring event in that thread. This callback has the type signature
8 ompt_callback_mutex_acquire_t.
9 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
10 occurrence of an ordered-acquired event in that thread. This callback has the type signature
11 ompt_callback_mutex_t.
12 A thread dispatches a registered ompt_callback_mutex_released callback with
13 ompt_mutex_ordered as the kind argument if practical, although a less specific kind may be
14 used, for each occurrence of an ordered-released event in that thread. This callback has the type
15 signature ompt_callback_mutex_t and occurs in the task that encounters the atomic
16 construct.
17 A thread dispatches a registered ompt_callback_dependences callback with all vector
18 entries listed as ompt_dependence_type_sink in the deps argument for each occurrence of a
19 doacross-sink event in that thread. A thread dispatches a registered
20 ompt_callback_dependences callback with all vector entries listed as
21 ompt_dependence_type_source in the deps argument for each occurrence of a
22 doacross-source event in that thread. These callbacks have the type signature
23 ompt_callback_dependences_t.
24 Restrictions
25 Restrictions to the ordered construct are as follows:
26 • At most one threads clause can appear on an ordered construct.
27 • At most one simd clause can appear on an ordered construct.
28 • At most one depend(source) clause can appear on an ordered construct.
29 • The construct that corresponds to the binding region of an ordered region must not specify a
30 reduction clause with the inscan modifier.
31 • Either depend(sink:vec) clauses or depend(source) clauses may appear on an
32 ordered construct, but not both.
33 • The worksharing-loop or worksharing-loop SIMD region to which an ordered region
34 corresponding to an ordered construct without a depend clause binds must have an
35 ordered clause without the parameter specified on the corresponding worksharing-loop or
36 worksharing-loop SIMD directive.
5 Syntax
C / C++
6 The syntax of the depobj construct is as follows:
7 #pragma omp depobj(depobj) clause new-line
22 Description
23 A depobj construct with a depend clause present sets the state of depobj to initialized. The
24 depobj is initialized to represent the dependence that the depend clause specifies.
25 A depobj construct with a destroy clause present changes the state of the depobj to
26 uninitialized.
27 A depobj construct with an update clause present changes the dependence type of the
28 dependence represented by depobj to the one specified by the update clause.
13 Cross References
14 • depend clause, see Section 2.19.11.
19 Syntax
20 The syntax of the depend clause is as follows:
21 depend([depend-modifier,]dependence-type : locator-list)
5 or
6 depend(dependence-type : vec)
9 and where vec is the iteration vector, which has the form:
10 x1 [± d1 ], x2 [± d2 ], . . . , xn [± dn ]
11 where n is the value specified by the ordered clause in the worksharing-loop directive, xi denotes
12 the loop iteration variable of the i-th nested loop associated with the worksharing-loop directive,
13 and di is a constant non-negative integer.
14 Description
15 Task dependences are derived from the dependence-type of a depend clause and its list items
16 when dependence-type is in, out, inout, mutexinoutset or inoutset. When the
17 dependence-type is depobj, the task dependences are derived from the dependences represented
18 by the depend objects specified in the depend clause as if the depend clauses of the depobj
19 constructs were specified in the current construct.
20 The storage location of a list item matches the storage location of another list item if they have the
21 same storage location, or if any of the list items is omp_all_memory.
22 For the in dependence-type, if the storage location of at least one of the list items matches the
23 storage location of a list item appearing in a depend clause with an out, inout,
24 mutexinoutset, or inoutset dependence-type on a construct from which a sibling task was
25 previously generated, then the generated task will be a dependent task of that sibling task.
26 For the out and inout dependence-types, if the storage location of at least one of the list items
27 matches the storage location of a list item appearing in a depend clause with an in, out, inout,
28 mutexinoutset, or inoutset dependence-type on a construct from which a sibling task was
29 previously generated, then the generated task will be a dependent task of that sibling task.
30 For the mutexinoutset dependence-type, if the storage location of at least one of the list items
31 matches the storage location of a list item appearing in a depend clause with an in, out, inout,
32 or inoutset dependence-type on a construct from which a sibling task was previously generated,
33 then the generated task will be a dependent task of that sibling task.
34 If a list item appearing in a depend clause with a mutexinoutset dependence-type on a task
35 generating construct matches a list item appearing in a depend clause with a mutexinoutset
28 Note – An iteration vector vec that does not indicate a lexicographically earlier iteration may cause
29 a deadlock.
30
7 Tool Callbacks
8 A thread dispatches the ompt_callback_dependences callback for each occurrence of the
9 task-dependences event to announce its dependences with respect to the list items in the depend
10 clause. This callback has type signature ompt_callback_dependences_t.
11 A thread dispatches the ompt_callback_task_dependence callback for a task-dependence
12 event to report a dependence between a predecessor task (src_task_data) and a dependent task
13 (sink_task_data). This callback has type signature ompt_callback_task_dependence_t.
14 Restrictions
15 Restrictions to the depend clause are as follows:
16 • List items, other than reserved locators, used in depend clauses of the same task or sibling tasks
17 must indicate identical storage locations or disjoint storage locations.
18 • List items used in depend clauses cannot be zero-length array sections.
19 • The omp_all_memory reserved locator can only be used in a depend clause with an out or
20 inout dependence-type.
21 • Array sections cannot be specified in depend clauses with the depobj dependence type.
22 • List items used in depend clauses with the depobj dependence type must be depend objects
23 in the initialized state.
C / C++
24 • List items used in depend clauses with the depobj dependence type must be expressions of
25 the omp_depend_t type.
26 • List items that are expressions of the omp_depend_t type can only be used in depend
27 clauses with the depobj dependence type.
C / C++
Fortran
28 • A common block name cannot appear in a depend clause.
29 • List items used in depend clauses with the depobj dependence type must be integer
30 expressions of the omp_depend_kind kind.
Fortran
3 Restrictions
4 Restrictions to the synchronization hints are as follows:
5 • The hints omp_sync_hint_uncontended and omp_sync_hint_contended cannot
6 be combined.
7 • The hints omp_sync_hint_nonspeculative and omp_sync_hint_speculative
8 cannot be combined.
9 The restrictions for combining multiple values of omp_sync_hint apply equally to the
10 corresponding values of omp_lock_hint, and expressions that mix the two types.
11 Cross References
12 • critical construct, see Section 2.19.1.
13 • atomic construct, see Section 2.19.7
14 • omp_init_lock_with_hint and omp_init_nest_lock_with_hint, see
15 Section 3.9.2.
21 Syntax
C / C++
22 The syntax of the cancel construct is as follows:
23 #pragma omp cancel construct-type-clause [ [,] if-clause] new-line
10 and if-clause is
11 if([ cancel :] scalar-logical-expression)
Fortran
12 Binding
13 The binding thread set of the cancel region is the current team. The binding region of the
14 cancel region is the innermost enclosing region of the type corresponding to the
15 construct-type-clause specified in the directive (that is, the innermost parallel, sections,
16 worksharing-loop, or taskgroup region).
17 Description
18 The cancel construct activates cancellation of the binding region only if the cancel-var ICV is
19 true, in which case the cancel construct causes the encountering task to continue execution at the
20 end of the binding region if construct-type-clause is parallel, for, do, or sections. If the
21 cancel-var ICV is true and construct-type-clause is taskgroup, the encountering task continues
22 execution at the end of the current task region. If the cancel-var ICV is false, the cancel
23 construct is ignored.
24 Threads check for active cancellation only at cancellation points that are implied at the following
25 locations:
26 • cancel regions;
27 • cancellation point regions;
28 • barrier regions;
29 • at the end of a worksharing-loop construct with a nowait clause and for which the same list
30 item appears in both firstprivate and lastprivate clauses; and
31 • implicit barrier regions.
15 Note – If one thread activates cancellation and another thread encounters a cancellation point, the
16 order of execution between the two threads is non-deterministic. Whether the thread that
17 encounters a cancellation point detects the activated cancellation depends on the underlying
18 hardware and operating system.
19
20 When cancellation of tasks is activated through a cancel construct with the taskgroup
21 construct-type-clause, the tasks that belong to the taskgroup set of the innermost enclosing
22 taskgroup region will be canceled. The task that encountered that construct continues execution
23 at the end of its task region, which implies completion of that task. Any task that belongs to the
24 innermost enclosing taskgroup and has already begun execution must run to completion or until
25 a cancellation point is reached. Upon reaching a cancellation point and if cancellation is active, the
26 task continues execution at the end of its task region, which implies the task’s completion. Any task
27 that belongs to the innermost enclosing taskgroup and that has not begun execution may be
28 discarded, which implies its completion.
29 When cancellation is active for a parallel, sections, or worksharing-loop region, each
30 thread of the binding thread set resumes execution at the end of the canceled region if a cancellation
31 point is encountered. If the canceled region is a parallel region, any tasks that have been
32 created by a task or a taskloop construct and their descendant tasks are canceled according to
33 the above taskgroup cancellation semantics. If the canceled region is a sections, or
34 worksharing-loop region, no task cancellation occurs.
C++
35 The usual C++ rules for object destruction are followed when cancellation is performed.
C++
9 Note – The programmer is responsible for releasing locks and other synchronization data
10 structures that might cause a deadlock when a cancel construct is encountered and blocked
11 threads cannot be canceled. The programmer is also responsible for ensuring proper
12 synchronizations to avoid deadlocks that might arise from cancellation of OpenMP regions that
13 contain OpenMP synchronization constructs.
14
18 Tool Callbacks
19 A thread dispatches a registered ompt_callback_cancel callback for each occurrence of a
20 cancel event in the context of the encountering task. This callback has type signature
21 ompt_callback_cancel_t; (flags & ompt_cancel_activated) always evaluates to
22 true in the dispatched callback; (flags & ompt_cancel_parallel) evaluates to true in the
23 dispatched callback if construct-type-clause is parallel;
24 (flags & ompt_cancel_sections) evaluates to true in the dispatched callback if
25 construct-type-clause is sections; (flags & ompt_cancel_loop) evaluates to true in the
26 dispatched callback if construct-type-clause is for or do; and
27 (flags & ompt_cancel_taskgroup) evaluates to true in the dispatched callback if
28 construct-type-clause is taskgroup.
29 A thread dispatches a registered ompt_callback_cancel callback with the ompt_data_t
30 associated with the discarded task as its task_data argument and
31 ompt_cancel_discarded_task as its flags argument for each occurrence of a
32 discarded-task event. The callback occurs in the context of the task that discards the task and has
33 type signature ompt_callback_cancel_t.
29 Cross References
30 • cancel-var ICV, see Section 2.4.1.
31 • if clause, see Section 2.18.
32 • cancellation point construct, see Section 2.20.2.
33 • omp_get_cancellation routine, see Section 3.2.8.
34 • omp_cancel_flag_t enumeration type, see Section 4.4.4.25.
35 • ompt_callback_cancel_t, see Section 4.5.2.18.
6 Syntax
C / C++
7 The syntax of the cancellation point construct is as follows:
8 #pragma omp cancellation point construct-type-clause new-line
21 Binding
22 The binding thread set of the cancellation point construct is the current team. The binding
23 region of the cancellation point region is the innermost enclosing region of the type
24 corresponding to the construct-type-clause specified in the directive (that is, the innermost
25 parallel, sections, worksharing-loop, or taskgroup region).
17 Tool Callbacks
18 A thread dispatches a registered ompt_callback_cancel callback for each occurrence of a
19 cancel event in the context of the encountering task. This callback has type signature
20 ompt_callback_cancel_t; (flags & ompt_cancel_detected) always evaluates to true
21 in the dispatched callback; (flags & ompt_cancel_parallel) evaluates to true in the
22 dispatched callback if construct-type-clause of the encountered cancellation point
23 construct is parallel; (flags & ompt_cancel_sections) evaluates to true in the
24 dispatched callback if construct-type-clause of the encountered cancellation point
25 construct is sections; (flags & ompt_cancel_loop) evaluates to true in the dispatched
26 callback if construct-type-clause of the encountered cancellation point construct is for or
27 do; and (flags & ompt_cancel_taskgroup) evaluates to true in the dispatched callback if
28 construct-type-clause of the encountered cancellation point construct is taskgroup.
29 Restrictions
30 Restrictions to the cancellation point construct are as follows:
31 • A cancellation point construct for which construct-type-clause is taskgroup must be
32 closely nested inside a task or taskloop construct, and the cancellation point region
33 must be closely nested inside a taskgroup region.
34 • A cancellation point construct for which construct-type-clause is sections must be
35 closely nested inside a sections or section construct.
36 • A cancellation point construct for which construct-type-clause is neither sections nor
37 taskgroup must be closely nested inside an OpenMP construct that matches the type specified
38 in construct-type-clause.
14 Syntax
C / C++
15 The syntax of the threadprivate directive is as follows:
16 #pragma omp threadprivate(list) new-line
21 where list is a comma-separated list of named variables and named common blocks. Common
22 block names must appear between slashes.
Fortran
23 Description
24 Unless otherwise specified, each copy of a threadprivate variable is initialized once, in the manner
25 specified by the program, but at an unspecified point in the program prior to the first reference to
26 that copy. The storage of all copies of a threadprivate variable is freed according to how static
27 variables are handled in the base language, but at an unspecified point in the program.
25 Cross References
26 • dyn-var ICV, see Section 2.4.
27 • Number of threads used to execute a parallel region, see Section 2.6.1.
28 • order clause, see Section 2.11.3.
29 • copyin clause, see Section 2.21.6.1.
3 Restrictions
4 The following restrictions apply to any list item that is privatized unless otherwise stated for a given
5 data-sharing attribute clause:
C
6 • A variable that is part of another variable (as an array or structure element) cannot be privatized.
C
C++
7 • A variable that is part of another variable (as an array or structure element) cannot be privatized
8 except if the data-sharing attribute clause is associated with a construct within a class non-static
9 member function and the variable is an accessible data member of the object for which the
10 non-static member function is invoked.
11 • A variable of class type (or array thereof) that is privatized requires an accessible, unambiguous
12 default constructor for the class type.
C++
C / C++
13 • A variable that is privatized must not have a const-qualified type unless it is of class type with
14 a mutable member. This restriction does not apply to the firstprivate clause.
15 • A variable that is privatized must not have an incomplete type or be a reference to an incomplete
16 type.
C / C++
Fortran
17 • A variable that is part of another variable (as an array or structure element) cannot be privatized.
18 • Variables that appear in namelist statements, in variable format expressions, and in expressions
19 for statement function definitions, may not be privatized.
20 • Pointers with the INTENT(IN) attribute may not be privatized. This restriction does not apply
21 to the firstprivate clause.
22 • A private variable must not be coindexed or appear as an actual argument to a procedure where
23 the corresponding dummy argument is a coarray.
24 • Assumed-size arrays may not be privatized in a target, teams, or distribute construct.
Fortran
21 Syntax
22 The syntax of the default clause is as follows:
23 default(data-sharing-attribute)
14 Restrictions
15 Restrictions to the default clause are as follows:
16 • Only a single default clause may be specified on a parallel, task, taskloop or
17 teams directive.
22 Syntax
23 The syntax of the shared clause is as follows:
24 shared(list)
25 Description
26 All references to a list item within a task refer to the storage area of the original variable at the point
27 the directive was encountered.
28 The programmer must ensure, by adding proper synchronization, that storage shared by an explicit
29 task region does not reach the end of its lifetime before the explicit task region completes its
30 execution.
17 Restrictions
18 Restrictions to the shared clause are as follows:
C
19 • A variable that is part of another variable (as an array or structure element) cannot appear in a
20 shared clause.
C
C++
21 • A variable that is part of another variable (as an array or structure element) cannot appear in a
22 shared clause except if the shared clause is associated with a construct within a class
23 non-static member function and the variable is an accessible data member of the object for which
24 the non-static member function is invoked.
C++
Fortran
25 • A variable that is part of another variable (as an array, structure element or type parameter
26 inquiry) cannot appear in a shared clause.
Fortran
4 Syntax
5 The syntax of the private clause is as follows:
6 private(list)
7 Description
8 The private clause specifies that its list items are to be privatized according to Section 2.21.3.
9 Each task or SIMD lane that references a list item in the construct receives only one new list item,
10 unless the construct has one or more associated loops and an order clause that specifies
11 concurrent is also present.
12 Restrictions
13 Restrictions to the private clause are as specified in Section 2.21.3.
14 Cross References
15 • List Item Privatization, see Section 2.21.3.
21 Syntax
22 The syntax of the firstprivate clause is as follows:
23 firstprivate(list)
24 Description
25 The firstprivate clause provides a superset of the functionality provided by the private
26 clause.
Fortran
27 The list items that appear in a firstprivate clause may include named constants.
Fortran
10 Description
11 The lastprivate clause provides a superset of the functionality provided by the private
12 clause.
13 A list item that appears in a lastprivate clause is subject to the private clause semantics
14 described in Section 2.21.4.3. In addition, when a lastprivate clause without the
15 conditional modifier appears on a directive and the list item is not an iteration variable of one
16 of the associated loops, the value of each new list item from the sequentially last iteration of the
17 associated loops, or the lexically last section construct, is assigned to the original list item.
18 When the conditional modifier appears on the clause or the list item is an iteration variable of
19 one of the associated loops, if sequential execution of the loop nest would assign a value to the list
20 item then the original list item is assigned the value that the list item would have after sequential
21 execution of the loop nest.
C / C++
22 For an array of elements of non-array type, each element is assigned to the corresponding element
23 of the original array.
C / C++
Fortran
24 If the original list item does not have the POINTER attribute, its update occurs as if by intrinsic
25 assignment unless it has a type bound procedure as a defined assignment.
26 If the original list item has the POINTER attribute, its update occurs as if by pointer assignment.
Fortran
27 When the conditional modifier does not appear on the lastprivate clause, any list item
28 that is not an iteration variable of the associated loops and that is not assigned a value by the
29 sequentially last iteration of the loops, or by the lexically last section construct, has an
30 unspecified value after the construct. When the conditional modifier does not appear on the
31 lastprivate clause, a list item that is the iteration variable of an associated loop and that would
32 not be assigned a value during sequential execution of the loop nest has an unspecified value after
33 the construct. Unassigned subcomponents also have unspecified values after the construct.
6 Syntax
C
7 The syntax of the linear clause is as follows:
8 linear(linear-list[ : linear-step])
27 Restrictions
28 Restrictions to the linear clause are as follows:
29 • The linear-step expression must be invariant during the execution of the region that corresponds
30 to the construct.
31 • Only a loop iteration variable of a loop that is associated with the construct may appear as a
32 list-item in a linear clause if a reduction clause with the inscan modifier also appears
33 on the construct.
C / C++
Fortran
1 Table 2.12 lists each reduction-identifier that is implicitly declared for numeric and logical types
2 and its semantic initializer value. The actual initializer value is that value as expressed in the data
3 type of the reduction list item.
Fortran
1 In the above tables, omp_in and omp_out correspond to two identifiers that refer to storage of the
2 type of the list item. If the list item is an array or array section, the identifiers to which omp_in
3 and omp_out correspond each refer to an array element. omp_out holds the final value of the
4 combiner operation.
5 Any reduction-identifier that is defined with the declare reduction directive is also valid. In
6 that case, the initializer and combiner of the reduction-identifier are specified by the
7 initializer-clause and the combiner in the declare reduction directive.
8 Description
9 A reduction clause specifies a reduction-identifier and one or more list items.
10 The reduction-identifier specified in a reduction clause must match a previously declared
11 reduction-identifier of the same name and type for each of the list items. This match is done by
12 means of a name lookup in the base language.
13 The list items that appear in a reduction clause may include array sections.
C++
14 If the type is a derived class, then any reduction-identifier that matches its base classes is also a
15 match, if no specific match for the type has been specified.
16 If the reduction-identifier is not an id-expression, then it is implicitly converted to one by
17 prepending the keyword operator (for example, + becomes operator+).
18 If the reduction-identifier is qualified then a qualified name lookup is used to find the declaration.
19 If the reduction-identifier is unqualified then an argument-dependent name lookup must be
20 performed using the type of each list item.
C++
21 If a list item is an array or array section, it will be treated as if a reduction clause would be applied
22 to each separate element of the array section.
23 If a list item is an array section, the elements of any copy of the array section will be stored
24 contiguously.
13 Tool Callbacks
14 A thread dispatches a registered ompt_callback_reduction with
15 ompt_sync_region_reduction in its kind argument and ompt_scope_begin as its
16 endpoint argument for each occurrence of a reduction-begin event in that thread. Similarly, a thread
17 dispatches a registered ompt_callback_reduction with
18 ompt_sync_region_reduction in its kind argument and ompt_scope_end as its
19 endpoint argument for each occurrence of a reduction-end event in that thread. These callbacks
20 occur in the context of the task that performs the reduction and has the type signature
21 ompt_callback_sync_region_t.
22 Restrictions
23 Restrictions common to reduction clauses are as follows:
24 • Any number of reduction clauses can be specified on the directive, but a list item (or any array
25 element in an array section) can appear only once in reduction clauses for that directive.
26 • For a reduction-identifier declared in a declare reduction directive, the directive must
27 appear before its use in a reduction clause.
28 • If a list item is an array section or an array element, its base expression must be a base language
29 identifier.
30 • If a list item is an array section, it must specify contiguous storage and it cannot be a zero-length
31 array section.
32 • If a list item is an array section or an array element, accesses to the elements of the array outside
33 the specified array section or array element result in unspecified behavior.
16 Syntax
17 The syntax of the reduction clause is as follows:
18 reduction([ reduction-modifier,]reduction-identifier : list)
24 Description
25 The reduction clause is a reduction scoping clause and a reduction participating clause, as
26 described in Section 2.21.5.2 and Section 2.21.5.3.
27 If reduction-modifier is not present or the default reduction-modifier is present, the behavior is
28 as follows. For parallel, scope and worksharing constructs, one or more private copies of
29 each list item are created for each implicit task, as if the private clause had been used. For the
30 simd construct, one or more private copies of each list item are created for each SIMD lane, as if
31 the private clause had been used. For the taskloop construct, private copies are created
32 according to the rules of the reduction scoping clauses. For the teams construct, one or more
33 private copies of each list item are created for the initial task of each team in the league, as if the
34 private clause had been used. For the loop construct, private copies are created and used in the
24 Restrictions
25 Restrictions to the reduction clause are as follows:
26 • All restrictions common to all reduction clauses, which are listed in Section 2.21.5.1, apply to
27 this clause.
28 • A list item that appears in a reduction clause of a worksharing construct must be shared in
29 the parallel region to which a corresponding worksharing region binds.
30 • A list item that appears in a reduction clause of a scope construct must be shared in the
31 parallel region to which a corresponding scope region binds.
32 • If an array section or an array element appears as a list item in a reduction clause of a
33 worksharing construct, scope construct or loop construct for which the corresponding region
34 binds to a parallel region, all threads that participate in the reduction must specify the same
35 storage location.
36 • A list item that appears in a reduction clause with the inscan reduction-modifier must
37 appear as a list item in an inclusive or exclusive clause on a scan directive enclosed by
38 the construct.
4 Syntax
5 The syntax of the task_reduction clause is as follows:
6 task_reduction(reduction-identifier : list)
8 Description
9 The task_reduction clause is a reduction scoping clause, as described in 2.21.5.2.
10 For each list item, the number of copies is unspecified. Any copies associated with the reduction
11 are initialized before they are accessed by the tasks that participate in the reduction. After the end
12 of the region, the original list item contains the result of the reduction.
13 Restrictions
14 Restrictions to the task_reduction clause are as follows:
15 • All restrictions common to all reduction clauses, which are listed in Section 2.21.5.1, apply to
16 this clause.
20 Syntax
21 The syntax of the in_reduction clause is as follows:
22 in_reduction(reduction-identifier : list)
24 Description
25 The in_reduction clause is a reduction participating clause, as described in Section 2.21.5.3.
26 For a given list item, the in_reduction clause defines a task to be a participant in a task
27 reduction that is defined by an enclosing region for a matching list item that appears in a
28 task_reduction clause or a reduction clause with task as the reduction-modifier, where
29 either:
30 1. The matching list item has the same storage location as the list item in the in_reduction
31 clause; or
32 2. A private copy, derived from the matching list item, that is used to perform the task reduction
33 has the same storage location as the list item in the in_reduction clause.
9 Restrictions
10 Restrictions to the in_reduction clause are as follows:
11 • All restrictions common to all reduction clauses, which are listed in Section 2.21.5.1, apply to
12 this clause.
13 • A list item that appears in a task_reduction clause or a reduction clause with task as
14 the reduction-modifier that is specified on a construct that corresponds to a region in which the
15 region of the participating task is closely nested must match each list item. The construct that
16 corresponds to the innermost enclosing region that meets this condition must specify the same
17 reduction-identifier for the matching list item as the in_reduction clause.
23 Syntax
C
24 The syntax of the declare reduction directive is as follows:
25 #pragma omp declare reduction(reduction-identifier : typename-list :
26 combiner )[initializer-clause] new-line
27 where:
28 • reduction-identifier is either a base language identifier or one of the following operators: +, -, *,
29 &, |, ^, && or ||
30 • typename-list is a list of type names
31 • combiner is an expression
32 • initializer-clause is initializer(initializer-expr) where initializer-expr is
33 omp_priv = initializer or function-name(argument-list)
C
4 where:
5 • reduction-identifier is either an id-expression or one of the following operators: +, -, *, &, |, ^,
6 && or ||
7 • typename-list is a list of type names
8 • combiner is an expression
9 • initializer-clause is initializer(initializer-expr) where initializer-expr is
10 omp_priv initializer or function-name(argument-list)
C++
Fortran
11 The syntax of the declare reduction directive is as follows:
12 !$omp declare reduction(reduction-identifier : type-list : combiner)
13 [initializer-clause]
14 where:
15 • reduction-identifier is either a base language identifier, or a user-defined operator, or one of the
16 following operators: +, -, *, .and., .or., .eqv., .neqv., or one of the following intrinsic
17 procedure names: max, min, iand, ior, ieor.
18 • type-list is a list of type specifiers that must not be CLASS(*) or an abstract type
19 • combiner is either an assignment statement or a subroutine name followed by an argument list
20 • initializer-clause is initializer(initializer-expr), where initializer-expr is
21 omp_priv = expression or subroutine-name(argument-list)
Fortran
22 Description
23 User-defined reductions can be defined using the declare reduction directive. The
24 reduction-identifier and the type identify the declare reduction directive. The
25 reduction-identifier can later be used in a reduction clause that uses variables of the type or
26 types specified in the declare reduction directive. If the directive specifies several types then
27 the behavior is as if a declare reduction directive was specified for each type.
6 Syntax
7 The syntax of the copyin clause is as follows:
8 copyin(list)
9 Description
C / C++
10 The copy is performed after the team is formed and prior to the execution of the associated
11 structured block. For variables of non-array type, the copy is by copy assignment. For an array of
12 elements of non-array type, each element is copied as if by assignment from an element of the array
13 of the primary thread to the corresponding element of the array of all other threads.
C / C++
C++
14 For class types, the copy assignment operator is invoked. The order in which copy assignment
15 operators for different variables of the same class type are invoked is unspecified.
C++
Fortran
16 The copy is performed, as if by assignment, after the team is formed and prior to the execution of
17 the associated structured block.
18 On entry to any parallel region, each thread’s copy of a variable that is affected by a copyin
19 clause for the parallel region will acquire the type parameters, allocation, association, and
20 definition status of the copy of the primary thread, according to the following rules:
21 • If the original list item has the POINTER attribute, each copy receives the same association
22 status as that of the copy of the primary thread as if by pointer assignment.
23 • If the original list item does not have the POINTER attribute, each copy becomes defined with
24 the value of the copy of the primary thread as if by intrinsic assignment unless the list item has a
25 type bound procedure as a defined assignment. If the original list item that does not have the
26 POINTER attribute has the allocation status of unallocated, each copy will have the same status.
27 • If the original list item is unallocated or unassociated, each copy inherits the declared type
28 parameters and the default type parameter values from the original list item.
Fortran
13 Cross References
14 • parallel construct, see Section 2.6.
15 • threadprivate directive, see Section 2.21.2.
23 Syntax
24 The syntax of the copyprivate clause is as follows:
25 copyprivate(list)
26 Note – The copyprivate clause is an alternative to using a shared variable for the value when
27 providing such a shared variable would be difficult (for example, in a recursion requiring a different
28 variable at each level).
29
15 Cross References
16 • parallel construct, see Section 2.6.
17 • threadprivate directive, see Section 2.21.2.
18 • private clause, see Section 2.21.4.3.
6 Syntax
7 The syntax of the map clause is as follows:
8 map([[map-type-modifier[,] [map-type-modifier[,] ...]] map-type: ] locator-list)
22 Description
23 The list items that appear in a map clause may include array sections and structure elements.
24 The map-type and map-type-modifier specify the effect of the map clause, as described below.
25 For a given construct, the effect of a map clause with the to, from, or tofrom map-type is
26 ordered before the effect of a map clause with the alloc, release, or delete map-type. If a
27 map clause with a present map-type-modifier appears in a map clause, then the effect of the
28 clause is ordered before all other map clauses that do not have the present modifier.
29 If the mapper map-type-modifier is not present, the behavior is as if the mapper(default)
30 modifier was specified. The map behavior of a list item in a map clause is modified by a visible
31 user-defined mapper (see Section 2.21.7.4) if the mapper has the same mapper-identifier as the
32 mapper-identifier in the mapper map-type-modifier and is specified for a type that matches the
33 type of the list item. The effect of the mapper is to remove the list item from the map clause, if the
7 Note – If the effect of the map clauses on a construct would assign the value of an original list
8 item to a corresponding list item more than once, then an implementation is allowed to ignore
9 additional assignments of the same value to the corresponding list item.
10
11 In all cases on entry to the region, concurrent reads or updates of any part of the corresponding list
12 item must be synchronized with any update of the corresponding list item that occurs as a result of
13 the map clause to avoid data races.
14 If the map clause appears on a target, target data, or target exit data construct and a
15 corresponding list item of the original list item is not present in the device data environment on exit
16 from the region then the list item is ignored. Alternatively, if the map clause appears on a target,
17 target data, or target exit data construct and a corresponding list item of the original list
18 item is present in the device data environment on exit from the region, then the following sequence
19 of steps occurs as if performed as a single atomic operation:
20 1. If the map-type is not delete and the reference count of the corresponding list item is finite
21 and was not already decremented because of the effect of a map clause on the construct then:
22 a) The reference count of the corresponding list item is decremented by one;
23 2. If the map-type is delete and the reference count of the corresponding list item is finite then:
24 a) The reference count of the corresponding list item is set to zero;
25 3. If the map-type is from or tofrom and if the reference count of the corresponding list item is
26 zero or the always map-type-modifier is present then:
C / C++
27 a) For each part of the list item that is an attached pointer, that part of the original list item will
28 have the value that it had at the point immediately prior to the effect of the map clause; and
C / C++
Fortran
29 a) For each part of the list item that is an attached pointer, that part of the original list item, if
30 associated, will be associated with the same pointer target with which it was associated at
31 the point immediately prior to the effect of the map clause; and
Fortran
6 Note – If the effect of the map clauses on a construct would assign the value of a corresponding
7 list item to an original list item more than once, then an implementation is allowed to ignore
8 additional assignments of the same value to the original list item.
9
10 In all cases on exit from the region, concurrent reads or updates of any part of the original list item
11 must be synchronized with any update of the original list item that occurs as a result of the map
12 clause to avoid data races.
13 If a single contiguous part of the original storage of a list item with an implicit data-mapping
14 attribute has corresponding storage in the device data environment prior to a task encountering the
15 construct that is associated with the map clause, only that part of the original storage will have
16 corresponding storage in the device data environment as a result of the map clause.
17 If a list item with an implicit data-mapping attribute does not have any corresponding storage in the
18 device data environment prior to a task encountering the construct associated with the map clause,
19 and one or more contiguous parts of the original storage are either list items or base pointers to list
20 items that are explicitly mapped on the construct, only those parts of the original storage will have
21 corresponding storage in the device data environment as a result of the map clauses on the
22 construct.
C / C++
23 If a new list item is created then a new list item of the same type, with automatic storage duration, is
24 allocated for the construct. The size and alignment of the new list item are determined by the static
25 type of the variable. This allocation occurs if the region references the list item in any statement.
26 Initialization and assignment of the new list item are through bitwise copy.
C / C++
Fortran
27 If a new list item is created then a new list item of the same type, type parameter, and rank is
28 allocated. The new list item inherits all default values for the type parameters from the original list
29 item. The value of the new list item becomes that of the original list item in the map initialization
30 and assignment.
31 If the allocation status of an original list item that has the ALLOCATABLE attribute is changed
32 while a corresponding list item is present in the device data environment, the allocation status of the
33 corresponding list item is unspecified until the list item is again mapped with an always modifier
34 on entry to a target, target data or target enter data region.
Fortran
8 Tool Callbacks
9 A thread dispatches a registered ompt_callback_target_map or
10 ompt_callback_target_map_emi callback for each occurrence of a target-map event in
11 that thread. The callback occurs in the context of the target task and has type signature
12 ompt_callback_target_map_t or ompt_callback_target_map_emi_t,
13 respectively.
14 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
15 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
16 event in that thread. Similarly, a thread dispatches a registered
17 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
18 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
19 type signature ompt_callback_target_data_op_emi_t.
20 A thread dispatches a registered ompt_callback_target_data_op callback for each
21 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
22 target task and has type signature ompt_callback_target_data_op_t.
23 Restrictions
24 Restrictions to the map clause are as follows:
25 • Each of the map-type-modifier modifiers can appear at most once in the map clause.
C / C++
26 • List items of the map clauses on the same construct must not share original storage unless they
27 are the same lvalue expression or array section.
C / C++
28 • If a list item is an array section, it must specify contiguous storage.
29 • If an expression that is used to form a list item in a map clause contains an iterator identifier, the
30 list item instances that would result from different values of the iterator must not have the same
31 containing array and must not have base pointers that share original storage.
17 Cross References
18 • Array sections, see Section 2.1.5.
19 • Iterators, see Section 2.1.6.
20 • declare mapper directive, see Section 2.21.7.4.
21 • ompt_callback_target_data_op_t or
22 ompt_callback_target_data_op_emi_t callback type, see Section 4.5.2.25.
23 • ompt_callback_target_map_t or ompt_callback_target_map_emi_t callback
24 type, see Section 4.5.2.27.
10 Syntax
11 The syntax of the defaultmap clause is as follows:
12 defaultmap(implicit-behavior[:variable-category])
C / C++
22 and variable-category is one of:
23 scalar
24 aggregate
25 pointer
C / C++
6 Description
7 The defaultmap clause sets the implicit data-mapping attribute for all variables referenced in the
8 construct. If variable-category is specified, the effect of the defaultmap clause is as follows:
9 • If variable-category is scalar, all scalar variables of non-pointer type or all non-pointer
10 non-allocatable scalar variables that have an implicitly determined data-mapping or data-sharing
11 attribute will have a data-mapping or data-sharing attribute specified by implicit-behavior.
12 • If variable-category is aggregate or allocatable, all aggregate or allocatable variables
13 that have an implicitly determined data-mapping or data-sharing attribute will have a
14 data-mapping or data-sharing attribute specified by implicit-behavior.
15 • If variable-category is pointer, all variables of pointer type or with the POINTER attribute
16 that have implicitly determined data-mapping or data-sharing attributes will have a data-mapping
17 or data-sharing attribute specified by implicit-behavior.
18 If no variable-category is specified in the clause then implicit-behavior specifies the implicitly
19 determined data-mapping or data-sharing attribute for all variables referenced in the construct. If
20 implicit-behavior is none, each variable referenced in the construct that does not have a
21 predetermined data-sharing attribute and does not appear in a to or link clause on a declare
22 target directive must be listed in a data-mapping attribute clause, a data-sharing attribute clause
23 (including a data-sharing attribute clause on a combined construct where target is one of the
24 constituent constructs), an is_device_ptr clause or a has_device_addr clause. If
25 implicit-behavior is default, then the clause has no effect for the variables in the category
26 specified by variable-category. If implicit-behavior is present, each variable referenced in the
27 construct in the category specified by variable-category is treated as if it had been listed in a map
28 clause with the map-type of alloc and map-type-modifier of present.
21 Description
22 User-defined mappers can be defined using the declare mapper directive. The type and an
23 optional mapper-identifier uniquely identify the mapper for use in a map clause or motion clause
24 later in the program. The visibility and accessibility of this declaration are the same as those of a
25 variable declared at the same location in the program.
26 If mapper-identifier is not specified, the behavior is as if mapper-identifier is default.
27 The variable declared by var is available for use in all map clauses on the directive, and no part of
28 the variable to be mapped is mapped by default.
19 Cross References
20 • target update construct, see Section 2.14.6.
21 • map clause, see Section 2.21.7.1.
Fortran
5 true means a logical value of .TRUE. and false means a logical value of .FALSE..
Fortran
Fortran
6 Restrictions
7 The following restrictions apply to all OpenMP runtime library routines:
8 • OpenMP runtime library routines may not be called from PURE or ELEMENTAL procedures.
9 • OpenMP runtime library routines may not be called in DO CONCURRENT constructs.
Fortran
17 3.2.1 omp_set_num_threads
18 Summary
19 The omp_set_num_threads routine affects the number of threads to be used for subsequent
20 parallel regions that do not specify a num_threads clause, by setting the value of the first
21 element of the nthreads-var ICV of the current task.
22 Format
C / C++
23 void omp_set_num_threads(int num_threads);
C / C++
Fortran
24 subroutine omp_set_num_threads(num_threads)
25 integer num_threads
Fortran
4 Binding
5 The binding task set for an omp_set_num_threads region is the generating task.
6 Effect
7 The effect of this routine is to set the value of the first element of the nthreads-var ICV of the
8 current task to the value specified in the argument.
9 Cross References
10 • nthreads-var ICV, see Section 2.4.
11 • parallel construct and num_threads clause, see Section 2.6.
12 • Determining the number of threads for a parallel region, see Section 2.6.1.
13 • omp_get_num_threads routine, see Section 3.2.2.
14 • omp_get_max_threads routine, see Section 3.2.3.
15 • OMP_NUM_THREADS environment variable, see Section 6.2.
16 3.2.2 omp_get_num_threads
17 Summary
18 The omp_get_num_threads routine returns the number of threads in the current team.
19 Format
C / C++
20 int omp_get_num_threads(void);
C / C++
Fortran
21 integer function omp_get_num_threads()
Fortran
22 Binding
23 The binding region for an omp_get_num_threads region is the innermost enclosing
24 parallel region.
25 Effect
26 The omp_get_num_threads routine returns the number of threads in the team that is executing
27 the parallel region to which the routine region binds. If called from the sequential part of a
28 program, this routine returns 1.
7 3.2.3 omp_get_max_threads
8 Summary
9 The omp_get_max_threads routine returns an upper bound on the number of threads that
10 could be used to form a new team if a parallel construct without a num_threads clause were
11 encountered after execution returns from this routine.
12 Format
C / C++
13 int omp_get_max_threads(void);
C / C++
Fortran
14 integer function omp_get_max_threads()
Fortran
15 Binding
16 The binding task set for an omp_get_max_threads region is the generating task.
17 Effect
18 The value returned by omp_get_max_threads is the value of the first element of the
19 nthreads-var ICV of the current task. This value is also an upper bound on the number of threads
20 that could be used to form a new team if a parallel region without a num_threads clause were
21 encountered after execution returns from this routine.
22
23 Note – The return value of the omp_get_max_threads routine can be used to allocate
24 sufficient storage dynamically for all threads in the team formed at the subsequent active
25 parallel region.
26
9 3.2.4 omp_get_thread_num
10 Summary
11 The omp_get_thread_num routine returns the thread number, within the current team, of the
12 calling thread.
13 Format
C / C++
14 int omp_get_thread_num(void);
C / C++
Fortran
15 integer function omp_get_thread_num()
Fortran
16 Binding
17 The binding thread set for an omp_get_thread_num region is the current team. The binding
18 region for an omp_get_thread_num region is the innermost enclosing parallel region.
19 Effect
20 The omp_get_thread_num routine returns the thread number of the calling thread, within the
21 team that is executing the parallel region to which the routine region binds. The thread number
22 is an integer between 0 and one less than the value returned by omp_get_num_threads,
23 inclusive. The thread number of the primary thread of the team is 0. The routine returns 0 if it is
24 called from the sequential part of a program.
25
26 Note – The thread number may change during the execution of an untied task. The value returned
27 by omp_get_thread_num is not generally useful during the execution of such a task region.
28
8 3.2.5 omp_in_parallel
9 Summary
10 The omp_in_parallel routine returns true if the active-levels-var ICV is greater than zero;
11 otherwise, it returns false.
12 Format
C / C++
13 int omp_in_parallel(void);
C / C++
Fortran
14 logical function omp_in_parallel()
Fortran
15 Binding
16 The binding task set for an omp_in_parallel region is the generating task.
17 Effect
18 The effect of the omp_in_parallel routine is to return true if the current task is enclosed by an
19 active parallel region, and the parallel region is enclosed by the outermost initial task
20 region on the device; otherwise it returns false.
21 Cross References
22 • active-levels-var, see Section 2.4.
23 • parallel construct, see Section 2.6.
24 • omp_get_num_threads routine, see Section 3.2.2.
25 • omp_get_active_level routine, see Section 3.2.20.
6 Format
C / C++
7 void omp_set_dynamic(int dynamic_threads);
C / C++
Fortran
8 subroutine omp_set_dynamic(dynamic_threads)
9 logical dynamic_threads
Fortran
10 Binding
11 The binding task set for an omp_set_dynamic region is the generating task.
12 Effect
13 For implementations that support dynamic adjustment of the number of threads, if the argument to
14 omp_set_dynamic evaluates to true, dynamic adjustment is enabled for the current task;
15 otherwise, dynamic adjustment is disabled for the current task. For implementations that do not
16 support dynamic adjustment of the number of threads, this routine has no effect: the value of
17 dyn-var remains false.
18 Cross References
19 • dyn-var ICV, see Section 2.4.
20 • Determining the number of threads for a parallel region, see Section 2.6.1.
21 • omp_get_num_threads routine, see Section 3.2.2.
22 • omp_get_dynamic routine, see Section 3.2.7.
23 • OMP_DYNAMIC environment variable, see Section 6.3.
24 3.2.7 omp_get_dynamic
25 Summary
26 The omp_get_dynamic routine returns the value of the dyn-var ICV, which determines whether
27 dynamic adjustment of the number of threads is enabled or disabled.
4 Binding
5 The binding task set for an omp_get_dynamic region is the generating task.
6 Effect
7 This routine returns true if dynamic adjustment of the number of threads is enabled for the current
8 task; it returns false, otherwise. If an implementation does not support dynamic adjustment of the
9 number of threads, then this routine always returns false.
10 Cross References
11 • dyn-var ICV, see Section 2.4.
12 • Determining the number of threads for a parallel region, see Section 2.6.1.
13 • omp_set_dynamic routine, see Section 3.2.6.
14 • OMP_DYNAMIC environment variable, see Section 6.3.
15 3.2.8 omp_get_cancellation
16 Summary
17 The omp_get_cancellation routine returns the value of the cancel-var ICV, which
18 determines if cancellation is enabled or disabled.
19 Format
C / C++
20 int omp_get_cancellation(void);
C / C++
Fortran
21 logical function omp_get_cancellation()
Fortran
22 Binding
23 The binding task set for an omp_get_cancellation region is the whole program.
3 Cross References
4 • cancel-var ICV, see Section 2.4.1.
5 • cancel construct, see Section 2.20.1.
6 • OMP_CANCELLATION environment variable, see Section 6.11.
11 Format
C / C++
12 void omp_set_nested(int nested);
C / C++
Fortran
13 subroutine omp_set_nested(nested)
14 logical nested
Fortran
15 Binding
16 The binding task set for an omp_set_nested region is the generating task.
17 Effect
18 If the argument to omp_set_nested evaluates to true, the value of the max-active-levels-var
19 ICV is set to the number of active levels of parallelism that the implementation supports; otherwise,
20 if the value of max-active-levels-var is greater than 1 then it is set to 1. This routine has been
21 deprecated.
22 Cross References
23 • max-active-levels-var ICV, see Section 2.4.
24 • Determining the number of threads for a parallel region, see Section 2.6.1.
25 • omp_get_nested routine, see Section 3.2.10.
26 • omp_set_max_active_levels routine, see Section 3.2.15.
27 • omp_get_max_active_levels routine, see Section 3.2.16.
28 • OMP_NESTED environment variable, see Section 6.9.
5 Format
C / C++
6 int omp_get_nested(void);
C / C++
Fortran
7 logical function omp_get_nested()
Fortran
8 Binding
9 The binding task set for an omp_get_nested region is the generating task.
10 Effect
11 This routine returns true if max-active-levels-var is greater than 1 and greater than active-levels-var
12 for the current task; it returns false, otherwise. If an implementation does not support nested
13 parallelism, this routine always returns false. This routine has been deprecated.
14 Cross References
15 • max-active-levels-var ICV, see Section 2.4.
16 • Determining the number of threads for a parallel region, see Section 2.6.1.
17 • omp_set_nested routine, see Section 3.2.9.
18 • omp_set_max_active_levels routine, see Section 3.2.15.
19 • omp_get_max_active_levels routine, see Section 3.2.16.
20 • OMP_NESTED environment variable, see Section 6.9.
21 3.2.11 omp_set_schedule
22 Summary
23 The omp_set_schedule routine affects the schedule that is applied when runtime is used as
24 schedule kind, by setting the value of the run-sched-var ICV.
6 Constraints on Arguments
7 The first argument passed to this routine can be one of the valid OpenMP schedule kinds (except for
8 runtime) or any implementation-specific schedule. The C/C++ header file (omp.h) and the
9 Fortran include file (omp_lib.h) and/or Fortran module file (omp_lib) define the valid
10 constants. The valid constants must include the following, which can be extended with
11 implementation-specific values:
C / C++
12 typedef enum omp_sched_t {
13 // schedule kinds
14 omp_sched_static = 0x1,
15 omp_sched_dynamic = 0x2,
16 omp_sched_guided = 0x3,
17 omp_sched_auto = 0x4,
18
19 // schedule modifier
20 omp_sched_monotonic = 0x80000000u
21 } omp_sched_t;
C / C++
19 Binding
20 The binding task set for an omp_set_schedule region is the generating task.
21 Effect
22 The effect of this routine is to set the value of the run-sched-var ICV of the current task to the
23 values specified in the two arguments. The schedule is set to the schedule kind that is specified by
24 the first argument kind. It can be any of the standard schedule kinds or any other
25 implementation-specific one. For the schedule kinds static, dynamic, and guided the
26 chunk_size is set to the value of the second argument, or to the default chunk_size if the value of the
27 second argument is less than 1; for the schedule kind auto the second argument has no meaning;
28 for implementation-specific schedule kinds, the values and associated meanings of the second
29 argument are implementation defined.
30 Each of the schedule kinds can be combined with the omp_sched_monotonic modifier by
31 using the + or | operators in C/C++ or the + operator in Fortran. If the schedule kind is combined
32 with the omp_sched_monotonic modifier, the schedule is modified as if the monotonic
33 schedule modifier was specified. Otherwise, the schedule modifier is nonmonotonic.
6 3.2.12 omp_get_schedule
7 Summary
8 The omp_get_schedule routine returns the schedule that is applied when the runtime schedule
9 is used.
10 Format
C / C++
11 void omp_get_schedule(omp_sched_t *kind, int *chunk_size);
C / C++
Fortran
12 subroutine omp_get_schedule(kind, chunk_size)
13 integer (kind=omp_sched_kind) kind
14 integer chunk_size
Fortran
15 Binding
16 The binding task set for an omp_get_schedule region is the generating task.
17 Effect
18 This routine returns the run-sched-var ICV in the task to which the routine binds. The first
19 argument kind returns the schedule to be used. It can be any of the standard schedule kinds as
20 defined in Section 3.2.11, or any implementation-specific schedule kind. The second argument
21 chunk_size returns the chunk size to be used, or a value less than 1 if the default chunk size is to be
22 used, if the returned schedule kind is static, dynamic, or guided. The value returned by the
23 second argument is implementation defined for any other schedule kinds.
24 Cross References
25 • run-sched-var ICV, see Section 2.4.
26 • Determining the schedule of a worksharing-loop, see Section 2.11.4.1.
27 • omp_set_schedule routine, see Section 3.2.11.
28 • OMP_SCHEDULE environment variable, see Section 6.1.
5 Format
C / C++
6 int omp_get_thread_limit(void);
C / C++
Fortran
7 integer function omp_get_thread_limit()
Fortran
8 Binding
9 The binding task set for an omp_get_thread_limit region is the generating task.
10 Effect
11 The omp_get_thread_limit routine returns the value of the thread-limit-var ICV.
12 Cross References
13 • thread-limit-var ICV, see Section 2.4.
14 • omp_get_num_threads routine, see Section 3.2.2.
15 • OMP_NUM_THREADS environment variable, see Section 6.2.
16 • OMP_THREAD_LIMIT environment variable, see Section 6.10.
17 3.2.14 omp_get_supported_active_levels
18 Summary
19 The omp_get_supported_active_levels routine returns the number of active levels of
20 parallelism supported by the implementation.
21 Format
C / C++
22 int omp_get_supported_active_levels(void);
C / C++
Fortran
23 integer function omp_get_supported_active_levels()
Fortran
4 Effect
5 The omp_get_supported_active_levels routine returns the number of active levels of
6 parallelism supported by the implementation. The max-active-levels-var ICV may not have a value
7 that is greater than this number. The value returned by the
8 omp_get_supported_active_levels routine is implementation defined, but it must be
9 greater than 0.
10 Cross References
11 • max-active-levels-var ICV, see Section 2.4.
12 • omp_set_max_active_levels routine, see Section 3.2.15.
13 • omp_get_max_active_levels routine, see Section 3.2.16.
14 3.2.15 omp_set_max_active_levels
15 Summary
16 The omp_set_max_active_levels routine limits the number of nested active parallel
17 regions when a new nested parallel region is generated by the current task by setting the
18 max-active-levels-var ICV.
19 Format
C / C++
20 void omp_set_max_active_levels(int max_levels);
C / C++
Fortran
21 subroutine omp_set_max_active_levels(max_levels)
22 integer max_levels
Fortran
23 Constraints on Arguments
24 The value of the argument passed to this routine must evaluate to a non-negative integer, otherwise
25 the behavior of this routine is implementation defined.
26 Binding
27 The binding task set for an omp_set_max_active_levels region is the generating task.
7 Cross References
8 • max-active-levels-var ICV, see Section 2.4.
9 • parallel construct, see Section 2.6.
10 • omp_get_supported_active_levels routine, see Section 3.2.14.
11 • omp_get_max_active_levels routine, see Section 3.2.16.
12 • OMP_MAX_ACTIVE_LEVELS environment variable, see Section 6.8.
13 3.2.16 omp_get_max_active_levels
14 Summary
15 The omp_get_max_active_levels routine returns the value of the max-active-levels-var
16 ICV, which determines the maximum number of nested active parallel regions when the innermost
17 parallel region is generated by the current task.
18 Format
C / C++
19 int omp_get_max_active_levels(void);
C / C++
Fortran
20 integer function omp_get_max_active_levels()
Fortran
21 Binding
22 The binding task set for an omp_get_max_active_levels region is the generating task.
23 Effect
24 The omp_get_max_active_levels routine returns the value of the max-active-levels-var
25 ICV. The current task may only generate an active parallel region if the returned value is greater
26 than the value of the active-levels-var ICV.
7 3.2.17 omp_get_level
8 Summary
9 The omp_get_level routine returns the value of the levels-var ICV.
10 Format
C / C++
11 int omp_get_level(void);
C / C++
Fortran
12 integer function omp_get_level()
Fortran
13 Binding
14 The binding task set for an omp_get_level region is the generating task.
15 Effect
16 The effect of the omp_get_level routine is to return the number of nested parallel regions
17 (whether active or inactive) that enclose the current task such that all of the parallel regions are
18 enclosed by the outermost initial task region on the current device.
19 Cross References
20 • levels-var ICV, see Section 2.4.
21 • parallel construct, see Section 2.6.
22 • omp_get_active_level routine, see Section 3.2.20.
23 • OMP_MAX_ACTIVE_LEVELS environment variable, see Section 6.8.
5 Format
C / C++
6 int omp_get_ancestor_thread_num(int level);
C / C++
Fortran
7 integer function omp_get_ancestor_thread_num(level)
8 integer level
Fortran
9 Binding
10 The binding thread set for an omp_get_ancestor_thread_num region is the encountering
11 thread. The binding region for an omp_get_ancestor_thread_num region is the innermost
12 enclosing parallel region.
13 Effect
14 The omp_get_ancestor_thread_num routine returns the thread number of the ancestor at a
15 given nest level of the current thread or the thread number of the current thread. If the requested
16 nest level is outside the range of 0 and the nest level of the current thread, as returned by the
17 omp_get_level routine, the routine returns -1.
18
23 Cross References
24 • parallel construct, see Section 2.6.
25 • omp_get_num_threads routine, see Section 3.2.2.
26 • omp_get_thread_num routine, see Section 3.2.4.
27 • omp_get_level routine, see Section 3.2.17.
28 • omp_get_team_size routine, see Section 3.2.19.
5 Format
C / C++
6 int omp_get_team_size(int level);
C / C++
Fortran
7 integer function omp_get_team_size(level)
8 integer level
Fortran
9 Binding
10 The binding thread set for an omp_get_team_size region is the encountering thread. The
11 binding region for an omp_get_team_size region is the innermost enclosing parallel
12 region.
13 Effect
14 The omp_get_team_size routine returns the size of the thread team to which the ancestor or
15 the current thread belongs. If the requested nested level is outside the range of 0 and the nested
16 level of the current thread, as returned by the omp_get_level routine, the routine returns -1.
17 Inactive parallel regions are regarded like active parallel regions executed with one thread.
18
19 Note – When the omp_get_team_size routine is called with a value of level=0, the routine
20 always returns 1. If level=omp_get_level(), the routine has the same effect as the
21 omp_get_num_threads routine.
22
23 Cross References
24 • omp_get_num_threads routine, see Section 3.2.2.
25 • omp_get_level routine, see Section 3.2.17.
26 • omp_get_ancestor_thread_num routine, see Section 3.2.18.
27 3.2.20 omp_get_active_level
28 Summary
29 The omp_get_active_level routine returns the value of the active-level-var ICV.
4 Binding
5 The binding task set for the an omp_get_active_level region is the generating task.
6 Effect
7 The effect of the omp_get_active_level routine is to return the number of nested active
8 parallel regions enclosing the current task such that all of the parallel regions are enclosed
9 by the outermost initial task region on the current device.
10 Cross References
11 • active-levels-var ICV, see Section 2.4.
12 • omp_set_max_active_levels routine, see Section 3.2.15.
13 • omp_get_max_active_levels routine, see Section 3.2.16.
14 • omp_get_level routine, see Section 3.2.17.
15 • OMP_MAX_ACTIVE_LEVELS environment variable, see Section 6.8.
18 3.3.1 omp_get_proc_bind
19 Summary
20 The omp_get_proc_bind routine returns the thread affinity policy to be used for the
21 subsequent nested parallel regions that do not specify a proc_bind clause.
22 Format
C / C++
23 omp_proc_bind_t omp_get_proc_bind(void);
C / C++
2 Constraints on Arguments
3 The value returned by this routine must be one of the valid affinity policy kinds. The C/C++ header
4 file (omp.h) and the Fortran include file (omp_lib.h) and/or Fortran module file (omp_lib)
5 define the valid constants. The valid constants must include the following:
C / C++
6 typedef enum omp_proc_bind_t {
7 omp_proc_bind_false = 0,
8 omp_proc_bind_true = 1,
9 omp_proc_bind_primary = 2,
10 omp_proc_bind_master = omp_proc_bind_primary, // (deprecated)
11 omp_proc_bind_close = 3,
12 omp_proc_bind_spread = 4
13 } omp_proc_bind_t;
C / C++
Fortran
14 integer (kind=omp_proc_bind_kind), &
15 parameter :: omp_proc_bind_false = 0
16 integer (kind=omp_proc_bind_kind), &
17 parameter :: omp_proc_bind_true = 1
18 integer (kind=omp_proc_bind_kind), &
19 parameter :: omp_proc_bind_primary = 2
20 integer (kind=omp_proc_bind_kind), &
21 parameter :: omp_proc_bind_master = &
22 omp_proc_bind_primary ! (deprecated)
23 integer (kind=omp_proc_bind_kind), &
24 parameter :: omp_proc_bind_close = 3
25 integer (kind=omp_proc_bind_kind), &
26 parameter :: omp_proc_bind_spread = 4
Fortran
27 Binding
28 The binding task set for an omp_get_proc_bind region is the generating task.
29 Effect
30 The effect of this routine is to return the value of the first element of the bind-var ICV of the current
31 task. See Section 2.6.2 for the rules that govern the thread affinity policy.
7 3.3.2 omp_get_num_places
8 Summary
9 The omp_get_num_places routine returns the number of places available to the execution
10 environment in the place list.
11 Format
C / C++
12 int omp_get_num_places(void);
C / C++
Fortran
13 integer function omp_get_num_places()
Fortran
14 Binding
15 The binding thread set for an omp_get_num_places region is all threads on a device. The
16 effect of executing this routine is not related to any specific region corresponding to any construct
17 or API routine.
18 Effect
19 The omp_get_num_places routine returns the number of places in the place list. This value is
20 equivalent to the number of places in the place-partition-var ICV in the execution environment of
21 the initial task.
22 Cross References
23 • place-partition-var ICV, see Section 2.4.
24 • Controlling OpenMP thread affinity, see Section 2.6.2.
25 • omp_get_place_num routine, see Section 3.3.5.
26 • OMP_PLACES environment variable, see Section 6.5.
5 Format
C / C++
6 int omp_get_place_num_procs(int place_num);
C / C++
Fortran
7 integer function omp_get_place_num_procs(place_num)
8 integer place_num
Fortran
9 Binding
10 The binding thread set for an omp_get_place_num_procs region is all threads on a device.
11 The effect of executing this routine is not related to any specific region corresponding to any
12 construct or API routine.
13 Effect
14 The omp_get_place_num_procs routine returns the number of processors associated with
15 the place numbered place_num. The routine returns zero when place_num is negative, or is greater
16 than or equal to the value returned by omp_get_num_places().
17 Cross References
18 • place-partition-var ICV, see Section 2.4.
19 • Controlling OpenMP thread affinity, see Section 2.6.2.
20 • omp_get_num_places routine, see Section 3.3.2.
21 • omp_get_place_proc_ids routine, see Section 3.3.4.
22 • OMP_PLACES environment variable, see Section 6.5.
23 3.3.4 omp_get_place_proc_ids
24 Summary
25 The omp_get_place_proc_ids routine returns the numerical identifiers of the processors
26 available to the execution environment in the specified place.
27 Format
C / C++
28 void omp_get_place_proc_ids(int place_num, int *ids);
C / C++
8 Effect
9 The omp_get_place_proc_ids routine returns the numerical identifiers of each processor
10 associated with the place numbered place_num. The numerical identifiers are non-negative and
11 their meaning is implementation defined. The numerical identifiers are returned in the array ids and
12 their order in the array is implementation defined. The array must be sufficiently large to contain
13 omp_get_place_num_procs(place_num) integers; otherwise, the behavior is unspecified.
14 The routine has no effect when place_num has a negative value or a value greater than or equal to
15 omp_get_num_places().
16 Cross References
17 • place-partition-var ICV, see Section 2.4.
18 • Controlling OpenMP thread affinity, see Section 2.6.2.
19 • omp_get_num_places routine, see Section 3.3.2.
20 • omp_get_place_num_procs routine, see Section 3.3.3.
21 • OMP_PLACES environment variable, see Section 6.5.
22 3.3.5 omp_get_place_num
23 Summary
24 The omp_get_place_num routine returns the place number of the place to which the
25 encountering thread is bound.
26 Format
C / C++
27 int omp_get_place_num(void);
C / C++
Fortran
28 integer function omp_get_place_num()
Fortran
3 Effect
4 When the encountering thread is bound to a place, the omp_get_place_num routine returns the
5 place number associated with the thread. The returned value is between 0 and one less than the
6 value returned by omp_get_num_places(), inclusive. When the encountering thread is not
7 bound to a place, the routine returns -1.
8 Cross References
9 • place-partition-var ICV, see Section 2.4.
10 • Controlling OpenMP thread affinity, see Section 2.6.2.
11 • omp_get_num_places routine, see Section 3.3.2.
12 • OMP_PLACES environment variable, see Section 6.5.
13 3.3.6 omp_get_partition_num_places
14 Summary
15 The omp_get_partition_num_places routine returns the number of places in the place
16 partition of the innermost implicit task.
17 Format
C / C++
18 int omp_get_partition_num_places(void);
C / C++
Fortran
19 integer function omp_get_partition_num_places()
Fortran
20 Binding
21 The binding task set for an omp_get_partition_num_places region is the encountering
22 implicit task.
23 Effect
24 The omp_get_partition_num_places routine returns the number of places in the
25 place-partition-var ICV.
6 3.3.7 omp_get_partition_place_nums
7 Summary
8 The omp_get_partition_place_nums routine returns the list of place numbers
9 corresponding to the places in the place-partition-var ICV of the innermost implicit task.
10 Format
C / C++
11 void omp_get_partition_place_nums(int *place_nums);
C / C++
Fortran
12 subroutine omp_get_partition_place_nums(place_nums)
13 integer place_nums(*)
Fortran
14 Binding
15 The binding task set for an omp_get_partition_place_nums region is the encountering
16 implicit task.
17 Effect
18 The omp_get_partition_place_nums routine returns the list of place numbers that
19 correspond to the places in the place-partition-var ICV of the innermost implicit task. The array
20 must be sufficiently large to contain omp_get_partition_num_places() integers;
21 otherwise, the behavior is unspecified.
22 Cross References
23 • place-partition-var ICV, see Section 2.4.
24 • Controlling OpenMP thread affinity, see Section 2.6.2.
25 • omp_get_partition_num_places routine, see Section 3.3.6.
26 • OMP_PLACES environment variable, see Section 6.5.
5 Format
C / C++
6 void omp_set_affinity_format(const char *format);
C / C++
Fortran
7 subroutine omp_set_affinity_format(format)
8 character(len=*),intent(in) :: format
Fortran
9 Binding
10 When called from a sequential part of the program, the binding thread set for an
11 omp_set_affinity_format region is the encountering thread. When called from within any
12 parallel or teams region, the binding thread set (and binding region, if required) for the
13 omp_set_affinity_format region is implementation defined.
14 Effect
15 The effect of omp_set_affinity_format routine is to copy the character string specified by
16 the format argument into the affinity-format-var ICV on the current device.
17 This routine has the described effect only when called from a sequential part of the program. When
18 called from within a parallel or teams region, the effect of this routine is implementation
19 defined.
20 Cross References
21 • Controlling OpenMP thread affinity, see Section 2.6.2.
22 • omp_get_affinity_format routine, see Section 3.3.9.
23 • omp_display_affinity routine, see Section 3.3.10.
24 • omp_capture_affinity routine, see Section 3.3.11.
25 • OMP_DISPLAY_AFFINITY environment variable, see Section 6.13.
26 • OMP_AFFINITY_FORMAT environment variable, see Section 6.14.
5 Format
C / C++
6 size_t omp_get_affinity_format(char *buffer, size_t size);
C / C++
Fortran
7 integer function omp_get_affinity_format(buffer)
8 character(len=*),intent(out) :: buffer
Fortran
9 Binding
10 When called from a sequential part of the program, the binding thread set for an
11 omp_get_affinity_format region is the encountering thread. When called from within any
12 parallel or teams region, the binding thread set (and binding region, if required) for the
13 omp_get_affinity_format region is implementation defined.
14 Effect
C / C++
15 The omp_get_affinity_format routine returns the number of characters in the
16 affinity-format-var ICV on the current device, excluding the terminating null byte (’\0’) and if
17 size is non-zero, writes the value of the affinity-format-var ICV on the current device to buffer
18 followed by a null byte. If the return value is larger or equal to size, the affinity format specification
19 is truncated, with the terminating null byte stored to buffer[size-1]. If size is zero, nothing is
20 stored and buffer may be NULL.
C / C++
Fortran
21 The omp_get_affinity_format routine returns the number of characters that are required to
22 hold the affinity-format-var ICV on the current device and writes the value of the
23 affinity-format-var ICV on the current device to buffer. If the return value is larger than
24 len(buffer), the affinity format specification is truncated.
Fortran
25 If the buffer argument does not conform to the specified format then the result is implementation
26 defined.
8 3.3.10 omp_display_affinity
9 Summary
10 The omp_display_affinity routine prints the OpenMP thread affinity information using the
11 format specification provided.
12 Format
C / C++
13 void omp_display_affinity(const char *format);
C / C++
Fortran
14 subroutine omp_display_affinity(format)
15 character(len=*),intent(in) :: format
Fortran
16 Binding
17 The binding thread set for an omp_display_affinity region is the encountering thread.
18 Effect
19 The omp_display_affinity routine prints the thread affinity information of the current
20 thread in the format specified by the format argument, followed by a new-line. If the format is
21 NULL (for C/C++) or a zero-length string (for Fortran and C/C++), the value of the
22 affinity-format-var ICV is used. If the format argument does not conform to the specified format
23 then the result is implementation defined.
8 3.3.11 omp_capture_affinity
9 Summary
10 The omp_capture_affinity routine prints the OpenMP thread affinity information into a
11 buffer using the format specification provided.
12 Format
C / C++
13 size_t omp_capture_affinity(
14 char *buffer,
15 size_t size,
16 const char *format
17 );
C / C++
Fortran
18 integer function omp_capture_affinity(buffer,format)
19 character(len=*),intent(out) :: buffer
20 character(len=*),intent(in) :: format
Fortran
21 Binding
22 The binding thread set for an omp_capture_affinity region is the encountering thread.
23 Effect
C / C++
24 The omp_capture_affinity routine returns the number of characters in the entire thread
25 affinity information string excluding the terminating null byte (’\0’) and if size is non-zero, writes
26 the thread affinity information of the current thread in the format specified by the format argument
27 into the character string buffer followed by a null byte. If the return value is larger or equal to
28 size, the thread affinity information string is truncated, with the terminating null byte stored to
29 buffer[size-1]. If size is zero, nothing is stored and buffer may be NULL. If the format is NULL or
30 a zero-length string, the value of the affinity-format-var ICV is used.
C / C++
9 Cross References
10 • Controlling OpenMP thread affinity, see Section 2.6.2.
11 • omp_set_affinity_format routine, see Section 3.3.8.
12 • omp_get_affinity_format routine, see Section 3.3.9.
13 • omp_display_affinity routine, see Section 3.3.10.
14 • OMP_DISPLAY_AFFINITY environment variable, see Section 6.13.
15 • OMP_AFFINITY_FORMAT environment variable, see Section 6.14.
19 3.4.1 omp_get_num_teams
20 Summary
21 The omp_get_num_teams routine returns the number of initial teams in the current teams
22 region.
23 Format
C / C++
24 int omp_get_num_teams(void);
C / C++
Fortran
25 integer function omp_get_num_teams()
Fortran
3 Effect
4 The effect of this routine is to return the number of initial teams in the current teams region. The
5 routine returns 1 if it is called from outside of a teams region.
6 Cross References
7 • teams construct, see Section 2.7.
8 • omp_get_team_num routine, see Section 3.4.2.
9 3.4.2 omp_get_team_num
10 Summary
11 The omp_get_team_num routine returns the initial team number of the calling thread.
12 Format
C / C++
13 int omp_get_team_num(void);
C / C++
Fortran
14 integer function omp_get_team_num()
Fortran
15 Binding
16 The binding task set for an omp_get_team_num region is the generating task.
17 Effect
18 The omp_get_team_num routine returns the initial team number of the calling thread. The
19 initial team number is an integer between 0 and one less than the value returned by
20 omp_get_num_teams(), inclusive. The routine returns 0 if it is called outside of a teams
21 region.
22 Cross References
23 • teams construct, see Section 2.7.
24 • omp_get_num_teams routine, see Section 3.4.1.
6 Format
C / C++
7 void omp_set_num_teams(int num_teams);
C / C++
Fortran
8 subroutine omp_set_num_teams(num_teams)
9 integer num_teams
Fortran
10 Constraints on Arguments
11 The value of the argument passed to this routine must evaluate to a positive integer, or else the
12 behavior of this routine is implementation defined.
13 Binding
14 The binding task set for an omp_set_num_teams region is the generating task.
15 Effect
16 The effect of this routine is to set the value of the nteams-var ICV of the current task to the value
17 specified in the argument.
18 Restrictions
19 Restrictions to the omp_set_num_teams routine are as follows:
20 • The routine may not be called from within a parallel region that is not the implicit parallel region
21 that surrounds the whole OpenMP program.
22 Cross References
23 • nteams-var ICV, see Section 2.4.
24 • teams construct and num_teams clause, see Section 2.7.
25 • omp_get_num_teams routine, see Section 3.4.1.
26 • omp_get_max_teams routine, see Section 3.4.4.
27 • OMP_NUM_TEAMS environment variable, see Section 6.23.
6 Format
C / C++
7 int omp_get_max_teams(void);
C / C++
Fortran
8 integer function omp_get_max_teams()
Fortran
9 Binding
10 The binding task set for an omp_get_max_teams region is the generating task.
11 Effect
12 The value returned by omp_get_max_teams is the value of the nteams-var ICV of the current
13 task. This value is also an upper bound on the number of teams that can be created by a teams
14 construct without a num_teams clause that is encountered after execution returns from this
15 routine.
16 Cross References
17 • nteams-var ICV, see Section 2.4.
18 • teams construct and num_teams clause, see Section 2.7.
19 • omp_get_num_teams routine, see Section 3.4.1.
20 • omp_set_num_teams routine, see Section 3.4.3.
21 3.4.5 omp_set_teams_thread_limit
22 Summary
23 The omp_set_teams_thread_limit routine defines the maximum number of OpenMP
24 threads that can participate in each contention group created by a teams construct.
25 Format
C / C++
26 void omp_set_teams_thread_limit(int thread_limit);
C / C++
6 Binding
7 The binding task set for an omp_set_teams_thread_limit region is the generating task.
8 Effect
9 The omp_set_teams_thread_limit routine sets the value of the teams-thread-limit-var
10 ICV to the value of the thread_limit argument.
11 If the value of thread_limit exceeds the number of OpenMP threads that an implementation
12 supports for each contention group created by a teams construct, the value of the
13 teams-thread-limit-var ICV will be set to the number that is supported by the implementation.
14 Restrictions
15 Restrictions to the omp_set_teams_thread_limit routine are as follows:
16 • The routine may not be called from within a parallel region other than the implicit parallel region
17 that surrounds the whole OpenMP program.
18 Cross References
19 • teams_thread-limit-var ICV, see Section 2.4.
20 • teams construct and thread_limit clause, see Section 2.7.
21 • omp_get_teams_thread_limit routine, see Section 3.4.6.
22 • OMP_TEAMS_THREAD_LIMIT environment variable, see Section 6.24.
23 3.4.6 omp_get_teams_thread_limit
24 Summary
25 The omp_get_teams_thread_limit routine returns the maximum number of OpenMP
26 threads available to participate in each contention group created by a teams construct.
27 Format
C / C++
28 int omp_get_teams_thread_limit(void);
C / C++
4 Effect
5 The omp_get_teams_thread_limit routine returns the value of the teams-thread-limit-var
6 ICV.
7 Cross References
8 • teams_thread-limit-var ICV, see Section 2.4.
9 • teams construct and thread_limit clause, see Section 2.7.
10 • omp_set_teams_thread_limit routine, see Section 3.4.5.
11 • OMP_TEAMS_THREAD_LIMIT environment variable, see Section 6.24.
14 3.5.1 omp_get_max_task_priority
15 Summary
16 The omp_get_max_task_priority routine returns the maximum value that can be specified
17 in the priority clause.
18 Format
C / C++
19 int omp_get_max_task_priority(void);
C / C++
Fortran
20 integer function omp_get_max_task_priority()
Fortran
21 Binding
22 The binding thread set for an omp_get_max_task_priority region is all threads on the
23 device. The effect of executing this routine is not related to any specific region that corresponds to
24 any construct or API routine.
4 Cross References
5 • max-task-priority-var, see Section 2.4.
6 • task construct, see Section 2.12.1.
7 3.5.2 omp_in_final
8 Summary
9 The omp_in_final routine returns true if the routine is executed in a final task region;
10 otherwise, it returns false.
11 Format
C / C++
12 int omp_in_final(void);
C / C++
Fortran
13 logical function omp_in_final()
Fortran
14 Binding
15 The binding task set for an omp_in_final region is the generating task.
16 Effect
17 omp_in_final returns true if the enclosing task region is final. Otherwise, it returns false.
18 Cross References
19 • task construct, see Section 2.12.1.
3 3.6.1 omp_pause_resource
4 Summary
5 The omp_pause_resource routine allows the runtime to relinquish resources used by OpenMP
6 on the specified device.
7 Format
C / C++
8 int omp_pause_resource(
9 omp_pause_resource_t kind,
10 int device_num
11 );
C / C++
Fortran
12 integer function omp_pause_resource(kind, device_num)
13 integer (kind=omp_pause_resource_kind) kind
14 integer device_num
Fortran
15 Constraints on Arguments
16 The first argument passed to this routine can be one of the valid OpenMP pause kind, or any
17 implementation specific pause kind. The C/C++ header file (omp.h) and the Fortran include file
18 (omp_lib.h) and/or Fortran module file (omp_lib) define the valid constants. The valid
19 constants must include the following, which can be extended with implementation-specific values:
20 Format
C / C++
21 typedef enum omp_pause_resource_t {
22 omp_pause_soft = 1,
23 omp_pause_hard = 2
24 } omp_pause_resource_t;
C / C++
Fortran
25 integer (kind=omp_pause_resource_kind), parameter :: &
26 omp_pause_soft = 1
27 integer (kind=omp_pause_resource_kind), parameter :: &
28 omp_pause_hard = 2
Fortran
4 Binding
5 The binding task set for an omp_pause_resource region is the whole program.
6 Effect
7 The omp_pause_resource routine allows the runtime to relinquish resources used by OpenMP
8 on the specified device.
9 If successful, the omp_pause_hard value results in a hard pause for which the OpenMP state is
10 not guaranteed to persist across the omp_pause_resource call. A hard pause may relinquish
11 any data allocated by OpenMP on a given device, including data allocated by memory routines for
12 that device as well as data present on the device as a result of a declare target directive or
13 target data construct. A hard pause may also relinquish any data associated with a
14 threadprivate directive. When relinquished and when applicable, base language appropriate
15 deallocation/finalization is performed. When relinquished and when applicable, mapped data on a
16 device will not be copied back from the device to the host.
17 If successful, the omp_pause_soft value results in a soft pause for which the OpenMP state is
18 guaranteed to persist across the call, with the exception of any data associated with a
19 threadprivate directive, which may be relinquished across the call. When relinquished and
20 when applicable, base language appropriate deallocation/finalization is performed.
21
22 Note – A hard pause may relinquish more resources, but may resume processing OpenMP regions
23 more slowly. A soft pause allows OpenMP regions to restart more quickly, but may relinquish fewer
24 resources. An OpenMP implementation will reclaim resources as needed for OpenMP regions
25 encountered after the omp_pause_resource region. Since a hard pause may unmap data on the
26 specified device, appropriate data mapping is required before using data on the specified device
27 after the omp_pause_region region.
28
30 Tool Callbacks
31 If the tool is not allowed to interact with the specified device after encountering this call, then the
32 runtime must call the tool finalizer for that device.
33 Restrictions
34 Restrictions to the omp_pause_resource routine are as follows:
35 • The omp_pause_resource region may not be nested in any explicit OpenMP region.
36 • The routine may only be called when all explicit tasks have finalized execution.
7 3.6.2 omp_pause_resource_all
8 Summary
9 The omp_pause_resource_all routine allows the runtime to relinquish resources used by
10 OpenMP on all devices.
11 Format
C / C++
12 int omp_pause_resource_all(omp_pause_resource_t kind);
C / C++
Fortran
13 integer function omp_pause_resource_all(kind)
14 integer (kind=omp_pause_resource_kind) kind
Fortran
15 Binding
16 The binding task set for an omp_pause_resource_all region is the whole program.
17 Effect
18 The omp_pause_resource_all routine allows the runtime to relinquish resources used by
19 OpenMP on all devices. It is equivalent to calling the omp_pause_resource routine once for
20 each available device, including the host device.
21 The argument kind passed to this routine can be one of the valid OpenMP pause kind as defined in
22 Section 3.6.1, or any implementation-specific pause kind.
23 Tool Callbacks
24 If the tool is not allowed to interact with a given device after encountering this call, then the
25 runtime must call the tool finalizer for that device.
5 Cross References
6 • target construct, see Section 2.14.5.
7 • Declare target directive, see Section 2.14.7.
8 • To pause resources on a specific device only, see Section 3.6.1.
12 3.7.1 omp_get_num_procs
13 Summary
14 The omp_get_num_procs routine returns the number of processors available to the device.
15 Format
C / C++
16 int omp_get_num_procs(void);
C / C++
Fortran
17 integer function omp_get_num_procs()
Fortran
18 Binding
19 The binding thread set for an omp_get_num_procs region is all threads on a device. The effect
20 of executing this routine is not related to any specific region corresponding to any construct or API
21 routine.
22 Effect
23 The omp_get_num_procs routine returns the number of processors that are available to the
24 device at the time the routine is called. This value may change between the time that it is
25 determined by the omp_get_num_procs routine and the time that it is read in the calling
26 context due to system actions outside the control of the OpenMP implementation.
6 3.7.2 omp_set_default_device
7 Summary
8 The omp_set_default_device routine controls the default target device by assigning the
9 value of the default-device-var ICV.
10 Format
C / C++
11 void omp_set_default_device(int device_num);
C / C++
Fortran
12 subroutine omp_set_default_device(device_num)
13 integer device_num
Fortran
14 Binding
15 The binding task set for an omp_set_default_device region is the generating task.
16 Effect
17 The effect of this routine is to set the value of the default-device-var ICV of the current task to the
18 value specified in the argument. When called from within a target region the effect of this
19 routine is unspecified.
20 Cross References
21 • default-device-var, see Section 2.4.
22 • target construct, see Section 2.14.5.
23 • omp_get_default_device, see Section 3.7.3.
24 • OMP_DEFAULT_DEVICE environment variable, see Section 6.15.
25 3.7.3 omp_get_default_device
26 Summary
27 The omp_get_default_device routine returns the default target device.
4 Binding
5 The binding task set for an omp_get_default_device region is the generating task.
6 Effect
7 The omp_get_default_device routine returns the value of the default-device-var ICV of the
8 current task. When called from within a target region the effect of this routine is unspecified.
9 Cross References
10 • default-device-var, see Section 2.4.
11 • target construct, see Section 2.14.5.
12 • omp_set_default_device, see Section 3.7.2.
13 • OMP_DEFAULT_DEVICE environment variable, see Section 6.15.
14 3.7.4 omp_get_num_devices
15 Summary
16 The omp_get_num_devices routine returns the number of non-host devices available for
17 offloading code or data.
18 Format
C / C++
19 int omp_get_num_devices(void);
C / C++
Fortran
20 integer function omp_get_num_devices()
Fortran
21 Binding
22 The binding task set for an omp_get_num_devices region is the generating task.
5 Cross References
6 • target construct, see Section 2.14.5.
7 • omp_get_default_device, see Section 3.7.3.
8 • omp_get_device_num, see Section 3.7.5.
9 3.7.5 omp_get_device_num
10 Summary
11 The omp_get_device_num routine returns the device number of the device on which the
12 calling thread is executing.
13 Format
C / C++
14 int omp_get_device_num(void);
C / C++
Fortran
15 integer function omp_get_device_num()
Fortran
16 Binding
17 The binding task set for an omp_get_device_num region is the generating task.
18 Effect
19 The omp_get_device_num routine returns the device number of the device on which the
20 calling thread is executing. When called on the host device, it will return the same value as the
21 omp_get_initial_device routine.
22 Cross References
23 • target construct, see Section 2.14.5.
24 • omp_get_default_device, see Section 3.7.3.
25 • omp_get_num_devices, see Section 3.7.4.
26 • omp_get_initial_device routine, see Section 3.7.7.
5 Format
C / C++
6 int omp_is_initial_device(void);
C / C++
Fortran
7 logical function omp_is_initial_device()
Fortran
8 Binding
9 The binding task set for an omp_is_initial_device region is the generating task.
10 Effect
11 The effect of this routine is to return true if the current task is executing on the host device;
12 otherwise, it returns false.
13 Cross References
14 • omp_get_initial_device routine, see Section 3.7.7.
15 • Device memory routines, see Section 3.8.
16 3.7.7 omp_get_initial_device
17 Summary
18 The omp_get_initial_device routine returns a device number that represents the host
19 device.
20 Format
C / C++
21 int omp_get_initial_device(void);
C / C++
Fortran
22 integer function omp_get_initial_device()
Fortran
23 Binding
24 The binding task set for an omp_get_initial_device region is the generating task.
5 Cross References
6 • target construct, see Section 2.14.5.
7 • omp_is_initial_device routine, see Section 3.7.6.
8 • Device memory routines, see Section 3.8.
12 3.8.1 omp_target_alloc
13 Summary
14 The omp_target_alloc routine allocates memory in a device data environment and returns a
15 device pointer to that memory.
16 Format
C / C++
17 void* omp_target_alloc(size_t size, int device_num);
C / C++
Fortran
18 type(c_ptr) function omp_target_alloc(size, device_num) bind(c)
19 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
20 integer(c_size_t), value :: size
21 integer(c_int), value :: device_num
Fortran
22 Constraints on Arguments
23 The device_num argument must be greater than or equal to zero and less than or equal to the result
24 of omp_get_num_devices().
25 Binding
26 The binding task set for an omp_target_alloc region is the generating task, which is the target
27 task generated by the call to the omp_target_alloc routine.
18 Tool Callbacks
19 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
20 ompt_scope_begin as its endpoint argument for each occurrence of a
21 target-data-allocation-begin event in that thread. Similarly, a thread dispatches a registered
22 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
23 argument for each occurrence of a target-data-allocation-end event in that thread. These callbacks
24 have type signature ompt_callback_target_data_op_emi_t.
25 A thread dispatches a registered ompt_callback_target_data_op callback for each
26 occurrence of a target-data-allocation-begin event in that thread. The callback occurs in the context
27 of the target task and has type signature ompt_callback_target_data_op_t.
28 Restrictions
29 Restrictions to the omp_target_alloc routine are as follows.
30 • Freeing the storage returned by omp_target_alloc with any routine other than
31 omp_target_free results in unspecified behavior.
32 • When called from within a target region the effect is unspecified.
10 3.8.2 omp_target_free
11 Summary
12 The omp_target_free routine frees the device memory allocated by the
13 omp_target_alloc routine.
14 Format
C / C++
15 void omp_target_free(void *device_ptr, int device_num);
C / C++
Fortran
16 subroutine omp_target_free(device_ptr, device_num) bind(c)
17 use, intrinsic :: iso_c_binding, only : c_ptr, c_int
18 type(c_ptr), value :: device_ptr
19 integer(c_int), value :: device_num
Fortran
20 Constraints on Arguments
21 A program that calls omp_target_free with a non-null pointer that does not have a value
22 returned from omp_target_alloc is non-conforming. The device_num argument must be
23 greater than or equal to zero and less than or equal to the result of omp_get_num_devices().
24 Binding
25 The binding task set for an omp_target_free region is the generating task, which is the target
26 task generated by the call to the omp_target_free routine.
13 Tool Callbacks
14 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
15 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-free-begin
16 event in that thread. Similarly, a thread dispatches a registered
17 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
18 argument for each occurrence of a target-data-free-end event in that thread. These callbacks have
19 type signature ompt_callback_target_data_op_emi_t.
20 A thread dispatches a registered ompt_callback_target_data_op callback for each
21 occurrence of a target-data-free-begin event in that thread. The callback occurs in the context of the
22 target task and has type signature ompt_callback_target_data_op_t.
23 Restrictions
24 Restrictions to the omp_target_free routine are as follows.
25 • When called from within a target region the effect is unspecified.
26 Cross References
27 • target construct, see Section 2.14.5.
28 • omp_get_num_devices routine, see Section 3.7.4.
29 • omp_target_alloc routine, see Section 3.8.1.
30 • ompt_callback_target_data_op_t or
31 ompt_callback_target_data_op_emi_t callback type, see Section 4.5.2.25.
5 Format
C / C++
6 int omp_target_is_present(const void *ptr, int device_num);
C / C++
Fortran
7 integer(c_int) function omp_target_is_present(ptr, device_num) &
8 bind(c)
9 use, intrinsic :: iso_c_binding, only : c_ptr, c_int
10 type(c_ptr), value :: ptr
11 integer(c_int), value :: device_num
Fortran
12 Constraints on Arguments
13 The value of ptr must be a valid host pointer or NULL (or C_NULL_PTR, for Fortran). The
14 device_num argument must be greater than or equal to zero and less than or equal to the result of
15 omp_get_num_devices().
16 Binding
17 The binding task set for an omp_target_is_present region is the encountering task.
18 Effect
19 The omp_target_is_present routine returns true if device_num refers to the host device or
20 if ptr refers to storage that has corresponding storage in the device data environment of device
21 device_num. Otherwise, the routine returns false.
22 Restrictions
23 Restrictions to the omp_target_is_present routine are as follows.
24 • When called from within a target region the effect is unspecified.
25 Cross References
26 • target construct, see Section 2.14.5.
27 • map clause, see Section 2.21.7.1.
28 • omp_get_num_devices routine, see Section 3.7.4.
5 Format
C / C++
6 int omp_target_is_accessible( const void *ptr, size_t size,
7 int device_num);
C / C++
Fortran
8 integer(c_int) function omp_target_is_accessible( &
9 ptr, size, device_num) bind(c)
10 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
11 type(c_ptr), value :: ptr
12 integer(c_size_t), value :: size
13 integer(c_int), value :: device_num
Fortran
14 Constraints on Arguments
15 The value of ptr must be a valid host pointer or NULL (or C_NULL_PTR, for Fortran). The
16 device_num argument must be greater than or equal to zero and less than or equal to the result of
17 omp_get_num_devices().
18 Binding
19 The binding task set for an omp_target_is_accessible region is the encountering task.
20 Effect
21 This routine returns true if the storage of size bytes starting at the address given by ptr is accessible
22 from device device_num. Otherwise, it returns false.
23 Restrictions
24 Restrictions to the omp_target_is_accessible routine are as follows.
25 • When called from within a target region the effect is unspecified.
26 Cross References
27 • target construct, see Section 2.14.5.
28 • omp_get_num_devices routine, see Section 3.7.4.
5 Format
C / C++
6 int omp_target_memcpy(
7 void *dst,
8 const void *src,
9 size_t length,
10 size_t dst_offset,
11 size_t src_offset,
12 int dst_device_num,
13 int src_device_num
14 );
C / C++
Fortran
15 integer(c_int) function omp_target_memcpy(dst, src, length, &
16 dst_offset, src_offset, dst_device_num, src_device_num) bind(c)
17 use, intrinsic :: iso_c_binding, only : c_ptr, c_int, c_size_t
18 type(c_ptr), value :: dst, src
19 integer(c_size_t), value :: length, dst_offset, src_offset
20 integer(c_int), value :: dst_device_num, src_device_num
Fortran
21 Constraints on Arguments
22 Each device pointer specified must be valid for the device on the same side of the copy. The
23 dst_device_num and src_device_num arguments must be greater than or equal to zero and less than
24 or equal to the result of omp_get_num_devices().
25 Binding
26 The binding task set for an omp_target_memcpy region is the generating task, which is the
27 target task generated by the call to the omp_target_memcpy routine.
28 Effect
29 This routine copies length bytes of memory at offset src_offset from src in the device data
30 environment of device src_device_num to dst starting at offset dst_offset in the device data
31 environment of device dst_device_num.
32 The omp_target_memcpy routine executes as if part of a target task that is generated by the call
33 to the routine and that is an included task.
34 The return value is zero on success and non-zero on failure. This routine contains a task scheduling
35 point.
6 Tool Callbacks
7 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
8 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
9 event in that thread. Similarly, a thread dispatches a registered
10 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
11 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
12 type signature ompt_callback_target_data_op_emi_t.
13 A thread dispatches a registered ompt_callback_target_data_op callback for each
14 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
15 target task and has type signature ompt_callback_target_data_op_t.
16 Restrictions
17 Restrictions to the omp_target_memcpy routine are as follows.
18 • When called from within a target region the effect is unspecified.
19 Cross References
20 • target construct, see Section 2.14.5.
21 • omp_get_num_devices routine, see Section 3.7.4.
22 • ompt_callback_target_data_op_t or
23 ompt_callback_target_data_op_emi_t callback type, see Section 4.5.2.25.
24 3.8.6 omp_target_memcpy_rect
25 Summary
26 The omp_target_memcpy_rect routine copies a rectangular subvolume from a
27 multi-dimensional array to another multi-dimensional array. The omp_target_memcpy_rect
28 routine performs a copy between any combination of host and device pointers.
22 Tool Callbacks
23 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
24 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
25 event in that thread. Similarly, a thread dispatches a registered
26 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
27 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
28 type signature ompt_callback_target_data_op_emi_t.
29 A thread dispatches a registered ompt_callback_target_data_op callback for each
30 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
31 target task and has type signature ompt_callback_target_data_op_t.
32 Restrictions
33 Restrictions to the omp_target_memcpy_rect routine are as follows.
34 • When called from within a target region the effect is unspecified.
6 3.8.7 omp_target_memcpy_async
7 Summary
8 The omp_target_memcpy_async routine asynchronously performs a copy between any
9 combination of host and device pointers.
10 Format
C / C++
11 int omp_target_memcpy_async(
12 void *dst,
13 const void *src,
14 size_t length,
15 size_t dst_offset,
16 size_t src_offset,
17 int dst_device_num,
18 int src_device_num,
19 int depobj_count,
20 omp_depend_t *depobj_list
21 );
C / C++
Fortran
22 integer(c_int) function omp_target_memcpy_async(dst, src, length, &
23 dst_offset, src_offset, dst_device_num, src_device_num, &
24 depobj_count, depobj_list) bind(c)
25 use, intrinsic :: iso_c_binding, only : c_ptr, c_int, c_size_t
26 type(c_ptr), value :: dst, src
27 integer(c_size_t), value :: length, dst_offset, src_offset
28 integer(c_int), value :: dst_device_num, src_device_num, depobj_count
29 integer(omp_depend_kind), optional :: depobj_list(*)
Fortran
30 Constraints on Arguments
31 Each device pointer specified must be valid for the device on the same side of the copy. The
32 dst_device_num and src_device_num arguments must be greater than or equal to zero and less than
33 or equal to the result of omp_get_num_devices().
4 Effect
5 This routine performs an asynchronous memory copy where length bytes of memory at offset
6 src_offset from src in the device data environment of device src_device_num are copied to dst
7 starting at offset dst_offset in the device data environment of device dst_device_num.
8 The omp_target_memcpy_async routine executes as if part of a target task that is generated
9 by the call to the routine and for which execution may be deferred.
10 Task dependences are expressed with zero or more omp_depend_t objects. The dependences are
11 specified by passing the number of omp_depend_t objects followed by an array of
12 omp_depend_t objects. The generated target task is not a dependent task if the program passes
13 in a count of zero for depobj_count. depojb_list is ignored if the value of depobj_count is zero.
14 The routine returns zero if successful. Otherwise, it returns a non-zero value. The routine contains
15 a task scheduling point.
Fortran
16 The omp_target_memcpy_async routine requires an explicit interface and so might not be
17 provided in omp_lib.h.
Fortran
21 Tool Callbacks
22 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
23 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
24 event in that thread. Similarly, a thread dispatches a registered
25 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
26 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
27 type signature ompt_callback_target_data_op_emi_t.
28 A thread dispatches a registered ompt_callback_target_data_op callback for each
29 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
30 target task and has type signature ompt_callback_target_data_op_t.
31 Restrictions
32 Restrictions to the omp_target_memcpy_async routine are as follows.
33 • When called from within a target region the effect is unspecified.
7 3.8.8 omp_target_memcpy_rect_async
8 Summary
9 The omp_target_memcpy_rect_async routine asynchronously performs a copy between
10 any combination of host and device pointers.
11 Format
C / C++
12 int omp_target_memcpy_rect_async(
13 void *dst,
14 const void *src,
15 size_t element_size,
16 int num_dims,
17 const size_t *volume,
18 const size_t *dst_offsets,
19 const size_t *src_offsets,
20 const size_t *dst_dimensions,
21 const size_t *src_dimensions,
22 int dst_device_num,
23 int src_device_num,
24 int depobj_count,
25 omp_depend_t *depobj_list
26 );
C / C++
Fortran
27 integer(c_int) function omp_target_memcpy_rect_async(dst, src, &
28 element_size, num_dims, volume, dst_offsets, src_offsets, &
29 dst_dimensions, src_dimensions, dst_device_num, src_device_num, &
30 depobj_count, depobj_list) bind(c)
31 use, intrinsic :: iso_c_binding, only : c_ptr, c_int, c_size_t
32 type(c_ptr), value :: dst, src
33 integer(c_size_t), value :: element_size
34 integer(c_int), value :: num_dims, dst_device_num, src_device_num, &
35 depobj_count
16 Effect
17 This routine copies a rectangular subvolume of src, in the device data environment of device
18 src_device_num, to dst, in the device data environment of device dst_device_num. The volume is
19 specified in terms of the size of an element, number of dimensions, and constant arrays of length
20 num_dims. The maximum number of dimensions supported is at least three; support for higher
21 dimensionality is implementation defined. The volume array specifies the length, in number of
22 elements, to copy in each dimension from src to dst. The dst_offsets (src_offsets) parameter
23 specifies the number of elements from the origin of dst (src) in elements. The dst_dimensions
24 (src_dimensions) parameter specifies the length of each dimension of dst (src).
25 The omp_target_memcpy_rect_async routine executes as if part of a target task that is
26 generated by the call to the routine and for which execution may be deferred.
27 Task dependences are expressed with zero or more omp_depend_t objects. The dependences are
28 specified by passing the number of omp_depend_t objects followed by an array of
29 omp_depend_t objects. The generated target task is not a dependent task if the program passes
30 in a count of zero for depobj_count. depobj_list is ignored if the value of depobj_count is zero.
31 The routine returns zero if successful. Otherwise, it returns a non-zero value. The routine contains
32 a task scheduling point.
33 An application can determine the number of inclusive dimensions supported by an implementation
34 by passing NULL pointers (or C_NULL_PTR, for Fortran) for both dst and src. The routine returns
8 Tool Callbacks
9 A thread dispatches a registered ompt_callback_target_data_op_emi callback with
10 ompt_scope_begin as its endpoint argument for each occurrence of a target-data-op-begin
11 event in that thread. Similarly, a thread dispatches a registered
12 ompt_callback_target_data_op_emi callback with ompt_scope_end as its endpoint
13 argument for each occurrence of a target-data-op-end event in that thread. These callbacks have
14 type signature ompt_callback_target_data_op_emi_t.
15 A thread dispatches a registered ompt_callback_target_data_op callback for each
16 occurrence of a target-data-op-end event in that thread. The callback occurs in the context of the
17 target task and has type signature ompt_callback_target_data_op_t.
18 Restrictions
19 Restrictions to the omp_target_memcpy_rect_async routine are as follows.
20 • When called from within a target region the effect is unspecified.
21 Cross References
22 • target construct, see Section 2.14.5.
23 • Depend objects, see Section 2.19.10.
24 • omp_get_num_devices routine, see Section 3.7.4.
25 • ompt_callback_target_data_op_t or
26 ompt_callback_target_data_op_emi_t callback type, see Section 4.5.2.25.
27 3.8.9 omp_target_associate_ptr
28 Summary
29 The omp_target_associate_ptr routine maps a device pointer, which may be returned
30 from omp_target_alloc or implementation-defined runtime routines, to a host pointer.
15 Constraints on Arguments
16 The value of device_ptr value must be a valid pointer to device memory for the device denoted by
17 the value of device_num. The device_num argument must be greater than or equal to zero and less
18 than or equal to the result of omp_get_num_devices().
19 Binding
20 The binding task set for an omp_target_associate_ptr region is the generating task, which
21 is the target task generated by the call to the omp_target_associate_ptr routine.
22 Effect
23 The omp_target_associate_ptr routine associates a device pointer in the device data
24 environment of device device_num with a host pointer such that when the host pointer appears in a
25 subsequent map clause, the associated device pointer is used as the target for data motion
26 associated with that host pointer. The device_offset parameter specifies the offset into device_ptr
27 that is used as the base address for the device side of the mapping. The reference count of the
28 resulting mapping will be infinite. After being successfully associated, the buffer to which the
29 device pointer points is invalidated and accessing data directly through the device pointer results in
30 unspecified behavior. The pointer can be retrieved for other uses by using the
31 omp_target_disassociate_ptr routine to disassociate it .
32 The omp_target_associate_ptr routine executes as if part of a target task that is generated
33 by the call to the routine and that is an included task.
34 The routine returns zero if successful. Otherwise it returns a non-zero value.
12 Tool Callbacks
13 A thread dispatches a registered ompt_callback_target_data_op callback, or a registered
14 ompt_callback_target_data_op_emi callback with ompt_scope_beginend as its
15 endpoint argument for each occurrence of a target-data-associate event in that thread. These
16 callbacks have type signature ompt_callback_target_data_op_t or
17 ompt_callback_target_data_op_emi_t, respectively.
18 Restrictions
19 Restrictions to the omp_target_associate_ptr routine are as follows.
20 • When called from within a target region the effect is unspecified.
21 Cross References
22 • target construct, see Section 2.14.5.
23 • map clause, see Section 2.21.7.1.
24 • omp_get_num_devices routine, see Section 3.7.4.
25 • omp_target_alloc routine, see Section 3.8.1.
26 • omp_target_is_present routine, see Section 3.8.3.
27 • omp_target_disassociate_ptr routine, see Section 3.8.10.
28 • omp_get_mapped_ptr routine, see Section 3.8.11.
29 • ompt_callback_target_data_op_t or
30 ompt_callback_target_data_op_emi_t callback type, see Section 4.5.2.25.
5 Format
C / C++
6 int omp_target_disassociate_ptr(const void *ptr, int device_num);
C / C++
Fortran
7 integer(c_int) function omp_target_disassociate_ptr(ptr, &
8 device_num) bind(c)
9 use, intrinsic :: iso_c_binding, only : c_ptr, c_int
10 type(c_ptr), value :: ptr
11 integer(c_int), value :: device_num
Fortran
12 Constraints on Arguments
13 The device_num argument must be greater than or equal to zero and less than or equal to the result
14 of omp_get_num_devices().
15 Binding
16 The binding task set for an omp_target_disassociate_ptr region is the generating task,
17 which is the target task generated by the call to the omp_target_disassociate_ptr routine.
18 Effect
19 The omp_target_disassociate_ptr removes the associated device data on device
20 device_num from the presence table for host pointer ptr. A call to this routine on a pointer that is
21 not NULL (or C_NULL_PTR, for Fortran) and does not have associated data on the given device
22 results in unspecified behavior. The reference count of the mapping is reduced to zero, regardless of
23 its current value.
24 The omp_target_disassociate_ptr routine executes as if part of a target task that is
25 generated by the call to the routine and that is an included task.
26 The routine returns zero if successful. Otherwise it returns a non-zero value.
27 After a call to omp_target_disassociate_ptr, the contents of the device buffer are
28 invalidated.
Fortran
29 The omp_target_disassociate_ptr routine requires an explicit interface and so might not
30 be provided in omp_lib.h.
Fortran
4 Tool Callbacks
5 A thread dispatches a registered ompt_callback_target_data_op callback, or a registered
6 ompt_callback_target_data_op_emi callback with ompt_scope_beginend as its
7 endpoint argument for each occurrence of a target-data-disassociate event in that thread. These
8 callbacks have type signature ompt_callback_target_data_op_t or
9 ompt_callback_target_data_op_emi_t, respectively.
10 Restrictions
11 Restrictions to the omp_target_disassociate_ptr routine are as follows.
12 • When called from within a target region the effect is unspecified.
13 Cross References
14 • target construct, see Section 2.14.5.
15 • omp_get_num_devices routine, see Section 3.7.4.
16 • omp_target_associate_ptr routine, see Section 3.8.9.
17 • ompt_callback_target_data_op_t or
18 ompt_callback_target_data_op_emi_t callback type, see Section 4.5.2.25.
19 3.8.11 omp_get_mapped_ptr
20 Summary
21 The omp_get_mapped_ptr routine returns the device pointer that is associated with a host
22 pointer for a given device.
23 Format
C / C++
24 void * omp_get_mapped_ptr(const void *ptr, int device_num);
C / C++
Fortran
25 type(c_ptr) function omp_get_mapped_ptr(ptr, &
26 device_num) bind(c)
27 use, intrinsic :: iso_c_binding, only : c_ptr, c_int
28 type(c_ptr), value :: ptr
29 integer(c_int), value :: device_num
Fortran
4 Binding
5 The binding task set for an omp_get_mapped_ptr region is the encountering task.
6 Effect
7 The omp_get_mapped_ptr routine returns the associated device pointer on device device_num.
8 A call to this routine for a pointer that is not NULL (or C_NULL_PTR, for Fortran) and does not
9 have an associated pointer on the given device results in a NULL pointer.
10 The routine returns NULL (or C_NULL_PTR, for Fortran) if unsuccessful. Otherwise it returns the
11 device pointer, which is ptr if device_num is the value returned by
12 omp_get_initial_device().
Fortran
13 The omp_get_mapped_ptr routine requires an explicit interface and so might not be provided
14 in omp_lib.h.
Fortran
17 Restrictions
18 Restrictions to the omp_get_mapped_ptr routine are as follows.
19 • When called from within a target region the effect is unspecified.
20 Cross References
21 • omp_get_num_devices routine, see Section 3.7.4.
22 • omp_get_initial_device routine, see Section 3.7.7.
27 Binding
28 The binding thread set for all lock routine regions is all threads in the contention group. As a
29 consequence, for each OpenMP lock, the lock routine effects relate to all tasks that call the routines,
30 without regard to which teams in the contention group the threads that are executing the tasks
31 belong.
24 Restrictions
25 Restrictions to OpenMP lock routines are as follows:
26 • The use of the same OpenMP lock in different contention groups results in unspecified behavior.
4 Format
C / C++
5 void omp_init_lock(omp_lock_t *lock);
6 void omp_init_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
7 subroutine omp_init_lock(svar)
8 integer (kind=omp_lock_kind) svar
9
10 subroutine omp_init_nest_lock(nvar)
11 integer (kind=omp_nest_lock_kind) nvar
Fortran
12 Constraints on Arguments
13 A program that accesses a lock that is not in the uninitialized state through either routine is
14 non-conforming.
15 Effect
16 The effect of these routines is to initialize the lock to the unlocked state; that is, no task owns the
17 lock. In addition, the nesting count for a nestable lock is set to zero.
22 Tool Callbacks
23 A thread dispatches a registered ompt_callback_lock_init callback with
24 omp_sync_hint_none as the hint argument and ompt_mutex_lock as the kind argument
25 for each occurrence of a lock-init event in that thread. Similarly, a thread dispatches a registered
26 ompt_callback_lock_init callback with omp_sync_hint_none as the hint argument
27 and ompt_mutex_nest_lock as the kind argument for each occurrence of a nest-lock-init
28 event in that thread. These callbacks have the type signature
29 ompt_callback_mutex_acquire_t and occur in the task that encounters the routine.
30 Cross References
31 • ompt_callback_mutex_acquire_t, see Section 4.5.2.14.
7 Format
C / C++
8 void omp_init_lock_with_hint(
9 omp_lock_t *lock,
10 omp_sync_hint_t hint
11 );
12 void omp_init_nest_lock_with_hint(
13 omp_nest_lock_t *lock,
14 omp_sync_hint_t hint
15 );
C / C++
Fortran
16 subroutine omp_init_lock_with_hint(svar, hint)
17 integer (kind=omp_lock_kind) svar
18 integer (kind=omp_sync_hint_kind) hint
19
20 subroutine omp_init_nest_lock_with_hint(nvar, hint)
21 integer (kind=omp_nest_lock_kind) nvar
22 integer (kind=omp_sync_hint_kind) hint
Fortran
23 Constraints on Arguments
24 A program that accesses a lock that is not in the uninitialized state through either routine is
25 non-conforming.
26 The second argument passed to these routines (hint) is a hint as described in Section 2.19.12.
27 Effect
28 The effect of these routines is to initialize the lock to the unlocked state and, optionally, to choose a
29 specific lock implementation based on the hint. After initialization no task owns the lock. In
30 addition, the nesting count for a nestable lock is set to zero.
6 Tool Callbacks
7 A thread dispatches a registered ompt_callback_lock_init callback with the same value
8 for its hint argument as the hint argument of the call to omp_init_lock_with_hint and
9 ompt_mutex_lock as the kind argument for each occurrence of a lock-init-with-hint event in
10 that thread. Similarly, a thread dispatches a registered ompt_callback_lock_init callback
11 with the same value for its hint argument as the hint argument of the call to
12 omp_init_nest_lock_with_hint and ompt_mutex_nest_lock as the kind argument
13 for each occurrence of a nest-lock-init-with-hint event in that thread. These callbacks have the type
14 signature ompt_callback_mutex_acquire_t and occur in the task that encounters the
15 routine.
16 Cross References
17 • Synchronization Hints, see Section 2.19.12.
18 • ompt_callback_mutex_acquire_t, see Section 4.5.2.14.
23 Format
C / C++
24 void omp_destroy_lock(omp_lock_t *lock);
25 void omp_destroy_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
26 subroutine omp_destroy_lock(svar)
27 integer (kind=omp_lock_kind) svar
28
29 subroutine omp_destroy_nest_lock(nvar)
30 integer (kind=omp_nest_lock_kind) nvar
Fortran
4 Effect
5 The effect of these routines is to change the state of the lock to uninitialized.
10 Tool Callbacks
11 A thread dispatches a registered ompt_callback_lock_destroy callback with
12 ompt_mutex_lock as the kind argument for each occurrence of a lock-destroy event in that
13 thread. Similarly, a thread dispatches a registered ompt_callback_lock_destroy callback
14 with ompt_mutex_nest_lock as the kind argument for each occurrence of a nest-lock-destroy
15 event in that thread. These callbacks have the type signature ompt_callback_mutex_t and
16 occur in the task that encounters the routine.
17 Cross References
18 • ompt_callback_mutex_t, see Section 4.5.2.15.
23 Format
C / C++
24 void omp_set_lock(omp_lock_t *lock);
25 void omp_set_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
26 subroutine omp_set_lock(svar)
27 integer (kind=omp_lock_kind) svar
28
29 subroutine omp_set_nest_lock(nvar)
30 integer (kind=omp_nest_lock_kind) nvar
Fortran
5 Effect
6 Each of these routines has an effect equivalent to suspension of the task that is executing the routine
7 until the specified lock is available.
8
9 Note – The semantics of these routines is specified as if they serialize execution of the region
10 guarded by the lock. However, implementations may implement them in other ways provided that
11 the isolation properties are respected so that the actual execution delivers a result that could arise
12 from some serialization.
13
14 A simple lock is available if it is unlocked. Ownership of the lock is granted to the task that
15 executes the routine.
16 A nestable lock is available if it is unlocked or if it is already owned by the task that executes the
17 routine. The task that executes the routine is granted, or retains, ownership of the lock, and the
18 nesting count for the lock is incremented.
30 Tool Callbacks
31 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
32 occurrence of a lock-acquire or nest-lock-acquire event in that thread. This callback has the type
33 signature ompt_callback_mutex_acquire_t.
34 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
35 occurrence of a lock-acquired or nest-lock-acquired event in that thread. This callback has the type
36 signature ompt_callback_mutex_t.
7 Cross References
8 • ompt_callback_mutex_acquire_t, see Section 4.5.2.14.
9 • ompt_callback_mutex_t, see Section 4.5.2.15.
10 • ompt_callback_nest_lock_t, see Section 4.5.2.16.
14 Format
C / C++
15 void omp_unset_lock(omp_lock_t *lock);
16 void omp_unset_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
17 subroutine omp_unset_lock(svar)
18 integer (kind=omp_lock_kind) svar
19
20 subroutine omp_unset_nest_lock(nvar)
21 integer (kind=omp_nest_lock_kind) nvar
Fortran
22 Constraints on Arguments
23 A program that accesses a lock that is not in the locked state or that is not owned by the task that
24 contains the call through either routine is non-conforming.
25 Effect
26 For a simple lock, the omp_unset_lock routine causes the lock to become unlocked.
27 For a nestable lock, the omp_unset_nest_lock routine decrements the nesting count, and
28 causes the lock to become unlocked if the resulting nesting count is zero.
29 For either routine, if the lock becomes unlocked, and if one or more task regions were effectively
30 suspended because the lock was unavailable, the effect is that one task is chosen and given
31 ownership of the lock.
9 Tool Callbacks
10 A thread dispatches a registered ompt_callback_mutex_released callback with
11 ompt_mutex_lock as the kind argument for each occurrence of a lock-release event in that
12 thread. Similarly, a thread dispatches a registered ompt_callback_mutex_released
13 callback with ompt_mutex_nest_lock as the kind argument for each occurrence of a
14 nest-lock-release event in that thread. These callbacks have the type signature
15 ompt_callback_mutex_t and occur in the task that encounters the routine.
16 A thread dispatches a registered ompt_callback_nest_lock callback with
17 ompt_scope_end as its endpoint argument for each occurrence of a nest-lock-held event in that
18 thread. This callback has the type signature ompt_callback_nest_lock_t.
19 Cross References
20 • ompt_callback_mutex_t, see Section 4.5.2.15.
21 • ompt_callback_nest_lock_t, see Section 4.5.2.16.
26 Format
C / C++
27 int omp_test_lock(omp_lock_t *lock);
28 int omp_test_nest_lock(omp_nest_lock_t *lock);
C / C++
Fortran
29 logical function omp_test_lock(svar)
30 integer (kind=omp_lock_kind) svar
31
32 integer function omp_test_nest_lock(nvar)
33 integer (kind=omp_nest_lock_kind) nvar
Fortran
5 Effect
6 These routines attempt to set a lock in the same manner as omp_set_lock and
7 omp_set_nest_lock, except that they do not suspend execution of the task that executes the
8 routine.
9 For a simple lock, the omp_test_lock routine returns true if the lock is successfully set;
10 otherwise, it returns false.
11 For a nestable lock, the omp_test_nest_lock routine returns the new nesting count if the lock
12 is successfully set; otherwise, it returns zero.
24 Tool Callbacks
25 A thread dispatches a registered ompt_callback_mutex_acquire callback for each
26 occurrence of a lock-test or nest-lock-test event in that thread. This callback has the type signature
27 ompt_callback_mutex_acquire_t.
28 A thread dispatches a registered ompt_callback_mutex_acquired callback for each
29 occurrence of a lock-test-acquired or nest-lock-test-acquired event in that thread. This callback has
30 the type signature ompt_callback_mutex_t.
31 A thread dispatches a registered ompt_callback_nest_lock callback with
32 ompt_scope_begin as its endpoint argument for each occurrence of a nest-lock-owned event in
33 that thread. This callback has the type signature ompt_callback_nest_lock_t.
34 The above callbacks occur in the task that encounters the lock function. The kind argument of these
35 callbacks is ompt_mutex_test_lock when the events arise from an omp_test_lock
36 region while it is ompt_mutex_test_nest_lock when the events arise from an
37 omp_test_nest_lock region.
7 3.10.1 omp_get_wtime
8 Summary
9 The omp_get_wtime routine returns elapsed wall clock time in seconds.
10 Format
C / C++
11 double omp_get_wtime(void);
C / C++
Fortran
12 double precision function omp_get_wtime()
Fortran
13 Binding
14 The binding thread set for an omp_get_wtime region is the encountering thread. The routine’s
15 return value is not guaranteed to be consistent across any set of threads.
16 Effect
17 The omp_get_wtime routine returns a value equal to the elapsed wall clock time in seconds
18 since some time-in-the-past. The actual time-in-the-past is arbitrary, but it is guaranteed not to
19 change during the execution of the application program. The time returned is a per-thread time, so
20 it is not required to be globally consistent across all threads that participate in an application.
21 3.10.2 omp_get_wtick
22 Summary
23 The omp_get_wtick routine returns the precision of the timer used by omp_get_wtime.
24 Format
C / C++
25 double omp_get_wtick(void);
C / C++
5 Effect
6 The omp_get_wtick routine returns a value equal to the number of seconds between successive
7 clock ticks of the timer used by omp_get_wtime.
10 Binding
11 The binding thread set for all event routine regions is the encountering thread.
12 3.11.1 omp_fulfill_event
13 Summary
14 This routine fulfills and destroys an OpenMP event.
15 Format
C / C++
16 void omp_fulfill_event(omp_event_handle_t event);
C / C++
Fortran
17 subroutine omp_fulfill_event(event)
18 integer (kind=omp_event_handle_kind) event
Fortran
19 Constraints on Arguments
20 A program that calls this routine on an event that was already fulfilled is non-conforming. A
21 program that calls this routine with an event handle that was not created by the detach clause is
22 non-conforming.
23 Effect
24 The effect of this routine is to fulfill the event associated with the event handle argument. The effect
25 of fulfilling the event will depend on how the event was created. The event is destroyed and cannot
26 be accessed after calling this routine, and the event handle becomes unassociated with any event.
4 Tool Callbacks
5 A thread dispatches a registered ompt_callback_task_schedule callback with NULL as its
6 next_task_data argument while the argument prior_task_data binds to the detached task for each
7 occurrence of a task-fulfill event. If the task-fulfill event occurs before the detached task finished the
8 execution of the associated structured-block, the callback has ompt_task_early_fulfill as
9 its prior_task_status argument; otherwise the callback has ompt_task_late_fulfill as its
10 prior_task_status argument. This callback has type signature
11 ompt_callback_task_schedule_t.
12 Cross References
13 • detach clause, see Section 2.12.1.
14 • ompt_callback_task_schedule_t, see Section 4.5.2.10.
C / C++
29 Binding
30 The binding task set for all interoperability routine regions is the generating task.
1 3.12.1 omp_get_num_interop_properties
2 Summary
3 The omp_get_num_interop_properties routine retrieves the number of
4 implementation-defined properties available for an omp_interop_t object.
5 Format
6 int omp_get_num_interop_properties(const omp_interop_t interop);
7 Effect
8 The omp_get_num_interop_properties routine returns the number of
9 implementation-defined properties available for interop. The total number of properties available
10 for interop is the returned value minus omp_ipr_first.
11 Cross References
12 • interop construct, see Section 2.15.1.
13 3.12.2 omp_get_interop_int
14 Summary
15 The omp_get_interop_int routine retrieves an integer property from an omp_interop_t
16 object.
17 Format
18 omp_intptr_t omp_get_interop_int(const omp_interop_t interop,
19 omp_interop_property_t property_id,
20 int *ret_code);
1 Effect
2 The omp_get_interop_int routine returns the requested integer property, if available, and
3 zero if an error occurs or no value is available.
4 If the interop is omp_interop_none, an empty error occurs.
5 If the property_id is smaller than omp_ipr_first or not smaller than
6 omp_get_num_interop_properties(interop), an out of range error occurs.
7 If the requested property value is not convertible into an integer value, a type error occurs.
8 If a non-null pointer is passed to ret_code, an omp_interop_rc_t value that indicates the
9 return code is stored in the object to which ret_code points. If an error occurred, the stored value
10 will be negative and it will match the error as defined in Table 3.2. On success, zero will be stored.
11 If no error occurred but no meaningful value can be returned, omp_irc_no_value, which is
12 one, will be stored.
13 Restrictions
14 Restrictions to the omp_get_interop_int routine are as follows:
15 • The behavior of the routine is unspecified if an invalid omp_interop_t object is provided.
16 Cross References
17 • interop construct, see Section 2.15.1.
18 • omp_get_num_interop_properties routine, see Section 3.12.1.
19 3.12.3 omp_get_interop_ptr
20 Summary
21 The omp_get_interop_ptr routine retrieves a pointer property from an omp_interop_t
22 object.
23 Format
24 void* omp_get_interop_ptr(const omp_interop_t interop,
25 omp_interop_property_t property_id,
26 int *ret_code);
27 Effect
28 The omp_get_interop_ptr routine returns the requested pointer property, if available, and
29 NULL if an error occurs or no value is available.
30 If the interop is omp_interop_none, an empty error occurs.
31 If the property_id is smaller than omp_ipr_first or not smaller than
32 omp_get_num_interop_properties(interop), an out of range error occurs.
33 If the requested property value is not convertible into a pointer value, a type error occurs.
6 Restrictions
7 Restrictions to the omp_get_interop_ptr routine are as follows:
8 • The behavior of the routine is unspecified if an invalid omp_interop_t object is provided.
9 • Memory referenced by the pointer returned by the omp_get_interop_ptr routine is
10 managed by the OpenMP implementation and should not be freed or modified.
11 Cross References
12 • interop construct, see Section 2.15.1.
13 • omp_get_num_interop_properties routine, see Section 3.12.1.
14 3.12.4 omp_get_interop_str
15 Summary
16 The omp_get_interop_str routine retrieves a string property from an omp_interop_t
17 object.
18 Format
19 const char* omp_get_interop_str(const omp_interop_t interop,
20 omp_interop_property_t property_id,
21 int *ret_code);
22 Effect
23 The omp_get_interop_str routine returns the requested string property as a C string, if
24 available, and NULL if an error occurs or no value is available.
25 If the interop is omp_interop_none, an empty error occurs.
26 If the property_id is smaller than omp_ipr_first or not smaller than
27 omp_get_num_interop_properties(interop), an out of range error occurs.
28 If the requested property value is not convertible into a string value, a type error occurs.
29 If a non-null pointer is passed to ret_code, an omp_interop_rc_t value that indicates the
30 return code is stored in the object to which the ret_code points. If an error occurred, the stored
31 value will be negative and it will match the error as defined in Table 3.2. On success, zero will be
32 stored. If no error occurred but no meaningful value can be returned, omp_irc_no_value,
33 which is one, will be stored.
1 Restrictions
2 Restrictions to the omp_get_interop_str routine are as follows:
3 • The behavior of the routine is unspecified if an invalid omp_interop_t object is provided.
4 • Memory referenced by the pointer returned by the omp_get_interop_str routine is
5 managed by the OpenMP implementation and should not be freed or modified.
6 Cross References
7 • interop construct, see Section 2.15.1.
8 • omp_get_num_interop_properties routine, see Section 3.12.1.
9 3.12.5 omp_get_interop_name
10 Summary
11 The omp_get_interop_name routine retrieves a property name from an omp_interop_t
12 object.
13 Format
14 const char* omp_get_interop_name(const omp_interop_t interop,
15 omp_interop_property_t property_id)
16 ;
17 Effect
18 The omp_get_interop_name routine returns the name of the property identified by
19 property_id as a C string.
20 Property names for non-implementation defined properties are listed in Table 3.1.
21 If the property_id is smaller than omp_ipr_first or not smaller than
22 omp_get_num_interop_properties(interop), NULL is returned.
23 Restrictions
24 Restrictions to the omp_get_interop_name routine are as follows:
25 • The behavior of the routine is unspecified if an invalid object is provided.
26 • Memory referenced by the pointer returned by the omp_get_interop_name routine is
27 managed by the OpenMP implementation and should not be freed or modified.
28 Cross References
29 • interop construct, see Section 2.15.1.
30 • omp_get_num_interop_properties routine, see Section 3.12.1.
1 3.12.6 omp_get_interop_type_desc
2 Summary
3 The omp_get_interop_type_desc routine retrieves a description of the type of a property
4 associated with an omp_interop_t object.
5 Format
6 const char* omp_get_interop_type_desc(const omp_interop_t interop,
7 omp_interop_property_t
8 property_id);
9 Effect
10 The omp_get_interop_type_desc routine returns a C string that describes the type of the
11 property identified by property_id in human-readable form. That may contain a valid C type
12 declaration possibly followed by a description or name of the type.
13 If interop has the value omp_interop_none, NULL is returned.
14 If the property_id is smaller than omp_ipr_first or not smaller than
15 omp_get_num_interop_properties(interop), NULL is returned.
16 Restrictions
17 Restrictions to the omp_get_interop_type_desc routine are as follows:
18 • The behavior of the routine is unspecified if an invalid object is provided.
19 • Memory referenced by the pointer returned from the omp_get_interop_type_desc
20 routine is managed by the OpenMP implementation and should not be freed or modified.
21 Cross References
22 • interop construct, see Section 2.15.1.
23 • omp_get_num_interop_properties routine, see Section 3.12.1.
24 3.12.7 omp_get_interop_rc_desc
25 Summary
26 The omp_get_interop_rc_desc routine retrieves a description of the return code associated
27 with an omp_interop_t object.
28 Format
29 const char* omp_get_interop_rc_desc(const omp_interop_t interop,
30 omp_interop_rc_t ret_code);
31 Effect
32 The omp_get_interop_rc_desc routine returns a C string that describes the return code
33 ret_code in human-readable form.
7 Cross References
8 • interop construct, see Section 2.15.1.
9 • omp_get_num_interop_properties routine, see Section 3.12.1.
C / C++
1 parameter :: omp_atv_false = 0
2 integer(kind=omp_alloctrait_val_kind), &
3 parameter :: omp_atv_true = 1
4 integer(kind=omp_alloctrait_val_kind), &
5 parameter :: omp_atv_contended = 3
6 integer(kind=omp_alloctrait_val_kind), &
7 parameter :: omp_atv_uncontended = 4
8 integer(kind=omp_alloctrait_val_kind), &
9 parameter :: omp_atv_serialized = 5
10 integer(kind=omp_alloctrait_val_kind), &
11 parameter :: omp_atv_sequential = &
12 omp_atv_serialized ! (deprecated)
13 integer(kind=omp_alloctrait_val_kind), &
14 parameter :: omp_atv_private = 6
15 integer(kind=omp_alloctrait_val_kind), &
16 parameter :: omp_atv_all = 7
17 integer(kind=omp_alloctrait_val_kind), &
18 parameter :: omp_atv_thread = 8
19 integer(kind=omp_alloctrait_val_kind), &
20 parameter :: omp_atv_pteam = 9
21 integer(kind=omp_alloctrait_val_kind), &
22 parameter :: omp_atv_cgroup = 10
23 integer(kind=omp_alloctrait_val_kind), &
24 parameter :: omp_atv_default_mem_fb = 11
25 integer(kind=omp_alloctrait_val_kind), &
26 parameter :: omp_atv_null_fb = 12
27 integer(kind=omp_alloctrait_val_kind), &
28 parameter :: omp_atv_abort_fb = 13
29 integer(kind=omp_alloctrait_val_kind), &
30 parameter :: omp_atv_allocator_fb = 14
31 integer(kind=omp_alloctrait_val_kind), &
32 parameter :: omp_atv_environment = 15
33 integer(kind=omp_alloctrait_val_kind), &
34 parameter :: omp_atv_nearest = 16
35 integer(kind=omp_alloctrait_val_kind), &
36 parameter :: omp_atv_blocked = 17
37 integer(kind=omp_alloctrait_val_kind), &
38 parameter :: omp_atv_interleaved = 18
39
40 ! omp_alloctrait might not be provided in omp_lib.h.
41 type omp_alloctrait
42 integer(kind=omp_alloctrait_key_kind) key
43 integer(kind=omp_alloctrait_val_kind) value
5 3.13.2 omp_init_allocator
6 Summary
7 The omp_init_allocator routine initializes an allocator and associates it with a memory
8 space.
9 Format
C / C++
10 omp_allocator_handle_t omp_init_allocator (
11 omp_memspace_handle_t memspace,
12 int ntraits,
13 const omp_alloctrait_t traits[]
14 );
C / C++
Fortran
15 integer(kind=omp_allocator_handle_kind) &
16 function omp_init_allocator ( memspace, ntraits, traits )
17 integer(kind=omp_memspace_handle_kind),intent(in) :: memspace
18 integer,intent(in) :: ntraits
19 type(omp_alloctrait),intent(in) :: traits(*)
Fortran
20 Constraints on Arguments
21 The memspace argument must be one of the predefined memory spaces defined in Table 2.8.
22 If the ntraits argument is greater than zero then the traits argument must specify at least that many
23 traits. If it specifies fewer than ntraits traits the behavior is unspecified.
24 Binding
25 The binding thread set for an omp_init_allocator region is all threads on a device. The
26 effect of executing this routine is not related to any specific region that corresponds to any construct
27 or API routine.
12 Restrictions
13 The restrictions to the omp_init_allocator routine are as follows:
14 • The use of an allocator returned by this routine on a device other than the one on which it was
15 created results in unspecified behavior.
16 • Unless a requires directive with the dynamic_allocators clause is present in the same
17 compilation unit, using this routine in a target region results in unspecified behavior.
18 Cross References
19 • Memory Spaces, see Section 2.13.1.
20 • Memory Allocators, see Section 2.13.2.
21 3.13.3 omp_destroy_allocator
22 Summary
23 The omp_destroy_allocator routine releases all resources used by the allocator handle.
24 Format
C / C++
25 void omp_destroy_allocator (omp_allocator_handle_t allocator);
C / C++
Fortran
26 subroutine omp_destroy_allocator ( allocator )
27 integer(kind=omp_allocator_handle_kind),intent(in) :: allocator
Fortran
28 Constraints on Arguments
29 The allocator argument must not represent a predefined memory allocator.
5 Effect
6 The omp_destroy_allocator routine releases all resources used to implement the allocator
7 handle.
8 If allocator is omp_null_allocator then this routine will have no effect.
9 Restrictions
10 The restrictions to the omp_destroy_allocator routine are as follows:
11 • Accessing any memory allocated by the allocator after this call results in unspecified behavior.
12 • Unless a requires directive with the dynamic_allocators clause is present in the same
13 compilation unit, using this routine in a target region results in unspecified behavior.
14 Cross References
15 • Memory Allocators, see Section 2.13.2.
16 3.13.4 omp_set_default_allocator
17 Summary
18 The omp_set_default_allocator routine sets the default memory allocator to be used by
19 allocation calls, allocate directives and allocate clauses that do not specify an allocator.
20 Format
C / C++
21 void omp_set_default_allocator (omp_allocator_handle_t allocator);
C / C++
Fortran
22 subroutine omp_set_default_allocator ( allocator )
23 integer(kind=omp_allocator_handle_kind),intent(in) :: allocator
Fortran
24 Constraints on Arguments
25 The allocator argument must be a valid memory allocator handle.
26 Binding
27 The binding task set for an omp_set_default_allocator region is the binding implicit task.
4 Cross References
5 • def-allocator-var ICV, see Section 2.4.
6 • Memory Allocators, see Section 2.13.2.
7 • omp_alloc routine, see Section 3.13.6.
8 3.13.5 omp_get_default_allocator
9 Summary
10 The omp_get_default_allocator routine returns a handle to the memory allocator to be
11 used by allocation calls, allocate directives and allocate clauses that do not specify an
12 allocator.
13 Format
C / C++
14 omp_allocator_handle_t omp_get_default_allocator (void);
C / C++
Fortran
15 integer(kind=omp_allocator_handle_kind)&
16 function omp_get_default_allocator ()
Fortran
17 Binding
18 The binding task set for an omp_get_default_allocator region is the binding implicit task.
19 Effect
20 The effect of this routine is to return the value of the def-allocator-var ICV of the binding implicit
21 task.
22 Cross References
23 • def-allocator-var ICV, see Section 2.4.
24 • Memory Allocators, see Section 2.13.2.
25 • omp_alloc routine, see Section 3.13.6.
5 Format
C
6 void *omp_alloc(size_t size, omp_allocator_handle_t allocator);
7 void *omp_aligned_alloc(
8 size_t alignment,
9 size_t size,
10 omp_allocator_handle_t allocator);
C
C++
11 void *omp_alloc(
12 size_t size,
13 omp_allocator_handle_t allocator=omp_null_allocator
14 );
15 void *omp_aligned_alloc(
16 size_t alignment,
17 size_t size,
18 omp_allocator_handle_t allocator=omp_null_allocator
19 );
C++
Fortran
20 type(c_ptr) function omp_alloc(size, allocator) bind(c)
21 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t
22 integer(c_size_t), value :: size
23 integer(omp_allocator_handle_kind), value :: allocator
24
25 type(c_ptr) function omp_aligned_alloc(alignment, &
26 size, allocator) bind(c)
27 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t
28 integer(c_size_t), value :: alignment, size
29 integer(omp_allocator_handle_kind), value :: allocator
Fortran
8 Binding
9 The binding task set for an omp_alloc or omp_aligned_alloc region is the generating task.
10 Effect
11 The omp_alloc and omp_aligned_alloc routines request a memory allocation of size bytes
12 from the specified memory allocator. If the allocator argument is omp_null_allocator the
13 memory allocator used by the routines will be the one specified by the def-allocator-var ICV of the
14 binding implicit task. Upon success they return a pointer to the allocated memory. Otherwise, the
15 behavior that the fallback trait of the allocator specifies will be followed.
16 If size is 0, omp_alloc and omp_aligned_alloc will return NULL (or, C_NULL_PTR, for
17 Fortran).
18 Memory allocated by omp_alloc will be byte-aligned to at least the maximum of the alignment
19 required by malloc and the alignment trait of the allocator.
20 Memory allocated by omp_aligned_alloc will be byte-aligned to at least the maximum of the
21 alignment required by malloc, the alignment trait of the allocator and the alignment argument
22 value.
Fortran
23 The omp_alloc and omp_aligned_alloc routines require an explicit interface and so might
24 not be provided in omp_lib.h.
Fortran
25 Cross References
26 • Memory allocators, see Section 2.13.2.
27 3.13.7 omp_free
28 Summary
29 The omp_free routine deallocates previously allocated memory.
30 Format
C
31 void omp_free (void *ptr, omp_allocator_handle_t allocator);
C
9 Binding
10 The binding task set for an omp_free region is the generating task.
11 Effect
12 The omp_free routine deallocates the memory to which ptr points. The ptr argument must have
13 been returned by an OpenMP allocation routine. If the allocator argument is specified it must be
14 the memory allocator to which the allocation request was made. If the allocator argument is
15 omp_null_allocator the implementation will determine that value automatically.
16 If ptr is NULL (or, C_NULL_PTR, for Fortran), no operation is performed.
Fortran
17 The omp_free routine requires an explicit interface and so might not be provided in
18 omp_lib.h.
Fortran
19 Restrictions
20 The restrictions to the omp_free routine are as follows:
21 • Using omp_free on memory that was already deallocated or that was allocated by an allocator
22 that has already been destroyed with omp_destroy_allocator results in unspecified
23 behavior.
24 Cross References
25 • Memory allocators, see Section 2.13.2.
5 Format
C
6 void *omp_calloc(
7 size_t nmemb,
8 size_t size,
9 omp_allocator_handle_t allocator
10 );
11 void *omp_aligned_calloc(
12 size_t alignment,
13 size_t nmemb,
14 size_t size,
15 omp_allocator_handle_t allocator
16 );
C
C++
17 void *omp_calloc(
18 size_t nmemb,
19 size_t size,
20 omp_allocator_handle_t allocator=omp_null_allocator
21 );
22 void *omp_aligned_calloc(
23 size_t alignment,
24 size_t nmemb,
25 size_t size,
26 omp_allocator_handle_t allocator=omp_null_allocator
27 );
C++
11 Constraints on Arguments
12 Unless dynamic_allocators appears on a requires directive in the same compilation unit,
13 omp_calloc and omp_aligned_calloc invocations that appear in target regions must
14 not pass omp_null_allocator as the allocator argument, which must be a constant expression
15 that evaluates to one of the predefined memory allocator values.
16 The alignment argument to omp_aligned_calloc must be a power of two and the size
17 argument must be a multiple of alignment.
18 Binding
19 The binding task set for an omp_calloc or omp_aligned_calloc region is the generating
20 task.
21 Effect
22 The omp_calloc and omp_aligned_calloc routines request a memory allocation from the
23 specified memory allocator for an array of nmemb elements each of which has a size of size bytes.
24 If the allocator argument is omp_null_allocator the memory allocator used by the routines
25 will be the one specified by the def-allocator-var ICV of the binding implicit task. Upon success
26 they return a pointer to the allocated memory. Otherwise, the behavior that the fallback trait of
27 the allocator specifies will be followed. Any memory allocated by these routines will be set to zero
28 before returning.
29 If either nmemb or size is 0, omp_calloc will return NULL (or, C_NULL_PTR, for Fortran).
30 Memory allocated by omp_calloc will be byte-aligned to at least the maximum of the alignment
31 required by malloc and the alignment trait of the allocator.
32 Memory allocated by omp_aligned_calloc will be byte-aligned to at least the maximum of
33 the alignment required by malloc, the alignment trait of the allocator and the alignment
34 argument value.
3 Cross References
4 • Memory allocators, see Section 2.13.2.
5 3.13.9 omp_realloc
6 Summary
7 The omp_realloc routine deallocates previously allocated memory and requests a memory
8 allocation from a memory allocator.
9 Format
C
10 void *omp_realloc(
11 void *ptr,
12 size_t size,
13 omp_allocator_handle_t allocator,
14 omp_allocator_handle_t free_allocator
15 );
C
C++
16 void *omp_realloc(
17 void *ptr,
18 size_t size,
19 omp_allocator_handle_t allocator=omp_null_allocator,
20 omp_allocator_handle_t free_allocator=omp_null_allocator
21 );
C++
Fortran
22 type(c_ptr) &
23 function omp_realloc(ptr, size, allocator, free_allocator) bind(c)
24 use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t
25 type(c_ptr), value :: ptr
26 integer(c_size_t), value :: size
27 integer(omp_allocator_handle_kind), value :: allocator, free_allocator
Fortran
6 Binding
7 The binding task set for an omp_realloc region is the generating task.
8 Effect
9 The omp_realloc routine deallocates the memory to which ptr points and requests a new
10 memory allocation of size bytes from the specified memory allocator. If the free_allocator
11 argument is specified, it must be the memory allocator to which the previous allocation request was
12 made. If the free_allocator argument is omp_null_allocator the implementation will
13 determine that value automatically. If the allocator argument is omp_null_allocator the
14 behavior is as if the memory allocator that allocated the memory to which ptr argument points is
15 passed to the allocator argument. Upon success it returns a (possibly moved) pointer to the
16 allocated memory and the contents of the new object shall be the same as that of the old object
17 prior to deallocation, up to the minimum size of old allocated size and size. Any bytes in the new
18 object beyond the old allocated size will have unspecified values. If the allocation failed, the
19 behavior that the fallback trait of the allocator specifies will be followed.
20 If ptr is NULL (or, C_NULL_PTR, for Fortran), omp_realloc will behave the same as
21 omp_alloc with the same size and allocator arguments.
22 If size is 0, omp_realloc will return NULL (or, C_NULL_PTR, for Fortran) and the old
23 allocation will be deallocated.
24 If size is not 0, the old allocation will be deallocated if and only if the function returns a non-NULL
25 value (or, a non-C_NULL_PTR value, for Fortran).
26 Memory allocated by omp_realloc will be byte-aligned to at least the maximum of the
27 alignment required by malloc and the alignment trait of the allocator.
Fortran
28 The omp_realloc routine requires an explicit interface and so might not be provided in
29 omp_lib.h.
Fortran
30 Restrictions
31 The restrictions to the omp_realloc routine are as follows:
32 • The ptr argument must have been returned by an OpenMP allocation routine.
33 • Using omp_realloc on memory that was already deallocated or that was allocated by an
34 allocator that has already been destroyed with omp_destroy_allocator results in
35 unspecified behavior.
6 Format
C / C++
7 int omp_control_tool(int command, int modifier, void *arg);
C / C++
Fortran
8 integer function omp_control_tool(command, modifier)
9 integer (kind=omp_control_tool_kind) command
10 integer modifier
Fortran
11 Constraints on Arguments
12 The following enumeration type defines four standard commands. Table 3.3 describes the actions
13 that these commands request from a tool.
C / C++
14 typedef enum omp_control_tool_t {
15 omp_control_tool_start = 1,
16 omp_control_tool_pause = 2,
17 omp_control_tool_flush = 3,
18 omp_control_tool_end = 4
19 } omp_control_tool_t;
C / C++
Fortran
20 integer (kind=omp_control_tool_kind), &
21 parameter :: omp_control_tool_start = 1
22 integer (kind=omp_control_tool_kind), &
23 parameter :: omp_control_tool_pause = 2
24 integer (kind=omp_control_tool_kind), &
25 parameter :: omp_control_tool_flush = 3
26 integer (kind=omp_control_tool_kind), &
27 parameter :: omp_control_tool_end = 4
Fortran
Command Action
4 Binding
5 The binding task set for an omp_control_tool region is the generating task.
6 Effect
7 An OpenMP program may use omp_control_tool to pass commands to a tool. An application
8 can use omp_control_tool to request that a tool starts or restarts data collection when a code
9 region of interest is encountered, that a tool pauses data collection when leaving the region of
10 interest, that a tool flushes any data that it has collected so far, or that a tool ends data collection.
11 Additionally, omp_control_tool can be used to pass tool-specific commands to a particular
12 tool.
13 The following types correspond to return values from omp_control_tool:
C / C++
14 typedef enum omp_control_tool_result_t {
15 omp_control_tool_notool = -2,
16 omp_control_tool_nocallback = -1,
17 omp_control_tool_success = 0,
18 omp_control_tool_ignored = 1
19 } omp_control_tool_result_t;
C / C++
21 Tool Callbacks
22 A thread dispatches a registered ompt_callback_control_tool callback for each
23 occurrence of a tool-control event. The callback executes in the context of the call that occurs in the
24 user program and has type signature ompt_callback_control_tool_t. The callback may
25 return any non-negative value, which will be returned to the application by the OpenMP
26 implementation as the return value of the omp_control_tool call that triggered the callback.
27 Arguments passed to the callback are those passed by the user to omp_control_tool. If the
28 call is made in Fortran, the tool will be passed NULL as the third argument to the callback. If any of
29 the four standard commands is presented to a tool, the tool will ignore the modifier and arg
30 argument values.
31 Restrictions
32 Restrictions on access to the state of an OpenMP first-party tool are as follows:
33 • An application may access the tool state modified by an OMPT callback only by using
34 omp_control_tool.
8 Format
C / C++
9 void omp_display_env(int verbose);
C / C++
Fortran
10 subroutine omp_display_env(verbose)
11 logical,intent(in) :: verbose
Fortran
12 Binding
13 The binding thread set for an omp_display_env region is the encountering thread.
14 Effect
15 Each time the omp_display_env routine is invoked, the runtime system prints the OpenMP
16 version number and the initial values of the ICVs associated with the environment variables
17 described in Chapter 6. The displayed values are the values of the ICVs after they have been
18 modified according to the environment variable settings and before the execution of any OpenMP
19 construct or API routine.
20 The display begins with "OPENMP DISPLAY ENVIRONMENT BEGIN", followed by the
21 _OPENMP version macro (or the openmp_version named constant for Fortran) and ICV values,
22 in the format NAME ’=’ VALUE. NAME corresponds to the macro or environment variable name,
23 optionally prepended with a bracketed DEVICE. VALUE corresponds to the value of the macro or
24 ICV associated with this environment variable. Values are enclosed in single quotes. DEVICE
25 corresponds to the device on which the value of the ICV is applied. The display is terminated with
26 "OPENMP DISPLAY ENVIRONMENT END".
27 For the OMP_NESTED environment variable, the printed value is true if the max-active-levels-var
28 ICV is initialized to a value greater than 1; otherwise the printed value is false. The OMP_NESTED
29 environment variable has been deprecated.
16 Cross References
17 • OMP_DISPLAY_ENV environment variable, see Section 6.12.
21 4.2.1 ompt_start_tool
22 Summary
23 In order to use the OMPT interface provided by an OpenMP implementation, a tool must implement
24 the ompt_start_tool function, through which the OpenMP implementation initializes the tool.
17 Description of Arguments
18 The argument omp_version is the value of the _OPENMP version macro associated with the
19 OpenMP API implementation. This value identifies the OpenMP API version that an OpenMP
20 implementation supports, which specifies the version of the OMPT interface that it supports.
21 The argument runtime_version is a version string that unambiguously identifies the OpenMP
22 implementation.
23 Constraints on Arguments
24 The argument runtime_version must be an immutable string that is defined for the lifetime of a
25 program execution.
26 Effect
27 If a tool returns a non-null pointer to an ompt_start_tool_result_t structure, an OpenMP
28 implementation will call the tool initializer specified by the initialize field in this structure before
29 beginning execution of any OpenMP construct or completing execution of any environment routine
30 invocation; the OpenMP implementation will call the tool finalizer specified by the finalize field in
31 this structure when the OpenMP implementation shuts down.
32 Cross References
33 • ompt_start_tool_result_t, see Section 4.4.1.
disabled
Runtime shutdown no
or pause Inactive Found? Find next tool
yes r=NULL
Call Return
ompt_start_tool value r
0
r=non-null
1 Return Call
Active
value r->initialize
20 Cross References
21 • tool-libraries-var ICV, see Section 2.4.
22 • tool-var ICV, see Section 2.4.
23 • ompt_start_tool function, see Section 4.2.1.
24 • ompt_start_tool_result_t type, see Section 4.4.1.
8 Cross References
9 • ompt_start_tool function, see Section 4.2.1.
10 • ompt_start_tool_result_t type, see Section 4.4.1.
11 • ompt_initialize_t type, see Section 4.5.1.1.
12 • ompt_callback_thread_begin_t type, see Section 4.5.2.1.
13 • ompt_enumerate_states_t type, see Section 4.6.1.1.
14 • ompt_enumerate_mutex_impls_t type, see Section 4.6.1.2.
15 • ompt_set_callback_t type, see Section 4.6.1.3.
16 • ompt_function_lookup_t type, see Section 4.6.3.
3 Cross References
4 • ompt_enumerate_states_t type, see Section 4.6.1.1.
5 • ompt_enumerate_mutex_impls_t type, see Section 4.6.1.2.
6 • ompt_set_callback_t type, see Section 4.6.1.3.
7 • ompt_get_callback_t type, see Section 4.6.1.4.
8 • ompt_get_thread_data_t type, see Section 4.6.1.5.
9 • ompt_get_num_procs_t type, see Section 4.6.1.6.
10 • ompt_get_num_places_t type, see Section 4.6.1.7.
11 • ompt_get_place_proc_ids_t type, see Section 4.6.1.8.
12 • ompt_get_place_num_t type, see Section 4.6.1.9.
13 • ompt_get_partition_place_nums_t type, see Section 4.6.1.10.
14 • ompt_get_proc_id_t type, see Section 4.6.1.11.
15 • ompt_get_state_t type, see Section 4.6.1.12.
16 • ompt_get_parallel_info_t type, see Section 4.6.1.13.
17 • ompt_get_task_info_t type, see Section 4.6.1.14.
18 • ompt_get_task_memory_t type, see Section 4.6.1.15.
19 • ompt_get_target_info_t type, see Section 4.6.1.16.
20 • ompt_get_num_devices_t type, see Section 4.6.1.17.
21 • ompt_get_unique_id_t type, see Section 4.6.1.18.
22 • ompt_finalize_tool_t type, see Section 4.6.1.19.
23 • ompt_function_lookup_t type, see Section 4.6.3.
“ompt_enumerate_states” ompt_enumerate_states_t
“ompt_enumerate_mutex_impls” ompt_enumerate_mutex_impls_t
“ompt_set_callback” ompt_set_callback_t
“ompt_get_callback” ompt_get_callback_t
“ompt_get_thread_data” ompt_get_thread_data_t
“ompt_get_num_places” ompt_get_num_places_t
“ompt_get_place_proc_ids” ompt_get_place_proc_ids_t
“ompt_get_place_num” ompt_get_place_num_t
“ompt_get_partition_place_nums” ompt_get_partition_place_nums_t
“ompt_get_proc_id” ompt_get_proc_id_t
“ompt_get_state” ompt_get_state_t
“ompt_get_parallel_info” ompt_get_parallel_info_t
“ompt_get_task_info” ompt_get_task_info_t
“ompt_get_task_memory” ompt_get_task_memory_t
“ompt_get_num_devices” ompt_get_num_devices_t
“ompt_get_num_procs” ompt_get_num_procs_t
“ompt_get_target_info” ompt_get_target_info_t
“ompt_get_unique_id” ompt_get_unique_id_t
“ompt_finalize_tool” ompt_finalize_tool_t
22 Cross References
23 • ompt_set_result_t type, see Section 4.4.4.2.
24 • ompt_set_callback_t type, see Section 4.6.1.3.
25 • ompt_get_callback_t type, see Section 4.6.1.4.
Callback name
ompt_callback_thread_begin
ompt_callback_thread_end
ompt_callback_parallel_begin
ompt_callback_parallel_end
ompt_callback_task_create
ompt_callback_task_schedule
ompt_callback_implicit_task
ompt_callback_target
ompt_callback_target_emi
ompt_callback_target_data_op
ompt_callback_target_data_op_emi
ompt_callback_target_submit
ompt_callback_target_submit_emi
ompt_callback_control_tool
ompt_callback_device_initialize
ompt_callback_device_finalize
ompt_callback_device_load
ompt_callback_device_unload
Callback name
ompt_callback_sync_region_wait
ompt_callback_mutex_released
ompt_callback_dependences
ompt_callback_task_dependence
ompt_callback_work
ompt_callback_master // (deprecated)
ompt_callback_masked
ompt_callback_target_map
ompt_callback_target_map_emi
ompt_callback_sync_region
ompt_callback_reduction
ompt_callback_lock_init
ompt_callback_lock_destroy
ompt_callback_mutex_acquire
ompt_callback_mutex_acquired
ompt_callback_nest_lock
ompt_callback_flush
ompt_callback_cancel
ompt_callback_dispatch
34 Restrictions
35 Restrictions on tracing activity on devices are as follows:
36 • Implementation-defined names must not start with the prefix ompt_, which is reserved for the
37 OpenMP specification.
23 Cross References
24 • ompt_finalize_t callback type, see Section 4.5.1.2
9 Format
C / C++
10 typedef struct ompt_start_tool_result_t {
11 ompt_initialize_t initialize;
12 ompt_finalize_t finalize;
13 ompt_data_t tool_data;
14 } ompt_start_tool_result_t;
C / C++
15 Restrictions
16 Restrictions to the ompt_start_tool_result_t type are as follows:
17 • The initialize and finalize callback pointer values in an ompt_start_tool_result_t
18 structure that ompt_start_tool returns must be non-null.
19 Cross References
20 • ompt_start_tool function, see Section 4.2.1.
21 • ompt_data_t type, see Section 4.4.4.4.
22 • ompt_initialize_t callback type, see Section 4.5.1.1.
23 • ompt_finalize_t callback type, see Section 4.5.1.2.
24 4.4.2 Callbacks
25 Summary
26 The ompt_callbacks_t enumeration type indicates the integer codes used to identify OpenMP
27 callbacks when registering or querying them.
7 Format
C / C++
8 typedef enum ompt_record_t {
9 ompt_record_ompt = 1,
10 ompt_record_native = 2,
11 ompt_record_invalid = 3
12 } ompt_record_t;
C / C++
17 Format
C / C++
18 typedef enum ompt_record_native_t {
19 ompt_record_native_info = 1,
20 ompt_record_native_event = 2
21 } ompt_record_native_t;
C / C++
24 Format
C / C++
25 typedef struct ompt_record_ompt_t {
26 ompt_callbacks_t type;
27 ompt_device_time_t time;
28 ompt_id_t thread_id;
29 ompt_id_t target_id;
30 union {
31 ompt_record_thread_begin_t thread_begin;
32 ompt_record_parallel_begin_t parallel_begin;
33 ompt_record_parallel_end_t parallel_end;
34 ompt_record_work_t work;
35 ompt_record_dispatch_t dispatch;
36 ompt_record_task_create_t task_create;
23 Restrictions
24 Restrictions to the ompt_record_ompt_t type are as follows:
25 • If type is set to ompt_callback_thread_end_t then the value of record is undefined.
28 4.4.4.1 ompt_callback_t
29 Summary
30 Pointers to tool callback functions with different type signatures are passed to the
31 ompt_set_callback runtime entry point and returned by the ompt_get_callback
32 runtime entry point. For convenience, these runtime entry points expect all type signatures to be
33 cast to a dummy type ompt_callback_t.
3 4.4.4.2 ompt_set_result_t
4 Summary
5 The ompt_set_result_t enumeration type corresponds to values that the
6 ompt_set_callback, ompt_set_trace_ompt and ompt_set_trace_native
7 runtime entry points return.
8 Format
C / C++
9 typedef enum ompt_set_result_t {
10 ompt_set_error = 0,
11 ompt_set_never = 1,
12 ompt_set_impossible = 2,
13 ompt_set_sometimes = 3,
14 ompt_set_sometimes_paired = 4,
15 ompt_set_always = 5
16 } ompt_set_result_t;
C / C++
17 Description
18 Values of ompt_set_result_t, may indicate several possible outcomes. The
19 omp_set_error value indicates that the associated call failed. Otherwise, the value indicates
20 when an event may occur and, when appropriate, dispatching a callback event leads to the
21 invocation of the callback. The ompt_set_never value indicates that the event will never occur
22 or that the callback will never be invoked at runtime. The ompt_set_impossible value
23 indicates that the event may occur but that tracing of it is not possible. The
24 ompt_set_sometimes value indicates that the event may occur and, for an
25 implementation-defined subset of associated event occurrences, will be traced or the callback will
26 be invoked at runtime. The ompt_set_sometimes_paired value indicates the same result as
27 ompt_set_sometimes and, in addition, that a callback with an endpoint value of
28 ompt_scope_begin will be invoked if and only if the same callback with an endpoint value of
29 ompt_scope_end will also be invoked sometime in the future. The ompt_set_always value
30 indicates that, whenever an associated event occurs, it will be traced or the callback will be invoked.
7 4.4.4.3 ompt_id_t
8 Summary
9 The ompt_id_t type is used to provide various identifiers to tools.
10 Format
C / C++
11 typedef uint64_t ompt_id_t;
C / C++
12 Description
13 When tracing asynchronous activity on devices, identifiers enable tools to correlate target regions
14 and operations that the host initiates with associated activities on a target device. In addition,
15 OMPT provides identifiers to refer to parallel regions and tasks that execute on a device. These
16 various identifiers are of type ompt_id_t.
17 ompt_id_none is defined as an instance of type ompt_id_t with the value 0.
18 Restrictions
19 Restrictions to the ompt_id_t type are as follows:
20 • Identifiers created on each device must be unique from the time an OpenMP implementation is
21 initialized until it is shut down. Identifiers for each target region and target data operation
22 instance that the host device initiates must be unique over time on the host. Identifiers for parallel
23 and task region instances that execute on a device must be unique over time within that device.
4 Format
C / C++
5 typedef union ompt_data_t {
6 uint64_t value;
7 void *ptr;
8 } ompt_data_t;
C / C++
9 Description
10 The ompt_data_t type represents data that is reserved for tool use and that is related to a thread
11 or to a parallel or task region. When an OpenMP implementation creates a thread or an instance of
12 a parallel, teams, task, or target region, it initializes the associated ompt_data_t object with
13 the value ompt_data_none, which is an instance of the type with the data and pointer fields
14 equal to 0.
15 4.4.4.5 ompt_device_t
16 Summary
17 The ompt_device_t opaque object type represents a device.
18 Format
C / C++
19 typedef void ompt_device_t;
C / C++
20 4.4.4.6 ompt_device_time_t
21 Summary
22 The ompt_device_time_t type represents raw device time values.
23 Format
C / C++
24 typedef uint64_t ompt_device_time_t;
C / C++
5 4.4.4.7 ompt_buffer_t
6 Summary
7 The ompt_buffer_t opaque object type is a handle for a target buffer.
8 Format
C / C++
9 typedef void ompt_buffer_t;
C / C++
10 4.4.4.8 ompt_buffer_cursor_t
11 Summary
12 The ompt_buffer_cursor_t opaque type is a handle for a position in a target buffer.
13 Format
C / C++
14 typedef uint64_t ompt_buffer_cursor_t;
C / C++
15 4.4.4.9 ompt_dependence_t
16 Summary
17 The ompt_dependence_t type represents a task dependence.
18 Format
C / C++
19 typedef struct ompt_dependence_t {
20 ompt_data_t variable;
21 ompt_dependence_type_t dependence_type;
22 } ompt_dependence_t;
C / C++
6 Cross References
7 • ompt_dependence_type_t type, see Section 4.4.4.23.
8 4.4.4.10 ompt_thread_t
9 Summary
10 The ompt_thread_t enumeration type defines the valid thread type values.
11 Format
C / C++
12 typedef enum ompt_thread_t {
13 ompt_thread_initial = 1,
14 ompt_thread_worker = 2,
15 ompt_thread_other = 3,
16 ompt_thread_unknown = 4
17 } ompt_thread_t;
C / C++
18 Description
19 Any initial thread has thread type ompt_thread_initial. All OpenMP threads that are not
20 initial threads have thread type ompt_thread_worker. A thread that an OpenMP
21 implementation uses but that does not execute user code has thread type ompt_thread_other.
22 Any thread that is created outside an OpenMP implementation and that is not an initial thread has
23 thread type ompt_thread_unknown.
24 4.4.4.11 ompt_scope_endpoint_t
25 Summary
26 The ompt_scope_endpoint_t enumeration type defines valid scope endpoint values.
27 Format
C / C++
28 typedef enum ompt_scope_endpoint_t {
29 ompt_scope_begin = 1,
30 ompt_scope_end = 2,
31 ompt_scope_beginend = 3
32 } ompt_scope_endpoint_t;
C / C++
4 Format
C / C++
5 typedef enum ompt_dispatch_t {
6 ompt_dispatch_iteration = 1,
7 ompt_dispatch_section = 2
8 } ompt_dispatch_t;
C / C++
9 4.4.4.13 ompt_sync_region_t
10 Summary
11 The ompt_sync_region_t enumeration type defines the valid synchronization region kind
12 values.
13 Format
C / C++
14 typedef enum ompt_sync_region_t {
15 ompt_sync_region_barrier = 1, // deprecated
16 ompt_sync_region_barrier_implicit = 2, // deprecated
17 ompt_sync_region_barrier_explicit = 3,
18 ompt_sync_region_barrier_implementation = 4,
19 ompt_sync_region_taskwait = 5,
20 ompt_sync_region_taskgroup = 6,
21 ompt_sync_region_reduction = 7,
22 ompt_sync_region_barrier_implicit_workshare = 8,
23 ompt_sync_region_barrier_implicit_parallel = 9,
24 ompt_sync_region_barrier_teams = 10
25 } ompt_sync_region_t;
C / C++
4 Format
C / C++
5 typedef enum ompt_target_data_op_t {
6 ompt_target_data_alloc = 1,
7 ompt_target_data_transfer_to_device = 2,
8 ompt_target_data_transfer_from_device = 3,
9 ompt_target_data_delete = 4,
10 ompt_target_data_associate = 5,
11 ompt_target_data_disassociate = 6,
12 ompt_target_data_alloc_async = 17,
13 ompt_target_data_transfer_to_device_async = 18,
14 ompt_target_data_transfer_from_device_async = 19,
15 ompt_target_data_delete_async = 20
16 } ompt_target_data_op_t;
C / C++
17 4.4.4.15 ompt_work_t
18 Summary
19 The ompt_work_t enumeration type defines the valid work type values.
20 Format
C / C++
21 typedef enum ompt_work_t {
22 ompt_work_loop = 1,
23 ompt_work_sections = 2,
24 ompt_work_single_executor = 3,
25 ompt_work_single_other = 4,
26 ompt_work_workshare = 5,
27 ompt_work_distribute = 6,
28 ompt_work_taskloop = 7,
29 ompt_work_scope = 8
30 } ompt_work_t;
C / C++
4 Format
C / C++
5 typedef enum ompt_mutex_t {
6 ompt_mutex_lock = 1,
7 ompt_mutex_test_lock = 2,
8 ompt_mutex_nest_lock = 3,
9 ompt_mutex_test_nest_lock = 4,
10 ompt_mutex_critical = 5,
11 ompt_mutex_atomic = 6,
12 ompt_mutex_ordered = 7
13 } ompt_mutex_t;
C / C++
14 4.4.4.17 ompt_native_mon_flag_t
15 Summary
16 The ompt_native_mon_flag_t enumeration type defines the valid native monitoring flag
17 values.
18 Format
C / C++
19 typedef enum ompt_native_mon_flag_t {
20 ompt_native_data_motion_explicit = 0x01,
21 ompt_native_data_motion_implicit = 0x02,
22 ompt_native_kernel_invocation = 0x04,
23 ompt_native_kernel_execution = 0x08,
24 ompt_native_driver = 0x10,
25 ompt_native_runtime = 0x20,
26 ompt_native_overhead = 0x40,
27 ompt_native_idleness = 0x80
28 } ompt_native_mon_flag_t;
C / C++
4 Format
C / C++
5 typedef enum ompt_task_flag_t {
6 ompt_task_initial = 0x00000001,
7 ompt_task_implicit = 0x00000002,
8 ompt_task_explicit = 0x00000004,
9 ompt_task_target = 0x00000008,
10 ompt_task_taskwait = 0x00000010,
11 ompt_task_undeferred = 0x08000000,
12 ompt_task_untied = 0x10000000,
13 ompt_task_final = 0x20000000,
14 ompt_task_mergeable = 0x40000000,
15 ompt_task_merged = 0x80000000
16 } ompt_task_flag_t;
C / C++
17 Description
18 The ompt_task_flag_t enumeration type defines valid task type values. The least significant
19 byte provides information about the general classification of the task. The other bits represent
20 properties of the task.
21 4.4.4.19 ompt_task_status_t
22 Summary
23 The ompt_task_status_t enumeration type indicates the reason that a task was switched
24 when it reached a task scheduling point.
25 Format
C / C++
26 typedef enum ompt_task_status_t {
27 ompt_task_complete = 1,
28 ompt_task_yield = 2,
29 ompt_task_cancel = 3,
30 ompt_task_detach = 4,
31 ompt_task_early_fulfill = 5,
32 ompt_task_late_fulfill = 6,
33 ompt_task_switch = 7,
34 ompt_taskwait_complete = 8
35 } ompt_task_status_t;
C / C++
16 4.4.4.20 ompt_target_t
17 Summary
18 The ompt_target_t enumeration type defines the valid target type values.
19 Format
C / C++
20 typedef enum ompt_target_t {
21 ompt_target = 1,
22 ompt_target_enter_data = 2,
23 ompt_target_exit_data = 3,
24 ompt_target_update = 4,
25
26 ompt_target_nowait = 9,
27 ompt_target_enter_data_nowait = 10,
28 ompt_target_exit_data_nowait = 11,
29 ompt_target_update_nowait = 12
30 } ompt_target_t;
C / C++
4 Format
C / C++
5 typedef enum ompt_parallel_flag_t {
6 ompt_parallel_invoker_program = 0x00000001,
7 ompt_parallel_invoker_runtime = 0x00000002,
8 ompt_parallel_league = 0x40000000,
9 ompt_parallel_team = 0x80000000
10 } ompt_parallel_flag_t;
C / C++
11 Description
12 The ompt_parallel_flag_t enumeration type defines valid invoker values, which indicate
13 how an outlined function is invoked.
14 The value ompt_parallel_invoker_program indicates that the outlined function
15 associated with implicit tasks for the region is invoked directly by the application on the primary
16 thread for a parallel region.
17 The value ompt_parallel_invoker_runtime indicates that the outlined function
18 associated with implicit tasks for the region is invoked by the runtime on the primary thread for a
19 parallel region.
20 The value ompt_parallel_league indicates that the callback is invoked due to the creation of
21 a league of teams by a teams construct.
22 The value ompt_parallel_team indicates that the callback is invoked due to the creation of a
23 team of threads by a parallel construct.
4 Format
C / C++
5 typedef enum ompt_target_map_flag_t {
6 ompt_target_map_flag_to = 0x01,
7 ompt_target_map_flag_from = 0x02,
8 ompt_target_map_flag_alloc = 0x04,
9 ompt_target_map_flag_release = 0x08,
10 ompt_target_map_flag_delete = 0x10,
11 ompt_target_map_flag_implicit = 0x20
12 } ompt_target_map_flag_t;
C / C++
13 4.4.4.23 ompt_dependence_type_t
14 Summary
15 The ompt_dependence_type_t enumeration type defines the valid task dependence type
16 values.
17 Format
C / C++
18 typedef enum ompt_dependence_type_t {
19 ompt_dependence_type_in = 1,
20 ompt_dependence_type_out = 2,
21 ompt_dependence_type_inout = 3,
22 ompt_dependence_type_mutexinoutset = 4,
23 ompt_dependence_type_source = 5,
24 ompt_dependence_type_sink = 6,
25 ompt_dependence_type_inoutset = 7
26 } ompt_dependence_type_t;
C / C++
4 Format
C / C++
5 typedef enum ompt_severity_t {
6 ompt_warning = 1,
7 ompt_fatal = 2
8 } ompt_severity_t;
C / C++
9 4.4.4.25 ompt_cancel_flag_t
10 Summary
11 The ompt_cancel_flag_t enumeration type defines the valid cancel flag values.
12 Format
C / C++
13 typedef enum ompt_cancel_flag_t {
14 ompt_cancel_parallel = 0x01,
15 ompt_cancel_sections = 0x02,
16 ompt_cancel_loop = 0x04,
17 ompt_cancel_taskgroup = 0x08,
18 ompt_cancel_activated = 0x10,
19 ompt_cancel_detected = 0x20,
20 ompt_cancel_discarded_task = 0x40
21 } ompt_cancel_flag_t;
C / C++
22 4.4.4.26 ompt_hwid_t
23 Summary
24 The ompt_hwid_t opaque type is a handle for a hardware identifier for a target device.
25 Format
C / C++
26 typedef uint64_t ompt_hwid_t;
C / C++
6 Cross References
7 • ompt_record_abstract_t type, see Section 4.4.3.3.
8 4.4.4.27 ompt_state_t
9 Summary
10 If the OMPT interface is in the active state then an OpenMP implementation must maintain thread
11 state information for each thread. The thread state maintained is an approximation of the
12 instantaneous state of a thread.
13 Format
C / C++
14 A thread state must be one of the values of the enumeration type ompt_state_t or an
15 implementation-defined state value of 512 or higher.
16 typedef enum ompt_state_t {
17 ompt_state_work_serial = 0x000,
18 ompt_state_work_parallel = 0x001,
19 ompt_state_work_reduction = 0x002,
20
21 ompt_state_wait_barrier = 0x010, //
22 deprecated
23 ompt_state_wait_barrier_implicit_parallel = 0x011,
24 ompt_state_wait_barrier_implicit_workshare = 0x012,
25 ompt_state_wait_barrier_implicit = 0x013, //
26 deprecated
27 ompt_state_wait_barrier_explicit = 0x014,
28 ompt_state_wait_barrier_implementation = 0x015,
29 ompt_state_wait_barrier_teams = 0x016,
30
31 ompt_state_wait_taskwait = 0x020,
32 ompt_state_wait_taskgroup = 0x021,
33
34 ompt_state_wait_mutex = 0x040,
35 ompt_state_wait_lock = 0x041,
36 ompt_state_wait_critical = 0x042,
37 ompt_state_wait_atomic = 0x043,
38 ompt_state_wait_ordered = 0x044,
25 4.4.4.28 ompt_frame_t
26 Summary
27 The ompt_frame_t type describes procedure frame information for an OpenMP task.
28 Format
C / C++
29 typedef struct ompt_frame_t {
30 ompt_data_t exit_frame;
31 ompt_data_t enter_frame;
32 int exit_frame_flags;
33 int enter_frame_flags;
34 } ompt_frame_t;
C / C++
25 Note – A monitoring tool that uses asynchronous sampling can observe values of exit_frame and
26 enter_frame at inconvenient times. Tools must be prepared to handle ompt_frame_t objects
27 observed just prior to when their field values will be set or cleared.
28
29 4.4.4.29 ompt_frame_flag_t
30 Summary
31 The ompt_frame_flag_t enumeration type defines valid frame information flags.
19 4.4.4.30 ompt_wait_id_t
20 Summary
21 The ompt_wait_id_t type describes wait identifiers for an OpenMP thread.
22 Format
C / C++
23 typedef uint64_t ompt_wait_id_t;
C / C++
24 Description
25 Each thread maintains a wait identifier of type ompt_wait_id_t. When a task that a thread
26 executes is waiting for mutual exclusion, the wait identifier of the thread indicates the reason that
27 the thread is waiting. A wait identifier may represent a critical section name, a lock, a program
28 variable accessed in an atomic region, or a synchronization object that is internal to an OpenMP
29 implementation. When a thread is not in a wait state then the value of the wait identifier of the
30 thread is undefined.
31 ompt_wait_id_none is defined as an instance of type ompt_wait_id_t with the value 0.
5 Restrictions
6 • Tool callbacks may not use OpenMP directives or call any runtime library routines described in
7 Section 3.
8 • Tool callbacks must exit by either returning to the caller or aborting.
13 Format
C / C++
14 typedef int (*ompt_initialize_t) (
15 ompt_function_lookup_t lookup,
16 int initial_device_num,
17 ompt_data_t *tool_data
18 );
C / C++
19 Description
20 To use the OMPT interface, an implementation of ompt_start_tool must return a non-null
21 pointer to an ompt_start_tool_result_t structure that contains a pointer to a tool
22 initializer function with type signature ompt_initialize_t. An OpenMP implementation will
23 call the initializer after fully initializing itself but before beginning execution of any OpenMP
24 construct or runtime library routine.
25 The initializer returns a non-zero value if it succeeds; otherwise the OMPT interface state changes
26 to inactive as described in Section 4.2.3.
27 Description of Arguments
28 The lookup argument is a callback to an OpenMP runtime routine that must be used to obtain a
29 pointer to each runtime entry point in the OMPT interface. The initial_device_num argument
30 provides the value of omp_get_initial_device(). The tool_data argument is a pointer to
31 the tool_data field in the ompt_start_tool_result_t structure that ompt_start_tool
32 returned.
7 4.5.1.2 ompt_finalize_t
8 Summary
9 A tool implements a finalizer with the type signature ompt_finalize_t to finalize its use of the
10 OMPT interface.
11 Format
C / C++
12 typedef void (*ompt_finalize_t) (
13 ompt_data_t *tool_data
14 );
C / C++
15 Description
16 To use the OMPT interface, an implementation of ompt_start_tool must return a non-null
17 pointer to an ompt_start_tool_result_t structure that contains a non-null pointer to a tool
18 finalizer with type signature ompt_finalize_t. An OpenMP implementation must call the tool
19 finalizer after the last OMPT event as the OpenMP implementation shuts down.
20 Description of Arguments
21 The tool_data argument is a pointer to the tool_data field in the
22 ompt_start_tool_result_t structure returned by ompt_start_tool.
23 Cross References
24 • ompt_start_tool function, see Section 4.2.1.
25 • ompt_start_tool_result_t type, see Section 4.4.1.
26 • ompt_data_t type, see Section 4.4.4.4.
8 Cross References
9 • ompt_id_t type, see Section 4.4.4.3.
10 • ompt_data_t type, see Section 4.4.4.4.
11 4.5.2.1 ompt_callback_thread_begin_t
12 Summary
13 The ompt_callback_thread_begin_t type is used for callbacks that are dispatched when
14 native threads are created.
15 Format
C / C++
16 typedef void (*ompt_callback_thread_begin_t) (
17 ompt_thread_t thread_type,
18 ompt_data_t *thread_data
19 );
C / C++
20 Trace Record
C / C++
21 typedef struct ompt_record_thread_begin_t {
22 ompt_thread_t thread_type;
23 } ompt_record_thread_begin_t;
C / C++
24 Description of Arguments
25 The thread_type argument indicates the type of the new thread: initial, worker, or other. The
26 binding of the thread_data argument is the new thread.
7 4.5.2.2 ompt_callback_thread_end_t
8 Summary
9 The ompt_callback_thread_end_t type is used for callbacks that are dispatched when
10 native threads are destroyed.
11 Format
C / C++
12 typedef void (*ompt_callback_thread_end_t) (
13 ompt_data_t *thread_data
14 );
C / C++
15 Description of Arguments
16 The binding of the thread_data argument is the thread that will be destroyed.
17 Cross References
18 • parallel construct, see Section 2.6.
19 • teams construct, see Section 2.7.
20 • Initial task, see Section 2.12.5.
21 • ompt_record_ompt_t type, see Section 4.4.3.4.
22 • ompt_data_t type, see Section 4.4.4.4.
23 4.5.2.3 ompt_callback_parallel_begin_t
24 Summary
25 The ompt_callback_parallel_begin_t type is used for callbacks that are dispatched
26 when a parallel or teams region starts.
7 4.5.2.4 ompt_callback_parallel_end_t
8 Summary
9 The ompt_callback_parallel_end_t type is used for callbacks that are dispatched when a
10 parallel or teams region ends.
11 Format
C / C++
12 typedef void (*ompt_callback_parallel_end_t) (
13 ompt_data_t *parallel_data,
14 ompt_data_t *encountering_task_data,
15 int flags,
16 const void *codeptr_ra
17 );
C / C++
18 Trace Record
C / C++
19 typedef struct ompt_record_parallel_end_t {
20 ompt_id_t parallel_id;
21 ompt_id_t encountering_task_id;
22 int flags;
23 const void *codeptr_ra;
24 } ompt_record_parallel_end_t;
C / C++
25 Description of Arguments
26 The binding of the parallel_data argument is the parallel or teams region that is ending.
27 The binding of the encountering_task_data argument is the encountering task.
28 The flags argument indicates whether the execution of the region is inlined into the application or
29 invoked by the runtime and also whether it is a parallel or teams region. Values for flags are a
30 disjunction of elements in the enum ompt_parallel_flag_t.
7 Cross References
8 • parallel construct, see Section 2.6.
9 • teams construct, see Section 2.7.
10 • ompt_data_t type, see Section 4.4.4.4.
11 • ompt_parallel_flag_t type, see Section 4.4.4.21.
12 4.5.2.5 ompt_callback_work_t
13 Summary
14 The ompt_callback_work_t type is used for callbacks that are dispatched when worksharing
15 regions, loop-related regions, taskloop regions and scope regions begin and end.
16 Format
C / C++
17 typedef void (*ompt_callback_work_t) (
18 ompt_work_t wstype,
19 ompt_scope_endpoint_t endpoint,
20 ompt_data_t *parallel_data,
21 ompt_data_t *task_data,
22 uint64_t count,
23 const void *codeptr_ra
24 );
C / C++
25 Trace Record
C / C++
26 typedef struct ompt_record_work_t {
27 ompt_work_t wstype;
28 ompt_scope_endpoint_t endpoint;
29 ompt_id_t parallel_id;
30 ompt_id_t task_id;
31 uint64_t count;
32 const void *codeptr_ra;
33 } ompt_record_work_t;
C / C++
20 Cross References
21 • Worksharing constructs, see Section 2.10.
22 • Loop-related directives, see Section 2.11.
23 • Worksharing-Loop construct, see Section 2.11.4.
24 • taskloop construct, see Section 2.12.2.
25 • ompt_data_t type, see Section 4.4.4.4.
26 • ompt_scope_endpoint_t type, see Section 4.4.4.11.
27 • ompt_work_t type, see Section 4.4.4.15.
28 4.5.2.6 ompt_callback_dispatch_t
29 Summary
30 The ompt_callback_dispatch_t type is used for callbacks that are dispatched when a
31 thread begins to execute a section or loop iteration.
26 Cross References
27 • sections and section constructs, see Section 2.10.1.
28 • Worksharing-loop construct, see Section 2.11.4.
29 • taskloop construct, see Section 2.12.2.
30 • ompt_data_t type, see Section 4.4.4.4.
31 • ompt_dispatch_t type, see Section 4.4.4.12.
5 Format
C / C++
6 typedef void (*ompt_callback_task_create_t) (
7 ompt_data_t *encountering_task_data,
8 const ompt_frame_t *encountering_task_frame,
9 ompt_data_t *new_task_data,
10 int flags,
11 int has_dependences,
12 const void *codeptr_ra
13 );
C / C++
14 Trace Record
C / C++
15 typedef struct ompt_record_task_create_t {
16 ompt_id_t encountering_task_id;
17 ompt_id_t new_task_id;
18 int flags;
19 int has_dependences;
20 const void *codeptr_ra;
21 } ompt_record_task_create_t;
C / C++
22 Description of Arguments
23 The binding of the encountering_task_data argument is the encountering task.
24 The encountering_task_frame argument points to the frame object associated with the encountering
25 task. Accessing the frame object after the callback returned can cause a data race.
26 The binding of the new_task_data argument is the generated task.
27 The flags argument indicates the kind of task (explicit or target) that is generated. Values for flags
28 are a disjunction of elements in the ompt_task_flag_t enumeration type.
29 The has_dependences argument is true if the generated task has dependences and false otherwise.
30 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
31 runtime routine implements the region associated with a callback that has type signature
32 ompt_callback_task_create_t then codeptr_ra contains the return address of the call to
33 that runtime routine. If the implementation of the region is inlined then codeptr_ra contains the
3 Cross References
4 • task construct, see Section 2.12.1.
5 • Initial task, see Section 2.12.5.
6 • ompt_data_t type, see Section 4.4.4.4.
7 • ompt_task_flag_t type, see Section 4.4.4.18.
8 • ompt_frame_t type, see Section 4.4.4.28.
9 4.5.2.8 ompt_callback_dependences_t
10 Summary
11 The ompt_callback_dependences_t type is used for callbacks that are related to
12 dependences and that are dispatched when new tasks are generated and when ordered constructs
13 are encountered.
14 Format
C / C++
15 typedef void (*ompt_callback_dependences_t) (
16 ompt_data_t *task_data,
17 const ompt_dependence_t *deps,
18 int ndeps
19 );
C / C++
20 Trace Record
C / C++
21 typedef struct ompt_record_dependences_t {
22 ompt_id_t task_id;
23 ompt_dependence_t dep;
24 int ndeps;
25 } ompt_record_dependences_t;
C / C++
26 Description of Arguments
27 The binding of the task_data argument is the generated task for a depend clause on a task construct,
28 the target task for a depend clause on a target construct respectively depend object in an
29 asynchronous runtime routine, or the encountering implicit task for a depend clause of the ordered
30 construct.
8 Cross References
9 • ordered construct, see Section 2.19.9.
10 • depend clause, see Section 2.19.11.
11 • ompt_data_t type, see Section 4.4.4.4.
12 • ompt_dependence_t type, see Section 4.4.4.9.
13 4.5.2.9 ompt_callback_task_dependence_t
14 Summary
15 The ompt_callback_task_dependence_t type is used for callbacks that are dispatched
16 when unfulfilled task dependences are encountered.
17 Format
C / C++
18 typedef void (*ompt_callback_task_dependence_t) (
19 ompt_data_t *src_task_data,
20 ompt_data_t *sink_task_data
21 );
C / C++
22 Trace Record
C / C++
23 typedef struct ompt_record_task_dependence_t {
24 ompt_id_t src_task_id;
25 ompt_id_t sink_task_id;
26 } ompt_record_task_dependence_t;
C / C++
27 Description of Arguments
28 The binding of the src_task_data argument is a running task with an outgoing dependence.
29 The binding of the sink_task_data argument is a task with an unsatisfied incoming dependence.
4 4.5.2.10 ompt_callback_task_schedule_t
5 Summary
6 The ompt_callback_task_schedule_t type is used for callbacks that are dispatched when
7 task scheduling decisions are made.
8 Format
C / C++
9 typedef void (*ompt_callback_task_schedule_t) (
10 ompt_data_t *prior_task_data,
11 ompt_task_status_t prior_task_status,
12 ompt_data_t *next_task_data
13 );
C / C++
14 Trace Record
C / C++
15 typedef struct ompt_record_task_schedule_t {
16 ompt_id_t prior_task_id;
17 ompt_task_status_t prior_task_status;
18 ompt_id_t next_task_id;
19 } ompt_record_task_schedule_t;
C / C++
20 Description of Arguments
21 The prior_task_status argument indicates the status of the task that arrived at a task scheduling
22 point.
23 The binding of the prior_task_data argument is the task that arrived at the scheduling point.
24 The binding of the next_task_data argument is the task that is resumed at the scheduling point.
25 This argument is NULL if the callback is dispatched for a task-fulfill event or if the callback signals
26 completion of a taskwait construct.
27 Cross References
28 • Task scheduling, see Section 2.12.6.
29 • ompt_data_t type, see Section 4.4.4.4.
30 • ompt_task_status_t type, see Section 4.4.4.19.
5 Format
C / C++
6 typedef void (*ompt_callback_implicit_task_t) (
7 ompt_scope_endpoint_t endpoint,
8 ompt_data_t *parallel_data,
9 ompt_data_t *task_data,
10 unsigned int actual_parallelism,
11 unsigned int index,
12 int flags
13 );
C / C++
14 Trace Record
C / C++
15 typedef struct ompt_record_implicit_task_t {
16 ompt_scope_endpoint_t endpoint;
17 ompt_id_t parallel_id;
18 ompt_id_t task_id;
19 unsigned int actual_parallelism;
20 unsigned int index;
21 int flags;
22 } ompt_record_implicit_task_t;
C / C++
23 Description of Arguments
24 The endpoint argument indicates that the callback signals the beginning of a scope or the end of a
25 scope.
26 The binding of the parallel_data argument is the current parallel or teams region. For the
27 implicit-task-end and the initial-task-end events, this argument is NULL.
28 The binding of the task_data argument is the implicit task that executes the structured block of the
29 parallel or teams region.
30 The actual_parallelism argument indicates the number of threads in the parallel region or the
31 number of teams in the teams region. For initial tasks, that are not closely nested in a teams
32 construct, this argument is 1. For the implicit-task-end and the initial-task-end events, this
33 argument is 0.
5 Cross References
6 • parallel construct, see Section 2.6.
7 • teams construct, see Section 2.7.
8 • ompt_data_t type, see Section 4.4.4.4.
9 • ompt_scope_endpoint_t enumeration type, see Section 4.4.4.11.
10 4.5.2.12 ompt_callback_masked_t
11 Summary
12 The ompt_callback_masked_t type is used for callbacks that are dispatched when masked
13 regions start and end.
14 Format
C / C++
15 typedef void (*ompt_callback_masked_t) (
16 ompt_scope_endpoint_t endpoint,
17 ompt_data_t *parallel_data,
18 ompt_data_t *task_data,
19 const void *codeptr_ra
20 );
C / C++
21 Trace Record
C / C++
22 typedef struct ompt_record_masked_t {
23 ompt_scope_endpoint_t endpoint;
24 ompt_id_t parallel_id;
25 ompt_id_t task_id;
26 const void *codeptr_ra;
27 } ompt_record_masked_t;
C / C++
12 Cross References
13 • masked construct, see Section 2.8.
14 • ompt_data_t type, see Section 4.4.4.4.
15 • ompt_scope_endpoint_t type, see Section 4.4.4.11.
16 4.5.2.13 ompt_callback_sync_region_t
17 Summary
18 The ompt_callback_sync_region_t type is used for callbacks that are dispatched when
19 barrier regions, taskwait regions, and taskgroup regions begin and end and when waiting
20 begins and ends for them as well as for when reductions are performed.
21 Format
C / C++
22 typedef void (*ompt_callback_sync_region_t) (
23 ompt_sync_region_t kind,
24 ompt_scope_endpoint_t endpoint,
25 ompt_data_t *parallel_data,
26 ompt_data_t *task_data,
27 const void *codeptr_ra
28 );
C / C++
22 Cross References
23 • barrier construct, see Section 2.19.2.
24 • Implicit barriers, see Section 2.19.3.
25 • taskwait construct, see Section 2.19.5.
26 • taskgroup construct, see Section 2.19.6.
27 • Properties common to all reduction clauses, see Section 2.21.5.1.
28 • ompt_data_t type, see Section 4.4.4.4.
29 • ompt_scope_endpoint_t type, see Section 4.4.4.11.
30 • ompt_sync_region_t type, see Section 4.4.4.13.
6 Format
C / C++
7 typedef void (*ompt_callback_mutex_acquire_t) (
8 ompt_mutex_t kind,
9 unsigned int hint,
10 unsigned int impl,
11 ompt_wait_id_t wait_id,
12 const void *codeptr_ra
13 );
C / C++
14 Trace Record
C / C++
15 typedef struct ompt_record_mutex_acquire_t {
16 ompt_mutex_t kind;
17 unsigned int hint;
18 unsigned int impl;
19 ompt_wait_id_t wait_id;
20 const void *codeptr_ra;
21 } ompt_record_mutex_acquire_t;
C / C++
22 Description of Arguments
23 The kind argument indicates the kind of mutual exclusion event.
24 The hint argument indicates the hint that was provided when initializing an implementation of
25 mutual exclusion. If no hint is available when a thread initiates acquisition of mutual exclusion, the
26 runtime may supply omp_sync_hint_none as the value for hint.
27 The impl argument indicates the mechanism chosen by the runtime to implement the mutual
28 exclusion.
29 The wait_id argument indicates the object being awaited.
30 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
31 runtime routine implements the region associated with a callback that has type signature
32 ompt_callback_mutex_acquire_t then codeptr_ra contains the return address of the call
33 to that runtime routine. If the implementation of the region is inlined then codeptr_ra contains the
34 return address of the invocation of the callback. If attribution to source code is impossible or
35 inappropriate, codeptr_ra may be NULL.
8 4.5.2.15 ompt_callback_mutex_t
9 Summary
10 The ompt_callback_mutex_t type is used for callbacks that indicate important
11 synchronization events.
12 Format
C / C++
13 typedef void (*ompt_callback_mutex_t) (
14 ompt_mutex_t kind,
15 ompt_wait_id_t wait_id,
16 const void *codeptr_ra
17 );
C / C++
18 Trace Record
C / C++
19 typedef struct ompt_record_mutex_t {
20 ompt_mutex_t kind;
21 ompt_wait_id_t wait_id;
22 const void *codeptr_ra;
23 } ompt_record_mutex_t;
C / C++
24 Description of Arguments
25 The kind argument indicates the kind of mutual exclusion event.
26 The wait_id argument indicates the object being awaited.
27 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
28 runtime routine implements the region associated with a callback that has type signature
29 ompt_callback_mutex_t then codeptr_ra contains the return address of the call to that
30 runtime routine. If the implementation of the region is inlined then codeptr_ra contains the return
31 address of the invocation of the callback. If attribution to source code is impossible or
32 inappropriate, codeptr_ra may be NULL.
11 4.5.2.16 ompt_callback_nest_lock_t
12 Summary
13 The ompt_callback_nest_lock_t type is used for callbacks that indicate that a thread that
14 owns a nested lock has performed an action related to the lock but has not relinquished ownership
15 of it.
16 Format
C / C++
17 typedef void (*ompt_callback_nest_lock_t) (
18 ompt_scope_endpoint_t endpoint,
19 ompt_wait_id_t wait_id,
20 const void *codeptr_ra
21 );
C / C++
22 Trace Record
C / C++
23 typedef struct ompt_record_nest_lock_t {
24 ompt_scope_endpoint_t endpoint;
25 ompt_wait_id_t wait_id;
26 const void *codeptr_ra;
27 } ompt_record_nest_lock_t;
C / C++
28 Description of Arguments
29 The endpoint argument indicates that the callback signals the beginning of a scope or the end of a
30 scope.
31 The wait_id argument indicates the object being awaited.
13 4.5.2.17 ompt_callback_flush_t
14 Summary
15 The ompt_callback_flush_t type is used for callbacks that are dispatched when flush
16 constructs are encountered.
17 Format
C / C++
18 typedef void (*ompt_callback_flush_t) (
19 ompt_data_t *thread_data,
20 const void *codeptr_ra
21 );
C / C++
22 Trace Record
C / C++
23 typedef struct ompt_record_flush_t {
24 const void *codeptr_ra;
25 } ompt_record_flush_t;
C / C++
26 Description of Arguments
27 The binding of the thread_data argument is the executing thread.
28 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
29 runtime routine implements the region associated with a callback that has type signature
30 ompt_callback_flush_t then codeptr_ra contains the return address of the call to that
31 runtime routine. If the implementation of the region is inlined then codeptr_ra contains the return
32 address of the invocation of the callback. If attribution to source code is impossible or
33 inappropriate, codeptr_ra may be NULL.
4 4.5.2.18 ompt_callback_cancel_t
5 Summary
6 The ompt_callback_cancel_t type is used for callbacks that are dispatched for cancellation,
7 cancel and discarded-task events.
8 Format
C / C++
9 typedef void (*ompt_callback_cancel_t) (
10 ompt_data_t *task_data,
11 int flags,
12 const void *codeptr_ra
13 );
C / C++
14 Trace Record
C / C++
15 typedef struct ompt_record_cancel_t {
16 ompt_id_t task_id;
17 int flags;
18 const void *codeptr_ra;
19 } ompt_record_cancel_t;
C / C++
20 Description of Arguments
21 The binding of the task_data argument is the task that encounters a cancel construct, a
22 cancellation point construct, or a construct defined as having an implicit cancellation
23 point.
24 The flags argument, defined by the ompt_cancel_flag_t enumeration type, indicates whether
25 cancellation is activated by the current task, or detected as being activated by another task. The
26 construct that is being canceled is also described in the flags argument. When several constructs are
27 detected as being concurrently canceled, each corresponding bit in the argument will be set.
28 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
29 runtime routine implements the region associated with a callback that has type signature
30 ompt_callback_cancel_t then codeptr_ra contains the return address of the call to that
31 runtime routine. If the implementation of the region is inlined then codeptr_ra contains the return
32 address of the invocation of the callback. If attribution to source code is impossible or
33 inappropriate, codeptr_ra may be NULL.
3 4.5.2.19 ompt_callback_device_initialize_t
4 Summary
5 The ompt_callback_device_initialize_t type is used for callbacks that initialize
6 device tracing interfaces.
7 Format
C / C++
8 typedef void (*ompt_callback_device_initialize_t) (
9 int device_num,
10 const char *type,
11 ompt_device_t *device,
12 ompt_function_lookup_t lookup,
13 const char *documentation
14 );
C / C++
15 Description
16 Registration of a callback with type signature ompt_callback_device_initialize_t for
17 the ompt_callback_device_initialize event enables asynchronous collection of a trace
18 for a device. The OpenMP implementation invokes this callback after OpenMP is initialized for the
19 device but before execution of any OpenMP construct is started on the device.
20 Description of Arguments
21 The device_num argument identifies the logical device that is being initialized.
22 The type argument is a character string that indicates the type of the device. A device type string is
23 a semicolon-separated character string that includes at a minimum the vendor and model name of
24 the device. These names may be followed by a semicolon-separated sequence of properties that
25 describe the hardware or software of the device.
26 The device argument is a pointer to an opaque object that represents the target device instance.
27 Functions in the device tracing interface use this pointer to identify the device that is being
28 addressed.
29 The lookup argument points to a runtime callback that a tool must use to obtain pointers to runtime
30 entry points in the device’s OMPT tracing interface. If a device does not support tracing then
31 lookup is NULL.
32 The documentation argument is a string that describes how to use any device-specific runtime entry
33 points that can be obtained through the lookup argument. This documentation string may be a
34 pointer to external documentation, or it may be inline descriptions that include names and type
4 Constraints on Arguments
5 The type and documentation arguments must be immutable strings that are defined for the lifetime
6 of program execution.
7 Effect
8 A device initializer must fulfill several duties. First, the type argument should be used to determine
9 if any special knowledge about the hardware and/or software of a device is employed. Second, the
10 lookup argument should be used to look up pointers to runtime entry points in the OMPT tracing
11 interface for the device. Finally, these runtime entry points should be used to set up tracing for the
12 device.
13 Initialization of tracing for a target device is described in Section 4.2.5.
14 Cross References
15 • ompt_function_lookup_t type, see Section 4.6.3.
16 4.5.2.20 ompt_callback_device_finalize_t
17 Summary
18 The ompt_callback_device_initialize_t type is used for callbacks that finalize device
19 tracing interfaces.
20 Format
C / C++
21 typedef void (*ompt_callback_device_finalize_t) (
22 int device_num
23 );
C / C++
24 Description of Arguments
25 The device_num argument identifies the logical device that is being finalized.
26 Description
27 A registered callback with type signature ompt_callback_device_finalize_t is
28 dispatched for a device immediately prior to finalizing the device. Prior to dispatching a finalization
29 callback for a device on which tracing is active, the OpenMP implementation stops tracing on the
30 device and synchronously flushes all trace records for the device that have not yet been reported.
31 These trace records are flushed through one or more buffer completion callbacks with type
32 signature ompt_callback_buffer_complete_t as needed prior to the dispatch of the
33 callback with type signature ompt_callback_device_finalize_t.
3 4.5.2.21 ompt_callback_device_load_t
4 Summary
5 The ompt_callback_device_load_t type is used for callbacks that the OpenMP runtime
6 invokes to indicate that it has just loaded code onto the specified device.
7 Format
C / C++
8 typedef void (*ompt_callback_device_load_t) (
9 int device_num,
10 const char *filename,
11 int64_t offset_in_file,
12 void *vma_in_file,
13 size_t bytes,
14 void *host_addr,
15 void *device_addr,
16 uint64_t module_id
17 );
C / C++
18 Description of Arguments
19 The device_num argument specifies the device.
20 The filename argument indicates the name of a file in which the device code can be found. A NULL
21 filename indicates that the code is not available in a file in the file system.
22 The offset_in_file argument indicates an offset into filename at which the code can be found. A
23 value of -1 indicates that no offset is provided.
24 ompt_addr_none is defined as a pointer with the value ~0.
25 The vma_in_file argument indicates a virtual address in filename at which the code can be found. A
26 value of ompt_addr_none indicates that a virtual address in the file is not available.
27 The bytes argument indicates the size of the device code object in bytes.
28 The host_addr argument indicates the address at which a copy of the device code is available in
29 host memory. A value of ompt_addr_none indicates that a host code address is not available.
30 The device_addr argument indicates the address at which the device code has been loaded in device
31 memory. A value of ompt_addr_none indicates that a device code address is not available.
32 The module_id argument is an identifier that is associated with the device code object.
3 4.5.2.22 ompt_callback_device_unload_t
4 Summary
5 The ompt_callback_device_unload_t type is used for callbacks that the OpenMP
6 runtime invokes to indicate that it is about to unload code from the specified device.
7 Format
C / C++
8 typedef void (*ompt_callback_device_unload_t) (
9 int device_num,
10 uint64_t module_id
11 );
C / C++
12 Description of Arguments
13 The device_num argument specifies the device.
14 The module_id argument is an identifier that is associated with the device code object.
15 Cross References
16 • Device directives, see Section 2.14.
17 4.5.2.23 ompt_callback_buffer_request_t
18 Summary
19 The ompt_callback_buffer_request_t type is used for callbacks that are dispatched
20 when a buffer to store event records for a device is requested.
21 Format
C / C++
22 typedef void (*ompt_callback_buffer_request_t) (
23 int device_num,
24 ompt_buffer_t **buffer,
25 size_t *bytes
26 );
C / C++
27 Description
28 A callback with type signature ompt_callback_buffer_request_t requests a buffer to
29 store trace records for the specified device. A buffer request callback may set *bytes to 0 if it does
30 not provide a buffer. If a callback sets *bytes to 0, further recording of events for the device is
31 disabled until the next invocation of ompt_start_trace. This action causes the device to drop
32 future trace records until recording is restarted.
5 Cross References
6 • ompt_buffer_t type, see Section 4.4.4.7.
7 4.5.2.24 ompt_callback_buffer_complete_t
8 Summary
9 The ompt_callback_buffer_complete_t type is used for callbacks that are dispatched
10 when devices will not record any more trace records in an event buffer and all records written to the
11 buffer are valid.
12 Format
C / C++
13 typedef void (*ompt_callback_buffer_complete_t) (
14 int device_num,
15 ompt_buffer_t *buffer,
16 size_t bytes,
17 ompt_buffer_cursor_t begin,
18 int buffer_owned
19 );
C / C++
20 Description
21 A callback with type signature ompt_callback_buffer_complete_t provides a buffer that
22 contains trace records for the specified device. Typically, a tool will iterate through the records in
23 the buffer and process them.
24 The OpenMP implementation makes these callbacks on a thread that is not an OpenMP primary or
25 worker thread.
26 The callee may not delete the buffer if the buffer_owned argument is 0.
27 The buffer completion callback is not required to be async signal safe.
28 Description of Arguments
29 The device_num argument indicates the device for which the buffer contains events.
30 The buffer argument is the address of a buffer that was previously allocated by a buffer request
31 callback.
32 The bytes argument indicates the full size of the buffer.
7 Cross References
8 • ompt_buffer_t type, see Section 4.4.4.7.
9 • ompt_buffer_cursor_t type, see Section 4.4.4.8.
16 Format
C / C++
17 typedef void (*ompt_callback_target_data_op_emi_t) (
18 ompt_scope_endpoint_t endpoint,
19 ompt_data_t *target_task_data,
20 ompt_data_t *target_data,
21 ompt_id_t *host_op_id,
22 ompt_target_data_op_t optype,
23 void *src_addr,
24 int src_device_num,
25 void *dest_addr,
26 int dest_device_num,
27 size_t bytes,
28 const void *codeptr_ra
29 );
29 Note – An OpenMP implementation may aggregate program variables and data operations upon
30 them. For instance, an OpenMP implementation may synthesize a composite to represent multiple
31 scalars and then allocate, free, or copy this composite as a whole rather than performing data
32 operations on each scalar individually. Thus, callbacks may not be dispatched as separate data
33 operations on each variable.
34
22 Restrictions
23 Restrictions to the ompt_callback_target_data_op_emi and
24 ompt_callback_target_data_op callbacks are as follows:
25 • These callbacks must not be registered at the same time.
26 Cross References
27 • map clause, see Section 2.21.7.1.
28 • ompt_id_t type, see Section 4.4.4.3.
29 • ompt_data_t type, see Section 4.4.4.4.
30 • ompt_scope_endpoint_t type, see Section 4.4.4.11.
31 • ompt_target_data_op_t type, see Section 4.4.4.14.
6 Format
C / C++
7 typedef void (*ompt_callback_target_emi_t) (
8 ompt_target_t kind,
9 ompt_scope_endpoint_t endpoint,
10 int device_num,
11 ompt_data_t *task_data,
12 ompt_data_t *target_task_data,
13 ompt_data_t *target_data,
14 const void *codeptr_ra
15 );
16 typedef void (*ompt_callback_target_t) (
17 ompt_target_t kind,
18 ompt_scope_endpoint_t endpoint,
19 int device_num,
20 ompt_data_t *task_data,
21 ompt_id_t target_id,
22 const void *codeptr_ra
23 );
C / C++
24 Trace Record
C / C++
25 typedef struct ompt_record_target_t {
26 ompt_target_t kind;
27 ompt_scope_endpoint_t endpoint;
28 int device_num;
29 ompt_id_t task_id;
30 ompt_id_t target_id;
31 const void *codeptr_ra;
32 } ompt_record_target_t;
C / C++
16 Restrictions
17 Restrictions to the ompt_callback_target_emi and ompt_callback_target callbacks
18 are as follows:
19 • These callbacks must not be registered at the same time.
20 Cross References
21 • target data construct, see Section 2.14.2.
22 • target enter data construct, see Section 2.14.3.
23 • target exit data construct, see Section 2.14.4.
24 • target construct, see Section 2.14.5.
25 • target update construct, see Section 2.14.6.
26 • ompt_id_t type, see Section 4.4.4.3.
27 • ompt_data_t type, see Section 4.4.4.4.
28 • ompt_scope_endpoint_t type, see Section 4.4.4.11.
29 • ompt_target_t type, see Section 4.4.4.20.
6 Format
C / C++
7 typedef void (*ompt_callback_target_map_emi_t) (
8 ompt_data_t *target_data,
9 unsigned int nitems,
10 void **host_addr,
11 void **device_addr,
12 size_t *bytes,
13 unsigned int *mapping_flags,
14 const void *codeptr_ra
15 );
16 typedef void (*ompt_callback_target_map_t) (
17 ompt_id_t target_id,
18 unsigned int nitems,
19 void **host_addr,
20 void **device_addr,
21 size_t *bytes,
22 unsigned int *mapping_flags,
23 const void *codeptr_ra
24 );
C / C++
25 Trace Record
C / C++
26 typedef struct ompt_record_target_map_t {
27 ompt_id_t target_id;
28 unsigned int nitems;
29 void **host_addr;
30 void **device_addr;
31 size_t *bytes;
32 unsigned int *mapping_flags;
33 const void *codeptr_ra;
34 } ompt_record_target_map_t;
C / C++
13 Description of Arguments
14 The binding of the target_data argument is the target region.
15 The nitems argument indicates the number of data mappings that this callback reports.
16 The host_addr argument indicates an array of host data addresses.
17 The device_addr argument indicates an array of device data addresses.
18 The bytes argument indicates an array of sizes of data.
19 The mapping_flags argument indicates the kind of data mapping. Flags for a mapping include one
20 or more values specified by the ompt_target_map_flag_t type.
21 The codeptr_ra argument relates the implementation of an OpenMP region to its source code. If a
22 runtime routine implements the region associated with a callback that has type signature
23 ompt_callback_target_map_t or ompt_callback_target_map_emi_t then
24 codeptr_ra contains the return address of the call to that runtime routine. If the implementation of
25 the region is inlined then codeptr_ra contains the return address of the invocation of the callback. If
26 attribution to source code is impossible or inappropriate, codeptr_ra may be NULL.
27 Restrictions
28 Restrictions to the ompt_callback_target_data_map_emi and
29 ompt_callback_target_data_map callbacks are as follows:
30 • These callbacks must not be registered at the same time.
31 Cross References
32 • target data construct, see Section 2.14.2.
33 • target enter data construct, see Section 2.14.3.
34 • target exit data construct, see Section 2.14.4.
35 • target construct, see Section 2.14.5.
12 Format
C / C++
13 typedef void (*ompt_callback_target_submit_emi_t) (
14 ompt_scope_endpoint_t endpoint,
15 ompt_data_t *target_data,
16 ompt_id_t *host_op_id,
17 unsigned int requested_num_teams
18 );
19 typedef void (*ompt_callback_target_submit_t) (
20 ompt_id_t target_id,
21 ompt_id_t host_op_id,
22 unsigned int requested_num_teams
23 );
C / C++
24 Trace Record
C / C++
25 typedef struct ompt_record_target_kernel_t {
26 ompt_id_t host_op_id;
27 unsigned int requested_num_teams;
28 unsigned int granted_num_teams;
29 ompt_device_time_t end_time;
30 } ompt_record_target_kernel_t;
C / C++
5 Description of Arguments
6 The endpoint argument indicates that the callback signals the beginning or end of a scope.
7 The binding of the target_data argument is the target region.
8 The host_op_id argument points to a tool controlled integer value, which identifies an initial task
9 on a target device.
10 The requested_num_teams argument is the number of teams that the host requested to execute the
11 kernel. The actual number of teams that execute the kernel may be smaller and generally will not be
12 known until the kernel begins to execute on the device.
13 If ompt_set_trace_ompt has configured the device to trace kernel execution then the device
14 will log a ompt_record_target_kernel_t record in a trace. The fields in the record are as
15 follows:
16 • The host_op_id field contains a tool-controlled identifier that can be used to correlate a
17 ompt_record_target_kernel_t record with its associated
18 ompt_callback_target_submit_emi or ompt_callback_target_submit
19 callback on the host;
20 • The requested_num_teams field contains the number of teams that the host requested to execute
21 the kernel;
22 • The granted_num_teams field contains the number of teams that the device actually used to
23 execute the kernel;
24 • The time when the initial task began execution on the device is recorded in the time field of an
25 enclosing ompt_record_t structure; and
26 • The time when the initial task completed execution on the device is recorded in the end_time
27 field.
28 Restrictions
29 Restrictions to the ompt_callback_target_submit_emi and
30 ompt_callback_target_submit callbacks are as follows:
31 • These callbacks must not be registered at the same time.
6 4.5.2.29 ompt_callback_control_tool_t
7 Summary
8 The ompt_callback_control_tool_t type is used for callbacks that dispatch tool-control
9 events.
10 Format
C / C++
11 typedef int (*ompt_callback_control_tool_t) (
12 uint64_t command,
13 uint64_t modifier,
14 void *arg,
15 const void *codeptr_ra
16 );
C / C++
17 Trace Record
C / C++
18 typedef struct ompt_record_control_tool_t {
19 uint64_t command;
20 uint64_t modifier;
21 const void *codeptr_ra;
22 } ompt_record_control_tool_t;
C / C++
23 Description
24 Callbacks with type signature ompt_callback_control_tool_t may return any
25 non-negative value, which will be returned to the application as the return value of the
26 omp_control_tool call that triggered the callback.
27 Description of Arguments
28 The command argument passes a command from an application to a tool. Standard values for
29 command are defined by omp_control_tool_t in Section 3.14.
30 The modifier argument passes a command modifier from an application to a tool.
31 The command and modifier arguments may have tool-specific values. Tools must ignore command
32 values that they are not designed to handle.
9 Constraints on Arguments
10 Tool-specific values for command must be ≥ 64.
11 Cross References
12 • Tool control routine and types, see Section 3.14.
13 4.5.2.30 ompt_callback_error_t
14 Summary
15 The ompt_callback_error_t type is used for callbacks that dispatch runtime-error events.
16 Format
C / C++
17 typedef void (*ompt_callback_error_t) (
18 ompt_severity_t severity,
19 const char *message,
20 size_t length,
21 const void *codeptr_ra
22 );
C / C++
23 Trace Record
C / C++
24 typedef struct ompt_record_error_t {
25 ompt_severity_t severity;
26 const char *message;
27 size_t length;
28 const void *codeptr_ra;
29 } ompt_record_error_t;
C / C++
30 Description
31 A thread dispatches a registered ompt_callback_error_t callback when an error directive
32 is encountered for which the at(execution) clause is specified.
11 Cross References
12 • error directive, see Section 2.5.4.
13 • ompt_severity_t enumeration type, see Section 4.4.4.24.
28 Binding
29 The binding thread set for each of the entry points in this section is the encountering thread unless
30 otherwise specified. The binding task set is the task executing on the encountering thread.
31 Restrictions
32 Restrictions on OMPT runtime entry points are as follows:
33 • OMPT runtime entry points must not be called from a signal handler on a native thread before a
34 native-thread-begin or after a native-thread-end event.
35 • OMPT device runtime entry points must not be called after a device-finalize event for that device.
6 4.6.1.1 ompt_enumerate_states_t
7 Summary
8 The ompt_enumerate_states_t type is the type signature of the
9 ompt_enumerate_states runtime entry point, which enumerates the thread states that an
10 OpenMP implementation supports.
11 Format
C / C++
12 typedef int (*ompt_enumerate_states_t) (
13 int current_state,
14 int *next_state,
15 const char **next_state_name
16 );
C / C++
17 Description
18 An OpenMP implementation may support only a subset of the states defined by the
19 ompt_state_t enumeration type. An OpenMP implementation may also support
20 implementation-specific states. The ompt_enumerate_states runtime entry point, which has
21 type signature ompt_enumerate_states_t, enables a tool to enumerate the supported thread
22 states.
23 When a supported thread state is passed as current_state, the runtime entry point assigns the next
24 thread state in the enumeration to the variable passed by reference in next_state and assigns the
25 name associated with that state to the character pointer passed by reference in next_state_name.
26 Whenever one or more states are left in the enumeration, the ompt_enumerate_states
27 runtime entry point returns 1. When the last state in the enumeration is passed as current_state,
28 ompt_enumerate_states returns 0, which indicates that the enumeration is complete.
29 Description of Arguments
30 The current_state argument must be a thread state that the OpenMP implementation supports. To
31 begin enumerating the supported states, a tool should pass ompt_state_undefined as
32 current_state. Subsequent invocations of ompt_enumerate_states should pass the value
33 assigned to the variable that was passed by reference in next_state to the previous call.
7 Constraints on Arguments
8 Any string returned through the next_state_name argument must be immutable and defined for the
9 lifetime of program execution.
10 Cross References
11 • ompt_state_t type, see Section 4.4.4.27.
12 4.6.1.2 ompt_enumerate_mutex_impls_t
13 Summary
14 The ompt_enumerate_mutex_impls_t type is the type signature of the
15 ompt_enumerate_mutex_impls runtime entry point, which enumerates the kinds of mutual
16 exclusion implementations that an OpenMP implementation employs.
17 Format
C / C++
18 typedef int (*ompt_enumerate_mutex_impls_t) (
19 int current_impl,
20 int *next_impl,
21 const char **next_impl_name
22 );
C / C++
23 Description
24 Mutual exclusion for locks, critical sections, and atomic regions may be implemented in
25 several ways. The ompt_enumerate_mutex_impls runtime entry point, which has type
26 signature ompt_enumerate_mutex_impls_t, enables a tool to enumerate the supported
27 mutual exclusion implementations.
28 When a supported mutex implementation is passed as current_impl, the runtime entry point assigns
29 the next mutex implementation in the enumeration to the variable passed by reference in next_impl
30 and assigns the name associated with that mutex implementation to the character pointer passed by
31 reference in next_impl_name.
32 Whenever one or more mutex implementations are left in the enumeration, the
33 ompt_enumerate_mutex_impls runtime entry point returns 1. When the last mutex
3 Description of Arguments
4 The current_impl argument must be a mutex implementation that an OpenMP implementation
5 supports. To begin enumerating the supported mutex implementations, a tool should pass
6 ompt_mutex_impl_none as current_impl. Subsequent invocations of
7 ompt_enumerate_mutex_impls should pass the value assigned to the variable that was
8 passed in next_impl to the previous call.
9 The value ompt_mutex_impl_none is reserved to indicate an invalid mutex implementation.
10 ompt_mutex_impl_none is defined as an integer with the value 0.
11 The next_impl argument is a pointer to an integer in which ompt_enumerate_mutex_impls
12 returns the value of the next mutex implementation in the enumeration.
13 The next_impl_name argument is a pointer to a character string pointer in which
14 ompt_enumerate_mutex_impls returns a string that describes the next mutex
15 implementation.
16 Constraints on Arguments
17 Any string returned through the next_impl_name argument must be immutable and defined for the
18 lifetime of a program execution.
19 Cross References
20 • ompt_mutex_t type, see Section 4.4.4.16.
21 4.6.1.3 ompt_set_callback_t
22 Summary
23 The ompt_set_callback_t type is the type signature of the ompt_set_callback runtime
24 entry point, which registers a pointer to a tool callback that an OpenMP implementation invokes
25 when a host OpenMP event occurs.
26 Format
C / C++
27 typedef ompt_set_result_t (*ompt_set_callback_t) (
28 ompt_callbacks_t event,
29 ompt_callback_t callback
30 );
C / C++
24 4.6.1.4 ompt_get_callback_t
25 Summary
26 The ompt_get_callback_t type is the type signature of the ompt_get_callback runtime
27 entry point, which retrieves a pointer to a registered tool callback routine (if any) that an OpenMP
28 implementation invokes when a host OpenMP event occurs.
29 Format
C / C++
30 typedef int (*ompt_get_callback_t) (
31 ompt_callbacks_t event,
32 ompt_callback_t *callback
33 );
C / C++
18 4.6.1.5 ompt_get_thread_data_t
19 Summary
20 The ompt_get_thread_data_t type is the type signature of the
21 ompt_get_thread_data runtime entry point, which returns the address of the thread data
22 object for the current thread.
23 Format
C / C++
24 typedef ompt_data_t *(*ompt_get_thread_data_t) (void);
C / C++
25 Description
26 Each OpenMP thread can have an associated thread data object of type ompt_data_t. The
27 ompt_get_thread_data runtime entry point, which has type signature
28 ompt_get_thread_data_t, retrieves a pointer to the thread data object, if any, that is
29 associated with the current thread. A tool may use a pointer to an OpenMP thread’s data object that
30 ompt_get_thread_data retrieves to inspect or to modify the value of the data object. When
31 an OpenMP thread is created, its data object is initialized with value ompt_data_none.
32 This runtime entry point is async signal safe.
33 Cross References
34 • ompt_data_t type, see Section 4.4.4.4.
6 Format
C / C++
7 typedef int (*ompt_get_num_procs_t) (void);
C / C++
8 Binding
9 The binding thread set is all threads on the host device.
10 Description
11 The ompt_get_num_procs runtime entry point, which has type signature
12 ompt_get_num_procs_t, returns the number of processors that are available on the host
13 device at the time the routine is called. This value may change between the time that it is
14 determined and the time that it is read in the calling context due to system actions outside the
15 control of the OpenMP implementation.
16 This runtime entry point is async signal safe.
17 4.6.1.7 ompt_get_num_places_t
18 Summary
19 The ompt_get_num_places_t type is the type signature of the ompt_get_num_places
20 runtime entry point, which returns the number of places currently available to the execution
21 environment in the place list.
22 Format
C / C++
23 typedef int (*ompt_get_num_places_t) (void);
C / C++
24 Binding
25 The binding thread set is all threads on a device.
26 Description
27 The ompt_get_num_places runtime entry point, which has type signature
28 ompt_get_num_places_t, returns the number of places in the place list. This value is
29 equivalent to the number of places in the place-partition-var ICV in the execution environment of
30 the initial task.
31 This runtime entry point is async signal safe.
4 4.6.1.8 ompt_get_place_proc_ids_t
5 Summary
6 The ompt_get_place_procs_ids_t type is the type signature of the
7 ompt_get_num_place_procs_ids runtime entry point, which returns the numerical
8 identifiers of the processors that are available to the execution environment in the specified place.
9 Format
C / C++
10 typedef int (*ompt_get_place_proc_ids_t) (
11 int place_num,
12 int ids_size,
13 int *ids
14 );
C / C++
15 Binding
16 The binding thread set is all threads on a device.
17 Description
18 The ompt_get_place_proc_ids runtime entry point, which has type signature
19 ompt_get_place_proc_ids_t, returns the numerical identifiers of each processor that is
20 associated with the specified place. These numerical identifiers are non-negative, and their meaning
21 is implementation defined.
22 Description of Arguments
23 The place_num argument specifies the place that is being queried.
24 The ids argument is an array in which the routine can return a vector of processor identifiers in the
25 specified place.
26 The ids_size argument indicates the size of the result array that is specified by ids.
27 Effect
28 If the ids array of size ids_size is large enough to contain all identifiers then they are returned in ids
29 and their order in the array is implementation defined. Otherwise, if the ids array is too small, the
30 values in ids when the function returns are unspecified. The routine always returns the number of
31 numerical identifiers of the processors that are available to the execution environment in the
32 specified place.
6 Format
C / C++
7 typedef int (*ompt_get_place_num_t) (void);
C / C++
8 Description
9 When the current thread is bound to a place, ompt_get_place_num returns the place number
10 associated with the thread. The returned value is between 0 and one less than the value returned by
11 ompt_get_num_places, inclusive. When the current thread is not bound to a place, the routine
12 returns -1.
13 This runtime entry point is async signal safe.
14 4.6.1.10 ompt_get_partition_place_nums_t
15 Summary
16 The ompt_get_partition_place_nums_t type is the type signature of the
17 ompt_get_partition_place_nums runtime entry point, which returns a list of place
18 numbers that correspond to the places in the place-partition-var ICV of the innermost implicit task.
19 Format
C / C++
20 typedef int (*ompt_get_partition_place_nums_t) (
21 int place_nums_size,
22 int *place_nums
23 );
C / C++
24 Description
25 The ompt_get_partition_place_nums runtime entry point, which has type signature
26 ompt_get_partition_place_nums_t, returns a list of place numbers that correspond to
27 the places in the place-partition-var ICV of the innermost implicit task.
28 This runtime entry point is async signal safe.
5 Effect
6 If the place_nums array of size place_nums_size is large enough to contain all identifiers then they
7 are returned in place_nums and their order in the array is implementation defined. Otherwise, if the
8 place_nums array is too small, the values in place_nums when the function returns are unspecified.
9 The routine always returns the number of places in the place-partition-var ICV of the innermost
10 implicit task.
11 Cross References
12 • place-partition-var ICV, see Section 2.4.
13 • OMP_PLACES environment variable, see Section 6.5.
14 4.6.1.11 ompt_get_proc_id_t
15 Summary
16 The ompt_get_proc_id_t type is the type signature of the ompt_get_proc_id runtime
17 entry point, which returns the numerical identifier of the processor of the current thread.
18 Format
C / C++
19 typedef int (*ompt_get_proc_id_t) (void);
C / C++
20 Description
21 The ompt_get_proc_id runtime entry point, which has type signature
22 ompt_get_proc_id_t, returns the numerical identifier of the processor of the current thread.
23 A defined numerical identifier is non-negative, and its meaning is implementation defined. A
24 negative number indicates a failure to retrieve the numerical identifier.
25 This runtime entry point is async signal safe.
26 4.6.1.12 ompt_get_state_t
27 Summary
28 The ompt_get_state_t type is the type signature of the ompt_get_state runtime entry
29 point, which returns the state and the wait identifier of the current thread.
17 Description of Arguments
18 The wait_id argument is a pointer to an opaque handle that is available to receive the value of the
19 wait identifier of the thread. If wait_id is not NULL then the entry point assigns the value of the
20 wait identifier of the thread to the object to which wait_id points. If the returned state is not one of
21 the specified wait states then the value of opaque object to which wait_id points is undefined after
22 the call.
23 Constraints on Arguments
24 The argument passed to the entry point must be a reference to a variable of the specified type or
25 NULL.
26 Cross References
27 • ompt_state_t type, see Section 4.4.4.27.
28 • ompt_wait_id_t type, see Section 4.4.4.30.
29 • ompt_enumerate_states_t type, see Section 4.6.1.1.
30 4.6.1.13 ompt_get_parallel_info_t
31 Summary
32 The ompt_get_parallel_info_t type is the type signature of the
33 ompt_get_parallel_info runtime entry point, which returns information about the parallel
34 region, if any, at the specified ancestor level for the current execution context.
27 Description of Arguments
28 The ancestor_level argument specifies the parallel region of interest by its ancestor level. Ancestor
29 level 0 refers to the innermost parallel region; information about enclosing parallel regions may be
30 obtained using larger values for ancestor_level.
31 The parallel_data argument returns the parallel data if the argument is not NULL.
32 The team_size argument returns the team size if the argument is not NULL.
8 Constraints on Arguments
9 While argument ancestor_level is passed by value, all other arguments to the entry point must be
10 pointers to variables of the specified types or NULL.
11 Cross References
12 • ompt_data_t type, see Section 4.4.4.4.
13 4.6.1.14 ompt_get_task_info_t
14 Summary
15 The ompt_get_task_info_t type is the type signature of the ompt_get_task_info
16 runtime entry point, which returns information about the task, if any, at the specified ancestor level
17 in the current execution context.
18 Format
C / C++
19 typedef int (*ompt_get_task_info_t) (
20 int ancestor_level,
21 int *flags,
22 ompt_data_t **task_data,
23 ompt_frame_t **task_frame,
24 ompt_data_t **parallel_data,
25 int *thread_num
26 );
C / C++
27 Description
28 During execution, an OpenMP thread may be executing an OpenMP task. Additionally, the stack of
29 the thread may contain procedure frames that are associated with suspended OpenMP tasks or
30 OpenMP runtime system routines. To obtain information about any task on the stack of the current
31 thread, a tool uses the ompt_get_task_info runtime entry point, which has type signature
32 ompt_get_task_info_t.
33 Ancestor level 0 refers to the active task; information about other tasks with associated frames
34 present on the stack in the current execution context may be queried at higher ancestor levels.
13 Description of Arguments
14 The ancestor_level argument specifies the task region of interest by its ancestor level. Ancestor
15 level 0 refers to the active task; information about ancestor tasks found in the current execution
16 context may be queried at higher ancestor levels.
17 The flags argument returns the task type if the argument is not NULL.
18 The task_data argument returns the task data if the argument is not NULL.
19 The task_frame argument returns the task frame pointer if the argument is not NULL.
20 The parallel_data argument returns the parallel data if the argument is not NULL.
21 The thread_num argument returns the thread number if the argument is not NULL.
22 Effect
23 If the runtime entry point returns 0 or 1, no argument is modified. Otherwise,
24 ompt_get_task_info has the following effects:
25 • If a non-null value was passed for flags then the value returned in the integer to which flags
26 points represents the type of the task at the specified level; possible task types include initial,
27 implicit, explicit, and target tasks;
28 • If a non-null value was passed for task_data then the value that is returned in the object to which
29 it points is a pointer to a data word that is associated with the task at the specified level;
30 • If a non-null value was passed for task_frame then the value that is returned in the object to
31 which task_frame points is a pointer to the ompt_frame_t structure that is associated with the
32 task at the specified level;
33 • If a non-null value was passed for parallel_data then the value that is returned in the object to
34 which parallel_data points is a pointer to a data word that is associated with the parallel region
35 that contains the task at the specified level or, if the task at the specified level is an initial task,
36 NULL; and
4 Constraints on Arguments
5 While argument ancestor_level is passed by value, all other arguments to
6 ompt_get_task_info must be pointers to variables of the specified types or NULL.
7 Cross References
8 • ompt_data_t type, see Section 4.4.4.4.
9 • ompt_task_flag_t type, see Section 4.4.4.18.
10 • ompt_frame_t type, see Section 4.4.4.28.
11 4.6.1.15 ompt_get_task_memory_t
12 Summary
13 The ompt_get_task_memory_t type is the type signature of the
14 ompt_get_task_memory runtime entry point, which returns information about memory ranges
15 that are associated with the task.
16 Format
C / C++
17 typedef int (*ompt_get_task_memory_t)(
18 void **addr,
19 size_t *size,
20 int block
21 );
C / C++
22 Description
23 During execution, an OpenMP thread may be executing an OpenMP task. The OpenMP
24 implementation must preserve the data environment from the creation of the task for the execution
25 of the task. The ompt_get_task_memory runtime entry point, which has type signature
26 ompt_get_task_memory_t, provides information about the memory ranges used to store the
27 data environment for the current task.
28 Multiple memory ranges may be used to store these data. The block argument supports iteration
29 over these memory ranges.
30 The ompt_get_task_memory runtime entry point returns 1 if more memory ranges are
31 available, and 0 otherwise. If no memory is used for a task, size is set to 0. In this case, addr is
32 unspecified.
33 This runtime entry point is async signal safe.
6 4.6.1.16 ompt_get_target_info_t
7 Summary
8 The ompt_get_target_info_t type is the type signature of the
9 ompt_get_target_info runtime entry point, which returns identifiers that specify a thread’s
10 current target region and target operation ID, if any.
11 Format
C / C++
12 typedef int (*ompt_get_target_info_t) (
13 uint64_t *device_num,
14 ompt_id_t *target_id,
15 ompt_id_t *host_op_id
16 );
C / C++
17 Description
18 The ompt_get_target_info entry point, which has type signature
19 ompt_get_target_info_t, returns 1 if the current thread is in a target region and 0
20 otherwise. If the entry point returns 0 then the values of the variables passed by reference as its
21 arguments are undefined.
22 If the current thread is in a target region then ompt_get_target_info returns information
23 about the current device, active target region, and active host operation, if any.
24 This runtime entry point is async signal safe.
25 Description of Arguments
26 The device_num argument returns the device number if the current thread is in a target region.
27 The target_id argument returns the target region identifier if the current thread is in a target
28 region.
29 If the current thread is in the process of initiating an operation on a target device (for example,
30 copying data to or from an accelerator or launching a kernel), then host_op_id returns the identifier
31 for the operation; otherwise, host_op_id returns ompt_id_none.
3 Cross References
4 • ompt_id_t type, see Section 4.4.4.3.
5 4.6.1.17 ompt_get_num_devices_t
6 Summary
7 The ompt_get_num_devices_t type is the type signature of the
8 ompt_get_num_devices runtime entry point, which returns the number of available devices.
9 Format
C / C++
10 typedef int (*ompt_get_num_devices_t) (void);
C / C++
11 Description
12 The ompt_get_num_devices runtime entry point, which has type signature
13 ompt_get_num_devices_t, returns the number of devices available to an OpenMP program.
14 This runtime entry point is async signal safe.
15 4.6.1.18 ompt_get_unique_id_t
16 Summary
17 The ompt_get_unique_id_t type is the type signature of the ompt_get_unique_id
18 runtime entry point, which returns a unique number.
19 Format
C / C++
20 typedef uint64_t (*ompt_get_unique_id_t) (void);
C / C++
21 Description
22 The ompt_get_unique_id runtime entry point, which has type signature
23 ompt_get_unique_id_t, returns a number that is unique for the duration of an OpenMP
24 program. Successive invocations may not result in consecutive or even increasing numbers.
25 This runtime entry point is async signal safe.
23 4.6.2.1 ompt_get_device_num_procs_t
24 Summary
25 The ompt_get_device_num_procs_t type is the type signature of the
26 ompt_get_device_num_procs runtime entry point, which returns the number of processors
27 currently available to the execution environment on the specified device.
28 Format
C / C++
29 typedef int (*ompt_get_device_num_procs_t) (
30 ompt_device_t *device
31 );
C / C++
7 Description of Arguments
8 The device argument is a pointer to an opaque object that represents the target device instance. The
9 pointer to the device instance object is used by functions in the device tracing interface to identify
10 the device being addressed.
11 Cross References
12 • ompt_device_t type, see Section 4.4.4.5.
13 4.6.2.2 ompt_get_device_time_t
14 Summary
15 The ompt_get_device_time_t type is the type signature of the
16 ompt_get_device_time runtime entry point, which returns the current time on the specified
17 device.
18 Format
C / C++
19 typedef ompt_device_time_t (*ompt_get_device_time_t) (
20 ompt_device_t *device
21 );
C / C++
22 Description
23 Host and target devices are typically distinct and run independently. If host and target devices are
24 different hardware components, they may use different clock generators. For this reason, a common
25 time base for ordering host-side and device-side events may not be available.
26 The ompt_get_device_time runtime entry point, which has type signature
27 ompt_get_device_time_t, returns the current time on the specified device. A tool can use
28 this information to align time stamps from different devices.
29 Description of Arguments
30 The device argument is a pointer to an opaque object that represents the target device instance. The
31 pointer to the device instance object is used by functions in the device tracing interface to identify
32 the device being addressed.
4 4.6.2.3 ompt_translate_time_t
5 Summary
6 The ompt_translate_time_t type is the type signature of the ompt_translate_time
7 runtime entry point, which translates a time value that is obtained from the specified device to a
8 corresponding time value on the host device.
9 Format
C / C++
10 typedef double (*ompt_translate_time_t) (
11 ompt_device_t *device,
12 ompt_device_time_t time
13 );
C / C++
14 Description
15 The ompt_translate_time runtime entry point, which has type signature
16 ompt_translate_time_t, translates a time value obtained from the specified device to a
17 corresponding time value on the host device. The returned value for the host time has the same
18 meaning as the value returned from omp_get_wtime.
19
20 Note – The accuracy of time translations may degrade, if they are not performed promptly after a
21 device time value is received and if either the host or device vary their clock speeds. Prompt
22 translation of device times to host times is recommended.
23
24 Description of Arguments
25 The device argument is a pointer to an opaque object that represents the target device instance. The
26 pointer to the device instance object is used by functions in the device tracing interface to identify
27 the device being addressed.
28 The time argument is a time from the specified device.
29 Cross References
30 • omp_get_wtime routine, see Section 3.10.1.
31 • ompt_device_t type, see Section 4.4.4.5.
32 • ompt_device_time_t type, see Section 4.4.4.6.
6 Format
C / C++
7 typedef ompt_set_result_t (*ompt_set_trace_ompt_t) (
8 ompt_device_t *device,
9 unsigned int enable,
10 unsigned int etype
11 );
C / C++
12 Description of Arguments
13 The device argument points to an opaque object that represents the target device instance. Functions
14 in the device tracing interface use this pointer to identify the device that is being addressed.
15 The etype argument indicates the events to which the invocation of ompt_set_trace_ompt
16 applies. If the value of etype is 0 then the invocation applies to all events. If etype is positive then it
17 applies to the event in ompt_callbacks_t that matches that value.
18 The enable argument indicates whether tracing should be enabled or disabled for the event or events
19 that the etype argument specifies. A positive value for enable indicates that recording should be
20 enabled; a value of 0 for enable indicates that recording should be disabled.
21 Restrictions
22 Restrictions on the ompt_set_trace_ompt runtime entry point are as follows:
23 • The entry point must not return ompt_set_sometimes_paired.
24 Cross References
25 • Tracing activity on target devices with OMPT, see Section 4.2.5.
26 • ompt_callbacks_t type, see Section 4.4.2.
27 • ompt_set_result_t type, see Section 4.4.4.2.
28 • ompt_device_t type, see Section 4.4.4.5.
6 Format
C / C++
7 typedef ompt_set_result_t (*ompt_set_trace_native_t) (
8 ompt_device_t *device,
9 int enable,
10 int flags
11 );
C / C++
12 Description
13 This interface is designed for use by a tool that cannot directly use native control functions for the
14 device. If a tool can directly use the native control functions then it can invoke native control
15 functions directly using pointers that the lookup function associated with the device provides and
16 that are described in the documentation string that is provided to the device initializer callback.
17 Description of Arguments
18 The device argument points to an opaque object that represents the target device instance. Functions
19 in the device tracing interface use this pointer to identify the device that is being addressed.
20 The enable argument indicates whether this invocation should enable or disable recording of events.
21 The flags argument specifies the kinds of native device monitoring to enable or to disable. Each
22 kind of monitoring is specified by a flag bit. Flags can be composed by using logical or to combine
23 enumeration values from type ompt_native_mon_flag_t.
24 To start, to pause, to flush, or to stop tracing for a specific target device associated with device, a
25 tool invokes the ompt_start_trace, ompt_pause_trace, ompt_flush_trace, or
26 ompt_stop_trace runtime entry point for the device.
27 Restrictions
28 Restrictions on the ompt_set_trace_native runtime entry point are as follows:
29 • The entry point must not return ompt_set_sometimes_paired.
30 Cross References
31 • Tracing activity on target devices with OMPT, see Section 4.2.5.
32 • ompt_set_result_t type, see Section 4.4.4.2.
33 • ompt_device_t type, see Section 4.4.4.5.
5 Format
C / C++
6 typedef int (*ompt_start_trace_t) (
7 ompt_device_t *device,
8 ompt_callback_buffer_request_t request,
9 ompt_callback_buffer_complete_t complete
10 );
C / C++
11 Description
12 A device’s ompt_start_trace runtime entry point, which has type signature
13 ompt_start_trace_t, initiates tracing on the device. Under normal operating conditions,
14 every event buffer provided to a device by a tool callback is returned to the tool before the OpenMP
15 runtime shuts down. If an exceptional condition terminates execution of an OpenMP program, the
16 OpenMP runtime may not return buffers provided to the device.
17 An invocation of ompt_start_trace returns 1 if the command succeeds and 0 otherwise.
18 Description of Arguments
19 The device argument points to an opaque object that represents the target device instance. Functions
20 in the device tracing interface use this pointer to identify the device that is being addressed.
21 The request argument specifies a tool callback that supplies a buffer in which a device can deposit
22 events.
23 The complete argument specifies a tool callback that is invoked by the OpenMP implementation to
24 empty a buffer that contains event records.
25 Cross References
26 • ompt_device_t type, see Section 4.4.4.5.
27 • ompt_callback_buffer_request_t callback type, see Section 4.5.2.23.
28 • ompt_callback_buffer_complete_t callback type, see Section 4.5.2.24.
29 4.6.2.7 ompt_pause_trace_t
30 Summary
31 The ompt_pause_trace_t type is the type signature of the ompt_pause_trace runtime
32 entry point, which pauses or restarts activity tracing on a specific device.
11 Description of Arguments
12 The device argument points to an opaque object that represents the target device instance. Functions
13 in the device tracing interface use this pointer to identify the device that is being addressed.
14 The begin_pause argument indicates whether to pause or to resume tracing. To resume tracing,
15 zero should be supplied for begin_pause; To pause tracing, any other value should be supplied.
16 Cross References
17 • ompt_device_t type, see Section 4.4.4.5.
18 4.6.2.8 ompt_flush_trace_t
19 Summary
20 The ompt_flush_trace_t type is the type signature of the ompt_flush_trace runtime
21 entry point, which causes all pending trace records for the specified device to be delivered.
22 Format
C / C++
23 typedef int (*ompt_flush_trace_t) (
24 ompt_device_t *device
25 );
C / C++
26 Description
27 A device’s ompt_flush_trace runtime entry point, which has type signature
28 ompt_flush_trace_t, causes the OpenMP implementation to issue a sequence of zero or more
29 buffer completion callbacks to deliver all trace records that have been collected prior to the flush.
30 An invocation of ompt_flush_trace returns 1 if the command succeeds and 0 otherwise.
4 Cross References
5 • ompt_device_t type, see Section 4.4.4.5.
6 4.6.2.9 ompt_stop_trace_t
7 Summary
8 The ompt_stop_trace_t type is the type signature of the ompt_stop_trace runtime entry
9 point, which stops tracing for a device.
10 Format
C / C++
11 typedef int (*ompt_stop_trace_t) (
12 ompt_device_t *device
13 );
C / C++
14 Description
15 A device’s ompt_stop_trace runtime entry point, which has type signature
16 ompt_stop_trace_t, halts tracing on the device and requests that any pending trace records
17 are flushed. An invocation of ompt_stop_trace returns 1 if the command succeeds and 0
18 otherwise.
19 Description of Arguments
20 The device argument points to an opaque object that represents the target device instance. Functions
21 in the device tracing interface use this pointer to identify the device that is being addressed.
22 Cross References
23 • ompt_device_t type, see Section 4.4.4.5.
24 4.6.2.10 ompt_advance_buffer_cursor_t
25 Summary
26 The ompt_advance_buffer_cursor_t type is the type signature of the
27 ompt_advance_buffer_cursor runtime entry point, which advances a trace buffer cursor to
28 the next record.
24 4.6.2.11 ompt_get_record_type_t
25 Summary
26 The ompt_get_record_type_t type is the type signature of the
27 ompt_get_record_type runtime entry point, which inspects the type of a trace record.
28 Format
C / C++
29 typedef ompt_record_t (*ompt_get_record_type_t) (
30 ompt_buffer_t *buffer,
31 ompt_buffer_cursor_t current
32 );
C / C++
10 Description of Arguments
11 The buffer argument indicates a trace buffer.
12 The current argument is an opaque buffer cursor.
13 Cross References
14 • ompt_record_t type, see Section 4.4.3.1.
15 • ompt_buffer_t type, see Section 4.4.4.7.
16 • ompt_buffer_cursor_t type, see Section 4.4.4.8.
17 4.6.2.12 ompt_get_record_ompt_t
18 Summary
19 The ompt_get_record_ompt_t type is the type signature of the
20 ompt_get_record_ompt runtime entry point, which obtains a pointer to an OMPT trace
21 record from a trace buffer associated with a device.
22 Format
C / C++
23 typedef ompt_record_ompt_t *(*ompt_get_record_ompt_t) (
24 ompt_buffer_t *buffer,
25 ompt_buffer_cursor_t current
26 );
C / C++
27 Description
28 A device’s ompt_get_record_ompt runtime entry point, which has type signature
29 ompt_get_record_ompt_t, returns a pointer that may point to a record in the trace buffer, or
30 it may point to a record in thread local storage in which the information extracted from a record was
31 assembled. The information available for an event depends upon its type.
32 The return value of the ompt_record_ompt_t type includes a field of a union type that can
33 represent information for any OMPT event record type. Another call to the runtime entry point may
34 overwrite the contents of the fields in a record returned by a prior invocation.
4 Cross References
5 • ompt_record_ompt_t type, see Section 4.4.3.4.
6 • ompt_device_t type, see Section 4.4.4.5.
7 • ompt_buffer_cursor_t type, see Section 4.4.4.8.
8 4.6.2.13 ompt_get_record_native_t
9 Summary
10 The ompt_get_record_native_t type is the type signature of the
11 ompt_get_record_native runtime entry point, which obtains a pointer to a native trace
12 record from a trace buffer associated with a device.
13 Format
C / C++
14 typedef void *(*ompt_get_record_native_t) (
15 ompt_buffer_t *buffer,
16 ompt_buffer_cursor_t current,
17 ompt_id_t *host_op_id
18 );
C / C++
19 Description
20 A device’s ompt_get_record_native runtime entry point, which has type signature
21 ompt_get_record_native_t, returns a pointer that may point may point into the specified
22 trace buffer, or into thread local storage in which the information extracted from a trace record was
23 assembled. The information available for a native event depends upon its type. If the function
24 returns a non-null result, it will also set the object to which host_op_id points to a host-side
25 identifier for the operation that is associated with the record. A subsequent call to
26 ompt_get_record_native may overwrite the contents of the fields in a record returned by a
27 prior invocation.
28 Description of Arguments
29 The buffer argument indicates a trace buffer.
30 The current argument is an opaque buffer cursor.
31 The host_op_id argument is a pointer to an identifier that is returned by the function. The entry
32 point sets the identifier to which host_op_id points to the value of a host-side identifier for an
33 operation on a target device that was created when the operation was initiated by the host.
5 4.6.2.14 ompt_get_record_abstract_t
6 Summary
7 The ompt_get_record_abstract_t type is the type signature of the
8 ompt_get_record_abstract runtime entry point, which summarizes the context of a native
9 (device-specific) trace record.
10 Format
C / C++
11 typedef ompt_record_abstract_t *(*ompt_get_record_abstract_t) (
12 void *native_record
13 );
C / C++
14 Description
15 An OpenMP implementation may execute on a device that logs trace records in a native
16 (device-specific) format that a tool cannot interpret directly. The
17 ompt_get_record_abstract runtime entry point of a device, which has type signature
18 ompt_get_record_abstract_t, translates a native trace record into a standard form.
19 Description of Arguments
20 The native_record argument is a pointer to a native trace record.
21 Cross References
22 • ompt_record_abstract_t type, see Section 4.4.3.3.
27 Format
C / C++
28 typedef void (*ompt_interface_fn_t) (void);
29
30 typedef ompt_interface_fn_t (*ompt_function_lookup_t) (
31 const char *interface_function_name
32 );
C / C++
17 Description of Arguments
18 The interface_function_name argument is a C string that represents the name of a runtime entry
19 point.
20 Cross References
21 • Tool initializer for a device’s OMPT tracing interface, see Section 4.2.5.
22 • Tool initializer for the OMPT callback interface, see Section 4.5.1.1.
23 • Entry points in the OMPT callback interface, see Table 4.1 for a list and Section 4.6.1 for
24 detailed definitions.
25 • Entry points in the OMPT tracing interface, see Table 4.4 for a list and Section 4.6.2 for detailed
26 definitions.
577
1 location. The location can, but may not, be a function. It can, for example, simply be a label.
2 However, the names of the locations must have external C linkage.
19 Cross References
20 • Activating a first-party tool, see Section 4.2.
21 • OMP_DEBUG environment variable, see Section 6.21.
22 5.2.2 ompd_dll_locations
23 Summary
24 The ompd_dll_locations global variable points to the locations of OMPD libraries that are
25 compatible with the OpenMP implementation.
26 Format
C
27 extern const char **ompd_dll_locations;
C
23 Cross References
24 • ompd_dll_locations_valid global variable, see Section 5.2.3.
25 5.2.3 ompd_dll_locations_valid
26 Summary
27 The OpenMP runtime notifies third-party tools that ompd_dll_locations is valid by allowing
28 execution to pass through a location that the symbol ompd_dll_locations_valid identifies.
29 Format
C
30 void ompd_dll_locations_valid(void);
C
15 Format
C / C++
16 typedef uint64_t ompd_size_t;
C / C++
20 Format
C / C++
21 typedef uint64_t ompd_wait_id_t;
C / C++
22 Description
23 The values and meaning of ompd_wait_id_t is the same as defined for the
24 ompt_wait_id_t type.
25 Cross References
26 • ompt_wait_id_t type, see Section 4.4.4.30.
4 Format
C / C++
5 typedef uint64_t ompd_addr_t;
6 typedef int64_t ompd_word_t;
7 typedef uint64_t ompd_seg_t;
C / C++
8 Description
9 The ompd_addr_t type represents an address in an OpenMP process with an unsigned integer type.
10 The ompd_word_t type represents a data word from the OpenMP runtime with a signed integer
11 type. The ompd_seg_t type represents a segment value with an unsigned integer type.
15 Format
C / C++
16 typedef struct ompd_address_t {
17 ompd_seg_t segment;
18 ompd_addr_t address;
19 } ompd_address_t;
C / C++
20 Description
21 The ompd_address_t type is a structure that OMPD uses to specify device addresses, which
22 may or may not be segmented. For non-segmented architectures, ompd_segment_none is used
23 in the segment field of ompd_address_t; it is an instance of the ompd_seg_t type that has the
24 value 0.
4 Format
C / C++
5 typedef struct ompd_frame_info_t {
6 ompd_address_t frame_address;
7 ompd_word_t frame_flag;
8 } ompd_frame_info_t;
C / C++
9 Description
10 The ompd_frame_info_t type is a structure that OMPD uses to specify frame information.
11 The frame_address field of ompd_frame_info_t identifies a frame. The frame_flag field of
12 ompd_frame_info_t indicates what type of information is provided in frame_address. The
13 values and meaning is the same as defined for the ompt_frame_flag_t enumeration type.
14 Cross References
15 • ompt_frame_t type, see Section 4.4.4.28.
19 Format
C / C++
20 typedef uint64_t ompd_device_t;
C / C++
21 Description
22 OpenMP runtimes may utilize different underlying devices, each represented by a device identifier.
23 The device identifiers can vary in size and format and, thus, are not explicitly represented in the
24 OMPD interface. Instead, a device identifier is passed across the interface via its
25 ompd_device_t kind, its size in bytes and a pointer to where it is stored. The OMPD library and
26 the third-party tool use the ompd_device_t kind to interpret the format of the device identifier
27 that is referenced by the pointer argument. Each different device identifier kind is represented by a
28 unique unsigned 64-bit integer value.
29 Recommended values of ompd_device_t kinds are defined in the ompd-types.h header file,
30 which is available on http://www.openmp.org/.
4 Format
C / C++
5 typedef uint64_t ompd_thread_id_t;
C / C++
6 Description
7 OpenMP runtimes may use different native thread implementations. Native thread identifiers for
8 these implementations can vary in size and format and, thus, are not explicitly represented in the
9 OMPD interface. Instead, a native thread identifier is passed across the interface via its
10 ompd_thread_id_t kind, its size in bytes and a pointer to where it is stored. The OMPD
11 library and the third-party tool use the ompd_thread_id_t kind to interpret the format of the
12 native thread identifier that is referenced by the pointer argument. Each different native thread
13 identifier kind is represented by a unique unsigned 64-bit integer value.
14 Recommended values of ompd_thread_id_t kinds, and formats for some corresponding native
15 thread identifiers, are defined in the ompd-types.h header file, which is available on
16 http://www.openmp.org/.
21 Format
C / C++
22 typedef struct _ompd_aspace_handle ompd_address_space_handle_t;
23 typedef struct _ompd_thread_handle ompd_thread_handle_t;
24 typedef struct _ompd_parallel_handle ompd_parallel_handle_t;
25 typedef struct _ompd_task_handle ompd_task_handle_t;
C / C++
26 Description
27 OMPD uses handles for address spaces (ompd_address_space_handle_t), threads
28 (ompd_thread_handle_t), parallel regions (ompd_parallel_handle_t), and tasks
29 (ompd_task_handle_t). Each operation of the OMPD interface that applies to a particular
30 address space, thread, parallel region or task must explicitly specify a corresponding handle. A
31 handle for an entity is constant while the entity itself is alive. Handles are defined by the OMPD
32 library and are opaque to the third-party tool.
8 Format
C / C++
9 typedef enum ompd_scope_t {
10 ompd_scope_global = 1,
11 ompd_scope_address_space = 2,
12 ompd_scope_thread = 3,
13 ompd_scope_parallel = 4,
14 ompd_scope_implicit_task = 5,
15 ompd_scope_task = 6
16 } ompd_scope_t;
C / C++
17 Description
18 The ompd_scope_t type identifies OpenMP scopes, including those related to parallel regions
19 and tasks. When used in an OMPD interface function call, the scope type and the OMPD handle
20 must match according to Table 5.1.
4 Format
C / C++
5 typedef uint64_t ompd_icv_id_t;
C / C++
6 The ompd_icv_id_t type identifies OpenMP implementation ICVs. ompd_icv_undefined
7 is an instance of this type with the value 0.
12 Format
C / C++
13 typedef struct _ompd_aspace_cont ompd_address_space_context_t;
14 typedef struct _ompd_thread_cont ompd_thread_context_t;
C / C++
15 Description
16 A third-party tool uniquely defines an address space context to identify the address space for the
17 process that it is monitoring. Similarly, it uniquely defines a thread context to identify a native
18 thread of the process that it is monitoring. These contexts are opaque to the OMPD library.
22 Format
C / C++
23 typedef enum ompd_rc_t {
24 ompd_rc_ok = 0,
25 ompd_rc_unavailable = 1,
26 ompd_rc_stale_handle = 2,
27 ompd_rc_bad_input = 3,
28 ompd_rc_error = 4,
29 ompd_rc_unsupported = 5,
30 ompd_rc_needs_state_tracking = 6,
17 Cross References
18 • ompd_callback_sizeof_fn_t type, see Section 5.4.2.2.
24 5.4.1.1 ompd_callback_memory_alloc_fn_t
25 Summary
26 The ompd_callback_memory_alloc_fn_t type is the type signature of the callback routine
27 that the third-party tool provides to the OMPD library to allocate memory.
28 Format
C
29 typedef ompd_rc_t (*ompd_callback_memory_alloc_fn_t) (
30 ompd_size_t nbytes,
31 void **ptr
32 );
C
33 Description
34 The ompd_callback_memory_alloc_fn_t type is the type signature of the memory
35 allocation callback routine that the third-party tool provides. The OMPD library may call the
36 ompd_callback_memory_alloc_fn_t callback function to allocate memory.
9 Cross References
10 • ompd_size_t type, see Section 5.3.1.
11 • ompd_rc_t type, see Section 5.3.12.
12 5.4.1.2 ompd_callback_memory_free_fn_t
13 Summary
14 The ompd_callback_memory_free_fn_t type is the type signature of the callback routine
15 that the third-party tool provides to the OMPD library to deallocate memory.
16 Format
C
17 typedef ompd_rc_t (*ompd_callback_memory_free_fn_t) (
18 void *ptr
19 );
C
20 Description
21 The ompd_callback_memory_free_fn_t type is the type signature of the memory
22 deallocation callback routine that the third-party tool provides. The OMPD library may call the
23 ompd_callback_memory_free_fn_t callback function to deallocate memory that was
24 obtained from a prior call to the ompd_callback_memory_alloc_fn_t callback function.
25 Description of Arguments
26 The ptr argument is the address of the block to be deallocated.
9 5.4.2.1 ompd_callback_get_thread_context_for_thread_id
10 _fn_t
11 Summary
12 The ompd_callback_get_thread_context_for_thread_id_fn_t is the type
13 signature of the callback routine that the third-party tool provides to the OMPD library to map a
14 native thread identifier to a third-party tool thread context.
15 Format
C
16 typedef ompd_rc_t
17 (*ompd_callback_get_thread_context_for_thread_id_fn_t) (
18 ompd_address_space_context_t *address_space_context,
19 ompd_thread_id_t kind,
20 ompd_size_t sizeof_thread_id,
21 const void *thread_id,
22 ompd_thread_context_t **thread_context
23 );
C
24 Description
25 The ompd_callback_get_thread_context_for_thread_id_fn_t is the type
26 signature of the context mapping callback routine that the third-party tool provides. This callback
27 maps a native thread identifier to a third-party tool thread context. The native thread identifier is
28 within the address space that address_space_context identifies. The OMPD library can use the
29 thread context, for example, to access thread local storage.
30 Description of Arguments
31 The address_space_context argument is an opaque handle that the third-party tool provides to
32 reference an address space. The kind, sizeof_thread_id, and thread_id arguments represent a native
33 thread identifier. On return, the thread_context argument provides an opaque handle that maps a
34 native thread identifier to a third-party tool thread context.
8 Restrictions
9 Restrictions on routines that use
10 ompd_callback_get_thread_context_for_thread_id_fn_t are as follows:
11 • The provided thread_context must be valid until the OMPD library returns from the OMPD
12 third-party tool interface routine.
13 Cross References
14 • ompd_size_t type, see Section 5.3.1.
15 • ompd_thread_id_t type, see Section 5.3.7.
16 • ompd_address_space_context_t type, see Section 5.3.11.
17 • ompd_thread_context_t type, see Section 5.3.11.
18 • ompd_rc_t type, see Section 5.3.12.
19 5.4.2.2 ompd_callback_sizeof_fn_t
20 Summary
21 The ompd_callback_sizeof_fn_t type is the type signature of the callback routine that the
22 third-party tool provides to the OMPD library to determine the sizes of the primitive types in an
23 address space.
24 Format
C
25 typedef ompd_rc_t (*ompd_callback_sizeof_fn_t) (
26 ompd_address_space_context_t *address_space_context,
27 ompd_device_type_sizes_t *sizes
28 );
C
29 Description
30 The ompd_callback_sizeof_fn_t is the type signature of the type-size query callback
31 routine that the third-party tool provides. This callback provides the sizes of the basic primitive
32 types for a given address space.
7 Cross References
8 • ompd_address_space_context_t type, see Section 5.3.11.
9 • ompd_rc_t type, see Section 5.3.12.
10 • ompd_device_type_sizes_t type, see Section 5.3.13.
11 • ompd_callbacks_t type, see Section 5.4.6.
17 5.4.3.1 ompd_callback_symbol_addr_fn_t
18 Summary
19 The ompd_callback_symbol_addr_fn_t type is the type signature of the callback that the
20 third-party tool provides to look up the addresses of symbols in an OpenMP program.
21 Format
C
22 typedef ompd_rc_t (*ompd_callback_symbol_addr_fn_t) (
23 ompd_address_space_context_t *address_space_context,
24 ompd_thread_context_t *thread_context,
25 const char *symbol_name,
26 ompd_address_t *symbol_addr,
27 const char *file_name
28 );
C
29 Description
30 The ompd_callback_symbol_addr_fn_t is the type signature of the symbol-address query
31 callback routine that the third-party tool provides. This callback looks up addresses of symbols
32 within a specified address space.
33 Restrictions
34 Restrictions on routines that use the ompd_callback_symbol_addr_fn_t type are as
35 follows:
36 • The address_space_context argument must be non-null.
37 • The symbol that the symbol_name argument specifies must be defined.
7 5.4.3.2 ompd_callback_memory_read_fn_t
8 Summary
9 The ompd_callback_memory_read_fn_t type is the type signature of the callback that the
10 third-party tool provides to read data (read_memory) or a string (read_string) from an OpenMP
11 program.
12 Format
C
13 typedef ompd_rc_t (*ompd_callback_memory_read_fn_t) (
14 ompd_address_space_context_t *address_space_context,
15 ompd_thread_context_t *thread_context,
16 const ompd_address_t *addr,
17 ompd_size_t nbytes,
18 void *buffer
19 );
C
20 Description
21 The ompd_callback_memory_read_fn_t is the type signature of the read callback routines
22 that the third-party tool provides.
23 The read_memory callback copies a block of data from addr within the address space given by
24 address_space_context to the third-party tool buffer.
25 The read_string callback copies a string to which addr points, including the terminating null byte
26 (’\0’), to the third-party tool buffer. At most nbytes bytes are copied. If a null byte is not among
27 the first nbytes bytes, the string placed in buffer is not null-terminated.
28 Description of Arguments
29 The address from which the data are to be read in the OpenMP program that
30 address_space_context specifies is given by addr. The nbytes argument is the number of bytes to
31 be transferred. The thread_context argument is optional for global memory access, and in that case
32 should be NULL. If it is non-null, thread_context identifies the thread-specific context for the
33 memory access for the purpose of accessing thread local storage.
12 Cross References
13 • ompd_size_t type, see Section 5.3.1.
14 • ompd_address_t type, see Section 5.3.4.
15 • ompd_address_space_context_t type, see Section 5.3.11.
16 • ompd_thread_context_t type, see Section 5.3.11.
17 • ompd_rc_t type, see Section 5.3.12.
18 • ompd_callback_device_host_fn_t type, see Section 5.4.4.
19 • ompd_callbacks_t type, see Section 5.4.6.
20 5.4.3.3 ompd_callback_memory_write_fn_t
21 Summary
22 The ompd_callback_memory_write_fn_t type is the type signature of the callback that
23 the third-party tool provides to write data to an OpenMP program.
24 Format
C
25 typedef ompd_rc_t (*ompd_callback_memory_write_fn_t) (
26 ompd_address_space_context_t *address_space_context,
27 ompd_thread_context_t *thread_context,
28 const ompd_address_t *addr,
29 ompd_size_t nbytes,
30 const void *buffer
31 );
C
5 Description of Arguments
6 The address to which the data are to be written in the OpenMP program that address_space_context
7 specifies is given by addr. The nbytes argument is the number of bytes to be transferred. The
8 thread_context argument is optional for global memory access, and in that case should be NULL. If
9 it is non-null then thread_context identifies the thread-specific context for the memory access for
10 the purpose of accessing thread local storage.
11 The data to be written are passed through buffer, which is allocated and owned by the OMPD
12 library. The contents of the buffer are unstructured, raw bytes. The OMPD library must arrange for
13 any transformations such as byte-swapping that may be necessary (see Section 5.4.4) to render the
14 data into a form that is compatible with the OpenMP runtime.
18 Cross References
19 • ompd_size_t type, see Section 5.3.1.
20 • ompd_address_t type, see Section 5.3.4.
21 • ompd_address_space_context_t type, see Section 5.3.11.
22 • ompd_thread_context_t type, see Section 5.3.11.
23 • ompd_rc_t type, see Section 5.3.12.
24 • ompd_callback_device_host_fn_t type, see Section 5.4.4.
25 • ompd_callbacks_t type, see Section 5.4.6.
16 Description of Arguments
17 The address_space_context argument specifies the OpenMP address space that is associated with
18 the data. The input argument is the source buffer and the output argument is the destination buffer.
19 The unit_size argument is the size of each of the elements to be converted. The count argument is
20 the number of elements to be transformed.
21 The OMPD library allocates and owns the input and output buffers. It must ensure that the buffers
22 have the correct size and are eventually deallocated when they are no longer needed.
26 Cross References
27 • ompd_size_t type, see Section 5.3.1.
28 • ompd_address_space_context_t type, see Section 5.3.11.
29 • ompd_rc_t type, see Section 5.3.12.
30 • ompd_callbacks_t type, see Section 5.4.6.
5 Format
C
6 typedef ompd_rc_t (*ompd_callback_print_string_fn_t) (
7 const char *string,
8 int category
9 );
C
10 Description
11 The OMPD library may call the ompd_callback_print_string_fn_t callback function to
12 emit output, such as logging or debug information. The third-party tool may set the
13 ompd_callback_print_string_fn_t callback function to NULL to prevent the OMPD
14 library from emitting output. The OMPD library may not write to file descriptors that it did not
15 open.
16 Description of Arguments
17 The string argument is the null-terminated string to be printed. No conversion or formatting is
18 performed on the string.
19 The category argument is the implementation-defined category of the string to be printed.
23 Cross References
24 • ompd_rc_t type, see Section 5.3.12.
25 • ompd_callbacks_t type, see Section 5.4.6.
C
16 Description
17 The set of callbacks that the OMPD library must use is collected in the ompd_callbacks_t
18 structure. An instance of this type is passed to the OMPD library as a parameter to
19 ompd_initialize (see Section 5.5.1.1). Each field points to a function that the OMPD library
20 must use either to interact with the OpenMP program or for memory operations.
21 The alloc_memory and free_memory fields are pointers to functions the OMPD library uses to
22 allocate and to release dynamic memory.
23 The print_string field points to a function that prints a string.
24 The architecture on which the OMPD library and third-party tool execute may be different from the
25 architecture on which the OpenMP program that is being examined executes. The sizeof_type field
26 points to a function that allows the OMPD library to determine the sizes of the basic integer and
27 pointer types that the OpenMP program uses. Because of the potential differences in the targeted
28 architectures, the conventions for representing data in the OMPD library and the OpenMP program
29 may be different. The device_to_host field points to a function that translates data from the
30 conventions that the OpenMP program uses to those that the third-party tool and OMPD library
31 use. The reverse operation is performed by the function to which the host_to_device field points.
32 The symbol_addr_lookup field points to a callback that the OMPD library can use to find the
33 address of a global or thread local storage symbol. The read_memory, read_string and
34 write_memory fields are pointers to functions for reading from and writing to global memory or
35 thread local storage in the OpenMP program.
36 The get_thread_context_for_thread_id field is a pointer to a function that the OMPD library can
37 use to obtain a thread context that corresponds to a native thread identifier.
7 5.5.1.1 ompd_initialize
8 Summary
9 The ompd_initialize function initializes the OMPD library.
10 Format
C
11 ompd_rc_t ompd_initialize(
12 ompd_word_t api_version,
13 const ompd_callbacks_t *callbacks
14 );
C
15 Description
16 A tool that uses OMPD calls ompd_initialize to initialize each OMPD library that it loads.
17 More than one library may be present in a third-party tool, such as a debugger, because the tool
18 may control multiple devices, which may use different runtime systems that require different
19 OMPD libraries. This initialization must be performed exactly once before the tool can begin to
20 operate on an OpenMP process or core file.
21 Description of Arguments
22 The api_version argument is the OMPD API version that the tool requests to use. The tool may call
23 ompd_get_api_version to obtain the latest OMPD API version that the OMPD library
24 supports.
25 The tool provides the OMPD library with a set of callback functions in the callbacks input
26 argument which enables the OMPD library to allocate and to deallocate memory in the tool’s
27 address space, to lookup the sizes of basic primitive types in the device, to lookup symbols in the
28 device, and to read and to write memory in the device.
5 5.5.1.2 ompd_get_api_version
6 Summary
7 The ompd_get_api_version function returns the OMPD API version.
8 Format
C
9 ompd_rc_t ompd_get_api_version(ompd_word_t *version);
C
10 Description
11 The tool may call the ompd_get_api_version function to obtain the latest OMPD API
12 version number of the OMPD library. The OMPD API version number is equal to the value of the
13 _OPENMP macro defined in the associated OpenMP implementation, if the C preprocessor is
14 supported. If the associated OpenMP implementation compiles Fortran codes without the use of a
15 C preprocessor, the OMPD API version number is equal to the value of the Fortran integer
16 parameter openmp_version.
17 Description of Arguments
18 The latest version number is returned into the location to which the version argument points.
21 Cross References
22 • ompd_rc_t type, see Section 5.3.12.
23 5.5.1.3 ompd_get_version_string
24 Summary
25 The ompd_get_version_string function returns a descriptive string for the OMPD library
26 version.
27 Format
C
28 ompd_rc_t ompd_get_version_string(const char **string);
C
6 Description of Arguments
7 A pointer to a descriptive version string is placed into the location to which the string output
8 argument points. The OMPD library owns the string that the OMPD library returns; the tool must
9 not modify or release this string. The string remains valid for as long as the library is loaded. The
10 ompd_get_version_string function may be called before ompd_initialize (see
11 Section 5.5.1.1). Accordingly, the OMPD library must not use heap or stack memory for the string.
12 The signatures of ompd_get_api_version (see Section 5.5.1.2) and
13 ompd_get_version_string are guaranteed not to change in future versions of the API. In
14 contrast, the type definitions and prototypes in the rest of the API do not carry the same guarantee.
15 Therefore a tool that uses OMPD should check the version of the API of the loaded OMPD library
16 before it calls any other function of the API.
19 Cross References
20 • ompd_rc_t type, see Section 5.3.12.
21 5.5.1.4 ompd_finalize
22 Summary
23 When the tool is finished with the OMPD library it should call ompd_finalize before it
24 unloads the library.
25 Format
C
26 ompd_rc_t ompd_finalize(void);
C
27 Description
28 The call to ompd_finalize must be the last OMPD call that the tool makes before it unloads the
29 library. This call allows the OMPD library to free any resources that it may be holding.
30 The OMPD library may implement a finalizer section, which executes as the library is unloaded
31 and therefore after the call to ompd_finalize. During finalization, the OMPD library may use
32 the callbacks that the tool provided earlier during the call to ompd_initialize.
5 Cross References
6 • ompd_rc_t type, see Section 5.3.12.
12 Format
C
13 ompd_rc_t ompd_process_initialize(
14 ompd_address_space_context_t *context,
15 ompd_address_space_handle_t **handle
16 );
C
17 Description
18 A tool calls ompd_process_initialize to obtain an address space handle when it initializes
19 a session on a live process or core file. On return from ompd_process_initialize, the tool
20 owns the address space handle, which it must release with
21 ompd_rel_address_space_handle. The initialization function must be called before any
22 OMPD operations are performed on the OpenMP process or core file. This call allows the OMPD
23 library to confirm that it can handle the OpenMP process or core file that context identifies.
24 Description of Arguments
25 The context argument is an opaque handle that the tool provides to address an address space. On
26 return, the handle argument provides an opaque handle to the tool for this address space, which the
27 tool must release when it is no longer needed.
6 5.5.2.2 ompd_device_initialize
7 Summary
8 A tool calls ompd_device_initialize to obtain an address space handle for a device that has
9 at least one active target region.
10 Format
C
11 ompd_rc_t ompd_device_initialize(
12 ompd_address_space_handle_t *process_handle,
13 ompd_address_space_context_t *device_context,
14 ompd_device_t kind,
15 ompd_size_t sizeof_id,
16 void *id,
17 ompd_address_space_handle_t **device_handle
18 );
C
19 Description
20 A tool calls ompd_device_initialize to obtain an address space handle for a device that has
21 at least one active target region. On return from ompd_device_initialize, the tool owns the
22 address space handle.
23 Description of Arguments
24 The process_handle argument is an opaque handle that the tool provides to reference the address
25 space of the OpenMP process or core file. The device_context argument is an opaque handle that
26 the tool provides to reference a device address space. The kind, sizeof_id, and id arguments
27 represent a device identifier. On return the device_handle argument provides an opaque handle to
28 the tool for this address space.
7 5.5.2.3 ompd_rel_address_space_handle
8 Summary
9 A tool calls ompd_rel_address_space_handle to release an address space handle.
10 Format
C
11 ompd_rc_t ompd_rel_address_space_handle(
12 ompd_address_space_handle_t *handle
13 );
C
14 Description
15 When the tool is finished with the OpenMP process address space handle it should call
16 ompd_rel_address_space_handle to release the handle, which allows the OMPD library
17 to release any resources that it has related to the address space.
18 Description of Arguments
19 The handle argument is an opaque handle for the address space to be released.
20 Restrictions
21 Restrictions to the ompd_rel_address_space_handle routine are as follows:
22 • An address space context must not be used after the corresponding address space handle is
23 released.
26 Cross References
27 • ompd_address_space_handle_t type, see Section 5.3.8.
28 • ompd_rc_t type, see Section 5.3.12.
10 Format
C
11 ompd_rc_t ompd_get_omp_version(
12 ompd_address_space_handle_t *address_space,
13 ompd_word_t *omp_version
14 );
C
15 Description
16 The tool may call the ompd_get_omp_version function to obtain the version of the OpenMP
17 API that is associated with the address space.
18 Description of Arguments
19 The address_space argument is an opaque handle that the tool provides to reference the address
20 space of the OpenMP process or device.
21 Upon return, the omp_version argument contains the version of the OpenMP runtime in the
22 _OPENMP version macro format.
25 Cross References
26 • ompd_address_space_handle_t type, see Section 5.3.8.
27 • ompd_rc_t type, see Section 5.3.12.
5 Format
C
6 ompd_rc_t ompd_get_omp_version_string(
7 ompd_address_space_handle_t *address_space,
8 const char **string
9 );
C
10 Description
11 After initialization, the tool may call the ompd_get_omp_version_string function to obtain
12 the version of the OpenMP API that is associated with an address space.
13 Description of Arguments
14 The address_space argument is an opaque handle that the tool provides to reference the address
15 space of the OpenMP process or device. A pointer to a descriptive version string is placed into the
16 location to which the string output argument points. After returning from the call, the tool owns the
17 string. The OMPD library must use the memory allocation callback that the tool provides to
18 allocate the string storage. The tool is responsible for releasing the memory.
21 Cross References
22 • ompd_address_space_handle_t type, see Section 5.3.8.
23 • ompd_rc_t type, see Section 5.3.12.
6 Format
C
7 ompd_rc_t ompd_get_thread_in_parallel(
8 ompd_parallel_handle_t *parallel_handle,
9 int thread_num,
10 ompd_thread_handle_t **thread_handle
11 );
C
12 Description
13 A successful invocation of ompd_get_thread_in_parallel returns a pointer to a thread
14 handle in the location to which thread_handle points. This call yields meaningful results only
15 if all OpenMP threads in the parallel region are stopped.
16 Description of Arguments
17 The parallel_handle argument is an opaque handle for a parallel region and selects the parallel
18 region on which to operate. The thread_num argument selects the thread, the handle of which is to
19 be returned. On return, the thread_handle argument is an opaque handle for the selected thread.
25 Restrictions
26 Restrictions on the ompd_get_thread_in_parallel function are as follows:
27 • The value of thread_num must be a non-negative integer smaller than the team size that was
28 provided as the team-size-var ICV from ompd_get_icv_from_scope.
29 Cross References
30 • ompd_parallel_handle_t type, see Section 5.3.8.
31 • ompd_thread_handle_t type, see Section 5.3.8.
32 • ompd_rc_t type, see Section 5.3.12.
33 • ompd_get_icv_from_scope routine, see Section 5.5.9.2.
4 Format
C
5 ompd_rc_t ompd_rel_thread_handle(
6 ompd_thread_handle_t *thread_handle
7 );
C
8 Description
9 Thread handles are opaque to tools, which therefore cannot release them directly. Instead, when the
10 tool is finished with a thread handle it must pass it to ompd_rel_thread_handle for disposal.
11 Description of Arguments
12 The thread_handle argument is an opaque handle for a thread to be released.
15 Cross References
16 • ompd_thread_handle_t type, see Section 5.3.8.
17 • ompd_rc_t type, see Section 5.3.12.
18 5.5.5.4 ompd_thread_handle_compare
19 Summary
20 The ompd_thread_handle_compare function allows tools to compare two thread handles.
21 Format
C
22 ompd_rc_t ompd_thread_handle_compare(
23 ompd_thread_handle_t *thread_handle_1,
24 ompd_thread_handle_t *thread_handle_2,
25 int *cmp_value
26 );
C
27 Description
28 The internal structure of thread handles is opaque to a tool. While the tool can easily compare
29 pointers to thread handles, it cannot determine whether handles of two different addresses refer to
30 the same underlying thread. The ompd_thread_handle_compare function compares thread
31 handles.
5 Description of Arguments
6 The thread_handle_1 and thread_handle_2 arguments are opaque handles for threads. On return
7 the cmp_value argument is set to a signed integer value.
10 Cross References
11 • ompd_thread_handle_t type, see Section 5.3.8.
12 • ompd_rc_t type, see Section 5.3.12.
13 5.5.5.5 ompd_get_thread_id
14 Summary
15 The ompd_get_thread_id maps an OMPD thread handle to a native thread.
16 Format
C
17 ompd_rc_t ompd_get_thread_id(
18 ompd_thread_handle_t *thread_handle,
19 ompd_thread_id_t kind,
20 ompd_size_t sizeof_thread_id,
21 void *thread_id
22 );
C
23 Description
24 The ompd_get_thread_id function maps an OMPD thread handle to a native thread identifier.
25 Description of Arguments
26 The thread_handle argument is an opaque thread handle. The kind argument represents the native
27 thread identifier. The sizeof_thread_id argument represents the size of the native thread identifier.
28 On return, the thread_id argument is a buffer that represents a native thread identifier.
7 Cross References
8 • ompd_size_t type, see Section 5.3.1.
9 • ompd_thread_id_t type, see Section 5.3.7.
10 • ompd_thread_handle_t type, see Section 5.3.8.
11 • ompd_rc_t type, see Section 5.3.12.
17 Format
C
18 ompd_rc_t ompd_get_curr_parallel_handle(
19 ompd_thread_handle_t *thread_handle,
20 ompd_parallel_handle_t **parallel_handle
21 );
C
22 Description
23 The ompd_get_curr_parallel_handle function enables the tool to obtain a pointer to the
24 parallel handle for the current parallel region that is associated with an OpenMP thread. This call is
25 meaningful only if the associated thread is stopped. The parallel handle is owned by the tool and it
26 must be released by calling ompd_rel_parallel_handle.
27 Description of Arguments
28 The thread_handle argument is an opaque handle for a thread and selects the thread on which to
29 operate. On return, the parallel_handle argument is set to a handle for the parallel region that the
30 associated thread is currently executing, if any.
5 Cross References
6 • ompd_thread_handle_t type, see Section 5.3.8.
7 • ompd_parallel_handle_t type, see Section 5.3.8.
8 • ompd_rc_t type, see Section 5.3.12.
9 • ompd_rel_parallel_handle routine, see Section 5.5.6.4.
10 5.5.6.2 ompd_get_enclosing_parallel_handle
11 Summary
12 The ompd_get_enclosing_parallel_handle function obtains a pointer to the parallel
13 handle for an enclosing parallel region.
14 Format
C
15 ompd_rc_t ompd_get_enclosing_parallel_handle(
16 ompd_parallel_handle_t *parallel_handle,
17 ompd_parallel_handle_t **enclosing_parallel_handle
18 );
C
19 Description
20 The ompd_get_enclosing_parallel_handle function enables a tool to obtain a pointer
21 to the parallel handle for the parallel region that encloses the parallel region that
22 parallel_handle specifies. This call is meaningful only if at least one thread in the parallel
23 region is stopped. A pointer to the parallel handle for the enclosing region is returned in the
24 location to which enclosing_parallel_handle points. After the call, the tool owns the handle; the
25 tool must release the handle with ompd_rel_parallel_handle when it is no longer required.
26 Description of Arguments
27 The parallel_handle argument is an opaque handle for a parallel region that selects the parallel
28 region on which to operate. On return, the enclosing_parallel_handle argument is set to a handle
29 for the parallel region that encloses the selected parallel region.
9 5.5.6.3 ompd_get_task_parallel_handle
10 Summary
11 The ompd_get_task_parallel_handle function obtains a pointer to the parallel handle for
12 the parallel region that encloses a task region.
13 Format
C
14 ompd_rc_t ompd_get_task_parallel_handle(
15 ompd_task_handle_t *task_handle,
16 ompd_parallel_handle_t **task_parallel_handle
17 );
C
18 Description
19 The ompd_get_task_parallel_handle function enables a tool to obtain a pointer to the
20 parallel handle for the parallel region that encloses the task region that task_handle specifies. This
21 call is meaningful only if at least one thread in the parallel region is stopped. A pointer to the
22 parallel regions handle is returned in the location to which task_parallel_handle points. The tool
23 owns that parallel handle, which it must release with ompd_rel_parallel_handle.
24 Description of Arguments
25 The task_handle argument is an opaque handle that selects the task on which to operate. On return,
26 the parallel_handle argument is set to a handle for the parallel region that encloses the selected task.
27 Description of Return Codes
28 This routine must return any of the general return codes listed at the beginning of Section 5.5.
29 Cross References
30 • ompd_task_handle_t type, see Section 5.3.8.
31 • ompd_parallel_handle_t type, see Section 5.3.8.
32 • ompd_rc_t type, see Section 5.3.12.
33 • ompd_rel_parallel_handle routine, see Section 5.5.6.4.
4 Format
C
5 ompd_rc_t ompd_rel_parallel_handle(
6 ompd_parallel_handle_t *parallel_handle
7 );
C
8 Description
9 Parallel region handles are opaque so tools cannot release them directly. Instead, a tool must pass a
10 parallel region handle to the ompd_rel_parallel_handle function for disposal when
11 finished with it.
12 Description of Arguments
13 The parallel_handle argument is an opaque handle to be released.
16 Cross References
17 • ompd_parallel_handle_t type, see Section 5.3.8.
18 • ompd_rc_t type, see Section 5.3.12.
19 5.5.6.5 ompd_parallel_handle_compare
20 Summary
21 The ompd_parallel_handle_compare function compares two parallel region handles.
22 Format
C
23 ompd_rc_t ompd_parallel_handle_compare(
24 ompd_parallel_handle_t *parallel_handle_1,
25 ompd_parallel_handle_t *parallel_handle_2,
26 int *cmp_value
27 );
C
12 Description of Arguments
13 The parallel_handle_1 and parallel_handle_2 arguments are opaque handles that correspond to
14 parallel regions. On return the cmp_value argument points to a signed integer value that indicates
15 how the underlying parallel regions compare.
18 Cross References
19 • ompd_parallel_handle_t type, see Section 5.3.8.
20 • ompd_rc_t type, see Section 5.3.12.
26 Format
C
27 ompd_rc_t ompd_get_curr_task_handle(
28 ompd_thread_handle_t *thread_handle,
29 ompd_task_handle_t **task_handle
30 );
C
6 Description of Arguments
7 The thread_handle argument is an opaque handle that selects the thread on which to operate. On
8 return, the task_handle argument points to a location that points to a handle for the task that the
9 thread is currently executing.
14 Cross References
15 • ompd_thread_handle_t type, see Section 5.3.8.
16 • ompd_task_handle_t type, see Section 5.3.8.
17 • ompd_rc_t type, see Section 5.3.12.
18 • ompd_rel_task_handle routine, see Section 5.5.7.5.
19 5.5.7.2 ompd_get_generating_task_handle
20 Summary
21 The ompd_get_generating_task_handle function obtains a pointer to the task handle of
22 the generating task region.
23 Format
C
24 ompd_rc_t ompd_get_generating_task_handle(
25 ompd_task_handle_t *task_handle,
26 ompd_task_handle_t **generating_task_handle
27 );
C
28 Description
29 The ompd_get_generating_task_handle function obtains a pointer to the task handle for
30 the task that encountered the OpenMP task construct that generated the task represented by
31 task_handle. The generating task is the OpenMP task that was active when the task specified by
32 task_handle was created. This call is meaningful only if the thread that is executing the task that
33 task_handle specifies is stopped. The generating task handle must be released with
34 ompd_rel_task_handle.
9 Cross References
10 • ompd_task_handle_t type, see Section 5.3.8.
11 • ompd_rc_t type, see Section 5.3.12.
12 • ompd_rel_task_handle routine, see Section 5.5.7.5.
13 5.5.7.3 ompd_get_scheduling_task_handle
14 Summary
15 The ompd_get_scheduling_task_handle function obtains a task handle for the task that
16 was active at a task scheduling point.
17 Format
C
18 ompd_rc_t ompd_get_scheduling_task_handle(
19 ompd_task_handle_t *task_handle,
20 ompd_task_handle_t **scheduling_task_handle
21 );
C
22 Description
23 The ompd_get_scheduling_task_handle function obtains a task handle for the task that
24 was active when the task that task_handle represents was scheduled. This call is meaningful only if
25 the thread that is executing the task that task_handle specifies is stopped. The scheduling task
26 handle must be released with ompd_rel_task_handle.
27 Description of Arguments
28 The task_handle argument is an opaque handle for a task and selects the task on which to operate.
29 On return, the scheduling_task_handle argument points to a location that points to a handle for the
30 task that is still on the stack of execution on the same thread and was deferred in favor of executing
31 the selected task.
5 Cross References
6 • ompd_task_handle_t type, see Section 5.3.8.
7 • ompd_rc_t type, see Section 5.3.12.
8 • ompd_rel_task_handle routine, see Section 5.5.7.5.
9 5.5.7.4 ompd_get_task_in_parallel
10 Summary
11 The ompd_get_task_in_parallel function obtains handles for the implicit tasks that are
12 associated with a parallel region.
13 Format
C
14 ompd_rc_t ompd_get_task_in_parallel(
15 ompd_parallel_handle_t *parallel_handle,
16 int thread_num,
17 ompd_task_handle_t **task_handle
18 );
C
19 Description
20 The ompd_get_task_in_parallel function obtains handles for the implicit tasks that are
21 associated with a parallel region. A successful invocation of ompd_get_task_in_parallel
22 returns a pointer to a task handle in the location to which task_handle points. This call yields
23 meaningful results only if all OpenMP threads in the parallel region are stopped.
24 Description of Arguments
25 The parallel_handle argument is an opaque handle that selects the parallel region on which to
26 operate. The thread_num argument selects the implicit task of the team to be returned. The
27 thread_num argument is equal to the thread-num-var ICV value of the selected implicit task. On
28 return, the task_handle argument points to a location that points to an opaque handle for the
29 selected implicit task.
6 Restrictions
7 Restrictions on the ompd_get_task_in_parallel function are as follows:
8 • The value of thread_num must be a non-negative integer that is smaller than the size of the team
9 size that is the value of the team-size-var ICV that ompd_get_icv_from_scope returns.
10 Cross References
11 • ompd_parallel_handle_t type, see Section 5.3.8.
12 • ompd_task_handle_t type, see Section 5.3.8.
13 • ompd_rc_t type, see Section 5.3.12.
14 • ompd_get_icv_from_scope routine, see Section 5.5.9.2.
15 5.5.7.5 ompd_rel_task_handle
16 Summary
17 This ompd_rel_task_handle function releases a task handle.
18 Format
C
19 ompd_rc_t ompd_rel_task_handle(
20 ompd_task_handle_t *task_handle
21 );
C
22 Description
23 Task handles are opaque to tools; thus tools cannot release them directly. Instead, when a tool is
24 finished with a task handle it must use the ompd_rel_task_handle function to release it.
25 Description of Arguments
26 The task_handle argument is an opaque task handle to be released.
29 Cross References
30 • ompd_task_handle_t type, see Section 5.3.8.
31 • ompd_rc_t type, see Section 5.3.12.
27 5.5.7.7 ompd_get_task_function
28 Summary
29 This ompd_get_task_function function returns the entry point of the code that corresponds
30 to the body of a task.
31 Format
C
32 ompd_rc_t ompd_get_task_function (
33 ompd_task_handle_t *task_handle,
34 ompd_address_t *entry_point
35 );
C
4 Description of Arguments
5 The task_handle argument is an opaque handle that selects the task on which to operate. On return,
6 the entry_point argument is set to an address that describes the beginning of application code that
7 executes the task region.
10 Cross References
11 • ompd_address_t type, see Section 5.3.4.
12 • ompd_task_handle_t type, see Section 5.3.8.
13 • ompd_rc_t type, see Section 5.3.12.
14 5.5.7.8 ompd_get_task_frame
15 Summary
16 The ompd_get_task_frame function extracts the frame pointers of a task.
17 Format
C
18 ompd_rc_t ompd_get_task_frame (
19 ompd_task_handle_t *task_handle,
20 ompd_frame_info_t *exit_frame,
21 ompd_frame_info_t *enter_frame
22 );
C
23 Description
24 An OpenMP implementation maintains an ompt_frame_t object for every implicit or explicit
25 task. The ompd_get_task_frame function extracts the enter_frame and exit_frame fields of
26 the ompt_frame_t object of the task that task_handle identifies.
27 Description of Arguments
28 The task_handle argument specifies an OpenMP task. On return, the exit_frame argument points to
29 an ompd_frame_info_t object that has the frame information with the same semantics as the
30 exit_frame field in the ompt_frame_t object that is associated with the specified task. On return,
31 the enter_frame argument points to an ompd_frame_info_t object that has the frame
32 information with the same semantics as the enter_frame field in the ompt_frame_t object that is
33 associated with the specified task.
3 Cross References
4 • ompt_frame_t type, see Section 4.4.4.28.
5 • ompd_address_t type, see Section 5.3.4.
6 • ompd_frame_info_t type, see Section 5.3.5.
7 • ompd_task_handle_t type, see Section 5.3.8.
8 • ompd_rc_t type, see Section 5.3.12.
9 5.5.7.9 ompd_enumerate_states
10 Summary
11 The ompd_enumerate_states function enumerates thread states that an OpenMP
12 implementation supports.
13 Format
C
14 ompd_rc_t ompd_enumerate_states (
15 ompd_address_space_handle_t *address_space_handle,
16 ompd_word_t current_state,
17 ompd_word_t *next_state,
18 const char **next_state_name,
19 ompd_word_t *more_enums
20 );
C
21 Description
22 An OpenMP implementation may support only a subset of the states that the ompt_state_t
23 enumeration type defines. In addition, an OpenMP implementation may support
24 implementation-specific states. The ompd_enumerate_states call enables a tool to
25 enumerate the thread states that an OpenMP implementation supports.
26 When the current_state argument is a thread state that an OpenMP implementation supports, the
27 call assigns the value and string name of the next thread state in the enumeration to the locations to
28 which the next_state and next_state_name arguments point.
29 On return, the third-party tool owns the next_state_name string. The OMPD library allocates
30 storage for the string with the memory allocation callback that the tool provides. The tool is
31 responsible for releasing the memory.
32 On return, the location to which the more_enums argument points has the value 1 whenever one or
33 more states are left in the enumeration. On return, the location to which the more_enums argument
34 points has the value 0 when current_state is the last state in the enumeration.
14 Cross References
15 • ompt_state_t type, see Section 4.4.4.27.
16 • ompd_address_space_handle_t type, see Section 5.3.8.
17 • ompd_rc_t type, see Section 5.3.12.
18 5.5.7.10 ompd_get_state
19 Summary
20 The ompd_get_state function obtains the state of a thread.
21 Format
C
22 ompd_rc_t ompd_get_state (
23 ompd_thread_handle_t *thread_handle,
24 ompd_word_t *state,
25 ompd_wait_id_t *wait_id
26 );
C
27 Description
28 The ompd_get_state function returns the state of an OpenMP thread.
29 Description of Arguments
30 The thread_handle argument identifies the thread. The state argument represents the state of that
31 thread as represented by a value that ompd_enumerate_states returns. On return, if the
32 wait_id argument is non-null then it points to a handle that corresponds to the wait_id wait
33 identifier of the thread. If the thread state is not one of the specified wait states, the value to which
34 wait_id points is undefined.
3 Cross References
4 • ompd_wait_id_t type, see Section 5.3.2.
5 • ompd_thread_handle_t type, see Section 5.3.8.
6 • ompd_rc_t type, see Section 5.3.12.
7 • ompd_enumerate_states routine, see Section 5.5.7.9.
13 Format
C
14 ompd_rc_t ompd_get_display_control_vars (
15 ompd_address_space_handle_t *address_space_handle,
16 const char * const **control_vars
17 );
C
18 Description
19 The ompd_get_display_control_vars function returns a NULL-terminated vector of
20 NULL-terminated strings of name/value pairs of control variables that have user controllable
21 settings and are important to the operation or performance of an OpenMP runtime system. The
22 control variables that this interface exposes include all OpenMP environment variables, settings
23 that may come from vendor or platform-specific environment variables, and other settings that
24 affect the operation or functioning of an OpenMP runtime.
25 The format of the strings is "icv-name=icv-value".
26 On return, the third-party tool owns the vector and the strings. The OMPD library must satisfy the
27 termination constraints; it may use static or dynamic memory for the vector and/or the strings and is
28 unconstrained in how it arranges them in memory. If it uses dynamic memory then the OMPD
29 library must use the allocate callback that the tool provides to ompd_initialize. The tool must
30 use the ompd_rel_display_control_vars function to release the vector and the strings.
31 Description of Arguments
32 The address_space_handle argument identifies the address space. On return, the control_vars
33 argument points to the vector of display control variables.
3 Cross References
4 • ompd_address_space_handle_t type, see Section 5.3.8.
5 • ompd_rc_t type, see Section 5.3.12.
6 • ompd_initialize routine, see Section 5.5.1.1.
7 • ompd_rel_display_control_vars routine, see Section 5.5.8.2.
8 5.5.8.2 ompd_rel_display_control_vars
9 Summary
10 The ompd_rel_display_control_vars releases a list of name/value pairs of OpenMP
11 control variables previously acquired with ompd_get_display_control_vars.
12 Format
C
13 ompd_rc_t ompd_rel_display_control_vars (
14 const char * const **control_vars
15 );
C
16 Description
17 The third-party tool owns the vector and strings that ompd_get_display_control_vars
18 returns. The tool must call ompd_rel_display_control_vars to release the vector and the
19 strings.
20 Description of Arguments
21 The control_vars argument is the vector of display control variables to be released.
24 Cross References
25 • ompd_rc_t type, see Section 5.3.12.
26 • ompd_get_display_control_vars routine, see Section 5.5.8.1.
5 Cross References
6 • ompd_address_space_handle_t type, see Section 5.3.8.
7 • ompd_scope_t type, see Section 5.3.9.
8 • ompd_icv_id_t type, see Section 5.3.10.
9 • ompd_rc_t type, see Section 5.3.12.
10 5.5.9.2 ompd_get_icv_from_scope
11 Summary
12 The ompd_get_icv_from_scope function returns the value of an ICV.
13 Format
C
14 ompd_rc_t ompd_get_icv_from_scope (
15 void *handle,
16 ompd_scope_t scope,
17 ompd_icv_id_t icv_id,
18 ompd_word_t *icv_value
19 );
C
20 Description
21 The ompd_get_icv_from_scope function provides access to the ICVs that
22 ompd_enumerate_icvs identifies.
23 Description of Arguments
24 The handle argument provides an OpenMP scope handle. The scope argument specifies the kind of
25 scope provided in handle. The icv_id argument specifies the ID of the requested ICV. On return,
26 the icv_value argument points to a location with the value of the requested ICV.
27 Constraints on Arguments
28 The provided handle must match the scope as defined in Section 5.3.10.
29 The provided scope must match the scope for icv_id as requested by ompd_enumerate_icvs.
8 Cross References
9 • ompd_address_space_handle_t type, see Section 5.3.8.
10 • ompd_thread_handle_t type, see Section 5.3.8.
11 • ompd_parallel_handle_t type, see Section 5.3.8.
12 • ompd_task_handle_t type, see Section 5.3.8.
13 • ompd_scope_t type, see Section 5.3.9.
14 • ompd_icv_id_t type, see Section 5.3.10.
15 • ompd_rc_t type, see Section 5.3.12.
16 • ompd_enumerate_icvs routine, see Section 5.5.9.1.
17 5.5.9.3 ompd_get_icv_string_from_scope
18 Summary
19 The ompd_get_icv_string_from_scope function returns the value of an ICV.
20 Format
C
21 ompd_rc_t ompd_get_icv_string_from_scope (
22 void *handle,
23 ompd_scope_t scope,
24 ompd_icv_id_t icv_id,
25 const char **icv_string
26 );
C
27 Description
28 The ompd_get_icv_string_from_scope function provides access to the ICVs that
29 ompd_enumerate_icvs identifies.
24 5.5.9.4 ompd_get_tool_data
25 Summary
26 The ompd_get_tool_data function provides access to the OMPT data variable stored for each
27 OpenMP scope.
28 Format
C
29 ompd_rc_t ompd_get_tool_data(
30 void* handle,
31 ompd_scope_t scope,
32 ompd_word_t *value,
33 ompd_address_t *ptr
34 );
C
5 Description of Arguments
6 The handle argument provides an OpenMP scope handle. The scope argument specifies the kind of
7 scope provided in handle. On return, the value argument points to the value field of the
8 ompt_data_t union stored for the selected scope. On return, the ptr argument points to the ptr
9 field of the ompt_data_t union stored for the selected scope.
14 Cross References
15 • ompt_data_t type, see Section 4.4.4.4.
16 • ompd_address_space_handle_t type, see Section 5.3.8.
17 • ompd_thread_handle_t type, see Section 5.3.8.
18 • ompd_parallel_handle_t type, see Section 5.3.8.
19 • ompd_task_handle_t type, see Section 5.3.8.
20 • ompd_scope_t type, see Section 5.3.9.
21 • ompd_rc_t type, see Section 5.3.12.
5 Format
C
6 void ompd_bp_parallel_begin(void);
C
7 Description
8 The OpenMP implementation must execute ompd_bp_parallel_begin at every
9 parallel-begin event. At the point that the implementation reaches
10 ompd_bp_parallel_begin, the binding for ompd_get_curr_parallel_handle is the
11 parallel region that is beginning and the binding for ompd_get_curr_task_handle is the
12 task that encountered the parallel construct.
13 Cross References
14 • parallel construct, see Section 2.6.
15 • ompd_get_curr_parallel_handle routine, see Section 5.5.6.1.
16 • ompd_get_curr_task_handle routine, see Section 5.5.7.1.
21 Format
C
22 void ompd_bp_parallel_end(void);
C
23 Description
24 The OpenMP implementation must execute ompd_bp_parallel_end at every parallel-end
25 event. At the point that the implementation reaches ompd_bp_parallel_end, the binding for
26 ompd_get_curr_parallel_handle is the parallel region that is ending and the binding
27 for ompd_get_curr_task_handle is the task that encountered the parallel construct.
28 After execution of ompd_bp_parallel_end, any parallel_handle that was acquired for the
29 parallel region is invalid and should be released.
10 Format
C
11 void ompd_bp_task_begin(void);
C
12 Description
13 The OpenMP implementation must execute ompd_bp_task_begin immediately before starting
14 execution of a structured-block that is associated with a non-merged task. At the point that the
15 implementation reaches ompd_bp_task_begin, the binding for
16 ompd_get_curr_task_handle is the task that is scheduled to execute.
17 Cross References
18 • ompd_get_curr_task_handle routine, see Section 5.5.7.1.
23 Format
C
24 void ompd_bp_task_end(void);
C
7 Cross References
8 • ompd_get_curr_task_handle routine, see Section 5.5.7.1.
9 • ompd_rel_task_handle routine, see Section 5.5.7.5.
13 Format
C
14 void ompd_bp_thread_begin(void);
C
15 Description
16 The OpenMP implementation must execute ompd_bp_thread_begin at every
17 native-thread-begin and initial-thread-begin event. This execution occurs before the thread starts
18 the execution of any OpenMP region.
19 Cross References
20 • parallel construct, see Section 2.6.
21 • Initial task, see Section 2.12.5.
25 Format
C
26 void ompd_bp_thread_end(void);
C
6 Cross References
7 • parallel construct, see Section 2.6.
8 • Initial task, see Section 2.12.5.
9 • ompd_rel_thread_handle routine, see Section 5.5.5.3.
14 Format
C
15 void ompd_bp_device_begin(void);
C
16 Description
17 When initializing a device for execution of a target region, the implementation must execute
18 ompd_bp_device_begin. This execution occurs before the work associated with any OpenMP
19 region executes on the device.
20 Cross References
21 • Device Initialization, see Section 2.14.1.
25 Format
C
26 void ompd_bp_device_end(void);
C
6 Cross References
7 • Device Initialization, see Section 2.14.1.
8 • ompd_rel_address_space_handle routine, see Section 5.5.2.3.
14 • bash-like shells:
15 export OMP_SCHEDULE="dynamic"
18 As defined following Table 2.1 in Section 2.4.2, device-specific environment variables extend many
19 of the environment variables defined in this chapter. If the corresponding environment variable for
20 a specific device number, including the host device, is set, then the setting for that environment
21 variable is used to set the value of the associated ICV of the device with the corresponding device
22 number. If the corresponding environment variable that includes the _DEV suffix but no device
23 number is set, then the setting of that environment variable is used to set the value of the associated
24 ICV of any non-host device for which the device-number-specific corresponding environment
25 variable is not set. In all cases the setting of an environment variable for which a device number is
26 specified takes precedence.
27 Restrictions
28 Restrictions to device-specific environment variables are as follows:
29 • Device-specific environment variables must not correspond to environment variables that
30 initialize ICVs with global scope.
639
1 6.1 OMP_SCHEDULE
2 The OMP_SCHEDULE environment variable controls the schedule kind and chunk size of all loop
3 directives that have the schedule kind runtime, by setting the value of the run-sched-var ICV.
4 The value of this environment variable takes the form:
5 [modifier:]kind[, chunk]
6 where
7 • modifier is one of monotonic or nonmonotonic;
8 • kind is one of static, dynamic, guided, or auto;
9 • chunk is an optional positive integer that specifies the chunk size.
10 If the modifier is not present, the modifier is set to monotonic if kind is static; for any other
11 kind it is set to nonmonotonic.
12 If chunk is present, white space may be on either side of the “,”. See Section 2.11.4 for a detailed
13 description of the schedule kinds.
14 The behavior of the program is implementation defined if the value of OMP_SCHEDULE does not
15 conform to the above format.
16 Examples:
17 setenv OMP_SCHEDULE "guided,4"
18 setenv OMP_SCHEDULE "dynamic"
19 setenv OMP_SCHEDULE "nonmonotonic:dynamic,4"
20 Cross References
21 • run-sched-var ICV, see Section 2.4.
22 • Worksharing-Loop construct, see Section 2.11.4.
23 • Parallel worksharing-loop construct, see Section 2.16.1.
24 • omp_set_schedule routine, see Section 3.2.11.
25 • omp_get_schedule routine, see Section 3.2.12.
26 6.2 OMP_NUM_THREADS
27 The OMP_NUM_THREADS environment variable sets the number of threads to use for parallel
28 regions by setting the initial value of the nthreads-var ICV. See Section 2.4 for a comprehensive set
29 of rules about the interaction between the OMP_NUM_THREADS environment variable, the
30 num_threads clause, the omp_set_num_threads library routine and dynamic adjustment of
15 Cross References
16 • nthreads-var ICV, see Section 2.4.
17 • num_threads clause, see Section 2.6.
18 • omp_set_num_threads routine, see Section 3.2.1.
19 • omp_get_num_threads routine, see Section 3.2.2.
20 • omp_get_max_threads routine, see Section 3.2.3.
21 • omp_get_team_size routine, see Section 3.2.19.
22 6.3 OMP_DYNAMIC
23 The OMP_DYNAMIC environment variable controls dynamic adjustment of the number of threads
24 to use for executing parallel regions by setting the initial value of the dyn-var ICV.
25 The value of this environment variable must be one of the following:
26 true | false
27 If the environment variable is set to true, the OpenMP implementation may adjust the number of
28 threads to use for executing parallel regions in order to optimize the use of system resources. If
29 the environment variable is set to false, the dynamic adjustment of the number of threads is
30 disabled. The behavior of the program is implementation defined if the value of OMP_DYNAMIC is
31 neither true nor false.
3 Cross References
4 • dyn-var ICV, see Section 2.4.
5 • omp_set_dynamic routine, see Section 3.2.6.
6 • omp_get_dynamic routine, see Section 3.2.7.
7 6.4 OMP_PROC_BIND
8 The OMP_PROC_BIND environment variable sets the initial value of the bind-var ICV. The value
9 of this environment variable is either true, false, or a comma separated list of primary,
10 master (master has been deprecated), close, or spread. The values of the list set the thread
11 affinity policy to be used for parallel regions at the corresponding nested level.
12 If the environment variable is set to false, the execution environment may move OpenMP threads
13 between OpenMP places, thread affinity is disabled, and proc_bind clauses on parallel
14 constructs are ignored.
15 Otherwise, the execution environment should not move OpenMP threads between OpenMP places,
16 thread affinity is enabled, and the initial thread is bound to the first place in the place-partition-var
17 ICV prior to the first active parallel region. An initial thread that is created by a teams construct is
18 bound to the first place in its place-partition-var ICV before it begins execution of the associated
19 structured block.
20 If the environment variable is set to true, the thread affinity policy is implementation defined but
21 must conform to the previous paragraph. The behavior of the program is implementation defined if
22 the value in the OMP_PROC_BIND environment variable is not true, false, or a comma
23 separated list of primary, master (master has been deprecated), close, or spread. The
24 behavior is also implementation defined if an initial thread cannot be bound to the first place in the
25 place-partition-var ICV.
26 The OMP_PROC_BIND environment variable sets the max-active-levels-var ICV to the number of
27 active levels of parallelism that the implementation supports if the OMP_PROC_BIND environment
28 variable is set to a comma-separated list of more than one element. The value of the
29 max-active-level-var ICV may be overridden by setting OMP_MAX_ACTIVE_LEVELS or
30 OMP_NESTED. See Section 6.8 and Section 6.9 for details.
31 Examples:
32 setenv OMP_PROC_BIND false
33 setenv OMP_PROC_BIND "spread, spread, close"
5 6.5 OMP_PLACES
6 The OMP_PLACES environment variable sets the initial value of the place-partition-var ICV. A list
7 of places can be specified in the OMP_PLACES environment variable. The value of OMP_PLACES
8 can be one of two types of values: either an abstract name that describes a set of places or an
9 explicit list of places described by non-negative numbers.
10 The OMP_PLACES environment variable can be defined using an explicit ordered list of
11 comma-separated places. A place is defined by an unordered set of comma-separated non-negative
12 numbers enclosed by braces, or a non-negative number. The meaning of the numbers and how the
13 numbering is done are implementation defined. Generally, the numbers represent the smallest unit
14 of execution exposed by the execution environment, typically a hardware thread.
15 Intervals may also be used to define places. Intervals can be specified using the <lower-bound> :
16 <length> : <stride> notation to represent the following list of numbers: “<lower-bound>,
17 <lower-bound> + <stride>, ..., <lower-bound> + (<length> - 1)*<stride>.” When <stride> is
18 omitted, a unit stride is assumed. Intervals can specify numbers within a place as well as sequences
19 of places.
20 An exclusion operator “!” can also be used to exclude the number or place immediately following
21 the operator.
22 Alternatively, the abstract names listed in Table 6.1 should be understood by the execution and
23 runtime environment. The precise definitions of the abstract names are implementation defined. An
24 implementation may also add abstract names as appropriate for the target platform.
25 The abstract name may be appended by a positive number in parentheses to denote the length of the
26 place list to be created, that is abstract_name(num-places). When requesting fewer places than
27 available on the system, the determination of which resources of type abstract_name are to be
28 included in the place list is implementation defined. When requesting more resources than
29 available, the length of the place list is implementation defined.
8 where each of the last three definitions corresponds to the same 4 places including the smallest
9 units of execution exposed by the execution environment numbered, in turn, 0 to 3, 4 to 7, 8 to 11,
10 and 12 to 15.
11 Cross References
12 • place-partition-var, see Section 2.4.
13 • Controlling OpenMP thread affinity, see Section 2.6.2.
14 • omp_get_num_places routine, see Section 3.3.2.
15 • omp_get_place_num_procs routine, see Section 3.3.3.
16 • omp_get_place_proc_ids routine, see Section 3.3.4.
17 • omp_get_place_num routine, see Section 3.3.5.
18 • omp_get_partition_num_places routine, see Section 3.3.6.
19 • omp_get_partition_place_nums routine, see Section 3.3.7.
20 6.6 OMP_STACKSIZE
21 The OMP_STACKSIZE environment variable controls the size of the stack for threads created by
22 the OpenMP implementation, by setting the value of the stacksize-var ICV. The environment
23 variable does not control the size of the stack for an initial thread.
24 The value of this environment variable takes the form:
25 size | sizeB | sizeK | sizeM | sizeG
26 where:
27 • size is a positive integer that specifies the size of the stack for threads that are created by the
28 OpenMP implementation.
29 • B, K, M, and G are letters that specify whether the given size is in Bytes, Kilobytes (1024 Bytes),
30 Megabytes (1024 Kilobytes), or Gigabytes (1024 Megabytes), respectively. If one of these letters
31 is present, white space may occur between size and the letter.
12 Cross References
13 • stacksize-var ICV, see Section 2.4.
14 6.7 OMP_WAIT_POLICY
15 The OMP_WAIT_POLICY environment variable provides a hint to an OpenMP implementation
16 about the desired behavior of waiting threads by setting the wait-policy-var ICV. A compliant
17 OpenMP implementation may or may not abide by the setting of the environment variable.
18 The value of this environment variable must be one of the following:
19 active | passive
20 The active value specifies that waiting threads should mostly be active, consuming processor
21 cycles, while waiting. An OpenMP implementation may, for example, make waiting threads spin.
22 The passive value specifies that waiting threads should mostly be passive, not consuming
23 processor cycles, while waiting. For example, an OpenMP implementation may make waiting
24 threads yield the processor to other threads or go to sleep.
25 The details of the active and passive behaviors are implementation defined.
26 The behavior of the program is implementation defined if the value of OMP_WAIT_POLICY is
27 neither active nor passive.
28 Examples:
29 setenv OMP_WAIT_POLICY ACTIVE
30 setenv OMP_WAIT_POLICY active
31 setenv OMP_WAIT_POLICY PASSIVE
32 setenv OMP_WAIT_POLICY passive
33 Cross References
34 • wait-policy-var ICV, see Section 2.4.
8 Cross References
9 • max-active-levels-var ICV, see Section 2.4.
10 • omp_set_max_active_levels routine, see Section 3.2.15.
11 • omp_get_max_active_levels routine, see Section 3.2.16.
26 Cross References
27 • max-active-levels-var ICV, see Section 2.4.
28 • omp_set_nested routine, see Section 3.2.9.
29 • omp_get_team_size routine, see Section 3.2.19.
30 • OMP_MAX_ACTIVE_LEVELS environment variable, see Section 6.8.
7 Cross References
8 • thread-limit-var ICV, see Section 2.4.
9 • omp_get_thread_limit routine, see Section 3.2.13.
10 6.11 OMP_CANCELLATION
11 The OMP_CANCELLATION environment variable sets the initial value of the cancel-var ICV.
12 The value of this environment variable must be one of the following:
13 true|false
14 If the environment variable is set to true, the effects of the cancel construct and of cancellation
15 points are enabled and cancellation is activated. If the environment variable is set to false,
16 cancellation is disabled and the cancel construct and cancellation points are effectively ignored.
17 The behavior of the program is implementation defined if OMP_CANCELLATION is set to neither
18 true nor false.
19 Cross References
20 • cancel-var, see Section 2.4.1.
21 • cancel construct, see Section 2.20.1.
22 • cancellation point construct, see Section 2.20.2.
23 • omp_get_cancellation routine, see Section 3.2.8.
24 6.12 OMP_DISPLAY_ENV
25 The OMP_DISPLAY_ENV environment variable instructs the runtime to display the information as
26 described in the omp_display_env routine section (Section 3.15).
27 The value of the OMP_DISPLAY_ENV environment variable may be set to one of these values:
28 true | false | verbose
11 Cross References
12 • omp_display_env routine, see Section 3.15.
13 6.13 OMP_DISPLAY_AFFINITY
14 The OMP_DISPLAY_AFFINITY environment variable instructs the runtime to display formatted
15 affinity information for all OpenMP threads in the parallel region upon entering the first parallel
16 region and when any change occurs in the information accessible by the format specifiers listed in
17 Table 6.2. If affinity of any thread in a parallel region changes then thread affinity information for
18 all threads in that region is displayed. If the thread affinity for each respective parallel region at
19 each nesting level has already been displayed and the thread affinity has not changed, then the
20 information is not displayed again. Thread affinity information for threads in the same parallel
21 region may be displayed in any order.
22 The value of the OMP_DISPLAY_AFFINITY environment variable may be set to one of these
23 values:
24 true | false
25 The true value instructs the runtime to display the OpenMP thread affinity information, and uses
26 the format setting defined in the affinity-format-var ICV.
27 The runtime does not display the OpenMP thread affinity information when the value of the
28 OMP_DISPLAY_AFFINITY environment variable is false or undefined. For all values of the
29 environment variable other than true or false, the display action is implementation defined.
30 Example:
31 setenv OMP_DISPLAY_AFFINITY TRUE
32 The above example causes an OpenMP implementation to display OpenMP thread affinity
33 information during execution of the program, in a format given by the affinity-format-var ICV. The
34 following is a sample output:
3 Cross References
4 • Controlling OpenMP thread affinity, see Section 2.6.2.
5 • omp_set_affinity_format routine, see Section 3.3.8.
6 • omp_get_affinity_format routine, see Section 3.3.9.
7 • omp_display_affinity routine, see Section 3.3.10.
8 • omp_capture_affinity routine, see Section 3.3.11.
9 • OMP_AFFINITY_FORMAT environment variable, see Section 6.14.
10 6.14 OMP_AFFINITY_FORMAT
11 The OMP_AFFINITY_FORMAT environment variable sets the initial value of the
12 affinity-format-var ICV which defines the format when displaying OpenMP thread affinity
13 information.
14 The value of this environment variable is case sensitive and leading and trailing whitespace is
15 significant.
16 The value of this environment variable is a character string that may contain as substrings one or
17 more field specifiers, in addition to other characters. The format of each field specifier is
18 %[[[0].] size ] type
19 where an individual field specifier must contain the percent symbol (%) and a type. The type can be
20 a single character short name or its corresponding long name delimited with curly braces, such as
21 %n or %{thread_num}. A literal percent is specified as %%. Field specifiers can be provided in
22 any order.
23 The 0 modifier indicates whether or not to add leading zeros to the output, following any indication
24 of sign or base. The . modifier indicates the output should be right justified when size is specified.
25 By default, output is left justified. The minimum field length is size, which is a decimal digit string
26 with a non-zero first digit. If no size is specified, the actual length needed to print the field will be
27 used. If the 0 modifier is used with type of A, {thread_affinity}, H, {host}, or a type that
28 is not printed as a number, the result is unspecified. Any other characters in the format string that
29 are not part of a field specifier will be included literally in the output.
7 The above example causes an OpenMP implementation to display OpenMP thread affinity
8 information in the following form:
9 Thread Affinity: 001 0 0-1,16-17 nid003
10 Thread Affinity: 001 1 2-3,18-19 nid003
11 Cross References
12 • Controlling OpenMP thread affinity, see Section 2.6.2.
13 • omp_set_affinity_format routine, see Section 3.3.8.
5 6.15 OMP_DEFAULT_DEVICE
6 The OMP_DEFAULT_DEVICE environment variable sets the device number to use in device
7 constructs by setting the initial value of the default-device-var ICV.
8 The value of this environment variable must be a non-negative integer value.
9 Cross References
10 • default-device-var ICV, see Section 2.4.
11 • device directives, Section 2.14.
12 6.16 OMP_MAX_TASK_PRIORITY
13 The OMP_MAX_TASK_PRIORITY environment variable controls the use of task priorities by
14 setting the initial value of the max-task-priority-var ICV. The value of this environment variable
15 must be a non-negative integer.
16 Example:
17 % setenv OMP_MAX_TASK_PRIORITY 20
18 Cross References
19 • max-task-priority-var ICV, see Section 2.4.
20 • Tasking Constructs, see Section 2.12.
21 • omp_get_max_task_priority routine, see Section 3.5.1.
22 6.17 OMP_TARGET_OFFLOAD
23 The OMP_TARGET_OFFLOAD environment variable sets the initial value of the target-offload-var
24 ICV. The value of the OMP_TARGET_OFFLOAD environment variable must be one of the
25 following:
26 mandatory | disabled | default
8 Cross References
9 • target-offload-var ICV, see Section 2.4.
10 • Device Directives, see Section 2.14.
11 • Device Memory Routines, see Section 3.8.
12 6.18 OMP_TOOL
13 The OMP_TOOL environment variable sets the tool-var ICV, which controls whether an OpenMP
14 runtime will try to register a first party tool.
15 The value of this environment variable must be one of the following:
16 enabled | disabled
17 If OMP_TOOL is set to any value other than enabled or disabled, the behavior is unspecified.
18 If OMP_TOOL is not defined, the default value for tool-var is enabled.
19 Example:
20 % setenv OMP_TOOL enabled
21 Cross References
22 • tool-var ICV, see Section 2.4.
23 • OMPT Interface, see Chapter 4.
24 6.19 OMP_TOOL_LIBRARIES
25 The OMP_TOOL_LIBRARIES environment variable sets the tool-libraries-var ICV to a list of tool
26 libraries that are considered for use on a device on which an OpenMP implementation is being
27 initialized. The value of this environment variable must be a list of names of dynamically-loadable
28 libraries, separated by an implementation specific, platform typical separator. Whether the value of
29 this environment variable is case sensitive is implementation defined.
11 Cross References
12 • tool-libraries-var ICV, see Section 2.4.
13 • OMPT Interface, see Chapter 4.
14 • ompt_start_tool routine, see Section 4.2.1.
15 6.20 OMP_TOOL_VERBOSE_INIT
16 The OMP_TOOL_VERBOSE_INIT environment variable sets the tool-verbose-init-var ICV, which
17 controls whether an OpenMP implementation will verbosely log the registration of a tool.
18 The value of this environment variable must be one of the following:
19 disabled | stdout | stderr | <filename>
20 If OMP_TOOL_VERBOSE_INIT is set to any value other than case insensitive disabled,
21 stdout or stderr, the value is interpreted as a filename and the OpenMP runtime will try to log
22 to a file with prefix filename. If the value is interpreted as a filename, whether it is case sensitive is
23 implementation defined. If opening the logfile fails, the output will be redirected to stderr. If
24 OMP_TOOL_VERBOSE_INIT is not defined, the default value for tool-verbose-init-var is
25 disabled. Support for logging to stdout or stderr is implementation defined. Unless
26 tool-verbose-init-var is disabled, the OpenMP runtime will log the steps of the tool activation
27 process defined in Section 4.2.2 to a file with a name that is constructed using the provided
28 filename prefix. The format and detail of the log is implementation defined. At a minimum, the log
29 will contain the following:
30 • either that tool-var is disabled, or
31 • an indication that a tool was available in the address space at program launch, or
32 • the path name of each tool in OMP_TOOL_LIBRARIES that is considered for dynamic loading,
33 whether dynamic loading was successful, and whether the ompt_start_tool function is
34 found in the loaded library.
7 Cross References
8 • tool-verbose-init-var ICV, see Section 2.4.
9 • OMPT Interface, see Chapter 4.
10 6.21 OMP_DEBUG
11 The OMP_DEBUG environment variable sets the debug-var ICV, which controls whether an
12 OpenMP runtime collects information that an OMPD library may need to support a tool.
13 The value of this environment variable must be one of the following:
14 enabled | disabled
15 If OMP_DEBUG is set to any value other than enabled or disabled then the behavior is
16 implementation defined.
17 Example:
18 % setenv OMP_DEBUG enabled
19 Cross References
20 • debug-var ICV, see Section 2.4.
21 • OMPD Interface, see Chapter 5.
22 • Enabling the Runtime for OMPD, see Section 5.2.1.
23 6.22 OMP_ALLOCATOR
24 The OMP_ALLOCATOR environment variable sets the initial value of the def-allocator-var ICV
25 that specifies the default allocator for allocation calls, directives and clauses that do not specify an
26 allocator.
27 The following grammar describes the values accepted for the OMP_ALLOCATOR environment
28 variable.
1 value can be an integer only if the trait accepts a numerical value, for the fb_data trait the value
2 can only be predef-allocator. If the value of this environment variable is not a predefined allocator,
3 then a new allocator with the given predefined memory space and optional traits is created and set
4 as the def-allocator-var ICV. If the new allocator cannot be created, the def-allocator-var ICV will
5 be set to omp_default_mem_alloc.
6 Example:
7 setenv OMP_ALLOCATOR omp_high_bw_mem_alloc
8 setenv OMP_ALLOCATOR omp_large_cap_mem_space:alignment=16,\
9 pinned=true
10 setenv OMP_ALLOCATOR omp_high_bw_mem_space:pool_size=1048576,\
11 fallback=allocator_fb,fb_data=omp_low_lat_mem_alloc
12 Cross References
13 • def-allocator-var ICV, see Section 2.4.
14 • Memory allocators, see Section 2.13.2.
15 • omp_set_default_allocator routine, see Section 3.13.4.
16 • omp_get_default_allocator routine, see Section 3.13.5.
17 • omp_alloc and omp_aligned_alloc routines, see Section 3.13.6
18 • omp_calloc and omp_aligned_calloc routines, see Section 3.13.8
19 6.23 OMP_NUM_TEAMS
20 The OMP_NUM_TEAMS environment variable sets the maximum number of teams created by a
21 teams construct by setting the nteams-var ICV.
4 Cross References
5 • nteams-var ICV, see Section 2.4.
6 • omp_get_max_teams routine, see Section 3.4.4.
7 6.24 OMP_TEAMS_THREAD_LIMIT
8 The OMP_TEAMS_THREAD_LIMIT environment variable sets the maximum number of OpenMP
9 threads to use in each contention group created by a teams construct by setting the
10 teams-thread-limit-var ICV.
11 The value of this environment variable must be a positive integer. The behavior of the program is
12 implementation defined if the requested value of OMP_TEAMS_THREAD_LIMIT is greater than
13 the number of threads that an implementation can support, or if the value is not a positive integer.
14 Cross References
15 • teams-thread-limit-var ICV, see Section 2.4.
16 • omp_get_teams_thread_limit routine, see Section 3.4.6.
6 Chapter 1:
7 • Processor: A hardware unit that is implementation defined (see Section 1.2.1).
8 • Device: An implementation defined logical execution engine (see Section 1.2.1).
9 • Device pointer: an implementation defined handle that refers to a device address (see
10 Section 1.2.6).
11 • Supported active levels of parallelism: The maximum number of active parallel regions that
12 may enclose any region of code in the program is implementation defined (see Section 1.2.7).
13 • Memory model: The minimum size at which a memory update may also read and write back
14 adjacent variables that are part of another variable (as array or structure elements) is
15 implementation defined but is no larger than required by the base language. The manner in which
16 a program can obtain the referenced device address from a device pointer, outside the
17 mechanisms specified by OpenMP, is implementation defined (see Section 1.4.1).
18 Chapter 2:
19 • OpenMP context: Whether the dispatch construct is added to the construct set, the accepted
20 isa-name values for the isa trait, the accepted arch-name values for the arch trait, and the
21 accepted extension-name values for the extension trait are implementation defined (see
22 Section 2.3.1).
23 • Metadirectives: The number of times that each expression of the context selector of a when
24 clause is evaluated is implementation defined (see Section 2.3.4).
25 • Declare variant directive: If two replacement candidates have the same score, their order is
26 implementation defined. The number of times each expression of the context selector of a match
27 clause is evaluated is implementation defined. For calls to constexpr base functions that are
28 evaluated in constant expressions, whether any variant replacement occurs is implementation
29 defined. Any differences that the specific OpenMP context requires in the prototype of the
30 variant from the base function prototype are implementation defined (see Section 2.3.5).
659
1 • Internal control variables: The initial values of dyn-var, nthreads-var, run-sched-var,
2 def-sched-var, bind-var, stacksize-var, wait-policy-var, thread-limit-var, max-active-levels-var,
3 place-partition-var, affinity-format-var, default-device-var, num-procs-var and def-allocator-var
4 are implementation defined (see Section 2.4.2).
5 • requires directive: Support for any feature specified by a requirement clause on a
6 requires directive is implementation defined (see Section 2.5.1).
7 • Dynamic adjustment of threads: Providing the ability to adjust the number of threads
8 dynamically is implementation defined (see Section 2.6.1).
9 • Thread affinity: For the close thread affinity policy, if T > P and P does not divide T evenly,
10 the exact number of threads in a particular place is implementation defined. For the spread
11 thread affinity, if T > P and P does not divide T evenly, the exact number of threads in a
12 particular subpartition is implementation defined. The determination of whether the affinity
13 request can be fulfilled is implementation defined. If not, the mapping of threads in the team to
14 places is implementation defined (see Section 2.6.2).
15 • teams construct: The number of teams that are created is implementation defined, it is greater
16 than or equal to the lower bound and less than or equal to the upper bound values of the
17 num_teams clause if specified or it is less than or equal to the value of the nteams-var ICV if
18 its value is greater than zero. Otherwise it is greater than or equal to 1. The maximum number of
19 threads that participate in the contention group that each team initiates is implementation defined
20 if no thread_limit clause is specified on the construct. The assignment of the initial threads
21 to places and the values of the place-partition-var and default-device-var ICVs for each initial
22 thread are implementation defined (see Section 2.7).
23 • sections construct: The method of scheduling the structured blocks among threads in the
24 team is implementation defined (see Section 2.10.1).
25 • single construct: The method of choosing a thread to execute the structured block each time
26 the team encounters the construct is implementation defined (see Section 2.10.2).
27 • Canonical loop nest form: The particular integer type used to compute the iteration count for
28 the collapsed loop is implementation defined (see Section 2.11.1).
29 • Worksharing-loop directive: The effect of the schedule(runtime) clause when the
30 run-sched-var ICV is set to auto is implementation defined. The value of simd_width for the
31 simd schedule modifier is implementation defined (see Section 2.11.4).
32 • simd construct: The number of iterations that are executed concurrently at any given time is
33 implementation defined. If the alignment parameter is not specified in the aligned clause, the
34 default alignments for the SIMD instructions are implementation defined (see Section 2.11.5.1).
35 • declare simd directive: If the parameter of the simdlen clause is not a constant positive
36 integer expression, the number of concurrent arguments for the function is implementation
37 defined. If the alignment parameter of the aligned clause is not specified, the default
38 alignments for SIMD instructions are implementation defined (see Section 2.11.5.3).
8 Chapter 3:
C / C++
9 • Runtime library definitions: The enum types for omp_allocator_handle_t,
10 omp_event_handle_t, omp_interop_type_t and omp_memspace_handle_t are
11 implementation defined. The integral or pointer type for omp_interop_t is implementation
12 defined (see Section 3.1).
C / C++
Fortran
13 • Runtime library definitions: Whether the include file omp_lib.h or the module omp_lib
14 (or both) is provided is implementation defined. Whether the omp_lib.h file provides
15 derived-type definitions or those routines that require an explicit interface is implementation
16 defined. Whether any of the OpenMP runtime library routines that take an argument are
17 extended with a generic interface so arguments of different KIND type can be accommodated is
18 implementation defined (see Section 3.1).
Fortran
19 • omp_set_num_threads routine: If the argument is not a positive integer the behavior is
20 implementation defined (see Section 3.2.1).
21 • omp_set_schedule routine: For implementation-specific schedule kinds, the values and
22 associated meanings of the second argument are implementation defined (see Section 3.2.11).
23 • omp_get_schedule routine: The value returned by the second argument is implementation
24 defined for any schedule kinds other than static, dynamic and guided (see Section 3.2.12).
25 • omp_get_supported_active_levels routine: The number of active levels of
26 parallelism supported by the implementation is implementation defined, but must be greater than
27 0 (see Section 3.2.14).
28 • omp_set_max_active_levels routine: If the argument is not a non-negative integer then
29 the behavior is implementation defined (see Section 3.2.15).
23 Chapter 4:
24 • ompt_callback_sync_region_wait, ompt_callback_mutex_released,
25 ompt_callback_dependences, ompt_callback_task_dependence,
26 ompt_callback_work, ompt_callback_master (deprecated),
27 ompt_callback_masked, ompt_callback_target_map,
28 ompt_callback_target_map_emi, ompt_callback_sync_region,
29 ompt_callback_reduction, ompt_callback_lock_init,
30 ompt_callback_lock_destroy, ompt_callback_mutex_acquire,
31 ompt_callback_mutex_acquired, ompt_callback_nest_lock,
32 ompt_callback_flush, ompt_callback_cancel and
33 ompt_callback_dispatch tool callbacks: If a tool attempts to register a callback with the
34 string name using the runtime entry point ompt_set_callback (see Table 4.3), whether the
35 registered callback may never, sometimes or always invoke this callback for the associated events
36 is implementation defined (see Section 4.2.4).
24 Chapter 5:
25 • ompd_callback_print_string_fn_t callback function: The value of category is
26 implementation defined (see Section 5.4.5).
27 • ompd_parallel_handle_compare operation: The means by which parallel region
28 handles are ordered is implementation defined (see Section 5.5.6.5).
29 • ompd_task_handle_compare operation: The means by which task handles are ordered is
30 implementation defined (see Section 5.5.7.6).
31 Chapter 6:
32 • OMP_SCHEDULE environment variable: If the value does not conform to the specified format
33 then the behavior of the program is implementation defined (see Section 6.1).
34 • OMP_NUM_THREADS environment variable: If any value of the list specified leads to a number
35 of threads that is greater than the implementation can support, or if any value is not a positive
36 integer, then the behavior of the program is implementation defined (see Section 6.2).
667
1 B.2 Version 5.0 to 5.1 Differences
2 • Full support of C11, C++11, C++14, C++17, C++20 and Fortran 2008 was completed (see
3 Section 1.7).
4 • Various changes throughout the specification were made to provide initial support of Fortran
5 2018 (see Section 1.7).
6 • The OpenMP directive syntax was extended to include C++ attribute specifiers (see Section 2.1).
7 • The omp_all_memory reserved locator was added (see Section 2.1), and the depend clause
8 was extended to allow its use (see Section 2.19.11).
9 • The target_device trait set was added to the OpenMP Context (see Section 2.3.1), and the
10 target_device selector set was added to context selectors (see Section 2.3.2).
11 • For C/C++, the declare variant directive was extended to support elision of preprocessed code
12 and to allow enclosed function definitions to be interpreted as variant functions (see
13 Section 2.3.5).
14 • The declare variant directive was extended with new clauses (adjust_args and
15 append_args) that support adjustment of the interface between the original function and its
16 variants (see Section 2.3.5).
17 • The dispatch construct was added to allow users to control when variant substitution happens
18 and to define additional information that can be passed as arguments to the function variants (see
19 Section 2.3.6).
20 • To support device-specific ICV settings the environment variable syntax was extended to support
21 device-specific variables (see Section 2.4.2 and Section 6).
22 • The assume directive was added to allow users to specify invariants (see Section 2.5.2).
23 • To support clarity in metadirectives, the nothing directive was added (see Section 2.5.3).
24 • To allow users to control the compilation process and runtime error actions, the error directive
25 was added (see Section 2.5.4).
26 • The masked construct was added to support restricting execution to a specific thread (see
27 Section 2.8).
28 • The scope directive was added to support reductions without requiring a parallel or
29 worksharing region (see Section 2.9).
30 • Loop transformation constructs were added (see Section 2.11.9).
31 • The grainsize and num_tasks clauses for the taskloop construct were extended with a
32 strict modifier to ensure a deterministic distribution of logical iterations to tasks (see
33 Section 2.12.2).
681
target parallel worksharing-loop SIMD distribute parallel for simd,
construct, 241 149
target simd, 244 distribute parallel worksharing-loop
target teams, 245 construct, 148
target teams distribute, 246 distribute parallel worksharing-loop
target teams distribute parallel SIMD construct, 149
worksharing-loop construct, 249 distribute simd, 147
target teams distribute parallel do Fortran, 126
worksharing-loop SIMD construct, flush, 275
251 for, C/C++, 126
target teams distribute simd, interop, 217
247 loop, 151
target teams loop construct, 248 masked, 104
teams distribute, 233 masked taskloop, 228
teams distribute parallel masked taskloop simd, 229
worksharing-loop construct, 235 ordered, 283
teams distribute parallel parallel, 92
worksharing-loop SIMD construct, parallel do Fortran, 221
236 parallel for C/C++, 221
teams distribute simd, 234 parallel loop, 222
teams loop, 237 parallel masked, 226
compare, atomic, 266 parallel masked taskloop, 230
compilation sentinels, 52, 53 parallel masked taskloop simd,
compliance, 33 231
conditional compilation, 52 parallel sections, 223
consistent loop schedules, 125 parallel workshare, 224
constructs parallel worksharing-loop construct,
atomic, 266 221
barrier, 258 parallel worksharing-loop SIMD
cancel, 295 construct, 225
cancellation constructs, 295 scope, 106
cancellation point, 300 sections, 109
combined constructs, 221 simd, 134
critical, 255 single, 112
declare mapper, 358 target, 197
depobj, 287 target data, 187
device constructs, 186 target enter data, 191
dispatch, 69 target exit data, 193
distribute, 143 target parallel, 238
distribute parallel do, 148 target parallel do, 239
distribute parallel do simd, target parallel do simd, 241
149 target parallel for, 239
distribute parallel for, 148 target parallel for simd, 241
Index 683
error, 90 OMP_TOOL_LIBRARIES, 653
memory management directives, 177 OMP_TOOL_VERBOSE_INIT, 654
metadirective, 60 OMP_WAIT_POLICY, 646
nothing, 89 event, 443
requires, 83 event callback registration, 476
scan Directive, 154 event callback signatures, 510
threadprivate, 307 event routines, 443
variant directives, 53 execution model, 22
dispatch, 69
distribute, 143 F
distribute parallel worksharing-loop features history, 667
construct, 148 firstprivate, 318
distribute parallel worksharing-loop SIMD fixed source form conditional compilation
construct, 149 sentinels, 52
distribute simd, 147 fixed source form directives, 43
do, Fortran, 126 flush, 275
do simd, 138 flush operation, 27
dynamic, 129 flush synchronization, 29
dynamic thread adjustment, 660 flush-set, 27
for, C/C++, 126
E for simd, 138
environment display routine, 468 frames, 505
environment variables, 639 free source form conditional compilation
OMP_AFFINITY_FORMAT, 650 sentinel, 53
OMP_ALLOCATOR, 655 free source form directives, 44
OMP_CANCELLATION, 648
OMP_DEBUG, 655 G
OMP_DEFAULT_DEVICE, 652 glossary, 2
OMP_DISPLAY_AFFINITY, 649 guided, 129
OMP_DISPLAY_ENV, 648
OMP_DYNAMIC, 641 H
OMP_MAX_ACTIVE_LEVELS, 647 happens before, 29
OMP_MAX_TASK_PRIORITY, 652 header files, 365
OMP_NESTED, 647 history of features, 667
OMP_NUM_TEAMS, 656
I
OMP_NUM_THREADS, 640
ICVs (internal control variables), 71
OMP_PLACES, 643
if Clause, 254
OMP_PROC_BIND, 642
implementation, 659
OMP_SCHEDULE, 640
implementation terminology, 18
OMP_STACKSIZE, 645
implicit barrier, 260
OMP_TARGET_OFFLOAD, 652
implicit flushes, 279
OMP_TEAMS_THREAD_LIMIT, 657
in_reduction, 335
OMP_THREAD_LIMIT, 648
include files, 365
OMP_TOOL, 653
Index 685
omp_get_num_threads, 369 omp_set_nested, 375
omp_get_partition_num_places, omp_set_num_teams, 399
391 omp_set_num_threads, 368
omp_get_partition_place_nums, omp_set_schedule, 376
392 omp_set_teams_thread_limit, 400
omp_get_place_num, 390 OMP_STACKSIZE, 645
omp_get_place_num_procs, 389 omp_target_alloc, 412
omp_get_place_proc_ids, 389 omp_target_associate_ptr, 426
omp_get_proc_bind, 386 omp_target_disassociate_ptr, 429
omp_get_schedule, 379 omp_target_free, 414
omp_get_supported_active omp_target_is_accessible, 417
_levels, 380 omp_target_is_present, 416
omp_get_team_num, 398 omp_target_memcpy, 418
omp_get_team_size, 385 omp_target_memcpy_async, 422
omp_get_teams_thread_limit, 401 omp_target_memcpy_rect, 419
omp_get_thread_limit, 380 omp_target_memcpy_rect_async,
omp_get_thread_num, 371 424
omp_get_wtick, 442 OMP_TARGET_OFFLOAD, 652
omp_get_wtime, 442 OMP_TEAMS_THREAD_LIMIT, 657
omp_in_final, 403 omp_test_lock, 440
omp_in_parallel, 372 omp_test_nest_lock, 440
omp_init_allocator, 454 OMP_THREAD_LIMIT, 648
omp_init_lock, 434, 435 OMP_TOOL, 653
omp_init_nest_lock, 434, 435 OMP_TOOL_LIBRARIES, 653
omp_is_initial_device, 411 OMP_TOOL_VERBOSE_INIT, 654
OMP_MAX_ACTIVE_LEVELS, 647 omp_unset_lock, 439
OMP_MAX_TASK_PRIORITY, 652 omp_unset_nest_lock, 439
OMP_NESTED, 647 OMP_WAIT_POLICY, 646
OMP_NUM_TEAMS, 656 ompd_bp_device_begin, 636
OMP_NUM_THREADS, 640 ompd_bp_device_end, 636
omp_pause_resource, 404 ompd_bp_parallel_begin, 633
omp_pause_resource_all, 406 ompd_bp_parallel_end, 633
OMP_PLACES, 643 ompd_bp_task_begin, 634
OMP_PROC_BIND, 642 ompd_bp_task_end, 634
omp_realloc, 463 ompd_bp_thread_begin, 635
OMP_SCHEDULE, 640 ompd_bp_thread_end, 635
omp_set_affinity_format, 393 ompd_callback_device_host
omp_set_default_allocator, 456 _fn_t, 596
omp_set_default_device, 408 ompd_callback_get_thread
omp_set_dynamic, 373 _context_for_thread_id
omp_set_lock, 437 _fn_t, 590
omp_set_max_active_levels, 381 ompd_callback_memory_alloc
omp_set_nest_lock, 437 _fn_t, 588
Index 687
release flush, 29 target teams loop, 248
requires, 83 target update, 205
resource relinquishing routines, 404 task, 161
runtime, 130 task scheduling, 175
runtime library definitions, 365 task_reduction, 335
runtime library routines, 365 taskgroup, 264
tasking constructs, 161
S tasking routines, 402
scan Directive, 154 tasking terminology, 12
scheduling, 175 taskloop, 166
scope, 106 taskloop simd, 171
sections, 109 taskwait, 261
shared, 316 taskyield, 173
simd, 134 teams, 100
SIMD Directives, 134 teams distribute, 233
Simple Lock Routines, 432 teams distribute parallel worksharing-loop
single, 112 construct, 235
stand-alone directives, 45 teams distribute parallel worksharing-loop
static, 129 SIMD construct, 236
strong flush, 27 teams distribute simd, 234
synchronization constructs, 255 teams loop, 237
synchronization constructs and clauses, 255 teams region routines, 397
synchronization hints, 293 thread affinity, 98
synchronization terminology, 10 thread affinity routines, 386
thread team routines, 368
T threadprivate, 307
target, 197 tile, 158
target data, 187 timer, 442
target memory routines, 412 timing routines, 442
target parallel, 238 tool control, 465
target parallel loop, 242 tool initialization, 474
target parallel worksharing-loop tool interfaces definitions, 471, 578
construct, 239 tools header files, 471, 578
target parallel worksharing-loop SIMD tracing device activity, 478
construct, 241
target simd, 244 U
target teams, 245 unroll, 160
target teams distribute, 246 update, atomic, 266
target teams distribute parallel
V
worksharing-loop construct, 249
variables, environment, 639
target teams distribute parallel
variant directives, 53
worksharing-loop SIMD
construct, 251 W
target teams distribute simd, 247 wait identifier, 507
Index 689