A Tutorial On Parallel Computing On Shared Memory Systems
A Tutorial On Parallel Computing On Shared Memory Systems
OpenMP Is Not:
History:
•In the early 90's, vendors of shared-memory machines supplied similar, directive-based, Fortran programming extensions.
•The user would augment a serial Fortran program with directives specifying which loops were to be parallelized
•The compiler would be responsible for automatically parallelizing such loops across the SM processors. Implementations were all
functionally similar but were diverging.
•First attempt at a standard ANSI X3H5 in 1994. It was never adopted, since distributed memory machines became popular.
•However, not long after this, newer shared memory machine architectures started to become prevalent, and interest resumed.
•The OpenMP standard specification started in the spring of 1997, taking over where ANSI X3H5 had left off.
Shared Memory Model
• OpenMP is designed for multi-processor/core, shared memory
machines. The underlying architecture can be shared memory Uniform
Memory Access or Non-uniform Memory Access.
!$omp parallel shared(n) private(i,j,prime)
!$omp do reduction (+:sum)
…Code body…
!$omp end do
!$omp end parallel
Other solutions: Synchronization
constructs
• CRITICAL directive:
1) The CRITICAL directive specifies a region of code that must
be executed by only one thread at a time.
2) If a thread is currently executing inside a CRITICAL region
and another thread reaches that CRITICAL region and
attempts to execute it, it will block until the first thread exits
that CRITICAL region.
SINGLE, MASTER, BARRIER: examples
Some practical examples
• A matrix multiplication code: naïve parallelization example.
• A more generic approach for solutions on shared and
distributed memory systems – domain decomposition:
Gauss-Siedel: