Parallel Computing
Parallel Computing
Parallel Computers
Parallel processing/computing :
Speed-up
s(p) = T1/Tp
T1 = Tseq + Tpar
Tseq -- execution time of sequential part
Tpar -- execution time of parallelisable part
Execution on p processors
Tp = Tseq + Tpar /p
Shared memory parallel computers
Rpeak p = pRpeak 1
Examples:
• Cray J32
– f = 100 MHz, µpr = 2, Rpeak 1 = 200 Mflops
– p = 32, Rpeak 32 = 6.4 Gflops
• NEC SX-5
– f = 250 MHz, µpr = 32, Rpeak 1 = 8 Gflops
– p = 16, Rpeak 16 = 128 Gflops
– n = 32 (NUMA), Rpeak 32*16 = 4 Tflops
Benchmark performance
• Benchmark
http://www.top500.org/
Interconnection structures for
parallel computers
Bisection or cross-section
bandwidth
• Torus.
• Twisted torus.
Binary tree
Trees
• Remote communication -- O(log(n)).