TLP
TLP
¨ Announcement
¤ Homework 4 is due on Dec. 11th
¨ This lecture
¤ Thread level parallelism (TLP)
¤ Parallel architectures for exploiting TLP
n Hardware multithreading
n Symmetric multiprocessors
n Chip multiprocessing
Flynn’s Taxonomy
¨ Forms of computer architectures
Instruction Stream
Single Multiple
Single-Instruction, Multiple-Instruction,
Single Single Data (SISD) Single Data (MISD)
Data Stream
Multiple-Instruction,
Single-Instruction,
Multiple Data
Multiple Multiple Data (SIMD)
(MIMD)
vector processors
multiprocessors
Flynn’s Taxonomy
¨ Forms of computer architectures
Instruction Stream
Single Multiple
Single-Instruction, Multiple-Instruction,
Single Single Data (SISD) Single Data (MISD)
Data Stream
Multiple-Instruction,
Single-Instruction,
Multiple Data
Multiple Multiple Data (SIMD)
(MIMD)
vector processors
multiprocessors
Basics of Threads
¨ Thread is a single sequential flow of control within a
program including instructions and state
¤ Register state is called thread context
¨ A program may be single- or multi-threaded
¤ Single-threaded program can handle one task at any
time
¨ Multitasking is performed by modern operating
systems to load the context of a new thread while
the old thread’s context is written back to memory
Thread Level Parallelism (TLP)
¨ Users prefer to execute multiple applications
¤ Piping applications in Linux
n gunzip -c foo.gz | grep bar | perl some-script.pl
Simultaneous Multithreading
Fine Grained Multithreading
Multiprocessing
Symmetric Multiprocessors
¨ Multiple CPU chips share the same CPU 0
CPU 1
memory CPU 2
CPU 3
¨ From the OS’s point of view
¤ Allof the CPUs have equal compute appapp
app
capabilities
OS
¤ The main memory is equally accessible
by the CPU chips
¨ OS runs every thread on a CPU
¨ Every CPU has its own power
distribution and cooling system
AMD Opteron
Chip Multiprocessors
¨ Can be viewed as a simple SMP on
single chip Core Core
…
Core
0 1 3
¨ CPUs are now called cores
¤ One thread per core Shared
cache
¨ Shared higher level caches
¤ Typicallythe last level
¤ Lower latency
¤ Improved bandwidth