0% found this document useful (0 votes)
140 views22 pages

Hardware Multithreading

This document discusses hardware multithreading and its various types. It defines a thread as a flow of execution through a process's code with its own program counter. There are two types of threads: user-level threads managed by users and kernel-level threads managed by the operating system kernel. Multithreading allows a CPU to execute multiple threads concurrently. There are three main types: coarse-grain which switches threads on expensive operations, fine-grain which switches every cycle, and simultaneous multi-threading which exploits instruction-level and thread-level parallelism simultaneously. Multithreading improves processor resource utilization but can degrade single-thread performance and increase complexity.

Uploaded by

mian saad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
140 views22 pages

Hardware Multithreading

This document discusses hardware multithreading and its various types. It defines a thread as a flow of execution through a process's code with its own program counter. There are two types of threads: user-level threads managed by users and kernel-level threads managed by the operating system kernel. Multithreading allows a CPU to execute multiple threads concurrently. There are three main types: coarse-grain which switches threads on expensive operations, fine-grain which switches every cycle, and simultaneous multi-threading which exploits instruction-level and thread-level parallelism simultaneously. Multithreading improves processor resource utilization but can degrade single-thread performance and increase complexity.

Uploaded by

mian saad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

HARDWARE

MULTITHREADING
JAHANGIR ABBAS 15091519-091

SAAD MATEEN 15091519-098

SHAFAQAT ALI 15091519-137


What is Thread?
A thread is a flow of execution through the
process code, with its own program counter
that keeps track of which instruction to
execute next, system registers which hold its
current working variables, and a stack which
contains the execution history.
A thread is also called a lightweight
process.
Types of Thread
Threads are implemented in following two ways −
•User Level Threads − User managed threads.
•Kernel Level Threads − Operating System managed
threads acting on kernel, an operating system core.
Multithreading
In computer architecture, multithreading is
the ability of a central processing
unit (CPU) (or a single core in a multi-core
processor) to provide multiple threads of
execution concurrently, supported by
the operating system.
• What are the differences between software
multithreading and hardware multithreading?
Software: OS support for several concurrent threads
Large number of threads (effectively unlimited)
‘Heavy’ context switching

Hardware: CPU support for several instructions flows


Limited number of threads (typically 2 or 4)
‘Light’/’Immediate’ context switching
MULTITHREADING
TYPES

Coarse-grain
multithreading

Fine-grain
multithreading

Simultaneous
Multi-Threading
Coarse-grain Multithreading
• Threads are switched upon ‘expensive’ operations
• Single thread runs until a costly stall
– E.g. 2nd level cache miss
• Another thread starts during stall for first
– Pipeline fill time requires several cycles!
• Does not cover short stalls
• Less likely to slow execution of a single thread (smaller latency)
• Needs hardware support
– PC and register file for each thread
• – little other hardware
Fine-grain Multithreading
• Threads are switched every single cycle among the ‘ready’
threads
• Two or more threads interleave instructions
– Round-robin fashion
– Skip stalled threads
• Needs hardware support
– Separate PC and register file for each thread
– Hardware to control alternating pattern
• Naturally hides delays
– Data hazards, Cache misses
– Pipeline runs with rare stalls
• Does not make full use of multi-issue architecture
Simultaneous Multi-Threading
• The main idea is to exploit instructions level
parallelism and thread level parallelism at the
same time
• In a superscalar processor issue instructions from
different threads in the same cycle
– Schedule as many ‘ready’ instructions as possible
– Operand reading and result saving becomes
much more complex
Simultaneous Multi-Threading
• Let’s look simply at instruction issue:
1 2 3 4 5 6 7 8 9 10
Inst a IF ID EX MEM WB
Inst b IF ID EX MEM WB
Inst M IF ID EX MEM WB
Inst N IF ID EX MEM WB
Inst c IF ID EX MEM WB
Inst P IF ID EX MEM WB
Inst Q IF ID EX MEM WB
Inst d IF ID EX MEM WB
Inst e IF ID EX MEM WB
Inst R IF ID EX MEM WB
We want to run these
two Threads
Thread A Thread B SMT Issue as many
Time ————>
1 a 1 a Ready instrs.
2 b 2 b as possible
ICM c c d
ICM d e f
3 e 3 4
4 f 5 6
5 ICM … …
6 ICM
… …
SMT ISSUES WITH IN-ORDER PROCESSORS
• Asymmetric pipeline stall
• One part of pipeline stalls – we want other pipeline to
continue
• Overtaking – non-stalled threads should progress
• What happens if a ready thread
SMT issues with in-order processors
Cache misses – Abort instruction (and instructions
in the shadow if Dcache miss) upon cache miss
Most existing implementations are for O-o-O,
register-renamed architectures (akin to
tomasulo)
e.g. PowerPC, Intel Hyper-threading
SIMULTANEOUS MULTI THREADING
• Extracts the most parallelism from instructions and threads
• Implemented mostly in out-of-order processors because they
are the only able to exploit that much parallelism
• Has a significant hardware overhead
• Replicate (and MUX) thread state (registers, TLBs, etc)
• Operand reading and result saving increases datapath
complexity
• Per-thread instruction handling/scheduling engine in out-of-
order implementations
BENEFITS OF HW MT
• Multithreading techniques improve the utilisation of
processor resources and, hence, the overall performance
• If the different threads are accessing the same input data
they may be using the same regions of memory
• Cache efficiency improves in these cases
DISADVANTAGES OF HW MT

• Single-thread performance may be degraded when compared to


a single-thread CPU
• Multiple threads interfere with each other
• Shared caches mean that, effectively, threads would use a fraction
of the whole cache
• Trashing may exacerbate this issue
• Thread scheduling at hardware level adds high complexity to
processor design
• Thread state, managing priorities, OS-level information, …
Some Advanced Uses of Multithreading
SPECULATIVE EXECUTION
• When reaching a conditional branch we could spawn 2
threads
• One runs the true path
• Another runs the false
• Once we know which one is correct
kill the other thread
• Effects of Control Hazards alleviated
• Supported by current OoO cpus
• But not as a full-fledged thread
• Can reach several levels of nested conditions
• Requires memory support (e.g. reordering buffers)
MEMORY PREFETCHING
• Compile applications into two threads
• One runs the whole application Single Original
threaded thread
Scout
thread
• The other thread (scout thread) only has the memory accesses xCM

xCM
xCH

• The scout thread runs ahead and fetches memory in advance


xCM
xCM
xCH
• Ensures data will be in the cache when the original thread needs it xCM
xCH
• cache hit rate increases
• Synchronization is needed x
CM

• Scout has to run ahead enough so that memory delay is hidden …


• But not too much so that it does not replace useful data from the cache
• Beware trashing!!!
SLIPSTREAMING
• Compile sequential applications into two threads
• One runs the application itself Single Original Slipstream
threaded thread thread
• The slipstream thread only has a critical path of the
Non-critical
application
Critical
• The slipstream thread runs ahead and passes results
• Delay of slow operations (e.g. float point division) is
improved
• Synchronization and communication among the threads is
needed
• Requires extra hardware to deal with this ‘special’ behaviour
• Could be used in multicore as well
MULTITHREADING SUMMARY
• A cost-effective way of finding additional parallelism for the CPU
pipeline
• Available in x86, Itanium, Power and SPARC
• Intel Hyper-threading (SMT)
• PowerPC uses SMT
• Ultra-Sparc T1/T2 used fine-grain, later models used SMT
• Sparc64 VI used coarse-grain, later models moved to SMT
• Present additional hardware thread as an additional virtual CPU to
Operating System
• Multiprocessor OS is required
THANK YOU
Any Questions?

You might also like