0% found this document useful (0 votes)
23 views30 pages

L 5 Multicore

Multithreading allows multiple threads to run simultaneously by sharing processor resources. It exploits explicit parallelism through thread-level parallelism (TLP). There are different types of multithreading including fine-grained, coarse-grained, and simultaneous multithreading. Multicore processors contain two or more independent processor cores on a single chip to improve performance through parallel execution.

Uploaded by

Lekshmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views30 pages

L 5 Multicore

Multithreading allows multiple threads to run simultaneously by sharing processor resources. It exploits explicit parallelism through thread-level parallelism (TLP). There are different types of multithreading including fine-grained, coarse-grained, and simultaneous multithreading. Multicore processors contain two or more independent processor cores on a single chip to improve performance through parallel execution.

Uploaded by

Lekshmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Multithreading

Multithreading
• Thread : is a process with its own instructions
and data
• It may be apart of a parallel program or
represent an independent program on its own.
• Multithreading: is the execution of multiple
threads simultaneously.
• ILP exploits implicit parallelism while TLP
exploits explicit parallelism
Multithreading

• Multiple threads to share the functional units


of a single processor in an overlapping fashion
• processor must duplicate the resources
– Separate registers
– PC
– Page table
– Memory is shared thru virtual memory mech.
– H/W must support thread switching
Multithreading Classification
• Fine-grained multithreading
• Coarse-grained multithreading
• Simultaneous Multithreading
Fine grain Multithreading
• Switches between threads on each instruction
• Execution of multiples threads to be interleaved.
• Interleaving is done in a round-robin fashion
• CPU must be able to switch threads on every
clock cycle
Fine grain Multithreading
• Advantage:
– it can hide the throughput losses that arise from
both short and long stalls.
• Disadvantage:
– it slows down the execution of the individual
threads.
Coarse-grained multithreading
• Switches threads only on costly stalls
– Ex: level two cache misses
• Alternative to fine grained multithreading
• CPU with coarse-grained multithreading issues
instructions from a single thread
• Advantage:
• Is less likely to slow the processor down .
– since instructions from same thread will only be
issued, when a thread encounters a costly stall
Coarse-grained multithreading
• Drawback:
– limited in its ability to overcome throughput losses
– especially from shorter stalls
• when a stall occurs, the pipeline must be
emptied or frozen
Simultaneous Multithreading
• It exploit TLP at the same time it exploits ILP
• SMT is multiple-issue processors often have
more functional unit parallelism available
• SMT uses the concepts like
– Multiple-issue
– Register Renaming
– Data forwarding
– Static scheduling
– Dynamic scheduling
Example
• superscalar with no multithreading support
• superscalar with coarse-grained multithreading
• superscalar with fine-grained multithreading
• superscalar with simultaneous multithreading.
Example
Example
• Horizontal dimension represents the instruction issue
capability in each clock cycle.
• The vertical dimension represents a sequence of clock
cycles.
• Empty box indicates that the corresponding issue slot is
unused in that clock cycle
Example
• Superscalar without MT:
– Exploits ILP
– No Multithreading facility
– Large no of processor idle cycles.
• Coarse Grain MT:
– In the coarse-grained multithreading the long stalls are
partially hidden by switching to another thread
– since thread switching only occurs when there is a stall
there are likely to be some fully idle cycles
Example
• fine-grained MT:
– the interleaving of threads eliminates fully empty Slots
– only one thread issues instructions in a given clock cycle
– ILP limitations still lead to a significant number of idle slots
within individual clock cycles.
– SMT
– ILP and TLP are exploited
– multiple threads using the issue slots in a single clock cycle
– No issue slot is idle
Multithreading
• Advantages:
– If a thread gets lot of cache misses then the other
threads can continue by using the computing
resources.
– If several threads work on the same set of data
then better cache usage and sync can be achieved
– If a thread can not use all the computing
resources running other threads permit to use
these resources.
Multithreading
• Disadvantages:
– Multiple threads can interfere with each other
when sharing h/w resources like cache ,TLB.
– H/W support for Multithreading is more visible to
S/W.
• Applications:
– Used in server side applications
CMP Processors
• Integrates two or more independent cores in to a
single package composed of a single IC.
• single component with two or more independent
processors
• Every functional unit of a processor is duplicated
• Ex:
– A dual core processor contain two cores
– A quad core processor contain 4 cores
CMP processors
• CMP stands for Chip Multiprocessing
• CMP instantiates multiple processor cores the single
chip
• CMP will have:
– A separate L1 instruction cache and data cache per on
chip CPU.
– An optional Unified shared L2 cache.
CMP Processors

• Chip Multithreading = Chip Multiprocessing + Hard ware Multi


threading
• Chip multiprocessing is the ability to handle multiple s/w
threads.
• CMT is the ability of the processor to process multiple software
threads & support simultaneous hardware threads of execution
CMP Processors
• CMP are now the only way to build high performance
microprocessors for various reasons:
– Large uniprocessors are no longer scaling in
performance
– CMP processors support efficient sharing of
hardware resources such as pipelines, caches and
predictors.
– CMP processors are well suited to server
workloads
Multicore
• Why Multicore ?
– Difficult to make single core clock freq even higher
– Many new applications are Multithreaded
– General trend in CA is shift towards more parallelism
Multicore
• multi-core is a design in which a single physical processor
contains the core logic of more than one processor.
• several such processor “cores” and packages them as a
single physical processor
• goal of this design is to run tasks simultaneously and
achieve greater performance
Multithreading, Hyper-Threading, or Multi-Core
Multithreading, Hyper-Threading, or Multi-Core

• Multithreading:
– Programs are made up of execution threads
– threads are sequences of related instructions
– In the early days most programs consist of a single
thread
– OS in those days were capable of running only one
such program at a time
– Innovations in the operating system allows running
the programs simultaneously
Multithreading, Hyper-Threading, or Multi-Core

• Hyper-Threading:
– two programs could now run simultaneously on a processor
without swapped in and out
– operating system to recognize one processor as two possible
execution pipelines.
– performance boost of HT Technology was limited by the
availability of shared resources to the two executing threads
– HT Technology cannot achieve throughput of two distinct
processors because of the contention for shared resources
Multicore
• multi-core is a design in which a single physical processor
contains the core logic of more than one processor.
• several such processor “cores” and packages them as a
single physical processor
• goal of this design is to run tasks simultaneously and
achieve greater performance
Multithreading, Hyper-Threading, or Multi-Core
Multi-Core:
Multithreading, Hyper-Threading, or Multi-Core

• Multicore:
– contain two or more distinct cores in the same physical package
– each core has its own execution pipeline
– each core has the resources required to run without blocking
resources needed by the other software threads.
– core design enables two or more cores to run at somewhat
slower speeds and at much lower temperatures
– combined throughput of these cores delivers processing power
greater than the maximum available today on single-core
processors and at a much lower level of power consumption
Advantages
• Occupies less space on PCB
• Higher throughput
• Consume less power
• Cache coherency can be greatly improved
• Performs more operations/sec with less freq
– Ex: 16 core MITRAW processor operates
@425MHz performs 100 times more no of
operations than pentium3 with 600MHz.
Multicore applications
• Data base servers
• Web servers
• Compilers
• Multimedia Applications
• Scientific Applications
• General applications with TLP as opposed to ILP
– Downloading s/w while running Anti virus s/w
– Editing photo while recording TV show.

You might also like