0% found this document useful (0 votes)
19 views18 pages

Computer Architecture_Lecture 13

Uploaded by

rifadulhasan69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views18 pages

Computer Architecture_Lecture 13

Uploaded by

rifadulhasan69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Computer Architecture

Lecture 13
Instruction-level Parallelism
And Superscalar Processors
Instructor: Sultana Jahan Soheli
Assistant Professor , ICE, NSTU
Reference Books
• Computer Organization and Architecture:
Designing for Performance- William Stallings
(8th Edition)
– Any later edition is fine
Superscalar Architecture
• A processor architecture
• In which common instructions—integer and
floating-point arithmetic, loads, stores, and
conditional branches—can be initiated
simultaneously and executed independently
• Refers to a machine that is designed to improve
the performance of the execution of scalar
instructions
– In most applications, the bulk of the operations are on
scalar quantities
Superscalar Architecture
• Deals with the Ability to execute instructions
independently and concurrently in different
pipelines
Superpipelined Architecture
• Superpipelining
exploits the fact that
many pipeline stages
perform tasks that
require less than half a
clock cycle
• Thus, a doubled
internal clock speed
allows the
performance of two
tasks in one external
clock cycle
Superscalar vs. Superpipelined
Limitations
• The superscalar approach depends on the ability to execute multiple
instructions in parallel
• The term instruction-level parallelism refers to the degree to which, on
average, the instructions of a program can be executed in parallel
• A combination of compiler-based optimization and hardware
techniques can be used to maximize instruction-level parallelism
• Obstacles for achieving parallelism:
– True data dependency
– Procedural dependency
– Resource conflicts
– Output dependency
– Anti-dependency
Effect of Dependencies
Design Issues
• Instruction-Level Parallelism and Machine Parallelism
– Instruction-level parallelism exists when instructions in a
sequence are independent and thus can be executed in
parallel by overlapping
– The degree of instruction-level parallelism is determined
by the frequency of true data dependencies and
procedural dependencies in the code
– These factors, in turn, are dependent on the instruction set
architecture and on the application
– Also determined by operation latency: the time until the
result of an instruction is available for use as an operand in
a subsequent instruction
Design Issues
• Machine parallelism is a measure of the ability
of the processor to take advantage of
instruction-level parallelism
– Determined by the number of instructions that
can be fetched and executed at the same time
(the number of parallel pipelines) and
– by the speed and sophistication of the
mechanisms that the processor uses to find
independent instructions
Design Issues
• Instruction Issue Policy: The processor must also
be able to identify instruction level parallelism
and co-ordinate the fetching, decoding, and
execution of instructions in parallel
• Instruction issue refers to the process of initiating
instruction execution in the processor’s
functional units
• Instruction issue policy refers to the protocol
used to issue instructions
Design Issues
• Superscalar instruction issue policies fall into
the following categories:
– In-order issue with in-order completion
– In-order issue with out-of-order completion
– Out-of-order issue with out-of-order completion
Design Issues
Design Issues
• Register Renaming: When out-of-order
techniques are used, the values in registers
cannot be fully known at each point in time
just from a consideration of the sequence of
instructions dictated by the program
• Solution: Duplication of resources
– Also refers to as register renaming
• When a new register value is created, a new
register is allocated for that value
Design Issues
• Subsequent instructions that access that value
as a source operand in that register must go
through a renaming process:
– The register references in those instructions must
be revised to refer to the register containing the
needed value
– The same original register reference in several
different instructions may refer to different actual
registers, if different values are intended
Design Issues
• Superscalar Execution:
Design Issues
• Superscalar Implementation Issues:
a) Instruction fetch strategies that simultaneously fetch multiple
instructions, often by predicting the outcomes of, and fetching
beyond, conditional branch instructions
– These functions require the use of multiple pipeline fetch and
decode stages, and branch prediction logic
b) Logic for determining true dependencies involving register values,
and mechanisms for communicating these values to where they
are needed during execution
c) Mechanisms for initiating, or issuing, multiple instructions in
parallel
d) Resources for parallel execution of multiple instructions, including
multiple pipelined functional units and memory hierarchies
capable of simultaneously servicing multiple memory references
e) Mechanisms for committing the process state in correct order
Thank you!

You might also like