Computer Architecture Unit 3

The document discusses parallelism in computer architecture, focusing on Instruction-Level Parallelism (ILP) and its types, advantages, and disadvantages. It also covers techniques for increasing ILP, such as pipelining and superscalar execution, as well as the concepts of superpipelined architecture and Very Long Instruction Word (VLIW). Overall, it highlights how parallelism enhances computational speed and efficiency while addressing the complexities and limitations involved.

Uploaded by

Turbo Addict

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views8 pages

Computer Architecture Unit 3

Uploaded by

Turbo Addict

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

1

Unit 3
Parallelism
Parallelism in computer architecture is a method of breaking down tasks
into smaller parts that can be processed simultaneously by multiple
processors. This increases the speed and efficiency of computation.
Types of parallelism:
i)Instruction-level parallelism (ILP)
ii)Task parallelism
iii)Multiple instruction, multiple data (MIMD)

INSTRUCTION LEVEL PARALLELISM

Basic concepts
➢ Instruction Level Parallelism (ILP) is a measure of how many of the
operations in a computer program can be performed simultaneously.
➢ It is used to refer to the architecture in which multiple operations can
be performed parallelly in a particular process, with its own set of
resources – address space, registers, identifiers, state, and program
counters.
➢ It refers to the compiler design techniques and processors designed to
execute operations, like memory load and store, integer addition, and
float multiplication, in parallel to improve the performance of the
processors.
➢ It is a family of processor and compiler design techniques that speed up
execution by causing individual machine operations, such as memory
loads and stores, integer additions and floating-point multiplications,
to execute in parallel.
Classification of ILP Architectures:
• Sequential Architecture: Here, the program is not expected to
explicitly convey any information regarding parallelism to hardware,
like superscalar architecture.
2

• Dependence Architectures: The program explicitly mentions

information regarding dependencies between operations like dataflow
architecture.
• Independence Architecture: Here, programme gives information
regarding which operations are independent of each other so that they
can be executed.
Advantages of Instruction-Level Parallelism:
• Improved Performance: ILP can significantly improve the
performance of processors by allowing multiple instructions to be
executed simultaneously or out-of-order. This can lead to faster
program execution and better system throughput.
• Efficient Resource Utilization: ILP can help to efficiently utilize
processor resources by allowing multiple instructions to be executed at
the same time. This can help to reduce resource wastage and increase
efficiency.
• Reduced Instruction Dependency: ILP can help to reduce the number
of instruction dependencies, which can limit the amount of instruction-
level parallelism that can be exploited. This can help to improve
performance and reduce bottlenecks.
Disadvantages of Instruction-Level Parallelism
• Increased Complexity: Implementing ILP can be complex and
requires additional hardware resources, which can increase the
complexity and cost of processors.
• Data Dependency: Data dependency can limit the amount of
instruction-level parallelism that can be exploited. This can lead to
lower performance and reduced throughput.
• Reduced Energy Efficiency: ILP can reduce the energy efficiency of
processors by requiring additional hardware resources and increasing
instruction overhead. This can increase power consumption and result
in higher energy costs.
3

Techniques for increasing ILP

Increasing ILP can improve processor performance. Here are some
techniques for increasing ILP:
1. Pipelining:
• Break down the instruction execution process into a series of stages.
• Each stage can process a different instruction simultaneously.
2. Superscalar Execution:
• Execute multiple instructions simultaneously using multiple execution
units.
• Requires complex hardware to manage instruction dependencies.
3. Out-of-Order Execution (OoOE):
• Execute instructions out of their original order to minimize
dependencies.
• Requires complex hardware to manage instruction dependencies.
4. Speculative Execution:
• Execute instructions before it is known whether they are actually
needed.
• Can improve performance by reducing dependencies.
5. Register Renaming:
• Rename registers to avoid dependencies between instructions.
• Allows for more instructions to be executed simultaneously.
6. VLIW:
VLIW is Very Large Instruction Word.
4

SUPERSCALAR ARCHITECTURE
Definition: The main principle of superscalar approach is that it executes
instruction independently in different instruction pipelining which leads to
parallel processing thereby speeding up the processing of instruction.

❖ Superscalar refers to a machine that is designed to improve the

performance of the execution of scalar performance.
❖ Superscalar is in contrast in the intent of vector processors because in
most applications, the bulk of the operations are on scalar quantities.
❖ In superscalar processor, instructions execute independently in
different pipelines. So, throughput increases.
❖ It is a CPU that implements a form of parallelism called ILP within a
single processor.
❖ More commonly used in RISC.
❖ In superscalar processor, there are multiple functional units each of
which is implemented as a pipeline.
❖ Each pipeline consists of multiple stages to handle multiple
instructions at a time which supports parallel execution of instruction.
❖ Superscalar processors are much faster than scaler processors because
it increases throughput as the CPU can execute multiple instructions
per clock cycle.
❖ A scalar processor works on one or two data items, while in
superscalar processor each instruction processes one data item, but as
there are multiple execution units within each CPU, thus multiple
instructions can process separate data items concurrently.
❖ A superscalar processor typically fetches multiple instructions at a time
& try to find nearby instructions that are independent of one another &
can therefore be executed in parallel.
❖ If there is any dependency between input & output of instructions then
that instructions cannot be executed parallelly.
❖ Some unnecessary dependencies are eliminated by using additional
registers.
5

In the above diagram, there is a processor with two execution units; one for
integer and one for floating point operations. The instruction fetch unit is
capable of reading the instructions at a time and storing them in the
instruction queue. In each cycle, the dispatch unit retrieves and decodes up
to two instructions from the front of the queue. If there is one integer, one
floating point instruction and no hazards, both the instructions are
dispatched in the same clock cycle.
Advantages:
• The compiler can avoid many hazards through judicious selection and
ordering of instructions.
6

• In general, high performance is achieved if the compiler is able to

arrange program instructions to take maximum advantage of the
available hardware units.
Limitations:
• Superscalar processor depends on the ability to execute multiple
instructions in parallel.
• A combination of compiler-based optimization & hardware techniques
can be used to maximize instruction level parallelism.

SUPERPIPELINED ARCHITECTURE
Definition: An alternative approach in achieving greater performance
(throughput) is referred to as super pipelining.
❖ In super pipelining processor, many pipeline stages perform tasks that
require less than a half clock cycle. So, the number of executed
instructions will be doubled.
❖ A super pipelined architecture is one that makes use of more and more
small stages of a pipeline in attempt to shorten the clock period. With
more stages more instructions can be in the pipeline at the same time,
increasing parallelism (throughput). That mean instructions are
overlapped.

It issues two instructions per clock cycle & is capable of executing two
instances of each stage in parallel, so that no instances have to be idle in any
time.
7

❖ The number of instructions being processed at a given time depends

on the number of pipeline stages, commonly termed as the pipeline
depth.
❖ Some designer use maximum 8 stages of pipeline.
Benefits: Increases level of parallelism. Increases no of instruction
executed per unit time.
Limitations: The speed of execution is limited to the slowest stage of
parallelism.

VLIW
❖ Very long instruction word (VLIW) refers to instruction set
architectures that are designed to exploit instruction-level
parallelism (ILP).
❖ The limitations of the Superscalar processor are prominent as the
difficulty of scheduling instruction becomes complex. The intrinsic
parallelism in the instruction stream, complexity, cost, and the branch
instruction issue get resolved by a higher instruction set architecture
called the Very Long Instruction Word (VLIW) or VLIW Machines.
8

Advantages:
• Reduces hardware complexity.
• Reduces power consumption.
• Simplifies decoding and instruction issues.
• Increases potential clock rate.
• Functional units are positioned corresponding to the instruction pocket
by compiler.
Disadvantages:
• Complex compilers are required.
• Increased program code size.
• Larger memory bandwidth and register-file bandwidth.
• Unscheduled events, for example, a cache miss could lead to a stall that
will stall the entire processor.
• In case of un-filled opcodes in a VLIW, there is waste of memory space
and instruction bandwidth.

Super Scalar & Super Pipeline Approach To Processor
No ratings yet
Super Scalar & Super Pipeline Approach To Processor
13 pages
CSC 580 - Chapter 2
No ratings yet
CSC 580 - Chapter 2
50 pages
ITEC582-Chapter 16m
No ratings yet
ITEC582-Chapter 16m
55 pages
Xx-Iip & Ilp
No ratings yet
Xx-Iip & Ilp
16 pages
7TH_UNIT 2-21EC74H6_CA
No ratings yet
7TH_UNIT 2-21EC74H6_CA
95 pages
8. Module3
No ratings yet
8. Module3
49 pages
WINSEM2022-23_CSE4001_ETH_VL2022230503160_Reference_Material_I_22-12-2022_2.1_ILP
No ratings yet
WINSEM2022-23_CSE4001_ETH_VL2022230503160_Reference_Material_I_22-12-2022_2.1_ILP
34 pages
(123doc) Dien Tu Vien Thong c16 Instructionlevel Parallelism and Superscalar Processors 39 g3 Khotailieu
No ratings yet
(123doc) Dien Tu Vien Thong c16 Instructionlevel Parallelism and Superscalar Processors 39 g3 Khotailieu
71 pages
Lecture 06 - (New) Pipelining and Parallelism
No ratings yet
Lecture 06 - (New) Pipelining and Parallelism
36 pages
CH16 COA9e Instruction Level Parallelism and Superscalar Processors
No ratings yet
CH16 COA9e Instruction Level Parallelism and Superscalar Processors
20 pages
10.Week
No ratings yet
10.Week
35 pages
CSO Computer Programming
No ratings yet
CSO Computer Programming
73 pages
CH18-COA11e
No ratings yet
CH18-COA11e
37 pages
HSE-6-Soc Introduction To The System Design Approach
No ratings yet
HSE-6-Soc Introduction To The System Design Approach
69 pages
ILP-Solution For CO5
No ratings yet
ILP-Solution For CO5
27 pages
Chapter 5 PPTV 41 STDV 1
No ratings yet
Chapter 5 PPTV 41 STDV 1
47 pages
CH16-WS ILP and Superscalar-V2
No ratings yet
CH16-WS ILP and Superscalar-V2
42 pages
Chapter 2 ILP
No ratings yet
Chapter 2 ILP
89 pages
CH16 ParallelismSuperScalar 22 Slides
No ratings yet
CH16 ParallelismSuperScalar 22 Slides
22 pages
5-Instruction Level Support For Parallel Programming-22!12!2022
No ratings yet
5-Instruction Level Support For Parallel Programming-22!12!2022
16 pages
005-SimultaneousMultithreading
No ratings yet
005-SimultaneousMultithreading
50 pages
The Microarchitecture of Superscalar Processors: Paper
No ratings yet
The Microarchitecture of Superscalar Processors: Paper
16 pages
Parallelism in Microprocessor
No ratings yet
Parallelism in Microprocessor
17 pages
MongalJyoti Saha
No ratings yet
MongalJyoti Saha
9 pages
Lec9 Multiple Issue Processors
No ratings yet
Lec9 Multiple Issue Processors
33 pages
William Stallings Computer Organization and Architecture: Instruction Level Parallelism and Superscalar Processors
No ratings yet
William Stallings Computer Organization and Architecture: Instruction Level Parallelism and Superscalar Processors
28 pages
Lecture 2
No ratings yet
Lecture 2
17 pages
Batch 2 ICS 2101 AND BIT 2102 (1) - 1
No ratings yet
Batch 2 ICS 2101 AND BIT 2102 (1) - 1
17 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
10 pages
Pipelining
No ratings yet
Pipelining
5 pages
P14-15 Superscalar
No ratings yet
P14-15 Superscalar
28 pages
Parallel Processing: sp2016 Lec#3
No ratings yet
Parallel Processing: sp2016 Lec#3
23 pages
Chapter 13_Instruction Level Parallelism (1)
No ratings yet
Chapter 13_Instruction Level Parallelism (1)
16 pages
CH - 14 - Instruction Level Parallelism and Superscalar Processors
No ratings yet
CH - 14 - Instruction Level Parallelism and Superscalar Processors
42 pages
3-INSTRUCTION LEVEL PARALLELISM-12-Dec-2019Material - I - 12-Dec-2019 - ILP PDF
No ratings yet
3-INSTRUCTION LEVEL PARALLELISM-12-Dec-2019Material - I - 12-Dec-2019 - ILP PDF
15 pages
Lecture 06 - (New) Pipelining and Parallelism
No ratings yet
Lecture 06 - (New) Pipelining and Parallelism
37 pages
Computer Architecture_Lecture 13
No ratings yet
Computer Architecture_Lecture 13
18 pages
Instruction Pipelining and SuperScalar Development - 2019
No ratings yet
Instruction Pipelining and SuperScalar Development - 2019
53 pages
What Is The Essential Characteristic of The Su...
No ratings yet
What Is The Essential Characteristic of The Su...
2 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
19 pages
Input Unit: Memory: in Processing Element (PE) or CPU: Output
No ratings yet
Input Unit: Memory: in Processing Element (PE) or CPU: Output
24 pages
Unit 5
No ratings yet
Unit 5
44 pages
System-on-Chip Design: 2ECDE54
No ratings yet
System-on-Chip Design: 2ECDE54
24 pages
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
No ratings yet
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
36 pages
Unit-IV ILP
No ratings yet
Unit-IV ILP
6 pages
ACA Mod2
No ratings yet
ACA Mod2
45 pages
15CS72_ACA_Module2FinalCopy
No ratings yet
15CS72_ACA_Module2FinalCopy
29 pages
Aca Notes
No ratings yet
Aca Notes
23 pages
Cs2354 Advanced Computer Architecture 2 Marks
No ratings yet
Cs2354 Advanced Computer Architecture 2 Marks
10 pages
Superscalar and VLIW Architectures
No ratings yet
Superscalar and VLIW Architectures
35 pages
Unit 1
No ratings yet
Unit 1
5 pages
Me FIRST
No ratings yet
Me FIRST
4 pages
Instruction-Level Parallelism and Superscalar Processors
No ratings yet
Instruction-Level Parallelism and Superscalar Processors
22 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
2 pages
Superscaling in Computer Architecture
No ratings yet
Superscaling in Computer Architecture
9 pages
Parallelism
No ratings yet
Parallelism
22 pages
Course Code: Cpe 523 Course Title: Advanced Operating Systems
No ratings yet
Course Code: Cpe 523 Course Title: Advanced Operating Systems
9 pages
Os Lab Manual
No ratings yet
Os Lab Manual
51 pages
Superscalar Processor
No ratings yet
Superscalar Processor
4 pages
A-Deep-Dive-In-Hadoop-Spark-and-SQL-DW
No ratings yet
A-Deep-Dive-In-Hadoop-Spark-and-SQL-DW
41 pages
L27,28 Superscaler
No ratings yet
L27,28 Superscaler
28 pages
Chapter 5: Process Synchronization: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edition
No ratings yet
Chapter 5: Process Synchronization: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edition
66 pages
J Threads PDF
No ratings yet
J Threads PDF
24 pages
Operating System PDF
No ratings yet
Operating System PDF
43 pages
3 Processes
No ratings yet
3 Processes
53 pages
OS05
No ratings yet
OS05
32 pages
Distributed Systems (CT 703)
No ratings yet
Distributed Systems (CT 703)
10 pages
15.1 Processors, Parallel Processing and Virtual Machines (Part 1 Out of 3)
No ratings yet
15.1 Processors, Parallel Processing and Virtual Machines (Part 1 Out of 3)
10 pages
Chapter 4 Multithreading in Java PDF
No ratings yet
Chapter 4 Multithreading in Java PDF
21 pages
Microsoft Word - OS
No ratings yet
Microsoft Word - OS
254 pages
3.3.1 Multi-GPU Programming with CUDA
No ratings yet
3.3.1 Multi-GPU Programming with CUDA
13 pages
Write Full Code For All Algorithm Along With Output and Screen Shot
No ratings yet
Write Full Code For All Algorithm Along With Output and Screen Shot
7 pages
Java Swing Components Are Not Thread-Safe in Java
No ratings yet
Java Swing Components Are Not Thread-Safe in Java
4 pages
Unit-II: CPU Scheduling
No ratings yet
Unit-II: CPU Scheduling
69 pages
Raghu Engineering College (A) : Operating Systems
No ratings yet
Raghu Engineering College (A) : Operating Systems
2 pages
Modified_OS-Day wise Course_Handout Spring 2025_4th sem
No ratings yet
Modified_OS-Day wise Course_Handout Spring 2025_4th sem
5 pages
Parallel and Distributed Computing Lecture 03
No ratings yet
Parallel and Distributed Computing Lecture 03
44 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
16 pages
Os Imp Questions For External Exam
No ratings yet
Os Imp Questions For External Exam
3 pages
Power Off Reset Reason
No ratings yet
Power Off Reset Reason
2 pages
CST206 - Ktu Qbank
No ratings yet
CST206 - Ktu Qbank
10 pages
Review Question Pertemuan 8
No ratings yet
Review Question Pertemuan 8
2 pages
Untitled
No ratings yet
Untitled
5 pages
Classical IPC Problems
No ratings yet
Classical IPC Problems
15 pages
12.operating Systems MODEL QP
No ratings yet
12.operating Systems MODEL QP
3 pages
Dos 2160710
No ratings yet
Dos 2160710
3 pages
CSE16 Operating Systems Practice Subjective Questions For ETE
No ratings yet
CSE16 Operating Systems Practice Subjective Questions For ETE
9 pages
Scalable Parallel Computing
No ratings yet
Scalable Parallel Computing
11 pages
Shell Scripting Step by Step: A Practical Guide with Examples
From Everand
Shell Scripting Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet

Computer Architecture Unit 3

Uploaded by

Computer Architecture Unit 3

Uploaded by

1

INSTRUCTION LEVEL PARALLELISM

• Dependence Architectures: The program explicitly mentions

Techniques for increasing ILP

❖ Superscalar refers to a machine that is designed to improve the

• In general, high performance is achieved if the compiler is able to

❖ The number of instructions being processed at a given time depends

You might also like