0% found this document useful (0 votes)

3 views6 pages

Code Scheduling in Compiler Design

Code scheduling is a compiler optimization technique that enhances instruction-level parallelism (ILP) by reordering instructions to minimize stalls from data and control dependencies. It includes local scheduling, which optimizes within a single basic block, and global scheduling, which optimizes across multiple blocks using techniques like trace scheduling, software pipelining, and percolation scheduling. These techniques are vital for improving execution efficiency in high-performance computing applications.

Uploaded by

rajeshmeheto.ica

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views6 pages

Code Scheduling in Compiler Design

Uploaded by

rajeshmeheto.ica

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Code Scheduling in Compiler Design

Code scheduling is a compiler optimization technique used to improve instruction-level parallelism (ILP) and
overall program execution speed by reordering instructions without changing the program's output. It aims to
minimize stalls caused by data dependencies, control dependencies, and resource conflicts.

Types of Code Scheduling

1. Local Scheduling

 Optimizes instruction ordering within a single basic block (a straight-line sequence of code with no
branches except at the end).
 Limited scope but effective in reducing pipeline stalls.

2. Global Scheduling

 Optimizes instruction scheduling across multiple basic blocks, considering control flow structures such
as loops and branches.
 More complex but can achieve better performance improvements than local scheduling.

Global Scheduling Techniques

1. Trace Scheduling

 Selects a "trace" (a frequently executed path through the program) and schedules instructions optimally
along that path.
 Uses speculative execution by moving instructions above branches.

Example:

c
Copy
if (A > B) {
C = D + E;
}
F = G + H;

Before Scheduling:

sql
Copy
CMP A, B
JLE ELSE
ADD C, D, E // Only executes if A > B
ELSE:
ADD F, G, H

After Scheduling (Trace Scheduling):

less
Copy
ADD F, G, H // Moved before the branch
CMP A, B
JLE ELSE
ADD C, D, E

Now, ADD F, G, H executes regardless of the branch outcome, reducing delay.

2. Software Pipelining

 Mainly used for loops, it schedules instructions across multiple iterations to keep functional units busy.

Example:

c
Copy
for (i = 0; i < N; i++) {
A[i] = B[i] + C[i];
}

Without Software Pipelining (Sequential Execution):

css
Copy
Load B[i]
Load C[i]
ADD A[i], B[i], C[i]
Store A[i]

With Software Pipelining (Overlapping Iterations):

less
Copy
Load B[i] // Load next iteration’s B while computing the current one
Load C[i]
ADD A[i], B[i], C[i]
Store A[i]

This ensures continuous execution without stalls.

3. Percolation Scheduling

 Moves instructions across basic blocks, even past branches, while ensuring correctness.
 Used in Very Long Instruction Word (VLIW) architectures.

Example:
If an instruction depends on a previous one but can be executed earlier, it is moved before the branch condition
evaluation.

Conclusion

Global scheduling techniques like trace scheduling, software pipelining, and percolation scheduling improve
execution efficiency by reordering instructions across multiple basic blocks. These techniques are crucial for
optimizing modern high-performance computing applications.
You said:

Parallelism, types and levels

ChatGPT said:

Parallelism in Computing
Parallelism refers to the ability to execute multiple tasks or instructions simultaneously, improving
computational speed and efficiency. It is widely used in modern computing architectures to maximize
performance.

Types of Parallelism
1. Instruction-Level Parallelism (ILP)

 Executes multiple instructions from a single thread simultaneously.

 Achieved through techniques like pipelining, out-of-order execution, and superscalar execution.
 Example: Modern processors with multiple execution units process different instructions at the same
time.

2. Data Parallelism

 The same operation is performed on different pieces of data simultaneously.

 Common in SIMD (Single Instruction, Multiple Data) architectures like GPUs.
 Example: Applying the same mathematical function to all elements of an array in parallel.

3. Task Parallelism

 Different tasks (or functions) are executed simultaneously on different cores or processors.
 Tasks may operate on different data or perform different functions.
 Example: A web server handling multiple user requests at the same time.

4. Thread-Level Parallelism (TLP)

 Multiple threads within a program execute independently but may share data.
 Common in multi-threaded applications running on multi-core CPUs.
 Example: A video editing software rendering multiple effects on different cores.

5. Pipeline Parallelism

 Divides a process into stages, where different stages execute in parallel.

 Used in CPU instruction pipelines and deep learning model training.
 Example: Assembly lines in a factory, where different workers perform specific tasks simultaneously.
Levels of Parallelism
1. Bit-Level Parallelism

 Processes multiple bits simultaneously within a single instruction.

 Example: A 64-bit processor handling 64-bit operations in a single cycle instead of multiple 8-bit
operations.

2. Instruction-Level Parallelism (ILP)

 Executes multiple instructions at the same time within a CPU cycle.

 Example: Superscalar processors dispatch multiple instructions per clock cycle.

3. Data Parallelism

 Distributes data across multiple processors or cores to execute the same operation in parallel.
 Example: Vectorized operations in GPUs for matrix multiplication.

4. Task-Level Parallelism

 Different tasks run in parallel, either on separate processors or within the same processor using multi-
threading.
 Example: A web browser rendering a webpage while downloading a file in the background.

5. Distributed Parallelism

 Tasks are executed across multiple machines in a distributed system.

 Example: Cloud computing and high-performance computing (HPC) clusters.

Conclusion
Parallelism is essential for modern computing, ranging from bit-level optimizations to large-scale distributed
computing. Different types and levels of parallelism are combined to maximize performance across different
hardware architectures.

You said:

Data dependencies with examples

ChatGPT said:
Data Dependencies in Computing
Data dependencies occur when instructions in a program rely on the results of previous instructions. They can
limit instruction-level parallelism (ILP) and affect performance in pipelined and parallel architectures.

Types of Data Dependencies

1. True Dependency (Read After Write - RAW)

Occurs when an instruction needs the result of a previous instruction before it can execute.

Example:

assembly
Copy
A = B + C # Instruction 1
D = A * E # Instruction 2 (depends on A)

Here, Instruction 2 must wait for Instruction 1 to complete because it depends on A.

2. Anti-Dependency (Write After Read - WAR)

Occurs when an instruction writes to a register or memory location that a previous instruction reads from.

Example:

assembly
Copy
A = B + C # Instruction 1 (reads B)
B = D * E # Instruction 2 (writes to B)

Here, Instruction 2 writes to B, but Instruction 1 needs to read B first. If executed out of order, it could
lead to incorrect results.

3. Output Dependency (Write After Write - WAW)

Occurs when two instructions write to the same register or memory location.

Example:

assembly
Copy
A = B + C # Instruction 1 (writes to A)
A = D * E # Instruction 2 (writes to A)

Here, if Instruction 2 executes before Instruction 1 finishes, it overwrites the value of A, leading to
incorrect results.
4. Control Dependency

Occurs when an instruction depends on the outcome of a previous branch instruction.

Example:

assembly
Copy
if (X > Y) {
A = B + C; # Dependent on branch outcome
}
D = E * F;

Here, A = B + C executes only if X > Y, meaning its execution is dependent on the branch instruction.

Solutions to Data Dependencies

1. Pipeline Forwarding (Bypassing)

 Passes data directly between pipeline stages instead of waiting for memory writes.
 Example: Forwarding the result of A = B + C directly to the next instruction instead of waiting for it to
be written and read from memory.

2. Register Renaming

 Resolves WAR and WAW dependencies by assigning different registers.

 Example: Instead of writing to A in both instructions, rename them as A1 and A2 to avoid conflicts.

3. Loop Unrolling

 Reduces dependencies by executing multiple iterations of a loop at once.

 Example: Instead of processing one element at a time, process multiple elements in parallel.

4. Speculative Execution

 Executes instructions before knowing the final branch decision and rolls back if needed.
 Example: Modern CPUs use branch prediction to guess outcomes and execute instructions ahead of
time.

Conclusion
Data dependencies impact parallel execution and pipeline performance. Techniques like forwarding, register
renaming, and speculative execution help mitigate these dependencies, improving overall efficiency.

Introduction To High Performance Computing: Unit-I
No ratings yet
Introduction To High Performance Computing: Unit-I
70 pages
2 TypesofParallelism
No ratings yet
2 TypesofParallelism
69 pages
Module 1 Chapter2
No ratings yet
Module 1 Chapter2
98 pages
7TH - Unit 2-21ec74h6 - Ca
No ratings yet
7TH - Unit 2-21ec74h6 - Ca
95 pages
LP V Theory and Practical Explanation: o o o o
No ratings yet
LP V Theory and Practical Explanation: o o o o
96 pages
ACA Chapter2
No ratings yet
ACA Chapter2
66 pages
CompanionAsset 9780128119051 Chapter03
No ratings yet
CompanionAsset 9780128119051 Chapter03
67 pages
Parallel Programming - Unit 1
No ratings yet
Parallel Programming - Unit 1
81 pages
EC483 Fall2024 W7
No ratings yet
EC483 Fall2024 W7
40 pages
Instruction Scheduling
No ratings yet
Instruction Scheduling
17 pages
System-On-Chip (Soc) Architecture Soc Example
No ratings yet
System-On-Chip (Soc) Architecture Soc Example
71 pages
Computer Organization and Architecture
No ratings yet
Computer Organization and Architecture
33 pages
Advanced Computer Architecture: Conditions of Parallelism
No ratings yet
Advanced Computer Architecture: Conditions of Parallelism
27 pages
Coa Chapter 5
No ratings yet
Coa Chapter 5
96 pages
Unit 5
No ratings yet
Unit 5
96 pages
EE457Unit9a OoO
No ratings yet
EE457Unit9a OoO
77 pages
ILP Overview and Scoreboard
No ratings yet
ILP Overview and Scoreboard
60 pages
CA Unit 3
No ratings yet
CA Unit 3
9 pages
HPC Unit 2
No ratings yet
HPC Unit 2
72 pages
10 Week
No ratings yet
10 Week
35 pages
Instruction-Level Parallelism and Its Exploitation: Prof. Dr. Nizamettin AYDIN
No ratings yet
Instruction-Level Parallelism and Its Exploitation: Prof. Dr. Nizamettin AYDIN
170 pages
Coa Unit 04
No ratings yet
Coa Unit 04
85 pages
Lec5 - ILP Issues in Pipeline Design
No ratings yet
Lec5 - ILP Issues in Pipeline Design
38 pages
HPC Unit 1
No ratings yet
HPC Unit 1
65 pages
Mathematics Maths NSC P2 MEMO May June 2024 Eastern Cape
100% (1)
Mathematics Maths NSC P2 MEMO May June 2024 Eastern Cape
20 pages
Instruction Level Pipelining
100% (1)
Instruction Level Pipelining
113 pages
Chapter 2 ILP
No ratings yet
Chapter 2 ILP
89 pages
COA UNIT 5 (AutoRecovered)
No ratings yet
COA UNIT 5 (AutoRecovered)
14 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
22 pages
3a.ILP Dipendenze e Superscalare
No ratings yet
3a.ILP Dipendenze e Superscalare
24 pages
CH18 COA11e
No ratings yet
CH18 COA11e
40 pages
Lec6 - TLP Data Dependence Solutions
No ratings yet
Lec6 - TLP Data Dependence Solutions
20 pages
8085 Instruction Set
100% (1)
8085 Instruction Set
122 pages
Batch 2 ICS 2101 AND BIT 2102 (1) - 1
No ratings yet
Batch 2 ICS 2101 AND BIT 2102 (1) - 1
17 pages
Module 1 Chapter2
No ratings yet
Module 1 Chapter2
100 pages
Final Exam Topics: CSE 564 Computer Architecture Summer 2017
No ratings yet
Final Exam Topics: CSE 564 Computer Architecture Summer 2017
78 pages
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
P14-15 Superscalar
No ratings yet
P14-15 Superscalar
28 pages
COA - Unit 4
No ratings yet
COA - Unit 4
84 pages
Chapter 2: Program and Network Properties
No ratings yet
Chapter 2: Program and Network Properties
94 pages
pdc2: MODULE2
No ratings yet
pdc2: MODULE2
113 pages
Module 1: PARALLEL AND DISTRIBUTED COMPUTING
No ratings yet
Module 1: PARALLEL AND DISTRIBUTED COMPUTING
65 pages
Instruction-Level Parallelism: Stalls Control Stalls WAW Stalls WAR Stalls RAW Stalls Structural CPI CPI
No ratings yet
Instruction-Level Parallelism: Stalls Control Stalls WAW Stalls WAR Stalls RAW Stalls Structural CPI CPI
50 pages
Instruction-Level Parallelism (ILP), Since The
100% (1)
Instruction-Level Parallelism (ILP), Since The
57 pages
CAQA5e ch3
No ratings yet
CAQA5e ch3
45 pages
2-Introduction To Pentium Processor
92% (13)
2-Introduction To Pentium Processor
15 pages
IMNCI Module2
No ratings yet
IMNCI Module2
31 pages
Unit 3 Control Unit: Computer Architecture
No ratings yet
Unit 3 Control Unit: Computer Architecture
12 pages
Parallelism I: Inside The Core
No ratings yet
Parallelism I: Inside The Core
61 pages
CS 6290 Instruction Level Parallelism
No ratings yet
CS 6290 Instruction Level Parallelism
45 pages
William Stallings Computer Organization and Architecture 8 Edition Instruction Level Parallelism and Superscalar Processors
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Instruction Level Parallelism and Superscalar Processors
50 pages
Parallel Processors From Client To Cloud: Omputer Rganization and Esign
No ratings yet
Parallel Processors From Client To Cloud: Omputer Rganization and Esign
43 pages
Parallelism
No ratings yet
Parallelism
22 pages
Cosc530 Ch3all6up
No ratings yet
Cosc530 Ch3all6up
8 pages
Unit4 Aca
No ratings yet
Unit4 Aca
6 pages
CS 6461: Computer Architecture Instruction Level Parallelism
No ratings yet
CS 6461: Computer Architecture Instruction Level Parallelism
41 pages
Program and Network Properties 2.1 Conditions of Parallelism 2.2 Program Partitioning and Scheduling
No ratings yet
Program and Network Properties 2.1 Conditions of Parallelism 2.2 Program Partitioning and Scheduling
47 pages
Pipeline Hazards
No ratings yet
Pipeline Hazards
53 pages
Warren Hastings - 51324493 - 2025 - 03 - 09 - 17 - 40
No ratings yet
Warren Hastings - 51324493 - 2025 - 03 - 09 - 17 - 40
29 pages
Coa Ct3 Set A Answer Key
No ratings yet
Coa Ct3 Set A Answer Key
5 pages
Pipelining Become Universal Technique in 1985
No ratings yet
Pipelining Become Universal Technique in 1985
16 pages
An Information System in IT Consists of Several Components That Work Together To Support The Collection
No ratings yet
An Information System in IT Consists of Several Components That Work Together To Support The Collection
5 pages
Instruction-Level Parallel Processors: Asim Munir
No ratings yet
Instruction-Level Parallel Processors: Asim Munir
28 pages
Riscv Boom
No ratings yet
Riscv Boom
85 pages
Ee660 2017 Spring Materials Week 04 Slides
No ratings yet
Ee660 2017 Spring Materials Week 04 Slides
40 pages
ILP Saad Saeed
No ratings yet
ILP Saad Saeed
31 pages
Instruction Level Parallelism: Soner Onder
No ratings yet
Instruction Level Parallelism: Soner Onder
25 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
19 pages
Instruction Level Parallelism and Superscalar Processors
No ratings yet
Instruction Level Parallelism and Superscalar Processors
34 pages
CS252 Graduate Computer Architecture Prediction (Con't) (Dependencies, Load Values, Data Values) February 22, 2010
No ratings yet
CS252 Graduate Computer Architecture Prediction (Con't) (Dependencies, Load Values, Data Values) February 22, 2010
54 pages
Computer Organization and Architecture What Does Superscalar Mean?
No ratings yet
Computer Organization and Architecture What Does Superscalar Mean?
14 pages
ORACLE PL/SQL Interview Questions You'll Most Likely Be Asked
From Everand
ORACLE PL/SQL Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
5/5 (1)
Ece 4750 Pset 4
No ratings yet
Ece 4750 Pset 4
10 pages
Slides Chapter 6 Pipelining
No ratings yet
Slides Chapter 6 Pipelining
60 pages
Practice Assignment 11 Sol 12453
100% (1)
Practice Assignment 11 Sol 12453
6 pages
Mech 4 Sem Thermodynamics 2 181743 Feb 2022
No ratings yet
Mech 4 Sem Thermodynamics 2 181743 Feb 2022
2 pages
Ex4 Updated
No ratings yet
Ex4 Updated
4 pages
Cse-211 Computer Architecture
No ratings yet
Cse-211 Computer Architecture
32 pages
Index Java
No ratings yet
Index Java
2 pages
Feature Extraction Techniques in Pattern Recognition Are Methods Used To Extract Relevant Features From Raw Data
No ratings yet
Feature Extraction Techniques in Pattern Recognition Are Methods Used To Extract Relevant Features From Raw Data
1 page
MIPS AssemblyInstruction Formats
100% (1)
MIPS AssemblyInstruction Formats
5 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
2 pages
Dynamic Scheduling Using Tomasulo's Approach: - A Big Picture
No ratings yet
Dynamic Scheduling Using Tomasulo's Approach: - A Big Picture
29 pages
Instruction Set of 8085 Microprocessor
No ratings yet
Instruction Set of 8085 Microprocessor
12 pages
Lecture11-Instruction Pipelining Control Hazards
No ratings yet
Lecture11-Instruction Pipelining Control Hazards
19 pages
FRONT PAGE 2 (1) .PDF 20250313 100834 0000
No ratings yet
FRONT PAGE 2 (1) .PDF 20250313 100834 0000
1 page
Mech 5 Sem Machine Design 181756 2022
No ratings yet
Mech 5 Sem Machine Design 181756 2022
2 pages
Addressing Modes of 8085 Microprocessor: Notes By: Amitav Biswas (Department of Computer Sci.)
No ratings yet
Addressing Modes of 8085 Microprocessor: Notes By: Amitav Biswas (Department of Computer Sci.)
5 pages
Computer Architecture Assignment 1
No ratings yet
Computer Architecture Assignment 1
6 pages
CSA Seoultech 04
No ratings yet
CSA Seoultech 04
137 pages
Pipelining and Parallel Processing
No ratings yet
Pipelining and Parallel Processing
25 pages
ARM Cortex-A57 Block Diagram
No ratings yet
ARM Cortex-A57 Block Diagram
1 page
Computer Architecture CS F342 Ca-Lect6
No ratings yet
Computer Architecture CS F342 Ca-Lect6
19 pages
111 Computer Organization - Quiz 2
No ratings yet
111 Computer Organization - Quiz 2
3 pages
Motorola 6809 and Hitachi 6309 Programming Reference (Darren Atkinson) - 146-153
No ratings yet
Motorola 6809 and Hitachi 6309 Programming Reference (Darren Atkinson) - 146-153
8 pages
TUGAS 8 ARSITEKTUR KOMPUTER - Tumiar Angel Tampubolon - 200402118
No ratings yet
TUGAS 8 ARSITEKTUR KOMPUTER - Tumiar Angel Tampubolon - 200402118
5 pages
Elec 263 Computer Architecture and Organization
No ratings yet
Elec 263 Computer Architecture and Organization
4 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
3 pages

Code Scheduling in Compiler Design

Uploaded by

Code Scheduling in Compiler Design

Uploaded by

Code Scheduling in Compiler Design

Types of Code Scheduling

Global Scheduling Techniques

After Scheduling (Trace Scheduling):

Now, ADD F, G, H executes regardless of the branch outcome, reducing delay.

Without Software Pipelining (Sequential Execution):

With Software Pipelining (Overlapping Iterations):

This ensures continuous execution without stalls.

Parallelism, types and levels

 Executes multiple instructions from a single thread simultaneously.

 The same operation is performed on different pieces of data simultaneously.

4. Thread-Level Parallelism (TLP)

 Divides a process into stages, where different stages execute in parallel.

 Processes multiple bits simultaneously within a single instruction.

2. Instruction-Level Parallelism (ILP)

 Executes multiple instructions at the same time within a CPU cycle.

 Tasks are executed across multiple machines in a distributed system.

Data dependencies with examples

Types of Data Dependencies

Here, Instruction 2 must wait for Instruction 1 to complete because it depends on A.

2. Anti-Dependency (Write After Read - WAR)

3. Output Dependency (Write After Write - WAW)

Occurs when an instruction depends on the outcome of a previous branch instruction.

Solutions to Data Dependencies

 Resolves WAR and WAW dependencies by assigning different registers.

 Reduces dependencies by executing multiple iterations of a loop at once.

You might also like