Lecture 5 Bsit

Superblocks are regions of code with one entry point and one or more exit points. They are formed dynamically at runtime and are larger than basic blocks. Tail duplication allows basic blocks to appear in multiple superblocks by replicating code. Dynamic superblock formation considers when to start and stop a superblock based on heuristics like profiling heavily used blocks or reaching indirect jumps. Optimizations are then performed within superblocks and across their boundaries.

Uploaded by

mu70832

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views6 pages

Lecture 5 Bsit

Uploaded by

mu70832

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Lecture 5

Superblocks:
Superblocks. Superblocks are regions of code with only one entry point and one or more exit
points.A widely used alternative to the trace is the superblock (Hwu et al. 1993);
by construction, superblocks have only one entrance at the top and no side
entrances. In contrast, a trace may have both side entrances and side exits. For
example, in Figure 4.21, the trace ADEG contains a side exit from block D and
two side entrances into block G. As we shall see in Section 4.5.3, disallowing
side entrances simplifies later code optimizations.
It might first appear that if superblocks are formed using a method like the
one described for traces, the result is relatively small superblocks; for example,
in Figure 4.22a ADE, BC, F, and G form a complete set of superblocks. These
blocks are smaller than the traces and in some cases may be too small to
provide many opportunities for optimizations. However, larger superblocks
can be formed by allowing some basic blocks to appear more than once. This is
illustrated where larger superblocks have been formed. Here,
the superblock ADEG contains the most common sequence of basic blocks
(according to the profile information given in Figure 4.12). Now, because
block G in superblock ADEG can only be reached via a side entrance, block
G is replicated for the superblocks that contain BCG and FG. The process of
replicating code that appears at the end of a superblock in order to form other
superblocks is referred to as tail duplication.

Dynamic Superblock Formation:

Superblocks can be formed via a profile-driven static algorithm just as traces
are; i.e., by first using test input data to collect a profile and then as part of a static
compilation step forming all the superblocks. However, because superblocks
are a common choice for virtual machine implementations and are formed at
run time, we will consider only their dynamic formation in detail. The key point
is that they are formed incrementally as the source code is being emulated.
A complication that comes from using superblocks is that basic block replication
leads to more choices, especially the choice of when to terminate a
superblock, so a number of heuristics are followed. There are three key questions
regarding dynamic superblock formation: (1) At what point in the code
should a superblock be started? (2) As a superblock is being built, what should
the next basic block be? (3) At what point should a superblock be terminated?
We will discuss each of these in turn, focusing on heuristics that appear to work
well in practice.

Starting Points:
a superblock should start at a heavily used basic block. Consequently,
as code is initially being emulated, either through interpretation or
simple basic block translation, profile information is collected in order to determine
those heavily used basic blocks where superblock formation should be
started. For this purpose, there are two methods for determining profile points.
One is simply to profile all basic blocks. Another is to use heuristics based on
program structure to select a narrower set of good candidate start points and
then to profile only at those points.
A second heuristic is to use an exit arc from an existing superblock. These arcs
are good candidates because, by definition, the existing superblocks are known
to be hot, and some exit points will also be hot (although perhaps somewhat
less so than the original basic block).

Continuation: After a superblock is begun at an initial basic block, the next consideration is
which subsequent blocks should be collected and added as the superblock is
grown. This can be done using either node or edge information. There are two
basic heuristics for using this information: One is most frequently used and the
other is most recently used.
Stopping Points:

1. The start point of the same superblock is reached. This indicates the
closing of a loop that was started with this superblock. In some systems,
superblock formation can continue even after a loop is closed, which in
effect leads to dynamic loop unrolling.
2. A start point of some other superblock is reached. When this occurs,
superblock formation stops and the two superblocks can be linked together
(Section 2.7).
3. A superblock has reached some maximum length. This maximum length
may vary from a few tens to hundreds of instructions. A reason for having
a maximum length is that it will keep code expansion in check. Because a
basic block can be used in more than one superblock, there may be multiple
copies of a given basic block. The longer superblocks grow, the more basic
block replication there will be.
4. When using the most-frequently-used heuristic, there are no more
candidate basic blocks that have reached the candidate threshold.

5. An indirect jump is reached, or there is a procedure call. The use of

this stopping heuristic depends on whether partial procedure inlining is
enabled, and, if enabled, whether the procedure satisfies criteria for inlining.

Tree Groups:
Although traces and superblocks (as well as dynamic basic blocks) are the
most commonly used units for translation and optimization, there are other
possibilities. Traces and superblocks are based on the principle that conditional
branches are predominantly decided one way However, there are
some branches for which this is not the case. For example, almost
20% of the branches range between 30–70 and 70–30, taken versus not-taken.
Almost 10% of the branches are about 50–50. For branches that tend to split
their decisions, a superblock or trace side exit is frequently taken. When this
happens, there is often overhead involved in compensation code.
Optimization Framework:
we begin with traces and superblocks and consider ways of optimizing code within these large
translation blocks. Dynamic optimizations are performed in addition to any optimizations the
original compiler may have done. Because optimization is being performed
at run time, however, there are new optimization opportunities that may not
have been available to the static compiler. In general, these new opportunities
involve optimizations along frequently followed paths that cross basic block
boundaries.

Code Reordering:
An important optimization performed in a number of virtual machine applications
is code reordering. In many microarchitectures, performance is affected
by the order in which instructions are issued and executed. The most significant
examples are simple pipelined microarchitectures that execute instructions
strictly in program order. This was done in many of the early RISC processors
and is still done in a number of embedded processors.

Code Optimizations:
There are a number of optimizations that can be applied within translation
blocks to reduce execution time. Even if the original source binary code was
optimized when it was produced, additional optimization opportunities are
often present in the dynamic environment. For example, superblock formation
removes control flow join points, creating a locally different control flow than
in the original code. Partial procedure inlining converts what would originally
be interprocedural analysis into intrablock analysis.

Compatibility Issues:
An optimization is safe with respect to traps if, after the optimization is
performed, every trap in the original source ISA code is detected in the translated
target code and either traps to a handler in the runtime or branches
directly to the runtime. Furthermore, the precise architected state at that point
must be recoverable. There are no hard-and-fast rules for determining exactly
which optimizations are safe, but optimizations that do not remove trapping
operations tend typically to be safe. For example, copy-propagation, constant propagation,
and constant-folding optimizations are usually safe. There may
be some end cases where compatibility becomes an issue, for example, if constant
folding happens to result in overflow. This case can be determined at
optimization time and can be disabled in those rare cases where it occurs.

Inter-superblock Optimizations:
All the optimizations discussed thus far have a scope that is restricted to a
single superblock. During the optimization process only a single superblock is
buffered and analyzed. The register state at the time the superblock is entered
and exited is precisely consistent with the register state at the same points
in the source code. Even though superblocks are relatively large and provide
good optimization opportunities, there may be room for additional optimizations
that span superblock boundaries. One solution is to use tree groups, as
described .Another solution is to optimize across superblocks
incrementally.
At the time two superblocks are linked together, they can both be reexamined
and reoptimized, based on the new knowledge of control flow and a
more complete context. In theory, this could be a complete reoptimization and
reformulation of both superblocks that are being linked. However, this would
lead to additional optimization overhead and could affect the recovery of precise
state (i.e., superblock exits may no longer be precise exception points with
a consistent register mapping). Furthermore, the original superblocks are optimized
to take into account the most common control flow path, and modifying
some of these optimizations to account for early exit paths may be counterproductive.
Consequently, it is probably a better approach to stick with the original
superblock optimizations and optimize across superblocks only at the “seams.”
Instruction-Set-Specific Optimizations:

Each instruction set has its own features and quirks that could lead to special
optimizations that are instruction-set specific.

Chapter 8 Code Optimization and Code Generation
No ratings yet
Chapter 8 Code Optimization and Code Generation
58 pages
Disomat Weighing Terminal
0% (1)
Disomat Weighing Terminal
4 pages
Codeoptimization-Module 4B
No ratings yet
Codeoptimization-Module 4B
66 pages
Unit5 0CodeOptimization
No ratings yet
Unit5 0CodeOptimization
90 pages
Unit4 Contd CD
No ratings yet
Unit4 Contd CD
49 pages
Emailing Optimization
No ratings yet
Emailing Optimization
50 pages
Unit 6 and 7 - Code Optimization and Code Generation
No ratings yet
Unit 6 and 7 - Code Optimization and Code Generation
48 pages
Automata Theory and Compiler Design-1
No ratings yet
Automata Theory and Compiler Design-1
38 pages
Unit 4
No ratings yet
Unit 4
19 pages
Code Optimization PDF
No ratings yet
Code Optimization PDF
25 pages
Issues IN THE Design OF A Code Generator
No ratings yet
Issues IN THE Design OF A Code Generator
41 pages
Unit4 Compiler PDF
No ratings yet
Unit4 Compiler PDF
73 pages
Jump Threading Optimization
No ratings yet
Jump Threading Optimization
6 pages
Cdunit 5
No ratings yet
Cdunit 5
41 pages
Code Optimization - Compiler Design
No ratings yet
Code Optimization - Compiler Design
33 pages
CD Unit 6
No ratings yet
CD Unit 6
27 pages
Unit-5 F&CD
No ratings yet
Unit-5 F&CD
27 pages
CD Unit 4
No ratings yet
CD Unit 4
24 pages
Practical Monte Carlo Simulation with Excel - Part 2 of 2: Applications and Distributions
From Everand
Practical Monte Carlo Simulation with Excel - Part 2 of 2: Applications and Distributions
Akram Najjar
2/5 (1)
Ar20 Aus CD Unit V
No ratings yet
Ar20 Aus CD Unit V
17 pages
MCQS and Question VSS
No ratings yet
MCQS and Question VSS
6 pages
IT 802 Computer Organization Class 11 Question and Answer Book Back
No ratings yet
IT 802 Computer Organization Class 11 Question and Answer Book Back
7 pages
18 Unit-6
No ratings yet
18 Unit-6
21 pages
CD Unit-5
No ratings yet
CD Unit-5
30 pages
Cds 1
No ratings yet
Cds 1
27 pages
The Jain International School Computer Project#
No ratings yet
The Jain International School Computer Project#
23 pages
Placement Cheatsheet - Curious Freaks
100% (1)
Placement Cheatsheet - Curious Freaks
3 pages
280425
No ratings yet
280425
11 pages
@@code Optim
No ratings yet
@@code Optim
20 pages
Unit V
No ratings yet
Unit V
11 pages
Compiler Caps
No ratings yet
Compiler Caps
11 pages
Untitled 3
No ratings yet
Untitled 3
12 pages
Tender No. 01245169A - BREACH AND ATTACK SIMULATION SOLUTION - 5201518
No ratings yet
Tender No. 01245169A - BREACH AND ATTACK SIMULATION SOLUTION - 5201518
93 pages
18 Code Optimization 07-02-2025
No ratings yet
18 Code Optimization 07-02-2025
9 pages
SRS Document Template
No ratings yet
SRS Document Template
13 pages
Se (U5) 231127 184924
No ratings yet
Se (U5) 231127 184924
15 pages
Ia3 1
No ratings yet
Ia3 1
11 pages
Codeoptimization-150703103332-Lva1-App6892 PDF
No ratings yet
Codeoptimization-150703103332-Lva1-App6892 PDF
24 pages
Code Optimization
No ratings yet
Code Optimization
49 pages
CD - CH6 - Introduction To Code Optimization
No ratings yet
CD - CH6 - Introduction To Code Optimization
19 pages
CD Uint5
No ratings yet
CD Uint5
16 pages
Optimization
No ratings yet
Optimization
67 pages
LCM V
No ratings yet
LCM V
10 pages
Pgo Notes
No ratings yet
Pgo Notes
10 pages
Sarma 2015 Ijca 907609
No ratings yet
Sarma 2015 Ijca 907609
7 pages
Unit - Viii Machine Dependent Code Optimization Peephole Optimization
No ratings yet
Unit - Viii Machine Dependent Code Optimization Peephole Optimization
9 pages
Report MPA On Assembler in C++
No ratings yet
Report MPA On Assembler in C++
17 pages
Lecture 4 BSIT
No ratings yet
Lecture 4 BSIT
6 pages
Source Code Analysis - An Overview: Radoslav Kirkov, Gennady Agre
No ratings yet
Source Code Analysis - An Overview: Radoslav Kirkov, Gennady Agre
18 pages
Next-Generation IDE: Maximizing IP Reuse: Atmel White Paper
No ratings yet
Next-Generation IDE: Maximizing IP Reuse: Atmel White Paper
16 pages
Unit 4
No ratings yet
Unit 4
4 pages
What Do You Mean by Code Optimization
No ratings yet
What Do You Mean by Code Optimization
3 pages
Csci 260 Study Guide-10
No ratings yet
Csci 260 Study Guide-10
10 pages
Chapter 7 and 8
No ratings yet
Chapter 7 and 8
5 pages
Unit-Viii Object Code Generation Machine Dependent Code Optimization
No ratings yet
Unit-Viii Object Code Generation Machine Dependent Code Optimization
10 pages
Chapter 5 - ICG
No ratings yet
Chapter 5 - ICG
5 pages
Answer - File Management System: Question-1 What Is The Difference Between File Management and Database Management
No ratings yet
Answer - File Management System: Question-1 What Is The Difference Between File Management and Database Management
4 pages
Frequency Shift Chirp Modulation The LoRa Modulation
100% (2)
Frequency Shift Chirp Modulation The LoRa Modulation
4 pages
Compiler
No ratings yet
Compiler
5 pages
Code Optimization Word-Wide Optimization Mixing C and Assembly
No ratings yet
Code Optimization Word-Wide Optimization Mixing C and Assembly
13 pages
REDO - 2 CD - PDF 3
No ratings yet
REDO - 2 CD - PDF 3
1 page
Compiler Notes Unit IV
No ratings yet
Compiler Notes Unit IV
15 pages
New Trends and Challenges in Source Code Optimization
No ratings yet
New Trends and Challenges in Source Code Optimization
6 pages
Compiler Design Code Generation
No ratings yet
Compiler Design Code Generation
4 pages
Presentation Last One
No ratings yet
Presentation Last One
9 pages
Efficient Programming Techniques For Digital Signal Processing
No ratings yet
Efficient Programming Techniques For Digital Signal Processing
9 pages
R20 3-1 Cloud Computing (OE-1) CIV, MECH
No ratings yet
R20 3-1 Cloud Computing (OE-1) CIV, MECH
7 pages
Technical Review of Peephole Technique in Compiler To Optimize Intermediate Code
No ratings yet
Technical Review of Peephole Technique in Compiler To Optimize Intermediate Code
5 pages
Compiler
No ratings yet
Compiler
2 pages
21CS32
No ratings yet
21CS32
5 pages
Analysis
No ratings yet
Analysis
2 pages
Industrial Training
No ratings yet
Industrial Training
11 pages
Microcomputers and Microprocessors The 8080 8085 and Z 80 Programming Interfacing and Troub
No ratings yet
Microcomputers and Microprocessors The 8080 8085 and Z 80 Programming Interfacing and Troub
20 pages
VFDs++for+HVAC+Application+ +Standard+or+Packaged
No ratings yet
VFDs++for+HVAC+Application+ +Standard+or+Packaged
6 pages
Electronic Records Article
No ratings yet
Electronic Records Article
34 pages
PT APAQ GROUP Pricing Estimation
No ratings yet
PT APAQ GROUP Pricing Estimation
2 pages
Veeam Rental Licensing and Usage Reporting: Reference Guide
No ratings yet
Veeam Rental Licensing and Usage Reporting: Reference Guide
36 pages
Barracuda Web Application Firewall DS US 1-2
No ratings yet
Barracuda Web Application Firewall DS US 1-2
6 pages
CS2351 AI Question Paper April May 2011
No ratings yet
CS2351 AI Question Paper April May 2011
1 page
Dxdiag
No ratings yet
Dxdiag
32 pages
Cyber Security Cover Letter - Heeral
No ratings yet
Cyber Security Cover Letter - Heeral
2 pages
II B.Tech II-sem Timetables R20, R19, R16
No ratings yet
II B.Tech II-sem Timetables R20, R19, R16
5 pages
Virtual System Long
No ratings yet
Virtual System Long
5 pages
Robot Price List Pics and Price
No ratings yet
Robot Price List Pics and Price
6 pages
Mic Final Repot
No ratings yet
Mic Final Repot
21 pages
CIT-3117-INTRODUCTION-TO-COMPUTERS-AND-APPLICATION-2 Class 1
No ratings yet
CIT-3117-INTRODUCTION-TO-COMPUTERS-AND-APPLICATION-2 Class 1
6 pages
SRS Document Template-2
No ratings yet
SRS Document Template-2
17 pages
Increase CPUs and Memory On The Panorama Virtual Appliance
No ratings yet
Increase CPUs and Memory On The Panorama Virtual Appliance
2 pages
VL2023240500586 Da02
No ratings yet
VL2023240500586 Da02
12 pages
Introducing Spring Batch Slides
No ratings yet
Introducing Spring Batch Slides
20 pages
SRS Document-1
No ratings yet
SRS Document-1
8 pages
Automation in VirtualBox
No ratings yet
Automation in VirtualBox
3 pages
CueMix FX TouchOSC For Ipad
No ratings yet
CueMix FX TouchOSC For Ipad
7 pages
Actual vs. Budget Ytd Project
No ratings yet
Actual vs. Budget Ytd Project
4 pages

Lecture 5 Bsit

Uploaded by

Lecture 5 Bsit

Uploaded by

Lecture 5

Dynamic Superblock Formation:

5. An indirect jump is reached, or there is a procedure call. The use of

You might also like