ROSE Compiler Framework Print Version
ROSE Compiler Framework Print Version
Wikibooks.org
March 17, 2013
On the 28th of April 2012 the contents of the English as well as German Wikibooks and Wikipedia
projects were licensed under Creative Commons Attribution-ShareAlike 3.0 Unported license. An
URI to this license is given in the list of figures on page 213. If this document is a derived work
from the contents of one of these projects and the content was still licensed by the project under
this license at the time of derivation this document has to be licensed under the same, a similar or a
compatible license, as stated in section 4b of the license. The list of contributors is included in chapter
Contributors on page 211. The licenses GPL, LGPL and GFDL are included in chapter Licenses on
page 217, since this book and/or parts of it may or may not be licensed under one or more of these
licenses, and thus require inclusion of these licenses. The licenses of the figures are given in the list of
figures on page 213. This PDF was generated by the LATEX typesetting software. The LATEX source
code is included as an attachment (source.7z.txt) in this PDF file. To extract the source from the
PDF file, we recommend the use of http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
utility or clicking the paper clip attachment symbol on the lower left of your PDF Viewer, selecting
Save Attachment. After extracting it from the PDF file you have to rename it to source.7z. To
uncompress the resulting archive we recommend the use of http://www.7-zip.org/. The LATEX
source itself was generated by a program written by Dirk Hünniger, which is freely available under
an open source license from http://de.wikibooks.org/wiki/Benutzer:Dirk_Huenniger/wb2pdf.
This distribution also contains a configured version of the pdflatex compiler with all necessary
packages and fonts needed to compile the LATEX source included in this PDF file.
Contents
2 ROSE's Documentations 7
3 Obtaining ROSE 9
3.1 Git Repositories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Virtual Machine Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3 git 1.7.10 or later for github.com . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 EDG source code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.5 EDG tarball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4 Installation 15
4.1 Platform Requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Software Requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 ./build . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4 configure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.5 make . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.6 make check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.7 make install . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.8 set environment variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.9 try out a rose translator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.10 Trouble shooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6 ROSE tools 25
6.1 prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.2 identityTranslator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.3 AST dot graph generators . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.4 call graph generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6.5 Control flow graph generator . . . . . . . . . . . . . . . . . . . . . . . . . 28
6.6 TODO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
III
Contents
9 Program Translation 35
9.1 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
9.2 Expected behavior of a ROSE Translator . . . . . . . . . . . . . . . . . . 35
9.3 SageBuilder and SageInterface . . . . . . . . . . . . . . . . . . . . . . . . . 35
9.4 Steps for writing translators . . . . . . . . . . . . . . . . . . . . . . . . . . 35
9.5 Order to traverse AST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
9.6 Example translators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
9.7 Trouble shooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
10 Program Analysis 39
10.1 control flow graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
10.2 Virtual Function Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
10.3 Def-use analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
10.4 Pointer Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
10.5 SSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
10.6 Side Effect Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
10.7 Generic Dataflow Framework . . . . . . . . . . . . . . . . . . . . . . . . . 46
10.8 Dependence analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
12 Program Optimizations 79
IV
Contents
13 ROSE Projects 81
13.1 minitermite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
14 Developer's Guide 83
14.1 Basic skills for ROSE developers . . . . . . . . . . . . . . . . . . . . . . . 83
14.2 Valued Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
14.3 Milestones for a ROSE developers . . . . . . . . . . . . . . . . . . . . . . . 84
14.4 Termination checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
14.5 code review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
14.6 Working from a Lab machine . . . . . . . . . . . . . . . . . . . . . . . . . 86
15 Workflow 87
15.1 Motivation and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
15.2 Development Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
15.3 High Level Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
15.4 Proposing Workflow Changes . . . . . . . . . . . . . . . . . . . . . . . . . 89
15.5 Reviewing Workflow Change Proposals . . . . . . . . . . . . . . . . . . . . 90
16 Coding Standard 91
16.1 What to Expect and What to Avoid . . . . . . . . . . . . . . . . . . . . . 91
16.2 Git Convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
16.3 Design Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
16.4 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
16.5 Programming Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
16.6 Naming Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
16.7 Directories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
16.8 Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
16.9 README . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
16.10 Source Code Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . 106
16.11 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
16.12 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
16.13 Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
16.14 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
16.15 Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
16.16 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
16.17 AST Translators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
16.18 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
V
Contents
20 How-tos 151
20.1 How to write a How-to . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
20.2 How to incrementally work on a project . . . . . . . . . . . . . . . . . . . 157
20.3 How to create a translator . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
20.4 Sample translators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
20.5 How to build your translator . . . . . . . . . . . . . . . . . . . . . . . . . 161
20.6 How to create a cross-language translator . . . . . . . . . . . . . . . . . . 161
20.7 How to set up the makefile for a translator . . . . . . . . . . . . . . . . . . 163
20.8 How to debug a translator . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
20.9 How to add a new project directory . . . . . . . . . . . . . . . . . . . . . . 171
20.10 How to fix a bug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
20.11 How to add a ROSE commandline option . . . . . . . . . . . . . . . . . . 177
VI
Contents
22 Testing 183
22.1 make check rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
22.2 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
22.3 Modena Jt++ Test Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
22.4 Jenkins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
23 Git 185
23.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
23.2 git 1.7.10 or later for github.com . . . . . . . . . . . . . . . . . . . . . . . 185
23.3 Converting from a Subversion user . . . . . . . . . . . . . . . . . . . . . . 186
23.4 Git Convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
23.5 Push . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
23.6 Rebase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
23.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
24 Lattices 191
24.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
24.2 Poset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
24.3 Lattice Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
24.4 Infinite vs. Finite lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
24.5 Example: Bit vector Lattices . . . . . . . . . . . . . . . . . . . . . . . . . 193
24.6 Monotonic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
24.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
24.8 Lattice Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
24.9 integer value: ICP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
24.10 Relevance to data flow analysis . . . . . . . . . . . . . . . . . . . . . . . . 195
VII
Contents
29 Sandbox 207
29.1 How to create a new page . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
29.2 How to do XYZ in wiki? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
29.3 How to add comments which are only visible to editor, not readers of a page?208
29.4 Syntax highlighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
29.5 Math formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
30 Contributors 211
31 Licenses 217
31.1 GNU GENERAL PUBLIC LICENSE . . . . . . . . . . . . . . . . . . . . 217
31.2 GNU Free Documentation License . . . . . . . . . . . . . . . . . . . . . . 218
31.3 GNU Lesser General Public License . . . . . . . . . . . . . . . . . . . . . . 219
1
1 About the Book
1.1 Goal
The goal of this book is to have a community documentation providing extensive and
up-to-date instructional information about how to use the open-source ROSE compiler
framework1 , developed at Lawrence Livermore National Laboratory2 .
While the ROSE project website (http://www.rosecompiler.org) already has a variety
of official documentations, having a wikibook for ROSE allows anybody to contribute
to gathering instructional information about this software.
Again, please note that this wikibook is not the official documentation of ROSE. It is the
community efforts contributed by anyone just like you.
If you want to contribute, check to make sure your contributions to the wikibook are relevant
to this wikibook about ROSE
• Welcomed Contributions:
• Fix typos and grammar of existing pages to improve quality, clarity, and readability.
• Add new pages about ROSE-specific tutorials, how-tos, FAQ, and workflow
• Start discussions on the Discussion tab of an existing page about new suggestions of
how things can be done better than the current practice.
• What will be not be kept: Copy and paste of general guidelines of doing things: Please
just summary them in the ROSE-relevant wikibook page and give reference, URL to it.
Once you are certain the relevance of your contributions. Please read how to do one example
contribution.
• http://en.wikibooks.org/wiki/ROSE_Compiler_Framework/How_to_write_a_
How-to
• You can just test water how to edit in wikibook using http://en.wikibooks.org/wiki/
ROSE_Compiler_Framework/Sandbox
• Occasionally, you may want to insert figures into a wiki page. You can do this by
uploading file first through Left menu -> Toolbox->upload file
1 http://en.wikipedia.org/wiki/ROSE%20%28compiler%20framework%29
2 http://en.wikipedia.org/wiki/Lawrence%20Livermore%20National%20Laboratory
3
About the Book
• The upload link will direct you to Media Commons, more at link3
• Bottomline: make sure your contributions are visible in the print version of this book
and are logically consistent with the rest of the content.
• Link http://en.wikibooks.org/wiki/ROSE_Compiler_Framework/Print_version
• Thank you!
1.2.1 Conventions
<source lang="<language>">
<Code goes here...>
</ source>
(Enclosing code in a <pre></pre> block is also okay, but the highlighted code block is
preferred.)
• Headings: The first word in a heading title should begin capitalized, every other word
should be in lowercase, where applicable.
If you want to be notified of changes to this book, WikiBooks provides email notifications
for changes to Wiki pages that you explicitly choose to watch4 .
To use this feature:
1. Create an account with WikiBooks: http://en.wikibooks.org/w/index.php?
title=Special:UserLogin&returnto=Main+Page&type=signup
2. Login to WikiBooks and set your preferences (top right corner of the web page) for both
email notifications and your watch list:
• Email notification settings
• Preferences-> User profile-> E-mail notifications -> E-mail me when a page on my
watchlist is changed (check this on)
• Define your watchlist
http://en.wikipedia.org/wiki/Wikipedia:Wikimedia_Commons#Embedding_Wikicommons.27_
3
media_in_Wikipedia_articles
4 http://en.wikibooks.org/wiki/WATCH%23Watching_pages
4
Wikibook Writing Tips
• Preferences->Watchlist -> Advanced options -> you can select the options you want,
such as "Add pages I edit to my watchlist" and "Add pages I create to my watchlist"
• you can also individually watch and unwatch any wiki page: by click on the star on
the page's tab list (after View history)
Caveat: we don't know if wikibooks supports users to watch one entire book. So far, you
have to do this one page after another by editing them at some points.
1. What exactly is "BookCat" for? It is a category tag automatically added by wiki robot
scripts.
2. Should "BookCat" be at the end of the document? Any position in the page should be
fine. Having it at top may be better so it won't be accidentally deleted when we add new
things at the bottom.
5
2 ROSE's Documentations
7
3 Obtaining ROSE
ROSE's source files are managed by git, a distributed revision control and source code
management system. There are several ways to download the source tree:
HTTP 401 is unauthorized access. If you're not prompted for your username/password in
the Shell, then you need to use X-forwarding so the authentication windows will popup:
machine github.llnl.gov
login <username>
password <password>
• Public Repositories
9
Obtaining ROSE
It can take quite some time to install ROSE for the first time. A virtual machine image is
provided with a Ubuntu 10.04 OS with ROSE already installed.
You can download it and play it using VMware Player
• http://www.rosecompiler.org/Ubuntu-ROSE-Demo.tar.gz
• Demonstration user account (sudo user in Ubuntu):
• account: demo
• password: password
• Warning: The file is quite large at 4.8 GB
More information is at ROSE Virtual Machine Image1
github requires git 1.7.10 or later to avoid https cloning errors, as mentioned at https:
//help.github.com/articles/https-cloning-errors
Ubuntu 10.04's package repository contains git 1.7.0.4. So building later version of git is
needed. But you still need an older version of git to get the latest version of git.
Install all prerequisite packages needed to build git from source files(assuming you already
installed GNU tool chain with GCC compiler, make, etc.)
1 Chapter 5 on page 21
10
EDG source code
If you have an EDG license, we can provide you with ROSE's EDG source code. The original,
official EDG source code does NOT work with ROSE since we have modified EDG to better
serve our purposes.
Note: We provide you with a snapshot of our Git revision controlled ROSE-EDG source
code repository. This way, you can more easily contribute your EDG modifications back
into ROSE.
1. Send your EDG (research) license to two ROSE staff members, just in case one is on
vacation or on travel.
2. Provide ROSE staff with a drop-off location for the EDG source code (ssh or ftp server,
etc.)
3. Once you receive the EDG source code, you have two options:
3.4.1 As a submodule
a. Use ROSE-EDG as a submodule (assuming you have ROSE's Git source tree):
This is the recommended way to use the EDG git repo we provide. So the assumption is
that you use a local git clone of ROSE($ROSE).
Edit submodule path in $ROSE/.gitmodules to point to your ROSE-EDG repository:
[submodule "src/frontend/CxxFrontend/EDG"]
path = src/frontend/CxxFrontend/EDG
- url = ../ROSE-EDG.git
+ url = <path/to/your/ROSE-EDG.git>
-[submodule "projects/vulnerabilitySeeding"]
- path = projects/vulnerabilitySeeding
- url = ../vulnerabilitySeeding.git
$ cd $ROSE
$ git submodule init
$ git submodule update
The commands above will check out a version of the EDG submodule and save it into
ROSE/src/frontend/CxxFrontend/EDG
2 http://www.kernel.org/pub/software/scm/git/docs/git-submodule.html
11
Obtaining ROSE
3.4.2 As a Drop-in
b. As a Drop-in
Move ROSE-EDG tarball into its correct location within the ROSE source tree:
$ROSE/src/frontend/EDG
Warning: This method may not work because EDG is a submodule of ROSE and therefore,
requires a version synchronization between the two. For example, the latest version of ROSE
may not use the latest version of ROSE's EDG.
4. In ROSE, run the $ROSE/build script from the top-level of the ROSE source tree, i.e.
$ROSE. This script bootstraps Autotools, including the Makefile.ams in the EDG source
tree.
5. Configure and build ROSE: Normally, during this process ROSE would attempt to
download an EDG binary tarball for you, but since you have the source code, this step will
be skipped.
3.5.1 Process
If you don't have access to the EDG source code, you will be able to automatically download
a packaged EDG binary tarball during the ROSE build process. The download is triggered
during make in $ROSE_BUILD/src/frontend/CxxFrontend.
The EDG binary version is a computed binary compatibility signature relative to your version
of ROSE. You can check this version by running the $ROSE/scripts/bincompat-sig, for
example:
$ ./scripts/bincompat-sig
7b1930fafc929de85182ee1a14c86758
$ ./scripts/bincompat-sig
Unable to find a remote tracking a canonical repository. Please add
a
canonical repository as a remote and ensure it is up to date.
Currently
configured remotes are:
12
EDG tarball
If you do, simply add ".git" to the end of your origin's URL path. In our example, this
translates to:
https://github.com/rose-compiler/rose.git
13
4 Installation
ROSE is released as an open source software package. Users are expected to compile and
install the software.
ROSE is portable to Linux and Mac OS X on IA-32 and x86-64 platforms. In particular,
ROSE developers often use the following development environments:
• Red Hat Enterprise Linux 5.6 or its open source equivalent Centos1 5.6
• Ubuntu 10.04.4 LTS. Higher versions of Ubuntu are NOT supported due to the GCC
versions supported by ROSE.
• Mac OS X 10.5 and 10.6
1 http://www.centos.org/
2 https://github.com/rose-compiler/rose/blob/master/config/support-rose.m4
3 https://github.com/rose-compiler/rose/blob/master/config/support-rose.m4
15
Installation
• libxml2-devel
• sqlite
• texlive-full, need for building LaTeX docs
./configure --prefix=/home/usera/opt/boost-1.35.0
make
make install
Ignore the warning like : Unicode/ICU support for Boost.Regex?... not found.
For version 1.39 and 1.47: create the boost installation directory first
In boost source tree
• ./bootstrap.sh --prefix=your_boost_install_path
• ./bjam -j4 install --prefix=your_boost_install_path --libdir=your_boost_install_-
path/lib
Remember to export LD_LIBRARY_PATH for the installed boost library, for example
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/leo/opt/boost_1.45.0_inst/lib
export PATH LD_LIBRARY_PATH
16
./build
4.3 ./build
In general, it is better to rebuild the configure file in the top level source directory of ROSE.
Just type:
rose_sourcetree>./build
4.4 configure
The next step is to run configure in a separated build tree. ROSE will complain if you try
to build it within its source directory.
There are many configuration options. You can see the full list of options by typing
../sourcetree/configure --help . But only --prefix and --with-boost are required as the
minimum options.
mkdir buildrose
cd buildrose
../rose_sourcetree/configure --prefix=/home/user/opt/rose_tux284
--with-boost=/home/user/opt/boost-1.36.0/
ROSE's configure turns on debugging option by default. The generated object files should
already have debugging information.
Additional useful configure options
• Specify where a gcc's OpenMP runtime library libgomp.a is located. Only GCC 4.4's
gomp lib should be used to have OpenMP 3.0 support
• --with-gomp_omp_runtime_library=/usr/apps/gcc/4.4.1/lib/
4.5 make
cd buildrose
make -j4
will build the entire ROSE, including librose.so, tutorials, projects, tests, and so on. -j4
means to use four processes to perform the build. You can have bigger numbers if your
machine supports more concurrent processes. Still, the entire process will take hours to
finish.
For most users, building librose.so should be enough for most of their work. In this case,
just type
17
Installation
Optionally, you can type make check to make sure the compiled rose pass all its shipped
tests. This takes hours again to go through all make check rules within projects, tutorial,
and tests directories.
To save time, you can just run partial tests under a selected directory, like the buildrose/tests
After "make", it is recommended to run "make install" so rose's library (librose.so), headers
(rose.h) and some prebuilt rose-based tools can be installed under the specified installation
path using --prefix.
After the installation, you should set up some standard environment variables so you can
use rose. For bash, the following is an example:
ROSE_INS=/home/userx/opt/rose_installation_tree
PATH=$PATH:$ROSE_INS/bin
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ROSE_INS/lib
# Don't forget to export variables !!!
export PATH LD_LIBRARY_PATH
There are quite some pre-built rose translators installed under $ROSE_INS/bin.
You can try identityTranslator, which just parses input code, generates AST, and unparses
it back to original code:
identityTranslator -c helloWorld.c
It should generate an output file named rose_helloWorld.c, which should just look like your
input code.
18
Trouble shooting
If you do not have the EDG frontend source code, ROSE's build system will automatically
attempt to download an appropriate EDG binary using wget during the build process (i.e.
make -C src/frontend/CxxFrontend).
The EDG binaries are platform-specific and have historically been a cause of issues, i.e.
Autoconf detecting wrong host/build/platform types. One possible remedy to these problems
is to use the Autoconf Build and Host Options4 :
1. Check what build system Autoconf thinks you have:
$ ./config/config.guess
x86_64-unknown-linux-gnu
$ $ROSE/configure [--build|--host|--target|...]
Hi Justin,
/Users/ma23/ROSE/configure --with-CXX_DEBUG=-ggdb3
--with-CXX_WARNINGS=-Wall
--with-boost=/Users/ma23/Desktop/ROSE/boost/BOOST_INSTALL
--with-gfortran=/Users/ma23/Desktop/macports/bin/gfortran-mp-4.4
--with-alternate_backend_fortran_compiler=gfortran-mp-4.4
GFORTRAN_PATH=/Users/ma23/Desktop/macports/bin/gfortran-mp-4.4
--build=x86_64-apple-darwin10
At last, make :)
Thanks:)
Regards,
4 http://sources.redhat.com/autobook/autobook/autobook_266.html
5 http://sources.redhat.com/autobook/autobook/autobook_261.html#SEC261
19
Installation
Hongyi Ma
20
5 Virtual machine image
21
Virtual machine image
A test translator
• /home/demo/myTranslator
Some dot graphs of a simplest function. Type "run.sh file.dot" will view a dot file
• /home/demo/dotGraphs
You have to install VMware Player to your machine to use the virtual machine image.
Goto http://www.vmware.com/go/downloadplayer/
Select the right bundle for your platform. For example: VMware-Player-4.0.4-744019.i386.txt
After downloading (assuming you are using Ubuntu 10.04)
• chmod a+x VMware-Player-4.0.4-744019.i386.txt
• sudo ./VMware-Player-4.0.4-744019.i386.txt
• follow the GUI to finish the installation
To start VMPlayer, goto Menu->Applications-> System Tools -> VMware Player
After downloading and untar the tar.gz package to a directory, use VMware player to open
the configuration file of the directory.
We used Ubuntu 10.04 LTS as a host machine to create the virtual machine image.
uname -a
Linux 8core-ubuntu 2.6.32-41-generic-pae #91-Ubuntu SMP Wed Jun 13
12:00:09 UTC 2012 i686 GNU/Linux
cat /etc/*release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.04
DISTRIB_CODENAME=lucid
DISTRIB_DESCRIPTION="Ubuntu 10.04.4 LTS"
5.2.2 Configurations
VMware player has been installed onto the host machine, as described above.
Basic configuration for the virtual machine
22
How to create a new virtual image
Hardware
• Memory : 2 GB
• Processors: 2
• Hard Disk size: 15 GB: We would like to keep it small while having enough space for
users.
• 5GB is used for Ubuntu system files and
• 10GB for the demonstration user's home directory
• Network Adapter: NAT: share the host's IP address
OS
• OS: Ubuntu 10.04 LTS
• Demonstration user account (sudo user in Ubuntu):
• account: demo
• password: password
• screen size: 1280x960 (4:3)
Download Ubuntu 10.04 LTS http://releases.ubuntu.com/lucid/ We currently use the
i386 desktop ISO as the start point
• http://releases.ubuntu.com/lucid/ubuntu-10.04.4-desktop-i386.iso
Here are some general guidelines for creating a new virtual machine. Following these exact
steps are not required, although they are recommended to ensure a consistent user experience
with the ROSE VM's.
Please make sure you document the whole process in its entirety.
These steps must be performed within the VM (guest OS):
1. Install the prerequisite software using the platform's software package manager. Only
as a last resort should you manually install software. Use the platform's default software
versions if possible. (Use bash as the default login shell.)
1 Chapter 4 on page 15
23
Virtual machine image
$ export ROSE_HOME=${HOME}/development/projects/rose
$ export ROSE_SOURCE=${HOME}/development/projects/rose/src
$ export ROSE_INSTALL=${HOME}/development/opt/rose
$ mkdir -p "$ROSE_HOME"
$ mkdir -p "$ROSE_INSTALL"
$ cd "$ROSE_HOME"
$ git clone https://github.com/rose-compiler/rose "$ROSE_SOURCE"
$ cd "$ROSE_SOURCE"
$ make
$ make install
24
6 ROSE tools
6.1 prerequisites
You have to install ROSE first, by typing configure, make, make install, etc.
You also have to set the environment variables properly before you can call ROSE tools
from command line.
For example: if the installation path (or --prefix path in configure) is /home/opt/rose/install,
you can have the following script to set the environment variables using bash:
ROSE_INS=/home/opt/rose/install
export ROSE_INS
PATH=$ROSE_INS/bin:$PATH
export PATH
LD_LIBRARY_PATH=$ROSE_INS/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH
6.2 identityTranslator
6.2.1 Uses
25
ROSE tools
identityTranslator.c
#include "rose.h"
int main(int argc, char *argv[]){
// Build the AST used by ROSE
SgProject *project = frontend(argc, argv);
// Generate source code from AST and call the vendor‚s compiler
return backend(project);
}
6.2.3 Limitations
But due to limitations of the frontends and the internal processing, it cannot generate 100%
identical output compared to the input file.
Some notable changes it may introduce include:
• "int a, b, c;" are transformed to three SgVariableDeclaration statements,
• macros are expanded.
• extra brackets are added around constants of typedef types (e.g. c=Typedef_Example(12);
is translated in the output to c = Typedef_Example((12));)
• Converting NULL to 0.
Tools to generate AST graph in dot format. There are two versions
• dotGenerator: simple AST graph generator showing essential nodes and edges
• dotGeneratorWholeASTGraph: whole AST graph showing more details. It provides filter
options to show/hide certain AST information.
command line:
26
AST dot graph generators
dotGeneratorWholeASTGraph yourcode.c
dotGeneratorWholeASTGraph --help
-rose:help show this help message
-rose:dotgraph:asmFileFormatFilter [0|1] Disable or
enable asmFileFormat filter
-rose:dotgraph:asmTypeFilter [0|1] Disable or
enable asmType filter
-rose:dotgraph:binaryExecutableFormatFilter [0|1] Disable or
enable binaryExecutableFormat filter
-rose:dotgraph:commentAndDirectiveFilter [0|1] Disable or
enable commentAndDirective filter
-rose:dotgraph:ctorInitializerListFilter [0|1] Disable or
enable ctorInitializerList filter
-rose:dotgraph:defaultFilter [0|1] Disable or
enable default filter
-rose:dotgraph:defaultColorFilter [0|1] Disable or
enable defaultColor filter
-rose:dotgraph:edgeFilter [0|1] Disable or
enable edge filter
-rose:dotgraph:expressionFilter [0|1] Disable or
enable expression filter
-rose:dotgraph:fileInfoFilter [0|1] Disable or
enable fileInfo filter
-rose:dotgraph:frontendCompatibilityFilter [0|1] Disable or
enable frontendCompatibility filter
-rose:dotgraph:symbolFilter [0|1] Disable or
enable symbol filter
-rose:dotgraph:emptySymbolTableFilter [0|1] Disable or
enable emptySymbolTable filter
-rose:dotgraph:typeFilter [0|1] Disable or
enable type filter
-rose:dotgraph:variableDeclarationFilter [0|1] Disable or
enable variableDeclaration filter
-rose:dotgraph:variableDefinitionFilter [0|1] Disable or
enable variableDefinitionFilter filter
-rose:dotgraph:noFilter [0|1] Disable or
enable no filtering
Current filter flags' values are:
m_asmFileFormat = 0
m_asmType = 0
m_binaryExecutableFormat = 0
m_commentAndDirective = 1
m_ctorInitializer = 0
m_default = 1
m_defaultColor = 1
m_edge = 1
m_emptySymbolTable = 0
m_expression = 0
m_fileInfo = 1
m_frontendCompatibility = 0
m_symbol = 0
m_type = 0
m_variableDeclaration = 0
m_variableDefinition = 0
m_noFilter = 0
27
ROSE tools
Command line:
buildCallGraph -c yourprogram.cpp
Command line:
virtualCFG -c yourprogram.c
6.6 TODO
Refactor the tools into a dedicated rose/tools directory. So they will always be built and
available by default, with minimum dependency on other things, like which languages are
turned on or off (when applicable of course)
Our current idea is we should separate translators used as examples or tutorials AND
translators used for creating end-user tools.
• For tutorial translators, they should NOT be installed as tools by default. Their purpose
is to be included in Manual or Tutorial pdf files to illustrate something to developers by
examples. Examples should be concise and to the point.
• On the other hand, translators used to build end-user tools should have much higher
standard to accept command options for different, even advanced features. These
translators can be very sophisticated since they don't have the page limitation as tutorial
examples do.
28
7 Supported Programming Languages
ROSE supports a wide range of main stream programming languages, with different degrees
of maturity. The list of supported languages includes:
• C and C++: based on the EDG C++ frontend1 version 3.3.
• An ongoing effort is to upgrade the EDG frontend to its recent 4.4 version.
• Another ongoing effort is to use clang as an alternative, open-source C/C++ frontend
• Fortran 77/95/2003: based on the Open Fortran Parser2
• OpenMP 3.0: based on ROSE's own parsing and translation support for both C/C++
and Fortran OpenMP programs.
• UPC 1.1: this is also based on the EDG 3.3 frontend
7.1 OpenMP
ROSE supports OpenMP 3.0 for C/C++ (and limited Fortran support).
• The ROSE manual has a chapter (Chapter 12 OpenMP Support) explaining the details.
pdf3
• A paper was published for the uniqueness of the ROSE OpenMP Implementation pdf4
• Frontend parsing source files (ompparser.yy and ompFortranParser.C) are located under
https://github.com/rose-compiler/rose/tree/master/src/frontend/SageIII
• The transformation of OpenMP into threaded code is located in omp_-
lowering.cpp, under https://github.com/rose-compiler/rose/blob/master/src/
midend/programTransformation/ompLowering
• The OpenMP runtime interface is defined in libxomp.h and xomp.c under the same
ompLowering directory mentioned above
Experimental OpenMP Acclerator Model Implementation
• OpenMP Acclerator Model Implementation5
7.2 UPC
1 http://www.edg.com/index.php?location=c_frontend
2 http://fortran-parser.sourceforge.net/
3 http://rosecompiler.org/ROSE_UserManual/ROSE-UserManual.pdf
http://rosecompiler.org/ROSE_ResearchPapers/2010-06-AROSEBasedOpenMP3.
4
0ResearchCompiler-IWOMP.pdf
http://en.wikibooks.org/wiki/ROSE%20Compiler%20Framework%2FOpenMP%20Acclerator%
5
20Model%20Implementation
29
Supported Programming Languages
• The supported version is limited by the EDG 3.3 frontend, which only supports UPC
1.1.1 ( UPC VERSION string is defined as
200310L). ROSE uses EDG 3.3 currently and it originally only supported UPC 1.0. We
merged the UPC 1.1.1 support from EDG 3.10 into our EDG 3.3 frontend. We have also
added the required work to support UPC 1.2.
Documentation:
• Chapter 13 UPC Support, of the ROSE manual http://rosecompiler.org/ROSE_
UserManual/ROSE-UserManual.pdf
Tests: make check rule under
• rose/tests/CompileTests/UPC_tests
An example UPC-to-C translator: roseupcc
• Not full featured. Only intended to serve as a start point for anybody who is interested/-
funded to implement UPC in ROSE
• roseupcc is located in ROSE/projects/UpcTranslation
• Documented by 13.5 An Example UPC-to-C Translator Using ROSE of the ROSE manual
7.3 CUDA
ROSE has an experimental connection to EDG 4.0, which helps us support CUDA.
To enable parsing CUDA codes, please use the following configuration options:
7.4 OpenCL
30
8 Abstract Syntax Tree (Intermediate
Representation)
The main intermediate representation of ROSE is its abstract syntax tree (AST).
We provide a set of sanity check for AST. We use them to make sure the AST is consistent.
It is also highly recommended that ROSE developers add a sanity check after their AST
transformation is done. This has a higher standard than just correctly unparsed code to
compilable code. It is common for an AST to unparse correctly but then fail on the sanity
check.
The recommend sanity check is
• AstTests::runAllTests(project); from src/midend/astDiagnostics. Internally, it calls the
following checks:
• TestAstForProperlyMangledNames
• TestAstCompilerGeneratedNodes
• AstTextAttributesHandling
• AstCycleTest
• TestAstTemplateProperties
• TestAstForProperlySetDefiningAndNondefiningDeclarations
• TestAstSymbolTables
• TestAstAccessToDeclarations
• TestExpressionTypes
• TestMangledNames::test()
• TestParentPointersInMemoryPool::test()
• TestChildPointersInMemoryPool::test()
• TestMappingOfDeclarationsInMemoryPoolToSymbols::test()
• TestLValueExpressions
• TestMultiFileConsistancy::test() //2009
• TestAstAccessToDeclarations::test(*i); // named type test
There are some other functions floating around. But they should be merged into
AstTests::runAllTests(project)
• FixSgProject(*project); //in Qing's AST interface
• Utility::sanityCheck(SgProject* )
• Utility::consistencyCheck(SgProject*) // SgFile*
31
Abstract Syntax Tree (Intermediate Representation)
#see it
which run.sh
˜/64home/opt/zgrviewer-0.8.2/run.sh
run.sh ttt.c_WholeAST.dot
Just call: SgNode::unparseToString(). You can call it from any SgLocatedNode within the
AST to dump partial AST's text format.
In addition to nodes and edges, ROSE AST may have attributes in addition to nodes and
edges that are attached for preprocessing information like #include or #if .. #else. They
are attached before, after, or within a nearby AST node (only the one with source location
information.)
An example translator will traverse the input code's AST and dump information which may
include preprocessing information. For example
exampleTranslators/defaultTranslator/preprocessingInfoDumper -c
main.cxx
-----------------------------------------------
Found an IR node with preprocessing Info attached:
(memory address: 0x2b7e1852c7d0 Sage type: SgFunctionDeclaration) in
file
/export/tmp.liao6/workspace/userSupport/main.cxx (line 3 column 1)
-------------PreprocessingInfo #0 ----------- :
classification = CpreprocessorIncludeDeclaration:
String format = #include "all_headers.h"
32
AST Construction
SageBuilder and SageInterface namespaces provide functions to create ASTs and manipulate
them.
33
9 Program Translation
With its high level intermediate representation, ROSE is suitable for building source-to-
source translators. This is achieved by re-structuring the AST of the input source code,
then unparsing the transformed AST to the output source code.
9.1 Documentation
A translator built using ROSE is designed to act like a compiler (gcc, g++,gfortran ,etc
depending on the input file types).
So users of the translator only need to change the build system for the input files to use the
translator instead of the original compiler.
The official guide for restructuring/constructing AST highly recommends using helper
functions from SageBuilder and SageInterface namespaces to create AST pieces and moving
them around. These helper functions try to be stable across low-level changes and be smart
enough to transparently set many edges and maintain symbol tables.
Users who want to have lower level control may want to directly invoke the member functions
of AST nodes and symbol tables to explicitly manipulate edges and symbols in the AST.
But this process is very tedious and error-prone.
It is possible that some builder functions are not yet provided, especially for C++ constructs
like template declaration etc. We are actively working on this. In the meantime, you can
directly use new operators and other member functions as a workaround.
Generic steps:
35
Program Translation
Naive pre-order traversal is not suitable for building a translator since the translator may
change the nodes the traversal is expected to visit later on. Conceptually, this is essentially
a similar problem to C++ iterator invalidation.
To safely transform AST, It is recommended to use a reverse iterator of the statement list
generated by a preorder traversal. This is different from a list generated from a post order
traversal.
For example, assuming we have a subtree of : parent <child 1, child 2>,
• Pre order traversal will generate a list: parent, child 1, child2
• Post order traversal will generate a list: child 1, child2, parent.
• Reverse iterator of the pre order will give you : child2, child 1, and parent. Transforming
using this order is the safest based on our experiences.
36
Trouble shooting
SageBuilder::pushScopeStack(body);
SgAssignOp* sao = isSgAssignOp(sgn);
if(!sao)
return;
37
10 Program Analysis
The virtual control flow graph (vcfg) is dynamically generated on the fly when needed. So
there is no mismatch between the ROSE AST and its corresponding control flow graph. The
downside is that the same vcfg will be re-generated each time it is needed. This can be a
potentially a performance bottleneck.
Facts
• Documentation: virtual CFG is documented in Chapter 19 Virtual CFG of ROSE
tutorial pdf1
• Source Files:
• src/frontend/SageIII/virtualCFG/virtualCFG.h
• src/frontend/SageIII/virtualCFG/virtualCFG.C //not only give definitions of virtual-
CFG.h, but also extend AST node support in VirtualCFG
• src/ROSETTA/Grammar/Statement.code // prototypes of member functions for
SgStatement nodes, etc.
• src/ROSETTA/Grammar/Expression.code // prototypes of member functions for
SgExpression nodes, etc.
• src/ROSETTA/Grammar/Support.code // prototypes of member functions for SgIni-
tialized(LocatedNode) nodes, etc.
• src/ROSETTA/Grammar/Common.code // prototypes of member functions for other
nodes, etc.
• src/frontend/SageIII/virtualCFG/memberFunctions.C // implementation of virtual
CFG related member functions for each AST node
1 http://www.rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf
39
Program Analysis
SgOmpClauseBodyStatement::cfgIndexForEnd() const {
if( idx == 0 )
{
makeEdge( getNodeJustBeforeInContainer( this ), CFGNode( this,
idx ), result );
}
else
{
40
control flow graph
return result; }
{
makeEdge( CFGNode( this ,idx), getNodeJustAfterInContainer( this
), result );
}
else
{
if( idx == this->get_clauses().size() )
{
makeEdge( CFGNode( this, idx ),
this->get_body()->cfgForBeginning(), result ); // connect variable
clauses first, parallel body last
}
else
{
if( idx < this->get_clauses().size() ) // connect variables
clauses first, parallel body last
{
makeEdge( CFGNode( this, idx ),
this->get_clauses()[idx]->cfgForBeginning(), result );
}
else
{
ROSE_ASSERT( !"Bad index for SgOmpClauseBodyStatement" );
}
}
}
return result; }
41
Program Analysis
42
Virtual Function Analysis
Due to the performance concern of virtual control flow graph, we developed another static
version which persistently exists in memory like a regular graph.
Facts:
• Documentation: 19.7 Static CFG of ROSE tutorial pdf2
• Test Directory: rose/tests/CompileTests/staticCFG_tests
Facts:
• Documentation: 19.8 Static, Interprocedural CFGs of ROSE tutorial pdf3
• Test Directory: rose/tests/CompileTests/staticCFG_tests
Facts
• Original contributor: Faizur from UTSA, done in Summer 2011
• Code: at src/midend/programAnalysis/VirtualFunctionAnalysis.
• Implemented with the techniques used in the following paper: "Interprocedural Pointer
Alias Analysis - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.42.
2382". The paper boils down the virtual function resolution to pointer aliasing problem.
The paper employs flow sensitive inter procedural data flow analysis to solve aliasing
problem, using compact representation graphs to represent the alias relations.
• Some test files in the roseTests folder of the ROSE repository and he told me that the
implementation supports function pointers as well as code which is written across different
files (header files etc).
• Documentation: Chapter 24 Dataflow Analysis based Virtual Function Analysis, of ROSE
tutorial pdf
VariableRenaming v(project);
v.run();
v.getReachingDefsAtNode(...);
2 http://www.rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf
3 http://www.rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf
43
Program Analysis
https://mailman.nersc.gov/pipermail/rose-public/2010-September/000390.html
> 3. Say I want to query whether two pointer variables alias and I
have
> SGNodes to their declarations. How do I get the AstNodePtr needed
to
> invoke the may_alias(AstInterface&, const AstNodePtr&, const
> AstNodePtr&) function? Or maybe I should rather invoke the version
of
> may_alias that takes two strings (varnames)?
44
SSA
AstNodePtrImpl
object, i.e., do AstNodePtrImpl(x), as illustrated inside the ()
operator of TestPtrAnal in steensgaardTest2.C.
-Qing Yi
void func(void) {
int* pointer;
int* aliasPointer;
pointer = malloc(sizeof(int));
aliasPointer = pointer;
*aliasPointer = 42;
printf("%d\n", *pointer);
}
10.5 SSA
ROSE has implemented an SSA form. Some discussions on the mailing list: link4 .
Rice branch has an implementation of array SSA. We are waiting for their commits to be
pushed into Jenkins. --Liao5 (discuss6 • contribs7 ) 18:17, 19 June 2012 (UTC)
4 https://mailman.nersc.gov/pipermail/rose-public/2012-March/001496.html
5 http://en.wikibooks.org/wiki/User%3ALiao
6 http://en.wikibooks.org/wiki/User%20talk%3ALiao
7 http://en.wikibooks.org/wiki/Special%3AContributions%2FLiao
45
Program Analysis
Quick Facts
• The algorithm is based on the paper: K. D. Cooper and K. Kennedy. 1988. Interprocedural
side-effect analysis in linear time. In Proceedings of the ACM SIGPLAN 1988 conference
on Programming Language design and Implementation (PLDI '88), R. L. Wexelblat (Ed.).
ACM, New York, NY, USA, 57-66.
• Source Code: src/midend/programAnalysis/sideEffectAnalysis
• Tests: tests/roseTests/programAnalysisTests/sideEffectAnalysisTests
As the ROSE project goes on, we have collected quite some versions of dataflow analysis. It
is painful to maintain and use them as they
• Duplicate the iterative fixed-point algorithm
• Scatter in different directories and
• Use different representations for results.
An ongoing effort is to consolidate all dataflow analysis work within a single framework.
Quick facts
• Original author: Greg Bronevetsky
• Code reviewer: Chunhua Liao
• Documentation:
• Source codes: files under ./src/midend/programAnalysis/genericDataflow
• Tests: tests/roseTests/programAnalysisTests/generalDataFlowAnalysisTests
• Currently implemented analysis
• Dominator analysis: dominatorAnalysis.h dominatorAnalysis.C
• Livedead variable analysis, or liveness analysis: liveDeadVarAnalysis.h liveDeadVar-
Analysis.C
• Constant propagation: constantPropagation.h constantPropagation.C: TODO need to
move the files into src/ from /tests
See more at Generic Dataflow Framework8
TODO: it turns out the interface work is not merged into our master branch. So the following
instructions do not apply!
The interface for dependence graph could be found in DependencyGraph.h. The underlying
representation is n DepGraph.h. BGL is required to access the graph.
8 Chapter 11 on page 49
46
Dependence analysis
Here9 are 6 examples attached with this email. In deptest.C, there are also some macros to
enable more accurate analysis.
If USE_IVS is defined, the induction variable substitution will be performed. if USE_-
FUNCTION is defined, the dependency could take a user-specified function side-effect
interface. Otherwise, if non of them are defined, it will perform a normal dependence
analysis and build the graph.
9 https://mailman.nersc.gov/pipermail/rose-public/2012-May/001620.html
47
11 Generic Dataflow Framework
11.1 Introduction
As the ROSE project goes on, we have collected quite some versions of dataflow analysis. It
is painful to maintain and use them as they
• duplicate the iterative fixed-point algorithm,
• scatter in different directories,
• use different representations for results, and
• has different level of maturity and robustness.
An ongoing effort is to consolidate all dataflow analysis work within a single framework.
Quick facts
• original author: Greg Bronevetsky
• code gatekeeper: Chunhua Liao
• Documentation:
• Chapter 18 Generic Dataflow Analysis Framework, of the ROSE tutorial pdf1 , git
commit2
• This wikibook page
• source codes: files under ./src/midend/programAnalysis/genericDataflow
• tests: tests/roseTests/programAnalysisTests/generalDataFlowAnalysisTests
List
• Constant Propagation3
• dominator analysis: dominatorAnalysis.h dominatorAnalysis.C
• livedead variable analysis, or liveness analysis: liveDeadVarAnalysis.h liveDeadVarAnaly-
sis.C
• Pointer Analysis4
1 http://rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf
2 http://github.com/rose-compiler/rose/commit/f4d5292dad1a68ee13cd9b38834efe4813db92ec
3 http://en.wikibooks.org/wiki/ROSE%20Compiler%20Framework%2FConstant%20Propagation
4 http://en.wikibooks.org/wiki/ROSE%20Compiler%20Framework%2FPointer%20Analysis
49
Generic Dataflow Framework
Function and nodeState are two required parameters to run data flow analysis:
They are stored together inside FunctionState //functionState.h
functionState.h
genericDataflow/cfgUtils/CallGraphTraverse.h
11.3.1 function
11.3.2 NodeFact
};
11.3.3 NodeState
Store information about multiple analyses and their corresponding lattices, for a given node
(CFG node ??)
./src/midend/programAnalysis/genericDataflow/state/nodeState.h
It also provide static functions to
• initialize NodeState for all DataflowNode
• to retrieve NodeState for a given DataflowNode
50
Function, nodeState and FunctionState
class NodeState
{
// internal types: map between analysis and set of lattices
// the facts that are true at this node, for each analysis
that
// may be interested in the current node
NodeFactMap facts;
// static interfaces
// returns the map containing all the lattices from above the
node that are owned by the given analysis
// (read-only access)
const std::vector<Lattice*>& getLatticeAbove(const Analysis*
analysis) const;
// returns the map containing all the lattices from below the
node that are owned by the given analysis
// (read-only access)
const std::vector<Lattice*>& getLatticeBelow(const Analysis*
analysis) const;
11.3.4 FunctionState
./src/midend/programAnalysis/genericDataflow/state/functionState.h
A pair of Function and NodeState.
51
Generic Dataflow Framework
class FunctionState
{
friend class CollectFunctions;
public:
Function func;
NodeState state;
// The lattices that describe the value of the function's
return variables
NodeState retState;
private:
static std::set<FunctionState*> allDefinedFuncs;
static std::set<FunctionState*> allFuncs;
static bool allFuncsComputed;
public:
FunctionState(Function &func):
func(func),
state(/*func.get_declaration()->cfgForBeginning()*/)
{}
// We should use this interface --------------
/*************************************
*** UnstructuredPassInterAnalysis ***
*************************************/
void UnstructuredPassInterAnalysis::runAnalysis()
{
set<FunctionState*> allFuncs =
FunctionState::getAllDefinedFuncs(); // call a static function to get
all function state s
52
Lattices
NodeState* state)
{
DataflowNode funcCFGStart =
cfgUtils::getFuncStartCFG(func.get_definition(),filter);
DataflowNode funcCFGEnd =
cfgUtils::getFuncEndCFG(func.get_definition(), filter);
if(analysisDebugLevel>=2)
Dbg::dbg <<
"UnstructuredPassIntraAnalysis::runAnalysis() function
"<<func.get_name().getString()<<"()\n";
11.4 Lattices
11.4.1 Basics
53
Generic Dataflow Framework
11.4.4 LiveVarsLattice
54
Transfer Function
variables.
...
};
// computes the meet of this and that and saves the result in this
// returns true if this causes this to change and false otherwise
bool LiveVarsLattice::meetUpdate(Lattice* that_arg)
{
bool modified = false;
LiveVarsLattice* that =
dynamic_cast<LiveVarsLattice*>(that_arg);
return modified;
}
basics: Data_flow_analysis#flow.2Ftransfer_function6
• IN = sum of OUT (predecessors)
• OUT = GEN + (IN - KILL)
The impact of program constructs on the current lattices (how to change the current lattices).
• lattices: stores IN and OUT information
• additional data members are necessary to store GEN and KILL set inside the transfer
function.
class hierarchy:
6 http://en.wikibooks.org/wiki/Data_flow_analysis%23flow.2Ftransfer_function
55
Generic Dataflow Framework
public:
};
};
void
ConstantPropagationAnalysisTransfer::visit(SgIntVal *sgn)
{
ROSE_ASSERT(sgn != NULL);
ConstantPropagationLattice* resLat = getLattice(sgn);
ROSE_ASSERT(resLat != NULL);
resLat->setValue(sgn->get_value());
resLat->setLevel(ConstantPropagationLattice::constantValue);
}
Functions to convert program point to Generator and KILL set. For liveness analysis7
• Kill (s) = {variables being defined in s}: //
• Gen (s) = {variables being used in s}
OUT = IN -KILL + GEN
7 http://en.wikibooks.org/wiki/liveness%20analysis
56
Transfer Function
{
LiveVarsLattice* liveLat; // the result of this analysis
bool modified;
// Expressions that are assigned by the current operation
std::set<SgExpression*> assignedExprs; // KILL () set
// Variables that are assigned by the current operation
std::set<varID> assignedVars;
// Variables that are used/read by the current operation
std::set<varID> usedVars; // GEN () set
public:
LiveDeadVarsTransfer(const Function &f, const DataflowNode &n,
NodeState &s, const std::vector<Lattice*> &d, funcSideEffectUses
*fseu_)
: IntraDFTransferVisitor(f, n, s, d), indent(" "),
liveLat(dynamic_cast<LiveVarsLattice*>(*(dfInfo.begin()))),
modified(false), fseu(fseu_)
{
if(liveDeadAnalysisDebugLevel>=1) Dbg::dbg << indent <<
"liveLat="<<liveLat->str(indent + " ")<<std::endl;
// Make sure that all the lattice is initialized
liveLat->initialize();
}
bool finish();
// operationg on different AST nodes
void visit(SgExpression *);
void visit(SgInitializedName *);
void visit(SgReturnStmt *);
void visit(SgExprStatement *);
void visit(SgCaseOptionStmt *);
void visit(SgIfStmt *);
void visit(SgForStatement *);
void visit(SgWhileStmt *);
void visit(SgDoWhileStmt *);
}
public:
57
Generic Dataflow Framework
liveLat->isLiveVar(SgExpr2Var(sgn->get_lhs_operand()))))
{ */
ldva.used(sgn->get_rhs_operand());
}
...
}
(gdb) bt
#0 LDVAExpressionTransfer::visit (this=0x7fffffffcea0, sgn=0xa20320)
at ../../../../sourcetree/src/midend/pro
gramAnalysis/genericDataflow/simpleAnalyses/liveDeadVarAnalysis.C:228
#1 0x00002aaaac3d9968 in SgAssignOp::accept (this=0xa20320,
visitor=...) at Cxx_Grammar.C:143069
#2 0x00002aaaadc61c04 in LiveDeadVarsTransfer::visit (this=0xaf9e00,
sgn=0xa20320)
at ../../../../sourcetree/src/midend/pro
gramAnalysis/genericDataflow/simpleAnalyses/liveDeadVarAnalysis.C:384
#3 0x00002aaaadbbaef0 in ROSE_VisitorPatternDefaultBase::visit
(this=0xaf9e00, variable_SgBinaryOp=0xa20320) at
../../../src/frontend/SageIII/Cxx_Grammar.h:316006
#4 0x00002aaaadbba04a in ROSE_VisitorPatternDefaultBase::visit
(this=0xaf9e00, variable_SgAssignOp=0xa20320) at
../../../src/frontend/SageIII/Cxx_Grammar.h:315931
#5 0x00002aaaac3d9968 in SgAssignOp::accept (this=0xa20320,
visitor=...) at Cxx_Grammar.C:143069
#6 0x00002aaaadbcca0a in IntraUniDirectionalDataflow::runAnalysis
(this=0x7fffffffd9f0, func=..., fState=0xafbd18,
analyzeDueToCallers=true, calleesUpdated=...)
at ../../../../sourcetr
ee/src/midend/programAnalysis/genericDataflow/analysis/dataflow.C:282
#7 0x00002aaaadbbf444 in IntraProceduralDataflow::runAnalysis
(this=0x7fffffffda00, func=..., state=0xafbd18)
at ../../../../sourcet
ree/src/midend/programAnalysis/genericDataflow/analysis/dataflow.h:74
#8 0x00002aaaadbb0966 in UnstructuredPassInterDataflow::runAnalysis
(this=0x7fffffffda50)
at ../../../../sourcetr
ee/src/midend/programAnalysis/genericDataflow/analysis/analysis.C:467
#9 0x000000000040381a in main (argc=2, argv=0x7fffffffdba8)
at ../../../../../sourcetree/tests/roseTests/programAna
lysisTests/generalDataFlowAnalysisTests/liveDeadVarAnalysisTest.C:101
The generic dataflow framework works on virtual control flow graph in ROSE
The raw virtual CFG may not be desirable for all kinds of analyses since it can have too
many administrative nodes which are not relevant to a problem.
So the framework provides a filter parameter to the Analysis class. A default filter will be
used unless you specify your own filter.
58
Analysis Driver
switch (node->variantT())
{
//Keep the last index for initialized names. This way the def of
the variable doesn't propagate to its assign initializer.
case V_SgInitializedName:
return (cfgn == node->cfgForEnd());
// For function calls, we only keep the last node. The function
is actually called after all its parameters are evaluated.
case V_SgFunctionCallExp:
return (cfgn == node->cfgForEnd());
//For basic blocks and other "container" nodes, keep the node that
appears before the contents are executed
case V_SgBasicBlock:
case V_SgExprStatement:
case V_SgCommaOpExp:
return (cfgn == node->cfgForBeginning());
UnstructuredPassInterDataflow ciipd_ldva(&ldva);
ciipd_ldva.runAnalysis();
....
}
Key function:
59
Generic Dataflow Framework
};
protected:
60
Analysis Driver
bool propagateStateToNextNode (
const std::vector<Lattice*>& curNodeState, DataflowNode
curDFNode, int nodeIndex,
const std::vector<Lattice*>& nextNodeState, DataflowNode
nextDFNode);
std::vector<DataflowNode>
gatherDescendants(std::vector<DataflowEdge> edges,
DataflowNode
(DataflowEdge::*edgeFn)() const);
IntraBWDataflow()
{}
VirtualCFG::dataflow*
getInitialWorklist(const Function &func, bool firstVisit,
bool analyzeDueToCallers, const set<Function> &calleesUpdated,
NodeState *fState);
61
Generic Dataflow Framework
};
Used to initialized the lattices/facts for CFG nodes. It is an analysis by itself. unstructured
pass
if(analysisDebugLevel>=2)
Dbg::dbg <<
"UnstructuredPassIntraAnalysis::runAnalysis() function
"<<func.get_name().getString()<<"()\n";
62
Analysis Driver
public:
InitDataflowState(IntraProceduralDataflow* dfAnalysis/*,
std::vector<Lattice*> &initState*/)
{
this->dfAnalysis = dfAnalysis;
}
11.7.3 worklist
public:
virtual void operator ++ (int);
63
Generic Dataflow Framework
};
if(pushAllChildren)
{
// find its followers (either successors or
predecessors, depending on value of fwDir), push back
// those that have not yet been visited
vector<DataflowEdge> nextE;
if(fwDir)
nextE = cur.outEdges();
else
nextE = cur.inEdges();
for(vector<DataflowEdge>::iterator
it=nextE.begin(); it!=nextE.end(); it++)
{
DataflowNode nextN((*it).target()/*
need to put something here because DataflowNodes don't have a default
constructor*/);
if(fwDir) nextN = (*it).target();
else nextN = (*it).source();
"<"<<nextN.getNode()->class_name()<<" | "<<nextN.getNode()<<" |
"<<nextN.getNode()->unparseToString()<<">, "<<
"visited="<<(visited.find(nextN) != visited.end())<<
"
remaining="<<isRemaining(nextN)<<"\n";*/
remainingNodes.push_back(nextN);
}
}
}
64
Analysis Driver
if(remainingNodes.size()>0)
{
// take the next node from the front of the
list and mark it as visited
//visited[remainingNodes.front()] = true;
visited.insert(remainingNodes.front());
}
}
}
...
for(vector<NodeState*>::const_iterator itS =
nodeStates.begin(); itS!=nodeStates.end(); )
{
65
Generic Dataflow Framework
state = *itS;
boost::shared_ptr<IntraDFTransferVisitor>
transferVisitor = getTransferVisitor(func, n, *state, dfInfoPost);
sgn->accept(*transferVisitor);
modified = transferVisitor->finish() ||
modified;
This is prove to be essential to propagate information along the path. Cannot commenting
it out!!
??? not sure about the difference between this step and the step before (Meet Before () /
Meet After)
66
Analysis Driver
67
Generic Dataflow Framework
if(analysisDebugLevel>=1) {
if(modified)
{
Dbg::dbg << " Next node's in-data
modified. Adding..."<<endl;
int j=0;
for(itN = nextNodeState.begin(); itN !=
nextNodeState.end(); itN++, j++)
{
Dbg::dbg << " Propagated:
Lattice "<<j<<": \n "<<(*itN)->str(" ")<<endl;
}
}
else
Dbg::dbg << " No modification on this
node"<<endl;
}
return modified;
}
Backward Intra-Procedural Dataflow Analysis: e.g. liveness analysis ( use --> backward -->
defined)
• class IntraBWDataflow : public IntraUniDirectionalDataflow
protected:
funcSideEffectUses* fseu;
public:
LiveDeadVarsAnalysis(SgProject *project, funcSideEffectUses*
fseu=NULL);
// Generates the initial lattice state for the given dataflow node,
in the given function, with the given NodeState
68
Inter-procedural analysis
boost::shared_ptr<IntraDFTransferVisitor> getTransferVisitor(const
Function& func, const DataflowNode& n,
};
Key: transfer function that is applied to call sites to perform the appropriate state transfers
across function boundaries.
69
Generic Dataflow Framework
11.8.2 InterProceduralDataflow
Inte
rProceduralDataflow::InterProceduralDataflow(IntraProceduralDataflow*
intraDataflowAnalysis) :
Int
erProceduralAnalysis((IntraProceduralAnalysis*)intraDataflowAnalysis)
TODO: begin and end func definition issue is mentioned inside of this
UnstructuredPassInterDataflow(IntraProceduralDataflow*
intraDataflowAnalysis)
: Inte
rProceduralAnalysis((IntraProceduralAnalysis*)intraDataflowAnalysis),
InterProceduralDataflow(intraDataflowAnalysis)
{}
70
How to use one analysis
void runAnalysis();
};
11.8.4 ContextInsensitiveInterProceduralDataflow
TODO
Direct call: Runs the intra-procedural analysis on the given function and returns true if
the function's NodeState gets modified as a result and false otherwise state - the function's
NodeState
71
Generic Dataflow Framework
int main()
{
SgProject* project = frontend(argc,argv);
initAnalysis(project);
// basis analysis
LiveDeadVarsAnalysis ldva(project);
// wrap it inside the unstructured inter-procedural data flow
UnstructuredPassInterDataflow ciipd_ldva(&ldva);
ciipd_ldva.runAnalysis();
.....
Sample code:
// Initialize vars to hold all the variables and expressions that are
live at DataflowNode n
//void getAllLiveVarsAt(LiveDeadVarsAnalysis* ldva, const
DataflowNode& n, const NodeState& state, set<varID>& vars, string
indent)
void getAllLiveVarsAt(LiveDeadVarsAnalysis* ldva, const NodeState&
state, set<varID>& vars, string indent)
72
Testing
{
LiveVarsLattice* liveLAbove = dy
namic_cast<LiveVarsLattice*>(*(state.getLatticeAbove(ldva).begin()));
LiveVarsLattice* liveLBelow = dy
namic_cast<LiveVarsLattice*>(*(state.getLatticeBelow(ldva).begin()));
11.10 Testing
int a =1,b,c;
#pragma rose [LiveVarsLattice: liveVars=[flag, a, b]]
if (flag == 0) // flag is only read here, not written!
c = a;
else
c = b;
return c;
}
73
Generic Dataflow Framework
Turn it on
liveDeadAnalysisDebugLevel = 1;
analysisDebugLevel = 1;
firefox index.html
How to read the trace file: start from the beginning: information is ordered based on the
CFG nodes visited. The order could be forward or backward order. Check if the order is
correct first, then for each node visited
==================================
Copying incoming Lattice 0:
[LiveVarsLattice: liveVars=[b]]
To outgoing Lattice 0:
[LiveVarsLattice: liveVars=[]]
==================================
Transferring the outgoing Lattice ...
liveLat=[LiveVarsLattice: liveVars=[b]]
Dead Expression
usedVars=<>
assignedVars=<>
assignedExprs=<>
#usedVars=0 #assignedExprs=0
Transferred: outgoing Lattice 0:
[LiveVarsLattice: liveVars=[b]]
transferred, modified=0
==================================
Propagating/Merging the outgoing Lattice to all descendant nodes
...
Descendants (1):
˜˜˜˜˜˜˜˜˜˜˜˜
Descendant: 0x2b9e8c47f010[SgIfStmt | if(flag == 0) c = a;else c
= b;]
74
How to debug
if-stmt
------------------
Descendants (1): // from c =a back to if-stmt (next node)
˜˜˜˜˜˜˜˜˜˜˜˜
Descendant: 0x2ac8bb95c010[SgIfStmt | if(flag == 0) c = a;else c
= b;]
------------------
Descendants (1): // from c = b --> if-stmt
˜˜˜˜˜˜˜˜˜˜˜˜
Descendant: 0x2ac8bb95c010[SgIfStmt | if(flag == 0) c = a;else c
= b;]
A class analysisStatesToDot is provided generate a CFG dot graph with lattices information.
//AnalysisDebuggingUtils.C
75
Generic Dataflow Framework
namespace Dbg
{
//....
void dotGraphGenerator (::Analysis *a)
{
::analysisStatesToDOT eas(a);
IntraAnalysisResultsToDotFiles upia_eas(eas);
upia_eas.runAnalysis();
}
} // namespace Dbg
// Liao, 12/6/2011
#include "rose.h"
#include <list>
#include <sstream>
#include <iostream>
#include <fstream>
#include <string>
#include <map>
//-----------------------------------------------------------
int
main( int argc, char * argv[] )
{
initAnalysis(project);
LiveDeadVarsAnalysis ldva(project);
UnstructuredPassInterDataflow ciipd_ldva(&ldva);
ciipd_ldva.runAnalysis();
// Output the dot graph *********************
Dbg::dotGraphGenerator (&ldva);
return 0;
}
76
TODO
11.12 TODO
• Hard to use the generated lattices since many temporary expression objects are generated
in lattices. But often users do not care about them (constant propagation, pointer
analysis)
• to see the problem: go to [build64/tests/roseTests/programAnalysisTests/generalDataFlowAnalysisTests]
• run make check
• see the dot graph dump of an analysis : run.sh test_ptr4.C_main_0x2b41e651c038_-
cfg.dot
77
12 Program Optimizations
79
13 ROSE Projects
This page serves as a quite guide about what the major directories under rose/projects are:
Parsing
• pragmaParsing: An example translator using the parsing building blocks provided by
ROSE to parse pragmas
Translations:
• autoTuning: a project to use ROSE's parameterized translators to facilitate empirical
tuning (or autotuning)
• DataFaultTolerance: a project to use source-to-source translation to make application
resilient to memory faults
• extractMPISkeleton: extract MPI communication skeletons
• Fortran_to_C : A Fortran to C language translator
Static Analysis
• compass: a static analysis tool to find errors in applications
Dynamic Analysis
• RTED: runtime error detection using compiler instrumentation of library calls.
Binary Analysis:
• BinaryCloneDetection: detect similarities between binary executables.
• CloneDetection:
Optimizations of high-level abstractions
• arrayOptimization: optimizations based on array abstractions
• autoParallelization: A translator which can automatically insert OpenMP directives into
serial code, based on dependence analysis and optionally semantics of abstractions.
Parallel Programming Models:
• mint: a directive based programming model for GPUs
• OpenMP_Translator: the first version of OpenMP implementation using ROSE. Not
recommended for production use, kept just as an example.
• UpcTranslation: a preliminarily example project to demonstrate how ROSE can be used
to created a UPC compiler
81
ROSE Projects
13.1 minitermite
Problem: A student added some new IR nodes into ROSE. She is having trouble to pass
make for minitermite
Solution: projects/minitermite/HOWTO_ADD_NEW_SGNODE_VARIANTS
82
14 Developer's Guide
These are some basic skills that ROSE developers should have, or acquire:
• Shell programming: Bash (Bourne Again Shell) is the default shell for ROSE.
• Unix commands: grep, find, ssh, etc.
• C++ programming: be conscious of applying consistent coding-style conventions and
writing code that will be maintainable when you leave
• Debugging: GDB will be invaluable to make sure your code works as expected
• Git - Source code management (SCM): get familiar with the basics of Git: http:
//git-scm.com/
• Build systems: GNU Autotools (autoconf, automake), GNU Make, GNU libtool
• CMake: (primarily so you won't break our existing Windows port)
• LaTex: Document your work in ROSE/docs
• ROSE Documentation: Be familiar with ROSE documents (tutorials, installation, and
developer guides): http://rosecompiler.org/documents.html. This also includes the
project's Doxygen documentation.
• Compilers: ROSE is a compiler project, after all. Take some compiler courses!
• Read free online course materials related to compilers
• Keep learning topics related to your projects
References
• http://www.mediawiki.org/wiki/Git/Tutorial very good Git Tutorial
• http://eagain.net/articles/git-for-computer-scientists/
83
Developer's Guide
• optimizations
• build system
• Bug fixes: passing code review and Jenkins (in the future, Klocwork, Coverity, etc.
analysis tools)
• reported by users on SciDAC outreach center's bug tracker
• found by ourselves, reported on github.com or redmine
• Documentation: write new ones, and improve existing ones
• how ROSE works
• Tutorial, manual, FAQ, etc.
• project documentation
• design/architecture/api documents,
• workflow documentation, etc
• System administration: Maintain and improve workflow components (mostly not student's
work, but suggestions are welcomed)
• Website: rosecompiler.org
• Git repository
• Project management: Redmine
• Code review: Github enterprise
• Jenkins: Continuous integration, improving testings
Research:
• Publications: technical reports, papers, presentations, posters
• If you have finished your presentation, please upload your slides to your relevant Redmine
project's @Files Tab@. @.pptx@ format is required since other people may want to edit
it in the future.
Proposal:
• write collaborative proposals
Feedback: we are continually looking for ways to improve our workflow, but there's always
more that we can do
• General struggles (administratively or implementation-wise)
• General improvement/enhancement ideas for both the software and the people
Having been working with some interns with us, we roughly identify the following milestones
for a ROSE developer:
• Development environment: pick a platform of your choice (Linux or Mac OS), and get
familiar with that specific platform (shell, editors, environment variable setting, etc.)
• Physical location: locations MATTER! Sit closer to people you should interact often.
Make your desk/office accessible to others. Physically isolated office/desk may have
very negative impact on your productivity.
• Installing ROSE: being able to smoothly configure, compile, and install ROSE
• Build system: being able to add a project (first skeleton) into ROSE by modifying
Makefile.am, etc.
84
Termination checklist
1 Chapter 16 on page 91
2 Chapter 17 on page 115
3 Chapter 17 on page 115
85
Developer's Guide
14.6.1 Toolchain
$ ls /nfs/apps
apr bin etc grace java mpc neon
pygobject sqlite toolworks.old
asciidoc binutils flex graphviz libtool mpfr openssh
python src totalview
asymptote blender gcc hdf5 m4 mpich perl qt
subversion upc
autoconf doc++ git insure++ maple mpich2 pgi
rdesktop swig visit
automake doxygen gmp intel matlab mplayer psi ruby
texinfo xemacs
The root of most of these tools contains a setup.sh file which you can source. This will
correctly setup your library path ($LD_LIBRARY_PATH) and program path ($PATH):
GCC
$ source /nfs/apps/gcc/4.5.0/setup.sh
This GCC setup.sh file should also source MPFR and GMP, but if not, please do it
manually:
$ source /nfs/apps/mpfr/3.0.0/setup.sh
$ source /nfs/apps/gmp/4.3.2/setup.sh
If you fail to properly source these dependencies, you may encounter this error:
/nfs/apps/gcc/4.3.2/libexec/gcc/x86_64-unknown-linux-gnu/4.3.2/f951:
error while loading shared libraries: libmpfr.so.1: cannot open
shared object file: No such file or directory
86
15 Workflow
Developing a big, sophisticated project entails many challenges. To mitigate some of these
challenges, we have adopted several best practices: incremental development, code review,
and continuous integration.
• Iterative and Incremental software development for early results, controllable risks, and
better engagement of stakeholders
• Code review for consistency, maintainability, usability, and quality
• Continuous Integration for automated testing, easy release, and scalable collaboration
Developing new functionality in small steps, where the resulting code at each step is a
useful improvement over the previous state. Contrast to developing an entire feature fully
elaborated, with no points along the way at which it's externally usable.
Each ROSE developer is expected to push his/her work at least once every three weeks.
Major benefits of doing things incrementally
• You can have intermediate results along the path. So your sponsors will sleep better.
• You will get feedback early and frequently about if you are heading to the right direction.
• Your work will be tested and merged often into the master branch, avoiding the risks of
merge conflicts.
See more tips about How to incrementally work on a project1
87
Workflow
Incorporating changes from work in progress into a shared mainline as frequently as possible,
in order to identify incompatible changes and introduced bugs as early as possible. The
integrated changes need not be particular increments of functionality as far as the rest of
the system is concerned.
In other words, incremental development is about making one's work valuable as early as
possible, and potentially about getting a better sense of what direction it should take, while
continuous integration is about reducing the risks that result from codebase divergence as
multiple people do development in parallel.
The question of whether to conditionalize new code is an interesting one. By doing so, one
narrows the scope of continuous integration to just checking for surface incompatibilities in
merging the changed code. Without actually running the new code against the existing tests,
the early detection of introduced bugs is lost. In exchange, multiple people working in the
same part of the codebase become less likely to step on each other's toes, because the relevant
code changes are distributed more rapidly.
See more at Continuous Integration3
15.3.2 Design
15.3.3 Implementation
88
Proposing Workflow Changes
• Project-Specific Tasks
• Private Issue Tracking
• Private Documentation
• Using redmine's wiki
• Github:
• Internal (http://github.llnl.gov/): for code review only,
• External (https://github.com/rose-compiler/rose): public hosting code, pubic
issue tracking for general ROSE bugs and features.
• "Rosebot" to automate Github workflow: preliminary testing, policies (git-hooks),
automatically add reviewers, etc.
15.3.4 Testing
15.3.5 Documentation
15.3.6 Publicity
Major workflow improvements and changes should be thoroughly tested and reviewed by
staff members before deployment since they may have profound impact on the project
How to propose a workflow change
• Submit a ticket on github.com's rose-public/rose issue tracker. In the ticket, provide the
following information:
• What is it: Explain what change is proposed
• Why the changes: the long-term benefits for our productivity and quality of work
• The cost of the changes: learning curve, maintainability, purchase cost
4 Chapter 2 on page 7
89
Workflow
• Optimize
• Optimize our workflow to allow us to do more quality and use less time and other
resources.
• Address what is slowing us down or distracting us.
• Simplify daily life. Compare how we can eliminate or automate using the proposed
workflow improvements.
• It is counterproductive to improve workflow by adding more hoops/steps/clicks
into daily work.
• Improve:
• Allows the improvement of the quality of work incrementally:
• Accepting incremental improvements is more realistic than asking for perfection in the
first try.
• Workflow should allow quick new contributions and fast revision of existing contribu-
tions
• Automate:
• Additions to the workflow should be automated as much as possible.
• Preserve:
• It must preserve existing work:
• No creation of anything from scratch
• Does it interact well with existing workflow
• Is there a way to convert existing code/documents into the new form
• Simplicity:
• The more software tools we depend on, the harder to use and maintain our workflow.
Similarly, the more formats/standards we enforce, the harder for developers to do
their daily work
• Adopting new required software components and new required technical formats/stan-
dards in our workflow should be very carefully reviewed for the associated long-term
benefits and costs. Long-term means the range of 5 to 10 years and is not tied to a
temporary thing we use now.
• Preference of major contributors: Whoever contributes the most should has a little bit
more weight to say
• Documentation: We require major changes to be documented and reviewed before
deployment. Writing down things can help us clarify details and solicit wider comments
(instead of limited to face-to-face meeting)
90
16 Coding Standard
This page documents the current recommended practice of how we should write code within
the ROSE project. It also serves as a guideline for our code review1 process.
New code should follow the conventions described in this document from the very beginning.
Updates to existing code that follows a different coding style should only be performed if
you are the maintainer of the code.
The order of sections in coding standard follows a top-down approach: big things first, then
drill down to fine-grain details.
We use coding standard to reflect the principal things we value for all contributions to ROSE
• Documentation: What are the commits about? Is this reflected in commit messages,
README, source comments, or LaTex files within the same commits?
• Style: Is the coding style consistent with the required and recommended formats? Is the
code clean and pleasant and easy to read?
• Interface: Does the code has a clean and simple interface to be used by users?
• Algorithm: Does the code has sufficient comments about what algorithm is used? Is
the algorithm correct and efficient (space and time complexity)?
• Implementation: Does the implementation correctly implement the documented algo-
rithms?
• Testing: Does the code has the accompanying test translator and input to ensure the
contributions do what they are supposed to do?
• Is Jenkins being configured to trigger these tests? Local tests on developer's workstation
do not count.
91
Coding Standard
"Nearly every software engineer has, at some point, been exploited by someone who used
coding standards as a power play. Dogmatism over minutia is the purvue of the intellectu-
ally weak. Don't be like them. These are those who can't contribute in any meaningful way,
who can't actually improve the value of the software product, so instead of exposing their
incompetence through silence, they blather with zeal about nits. They can't add value
in the substance of the software, so they argue over form. Just because "they" do
that doesn't mean coding standards are bad, however.
Another emotional reaction against coding standards is caused by coding standards set
by individuals with obsolete skills. For example, someone might set today's standards
based on what programming was like N decades ago when the standards setter was writing
code. Such impositions generate an attitude of mistrust for coding standards. As above, if
you have been forced to endure an unfortunate experience like this, don't let it sour you to
the whole point and value of coding standards. It doesn't take a very large organization to
find there is value in having consistency, since different programmers can edit the same code
without constantly reorganizing each others' code in a tug-of-war over the "best" coding
standard."
This is not a place to write down the new ideas/concepts/suggestions to be used in the
future. If you have suggestions, put into the discussion tab link2 of this page.
We do welcome suggestions for improvements and changes so we can do things faster and
better.
• For suggestions, please follow the procedure defined in Proposing_Workflow_Changes3
• The suggestions will be reviewed by the criteria defined in Reviewing_Workflow_-
Change_Proposals4
2 http://en.wikibooks.org/wiki/Talk:ROSE_Compiler_Framework/Coding_Standard
3 Chapter 15.4 on page 89
4 Chapter 15.5 on page 90
92
Git Convention
Before you commit your local changes, you MUST ensure that you have correctly configured
your author and email information (on all of your machines). Having a recognizable and
consistent name and email will make it easier for us to evaluate the contributions that you've
made to our project.
Guidelines:
• Name: You MUST use your official name you commonly use for work/business, not
nickname or alias which cannot be easily recognized by co-workers, managers, or sponsors.
• Email: You MUST use your email commonly used for work. It can be either your
company email or your personal email (gmail) if you DO commonly use that personal
email for business purpose.
To check if your author and email are configured correctly:
Alternatively, you can just type the following to list all your current git configuration
variables and values, including name and email information.
$ git config -l
It is important to have concise and accurate commit messages to help code reviewers do
their work.
Example commit message, excerpt from link5
* More documentation for the new memory cell, memory state, and X86
5 https://github.com/rose-compiler/rose/commit/801c53d81526e2eae7a68e0eab1a9f21b9892ab2
93
Coding Standard
• (Required) Summary: the first line of the commit message is a one line summary (<50
words) of the commit. Start the summary with a topic, enclosed in parentheses, to
indicate the project, feature, bugfix, etc. that this commit represents.
• (Optional) Use a bullet-list (using an asterisk, *) for each item to elaborate on the commit
Also see http://spheredev.org/wiki/Git_for_the_lazy#Writing_good_commit_
messages.
16.3.1 Overview
"The software design document is a written contract between you, your team, your project
manager and your client. When you document your assumptions, decisions and risks, it gives
the team members and stakeholders an opportunity to agree or to ask for clarifications and
modifications. Once the software design document is approved by the appropriate parties,
it becomes a baseline for limiting changes in the scope of the project." - How to Write a
Software Design Document | eHow.com6
We are still in the process of defining the requirements for design documents, but preliminarily,
here are the initial rules for writing a design document for a ROSE module (an analysis,
transformation, optimization, etc.).
(We thank Professor Vivek Sarkar7 at Rice University8 for his insightful comments for
some of the initial design document requirements.)
16.3.2 Guideline
• All new ROSE analyses, transformations, and optimizations must have an accompanying
design document, to be peer-reviewed, before the actual implementation begins.
• Be specific enough that someone with ROSE skills who is not the original designer could
(in principle) implement the design just by looking at the document.
• It's to be expected that different developers will make different low-level choices about
data structures, etc
If the requirements document is the "why" of the software, then the technical design document
is the "how to". For simplicity, we put both requirements and design into a single document
for now. We allow a separated requirement analysis document if necessary.
6 http://www.ehow.com/how_6734245_write-software-design-document.html#ixzz22E1xFTCS
7 http://www.cs.rice.edu/~vs3/home/Vivek_Sarkar.html
8 http://www.rice.edu/
94
Design Document
The purpose of writing the technical design document is to guide developers in implementing
(and fulfilling) the requirements of the software--it's the software's blueprint.
16.3.4 Format
16.3.5 Content
Major Sections
• Overview
• Explain the motivation and goal of the module: what does this module do, the goal,
the problem to address, etc.
• Requirement analysis: what is required for this module
• Define the interface: namespace, function names, parameters, return values. How
others can call this module and obtain the returned results
• Performance requirement: time and space complexity
• Scope of input/test codes: what types of languages to be supported, the constructs of
a language to be supported, the benchmarks to be used
• Design considerations
• Assumptions
• Constraints
• Tradeoffs and limitations: why this algorithm, what are the priorities, etc.
• Non-standard elements: Definitions of any non-standard symbols, shapes, acronyms,
and unique terms in the document
• Game plan: How each requirement will be achieved
• Internal software workflow
• Diagrams: logical structure and logical processing steps: MUST have a UML diagram
or power point diagram
• Pseudo code: MUST have pseudo code to describe key data structures and high-level
algorithm steps
• Example: Must illustrate the designed algorithm by using at least one sample input
code to go through the important intermediate results of the algorithm.
• Error, alarm and warning messages, optional
95
Coding Standard
• Performance: MUST have complexity analysis. Estimate the time and space complexity
of this module so users can know what to expect
• Reliability (Optional)
• Related work: cite relevant work in textbooks and papers
16.3.7 References
16.3.8 TODO
16.4 Testing
Rules
• All contributions MUST have the accompanying test translator and input files to demon-
strate the contributions work as expected.
• All tests MUST be triggered by the "make check" rule
• All test should have self-verification to make sure the correct results are generated
• All tests MUST be activated by at least one of the integration tests of Jenkins (the
test jobs used to check if something can be merged into our central repository's master
branch)
• This will ensure that no future commits can break your contributions.
96
Programming Languages
The order of sub-sections reflects a top-down approach for how things are added during the
development cycle: from directory --> file --> namespace --> etc.
16.6.1 General
• Language: all names should be written in English since it is the preferred language for
development, internationally
• fileName; // NOT: filNavn
Avoid ambiguous abbreviations: obtain good balance between user-clarity and -productivity.
Abbreviations and acronyms should NOT be uppercase when used as name
• exportHtmlSource(); // NOT: exportHTMLSource();
97
Coding Standard
16.6.3 File/Directory
Case:
• camelCase like fileName.hpp: This is consistent with existing names used in ROSE
File Extension:
• Header files: .h or .hpp
• Source files: .cpp or .cxx
• .C should be avoided to work with file systems which do not distinguish between lower
or upper case.
16.6.4 Namespaces
16.6.5 Types
9 http://rosecompiler.org/ROSE_HTML_Reference/namespaces.html
10 http://en.wikibooks.org/wiki/ROSE%20Compiler%20Framework%2FROSE%20API
98
Naming Conventions
16.6.6 Variables
• Length: variables with a large scope should have long names, variables with a small
scope can have short names
• Temporary variables used for temporary storage (e.g. loop indices) are best kept short.
A programmer reading such variables should be able to assume that its value is not used
outside of a few lines of code. Common scratch variables for integers are i, j, k, m, n.
Optionally, you can use ii, jj, kk, mm, and nn, which are easier to highlight when looking
for indexing bugs.
• Case: camelCase--mixed case starting with lowercase letter, as in functionDecl
• Variables are purposely to start with lowercase letter as compared to upper case letter
for Types. So it is clear by looking at the first letter to know if a name is a variable or
a type.
Booleans
Negated boolean variable names must be avoided. The problem arises when such a name is
used in conjunction with the logical negation operator as this results in a double negative.
It is not immediately apparent what !isNotFound means.
Collections
Plural form should be used on names representing a collection of objects. This enhances
readability since the name gives the user an immediate clue as to the type of the variable
and the operations that can be performed on its elements.
For example,
vector<Point> points;
int values[];
Constants
Named constants (including enumeration values): MUST be all uppercase using underscore
to separate words.
For example:
In general, the use of such constants should be minimized. In many cases implementing the
value as a method is a better choice:
99
Coding Standard
Generic
Generic variables should have the same name as their type. This reduces complexity by
reducing the number of terms and names used. Also makes it easy to deduce the type given
a variable name only. If for some reason this convention doesn't seem to fit it is a strong
indication that the type name is badly chosen.
Non-generic variables have a role. These variables can often be named by combining role
and type:
Globals
Private class variables should have underscore suffix. Apart from its name and its type, the
scope of a variable is its most important feature. Indicating class scope by using underscore
makes it easy to distinguish class variables from local scratch variables.
For example,
class SomeClass {
private:
int length_;
}
An issue is whether the underscore should be added as a prefix or as a suffix. Both practices
are commonly used, but the latter is recommended because it seem to best preserve the
100
Naming Conventions
readability of the name. A side effect of the underscore naming convention is that it
nicely resolves the problem of finding reasonable variable names for setter methods and
constructors:
Names representing methods or functions: MUST be verbs and written in mixed case
starting with lower case to indicate what they return and procedures (void methods) after
what they do.
• e.g. getName(), computeTotalWidth(), isEmpty()
A method name should avoid duplicated object name.
• e.g. line.getLength(); // NOT: line.getLineLength();
The latter seems natural in the class declaration, but proves superfluous in use, as shown in
the example.
The terms get and set must be used where an attribute is accessed directly.
• e.g: employee.getName(); employee.setName(name); matrix.getElement(2, 4); ma-
trix.setElement(2, 4, value);
The term compute can be used in methods where something is computed.
• e.g: valueSet->computeAverage(); matrix->computeInverse()
Give the reader the immediate clue that this is a potentially time-consuming operation, and
if used repeatedly, he might consider caching the result. Consistent use of the term enhances
readability.
The term find can be used in methods where something is looked up.
• e.g.: vertex.findNearestVertex(); matrix.findMinElement();
Give the reader the immediate clue that this is a simple look up method with a minimum of
computations involved. Consistent use of the term enhances readability.
The term initialize can be used where an object or a concept is established.
• e.g: printer.initializeFontSet();
The american initialize should be preferred over the English initialise. Abbreviation init
should be avoided.
The prefix is should be used for boolean variables and methods.
• e.g: isSet, isVisible, isFinished, isFound, isOpen
101
Coding Standard
There are a few alternatives to the is prefix that fit better in some situations. These are
the has, can and should prefixes:
• bool hasLicense();
• bool canEvaluate();
• bool shouldSort();
Parameters should be separated by a single space character, with no leading or trailing
spaces in the parameters list:
• YES: void foo(int x, int y)
• NO: void foo ( int x,int y )
16.7 Directories
16.7.2 Layout
TODO: big picture about where to put things within the ROSE git repository.
For each project directory under ./projects, it is our convention to have subdirectories for
different files
• README: must have this
• ./src: for all your source files
• ./include: for all your headers if you don't want to put them all into ./src
• ./tests: for your test input files
102
Files
16.8 Files
A single file should contain one logical unit, or feature. Keep it modular!
16.8.3 Indentation
Avoid tabs for your code indentation, except in cases where tabs (\t) are required, e.g.
Makefiles.
2 or 4 spaces is recommended for code indentation.
103
Coding Standard
a[i] = 0;
Indentation of 1 is too small to emphasize the logical layout of the code. Indentation larger
than 4 makes deeply nested code difficult to read and increases the chance that the lines
must be split.
16.8.4 Characters
File name:
• must be camelCase: such as fileName.h or fileName.hpp
• avoid file_name.h
Suffix
• For C header files: Use .h
• For C++ header files: Use .h or .hpp
Must have
• protected preprocesssing directives to prevent the header from being included more than
once, example
#ifndef _HEADER_FILE_X_H_
#define _HEADER_FILE_X_H_
#endif //_HEADER_FILE_X_H_
104
README
• this will pollute the global scope for each .cpp file which includes this header. using
namespace should only be used by .cpp files. More explanations are at link11 and
link212
• function definitions
References:
• http://www.parashift.com/c++-faq/hdr-file-ext.html
16.9 README
All major directories within ROSE git repository should have a README file
• projects/projectXYZ MUST have a README file.
File name should be README
what to avoid
• README.txt
• readme
11 http://www.parashift.com/c++-faq/using-namespace-std.html
12 http://www.possibility.com/Cpp/CppCodingStandard.html#dgdu
105
Coding Standard
16.9.2 Format
Format of README
• text format with clear sections and bullets
• optionally, you can use styles defined by w:Markdown13
16.9.3 Examples
The source code14 of ROSE is documented15 using the Doxygen documentation system16 .
13 http://en.wikipedia.org/wiki/Markdown
14 https://github.com/rose-compiler/rose
15 http://www.rosecompiler.org/ROSE_HTML_Reference/index.html
16 http://www.stack.nl/~dimitri/doxygen/
106
Source Code Documentation
• English only
• Use valid Doxygen syntax (see "Examples" below)
• Make the code readable for a person who reads your code for the first time:
• Document key concepts, algorithms, and functionalities
• Cover your project, file, class/namespace, functions, and variables.
• State your input and output clearly, specifically the meaning of the input or output
• Users are more likely to use your code if they don't have to think about what the
output means or what the input should be
• Clever is often synonymous with obfuscated, avoid this form of cleverness in coding.
TODO, not ready yet
• Test your documentation by generating it on your machine and then manually inspecting
it to confirm its correctness
TODO: Generating Local Documentation
This does not work sometimes since we have a configuration file to indicate which directories
to be scanned to generate the web reference html files
16.10.3 Examples
Single Line
107
Coding Standard
Multiple lines
/**
... text..
*/
/**
*
* ... text..
*
*/
/*******************************//**
* text
*********************************/
/////////////////////////////////////
/// ... text <= 80 columns in length
//////////////////////////////////////
Doxygen can generate a brief comment for a function and optionally show detailed comments
if users click on the function.
Here are the options to support combined single-line and multiple-line source comments.
Option 1:
/**
* \brief Brief description.
* Brief description continued.
*
* [Optional detailed description starts here.]
*/
Option 2:
/**
\brief Brief description.
Brief description continued.
---
Single line comment followed by multiple line comments:
You may extend an existing single line comment with a multiple line comments (Option 1
or 2). For example:
108
Functions
16.11 Functions
Rules
• Except for simple functions like getXX() and setXX(), all other functions should have at
least one line comment to explain what it does
• Avoid global functions and global variables. Try to put them into a namespace.
• A function should not have more than 100 lines of code. Please refactor big functions
into smaller, separated functions.
• Limit the unconditional printf() so your translator will not print hundreds lines of
unnecessary text output when processing multiple input files
• Use an if condition to control printf() for debugging purposes such as " if (
SgProject::get_verbose() > 0 ) "
• The beginning part of the function should try to do sanity check for the function
parameters.
16.12 Comments
Rules
• Please follow Doxygen style comments
• Please explain in sufficient detail how your function works and the steps in the algorithm.
• Reviewers will read your commented information to understand your algorithm and
then read your code to see if the code implements the algorithm correctly and efficiently.
16.13 Coding
Correctly implement the designed/documented algorithms. Future users won't have time to
read your code directly to discern what it does.
Code should be efficient in terms of both time and space (memory) complexity.
Please be aware that your translator may handle thousands of statements with even more
AST nodes.
Be aware that people other than you may use your code or develop it further. Please make
this as easy as possible.
16.14 Classes
109
Coding Standard
Name the class after what it is. If you can't think of what it is that is a clue you have not
thought through the design well enough.
• A class name should be a noun.
Compound names of over three words are a clue your design may be confusing various
entities in your system. Revisit your design. Try a CRC card session to see if your objects
have more responsibilities than they should.
All sections (public, protected, private) should be identified explicitly. Not applicable sections
should be left out.
Structs are kept in C++ for compatibility with C only, and avoiding them increases the
readability of the code by reducing the number of constructs used. Use a class instead.
16.15 Statements
16.15.1 Loops
Only loop control statements may be included in the for() construction, nothing else is
allowed.
//Correct
sum = 0;
110
Statements
//Incorrect
for (i = 0, sum = 0; i < 100; i++)
This increases maintainability and readability. It also allows future developers to make a
clear distinction of what controls and what is contained in the loop.
Loop variables should be initialized immediately before the loop.
Type conversions must always be done explicitly. Never rely on implicit type conversion.
//Correct
floatValue = static_cast<float>(intValue);
//Incorrect
floatValue = intValue;
By this, the programmer indicates that he is aware of the different types involved and that
the mix is intentional.
16.15.3 Conditionals
if (isDone)
// NOT: if (isDone) doCleanup(); doCleanup();
This is for debugging purposes. When writing on a single line, it is not apparent whether
the test is really true or not.
There must be a space separating the keyword if from the condition statement (isDone).
if (isDone)
ˆ space
Complex conditional expressions must be avoided. You must introduce temporary boolean
variables instead
//recommended way
bool isFinished = (elementNo < 0) || (elementNo > maxElement);
bool isRepeatedEntry = elementNo == lastElement;
if (isFinished || isRepeatedEntry) { : }
111
Coding Standard
All screen output MUST be put into a if statement to be conditionally executed, either via
verbose level or other debugging option.
They MUST not print out information by default.
TODO: this can be enforced by a simple Compass checker in the future.
16.15.5 switch
Carefully differentiate
• things which are known to be allowed to ignore and
• things which are not yet handled by the current implementation.
switch(type->variantT())
{
case V_SgTypeDouble:
{
...
}
break;
case V_SgTypeInt:
{
...
}
break;
case V_SgTypeFloat: // things which are known to be allowed to be
ignored.
break;
default:
{
//Things which are not yet explicitly handled
cerr<<"warning, unhandled node type: "<<
type->class_name()<<endl;
}
16.15.6 assert
It is encouraged to use assert often to explicitly express and guarantee assumptions used in
the code.
Please use ROSE_ASSERT() or assert().
For each occurrence of assertion, you MUST add a printf or cerr message to indicate where
in the code and what goes wrong so users can immediately know the cause of the assertion
failure, without going through a debugger to find out what went wrong.
112
Expressions
16.16 Expressions
All ROSE-based translators should call AstTests::runAllTests(project) after all the transfor-
mation is done to make sure the translated AST is correct.
This has a higher standard than just correctly unparsed to compilable code. It is common
for an AST to go through unparsing correctly but fail on the sanity check.
More information is at Sanity_check17
16.18 References
We list some external resources which are influential for us to define ROSE's coding standard
• http://www.possibility.com/Cpp/CppCodingStandard.html
113
Coding Standard
• Sutter and Alexandrescu, C++ Coding Standards, 220 pgs, Addison-Wesley, 2005, ISBN
0-321-11358-6.
• http://www.parashift.com/c++-faq/coding-standards.html
• http://geosoft.no/development/cppstyle.html/
• http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml
114
17 Code Review Process
115
Code Review Process
17.1 Motivation
17.2 Goals
http://www.phabricator.com/docs/phabricator/article/User_Guide_Review_vs_Audit.html#
1
advantages-of-review
116
Software
• enforce policies for consistent usability and maintainability of ROSE code: documented
and tested
• avoid reinventing the wheel and eliminating unnecessary redundancy
• safe-guarding the code: disallowing subversive attempts to disable or remove regression
tests
17.3 Software
We are currently testing Github Enterprise2 and looking into the possibility of leveraging
Redmine3 for internal code review.
In the past, we have looked at Google's Gerrit code review system4 .
17.3.1 Github
Releases: https://enterprise.github.com/releases
Support: https://support.enterprise.github.com
rosebot
(Under development)
An automated pull request analyzer to perform various tasks:
• Automatically add reviewers to Pull Requests based on hierarchical configuration
• "Pre-receive hook" analyses: file sizes, quantity of files, proprietary source, etc.
• more...
Read these tips and guidelines before sending a request for code review.
Please go to Coding Standard5 for the complete guideline. Here we only summary some key
points.
Your code should be written in a way that makes it easily maintainable and reviewable:
2 https://enterprise.github.com/dashboard
3 http://www.redmine.org/
4 http://code.google.com/p/gerrit/
5 Chapter 16 on page 91
117
Code Review Process
• write easy to understand code; avoid using exotic techniques which nobody can easily
understand.
• add sufficient documentation (source-code comments, README, etc.) to aid the under-
standability of your code, your documentation should cover
• why do you do this (motivation)
• how do you do it (design and/or algorithm)
• where are the associated tests (works as expected)
• before submission of your code for review, make sure
• you have merged with the latest central repository's master branch without conflicts
• your working copy can pass local tests via: make, make check, and make distcheck
• you have fixed all compiler warnings of your code whenever possible
• submit a logical unit of work (one or more commits); something coherent like a bug fix,
an improvement of documentation, an intermediate stage for reaching a big new feature.
• balance code submissions with a good ratio of [lines of code] and [complexity of code]. A
good balance needs to be achieved to make the reviewer's life easier.
• the time needed to review your code should not exceed 1 hour. Please avoid pushing
thousands of lines at a time.
• Please also avoid pushing any trivial (fixed a typo, commented out a single line etc.)
to be reviewed.
6 https://help.github.com/articles/generating-ssh-keys
118
Developer Checklist
5. Configure Auto-syncs: Contact the Jenkins administrator (too1 and liao6) to have
your repository added to a white-list of repositories to be synced whenever new commits are
integrated into ROSE's official master branch.
6. Setup polling job: Contact the Jenkins administrator (too1 and liao6) to have your
Github repository polled for new changes on the master branch. When new changes are
detected, your master branch will be pushed to the central repository (and added to the
Jenkins testing queue) as <oun>-reviewd-rc.
• have a local git repo to do your work and submit local commits, you have two choices:
• clone it from /nfs/casc/overture/rose/rose.git as we usually do before
• clone your fork on github.llnl.gov to a local repo (only HTTPS is supported via LC)
Note: You may encounter SSL certificate problems. If you do, simply disable SSL verification
in cURL using either export GIT_SSL_NO_VERIFY=false or configuring git:
• • don't use branches, use separated git repositories for each of your tasks. So sta-
tus/progress of one task won't interfere with other tasks.
• When ready to push your commits, synchronize with the latest rose-compiler/master to
resolve merge conflicts, etc.
• type: git pull origin master # this should always work since master branches on
github.llnl.gov are automatically kept up-to-date
• make sure your local changes can pass 1)make -j8, 2)make check -j8, and 3)make
distcheck -j8
• push your commits to your fork's non-master branch, (like bugfix-rc , featurex-rc, work-
status, etc.) You have total freedom in creating any branches in your forked repo, with
any names you like
• • It is encouraged to push your work to a remote branch with a -status suffix, which will
trigger a pre-screening Jenkins Job: http://hudson-rose-30:8080/view/Status/
job/S0-pre-screening-before-code-review/. This is often useful to make sure
your pushes can pass a minimum make check rules, including your own, before reviewers
spend time on reading your code. Reviewers can also see both your code and your
code's actions.
• add a pull(merge) request to merge bugfix-rc into your own fork's master,
119
Code Review Process
• please note that the default pull request will use rose-compiler/rose's mas-
ter as the base branch (destination of the merge). Please change it to be
your own fork's master branch instead.
• Also make sure the source (head) branch of the pull (merge) request is the one your
want (bugfix-rc in this example)
• Double check the diff tab of your pull request only shows the differences you
made, without other things brought in from the central repo. Or your own repo's
master is out-of-sync with the central repo's master. Notify system admin (too1) for
the problem or manually fix it using the troubleshooting section of this page.
• notify a reviewer that you have a pull request (requesting to merge your bugfix-rc into
your master branch)
• You can assign the pull request to the reviewer so an email notification will be
automatically sent to the reviewer
• Or you can add discussion within the pull request using @revieweraccount. NOTE:
please only click "Comment on this issue" once and manually refresh the web page.
Github Enterprise has a bug so it cannot automatically shown the newly added
comment. bug797
• Or you can just email the reviewer
• waiting for reviewer's feedback:
7 https://github.com/rose-compiler/rose/issues/79
120
Reviewer Checklist
• Be familiar with the current Coding Standard8 as a general guideline to perform the code
review.
• Allocate up to 1 hour at a time to review approximately 500-1000 lines of code: a longer
time may not pay off due to the attention span limits of human brains
17.5.2 Commenting
Reviewer comments should be clearly delimited into these three well-defined sections:
8 Chapter 16 on page 91
9 Chapter 16 on page 91
121
Code Review Process
1. Mandatory: the details of the comment must be implemented in a new commit and
added to the Pull Request before the code review can be completed.
2. Recommended: the details of the comment could represent a best-practice or, simply, it
could be intended to provide some insight to the developer that they may have not thought
about.
Both Mandatory and Recommended can be accompanied by the keyword Nitpick:
3. Nitpick: the details of the comment represent a fix that usually involves a spelling/-
grammatical or coding style correction. The main purpose of the nitpick indication is to let
the developer know that you're not trying to be on their case and make their life difficult,
but an error is an error, or there's a better way to do something.
17.5.3 Decisions
122
Who should review what
6. Most importantly, give the feedback quickly. While tactful is better (and you should
learn from past mistakes), you can always apologize for a poorly-delivered comment
with a quick followup. Don't just leave negative feedback to someone else or hope they
aren't persistent enough to make their contribution stick."
Ideally, every ROSE contributor should participate in code review as a reviewer at some
point so the benefits of peer-review can fully be fulfilled.
However, due to the limited access to our internal github enterprise server, we currently
have a centralized review process in which ROSE staff members (liao6, too1) serve as the
default code reviewers. They are responsible for either reviewing the code themselves or
delegate to other developers who either has better knowledge about the contributions or
should be aware of the contributions.
We am actively looking at better options and will gradually expand the pool of reviewers so
the reviewing step won't become a bottleneck.
TODO: use rosebot to automatically assign reviewers according to a hierarchical configura-
tion of the source-tree.
• Judging code by whether it's what the reviewer would have written
• Given a problem, there are usually a dozen different ways to solve it. And given a
solution, there's a million ways to render it as code.
• degenerating into nitpicks:
• perfectionism may hurt the progress. we should allow some non-critical improvements
to be done in the next version/commits.
• feel obligated to say something critical: it is perfectly fine to say "looks good, pass"
• delay in review: we should not rush it but we should keep in mind that somebody is
waiting for the review to be done to move forward
17.8 Criticism
Code reviews often degenerate into nitpicks. Brainstorming and design reviews to be more
productive.
• This makes sense, the early we catch the problems, the better. Design happens earlier.
Design should be reviewed. The same idea applies to requirement analysis also.
• To mitigate this risk, we now have rules for design document10 in our coding standard.
123
Code Review Process
17.9 Troubleshooting
$ cd ˜/Development/projects/rose
$ git clone [email protected]:<user_oun>/rose.git
Cloning into ROSE...
remote: Counting objects: 216579, done.
remote: Compressing objects: 100% (55675/55675), done.
remote: Total 216579 (delta 159850), reused 211131 (delta 155786)
Receiving objects: 100% (216579/216579), 296.41 MiB | 35.65 MiB/s,
done.
Resolving deltas: 100% (159850/159850), done.
In rare cases, your repository's master branch cannot be automatically synchronized. This is
most likely due to merge conflicts. You will receive an error message through an automated
email, resembling the following (last updated on 7/24/2012):
To [email protected]:lin32/rose.git
! [rejected] origin/master -> master (non-fast forward)
error: failed to push some refs to
'[email protected]:lin32/rose.git'
---
124
Past Software Experience
Please simply follow the email's instructions to force the update of your Github's master
branch.
In short:
• Gerrit's user interface is not user-friendly (it's complex and therefore, more confusing).
This is true, when compared to Github's Pull Request mechanism for code review.
• Gerrit's remote API was not mature enough to handle our workflow. Additionally, we
had to hack several things in order to slightly suit our needs. On the other hand, Github
has a great remote API which is easily accessible through Ruby scripting, a very popular
language for the domain of web interfaces and development.
• Gerrit is not as popular as Github, which is important for our project to gain traction.
Also, more people are familiar with Github so it makes it easier for them to use.
17.11 TODO
• TOP-PRIORITY: add pre-screening Jenkins job before manual code review kicks in
• Research, install, and test Facebook's Phabricator: http://phabricator.org/
125
Code Review Process
See Continuous_Integration#Connection_to_Code_Review11
17.13 References
• http://www.mediawiki.org/wiki/Git/Tutorial
• http://www.mediawiki.org/wiki/Code_review_guide
• http://www.possibility.com/wiki/index.php?title=CodeReviews
• http://scientopia.org/blogs/goodmath/2011/07/06/things-everyone-should-do-code-review/
• http://stackoverflow.com/questions/3730527/workflow-for-github-based-code-review
• http://stackoverflow.com/questions/4262693/what-to-look-for-in-a-code-review
• LLNL Internal URL: http://github.llnl.gov/
• http://www.processimpact.com/articles/revu_sins.html Seven Deadly Sins of Soft-
ware Reviews
126
18 Continuous Integration
Figure 4 ROSE Continuous integration using Git and Jenkins (Code Review Omitted for
simpler explanation)
18.1 Motivation
127
Continuous Integration
18.2 Overview
The ROSE project uses a workflow that automates the central principles of continuous
integration1 in order to make integrating the work from different developers a non-event.
Because the integration process only integrates with ROSE the changes that passes all tests
we encourage all developers to stay in sync with the latest version.
A high level overview of the development model used by ROSE developers.
• Step 1: Taking advantage of the distributed source code repositories based on git, each
developer should first clone his/her own repository from our central git repository (or its
mirrors/clones/forks).
• Step 2: Then a feature or a bugfix can be developed in isolation within the private
repository. He can create any number of private branches. Each branch should relate to
a feature that this developer is working on and be relatively short-lived. The developer
can commit changes to the private repository without maintaining an active connection
to the shared repository.
• Step 3: When work is finished and locally tested, he can pull the latest commits from the
central repo's master branch
• Step 4: He then can push all accumulated commits within the private repository to his
branch within the shared repository. We create a dedicated branch within the central
repository for each developer and establish access control of the branch so only an
authorized developer can push commits to a particular branch of the shared repository.
• Step 5-6 (automated): Any commits from a developer’s private repository will not be
immediately merged to the master branch of the shared repository.
In fact, we have access control to prevent any developer from pushing commits to the
master branch within the shared repository. A continuous integration server called Jenkins
is actively monitoring each developer’s branch within the central repository and will initiate
comprehensive commit tests upon the branch once new commits are detected. Finally,
Jenkins will merge the new commits to the master branch of the central repository if all
tests pass. If a single test fails, Jenkins will report the error and the responsible developer
should address the error in his private repository and push improved commits again.
As a result, the master branch of the central git repository is mostly stable and can be a
good candidate for our external release. On top of the master branch of the central git
repository, we further have more comprehensive release tests in Jenkins. If all the release
tests pass, an external release based on the master branch will be made available outside.
1 http://en.wikipedia.org/wiki/Continuous%20integration
128
Installed Software Packages
• Integration: tests used to check if the new commits can pass various "make check" rules,
compatibility tests, portability tests, configuration tests, and so on. If all tests pass, the
commits will be merged (or integrated) into the master branch of the central repository.
• Release: tests used to test the updated master branch of the central repository for
additional set of tests using external benchmarks. If all tests pass, the head of the master
will be released as a stable snapshot for public file package releases(generated by "make
dist").
• Others: for informational purpose now, not being used in our production workflow.
So for each push (one or more commits to a -rc branch), it will go through two stages:
Integration test and Release test stage.
It is each developer's responsibility to make sure their commits can pass BOTH stage by
fixing any bugs discovered by the tests.
It is possible to manually tracking down how you commits are doing within the test pipeline
within Jenkins (http://hudson-rose-30:8080/). But it can be tedious and overwhelming.
So we provide a dashboard ( http://sealavender:4000/) to summarize the commits to
your release candidate branch(-rc) and the pass/fail status for each integration tests.
Note: It's possible that all of your testing jobs (finally) pass, but the actual integration
is not performed. This typically occurs when one of your jobs have a system failure, for
instance, so it has to be manually re-started. If you see that all of your jobs have passed,
but your work has not been integrated, please let the Jenkins administrator know.
2 http://en.wikibooks.org/wiki/ROSE%20Compiler%20Framework%2FJenkins%20Failures
129
Continuous Integration
In reality, most LLNL developers are now asked to push things to Github Enterprise for code
review3 first instead of directly pushing to our central git repository. The synchronization
between the Github Enterprise's code review repositories and our Central Git repo are
automated.
##
## Add /nfs as remote
##
## ‘|| true‘: don't error if remote exists
130
TODO
##
git remote add nfs /nfs/casc/overture/ROSE/git/ROSE.git || true
git fetch nfs
##
## Push to /nfs *-rc
##
if [ -n "$(git log --oneline nfs/master..github/master)" ]; then
git push --force nfs "$GIT_BRANCH":refs/heads/oun-reviewed-rc
fi
Auto push: A Jenkins job is responsible for propagating latest central master contents to all
private repositories on github.llnl.gov
• http://hudson-rose-30:8080/job/Commit-sync-github
The Job configuration
• source Code Management:
• Git: /nfs/casc/overture/ROSE/git/ROSE.git
• Branches to build: */master
• Build Trigger: Build after other projects are built: Commit
• Execute Shell
USERS="\
user1\
user2
"
18.8 TODO
High priority
• Add a pre-screening job before manual code review kicks in. the pre-screening job can
make sure the code to be reviewed will be compiled with minimum warning messages
and with required make check rules to run tests.
• enable email notification for the final results of each test:
131
Continuous Integration
18.9 References
• Files used to generate the figure: feel free to add new versions as new slides: link4
https://docs.google.com/presentation/d/1US3e9sXnjPvgRU9cyOfQgKZBHScGiCMODSsbQH80i8s/
4
edit
132
19 Frequently Asked Questions (FAQ)
We collect a list of frequently asked questions about ROSE, mostly from the rose-public
mailing list link1
19.1 General
google.com supports search things within the scope of a URL. For example, if you have a
problem with a keyword MY PROBLEM, you can try to search the mailing list by using
the following keyword in google.com:
It can feel very frustrating when you get no responses to your questions submitted to the
[email protected] mailing list. You may wonder why the ROSE staff cannot help
neither sometimes.
Here are some possible excuses:
• They are just as busy as everybody else in the research and development fields. They
may be working around the clock to meet deadlines for proposals, papers, project reviews,
deliverables, etc.
• They don't know every corner of their own compiler, given the breadth and depth of
contributions made to ROSE by collaborators, former staff members, post-docs, and
interns. Moreover, most contributions lack good documentation--something that should
be remedied in the future.
• Some questions are simply difficult and open research and development questions. They
may have no clue, either.
• They just feel lazy sometimes or are taking a thing called vacation.
Possible alternatives to have your questions answered and your problems solved in a timely
fashion:
1 https://mailman.nersc.gov/pipermail/rose-public/
2 https://mailman.nersc.gov/mailman/listinfo/rose-public
133
Frequently Asked Questions (FAQ)
Excluding the EDG submodule and all source code comments, the core of ROSE (rose/src)
has about 674,000 lines of C/C++ source code as of July 11, 2012.
Including tests, projects, and tutorial directories, ROSE has about 2 Million lines of code.
Some details are shown below:
[rose/src]./cloc-1.56.pl .
3076 text files.
2871 unique files.
716 files ignored.
3 https://mailman.nersc.gov/mailman/listinfo/rose-public
134
General
To show top level information only (in MB): du -msl * | sort -nr
170 tests
109 projects
90 src
19 docs
16 winspecific
16 ROSE_ResearchPapers
15 binaries
7 scripts
5 LicenseInformation
4 tutorial
4 autom4te.cache
2 libltdl
2 exampleTranslators
2 configure
2 config
2 ChangeLog
709 .
250 ./.git
245 ./.git/objects
243 ./.git/objects/pack
170 ./tests
109 ./projects
90 ./src
76 ./tests/CompileTests
50 ./tests/RunTests
40 ./tests/RunTests/FortranTests
135
Frequently Asked Questions (FAQ)
34 ./tests/RunTests/FortranTests/LANL_POP
29 ./tests/RunTests/FortranTests/LANL_POP/netcdf-4.1.1
27 ./src/3rdPartyLibraries
23 ./tests/roseTests
23 ./src/frontend
22 ./tests/CompileTests/Fortran_tests
21 ./tests/CompilerOptionsTests
19 ./docs
18 ./tests/CompileTests/RoseExample_tests
18 ./src/midend
18 ./docs/Rose
16 ./winspecific
16 ./ROSE_ResearchPapers
15 ./tests/CompileTests/Fortran_tests/gfortranTestSuite
15 ./binaries/samples
15 ./binaries
14
./tests/CompileTests/Fortran_tests/gfortranTestSuite/gfortran.dg
14 ./src/roseExtensions
11 ./projects/traceAnalysis
10 ./tests/CompileTests/A++Code
10 ./tests/CompilerOptionsTests/testCpreprocessorOption
10 ./tests/CompilerOptionsTests/A++Code
10 ./src/roseExtensions/qtWidgets
10 ./src/frontend/Disassemblers
10 ./projects/symbolicAnalysisFramework
10 ./projects/SATIrE
10 ./projects/compass
9 ./winspecific/MSVS_ROSE
9 ./tests/RunTests/A++Tests
9 ./tests/roseTests/binaryTests
9 ./src/frontend/SageIII
9 ./projects/symbolicAnalysisFramework/src
9 ./docs/Rose/powerpoints
8 ./winspecific/MSVS_project_ROSETTA_empty
8 ./projects/simulator
7 ./tests/RunTests/FortranTests/LANL_POP_OLD
7 ./tests/CompileTests/Cxx_tests
7 ./src/midend/programTransformation
7 ./src/midend/programAnalysis
7 ./src/3rdPartyLibraries/libharu-2.1.0
7 ./scripts
7 ./projects/symbolicAnalysisFramework/src/mpiAnal
7 ./projects/RTC
6 ./winspecific/MSVS_ROSE/Debug
6 ./tests/RunTests/FortranTests/LANL_POP/netcdf-4.1.1/ncdap_test
6 ./tests/roseTests/programAnalysisTests
6 ./src/3rdPartyLibraries/ckpt
6 ./src/3rdPartyLibraries/antlr-jars
6 ./projects/SATIrE/src
5 ./tests/RunTests/FortranTests/LANL_POP/pop-distro
5 ./tests/RunTests/FortranTests/LANL_POP/netcdf-4.1.1/libcf
5 ./tests/CompileTests/ElsaTestCases
5 ./src/ROSETTA
5 ./src/3rdPartyLibraries/qrose
5 ./projects/DatalogAnalysis
5 ./projects/backstroke
5 ./LicenseInformation
5 ./docs/Rose/AstProcessing
136
Compilation
241568 .
/.git/objects/pack/pack-f366503d291fc33cb201781e641d688390e7f309.pack
13484 ./tests/CompileTests/RoseExample_tests/Cxx_Grammar.h
10240 ./projects/traceAnalysis/vmp-hw-part.trace
6324 ./tests/RunTests/FortranTests/LANL_POP_OLD/poptest.tgz
5828 ./winspecific/MSVS_ROSE/Debug/MSVS_ROSETTA.pdb
4732
./.git/objects/pack/pack-f366503d291fc33cb201781e641d688390e7f309.idx
4488 ./binaries/samples/bgl-helloworld-mpicc
4488 ./binaries/samples/bgl-helloworld-mpixlc
4080 ./LicenseInformation/edison_group.pdf
3968 ./projects/RTC/tags
3952 ./src/frontend/Disassemblers/x86-InstructionSetReference-NZ.pdf
3908 ./tests/CompileTests/RoseExample_tests/trial_Cxx_Grammar.C
3572 ./
winspecific/MSVS_project_ROSETTA_empty/MSVS_project_ROSETTA_empty.ncb
3424 ./src/frontend/Disassemblers/x86-InstructionSetReference-AM.pdf
2868 ./.git/index
2864 ./projects/compassDistribution/COMPASS_SUBMIT.tar.gz
2864 ./projects/COMPASS_SUBMIT.tar.gz
2740 ./ROSE_ResearchPapers/2007-Communi
catingSoftwareArchitectureUsingAUnifiedSingle-ViewVisualization-ICECC
S.pdf
2592 ./docs/Rose/powerpoints/rose_compiler_users.pptx
2428 ./src/3rdPartyLibraries/ckpt/wrapckpt.c
2408 ./projects/DatalogAnalysis/jars/weka.jar
2220 ./scripts/graph.tar
1900 ./src/3rdPartyLibraries/antlr-jars/antlr-3.3-complete.jar
1884 ./src/3rdPartyLibraries/antlr-jars/antlr-3.2.jar
1848 ./src/midend/programTransformation/ompLowering/run_me_defs.inc
1772 ./src/3rdPartyLibraries/qrose/docs/QROSE.pdf
1732 ./tests/CompileTests/Cxx_tests/longFile.C
1724
./src/midend/programTransformation/ompLowering/run_me_task_defs.inc
1656 ./ChangeLog
1548 ./tests/roseTests/binaryTests/yicesSemanticsExe.ans
1548 ./tests/roseTests/binaryTests/yicesSemanticsLib.ans
1480 ./
ROSE_ResearchPapers/1997-ExpressionTemplatePerformanceIssues-IPPS.pdf
1408 ./docs/Rose/powerpoints/ExaCT_AllHands_March2012_ROSE.pptx
...
19.2 Compilation
137
Frequently Asked Questions (FAQ)
138
Compilation
Solution: add an official rose repo as an additional remote repo of your local repo
• add a canonical repository, like the one at github: git add remote official-rose https:
//github.com/rose-compiler/rose.git
• git fetch official-rose // to retrieve hash numbers etc in the canonical repository
• Now you can build rose again. it should find the canonical repo you just added and use
it to find a matching EDG binary
139
Frequently Asked Questions (FAQ)
Question It takes hours to compile ROSE, how can I speed up this process?
Answer:
• if you have multi-core processors, try to use make -j4 (make by using four processes or
even more if you like).
• also try to only build librose.so under src/ by typing make -C src/ -j4
• Or only try to build the language support you are interested in during configure, such as
• ../sourcetree/configure --enable-only-c # if you are only interested in C/C++ support
• ../sourcetree/configure --enable-only-fortran # if you are only interested in Fortran
support
• ../sourcetree/configure --help # show all other options to enable only a few languages.
https://mailman.nersc.gov/pipermail/rose-public/2011-July/001015.html
ROSE does not handle incomplete code. Though this might be possible in the future. It
would be language dependent and likely depend heavily on some of the language specific
tools that we use internally. This is however, not really a priority for our work. If you want
to for example demonstrate how some of the internal tools we are using or alternative tools
that we could use might handle incomplete code, this might be interesting and we could
discuss it.
For example, we are not presently using Clang, but if it handled incomplete code that might
be interesting for the future. I recall that some of the latest EDG work might handle some
incomplete code, and if that is true then that might be interesting as well. I have not
attempted to handle incomplete code with OFP, so I am not sure how well that could be
expected to work. Similarly, I don't know what the incomplete code handling capabilities of
ECJ Java support is either. If you know any of these questions we could discuss this further.
I have some doubts about how much meaningful information can come from incomplete
code analysis and so that would worry me a bit. I expect it is very language dependent
and there would be likely some constraints on the incomplete code. So understanding the
subject better would be an additional requirement for me.
https://mailman.nersc.gov/pipermail/rose-public/2011-April/000856.html
Question: I'm trying to analyze the Linux kernel. I was not sure of the size of the code-base
that can be handled by ROSE, and could not find references as to whether it has been tried
on the Linux kernel source. As of now I'm trying to run the identity translator on the source,
and would like to know if it can be done using ROSE, and if it has been successfully tested
before.
Short answer: Not for now
140
AST
Long answer: We are using EDG 3.3 internally by default and this version of EDG does not
handle the GNU specific register modifiers used in the asm() statements of the Linux Kernel
code. There might be other problems, but that was at least the one that we noticed in
previous work on this some time ago. But we are working on upgrading the EDG frontend
to be a more recent version 4.4.
https://mailman.nersc.gov/pipermail/rose-public/2010-November/000544.html
not yet.
I know of a few cases where ROSE can't handle parts of Boost. In each case it is an EDG
problem where we are using an older version of EDG. We are trying to upgrade to a newer
version of EDG (4.x), but that version's use within ROSE does not include enough C++
support, so it is not ready. The C support is internally tested, but we need more time to
work on this.
19.3 AST
https://mailman.nersc.gov/pipermail/rose-public/2010-April/000144.html
Question: I want to exclude functions in #include files from my analysis/transformations
during my processing.
By default, AST traversal may visit all AST nodes, including the ones come from headers.
So AST processing classes provide three functions :
141
Frequently Asked Questions (FAQ)
• T traverse (SgNode * node, ..): traverse full AST , nodes which represent code from
include files
• T traverseInputFiles(SgProject* projectNode,..) traverse the subtree of AST which
represents the files specified on the command line
• T traverseWithinFile(SgNode* node,..): only the nodes which represent code of the same
file as the start node
https://mailman.nersc.gov/pipermail/rose-public/2011-April/000930.html
Both true/false bodies were SgBasicBlock before.
Later, we decided to have more faithful representation of both blocked (with {...}) and
single-statement (without { ..} ) bodies. So they are SgStatement (SgBasicBlock is a subclass
of SgStatement) now.
But it seems like the document has not been updated to be consistent with the change.
You have to check if the body is a block or a single statement in your code. Or you can use
the following function to ensure all bodies must be SgBasicBlock.
//A wrapper of all ensureBasicBlockAs*() above to ensure the parent of s is a scope statement
with list of statements as children, otherwise generate a SgBasicBlock in between.
SgLocatedNode * SageInterface::ensureBasicBlockAsParent (SgStatement *s)
It is called preprocessing info. within ROSE's AST. They are attached before, after, or
within a nearby AST node (only the one with source location information.)
An example translator is provided to traverse the input code's AST and dump information
about the found preprocessing information,
exampleTranslators/defaultTranslator/preprocessingInfoDumper -c
main.cxx
-----------------------------------------------
Found an IR node with preprocessing Info attached:
(memory address: 0x2b7e1852c7d0 Sage type: SgFunctionDeclaration) in
file
/export/tmp.liao6/workspace/userSupport/main.cxx (line 3 column 1)
-------------PreprocessingInfo #0 ----------- :
classification = CpreprocessorIncludeDeclaration:
String format = #include "all_headers.h"
142
AST
If you look at the whole AST graph carefully, you can find defining and non-defining
declarations for the same class.
A symbol is usually associated with a non-defining declaration. A class definition is associated
with a defining declaration.
You may want to get the defining declaration from the non-defining declaration before you
try to grab the definition.
There is a section named "1.7 Adding New SAGE III IR Nodes (Developers Only)" in ROSE
Developer’s Guide (http://www.rosecompiler.org/ROSE_DeveloperInstructions.pdf)
But before you decide adding new nodes, you may consider if AstAttribute (user defined
objects attached to AST) would be sufficient for your problem.
For example, the 1st version of the OpenMP implementation in ROSE
(rose/projects/OpenMP_Translator) started by using AstAttribute to represent in-
formation parsed from pragmas. Only in the 2nd version we introduced dedicated AST
nodes.
There are two separate steps when new kinds of IR nodes are added into ROSE:
• First step (declaration): Adding class declaration/implementation into ROSE for the new
IR nodes. This step is mostly related to ROSETTA.
• Second step (creation): Creating those new IR nodes at some point: such as somewhere
within frontend, midend, or even backend if desired. So this step is decided case by case.
If the new types of IR come from their counterparts in EDG, then modifications to the
EDG/SAGE connection code are needed. If not, the EDG/SAGE connection code may be
irrelevant.
If you are trying to add new nodes to represent pragma information, you can create your
new nodes without involving EDG or its connection to ROSE. You just parse the pragma
string in the original AST and create your own nodes to get a new version of AST. Then it
should be done.
tests/CompileTests/mergeAST_tests
143
Frequently Asked Questions (FAQ)
An AST node can have a parent node which is different from the its scope.
For example: the struct declaration's parent is the typedef declaration. But the struct's
scope is the scope of the typedef declaration.
typedef struct frame {int x;} s_frame;
19.4 Translation
https://mailman.nersc.gov/pipermail/rose-public/2011-January/000604.html
Questions: Rose identityTranslator performs some modifications, "automatically".
These modifications are:
• Expanding the assert macro.
• Adding extra brackets around constants of typedef types (e.g. c=Typedef_Example(12);
is translated in the output to c = Typedef_Example((12));)
• Converting NULL to 0.
How can I avoid these modifications?
Answer: No.
There is no easy way to avoid these changes currently. Some of them are introduced by
the cpp preprocessor. Others are introduced by the EDG front end ROSE uses. 100%
faithful source-to-source translation may require significant changes to preprocessing directive
handling and the EDG internals.
We have had some internal discussion to save raw token strings into AST and use them to
get faithful unparsed code. But this effort is still at its initial stage as far as I know.
https://mailman.nersc.gov/pipermail/rose-public/2010-July/000319.html
Question: I am trying to build a tool which insert one or more function calls whenever in
the source code there is a function belonging to a certain group (e.g. all functions beginning
with foo_*). During the ast traversal, how can I find the right place, i.e., there is a function
in ROSE that searches for a string pattern or something similar?
Answers:
• In Chapter 28 AST Construction of the ROSE tutorial, there are examples to instrument
function calls into the AST using traversals or a queryTree. I would approach this by
checking the node for the specific SgFunctionDefinition (or whatever you need) and then
check the name of the node to find its location.
144
Translation
• You can
• use the AST query mechanism to find all functions and store them in a container.
e.g Rose_STL_Container<SgNode*> nodeList = NodeQuery::querySubTree(root_-
node,V_Sg????);
• Then iterate the container to check each function to see if the function name matches
what you want.
• use SageBuilder namespace's buildFunctionCallStmt() to create a function call state-
ment.
• use SageInterface namespace's insertStatement () to do the insertion.
https://mailman.nersc.gov/pipermail/rose-public/2011-April/000919.html
We need to be more specific about the function you want to copy. Is it just a prototype
function declaration (non-defining declaration in ROSE's term ) or a function with a definition
(defining declaration in ROSE's term)?
• Copying a non-defining function declaration can be achieved by using the following
function instead:
#include <rose.h>
#include <stdio.h>
using namespace SageInterface;
SgFunctionDeclaration* func=
findDeclarationStatement<SgFunctionDeclaration> (project, "bar",
NULL,
true);
ROSE_ASSERT (func != NULL);
// Insert it to a scope
145
Frequently Asked Questions (FAQ)
https://mailman.nersc.gov/pipermail/rose-public/2011-May/000971.html
No. ROSE does not unparse AST from headers right now. A summer project tried to do
this. But it did not finish.
https://mailman.nersc.gov/pipermail/rose-public/2010-August/000344.html
I guess ROSE does not support writing out changed headers for safety/practical reasons. A
changed header has to be saved to another file since writing to the original header is very
dangerous (imaging debugging a header translator which corrupts input headers). Then all
other files/headers using the changed header have to be updated to use the new header file.
Also all files involved have to be writable by user's translators.
As a result, the current unparser skips subtrees of AST from headers by checking file flags
(compiler_generated and/or output_in_code_generation etc.) stored in Sg_File_Info
objects.
https://mailman.nersc.gov/pipermail/rose-public/2011-June/001008.html
146
Unparsing
if (calleeDef != NULL)
formalArgList = calleeDef->get_declaration()->get_args();
19.5 Unparsing
https://mailman.nersc.gov/pipermail/rose-public/2012-August/001742.html
Question: I wonder is it possible for ROSE to generate two files (.c and .cl) when it
translates C-to-OpenCL ?
Answer: The ROSE outliner has an option to output the generated function into a new file.
https://github.com/rose-compiler/rose/blob/master/src/midend/
programTransformation/astOutlining/Outliner.hh
...
// Generate the outlined function into a separated new source file
// -rose:outline:new_file
extern bool useNewFile;
...
147
Frequently Asked Questions (FAQ)
You may want to check how this option is used in the outliner source files to get what you
want.
Symptom:
The reason may be that you are behind a firewall which tweaks the original SSL certification.
Solutions: Tell cURL to not check for SSL certificates:
https://mailman.nersc.gov/pipermail/rose-public/2010-April/000115.html
There may not be a widely recognized best integrated development environment. But
developers have reported that they are using
• vim
• emacs
• KDevelop
• Source Navigator
• Eclipse
• Netbeans
The thing is that ROSE is huge and has some ridiculously large generated source file
(CxxGrammar.h and CxxGrammar.C are generated in the build tree for example). So many
code browsers may have trouble in handling ROSE.
148
Portability
19.7 Portability
mkdir ROSE-build-cmake
cd ROSE-build-cmake
cmake .. -DBOOST_ROOT=${ROSE_TEST_BOOST_PATH} // Example: boost
installation path /opt/boost_1_40_0-inst
https://mailman.nersc.gov/pipermail/rose-public/2011-December/001349.html
We have not finished the Windows work yet. IT is on our list of things to do. It was started
and ROSE internally compiles using MS Visual Studio (using project files generated from
the Cmake build that we maintain and test within our release process for ROSE) but does
not pass our tests. So it is not ready. The distribution of the EDG binaries for Windows is
another step that would come after that. We don't know at present when this will be done,
it is important, but not a high priority for our DOE specific work, but important for other
work. The effort required is something that we could discuss. If you want to call me that
would be the best way to proceed. Send me email off of the main list and we can set that up.
https://mailman.nersc.gov/pipermail/rose-public/2011-March/000798.html
Under Windows ROSE uses CMake. This is a project that is currently under development.
As of November 2010 we are able to compile and link the src directory. We are also able
to run example programs that link against librose and execute the frontend and backend.
{\em However, this is an internal capability and not available externally yet since we don't
distribute the Windows generated EDG binaries that would be required. Also the current
support for Windows is still incomplete, ROSE does not yet pass its internal tests under
Windows.}
4 http://www.cmake.org/
149
20 How-tos
Quick, short, and focused tutorials about how to do common tasks as a ROSE developer.
Please create a new wikibook page for each how-to topic. Each how-to wiki page should
NOT contain any level one (=) or level two(==) heading so it can be included at the correct
levels in the print version of this wikibook.
Quick, short, and focused tutorials about how to do common tasks as a ROSE developer.
Please create a new wikibook page for each how-to topic. Each how-to wiki page should
NOT contain any level one (=) or level two(==) heading so it can be included at the correct
levels in the print version of this wikibook.
151
How-tos
Please create a new wikibook page for each how-to topic. Each how-to
wiki page should NOT contain any level one (=) or level two(==)
heading so it can be included at the correct levels in the print
version of this wikibook.
• rename three places of the pasted text with the desired page name, for example
152
How to write a How-to
• To use your own image in wiki page, you have to upload the image to http://commons.
wikimedia.org/.
• Once you upload the image, it will become public to all wikibooks users. Be sure to
declare your copyright if the image is created by yourself.
• Following this instruction to insert image and adjust the layout of your page: http:
//en.wikibooks.org/wiki/Using_Wikibooks/Inserting_Images
• Only level three headings (===) and higher are allowed in a how-to page. This is
necessary for the how-to page to be correctly included into the final one-page print version
of this wikibook. Sorry about this restriction.
• Again, please don't use level one (=) or level two (==) headings in a how-to page!
• Keep each how-to short and focused. Readers are expected to only spend 30-minutes or
much less to quickly learn how to do something using ROSE.
• After you created a new how-to page and saved your contributions. Please go to the print
version to make sure it shows up correctly.
• Here is the link: http://en.wikibooks.org/wiki/ROSE_Compiler_Framework/
Print_version
• Having new content show up in the print version will make sure it is really visible and
consistent with the rest of the book.
• please specify the how-to topic is the current practice or the proposed new ways of doing
things. So we can have clear guideline for code review for what is mandatory and what is
optional.
• rename three places of the pasted text with the desired page name, for example
153
How-tos
• To use your own image in wiki page, you have to upload the image to http://commons.
wikimedia.org/.
• Once you upload the image, it will become public to all wikibooks users. Be sure to
declare your copyright if the image is created by yourself.
• Following this instruction to insert image and adjust the layout of your page: http:
//en.wikibooks.org/wiki/Using_Wikibooks/Inserting_Images
• Only level three headings (===) and higher are allowed in a how-to page. This is
necessary for the how-to page to be correctly included into the final one-page print version
of this wikibook. Sorry about this restriction.
• Again, please don't use level one (=) or level two (==) headings in a how-to page!
• Keep each how-to short and focused. Readers are expected to only spend 30-minutes or
much less to quickly learn how to do something using ROSE.
• After you created a new how-to page and saved your contributions. Please go to the print
version to make sure it shows up correctly.
• Here is the link: http://en.wikibooks.org/wiki/ROSE_Compiler_Framework/
Print_version
• Having new content show up in the print version will make sure it is really visible and
consistent with the rest of the book.
• please specify the how-to topic is the current practice or the proposed new ways of doing
things. So we can have clear guideline for code review for what is mandatory and what is
optional.
• rename three places of the pasted text with the desired page name, for example
• To use your own image in wiki page, you have to upload the image to http://commons.
wikimedia.org/.
154
How to write a How-to
• Once you upload the image, it will become public to all wikibooks users. Be sure to
declare your copyright if the image is created by yourself.
• Following this instruction to insert image and adjust the layout of your page: http:
//en.wikibooks.org/wiki/Using_Wikibooks/Inserting_Images
• Only level three headings (===) and higher are allowed in a how-to page. This is
necessary for the how-to page to be correctly included into the final one-page print version
of this wikibook. Sorry about this restriction.
• Again, please don't use level one (=) or level two (==) headings in a how-to page!
• Keep each how-to short and focused. Readers are expected to only spend 30-minutes or
much less to quickly learn how to do something using ROSE.
• After you created a new how-to page and saved your contributions. Please go to the print
version to make sure it shows up correctly.
• Here is the link: http://en.wikibooks.org/wiki/ROSE_Compiler_Framework/
Print_version
• Having new content show up in the print version will make sure it is really visible and
consistent with the rest of the book.
• please specify the how-to topic is the current practice or the proposed new ways of doing
things. So we can have clear guideline for code review for what is mandatory and what is
optional.
• rename three places of the pasted text with the desired page name, for example
• To use your own image in wiki page, you have to upload the image to http://commons.
wikimedia.org/.
• Once you upload the image, it will become public to all wikibooks users. Be sure to
declare your copyright if the image is created by yourself.
• Following this instruction to insert image and adjust the layout of your page: http:
//en.wikibooks.org/wiki/Using_Wikibooks/Inserting_Images
155
How-tos
• Only level three headings (===) and higher are allowed in a how-to page. This is
necessary for the how-to page to be correctly included into the final one-page print version
of this wikibook. Sorry about this restriction.
• Again, please don't use level one (=) or level two (==) headings in a how-to page!
• Keep each how-to short and focused. Readers are expected to only spend 30-minutes or
much less to quickly learn how to do something using ROSE.
• After you created a new how-to page and saved your contributions. Please go to the print
version to make sure it shows up correctly.
• Here is the link: http://en.wikibooks.org/wiki/ROSE_Compiler_Framework/
Print_version
• Having new content show up in the print version will make sure it is really visible and
consistent with the rest of the book.
• please specify the how-to topic is the current practice or the proposed new ways of doing
things. So we can have clear guideline for code review for what is mandatory and what is
optional.
• rename three places of the pasted text with the desired page name, for example
• To use your own image in wiki page, you have to upload the image to http://commons.
wikimedia.org/.
• Once you upload the image, it will become public to all wikibooks users. Be sure to
declare your copyright if the image is created by yourself.
• Following this instruction to insert image and adjust the layout of your page: http:
//en.wikibooks.org/wiki/Using_Wikibooks/Inserting_Images
• Only level three headings (===) and higher are allowed in a how-to page. This is
necessary for the how-to page to be correctly included into the final one-page print version
of this wikibook. Sorry about this restriction.
156
How to incrementally work on a project
• Again, please don't use level one (=) or level two (==) headings in a how-to page!
• Keep each how-to short and focused. Readers are expected to only spend 30-minutes or
much less to quickly learn how to do something using ROSE.
• After you created a new how-to page and saved your contributions. Please go to the print
version to make sure it shows up correctly.
• Here is the link: http://en.wikibooks.org/wiki/ROSE_Compiler_Framework/
Print_version
• Having new content show up in the print version will make sure it is really visible and
consistent with the rest of the book.
• please specify the how-to topic is the current practice or the proposed new ways of doing
things. So we can have clear guideline for code review for what is mandatory and what is
optional.
Developing a big, sophisticated project entails many challenges. To mitigate some of these
challenges, we have adopted several best practices: incremental development, code review,
and continuous integration.
Here are some tips on how to divide up a big project into smaller, bite-sized pieces so each
piece can be incrementally developed, code reviewed, and integrated.
• Input: define different sets of test inputs based on complexity and difficulty. Tackle
simpler sets first.
• Output: define intermediate results leading to the final output. Often, results A and B are
needed to generate C. So the project can have multiple stages, based on the intermediate
results.
• Algorithm: complex compiler algorithms are often just enhanced versions of more
fundamental algorithms. Implement the fundamental algorithms first to gain insight and
experience. Then, afterward, you can implement the full-blown versions.
• Language: for projects dealing with multiple languages, focus on one language at a
time.
• Platform: limit the scope of supported platforms: Linux, Ubuntu, OS X (TODO: add
reference to ROSE supported platforms)
• Performance: Start with a basic, working implementation first. Then try to optimize
its performance, efficiency.
• Scope: your translator could first focus on working at a function scope, then grow to
handle an entire source file, or even multiple files, at the same time.
• Skeleton then meat: a project should be created with the major components defined
first. Each component can be enriched separately later on.
• Annotations (manual vs. automated): Performing one compiler task often requires
results from many other tasks being developed. Defining source code annotations as
the interface between two tasks can decouple these dependencies in a clean manner. The
annotations can be first manually inserted. Later the annotations can be automatically
generated by the finished analysis.
• Optional vs. Default: introducing a flag to turn on/off your feature. Make it as a
default option when it matures.
157
How-tos
Translator basically converts one AST to another version of AST. The translation process
may add, delete, or modify the information stored in AST.
20.3.1 Overview
Get familiar with the ASTs before and after your translation. So you know for sure what
your code will deal with and what AST you code will generate.
The best way is to prepare simplest sample codes and carefully examine the whole dot
graphs of them.
There are multiple ways to find things you want to translate in AST.
158
How to create a translator
AST Query
• Via AST Query: Node query returns a list of AST nodes in the same type. This is often
enough to simple translations
Rose_STL_Container<SgNode*> ProgramHeaderStatementList =
NodeQuery::querySubTree (project,V_SgProgramHeaderStatement);
for (Rose_STL_Container<SgNode*>::iterator i =
ProgramHeaderStatementList.begin(); i !=
ProgramHeaderStatementList.end(); i++)
{
SgProgramHeaderStatement* ProgramHeaderStatement =
isSgProgramHeaderStatement(*i);
...
}
More information about AST Query can be found at "6 Query Library" of the ROSE User
Manual pdf.
AST Traversal
• Through AST traversal: walks through whole AST using different orders (pre-order
or post order). Post-order traversal is recommended to avoid modifying things the
traversal will hit later on (similar problem as iterator invalidation in C++)
• The AST traversal gives visit() functions to hook up your translation functions. A
switch statement is can be used for handling different types of AST node.
void f2cTraversal::visit(SgNode* n)
{
switch(n->variantT())
{
case V_SgSourceFile:
{
SgFile* fileNode = isSgFile(n);
translateFileName(fileNode);
}
break;
case V_SgProgramHeaderStatement:
{
...
}
break;
default:
break;
}
}
More information about AST Traversal can be found at "7 AST Traversal" of the ROSE
User manual pdf online.
159
How-tos
The translations you want to do often depend on the types of the AST nodes you visit. For
example you can have a set of translation functions defined in your namespace
• void translateForLoop(SgForLoop* n)
• void translateFileName(SgFile* n)
• void translateReturnStatement(SgReturnStmt* n), and so on
Other tips
• Reference ROSE doxygen website for information of each AST node: http://
rosecompiler.org/ROSE_HTML_Reference/index.html
• Use SageBuilder namespace (http://rosecompiler.org/ROSE_HTML_Reference/
namespaceSageBuilder.html) if you want to create new AST node. Update
SageBuilder you cannot find the one you need.
• Look up in SageInterface Namespace (http://rosecompiler.org/ROSE_HTML_
Reference/namespaceSageInterface.html) for the translation functions you need. If
there is none, then write your own function.
• Besides building things from scratch, you can use SageInterface::deepCopy() to copy AST
subtree.
• Update the information, or create the new AST node you need.
• Replace the existing AST node with your updated or new AST node.
Updating Tree
• You might need to handle some details, like removing symbol, updating parent, and
symbol table.
• Be careful to use deepDelete() and deepCopy(). Some information might not be updated
properly. For example, deepDelete might not update your symbol table.
160
Sample translators
Here we list a few sample translators which can grow to more sophisticated ones you want.
/*
toy code
by Liao, 12/14/2007
*/
#include "rose.h"
#include <iostream>
using namespace std;
return backend(project);
}
In this HOW-to, it presents the steps of generating a cross-language translator. We will use
Fortran to C translator as an example here.
161
How-tos
• change the output file name. The suffix name has to be changed with this following
function.
void SgFile::set_outputLanguage(SgFile::outputLanguageOption_enum
outputLanguage)
• Example: ROSE AST uses different AST nodes to present a loop in C and Fortran. The
following two figures represent the same loop for different languages.
C uses SgForStatement for the for loops.
Figure 6 C SgForStatement
162
How to set up the makefile for a translator
• If compiler is available to test the output code, run the backend to generate object by
the backend compiler.
• If compiler is not available for the target language, make sure output codes can be
generated from the testing cases. It is suggested to run the compilation tests for all the
testing output.
In this How-to, you will create a makefile to compile and test your own custom ROSE
translator.
You may want to first look at "How-to install ROSE": ROSE Compiler Framework/Installa-
tion3 .
3 Chapter 4 on page 15
163
How-tos
You must have the proper environment variable set so you translator can find the librose.so
during execution.
export LD_L
IBRARY_PATH=${ROSE_INSTALL}/lib:${BOOST_INSTALL}/lib:$LD_LIBRARY_PATH
#include <rose.h>
20.7.3 Makefile
Here is a sample makefile. Please make sure replacing some leading spaces of make rules
with leading Tabs if you copy & paste this sample.
164
How to set up the makefile for a translator
## Your translator
TRANSLATOR=my_translator
TRANSLATOR_SOURCE=$(TRANSLATOR).cpp
#-------------------------------------------------------------
# Makefile Targets
#-------------------------------------------------------------
all: $(TRANSLATOR)
clean:
rm -rf $(TRANSLATOR) *.o rose_* *.dot
## Your translator
TRANSLATOR=myTranslator
TRANSLATOR_SOURCE=$(TRANSLATOR).cpp
#-------------------------------------------------------------
4 Chapter 5 on page 21
165
How-tos
# Makefile Targets
#-------------------------------------------------------------
all: $(TRANSLATOR)
clean:
rm -rf $(TRANSLATOR) *.o rose_* *.dot
It is rare that your translator will just work after your finish up coding. Using gdb to debug
your code is indispensable to make sure your code works as expected. This page shows
examples of how to debug your translator.
If the translator is built using a makefile without using libtool. The debugging steps of your
translator are just classic steps to use gdb.
• Make sure your translator is compiled with the GNU debugging option5 -g so there is
debugging information in your object codes
These are the steps of a typical debugging session:
1. Set a break point
2. Examine the execution path to make sure the program goes through the path that you
expected
3. Examine the local data to validate their values
5 http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html
166
How to debug a translator
(gdb) print n
$1 = (SgNode *) 0xb7f12008
# Convert a node to its real node type then call its member functions
#---------------------------
(gdb) isSgFile(n)->getFileName ()
#-------------------------------------
# When displaying a pointer to an object, identify the actual
(derived) type of the object
# rather than the declared type, using the virtual function table.
#-------------------------------------
(gdb) set print object on
(gdb) print astNode
$6 = (SgPragmaDeclaration *) 0xb7c68008
ROSE turns on debugging support by default so the translators shipped with ROSE should
already have debugging information available. (Note: the compiler linking will take longer
when debugging support is enabled.)
However, ROSE uses libtool so the executables in the build tree are not real -- they're simply
wrappers around the actual executable files. You have two choices:
• Find the real executable in the .lib directory then debug the real executables there
• Use libtool command line as follows:
The remaining steps are the same as a regular gdb session with the typical operations, such
as breakpoints, printing data, etc.
Example 1: Fixing a real bug in ROSE
1. Reproduce the reported bug:
$ make check
167
How-tos
...
./testVirtualCFG \
--edg:no_warnings -w -rose:verbose 0 --edg:restrict \
-I$ROSE/tests/CompileTests/virtualCFG_tests/../Cxx_tests \
-I$ROSE/sourcetree/tests/CompileTests/A++Code \
-c $ROSE/sour
cetree/tests/CompileTests/virtualCFG_tests/../Cxx_tests/test2001_01.C
...
lt-testVirtualCFG:
$ROSE/src/frontend/SageIII/virtualCFG/virtualCFG.h:111:
VirtualCFG::CFGEdge::CFGEdge(VirtualCFG::CFGNode,
VirtualCFG::CFGNode):
Assertion ‘src.getNode() != __null && tgt.getNode() != __null'
failed.
Ah, so we've failed an assertion within the virtualCFG.h header file on line 111:
And the error was produced by running the lt-testVirtualCFG libtool executable translator,
i.e. the actual translator name is testVirtualCFG (without the lt- prefix).
2. Run the same translator command line with Libtool to start a GDB debugging session:
The GDB session has started, and we're provided with a command line prompt to begin our
debugging.
3. Let's run the program, which will hit the failed assertion:
(gdb) r
Starting program: \
${ROSE_BU
ILD_TREE}/tests/CompileTests/virtualCFG_tests/.libs/lt-testVirtualCFG
\
--edg:no_warnings -w -rose:verbose 0 --edg:restrict \
-I${ROSE}/tests/CompileTests/virtualCFG_tests/../Cxx_tests \
-I../../../../sourcetree/tests/CompileTests/A++Code
-c $
168
How to debug a translator
{ROSE}/tests/CompileTests/virtualCFG_tests/../Cxx_tests/test2001_01.C
warning: no loadable sections found in added symbol-file
system-supplied DSO at 0x2aaaaaaab000
[Thread debugging using libthread_db enabled]
lt-testVirtualCFG:
${ROSE}/src/frontend/SageIII/virtualCFG/virtualCFG.h:111:
VirtualCFG::CFGEdge::CFGEdge(VirtualCFG::CFGNode,
VirtualCFG::CFGNode): Assertion ‘src.getNode() != __null &&
tgt.getNode() != __null' failed.
(gdb) bt
#0 0x0000003752230285 in raise () from /lib64/libc.so.6
#1 0x0000003752231d30 in abort () from /lib64/libc.so.6
#2 0x0000003752229706 in __assert_fail () from /lib64/libc.so.6
#3 0x00002aaaad6437b2 in VirtualCFG::CFGEdge::CFGEdge
(this=0x7fffffffb300, src=..., tgt=...)
at ${ROSE}/../src/frontend/SageIII/virtualCFG/virtualCFG.h:111
#4 0x00002aaaad643b60 in makeEdge<VirtualCFG::CFGNode,
VirtualCFG::CFGEdge> (from=..., to=..., result=...)
at
${ROSE}/../src/frontend/SageIII/virtualCFG/memberFunctions.C:82
#5 0x00002aaaad62ef7d in SgReturnStmt::cfgOutEdges (this=0xbfaf10,
idx=1)
at
${ROSE}/../src/frontend/SageIII/virtualCFG/memberFunctions.C:1471
#6 0x00002aaaad647e69 in VirtualCFG::CFGNode::outEdges
(this=0x7fffffffb530)
at ${ROSE}/../src/frontend/SageIII/virtualCFG/virtualCFG.C:636
#7 0x000000000040bf7f in getReachableNodes (n=..., s=...) at
${ROSE}/tests/CompileTests/virtualCFG_tests/testVirtualCFG.C:13
...
5. Next, we'll move backwards (or upwards) in the program to get to the point of assertion:
(gdb) up
#1 0x0000003752231d30 in abort () from /lib64/libc.so.6
(gdb) up
#2 0x0000003752229706 in __assert_fail () from /lib64/libc.so.6
(gdb) up
#3 0x00002aaaad6437b2 in VirtualCFG::CFGEdge::CFGEdge
(this=0x7fffffffb300, src=..., tgt=...)
at ${ROSE}/src/frontend/SageIII/virtualCFG/virtualCFG.h:111
111 CFGEdge(CFGNode src, CFGNode tgt): src(src), tgt(tgt) \
{ assert(src.getNode() != NULL && tgt.getNode() !=
NULL); }
169
How-tos
Unfortunately, we can't tell at a glance which of the two conditions in the assertion is failing.
6. Figure out why the assertion is failing:
Let's examine the two conditions in the assertion:
(gdb) p src.getNode()
$1 = (SgNode *) 0xbfaf10
(gdb) p tgt.getNode()
$2 = (SgNode *) 0x0
Ah, there's the culprit. So for some reason, tgt.getNode() is returning a null SgNode
pointer (0x0).
From here, we used the GDB up command to backtrace in the program to figure out where
the node returned by tgt.getNode() was assigned a NULL value.
We eventually found a call to SgReturnStmt::cfgOutEdges which returns a variable, called
enclosingFunc. In the source code, there's currently no assertion to check the value of
enclosingFunc, and that's why we received the assertion later on in the program. As a side
note, it is good practice to add assertions as soon as possible in your source code so in times
like this, we don't have to spend time unnecessarily back-tracing.
After adding the assertion for enclosingFunc, we run the program again to reach this new
assertion point:
lt-testVirtualCFG: ${RO
SE}sourcetree/src/frontend/SageIII/virtualCFG/memberFunctions.C:1473:
\
virtual std::vector<VirtualCFG::CFGEdge,
std::allocator<VirtualCFG::CFGEdge> > \
SgReturnStmt::cfgOutEdges(unsigned int): \
Okay, we're inside of an SgReturnStmt object. Let's set a break point where enclosingFunc
is being assigned to:
170
How to add a new project directory
SgFunctionDefinition* enclosingFunc =
SageInterface::getEnclosingProcedure(this);
(gdb) p parent->class_name()
$12 = {static npos = 18446744073709551615,
_M_dataplus = {<std::allocator<char>> =
{<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data
fields>}, _M_p = 0x7cd0e8 "SgBasicBlock"}}
Most code development that is layered above the ROSE library starts out its life as a project
in the projects directory. Some projects are eventually refactored into the ROSE library
once they mature. This chapter describes how one adds a new project to ROSE.
A ROSE project encapsulates a complete program or set of related programs that use the
ROSE library. Each project exists as a subdirectory of the ROSE "projects" directory and
should include files "README", "rose.config", "Makefile.am", and any necessary source files,
scripts, tests, etc.
171
How-tos
• The "README" should provide an explanation about the project purpose, algorithm,
design, implementation, etc.
• The "rose.config" integrates the project into the ROSE build system in a manner that
allows the project to be an optional component (they can be disabled, renamed, deleted,
or withheld from distribution without changing any ROSE configuration files). Most older
projects are lacking this file and are thus more tightly coupled with the build system.
• The "Makefile.am" serves as the input to the GNU automake system that ROSE employs
to generate Makefiles.
• Each project should also include all necessary source files, documentation, and test cases.
The "rose.config" file integrates the project into the ROSE configure and build system. At a
minimum, it should contain a call to the autoconf AC_CONFIG_FILES macro with a list
of the project's Makefiles (without the ".am" extension) and its doxygen configuration file
(without the ".in" extension). It may also contain any other necessary autoconf checks that
are not already performed by ROSE's main configure scripts, including code to enable/disable
the project based on the availability of the project's prerequisites.
Here's an example:
Since all configuration for the project is encapsulated in the "rose.config" file, renaming,
disabling, or removing the project is trivial: a project can be renamed simply by renaming
its directory, it can be disabled by renaming/removing "rose.config", or it can be removed
by removing its directory. The "build" and "configure" scripts should be rerun after any of
these changes.
172
How to add a new project directory
Since projects are self-encapsulated and optional parts of ROSE, they need not be distributed
with ROSE. This enables end users to drop in their own private projects to an existing
ROSE source tree without modifying any ROSE files, and it allows ROSE developers to
work on projects that are not distributed publicly. Any project directory that is not part
of ROSE's main Git repository will not be distributed (this includes not distributing Git
submodules, although the submodule's placeholder empty directory will be distributed).
Each project should have at least one Makefile.am, each of which is processed by GNU
automake and autoconf to generate a Makefile. See documentation for automake for details
about what these files should contain. Some important variables and targets are:
• include $(top_srcdir)/config/Makefile.for.ROSE.includes.and.libs: This
brings in the definitions from the higher level Makefiles and is required by all projects. It
should be near the top of the Makefile.am.
• SUBDIRS: This variable should contain the names all the project's subdirectories that have
Makefiles. It may be omitted if the project's only Makefile is in that project's top-level
directory.
• INCLUDES: This would have the the flags that need to be added during compilation (flags
like -I$(top_srcdir)/projects/RTC/include). Your flags should be placed before
$(ROSE_INCLUDES) to ensure the correct files are found. This brings in all the necessary
headers from the src directory to your project.
• lib_*: These variables/targets are necessary if you are creating a library from your
project, which can be linked in with other projects or the src directory later. This is the
recommended way of handling projects.
• EXTRA_DIST: These are the files that are not listed as being needed to build the final
object (like source and header files), but must still be in the ROSE tarball distribution.
This could include README or configuration files, for example.
• check-local: This is the target that will be called from the higher level Makefiles when
make check is called.
• clean-local: Provides you with a step to perform manual cleanup of your project, for
instance, if you manually created some files (so Automake won't automatically clean
them up).
Many projects start as a translator, analyzer or optimizer, which takes into input code and
generate output.
A basic sample commit which adds a new project directory into ROSE: https://github.
com/rose-compiler/rose/commit/edf68927596960d96bb773efa25af5e090168f4a
Please look through the diffs so you know what files to be added and changed for a new
project.
Essentially, a basic project should contain
173
How-tos
• a README file explaining what this project is about, algorithm, design, implementation,
etc
• a translator acts as a driver of your project
• additional source files and headers as needed to contain the meat of your project
• test input files
• Makefile.am to
• compile and build your translator
• contain make check rule so your translator will be invoked to process your input files
and generate expected results
To connect your project into ROSE's build system, you also need to
• Add one more subdir entry into projects/Makefile.am for your project directory
• Add one line into config/support-rose.m4 for EACH new Makefile (generated from each
Makefile.am) used by your projects.
Install your project's content to a separate directory within the user's specified --prefix
location. The reason behind this is that we don't want to pollute the core ROSE installation
space. By doing so, we can reduce the complexity and confusion of the ROSE installation
tree, while eliminating cross-project file collisions. It also keeps the installation tree modular.
Example
This example uses a prefix for installation. It also maintains Semantic Versioning6 .
From projects/RosePoly7 :
6 http://semver.org/
7 http://github.llnl.gov/rose-compiler/rose/commit/30323b66bfaf53968f140ac331b37a6732ddf8ab
174
How to add a new project directory
## |--include # <project>/include
## |--<project>
## |--lib # <project>/lib
librosepoly_la_includedir = ${exec_prefix}/include/rosepoly
$ export PATH=/nfs/apps/doxygen/latest/bin:$PATH
$ doxygen -g
doxygen Doxyfile
...
EXTRACT_ALL = YES
...
# If the value of the INPUT tag contains directories, you can use the
# FILE_PATTERNS tag to specify one or more wildcard pattern (like
*.cpp
# and *.h) to filter out the source-files in the directories. If left
# blank the following patterns are tested:
# *.c *.cc *.cxx *.cpp *.c++ *.d *.java *.ii *.ixx *.ipp *.i++ *.inl
*.h *.hh
# *.hxx *.hpp *.h++ *.idl *.odl *.cs *.php *.php3 *.inc *.m *.mm
*.dox *.py
# *.f90 *.f *.for *.vhd *.vhdl
175
How-tos
subdirectories
# should be searched for input files as well. Possible values are YES
and NO.
# If left blank NO is used.
RECURSIVE = YES
...
.PHONY: docs
docs:
doxygen Doxyfile # TODO: should be $(DOXYGEN)
If you are trying to fix a bug ( your own or a bug assigned to you to fix). Here are high
level steps to do the work
You can only fix a bug when you can reproduce it. This step may be more difficult than it
sounds. In order to reproduce a bug, you have to
• find a proper input file
• find a proper translator: a translator shipped with ROSE is easy to find. But be patient
and sincere when you ask for a translator written by users.
• find a similar/identical software and hardware environment: a bug may only appear on a
specific platform when a specific software configuration is used
Possible results for this step:
• You can reproduce the bug reliably. Bingo! Go to the next step.
• You cannot reproduce the bug. Either the bug report is invalid or you have to keep
trying.
• You can reproduce the bug once a while (random errors). Oops. This is kind of difficult
situation.
176
How to add a ROSE commandline option
Once you can reproduce the bug. You have to identify the root cause of the bug using a
debugger like gdb.
Common steps involved
• simplify the input code as much as possible: It can be very hard to debug a problem with
a huge input. Always try to prepare the simplest possible code which can just trigger the
bug.
• Often, you have to use a binary search approach to narrow down the input code: only
use half of the input at a time to try. Recursively cut the input file into two parts
until no further cut is possible while you can still trigger the bug.
• forward tracking: for the translator, it usually takes input and generate intermediate
results before the final output is generated. Using a debugger to set break points at each
critical stages of the code to check if the intermediate results are what you expect.
• backwards tracking: similar to the previous techniques. But you just back tracking the
problem.
177
21 Lessons Learned
Lesson:
• A developer tried to understand a staff member's source code. But he found that
the code's indentation was not right for him. So he re-formatted the source files and
committed the changes. Later, the staff member found that his code was changed too
much and he could not read it anymore.
Solution:
• Please don't reformat code you do not own or will not maintain.
Lesson
• we had a student who was assigned a desk which was in a deep corner of a big room. The
desk was also far away from other interns. As a result, that student had less interactions
with others. He had to solve problems with less help.
Solution:
• Locations MATTER! Sit closer to people you should interact often. Make your desk/office
accessible to others. Physically isolated office/desk may have very negative impact on
your productivity.
Lesson
• Somehow new inters were assigned Mac OS X machines by default. But some of them
may not be familiar with Apple machines or even dislike Mac OS X's user interface,
including keyboard, window system, etc (a love-hate thing for Apple products). So they
felt stuck with an uncomfortable development platform. We had interns who could not
type smoothly on Mac keyboard even after one month. This is unnecessary.
Solution
179
Lessons Learned
• Provide choice up front: Linux or Mac OS X. Reminder people that they have freedom
to choose the platform they personally enjoy.
Lesson:
• A developer used different branches of the same git repository to do different tasks: fixing
bugs, adding a new feature, and documenting something. Later on he found that he
could not commit and push the work for one task since the changes for other tasks are
not ready.
Solution:
• using separated git repositories for different tasks. So the status of one task won't interfere
with the progress of other tasks.
Lesson
• ROSE did not depend on boost C++ library in the beginning. But later on, some
developers saw the benefits of Boost and advocated for it. Eventually, Boost becomes
the required software to use ROSE.
• But Boost library has its disadvantages: hard to install (just see how many boost issues
on our public mailing list), lack of backward compatibility (codes using older version of
boost break on new versions), huge header files with complex C++ templates slowing
down compilation or even breaking some compilers.
• We still have internal debates about what to do with Boost. It is often a painful and
emotional process.
Solution:
• Introducing big software dependency very carefully. Or you will get stuck easily.
• At least ask people who advocate for new software dependency to be responsible for
maintaining it for 5 years and providing an option to turn it off at the same time.
Lesson:
• A developer created tests that were too broad, mostly because they were included late in
development. This led to passes that should not have passed, that is passing all tests
even though the code had been broken.
Solution:
180
Keep Code Readable While Coding
• Make sure that tests check results carefully. This is made much easier by making sure
your functions have precisely ONE intention. E.g. if you need to transform data and
operate on the transformed data, split the transformation and the operation into two
functions (at least).
Lesson:
• A developer wrote code without commenting initially, then came back to the code and
had to go through the arduous task of understanding
his own unreadable code.
Solution:
• Keep variable and function names meaningful. Do full documentation as you go, do not
leave it for later.
Lesson:
• A developer wrote code without minding the structure. This led to bloated and unreadable
code that would have to be
refactored several times.
Solution:
• A programmer must code AND design, not just code. Well structured code is much easier
to read then badly structured code
Lesson: A developer wrote the code without knowing what the users actually needed. This
led to serious refactoring that could have been avoided, or at least made simpler, if he had
concentrated on the user at all times.
Solution: Whenever possible ask users for their input. It will save you a lot of trouble in the
long run.
Lesson: A developer wrote a rather obtuse component without understanding exactly what
the user might want this for
181
Lessons Learned
Solution: At the very least check that the input and output are what the user wanted, this
will save much time and aggravation
21.11 references
http://www.projectsmart.co.uk/lessons-learned.html
182
22 Testing
22.2 Benchmarks
1 http://jenkins-ci.org/
http://en.wikibooks.org/wiki/ROSE%20Compiler%20Framework%2FSPEC%20CPU%202006%
2
20benchmark
3 http://en.wikipedia.org/wiki/NAS_Parallel_Benchmarks
183
Testing
2. Autotools setup
$ cd modena
$ ./build.sh
+ libtoolize --force --copy --ltdl --automake
+ aclocal -I ./acmacros -I ./acmacros/ac-archive -I
/usr/share/aclocal
+ autoconf
+ automake -a -c
configure.ac:4: installing ‘./install-sh'
configure.ac:4: installing ‘./missing'
3. Environment bootstrap
$ source /nfs/apps/python/latest/setup.sh
$ mkdir buildTree
$ cd buildTree
$ ../configure \
--with-sqlalch
emy=${HOME}/opt/python/sqlalchemy/0.7.5/lib64/python2.4/site-packages
\
--with-target-java-interpreter=java \
--with-target-java-compiler=testTranslator \
--with-target-java-compiler-flags="-ecj:1.6" \
--with-host-java-compiler-flags="-source 1.6"
22.4 Jenkins
184
23 Git
23.1 Introduction
The ROSE project has been through multiple stages of source content management, starting
from CVS, then subversion, and now Git.
Git becomes the official source code version control software due to its unique features,
including
• Distributed source code management. Developers can have a self-contained local repository
to do their work anywhere they want, without the need for active connection to a central
repository.
• Easy merge. Merging using Git is as simple as it can get.
• Backup. Since easy clone of our central repository can serve as a standalone repository.
We no longer worry too much about losing the central repository.
• Integrity. Hashing algorithm used by Git ensures that you will get out what you have
put into the repository.
Many other prominent software projects have also been through the similar switch from
Subversion to Git, including
• the Linux kernel,
• Perl,
• Eclipse,
• Gnome,
• KDE,
• Android,
• Debian,
• MediaWiki
• http://gcc.gnu.org/git/
• http://darcs.haskell.org/ghc.git/
A more comprehensive list of Git users is given by https://git.wiki.kernel.org/index.
php/GitProjects
In summary, Git IS the state-of-the-art for source code management.
github requires git 1.7.10 or later to avoid HTTPS cloning errors, as mentioned at https:
//help.github.com/articles/https-cloning-errors
185
Git
Ubuntu 10.04's package repository has git 1.7.0.4. So building later version of git is needed.
But you still need an older version of git to get the latest version of git.
Install all prerequisite packages needed to build git from source files(assuming you already
installed GNU tool chain with GCC compiler, make, etc.)
If you're coming from a centralized system, you may have to unlearn a few of the things
you've become accustomed to.
• For example, you generally don't checkout out a branch from a central repo, but rather
clone a copy of the entire repository for your own local use.
• Also, rather than using small, sequential integers to identify revisions, Git uses a cryp-
tographic hash (SHA1), although in general you only need to ever write the first few
characters of the hash--just enough to uniquely identify a revision.
• Finally, the biggest thing to get used to: ALL(!) work is done on local branches--there's
no such thing in the DSCM world as working directly on a central branch, or checking
your work directly into a central branch.
Having said that, distributed revision control is a superset of centralized revision control, and
some projects, including ROSE, set up a centralized repository as a policy choice for sharing
code between developers. When a developer works on ROSE, they generally clone from this
central location, and when they've made changes, they generally push those changes back to
the same central location.
186
Git Convention
Before you commit your local changes, you MUST ensure that you have correctly configured
your author and email information (on all of your machines). Having a recognizable and
consistent name and email will make it easier for us to evaluate the contributions that you've
made to our project.
Guidelines:
• Name: You MUST use your official name you commonly use for work/business, not
nickname or alias which cannot be easily recognized by co-workers, managers, or sponsors.
• Email: You MUST use your email commonly used for work. It can be either your
company email or your personal email (gmail) if you DO commonly use that personal
email for business purpose.
To check if your author and email are configured correctly:
Alternatively, you can just type the following to list all your current git configuration
variables and values, including name and email information.
$ git config -l
All developer central repository branches should be named using the following pattern
• LOGIN-PURPOSE-OPTION
• NAME is typically a login name or surname.
• PURPOSE is a single-word description of the type of work performed on that branch,
such as "bugfixes".
• OPTION is information for ROSE robots with regards to your branch.
• -test Changes to the branch are automatically tested
• -rc Changes are tested and if they pass then they're merged into the "master"
branch (like "trunk" in Subversion).
• EXAMPLE:
• The "matzke-bugfixes-rc" branch is "owned" by Robb Matzke (i.e., he's the one that
generally makes changes to that branch), it probably contains only bug fixes or minor
187
Git
edits, and it's being automatically tested and merged into the master branch for
eventual release to the public.
It is important to have concise and accurate commit messages to help code reviewers do
their work.
Example commit message, excerpt from link1
* More documentation for the new memory cell, memory state, and X86
register state classes.
• (Required) Summary: the first line of the commit message is a one line summary (<50
words) of the commit. Start the summary with a topic, enclosed in parentheses, to
indicate the project, feature, bugfix, etc. that this commit represents.
• (Optional) Use a bullet-list (using an asterisk, *) for each item to elaborate on the commit
Also see http://spheredev.org/wiki/Git_for_the_lazy#Writing_good_commit_
messages.
23.5 Push
Creating and deleting branches on the remote repository is accomplished with git-push.
This is its general form:
Example:
$ git remote -v
1 https://github.com/rose-compiler/rose/commit/801c53d81526e2eae7a68e0eab1a9f21b9892ab2
188
Rebase
$ git branch
* master
# Method 1
$ git push origin master:refs/heads/master
23.6 Rebase
It is recommended to rebase your branch before pushing your work. So your local commits
will be moved to the head of the latest master branch, instead of being interleaved with
commits from master.
From http://gitready.com/intermediate/2009/01/31/intro-to-rebase.html
Rebase helps to cut up commits and slice them into any way that you want them served
up, and placed exactly where you want them. You can actually rewrite history with this
command, be it reordering commits, squashing them into bigger ones, or completely ignoring
them if you so desire.
Why is this helpful?
• One of the most common use cases is that you’ve been working on your own features/fix-
es/etc in separate branches. Instead of creating ugly merge commits for every change
that is brought back into the master branch, you could create one big commit and let
rebase handle attaching it.
• Another frequent use of rebase is to pull in changes from a project and keep your own
modifications in line. Usually by doing merges, you’ll end up with a history in which
commits are interleaved between upstream and your own. Doing a rebase prevents this
and keeps the order in a more sane state.
189
Git
23.7 References
• http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html
• http://book.git-scm.com/
• http://www.sourcemage.org/Git_Guide ( more like a FAQ )
• http://stackoverflow.com/questions/315911/git-for-beginners-the-definitive-practical-g
190
24 Lattices
24.1 Introduction
Lattices are mathematical structures. They can be used as a general way to express an
order among objects. This data can be exploited in data flow analysis.
Lattices can describe transformations effected by basic blocks on data flow values also known
as flow functions.
Lattices can describe data flow frameworks when instantiated as algebraic structures con-
sisting of a set of data flow values, a set of flow functions, and a merge operator.
24.2 Poset
Partial ordering: ≤
A partial ordering is a binary relation ≤ over a set P which is reflexive, antisymmetric
and transitive, i.e.
• Reflexive x<=x
• Anti-Symmetric, if x ≤ y, y ≤ x then x=y
• Transitive: if x ≤ y, y ≤ z then x ≤ z
Partial orders should not be confused with total orders. A total order is a partial order but
not vice versa. In a total order any two elements in the set P can be compared. This is not
required in a partial order. Two elements that can be compared are said to be comparable
A partially ordered set, also known as a poset, is a set with a partial order.
Given a poset there may exist an infimum or a supremum. However, not all posets contain
these.
Given a poset P with set X and order ≤:
An infimum of a subset S of X is an element a of X such that
• a ≤ x for all x in S and
• for all y in X, if for all x in S, y ≤ x then y ≤ a
The dual of this notion is the supremum which has the definition of infimum if you switch
≤ with ≥
If we simply pick an element of X that satisfies the first condition we have a lower bound.
The second condition ensures that we have (if it exists) the unique greatest lower bound.
Similarly for suprema.
191
Lattices
A lattice is a particular kind of poset. In particular, a lattice L is a poset P(X, ≤ where For
any two elements of the lattice a and b, the set {a, b} has a join and a meet
The join and meet operations MUST satisfy the following conditions
• 1) The join and meet must commute
• 2) The join and meet are associative
• 3) The join and meet are idempotent, that is, x join itself or x meet itself are both x.
If the lattice contains a meet it is a meet-semilattice, if a lattice contains a join it is a
join-semilattice, similarly there exists a meet-semilattice
(Definitions obtained from wikipedia with minimal modification)
• Infinite: An infinite lattice does not contain an 0 (bottom) or 1 (top) element, even
though every pair of elements contains a greatest lower bound and a least upper bound
on the entire underlying set. By the definition of unbounded or infinite sets we know
that given X an unbounded set given any x in X we can find an x' that is greater than x
(under some ordering, in this case the lattice). Similarly for greatest lower bounds.
• a finite/bounded lattice: the underlying set itself has a greatest lower bound and a least
upper bound, For now we will call the greatest lower bound 0 and the least upper bound
1.
• if a≤ x, for all x in L, then a is the 0 element of L, ⊥, recall that this is a unique
element
• if a≥ x for all x from L, then a is the 1 element of L, >
Meet ∧ is a binary operation such that a ∧ b take the greatest lower bound of the set (this
is guaranteed by the definition lattice.
Similarly Join ∨ returns the least upper bound of the set, guaranteed to exist by the definition
of a lattice.
To recap, a lattice L is a triple {X, ∧, ∨} composed of a set, a Meet function, and a Join
function
Properties of Meet and ∧.
• We refer to the ∨ as ∨ and the ∧ as J
192
Example: Bit vector Lattices
• Closure: If x and y belong to L, then there exists a unique z and a unique w from L such
that x ∨ y = z, and x ∧ y = w
• Commutativity: for all x, y in L, x ∨ y = y meet x, x ∧ y = y ∧ x:
• Associativity: (x ∨ y) ∨ z = x ∨ (y ∨ z), similarly in the ∧ operation
• There are two unique elements of L called bottom ( _|_), and top (T) , such that for all
x, x ∨ _|_= _|_and x ∧ T = T
• Many lattices, with some exceptions, notably the lattice corresponding to constant
propagatioin, are also distributive: x ∨ y ∧z = (x ∧z) ∨ (y ∧z)
Lattices and partial order:
x v y if and only if x u y = x
A strictly ascending chain is a sequence of elements of a set X such that, for x_i in X,
x1 , x2 , ..., xn has the property ⊥ = x1 < x2 < ... < xn = >. The greatest is the chain with
final index n such that n is the greatest such final index among all strictly ascending chains.
The height of a lattice is defined as the length of the longest strictly ascending chain it
contains.
If a data-flow analysis lattice has a finite height and a monotonic flow function then we
know that the associated data flow analysis algorithm will terminate.
• Example: If the greatest strictly ascending chain of a lattice L is finite and it takes
finitely many steps to reach the top, we can infer that the associated data flow algorithm
terminates.
(wikipedia used for definitions)
193
Lattices
111
/ | \
110 101 011
| x x \
100 010 001
\ | /
000
Here meet and join operators induce a partial order on the lattice elements
x is less than or equal to (<=) y if an only if x M y = x
For the BVˆ3: 000<= 010 <= 101<=111
The partial order on the lattice is:
• Transitive x <= y and y <= z, then x <=z
• Antisymmetric: if x<=y and y<=x, then x = y
• Reflexive: for all x: x<=x:
The height of the lattice is the length of its longest strictly ascending chain:
• The maximal n such that there exists a strictly ascending chain x1, x2, ..., xn such that
• Bottom = x1 < x2 < xn = Top
For BVˆ3 lattice, height = 4
24.7 Examples
A function f from L to itself, f: L -> L, is monotonic if for all x, y from L, x<=y ==>
f(x)<=f(y)
f: BVˆ3 -> BVˆ3: f (<x1 x2 x3>) -> <x1 1 x3>
194
integer value: ICP
195
Lattices
Design meet: set Union (Or operation): bring the value down to the bottom, context
insensitive
• design partial order <= --> ⊇
In between, a partial order: inferior/conservative solutions are lower on the lattice
Top
/ | \
{v1} {v2} {v3}
| x x |
{v1, v2} {v1,v3} {v2,v3}
\ | /
{v1, v2, v3} = Bottom
196
25 C++ Programming
ROSE is written in C++. Some users have suggested to mention the major C++ program-
ming techniques used in ROSE so they can have more focused learning experiences as C++
beginners.
Design Patterns: ROSE uses some common design patterns
• visitor pattern1 : used to create the AST traversal.
1 http://en.wikipedia.org/wiki/Visitor%20pattern
197
26 Good API Design
Google: "How to Design a Good API and Why it Matters" by Joshua Bloch1
TODO: convert from Markdown
• Easy to learn
• Easy to use, even without documentation
• Hard to misuse
• Easy to read and maintain code that uses it
• Sufficiently powerful to satisfy requirements
• Easy to extend
• Appropriate to audience
1 http://lcsd05.cs.tamu.edu/slides/keynote.pdf
199
Good API Design
• When in doubt, leave it out. You can always add, but you can never remove.
• Just because you can doesn't mean you should
• [Power-to-weight ratio](http://en.wikipedia.org/wiki/Power-to-weight_ratio)
> [A] measurement of actual performance [power / weight]
Implementation details should not impact the API. Don't let implementation details "leak"
into the API.
Performance
• Design for usability, refactor for performance
• Do not warp the API to gain performance
• Effects of API design decisions on performance are real and permanent:
• Component.getSize() returns Dimension
• Dimension is mutable
• Each getSize call must allocate Dimension
• Causes millions of needless object allocations
26.3.3 "Harmonize"
200
General Principles
• Provide programmatic access to all data available in string form => no client string
parsing necessary
• Overload with care: ambiguous overloadings
#include <string.h>
char *strcpy (char *dest, char *src);
void bcopy (void *src, void *dst, int n); // bad!
• short parameter lists: 3 or fewer; more and users will have to refer to docs; identically
typed params harmful
201
Good API Design
• Two techniques for shortening: 1) break up method, 2) create helper class to hold
parameters
26.3.7 Exceptions
202
27 Who is using ROSE
We are aware of the following ROSE users (people who write their own ROSE-based tools).
They are the reason of the ROSE's existence. Feel free to add your name if you are using
ROSE.
27.1 Universities
27.3 Companies
• Samsung: its research center at San Jose uses ROSE for multicore research and develop-
ment.
1 http://ege.ucsd.edu/dokuwiki-page/doku.php?id=didem:projects:mint
2 http://www.cs.uoregon.edu/Research/tau/home.php
203
28 TODO List
Just in case this website is down, how to download a backup of this wiki book?
How to set up a mirror wiki website containing the wikibook of ROSE?
It is possible that new chapters are added but they are not reflected in the one-page print
version. So periodical synchronization is needed by including more chapters or re-arranging
their order in the one-page print version.
Observations:
• A print version is similar to a source file with included contents, each included chapter
will have a first level of heading
• Because the first level heading (=) is used by the print version page to include all chapters,
all included pages/chapters should NOT contain any first level heading.
With the basic understanding of how this work, you can now edit the print version's wiki
page:
• Print version1
More at: http://en.wikibooks.org/wiki/Help:Print_versions
The pdf version automatically generated from the print version page is rudimentary. It has
no table of content and pagination etc.
So we used a manual process to generate better pdf file. We need to occasionally repeat this
process to have a up-to-date and better pdf file.
Here are the manual steps:
1 Chapter on page 1
205
TODO List
• Use your web browser to open and save the print version to your own computer as "web
page complete"
• use the HTML-compatible word processor of your choice to open the html file, convert
html to a format the word processor, and add paginate the book.
• In Microsoft Word, this can done by
• opening the saved HTML file
• saving it to a word file
• adding table of content by selecting Insert > Field > Index and Tables > TOC or
Preferences-> Table of contents for Word 2012 or later.
• adding page numbers to the footer
• save it to a pdf file with a name like ROSE_Compiler_Framework.pdf
• upload to wikibooks
To add a link to your wikibook page, insert
For example
206
29 Sandbox
Some common tricks to write things on wikibooks/wikipedia (both are using the mediawiki
software).
The best way is to goto en.wikipedia.com and find a page with the output you want. Then
pretend to edit the page (by clicking edit) to see the source used to generate the output.
For example, you want to know how C++ syntax highlighting is obtained in wikibook. Go
to en.wikipedia.com and find the page for C++. There must be sample code snippet.
Then you pretend to edit it to see the source: http://en.wikipedia.org/w/index.php?
title=C%2B%2B&action=edit§ion=6
You will see the source code generating the syntax highlighting:
<source lang="cpp">
# include <iostream>
int main()
{
std::cout << "Hello, world!\n";
207
Sandbox
}
</source>
Use the HTML comments: for example, the following comment will not show up in the
paper rendered. But it is visible to editor to reminder why things are done in certain way.
<!-- Please keep the pixel size to 400 so they are clean in the pdf
version, Thanks! -->
[[File:Rose-compiler-code-review-1.png|thumb|400px|Code review using
github.llnl.gov]]
<source lang="cpp">
# include <iostream>
int main()
{
std::cout << "Hello, world!\n";
}
</source>
# include <iostream>
int main()
{
std::cout << "Hello, world!\n";
}
You can pretend to edit this section to see how math formula are written.
More resources are at
• http://en.wikipedia.org/wiki/Help:Formula
• http://www.mediawiki.org/wiki/Manual:Math
208
Math formula
PN
j=1 (Si, j) =1
log2 (n!) = log2 (n) + log2 (n − 1) + log2 (n − 2) + ... + log2 (1)
log2 (n) + log2 (n) + log2 (n) + ... + log2 (n)
nlog2 (n)
log2 (n!) = log2 (n) + log2 (n − 1) + log2 (n − 2) + ... + log2 (1)
< log2 (n) + log2 (n) + log2 (n) + ... + log2 (n)
= nlog2 (n)
z = a
f (x, y, z) = x + y + z
2
√2
R ∞ −t2 e−x P∞ n (2n)!
erfc(x) = π x
e dt = √
x π n=0 (−1) n!(2x)2n
209
30 Contributors
Edits User
7 Chunhualiao1
91 Doubleotoo2
74 GoblinInventor3
6 Invapid4
1196 Liao5
3 Matzke6
16 Peihunglin7
39 QUBot8
1 QuiteUnusual9
1 http://en.wikibooks.org/w/index.php?title=User:Chunhualiao
2 http://en.wikibooks.org/w/index.php?title=User:Doubleotoo
3 http://en.wikibooks.org/w/index.php?title=User:GoblinInventor
4 http://en.wikibooks.org/w/index.php?title=User:Invapid
5 http://en.wikibooks.org/w/index.php?title=User:Liao
6 http://en.wikibooks.org/w/index.php?title=User:Matzke
7 http://en.wikibooks.org/w/index.php?title=User:Peihunglin
8 http://en.wikibooks.org/w/index.php?title=User:QUBot
9 http://en.wikibooks.org/w/index.php?title=User:QuiteUnusual
211
List of Figures
213
List of Figures
214
List of Figures
1 Peihunglin11 cc-by-sa-3.0
2 Chunhualiao12 cc-by-sa-3.0
3 Chunhualiao13 cc-by-sa-3.0
4 Liao14 cc-by-sa-3.0
5 Chunhualiao15 cc-by-sa-3.0
6 Peihunglin16 cc-by-sa-3.0
7 Peihunglin17 cc-by-sa-3.0
11 http://en.wikibooks.org/wiki/User%3APeihunglin
12 http://en.wikibooks.org/wiki/User%3AChunhualiao
13 http://en.wikibooks.org/wiki/User%3AChunhualiao
14 http://en.wikibooks.org/wiki/User%3ALiao
15 http://en.wikibooks.org/wiki/User%3AChunhualiao
16 http://en.wikibooks.org/wiki/User%3APeihunglin
17 http://en.wikibooks.org/wiki/User%3APeihunglin
215
31 Licenses