JP Exploiting UML
JP Exploiting UML
pyDot
Selector Profiler
C++ Pointcuts
gcc
Code
Instrumented
executable
Aspect C++
Figure 1: System architecture. This figure presents the main components of the spidor toolset. External programs and APIs,
such as gcc are colored blue, generated code, such as the pointcuts, is colored green, and our own code is colored gold. The
C++ code for the program being profiled, shown here in white, is the input to the system.
AST and in Section 3.2 we provide more detail about our these graphs we use the Dot tool, part of the Graphviz graph
use of Dot [Gansner and North 2000]. In Section 3.3 we visualization software suite [Gansner and North 2000]. We
describe our use of aspects and in Section 3.4 we describe found the default Dot layout strategy, based on a hierarchi-
our approach to connecting C++ and Python. cal layout, to be the most suitable for each of the graphs
created. Since the main edges in all three diagrams were
3.1 Interfacing with gcc based in some way on method calls, the use of the hierar-
chical layout helped to clearly present the flow of control
The spidor toolset uses the C++ compiler from the GNU through the system.
Compiler Collection, gcc, as its front end. We had previ- We utilize an existing program, pyDot, to represent Dot
ously used gcc as a basis for C++ program comprehension graphs, and both our Selector and Profiler tools develop this
by instrumenting its parser [Power and Malloy 2002], but to display Dot graphs in a Python canvas. In particular, our
this proved difficult to maintain over various gcc releases. Selector tool provides interpolation between the co-ordinate
Since version 3.0, gcc has begun to develop an internal ab- systems of Python canvas objects and Dot graphs to allow
stract syntax tree (AST) format, known as generic, which interaction with the Dot graph for the purposes of selecting
provides a high-level representation of a C++ program [Mer- nodes and edges.
rill 2003]. This representation is also reasonably accessible, Visualization of sequence diagrams is unencumbered by
since it can be generated as a text file using a compiler switch the layout problem inherent in visualizing call graphs and
(-fdump-translation-unit-all). class diagrams, since sequence diagram sequencing of mes-
However, it should be noted that most of the generic sages is ordered by time. Thus, we use Dot to visualize
documentation is in the form of comments in the gcc source our call graph, class and communication diagrams, and we
code, and some effort is required to disentangle the con- map Tkinter widgets directly onto a canvas to visualize our
structs used. We have written a Python API, pyGast, to sequence diagrams.
facilitate working with the gcc generic output. pyGast pro-
vides methods to parse the gcc output to produce a Python
representation of the AST, as well as providing facilities for 3.3 Interfacing with AspectC++
visitor-based AST navigation. Our pyGast API also builds
a representation of the class hierarchy, the call graph, and In order to create run-time profile data we need to track
provides output in text, XML and Dot formats. object creation and destruction, as well as method calls and
returns at run-time. To achieve this, we insert probes in the
C++ code at appropriate positions, as selected by the user. In
3.2 Interfacing with Dot order to facilitate using spidor with multiple C++ programs
Visualizing C++ programs statically and dynamically using it was important to automate the process of instrumenting
class diagrams, call graphs and communication diagrams re- the code.
quires the construction of non-trivial graphs. To visualize This kind of cross-cutting concern is one of the standard
4 Static Visualization:
4.1 The Class Diagram
The Selector
The UML class diagram is a popular way of presenting a
In this section we describe the spidor’s Selector tool, whose high-level overview of an object oriented system, and is in-
purpose is to visualize the classes and methods in a C++ creasingly used in programming environments as an aid to
program, and to allow the user to navigate through these code organization and navigation. The edges in a class dia-
gram represent relationships between classes, most notably lights the fact that all of the actions of the arcade game are
inheritance. However, associations based on field references, directed from AnimationManager.
parameters and local variables can also be represented. A theme of our work is that we seek to exploit the widely
Since we are interested in the interactions between meth- used UML notations as an aid to program visualization.
ods in the system, we restrict the class diagram edges to However, despite the importance of method interactions,
just two kinds: those representing inheritance, and those there is no UML diagram that completely meets the task of
representing dependencies between classes. A class C1 has representing these interactions. Elements of the call graph
a dependency on class C2 whenever a method from C1 can can be derived from UML sequence diagrams, activity dia-
call a method from C2 . Eliding associations and using Dot grams and communication diagrams, yet none of these pre-
to control the layout gives a rather unusual look to the class sents the interactions between methods as clearly as the call
diagram. Typically a class diagram, especially when used as graph.
a design artifact, will use the inheritance relationship as the The system call graph presents all methods in the sys-
main layout ordering edge, and use associations to impose tem, and can be quite complicated, even for relatively small
secondary orderings. However, we find that basing the lay- C++ programs. An added advantage of our use of Dot to
out on dependencies is a significant aid to understanding the layout the graph is the effect of exposing the ranking be-
flow of method calls, the main source of flow in our project tween methods in terms of the ordering of method calls, and
whose focus is dynamic object modeling. Figure 3 illustrates this provides a good overall view of the hierarchy of inter-
a partial class diagram for the arcade game that we use as actions. Fixing a coloring scheme for classes and using this
our case study. At the root of the diagram is class Anima- coloring for the nodes in the call graph was a significant aid
tionManager, which choreographs the actions of the game. to comprehension of the interactions among the methods in
AnimationManager calls methods in classes ExplodingSprite, our model.
Sound and Background, with stereotypes attached to each of The system call graph acts as a selector for method calls.
the edges emanating from AnimationManager to capture this We adopt the convention that selecting a node implies pro-
calling relationship. filing all calls to that method, whereas selecting an edge
implies only profiling calls from the source method to the
destination method. The class diagram acts as a higher level
4.2 The System Call Graph selector, where selecting a class implies the selection of each
The second overall view we present is the system call graph. of its methods in the system call graph.
This graph is not part of the UML standard, nor does it
typically appear during system design, but is a tool com- 4.3 Method and Class Call Graphs
monly used in program analysis and software testing. In a
system call graph the nodes are the methods in the system, One drawback of the high-level view presented by the system
and there is a directed edge between two methods m1 and call graph is that it can be difficult to distinguish edges for
m2 whenever m1 can call m2 . Figure 4 illustrates a partial methods that call, or that are called by, many others. To
call graph for the arcade game that we use in our case study. deal with this we have implemented lower-level selectors in
In the figure, function main is shown on the left side of the both the class diagram and the call graph. Right-clicking on
graph with edges leading to the constructor and destructor a node in the system call graph presents a detail of the graph,
of AnimationManager, also shown in Figure 3. Figure 4 high- showing just the immediate predecessors and successors of
that node. The reduced information in this method call graph class or method call graphs. It should be noted however
can be laid out to clearly distinguish between the method’s that there is some overhead in maintaining consistency be-
individual callers and callees. The window at the bottom of tween selections in all these diagrams, and performance can
Figure 5 illustrates a method call graph where the user has degrade if many views have to be maintained. In practice,
right clicked on method Sound::playSound() and the figure we do not envisage a need for more than one or two class or
illustrates methods that call Sound::playSound() as well as method call graphs at any one time.
the methods called by Sound::playSound().
Similarly, clicking the middle mouse button on a class in
the class diagram or a method in the call graph produces 5 Dynamic Visualization:
a class call graph. Here, we display the methods in the The Profiler
class, along with the immediate predecessors and successors
of these methods from the call graph. These methods are The output of the Selector tool is an instrumented version
then grouped based on their class, giving a fine grained view of the original C++ program, which is designed to interact
on the dependencies between classes. The same coloring and with a simple Python interface at run-time. The Profiler
selection conventions are used for all call graphs, and all are tool described in this section implements this interface, and
kept synchronized, so that a selection in any one is propa- displays run-time information about a program in terms of
gated to all the others. The rightmost window in Figure 5 UML sequence and communication diagrams.
also shows a class call graph where the user has clicked the The design of the Profiler is based loosely on the tradi-
middle mouse button on a method in class Sound and all of tional command-line debugger. The user launches the pro-
the methods that call methods in Sound are shown as well gram from the Profiler, and can then step through its exe-
as the methods called from within class Sound. cution at a chosen level of granularity. Unlike a traditional
Clicking with the middle or right mouse buttons on a class debugger, however, the user does not step through the actual
or method call graph replaces that call graph with the new source code, but rather the sequence diagram corresponding
one. Thus the user can display the method call graph for to the program’s execution. It is not intended that the Pro-
a method, and then follow the chain of calls one-by-one by filer would be an alternative to a traditional debugger, but
repeatedly right-clicking on one of the successor methods. rather a complimentary tool, since it concentrates on visual-
At any stage the user can click the middle mouse button izing method interactions, rather than the details of method
on a method and zoom out to the class call graph for that execution.
method’s class. The main window of the Profiler is shown in Figure 6.
At present the Selector provides a single class diagram The large red ’step’ button is used to invoke and return con-
and system call graph, and allows for any number of popup trol to the C++ program being profiled, and the slider bar
controls the number of events processed between each step. the height of the diagram, the user can choose to ignore any
In this context, an event is an object creation or destruction, of the methods or objects at run-time. The ’Method filter’
or the start or end of a method call. The profiler interacts button displays a pop-up list of the methods being profiled,
synchronously with the program being profiled. The pro- and the user can choose to enable or disable tracking of in-
gram is run for the designated number of steps, and then dividual methods during the program’s run. The ’Object
pauses while control is returned to the Profiler so that the filter’ button works analogously for objects.
user can examine the sequence diagram.
5.2 The Communication Diagram
5.1 The Sequence Diagram
The UML communication diagram is a more traditional
The sequence diagram uses the standard UML notation, graph, where the nodes correspond to individual objects,
where each individual object is represented by a horizon- and an edge denotes one or more method calls between the
tal bar, with the object name and class at the top. Each objects. While the communication diagram could be main-
’thickening’ of the bar represents a method belonging to that tained in parallel with the sequence diagram, in practice this
object being executed. Method calls from one object to the imposes a noticeable overhead on the Profiling. Also, the
next are represented by horizontal arrows between the rel- communication diagram quickly becomes incomprehensible
evant object bars. Thus, the sequence diagram increases in for large volumes of data, since it does not follow the more
width with each object creation, and increases in length with restrictive layout of the sequence diagram, where the most
each method call. A large ’X’ at the end of an object’s line recent events appear at the bottom, and the newest objects
denotes the destruction of that object. appear on the right side of the diagram.
Navigation within the sequence diagram can be achieved The approach taken in our tool is to provide the commu-
using either the slider bars, or the page navigation controls nication diagram on request. Thus, whenever the program is
on the panel. As with the selector tool, color is used to stopped and the user is investigating the sequence diagram,
identify objects from the same class. The color used in the they can also press a button to produce the communication
Profiler is the same as that used in the Selection tool, and diagram for that point in the program. Since communica-
is relayed to the Profiler as part of the instrumentation. tion diagrams can quickly become unmanageable, our dy-
Since the sequence diagram can quickly grow quite large, namic profiler will only present the last 50 events. As is the
two features were implemented to reduce its width and case for the Selector, an interface to the Dot tool is used to
height. Since a destroyed object has no further use for its provide graph layout information. Figure 7 shows a profiling
horizontal area, newly created objects are always positioned session with a communication diagram in the foreground.
in the leftmost free lane in the diagram. In order to reduce We use the standard UML numbering system for com-
Figure 7: The Communication Diagram. The communication diagram, shown here in the foreground, can be launched at any
point during profiling. It can show the last 50 events; the slider bar allows the user to step through these events one by one.
1. The design of a system for dynamic selection and visu- Jones, J. A., Orso, A., and Harrold, M. J. 2004. Gam-
alization of objects. matella: Visualizing program-execution data for deployed
software. Information Visualization 3, 3, 173–188.
2. An examination of the pragmatics of our system
through the construction of a toolset, spidor, which Kiczales, G., Lamping, J., Mendhedar, A., Maeda,
reduces the cognitive burden on the user by providing C., Lopes, C., Loingtier, J., and Irwin, J. 1997.
static selection of classes and methods and dynamic se- Aspect-oriented programming. In European Conference
lection of objects and messages. on Object-Oriented Programming, 220–242.
3. The visualization of sequence and communication dia- Knapen, G., Lague, B., Dagenais, M., and Merlo, E.
grams illustrating objects and messages of interest to 1999. Parsing C++ despite missing declarations. In 7th
the user. International Workshop on Program Comprehension.
4. Dynamic interaction with both the application and the
profiler. Interaction with the application enables the Lohmann, D., Blaschke, G., and Spinczyk, O. 2004.
user to supply input to the application to provide di- Generic advice: On the combination of AOP with gener-
rection and enhance comprehension or debugging. In- ative programming in AspectC++. In 3rd International
teraction with the profiler enables filtering of methods Conference on Generative Programming and Component
and objects for increased cognitive economy. Engineering.
Mahrenholz, D., Spinczyk, O., and Schrder-
Our visualization project is ongoing and our future work
Preikschat, W. 2002. Program instrumentation for de-
will take the following directions. We plan to conduct a com-
bugging and monitoring with Aspect C. In International
parison of the efficacy of sequence diagrams as compared to
Symposium on Object-oriented Real-time distributed Com-
communication diagrams in reasoning about large C++ ap-
puting, 249–256.
plications. To apply our approach to large applications we
will investigate techniques to reduce the size of sequence di- Malloy, B. A., Gibbs, T. H., and Power, J. F. 2003.
agrams by collapsing repeating sequences of messages into Decorating tokens to facilitate recognition of ambiguous
a single sequence [Jerding et al. 1997]. We will also investi- language constructs. Software: Practice and Experience
gate reducing the size of the communication diagrams using 33, 1, 19–39.
techniques such as those described in [Jacobs and Musial
2003]. We also plan to incorporate more debugging facilities Merrill, J. 2003. GENERIC and GIMPLE: A new tree
into spidor, permitting the user to set breakpoints during representation for entire functions. In First Annual GCC
execution of the application. Finally, we plan to investigate Developers Summit, 171–180.
usage of spidor generated design artifacts to validate design
artifacts constructed earlier in the life cycle. Murphy, G. C., Notkin, D., Griswold, W. G., and
Lan, E. S. 1998. An empirical study of static call graph
extractors. ACM Transactions on Software Engineering
and Methodology 7, 2, 158–191.
References
Orso, A., Jones, J., and Harrold, M. J. 2003. Visual-
Ambler, S. W. 2004. The Object Primer, third ed. Cam- ization of program-execution data for deployed software.
bridge University Press. In ACM Symposium on Software Visualization, 67–76.
Cook, S., and Brodsky, S. 1999. OMG analysis and de- Power, J. F., and Malloy, B. A. 2002. Program an-
sign PTF, UML 2.0. In Request for Information, Response notation in XML: a parser-based approach. In Working
from IBM Corporation. Conference on Reverse Engineering, 190–198.
Printezis, T., and Jones, R. 2002. GCspy: an adaptable
Gansner, E. R., and North, S. C. 2000. An open heap visualisation framework. In Conference on Object-
graph visualization system and its applications to soft- Oriented Programming Systems, Languages and Applica-
ware engineering. Software: Practice and Experience 30, tions, 343–358.
11 (September), 1203–1233.
Reiss, S. P. 2003. Visualizing Java in action. In ACM
Gibbs, T. H., and Malloy, B. A. 2003. Weaving aspects Symposium on Software Visualization, 57–65.
into C++ applications for validation of temporal invari-
ants. In 7th European Conference on Software Mainte- Rossum, G. V. 2003. An Introduction to Python, first ed.
nance and Reengineering, 249–258. Network Theory Ltd.
Rumbaugh, J., Jacobson, I., and Booch, G. 1999. The
Jacobs, T., and Musial, B. 2003. Interactive visual de- Unified Modeling Language Reference Manual. Object
bugging with UML. In ACM Symposium on Software Vi- Technology Series. Addison-Wesley.
sualization, 115–122.
Selic, B. 2004. UML 2.0: Exploiting abstration and au-
Jerding, D. F., Stasko, J. T., and Ball, T. 1997. Visual- tomation. Software Development Times Issue 98 (March
izing interactions in program executions. In International 15).
Conference on Software Engineering, 360–370.