Xdoc An Extensible Documentation Generator: R.B. Vermaas Rbvermaa@Cs - Uu.Nl MSC Thesis Inf/Scr-03-41
Xdoc An Extensible Documentation Generator: R.B. Vermaas Rbvermaa@Cs - Uu.Nl MSC Thesis Inf/Scr-03-41
Xdoc An Extensible Documentation Generator: R.B. Vermaas Rbvermaa@Cs - Uu.Nl MSC Thesis Inf/Scr-03-41
Center for Software Technology, Institute of Information and Computing Sciences, Utrecht University, P.O. Box 80089, 3508 TB, Utrecht, The Netherlands. 17th February 2004
Abstract Documentation is an important part of any software system. Documentation is often generated automatically from in-source documentation comments. There are already many documentation generators, however they lack the ability to easily adapt to language extensions or to add new visualizations to the documentation. Often when a new language is designed, a completely new documentation generator is developed for this language, even though the documentation generated is often very similar. This situation leads to the necessity of a documentation generator that allows easy adaptation and extension. The result of my thesis work is xDoc, an extensible documentation generator. xDoc can be extended with new languages, new visualizations and dierent documentation comment syntax. By separating the language-specic tasks from the generic tasks, we have created a package in which the documentation generator developer only has to focus on the language specic tasks. The documentation generated by xDoc contains API documentation and a code browser.
Preface
October 2002, I had to start thinking about my thesis project. I had no idea what to do my master thesis about. I had become infected with the Stratego virus when I followed the courses Software generation and High performance compilers. Stratego revitalized my enthusiasm in programming, transforming programming from an necessary evil to something I really enjoy doing. So basically I stepped into Eelco Vissers oce to ask if he had a nice project related to Stratego for me. There were several projects available, but only one of them was what I was looking for, something that would have a practical value. This thesis is the result of that meeting in October.
Acknowledgements
Obviously I have to thank Eelco Visser for handing me the opportunity to do this project and to let me do whatever I wanted. Someone I have to thank as much as Eelco, is Martin Bravenboer. Without Martins work xDoc would have looked completely dierent. Not only because of the tools he provided, but also the discussions we have had in the ST-lab and his advise and comments on xDoc were greatly appreciated. Then there are the other people from the ST-lab. I have been working quite a long while on this thesis, of which most of the time in the ST-lab, so thanks to everyone for the talks, music and coee. The ST-lab has proven to be the loudest, messiest and coolest lab of the faculty. Special thanks goes out to Arthur as he has probably used xDoc the most, resulting in many bugreports and suggestions that helped making xDoc better. Mushy stu Now on to the mushy part (is that a word?), skip this if you are not a fan of personal acknowledgement sections in a thesis. In the last few years, my best friends and me have formed aNGooSe. I know I would not normally say these kind of things, but thanks for your support, you are the best group of best friends I could wish for, seriously. Special thanks goes to my parents for their seemingly unlimited support during my studies. I have stretched my academic studies to a period longer than seven years, also stretching
3
their patience. Especially in the last half year, in which not everything went by plan, you have helped me so much. Edi and Renate, thank you for all your support, I love you. The most important person I would like to thank is my ancee. We have been together for almost four years. Living so far apart from eachother was very dicult, especially in the last year. But soon we can nally be together normally. Lizi, thank you for the best years of my life and for making me a better person. I love you.
Contents
Contents 1 Introduction 1.1 Software documentation . . . . . . 1.1.1 Documentation generation 1.1.2 Existing tools . . . . . . . 1.1.3 Quality of documentation . 1.2 Motivation . . . . . . . . . . . . . 5 7 7 7 8 9 10 13 13 16 16 21 21 22 24 27 27 28 30 30 33 33 34 36 36 40 40 45 45 46 47 47
5
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
2 xDoc in a nutshell 2.1 xDoc, the documentation . . . . . . . . . . . 2.1.1 Layout . . . . . . . . . . . . . . . . . 2.2 xDoc, the documentation generator . . . . . 2.3 xDoc, the extensible documentation generator 2.3.1 xDoc overview . . . . . . . . . . . . . 2.3.2 Conguration . . . . . . . . . . . . . 2.3.3 Tools . . . . . . . . . . . . . . . . . . 3 Fact 3.1 3.2 3.3
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
3.4
extraction Techniques for fact extraction . . . . . . . . . . . . . Fact extraction in xDoc . . . . . . . . . . . . . . . . Documentation comments . . . . . . . . . . . . . . 3.3.1 Instrumenting a language with documentation 3.3.2 Normalizing comments . . . . . . . . . . . . xDocInfo . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Describing xDocInfo . . . . . . . . . . . . . . 3.4.2 Using xDocInfo . . . . . . . . . . . . . . . . Stratego to xDocInfo . . . . . . . . . . . . . . . . . Matching Uses with Denitions . . . . . . . . . . . . Gathering code examples . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . comment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
4 HTML generation in Stratego/XT 4.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 xml-tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 API documentation 5.1 Overview of generated API documentation . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
47 48 48 51 55 55 57 58 58 60 63 63 64 66 67 67 72 72 75 75 76 76 79 79 79 79 80 81 81 82 82 83 83 85 86 86 88 88 92 92 95 99 105
6 Code browser 6.1 General . . . . . . . . . . . . . . . . 6.2 Transforming AsFix . . . . . . . . . 6.2.1 Syntax highlighting . . . . . 6.2.2 Anchors and hyperlinks . . . 6.2.3 Concrete syntax visualization 7 Visualizations in xDoc 7.1 Visualization components for xDoc 7.2 Graphs . . . . . . . . . . . . . . . 7.2.1 Import graphs . . . . . . . 7.2.2 Class diagrams . . . . . . . 7.2.3 XTC composition graph . . 7.3 Analyzers . . . . . . . . . . . . . . 7.3.1 Unused denition detection
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
8 xDoc in retrospect 8.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Appendix A.1 Stratego/XT: preliminaries . . . . . . . . . . . . . . . . . A.1.1 ATerm . . . . . . . . . . . . . . . . . . . . . . . . A.1.2 Stratego . . . . . . . . . . . . . . . . . . . . . . . A.1.3 SDF . . . . . . . . . . . . . . . . . . . . . . . . . A.1.4 Generic pretty printing . . . . . . . . . . . . . . . A.1.5 XTC . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Fact extraction . . . . . . . . . . . . . . . . . . . . . . . A.2.1 Syntax denition for xDoc comments . . . . . . . . A.3 HTML generation in xDoc . . . . . . . . . . . . . . . . . A.3.1 Standard layout . . . . . . . . . . . . . . . . . . . A.3.2 Indices . . . . . . . . . . . . . . . . . . . . . . . . A.3.3 Left indices . . . . . . . . . . . . . . . . . . . . . A.3.4 Experiences . . . . . . . . . . . . . . . . . . . . . A.4 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.4.1 Implementation of module level API documentation A.5 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . A.5.1 xdoc-import . . . . . . . . . . . . . . . . . . . . . A.5.2 java-class . . . . . . . . . . . . . . . . . . . . . . A.5.3 xtc-graph . . . . . . . . . . . . . . . . . . . . . . Bibliography
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
Contents
Chapter 1
Introduction
The research of this thesis is part of the Software Engineering research area. Specically we want to address the issues around the generation of source code documentation.
1.1
Software documentation
In this thesis we focus on documentation which is needed in the software development process. An important part of a software system is its documentation. Good technical documentation is necessary to understand the working of a system, especially for programmers that are new to a system. Probably the most used type of documentation is the API documentation. API documentation gives an abstract view of the structures dened in the source code of a system, and their appearance to the outside. Documentation can be either textual or graphical. Most documentation is textual documentation, such as API documentation. Graphical documentation is mainly used to visualize a software system and all sorts of relations between parts of the system, for example using graphs. Documentation can be either static or dynamic. With dynamic, we mean that it is interactive. API documentation is a good example of static documentation. An example of an interactive documentation is a visualization in which you can leave out parts which you are not interested in. 1.1.1 Documentation generation
Documentation can be created manually and automatically. Manual documentation creation has some disadvantages. It takes a lot of time to do and it is not always consistent with the source code. Automatic generation of documentation on the contrary, is consistent with the source code and does not take much time. It can also be done repeatedly. Another advantage of automatic generation of the documentation is that the documentation is more structured and consistent, which makes documentation a lot clearer and easier to browse through.
7
1.1
Software documentation
Manual input by the programmers remains necessary though, because there are no systems available yet that produces natural language which explains what a piece of code does and how it should be used. Therefore we want to combine automatic and manual documentation in a documentation generation system. The manual input of documentation is often done using documentation comments. Documentation comments are structured parse-able comments containing information for documentational purposes. Documentation generators are also necessary because they take away the dirty work of presenting the documentation from the hands of the developer. Developers are generally quite lazy when it comes to documentation. They do not like the task of documenting the software they write. Therefore documentation generators allow the developers to only focus on writing the documentation without having to think of how its presentation. 1.1.2 Existing tools
When we look at the existing systems and tools for documentation generation, we can divide these into two groups. Documentation in-the-small The rst group consists of the tools which focus on the task of automated documentation generation itself, resulting in code documentation. Examples of these tools are JavaDoc [5], Doc-O-Matic [1] and Doxygen[3]. These tools typically take the systems source code and parses out in-source documentation comments, and transform this to a selected output format. In 1984 Knuth proposed the concept of literate programming [13]. The original idea of literate programming was to make signicantly better documentation of programs by concentrating on explaining to human beings what we want a computer to do. The programmer strives for a program that is comprehensible because its concepts have been introduced in an order that is best for human understanding. Javadoc and systems like it also use a form of literate programming, although these systems have weakened the original idea of literate programming by using documentation comments, choosing a form that is partly based on human understanding, and partly on computer understanding. There are a lot of small documentation generators like Javadoc, supporting various languages and various output formats. But features which are almost always lacking are the ability to add extra visualizations or features to the documentation and the fact that there are no facilities for browsing the code in a user friendly way. Documentation in-the-large The second group focusses more on the concept of program understanding. Examples of these tools are DocGen [2], Rigi System [6], The Software Bookshelf [7] and Imagix 4D [4]. This group is the most interesting, as the goal of documentation is the understanding of a system. And for this purpose, the standard documentation like for example API documentation is insucient. We would like to add more useful information to the documentation, for example about some architectural aspects.
8 Introduction
Software documentation
1.1
DocGen has been developed by the Software Improvers Group. It is based on research done at CWI [10]. One of DocGen main focuses is documentation in-the-large, the mastery of the structural complexity of large software systems. Kuipers and van Deursen present a method for building documentation generators for systems in the Cobol domain. Besides automated documentation and visualization generation, the system also allows manual documentation writing. However, it is not possible to let a programmer add new visualizations or views of the system. To extract facts from the systems, they use a technique called island grammars. An island grammar precisely denes only small portions of the syntax of a language. The rest of the syntax is dened imprecisely, for instance as a list of characters, or a list of tokens. Goal is to do faster and more robust parsing. In DocGen the documentation generation process is divided in a front-end and a backend. The front-end extracts the facts from the system. These facts are stored in the fact repository. From this fact repository all the documentation and visualization is generated in the back-end. Just as with DocGen, Wong et al. focus more on the documentation in-the-large [24]. Their goal was to aid in understanding architectural aspects of software. They developed the Rigi System, which extracts facts from a system, queries these facts, and presents them. This is done by using a graph editor. Several views of the legacy system can be viewed with this graph editor. The graph editor makes it possible for a programmer to develop new views on the system. Finnigan et al. [7] introduced the concept of a software bookshelf to recapture, re-document, and access relevant information about a legacy software system for re-engineering or migration purposes. It is an IBM initiative building upon the Rigi experience. Three roles are distinguished in the software bookshelf concept. The builder constructs extraction tools. The librarian populates a repository with meaningful information using the building tools or other ways. The patron is the end user of the bookshelf. They have also integrated the Rigi System in the software bookshelf, but keep the presentation in a form that is viewable with a web browser, just like the DocGen project. The systems described mainly focus on the visualization part of documentation generation.
1.1.3
Quality of documentation
When we want to generate documentation we have to wonder what good documentation is. There is no easy answer to this question. It all depends on the purpose of the documentation, and needs of the users of the generated documentation. There are some guidelines we could follow. According to van Deursen and Kuipers [10] documentation should adhere to the following four criteria which they have tried to apply in DocGen. Documentation should be available on dierent levels of abstraction Documentation users must be able to move smoothly from one level of abstraction to another, without loosing their position in the documentation
Introduction 9
1.2
Motivation
The dierent levels of abstraction must be meaningful for the intended documentation users The documentation needs to be consistent with the source code at all times The criteria advised by van Deursen and Kuipers are consistent with a publication that Forward and Lethbridge [11] have published on the results of a survey held in 2002 among software professionals. The goal of this survey was to uncover the perceived relevance of software documentation. Even though this was quite a small survey, some interesting facts came to light. According to the results of the survey, the developers questioned did identify some properties which they liked or which they thought should be in good documentation. Documentation should have a proper navigation mechanism. The tools used to generate documentation should be easy to use. Other important aspects identied were quality of the content, being up-to-date, good availability, the use of code examples, and a clear organization and structure. An interesting fact is that the developers did not really mind the exact format of the documentation. Based on this survey, Forward an Lethbridge conclude that technologies for documentation should strive for ease of use, lightweight and disposable solutions for documentation. A documentation generator should generate documentation that adheres to all the mentioned properties of good documentation.
1.2
Motivation
As we have discussed in this chapter, there are already quite some documentation generators, sometimes supporting a variety of dierent languages. Still, every time a new language is introduced, often a new documentation generator is written instead of reusing other documentation generators, even though the documentation which is generated by the various documentation generators for other languages have a lot in common. This is done mainly because the existing documentation generators lack a good dened processes of documentation generation. Every time a language is extended, the time between the extension and a documentation generator thatsupports this extension is far too long, which was for example the case with Java Generics. Often programmers want to have visualizations of the programs in the documentation, giving better insight in the structures of the documented system. Strangely enough many documentation generators do not oer the possibility to add new visualizations to the documentation. This situation sketched above leads to the necessity of a documentation generator that allows easy adaptation and extension, which leads us to the main goal of this project. Identify the phases in the process of documentation generation and implement these in a system that is extensible in several aspects, most importantly the ability to support multiple languages.
10 Introduction
Motivation
1.2
In our specic situation we were in need of a documentation generator for the languages used in StrategoXT, SDF and Stratego. Every student at our institute who has been in touch with the Stratego language can probably remember the old procedure of nding strategies or rules in the Stratego Standard Library (SSL). Going to the directory containing the source of the SSL and using a mixture of name guessing and some lexical pattern we run grep to hopefully nd what we are looking for. Those were the days. But these days are over. The result of this thesis is xDoc, an extensible documentation generator, which support the following extensions. Instantiate new languages Use new/own comment style Generate new visualizations and analyses xDoc has shown its extensibility, as we have already instantiations for the following languages. Stratego SDF Java1
1 Not
Introduction
11
Chapter 2
xDoc in a nutshell
In this chapter, we will give an overview of xDoc. We will give an overview of the overall structure of xDoc, describing the features, architecture and tools without going too deep into the implementation details, which are discussed in Chapter 3, 5 and 6. xDoc owes its name to what it is designed to be, an extensible documentation generator. To understand what xDoc is all about, we have to look at the stakeholders. In the next few sections we will identify these stakeholders and show what interesting aspects xDoc has to oer to each of these groups. Users of the documentation Users of the documentation generator Developers of documentation generators
2.1
The rst group of importance to xDoc is the group for which we generate documentation, the users of the documentation. To describe what xDoc has to oer to this group, is to describe what the output of xDoc is, the documentation that is being generated. Documentation generated by xDoc mainly consists of HTML les and images from visualizations. Using hypertext to present software documentation has been proposed in [17, 8]. It oers a natural way of presenting, for example, def-use relationships between documented modules. The success of several other documentation generators, outputting HTML as main format and the fact that HTML has become one of the most popular languages used to present information in general, is the reason why we have chosen HTML as our output format. In xDoc we use a simple model for documentation, dening four dierent levels for which documentation is generated. The highest level, global, is a set of packages, a package is a set of modules and a module is a set of denitions. The notion of module is not necessarily related to a module system in a language. In the case of Stratego it is though, but, for example, in Java a class is a module or in other
13
2.1
languages it could be a le. In a language there are typically several kinds of denitions. In Stratego, for example, overlays, constructors, strategies, rules and dynamic rules. Java has methods and elds among other things. xDoc generates API documentation. The API documentation oers a user an abstract view of the structures in the documented system at various levels. The API documentation generated by xDoc has several nice features, which make it easy to browse through and search the documented system. Alphabetical indices are generated for modules and for each kind of denitions. Another feature of interest in the API documentation are the statistics shown in the API documentation. In Figure 2.1 we can see that the following statistics are shown, lines of code, number of modules, number of denition divided by kind of denition, each of which shows what percentage of these modules or denitions is documented using an xDoc comment. xDoc also generates hyperlinked view of the source code, which we call a code browser. A small piece of hyperlinked source code is shown in Figure 2.2. Syntax highlighting is also supported, which should help to clarify the code and its structure. The user has access to fully browsable source code of the system, including hyperlinked cross-references thoughout the sources of the packages. Access to the source code in a user friendly way is in our opinion very important for software documentation. It gives the user the opportunity to easily go to the source code in case API documentation is lacking or insucient. It can also be a good tool for code review as it has a user friendly interface. There are two source code views in the code browser. One that shows the code with its original layout, and one which shows the same code, only pretty-printed. Both have its advantages, so we have chosen to allow viewing both versions. The xDoc code browser also supports visualization of the inner works of a module which is dened using concrete object syntax as described in [22]. The pretty-printed code view can be very useful as it will show the code as it is transformed into terms of the meta-language. But in some cases we can even show the inner work of the concrete object syntax in the code view with original layout, as can be seen in Figure 2.3.
14 xDoc in a nutshell
2.1
xDoc in a nutshell
15
2.2
2.1.1
Layout
In Figures 2.4 and 2.5, we can see a typical xDoc session. The layout shown is the result of several experiments in which we have changed the layout of the documentation. As there is no such thing as a perfect layout for all documentation, it is still possible to change the layout using CSS21 . We have chosen to integrate the API documentation with the code browsing facilities. This gives the user the opportunity to easily go to the source code in case documentation at the API level is lacking. We have done this by using a standard layout for both views. Each HTML page generated by xDoc contains the following items. Menu On top of the HTML pages a menu is present, which contains links to the main entry points of the documentation and several alphabetical indices. This menu is the same for each generated HTML page. Context-sensitive indices Another means of navigation in xDoc is the presence of package, module and denition indices on the left of the screen. These indices are adapted to include only the relevant information given a certain point in the documentation. From these indices it is possible to jump to the corresponding place in both the API documentation and the code browser. Content frame The content frame is in fact the main part of the documentation. Visualizations, code browser and API documentation are displayed in this frame. An excellent example of a visualization that has shown to be practical, is the local import graph that is used in xDoc for navigating through the documentation. At the module level, a clickable graph is shown, which visualizes the local imports of a module, both the modules that are imported by the current module as well as the module that import the current module are shown. It makes it easy to browse through a system. This visualization is, just like the standard layout mentioned before, available for both the API documentation and the code browser.
2.2
Now it is clear what is generated by xDoc, the next group we have to focus on is the group that uses xDoc to generate documentation. Another important feature of xDoc, or any other documentation generator actually, that was required is its usability. If a tool is not user-friendly, developers will not use it. This requirement has resulted in several command-line options, to make documenting an easy task. Using these command-line option it is possible to generate documentation from several kinds of sources. We can generate documentation from:
1 http://www.w3.org/TR/REC-CSS2/
16
xDoc in a nutshell
2.2
xDoc in a nutshell
17
2.2
18
xDoc in a nutshell
2.2
xDoc in a nutshell
19
2.2
Local directories Subversion2 repositories Local or remote archive (tar.gz / tar.bz2) These command-line options make xDoc also suitable to be used to generate documentation on a daily basis, for example in the daily build system3 . Below we can see some uses of xDoc which show some of the possibilities of xDoc. xdoc title xDoc -I /src/xdoc html /www/xdoc language Stratego Generates documentation for the Stratego language, taking the sources that are in the /src/xdoc directories and its subdirectories. Documentation is outputted to the /www/xdoc directory. xdoc svn https://svn.cs.uu.nl:12443/repos/StrategoXT/trunk/xdoc html /www/xdoc Generates documentation for the Stratego language, taking the sources from location on a subversion server. Documentation is outputted to the /www/xdoc directory. xdoc tar xdoc.tar.gz output-tar-gz xdoc-doc.tar.gz Generates documentation for Stratego language, taking the sources from the xdoc.tar.gz archive, and outputting the documentation to the xdoc-doc.tar.gz archive. xdoc -I /src/StrategoXT subdir-as-package html /www/xdoc Generates documentation for the Stratego language from the directory /src/StrategoXT in which we treat each direct subdirectory of /src/StrategoXT as a package. This is especially useful for packages like StrategoXT which consists of several subpackages. xdoc -I /src/tiger -I /src/StrategoXT html /www/tiger-sdf language SDF Generates documentation for the SDF language, taking the sources from /src/tiger and /src/StrategoXT and outputting it to the /www/tiger-sdf directory. These were of course only a few uses of xDoc. xDoc has far more options, which would be too much to show an example of all of them. Below we can see an overview of all the options.
---xdoc -- (c) machina 2002-2003 ---Usage: xdoc [options]
Options: -S|--silent Silent execution (same as --verbose 0) --verbose i Verbosity level i (default 1) -s Turn on statistics --html d Output directory for HTML files -I dir Include modules from directory dir --tmp d Directory for temp. files --title n Title --package-dependencies file Dependencies between packages --description file Optional file including a description of the documented stuf --tar file Use tar.bz2 / tar.gz as input --svn url Subversion repo
2 http://subversion.tigris.org 3 http://www.stratego-language.org/Stratego/DailyBuild
20
xDoc in a nutshell
2.3
--output-tar-gz file Create tar.gz of documentation --xdoc-info file Output xDocInfo to file --transitive-reduction Use transitive reduction on graphs --remove-tmp Remove tmpdir --subdir-as-package Treat subdirs as packages (in combination with tar or svn) --include-tests Do not ignore tests --language lang Language to be documented (Stratego default) -h|-?|--help Display usage information --about Display information about this program --version Same as --about
2.3
The last group which xDoc focusses on are the developers of documentation generators. xDoc is an extensible documentation generator. Extensible can mean a lot of things though. Basically every system is extensible. The question is of course how easy it is to extend a system. In xDoc we have tried to make the extensions as easy as possible by identifying the separate tasks in the documentation process and letting them be handled by separate tools. Another question is, what aspects of the tool are extensible. xDoc, in its current form, supports the following extensions. Instantiate new languages Use new comment style Generate new visualizations and analyses At the moment there are congurations for generation of documentation for Stratego and SDF. Currently an instance for Java is being experimented with. Another candidate would be C, as a syntax denition is also available for this language. A language specication in SDF is a prerequisite. xDoc oers the possibility to add new visualizations to the documentation generator. Visualization can be interpreted in a broad sense, it can be a graph representing an import relation, or an analysis of the import structure for which an suggestion is shown. 2.3.1 xDoc overview
We have used a global structure which resembles the structure used in DocGen, in which there is a front-end and a back-end of the documentation generator. This is shown in Figure 2.8. Basically there are two steps in the documentation process. The front-end, which implements the so called fact extraction phase of the documentation process, gathers all the information from the system sources that is necessary to generate the documentation from the system. The information gathered is collected in a fact repository, which is a data directory containing an xDocInfo le and a parse tree for each of the modules in the system sources. The second step is the process where the actual documentation is generated, also called the back-end of the documentation generator. This phase does the HTML generation and generates the visualizations. This results in two HTML directories, containing the API
xDoc in a nutshell 21
2.3
xDoc project
System sources
module1 module2
Fact extraction
xdoc.conf xDocInfo
sort definition, visualizations, constructor names
code/
Figure 2.8: Overview of xDoc documentation and the code browser. 2.3.2 Conguration
For each language we want to document, we have to specify to xDoc in which way the documentation has to be made. We do this using a global conguration le, which is registered in the xDoc XTC repository. Figure 2.9 shows a typical xDoc conguration le. For each language that can be documented, the xDoc conguration le contains a tuple with following language specic information. Name of the programming language This is the unique identier of the language. It is mainly used to retrieve information from the conguration le during the documentation process. File extensions A list of strings representing le extension, which are used to indentify relevant les in the fact extraction phase of the documentation process. <..>2xdoc The <..>2xdoc tool which will do the analysis of the abstract syntax tree of the specied language and gathers all the necessary information needed by xDoc, resulting in a partial xDocInfo term. Use-def match tool Tool that handles the use-denition matching for a language. The tool gets an
22 xDoc in a nutshell
2.3
"Java" ["java"] "java2xdoc" "java-match" "xDoc-Java.tbl" "Java-pretty.pp.af" ["Constant","Field", "Method"] ([],[],["java-class"],[]) ([],[],["java-class"],[]) ["MethodDec","VarDec","ConstantDec","FieldDec"] [ "Float","Bool","Char","String","xDocComment","xDocCommentNoFields" , "xDocCommentNoSummary"] , [] , [] ) , ( "SDF" , ["sdf"] , "sdf2xdoc" , "sdf-match" , "Sdf2-Syntax.tbl" , "Sdf2.pp.af" , ["Production"] , (["xdoc-import"],["xdoc-import"],["xdoc-import"],[]) , ([],[],["xdoc-import"],[]) , ["prod"] , [] , ["sort"] , [] ) ]
[ ( , , , , , , , , , , , , ) , ( , , , , , , , , , ,
"Stratego" ["cr","r","str"] "stratego2xdoc" "stratego-match" "Stratego.tbl" "Stratego-pretty.pp.af" ["Constructor", "Overlay", "Strategy", "Rule", "Dynamic Rule"] (["xdoc-import"],["xdoc-import"],["xdoc-import"],[]) ([],[],["xdoc-import"],[]) ["SDef","SDefNoArgs","SDefT","RDef","RDefNoArgs","RDefT","OpDecl"] ["Str","Char","xDocComment","xDocCommentNoFields","xDocCommentNoSummary"] ["Call","CallT","CallNoArgs","Op"] ["ToTerm","ToStrategy"]
xDoc in a nutshell
23
2.3
xDocInfo term as input and it determines given a unique indentier of the denition in which a use initiated and a unique identier representing the use, the denitions which the use relation points to. Standard parse-table An sglr parse-table that has to be used in case a .meta le, specifying the syntax, can not be found. Pretty-print table The GPP pretty-print table for the specied language, which will be used to generate the pretty-printed version of the code browser. Sorts A list of strings representing the dierent possible kinds of denitions that the specied language has. From this list the order will be determined in which denitions are shown in the menu and several indices. Code browser visualizations This is a tuple of four lists corresponding to the dierent levels in the documentation, the global, package, module and denition level. For each of these levels we have to specify a list of xDoc visualization tools, which will generate the specied visualizations for us in the code browser. In the example shown in Figure 2.9 we can see that the xdoc-import tool has to be used at the global, package and module level. API visualizations This is again a tuple, such as the one in the previous option, only we will specify the visualization tools which are used for the API documentation. Denition constructors A list of strings representing the constructor names of denitions. Use constructors A list of strings representing the constructor names of uses, e.g. calls. Constructors to be highlighted A list of strings representing the constructor names of constructs of a language which we want to be highlighted.. Meta transition markers A list of strings representing the constructor names of possible meta transition markers. 2.3.3 Tools
Fact extraction The following tools are used in the phase that extracts all relevant information needed to generate documentation. xdoc The xdoc tool is the main application of xDoc. It browses through directories, searching for source les using given le extensions, and calls all the necessary tools used
24 xDoc in a nutshell
2.3
in a documentation session. It combines the gathered information by the <..>2xdoc tools and adds higher level information, like package structure, dependencies and possible unit tests. The documentation session can be manipulated using a variety of command-line parameters. <..>2xdoc This set of tools is used to gather the local information of a source le. It gathers local information like denitions, calls and imports and puts this in the xDocInfo format. See Section 3.4 for a more detailed description of xDocInfo. At the moment there are tools for Stratego and SDF. get-denitions This tool extracts denitions from a source le or AsFix tree, given a name and a constructor. Its result is a textual representation of the searched denitions, which can be included in papers, or any other application which needs to visualize parts of code. sunit2text This tool gathers sunit 4 tests, given a name or strategy and makes a textual representation of these unit test for a documentational purpose. gather-unused-defs This tool calculates from a given xDocInfo le and main strategies the denitions that are not used in the system. It can be used to warn users for unused denitions, as well in a program that removes the unused denitions automatically from the source code, with layout preservation.
Visualization The following tools are used for generating visualizations in xDoc. xdoc-import This tool generates the import graphs of the xDoc documentation. The graphs are used to give insight in the import relations of a sytem, and to navigate through both the code browser as well as the api documentation. java-class This tool generates Java class diagrams of the xDoc documentation for Java. The graphs are used to give insight in the inheritance relations of a sytem, and to navigate through both the code browser as well as the api documentation. xtc-graph This tool takes a Stratego application that uses the XTC library to compose a transformation, and generates a XTC composition graph. The resulting graph shows the composition which is derived from the Stratego source code.
4 SUnit is a Unit Testing framework for Stratego. By specifying tests that apply a strategy to a specic term and compare the result to the expected output, it is determined if new development have inuences on existing code.
xDoc in a nutshell
25
2.3
HTML generation The following tools generate the HTML documentation from the gathered information. xdoc-api This tool generates the actual API documentation in xDoc. Given an xDocInfo term it generates the HTML les and calls the necessary visualization tools. xdoc-code This tool generates the code-browsing functionality in xDoc. Just like the previous tool it generates the corresponding HTML les and calls the necessary visualization tools. asx-extra This tool is responsible for syntax highlighting, and adding of anchors and links into documented code.
26
xDoc in a nutshell
Chapter 3
Fact extraction
The extraction of facts from a system is probably the most important phase in the generation of documentation. Information has to be gathered from various places in the system. The lowest level of information we can nd is the source code of the system itself. This source code is the most precise information available, as the system is what the source code species. For this same reason it is also the most up-to-date source of information. As we can not derive an explanation in natural language from the source code yet, we have to rely on information that is handed to us by the developers, for example by documentation comments in the source code. But it is not the only information we can extract from a system. In this chapter we will discuss the possible techniques that can be used for fact extraction. We will discuss the implementation of this phase in xDoc.
3.1
Basically there are two ways of extracting information needed for documentation from the source code. We can use lexical analysis and context-free analysis. Both approaches have their advantages and disadvantages which we will discuss shortly. Lexical analysis usually comes down to using regular expressions. On the UNIX platform there are several tools that oer good support for regular expressions, e.g. awk, grep and perl. If the required information has an easily recognizable lexical form, then lexical analysis is a very ecient method to gather the information. Another advantage is the fact that we do not need to know all the constructs of the language to gather information. Also the source code does not have to be completely syntactically correct. But with these advantages also come some important disadvantages. The main disadvantage is that lexical analyse is not always very precise. With documentation we do need this precision. Also it is not always possible to recognize some language constructs easily, which leads to code that is dicult to maintain. Context-free analysis can be used to do a more thorough and precise analysis of the source code. Since analysis is done on the abstract syntax tree, abstracting from irrelevant details
27
3.2
such as layout. Working on an abstract syntax tree gives us exact information about the context of a construct, which makes the analysis much clearer and precise. Especially when analyses are more complicated, context-free analysis is the technique to use. A major disadvantage of context-free analysis is the need for a full grammar, which is not always available and is not easy to produce. In the DocGen project an intermediate approach is used. They have developed the concept of island grammars , in which relevant constructs, called islands, are recognized, and the irrelevant parts, the water, are thrown away. This should combine the advantage of lexical analysis, eciency, with the advantages of context-free analysis, clarity and precision. Even though this approach looks promising there are still some problems with modularity of island grammars and the combination of multiple island grammars. Despite the disadvantages of lexical analysis, most documentation generators are based on this technology. The main reason for this is the availability of tools that support this kind of analysis, resulting in a low technological threshold for the use of this technique.
3.2
In xDoc we have chosen to use context-free analysis. The implementation of the fact extraction phase is based on SDF and sglr, a scannerless generalized LR parser. We have chosen SDF for several reasons. As xDoc started o as a documentation generator for Stratego, and the Stratego language was dened using SDF, it was a practical decision. Besides the Stratego language, there are syntax denitions in SDF available for several other languages. Among these are SDF itself, C and Java. But the availability of syntax denitions for several languages is not reason enough to justify the use of SDF. SDF has two interesting features that are useful in a documentation generator. From a SDF syntax denition Stratego signatures, parse-tables for sglr and pretty-print tables for GPP can be generated. As SDF is a modular syntax denition formalism it is also easy to add new language features to a language. Good examples of such language features are documentation comments and concrete object syntax. Obviously this is a nice feature for documentation generators. This does not only give us the opportunity to let the documentation comments to be easily parsed within the language, but also to update the documentation generator to new language constructs, limiting the time between a language extension and an extension of the corresponding documentation generator. The fact extraction process in xDoc is schematically shown in Figure 3.1. From a set of source les, xDoc processes each of the les as follows. From the extension of the source le, looking up the corresponding parse-table in the xDoc conguration le, and a possible .meta le xDoc determines which parse-table to use. These les are passed to the sglr parser. The sglr parser outputs the result of the parsing in AsFix2 format. AsFix2, the ASF+SDF xed format, is a format for representing parse trees in the ATerm format, thus containing both the abstract syntax structure as well as the layout information. We will use this output to generate code views, so the output will be copied to the xDoc data directory.
28 Fact extraction
3.2
xdoc
Sources
Data directory
AsFix AST
xdoc.conf
normalize-comments
xDocInfo
meta-explosion
use-def matching tool
AST
fact extraction tool
Fact extraction
29
3.3
Documentation comments
In StrategoXT, program transformations usually operate on abstract syntax trees. In xDoc we use both representations, the abstract syntax tree for analyzing programs, and the AsFix2 format for generating code views. The AsFix2 output of sglr is therefore transformed into an abstract syntax tree by implode-asx. The resulting abstract syntax tree has to be transformed to a normal abstract syntax tree of the language. We have to eliminate the abstract syntax fragments of the xDoc comments without losing the information in them. This is done using the normalize-comments tool, which puts the fragments into annotations. But even this resulting abstract syntax tree might still not be a normal abstract syntax tree of the language. It might contain abstract syntax fragments of an object language. These fragments have to be exploded to fragments of the meta language, a process which is called meta-explosion. Now we have a proper abstract syntax tree, we can do the actual analysis of the module. Using the extension of the original source le and the language which was determined using this, an analysis tool is chosen from the xDoc conguration le. This <..>2xdoc tool returns a partial xDocInfo term with all the gathered information from the abstract syntax. We call it a partial xDocInfo term because of the fact that we can not always gather all the information for a module from the abstract syntax tree, for example the original lename of the module or the nuber of lines of code. The information we can get from the abstract syntax tree will be incorporated, together with the higher level information in a full xDocInfo term. The resulting xocInfo term will then be annotated with information which links the uses to the corresponding denition using a which is specied in the conguration le, resulting in an xDocInfo term containing all information necessary to generate the documentation. This xDocInfo term is passed on through the other phases of the documentation process.
3.3
3.3.1
Documentation comments
Instrumenting a language with documentation comment
Often extra information, in the form of documentation comments, is added by the developers to the source code. This information, which is ignored during the compilation process as it does not have a semantic value in the programming language, is processed by the documentation generation tool. The system of using documentation comments to add information to the source code is also used in xDoc. There are some design decisions that have to be made when choosing to use documentation comments. We have to decide on the placement and on the syntax of the documentation comments. Let us rst look at the syntax of the comments. Here we can see a typical xDoc comment, describing a denition.
/** * The first line should contain a short summary. * * Of course it is possible to write a more extensive
30
Fact extraction
Documentation comments
3.3
* explanation of the entity. * * @param desciption of first parameter * @param description of second parameter * which can consist of multiple lines * @author Rob Vermaas */
When looking at this comment, we can see some resemblance with a typical documentation comment in Javadoc. We have taken a Javadoc-like style of comments because of its popularity and clarity. An xDoc comment consists of a description and a list of elds. The description can consist of several paragraphs and is parsed as plain text. The elds, also called tags, are in essence key-value pairs, which can have a value that consists of one or more lines. xDoc supports some special tags, which are shown in a dierent way than other tags. These tags and the corresponding meaning are explained in detail in Section 5.1.4. For the complete syntax denition of xDoc comments, please refer to Appendix A.2.1. One could wonder why we are not using a more structured form of documentation comments, such as XML. XML is a well accepted language and this approach of structured comments is used for example in the programming language C#. Surely this approach makes it a lot easier for the documentation generator to gather information from the documentation comments, but it also makes the documentation comments less readable in the source itself. As the source code is in our opinion an important part of the documentation, we want to avoid such a situation. In fact, there is no such thing as a best choice for the syntax of documentation comments. It all depends on the developers that are using it. It is mainly a matter of taste. Therefore, we should make it possible to support more than one possible syntax. So now we have a syntax for documentation comments, we have to decide on the placement of these comments. Basically there are three ways of placing the documentation comments. Loose placement throughout the source le, which allows the developer to add the documentation comments wherever he wants, probably resulting in neater code. This approach has a serious disadvantage though, i.e. the need for a reference to the corresponding denition. It is not always easily possible to refer to a certain denition in a source le. Another option would be to put the comments in a separate le, resulting in the same disadvantage as with loose placement. The last option, and the one that is used by most documentation generators, is choosing xed positions in the source le for documentation comments. In xDoc we have also chosen to use xed positioning of documentation comments. A documentation comment is followed by the denition that is being documented. So we have a grammar for xDoc comments and a grammar for the language we want to document. We also know where we want xDoc comments to be allowed. Now we have to combine these things into a new grammar. Figure 3.2 shows a typical xDoc-enabled grammar, in this case for the Stratego language. The module imports both grammars, Stratego and xDoc. The production rules in the coupling module, xDoc-Stratego, correspond with the placement of xDoc comments in
Fact extraction 31
3.3
Documentation comments
module xDoc-Stratego imports StrategoRenamed xDoc exports context-free syntax XDocComment StrategoModule XDocComment StrategoStrategyDef XDocComment StrategoRuleDef XDocComment StrategoDecl
front of module, strategy, rule, overlay and constructor denitions. In Stratego there are also constructs, such as rules and strategies, with which you can group denitions. We are also allowing xDoc comments in front of these construct to give users the opportunity to comment groups of denitions. This is specied in the last two production rules. The possibility of documenting more than one denition with one xDoc comment can be very useful in Stratego. A good example of the use of these blocked comments is the following example.
/** * Translate a day constructor to a * E.g. <day-of-week2text>Tuesday() */ rules day-of-week2text : Sunday() -> day-of-week2text : Monday() -> day-of-week2text : Tuesday() -> day-of-week2text : Wednesday() -> day-of-week2text : Thursday() -> day-of-week2text : Friday() -> day-of-week2text : Saturday() ->
Obviously we would not want to document each denition separately. The xDoc comment will therefore be inherited from the grouping construct. From the extended syntax denition we have created, a new parse-table has to be compiled as it is not possible to combine two parse-tables at runtime. This xDoc-enabled parse-table now has to be registered in the xDoc XTC repository. This will hide a parse-table of the corresponding language that might be already in one of the imported XTC repositories. An xDoc-enabled parse-table has to be generated for each language we want to generate documentation for. This also means, that if we want to generate documentation for projects that are using concrete object syntax, we need to generate a parse-table for all the combinations of languages that are used.
32 Fact extraction
xDocInfo
3.4
module comments-to-annotations imports lib xDocInfo strategies comments-to-annotations = io-wrap( topdown( try(\ Commented(comment,term) -> term{<normalize-comments>comment} \) ) ) rules normalize-comments : xDocCommentNoFields(Summary(ps,_)) -> xDocComment(<filter(?Paragraph(_))>ps,[]) normalize-comments : xDocCommentNoSummary(fields) -> xDocComment([],<filter(?xDocField(_,_))>fields) normalize-comments : xDocComment(Summary(ps),fields) -> xDocComment(<filter(Fst)>ps,<filter(?xDocField(_,_))>fields)
3.3.2
Normalizing comments
Adding parse-able comments in a syntax denition results in a change in the abstract syntax trees of the parsed sources. This change in abstract syntax could break existing tools, which we could use for analyzing the programs. Therefore we want to remove these comment constructors for the abstract syntax, without losing the information contained in them. We can do this by putting the comments into annotations. We also perform a normalization step. This normalization enables us to use dierent forms of xDoc comments, transforming them into a standard form. This transformation can be seen as a sort of desugaring of comments. The transformation consists of a topdown traversal, which tries to translate each Commented constructor to the wrapped term by this constructor with a normalized comment annotation. The implementation is shown in Figure 3.3.
3.4
xDocInfo
For the exchange of data between xDoc components we have created a data exchange format, called xDocInfo. An xDocInfo le is an ATerm that corresponds to the abstract syntax specied in Figure 3.4. xDocInfo was designed to hold all information that is needed in the documentation generation process. The use of xDocInfo as an intermediate format has several advantages. Aside from the data exchange function, it allows xDoc to be separated in a front-end, for fact extraction, and
Fact extraction 33
3.4
xDocInfo
module xDocInfo signature constructors xDocInfo : Title * Description * List(Package) -> xDocInfo Package : Id * Directory * Description * PackageDeps * Modules * Tests -> Package Module : Id * File * Language * LOC * Footprint * LanguageSpecifics * Parents * Imports * xDocComment * Definitions -> Module Definition : Id * Parameters * Returns* Id * Footprint * Id * xDocComment * Uses * String -> Definition Use : Id * ParameterTypes * Id * String -> Use : String -> : String -> : String -> : String -> : String -> : String -> : List(Id) -> : List(Module) -> : List(Package) -> : List(Definition) -> : List(Use) -> : List(String) -> : List(Id) -> : List(Id) -> : List(Id) -> : Int -> : String * String -> Test : List(Test) -> Tests Id Title Returns Description File Directory PackageDeps Modules Packages Definitions Uses Properties Inherits Implements Imports LOC
xDocComment : List(Paragraph) * List(xDocField) -> xDocComment Paragraph : List(String) -> Paragraph xDocField : String * List(String) -> xDocField
a back-end, for presentation. This should encourage reuse of the components in xDoc and also allow easy replacement of these components. 3.4.1 Describing xDocInfo
The structure of xDocInfo follows the structure of the simple model for documentation as described in Section 2.1. For each documentation level there is a corresponding constructor. We will now describe the information that is gathered and where we get the information from. At the highest level, the global level, the fact extraction typically limits itself to the processing of the various command-line options given to xDoc, as this information can not be gathered from the source les.
34 Fact extraction
xDocInfo
3.4
Title Title for the whole project which is being documented. This information is collected from the --title parameter of xDoc. Description Description of the whole project, which is collected by reading in the le specied in the --description parameter of xDoc. The global level also contains a list of packages. At the package level we have to rely again on information given to us on the command-line. A package is dened by the following information. Name Typically the name of the package is derived from the directory a package is in. Note that the name of a package can not always be derived from the directory it is in, but sometimes, for example in Java, the package which a module belongs to is dened in the source code. Directory Directory containing the source les of the package. This can be collected directly from the -I parameter of xDoc. Description Description of the package, which is typically collected by reading in a standard le, for example a README le, from the package directory. Package dependences A list of packages on which the package depends. This information is collected by the --package-dependencies parameter of xDoc which should point to a le describing the dependencies between packages. Tests A list of key value pairs, mapping a test identier to a corresponding piece of code or example. At the module level we mostly use information gathered with the <..>2xdoc tool, so from the source code. The only information not from the source is the lename and the lines of code. A module is dened by the following information. Name The name of the module. Filename The lename which contains the module, relative to the package directory. Lines of code A number representing the lines of code of a module. Language The language which a module is specied in. Language specics List of terms, which allows for some language specic information on the module.
Fact extraction
35
3.5
Stratego to xDocInfo
Parents List of strings, representing the parent relations of a module. Imports List of strings, representing the import relations of a module. xDocComment An xDoc comment describing the module. A module contains a list of denitions. At the denition level all the information is gathered using the <..>2xdoc tool. A denition is dened by the following information. Name The name of the denition. Parameters List of tuples representing the name and types of the parameters in chronological order. Unique identier A unique identier for the denition, representing the checksum of the abstract syntax term of the denition. Return type A string representing the return type of the denition. Footprint A footprint of the denition. This is a string representing the header of a denition. Parent A string representing the unique identier of the parent denition, in case of nested denitions. xDocComment An xDoc comment describing the denition. Classication of denitions A string representing what kind of denition it is. For a more concrete description of what information is gathered at the module and denition level, see Section 3.5. 3.4.2 Using xDocInfo
The xDocInfo les produced by xDoc can be used for several purposes. Besides as exchange format within xDoc, it can be used as information source for analyzers, such as a unused code detection tool, as described in Section 7.3.1.
3.5
Stratego to xDocInfo
Now we know what information to gather, we have to do the actual extraction of information at source level.
36 Fact extraction
Stratego to xDocInfo
3.5
strategies gather-info = ?Module(n,_) ; where ( get-comment => cmt ) ; gather(blocks + imports + defs) ; !Module( n , "" , 0 , "Stratego" , [] , [] , <get-config <+ ![]>"imports" , cmt , <get-config <+ ![]>"defs" ) gather(s) = rec x({| current-def, inherited-comment : try(where(s)) ; all(x) |})
Figure 3.5: Information gathering for Stratego (Stratego) In this section we will discuss the implementation of the fact extraction tool for Stratego, stratego2xdoc. This tool works on a Stratego abstract syntax tree and returns a partial xDocInfo term, representing the module specied in the abstract syntax tree. The implementation of stratego2xdoc is written in less than 100 lines of code, showing that we can build really compact analysis modules using Stratego. In Figure 3.5 we can see the main strategy of the stratego2xdoc tool. The main strategy uses the gather strategy, parameterized with the strategies that do the actual information gathering. The gathered information is read from the cong tables used during the analysis and put into a Module term, which is the partial xDocInfo term. The strategy gather implements a topdown traversal, which tries to apply a given strategy using contextual information in the form of four scoped dynamic rules, current-def, is-dynamic, and inherited-comment. It does not modify the abstract syntax tree in any way. We will rst discuss the situations in which the contextual information is put into scoped dynamic rules.
strategies blocks = ( Strategies(id) + Rules(id) + Overlays(id) + Signature(id) ) ; get-comment => comment ; rules( inherited-comment : _ -> comment )
In Figure 3.2 we specied where in Stratego the xDoc comments are allowed. We discussed the possibility of so called blocked comments. When encountering one of the constructs, signature, overlays, rules, or strategies, supporting the blocked comment and with an xDoc comment in its annotation, we put this xDoc comment into the scoped dynamic
Fact extraction 37
3.5
Stratego to xDocInfo
strategies defs = where( def => (n,xs,tp) ) ; where( get-comment => cmt ) ; where( collect-all(call) => calls ) ; rm-annotations ; checksum => nn ; ( <?"Constructor" ; !n>tp <+ make-footprint(|n,xs) ) => fp ; !Definition(n,xs,nn,"ATerm",fp,<current-def <+ !"">,cmt,calls,tp) ; rules( current-def : _ -> nn ) ; <extend-config>("defs",[<id>]) params(|tp) = map(\ VarDec(n,_) -> (n,tp) \ <+ \ DefaultVarDec(n) -> (n,tp) \ <+ \ n -> (n,tp) \ ) rules def : RDef(n,xs,e) -> (n,<params(|"Strategy")>xs,s) where (is-dynamic ; !"Dynamic Rule" <+ !"Rule" ) => s def : RDefNoArgs(n,e) -> (n,[],s) where (is-dynamic ; !"Dynamic Rule" <+ !"Rule" ) => s def : RDefT(n,xs,xs2,e) -> (n,<conc>(<params(|"Strategy")>xs,<params(|"ATerm")>xs2),s) where (is-dynamic ; !"Dynamic Rule" <+ !"Rule" )=> s def : SDef(n,xs,e) -> (n,<params(|"Strategy")>xs,"Strategy") def : SDefT(n,xs,xs2,e) -> (n , <conc>(<params(|"Strategy")>xs,<params(|"ATerm")>xs2),"Strategy") def : SDefNoArgs(n,e) -> (n,[],"Strategy")
def : Overlay(n,xs,e) -> (n,<params(|"ATerm")>xs,"Overlay") def : OverlayNoArgs(n,e ) -> (n,[],"Overlay") def : OpDecl(n,FunType(xs,_)) -> (n,<map(!("","ATerm"))>xs,"Constructor") def : OpDecl(n,ConstType(_)) -> (n,["ATerm"],"Constructor") rules call : Call(SVar(n),xs) -> Use(n,<map(!"Strategy")>xs,<checksum>,"Call") call : CallT(SVar(n),xs1,xs2) -> Use(n,<concat>[<map(!"Strategy")>xs1,<map(!"ATerm")>xs2],<checksum>,"Call") call : CallNoArgs(SVar(n)) -> Use(n, [], <checksum>,"Call") call : Op(n,xs) -> Use(n,<map(!"ATerm")>xs,<checksum>,"Constructor application")
38
Fact extraction
Stratego to xDocInfo
3.5
rule
inherited-comment.
So when trying to resolve what xDoc comment belongs to a denition, we have to take these inherited comments into account. The following piece of code shows the resolution of an xDoc comment belonging to a denition. First the get-comment checks whether an xDoc comment is in the annotation of the denition, if so then it returns this xDoc comment. If not then it tries to get an inherited comment, by calling the scoped rule inherited-comment. If this also fails, it will result in an empty xDocComment.
strategies get-comment = get-annotations ; fetch-elem(xDocComment(id,id)) <+ inherited-comment <+ !xDocComment([],[])
Another piece of contextual information that is stored is whether or not a rule denition is a denition of a dynamic rule or not. When encountering constructs implying the generation of dynamic rules, rules( ... ) or override rules( ... ), we generate a scoped dynamic rule that represents a test which succeeds when used within the scope, is-dynamic.
strategies dynamic = (DynamicRules(id) + OverrideDynamicRules(id)) ; rules( is-dynamic : t -> t )
Now we have discussed the generation of the contextual information that is needed for the translation of Stratego to xDocInfo, we will discuss the strategies which perform this translation. When encountering an Imports term, we add the child of this term into the imports cong table. We are using these cong tables because it makes it easy to gather the information of the whole traversal, without having to pass an environment or use dynamic rules as a global variable. It is not the most beautiful and pure implementation imaginable, but it does make the program a lot easier to understand.
strategies imports = Imports(<extend-config>("imports",<id>))
Besides the imports, we have to gather the denitions that are specied in the module. Stratego has the following classication of denitions. Constructor Overlay Rule Dynamic rule Strategy In Figure 3.6 we can see what happens for each denition we encounter. We translate each Stratego denition into an xDoc Definition. As Stratego does not really have a type system we only state in the parameter list if the parameter is a Strategy or an ATerm.
Fact extraction 39
3.7
Another important issue, is the generation of a unique identier for a denition, which is done by calculating a checksum of the term representing the denition. This checksum is mainly used in the generation of the code browser.
3.6
In the previous section we described the procedure of gathering all necessary information regarding the modules, denitions and uses from the source code. However, as that analysis and translation is done at module level without knowledge of the rest of the system, we cannot already determine where the denitions that are referred to by the uses. To allow this analysis we need to have all the information on the other modules and denitions available and therefore it is implemented on the full xDocInfo term. In Figure 3.7 we can see the implementation of the tool that performs the use-denition matching for the Stratego language, stratego-match. The tool tries to annotate each Use construct in the xDocInfo term with a list of tuples representing the denitions the use refers to. When it encounters a use, it determines from the type of use, for Stratego either a Call or a Constructor application, which type of denitions it can refer to. Using the list of allowed types, the name of the use and the list of argument types, we collect all the denitions in the transitive closure of the imports which correspond to these properties, which is implemented in the get-definitions strategy. The tuples that we construct in the list in the annotation contain the name, the unique identier of the denition and a relative URL, which we can use in the API and code browser generation.
3.7
Documentation without code examples is crippled documentation. Developers need examples to understand how to use code, instead of only knowing what code is supposed to do. The problem with putting code examples in the documentation, is the fact that it should be up-to-date with the language and library. So putting the code into for example documentation comments is not an option, as the code is not syntactically or semantically checked. A good way to solve this problem is to use real programs, unit tests, which are checked at compile and run-time. In xDoc we have implemented this only for Stratego. SUnit is a Unit Testing framework for Stratego. By specifying tests that apply a strategy to a specic term and compare the result to the expected output, it is determined if new development have inuences on existing code. In SUnit, there are tests that should succeed, apply-test, apply-and-check and testing, and tests that should fail, apply-and-fail. In Figure 3.8 we can see the implementation of the tool test2text, which takes an abstract syntax tree of a Stratego module and gathers all tests in the module. This is implemented by transforming each test to the code that is in fact executed when running the tests. Depending on whether the test should or should not fail we surround it with a corresponding HTML element. In the standard stylesheet of xDoc, code that should succeed is coloured green and code that should fail is coloured red. Consider for example the following sunit test.
40 Fact extraction
3.7
module stratego-match imports lib xdoc-utils stratego2xdoc pack-graph strategies io-stratego-match = io-wrap(stratego-match) stratego-match = where(generate-internal-rules) ; rec x({| current-module: try(mod + use) ; all(x) |} ) use : Use(n,np,cnn,tp) -> Use(n,np,cnn,tp){ls} where ( <use-to-def-sort>tp ; filter(\ sort -> <get-definitions(|<current-module>,sort)>(n,np) \) ; concat ; filter(\ (n,nn) -> (n, nn, <concat-strings>[ <definition-to-module>nn , "/" , <definition-to-module>nn , ".html#",nn]) \) <+ ![] ) => ls get-definitions(|mod,n) = ?(cn,np) ; <pack-imports ; map(module-to-definitions<+![]);concat>mod ; filter({ nn: Definition(?cn,<eq>(<map(Snd)>,np),?nn,id,id,id,id,id,?n) ; !(cn,nn) }) mod = ?Module(n,_,_,_,_,_,_,_,_,_) ; rules( current-module : _ -> n ) rules use-to-def-sort : "Constructor application" -> ["Overlay","Constructor"] use-to-def-sort : "Call" -> ["Rule","Strategy","Dynamic Rule"]
Fact extraction
41
3.7
apply-test( , , , )
42
Fact extraction
3.7
module test2text imports lib Stratego stratego-xt-xtc-tools xdoc-utils string-misc strategies io-test2text = io-wrap(collect-tests) collect-tests = collect-all(succeeds + fails) succeeds = succeeds-match ; (try(un-double-quote),own-pp-str(|"succeed")) succeeds-match = \ |[ apply-test(!str:n,s,!t1,!t2) ]| -> (n,Strategy |[ <s>t1 => t2 ]|) \ + \ |[ apply-test(!str:n,s,!t1) ]| -> (n,Strategy |[ <s>t1 ]|) \ + \ |[ apply-and-check(!str:n,s1,!t1,s2) ]| -> (n,Strategy |[ <s1>t1 ; s2 ]|) \ + \ |[ testing(!str:n,s) ]| -> (n,Strategy |[ s ]|) \ fails = fails-match ; (try(un-double-quote),own-pp-str(|"fail")) fails-match = \ |[ apply-and-fail(!str:n,s,!t1) ]| + \ |[ apply-and-fail(!str:n,s,!t1,_) ]|
own-pp-str(|s) = xtc-temp-files( write-to ; xtc-pp-astratego ; ?FILE(<read-text-file>) ) ; <concat-strings>[ "<span class=\"",s,"\">" , <string-as-chars(at-last(![]));html-string> , "</span>"]
Fact extraction
43
Chapter 4
HTML was rst introduced in 1990. The main goal was to provide a simple markup language for the exchange of scientic and other technical documents suitable for use by people with no or little experience in document presentation. In the past decade HTML has outgrown its original purpose, and has become one of the most widely used and accepted formats available for presenting all sorts of information, not only technical documents. One could wonder why one wants to generate HTML with Stratego. A reason would be the need for documentation for the Stratego language, which was prefered to be in the HTML format. A second reason is to allow user-friendly demonstrations of software developed with Stratego. A good example is the online transformation service XWeb1 , which gives a webinterface to a transformation system, for example the Tiger compiler used in the Program transformation course at our institute. The advantage of such a transformation service is the fact that the user only needs a webbrowser to use the system. The reasons mentioned until now are based on a need for HTML. But a lot of websites and server-side applications actually transform and combine data. As this is exactly what Stratego is designed for, it is interesting to see how Stratego would perform on these kind of applications. Until the beginning of 2003, most HTML in Stratego/XT was generated by either building an abstract syntax tree of the HTML in Stratego and then mapping this to text, or by glueing some output of tools together with a shell-script. Both approaches were horrible, as we will argue. The use of an abstract syntax for the generation of HTML leads to unreadable and thus unmaintainable code, especially in situations where big pieces of HTML are generated. Also HTML grew a lot since its introduction. A lot of new tags where introduced, and for each of these tags a new constructor has to be made. The use of shell-scripts is also a big hassle. It is string-based, which does not allow checking of well-formedness and validness of the resulting HTML document. Another issue is the conguration of shell-scripts. A preprocessing step has to be done to add the path to the
1 Developed
by Niels Janssen.http://www.stratego-language.org/Stratego/XWeb
45
4.2
xml-tools
rules link-one : (name,url) -> %><a href="<% !url %>"><% !name :: cdata %></a><% link-two : (name,url) -> Element( QName(None,"a") , [ Attribute( QName(None,"href") , DoubleQuoted([Literal(url)]) ) ] , [ Text([Literal(name)]) ] , QName(None, "a") )
Figure 4.1: Simple link example - concrete vs. abstract syntax (Stratego) tool which outputs the script wraps.
4.2
xml-tools
The mentioned reasons have lead to the necessity of a simple way to generate HTML from Stratego. This is exactly what the xml-tools 2 package oers. The xml-tools package was designed in the rst place to connect Stratego/XT to the real world, in which the XML format is used much more than the ATerm format used in the Stratego/XT project. In the xml-tools package, a syntax denition in SDF is available for the XML language. As any language for which an SDF syntax denition is available can be embedded in Stratego, it is also possible to embed XML into the Stratego language, and as XHTML 1.0 is a reformulation of HTML 4.01 in XML we can use Stratego to generate HTML. The embedding of XML in Stratego denes a system to change from Stratego to XML, called quotation, and to change from XML to Stratego, called anti-quotation. The symbol %> and <% are used for quotation, for anti-quotation <% and %> are used. In Figure 4.1 we can see a small example of the use of XML concrete syntax in Stratego. Both rules in the example transform a pair of name and URL, which are strings, to a A element, representing a hyperlink in HTML. As can be seen the rule dened using concrete syntax is much more readable, even though this is a really small piece of XML. HTML generation in xDoc For the details of the generation of HTML in xDoc, including helper strategies, please refer to Appendix A.3.
2 Developed xml-tools
by
Martin
Bravenboer.http://www.stratego-language.org/Stratego/
46
Chapter 5
API documentation
One of the two major parts in the generated documentation by xDoc is the API documentation. In this chapter we will discuss its implementation. API documentation should give an abstract view of the structures dened in the source code of a system, and their appearance to the outside. The API documentation of dierent languages often have a lot of similarities in structure and appearance. As we have chosen the structure of xDocInfo to t multiple possible languages, the structure of xDocInfo is very close to the structure of typical API documentation. Therefore the translation of xDocInfo to the HTML documentation is relatively straightforward. In this chapter we will rst give an overview of the generated API documentation and the implementation. In the rst section we will discuss the contents of the API documentation and show the mapping between an xDocInfo term and the resulting documentation.
5.1
The API documentation generated by xDoc is divided into four parts. Alphabetical indices Global level documentation Package level documentation Module level documentation In this section we will show the mapping of xDocInfo to the actual documentation, using screenshots of documentation generated for the Stratego language. 5.1.1 Alphabetical indices
An important part of good API documentation are alphabetical indices which make it easier to search through the documentation. In xDoc we generate three kinds of alphabetical indices.
47
5.1
Index of modules Index of all denitions Indices of denitions for each kind of denition All the alphabetical indices look the same, only the contents are dierent. The module index lists the modules in the documented system, showing the name and, if available, a short description of the module. The denition indices which are generated per kind denition, show the name of the denition together with a short description if it is available. A special case of the denition index is the index containing all denitions of the system, which can be useful when the users of the documentation do not know what kind of denition they are looking for. In Figure 5.1 we can see an example xDocInfo term which we will use to illustrate the mapping of xDocInfo to the API documentation. The gure also shows the alphabetical index of strategies which corresponds to the given xDocInfo term. The alphabetical indices are generated from the complete xDocInfo term. 5.1.2 Global level
The global level API documentation consists of one HTML page. This page contains the following information of the documented system. Description of the system Possible visualizations Index of packages Statistics of system Figure 5.2 shows the mapping of xDocInfo to the global level API documentation. The global level API documentation starts with the description of the system, which has been gathered in the fact extraction phase. It is normally followed by a possible visualization, but in this case there is none. The possible visualizations are followed by a list of the packages in the system, which contains hyperlinks to the corresponding package level API documentation. Finally the list of packages is followed by some statistics of the whole system, describing the number of packages, modules and denitions. It also shows the percentages of the modules and denitions that are documented using an xDocComment. 5.1.3 Package level
The package level API documentation consists of one HTML page for each package in the documented system. This page contains the following information of the corresponding package. Description of the package Possible visualizations Statistics of the package
48 API documentation
5.1
xDocInfo( "Test packages" , "No description available..." , [ Package( "package1" , "/home/rbvermaa/tet/package1" , "No description available from package directory." , ["package1"] , [ Module( "string" , "string.str" , "Stratego" , 33 , "module string" , [] , [] , ["lib"] , xDocComment( [Paragraph(["Some string strategies. "])] , [xDocField("@author", ["John Doe"])] ) , [ Definition( "unescape-chars" , [("s", "Strategy")] , "16t-102t-117t24t32t35t-104t-63t-100t-81t-118t-75t85t87t119t124" , "ATerm" , "unescape-chars(Strategy s)" , "" , xDocComment( [Paragraph(["Unescapes characters using a specified unescape strategy."])] , [xDocField("@author", ["Armijn Hemel"]), xDocField("@param", ["List(Char) -> List(Char)"])] ) , [ Use("try", ["Strategy"], "-26t93t-32t57t-13t-39t77t-45t90t108t-67t-102t47t-103t27t65", "Call"){[]} , Use("s", [], "", "Call"){[]} , Use("x", [], "-72t5t-122t-77t24t-95t-29t-60t-56t27t63t-114t73t-13t95t-77", "Call"){[]} ] , "Strategy" ) , Definition( "escape-chars" , [] , "-99t-33t82t126t-64t124t-59t-31t69t49t125t-121t21t24t36t43" , "ATerm" , "escape-chars" , "" , xDocComment( [Paragraph(["Escapes double quotes, backslash and linefeed to C like escape sequences."])] , [xDocField("@since", ["0.9.4"])] ) , [ Use("Escape", [], "50t47t69t46t-68t77t75t5t-39t-40t-111t-70t-55t-17t-126t-28", "Call"){[]} , Use("x", [], "-72t5t-122t-77t24t-95t-29t-60t-56t27t63t-114t73t-13t95t-77", "Call"){[]} ] , "Strategy" ) ] ) ] ) ] )
Figure 5.1: xDocInfo term with corresponding alphabetical index of strategies (Stratego).
API documentation
49
5.1
xDocInfo( "Test packages" , "No description available..." , [ Package( "package1" , "/home/rbvermaa/tet/package1" , "No description available from package directory." , ["package1"] , [ ... ] ) ] )
Figure 5.2: xDocInfo term with corresponding global level API documentation.
50
API documentation
5.1
Summary index of modules Figure 5.3 shows the mapping of a partial xDocInfo term representing a package, to the package level API documentation. This part starts with the description of the package, which was gathered in the fact extraction phase. The description is followed by possible visualizations for this level. In the case of the screenshot there are no visualizations done at this level. The visualization are followed by some statistics, which are almost the same as in the previous level, except that the statistics are calculated on the contents of the current package. The statistics are followed by a summary list of modules that are in the package. The summary is taken from a possible xDocComment gathered in the fact extraction phase, taking the rst sentence of the summary.
5.1.4
Module level
The module level API documentation consists of one HTML page for each module in the documented system. This page contains the following information of the corresponding module. Description of module Possible visualizations Statistics Summary list of denitions, per kind of denition Detailed list of denitions, per kind of denition In Figure 5.4 we can see the module level API documentation which corresponds to the only module in the given example. It starts with a header of the module, followed by two elds shown separately, Author and Since, which are taken from the a possible xDocComment documenting the module. These elds are followed by a full description of the module, which also originates from a possible xDocComment. This is followed by the possible visualizations, in this case a local import graph, and the statistics of the module. Next part of the module level API documentation is the summary list of denitions contained in the module. This list is an alphabetically sorted list which shows the denitions divided by kind of denition, and if available also a short summary of the denition. It also contains hyperlinks to the detailed description of a denition and to the source code, which is in the same HTML le as this list. The detailed denition list shows all the denition in the module for which relevant information is available, in other words, denitions which have been documented using an xDocComment. The detailed denition index is also categorized using the classication of the denitions, and visualizes all the information taken from the xDocComment.
Special tags Besides the description, xDoc comments have so called tags. These tags are in essence just key-value pairs, but in some cases we want to show the information in these tags in a dierent way. For this purpose xDoc has some tags that have a special meaning.
API documentation 51
5.1
Package( "package1" , "/home/rbvermaa/tet/package1" , "No description available from package directory." , ["package1"] , [ Module( "string" , "string.str" , "Stratego" , 33 , "module string" , [] , [] , ["lib"] , xDocComment( [Paragraph(["Some string strategies. "])] , [xDocField("@author", ["John Doe"])] ) , [ ... ] ) ] )
Figure 5.3: xDocInfo term with corresponding package level API documentation.
52
API documentation
5.1
API documentation
53
5.1
Here we will give an overview of these tags and how they are processed in xDoc. The following tags can be used in any xDoc comments. @since This tag can be used to specify from which version on the corresponding module or denition is available. The value itself is shown without any interpretation, but is not shown in the standard list that is generated or tags. @author This tag can be used to specify the author of the corresponding module or denition. As with the @since tag, the value is shown without interpretation, and shown on a xed place. The following tags can be used in xDoc comments that are used to document denitions. @param This tag can be used to give a description of a parameter of a denition. There can be more than one @param tag, in which case the n-th @param gives the description for the n-th parameter of the denition. This information is combined with the names and possible types of the parameters. @obsolete This tag can be used to lter denitions that have been marked obsolete out from the documentation. Obviously we do not want developers to use this kind of denitions. Thus they are not shown in the documentation @inc This tag can be used to include visualizations of tests, which might have been gathered in the fact extraction phase. Given a name it looks up the tests and shows them in a separate box.
54
API documentation
Chapter 6
Code browser
An important part of xDocs view on documentation is the importance of the presence of original source code in the documentation. Generated documentation is usually a snap-shot of a system. Adding the possibility to browse through the code and to switch from API documentation to code view and back should give the developer better insight into the system. Also it is important for system that are not very well documented, as was for example the case with Stratego, in which case the code of the library was its own documentation. Browsing through the systems source code helps in searching and understanding the system. The code browser generator in xDoc is in fact a cross-reference tool, such as for example the Linux Cross Reference1 tool. Cross-referencing tools describe the relations between the denitions and uses in a program, typically resulting in a hyperlinked version of the source code. This chapter will discuss the implementation of the generation of the code browsing facility that xDoc oers, for which an example was shown in Figure 2.2.
6.1
General
In StrategoXT there was already a tool for creating an HTML view of the source code, called tohtml. However it does not support cross-referencing and has only limited support for syntax highlighting. It is part of the Generic Pretty-printing package. To allow cross-referencing we need to use the information we have gathered in earlier stages of the documentation process. This information was gathered by analyzing the abstract syntax tree of the modules of the language. Parsing a piece of code and converting this to an abstract syntax tree introduces a problem. We loose the connection between the source code and the abstract syntax tree, so also the connection between source code and our analysis. We could generate the code views by pretty-printing the abstract syntax trees to text. This would make it quite easy to generate hyperlinked code views, as GPPs BOX language has constructs for a LABEL (anchors) and a REF (hyperlinks). Pretty-printing has the advantage that all code should look pretty consistent with respect to the layout. However
1 http://lxr.linux.no
55
6.1
General
implode-asfix asfix-extra
xDocInfo
AsFix
Asfix-2-html4
Code view
whitespace is often used by programmers to clarify pieces of code and its structure. In practise, pretty-printing is actually not as pretty as one would like. We would actually want to support both situations, code views for both original as prettyprinted code. Pretty-printed code views can be useful in case of the use of concrete object syntax. Wanting to generate both kinds of source code views, we obviously do not want to specify the transformation of the code, either for textual or abstract syntax tree format, twice. We have solved this by using the AsFix2 output that is generated by sglr. The AsFix2 format is a complete representation in ATerm format of the parse-tree constructed by the parser. It includes all information required to reproduce the original input. AsFix2 is in typical StrategoXT applications only an intermediate format that is immediately imploded to an abstract syntax tree. In xDoc we will use the AsFix2 terms as main source for generating code views. In Figure 6.1 we can see a schematic overview of how both kinds of code views are generated. From the fact extraction phase, we have already all the sources parsed and the results, AsFix2 and abstract syntax trees, stored into a data directory. To allow the generation of a prettyprinted version of the code view, we have to convert the abstract syntax tree to AsFix2. This is done by pretty-printing the abstract syntax tree using ast2abox and abox2text from GPP. The result is then parsed again by sglr resulting in an AsFix2 term. The procedure for the pretty-printed code view, converting the abstract syntax tree to AsFix2, using pretty-printing and parsing again might seem inecient, but it gives us also the opportunity to test our pretty-printer, as all the output by the pretty-printer is checked by the parser. This way we have found many small mistakes in for example the pretty-printer
56 Code browser
Transforming AsFix
6.2
rules all-other-asfix : t@appl(prod(ss,s,attrs(atts)),l) -> appl(prod(ss,s,attrs(atts)),l1) where <fetch-elem(?term(cons(<id>)))>atts ; ( match-config(|"--constructors") => c ; <asfix-syntaxhiglight(|c)>t + match-config(|"--defs") ; <asfix-defs>t + match-config(|"--uses") ; <asfix-uses>t + match-config(|"--meta") ; <asfix-concrete>t ) => l1
Figure 6.2: Matching constructor names (Stratego) for Stratego. So now we have a representation of the original source code and the pretty-printed source code in AsFix2 format. We partially implode these AsFix2 terms using implode-asx, to make life a bit easier. The necessary HTML elements are then added to the resulting AsFix2 term using asx-extra, which also gets an xDocInfo term containing all our analysis information. The resulting HTML-annotated AsFix2 term is then yielded by Asx-2-html4 into text, including escaping certain characters in this text, which can be included into a PRE tag in HTML.
6.2
Transforming AsFix
The asx-extra tool performs four tasks that are necessary to create the code views in xDoc. Add syntax highlighting Add anchors Add hyperlinks Add concrete syntax visualization The fact that an AsFix2 term has all information contained in it also means that AsFix2 terms are huge. The size of these terms make it almost impossible for developers to perform clearly written transformations on them, which obviously leads to unmaintainable code. So AsFix should be hidden as much as possible. In xDoc we have chosen to hide any form of AsFix transformations from the users. The asx-extra tool does the ugly work for us, leaving us only to specify lists of constructor names of denitions, uses, syntax-highlighted constructs and meta transition markers in the xDoc conguration le. The asx-extra tool performs a topdown traversal in which it tries to nd an AsFix term representing a constructor application. The rule that implements this transformation is shown in Figure 6.2. When such a term is found it checks if the constructor name is in one of the lists provided by the user in the xDoc conguration le and perfoms the corresponding addition of HTML elements in the AsFix term. In the next sections we will eloborate what exactly happens for each case.
Code browser 57
6.2
Transforming AsFix
strategies asfix-syntaxhiglight(|c) = ?appl(_,l) ; <span(|c)>l span(|class) = <concat>[ [ <concat-strings>["<span class=\"",class,"\">"] ] , <id> , ["</span>"] ]
Syntax highlighting is a feature which is often used in text editors, cross-referencers and other software that presents source code. By using colors or text decorations we can enhance the understanding of the code as developers can more easily see what a certain part of code contains. Syntax highlighting in xDoc consists of a language-independent part and a language-specic part. The language-independent syntax highlighting highlight all the keywords and literals in a language. It is relatively straightforward to implement this as all the literals can be recognized by lit nodes in the AsFix terms. In Figure 6.3 we can see that when we encounter a lit term we surround the string representing the literal with a SPAN (HTML) element with class keyword. Using CSS2 we can alter the appearance of these SPAN elements, for example rendering it bold face. The language-specic part consists of adding syntax highlighting to parts that are specied per language in the conguration le of xDoc, using the constructor names. In Figure 6.3 we can see the strategy asfix-syntaxhighlight which implements this. In the case that we have found a constructor in the AsFix term we add again a SPAN element to the AsFix term, but this time using the constructor name as value for the class attribute of the HTML element. This allows the user to adapt the appearance of the specic constructors in CSS like the example below which is the CSS for changing the appearance of an xDocComment SPAN element.
span[class="xDocComment"] { color : #008800 ; // xDocComment is rendered dark green }
6.2.2
The syntax highlighting did not require any information of the analysis of the source code, that we have stored in the xDocInfo term. Therefore the transformations are straightforward. The implementation of the generation of anchors and hyperlinks is shown in Figure 6.4. When we want to add hyperlinks to the source code, we need to have anchors to let the browser know where to jump to. We want to place an anchor in front of each denition we want to link to. At this point we need to make the connection between the AsFix term and
58 Code browser
Transforming AsFix
6.2
strategies asfix-defs = ?appl(_,l) ; implode ; checksum => checksum ; <anchor(|checksum)>l ; rules( current-def : _ -> checksum ) asfix-uses = ?appl(_,l) ; implode ; <get-config>(<current-def>,<checksum>) => [ (n,_,link) | _ ] ; ![ <link(|link)>n | <Tl>l ] anchor(|name) = ![ <concat-strings>["<a name=\"",name,"\"></a>"] | <id> ] link(|link) = <concat-strings>["<a href=\"",link,"\">",<id>,"</a>"]
the xDocInfo term, which represents the gathered information of the documented system. When we have found a constructor application in an AsFix term of a constructor, that is dened in the conguration le as a denition constructor, we call the asfix-def strategy, which is listed in Figure 6.4. As explained in Section 3.5, for each found denition we have calculated a checksum of the term representing the abstract syntax tree. We use this checksum as the name for the anchor. As the AsFix term includes the abstract syntax tree in encoded form, we can obtain this abstract syntax tree and its checksum by imploding the AsFix term representing a denition and perform the checksum calculation, which we use in the anchor being inserted. In the traversal which implements the AsFix transformation we also store the current denition we are in using a scoped dynamic rule, current-def. Now we have anchors to link to, we have to add the hyperlinks. When we have found a constructor application in an AsFix term of a constructor, that is dened in the conguration le as a use constructor, we call the asfix-use strategy. In Section 3.6 we showed how we match the uses to the corresponding denitions. When we nd a use in the AsFix term we need to know where the use should link to. Again we need information from the xDocInfo term. This is collected similarly to the procedure with anchors. We implode the AsFix term and calculate the checksum of the resulting abstract syntax tree. However, this checksum is not enough to identify a use, as it depends on the place where, in which denition, it is used. We use the current-def scoped dynamic rule for this purpose. As a use can point to more than one denition, we run into a problem. How can we link to multiple denitions, as a hyperlink can point only to one place. In xDoc we have chosen to only point to the rst denition in our list of used denitions. In a future version of xDoc
Code browser 59
6.2
Transforming AsFix
this will be replaced by a construction in which we use CSS to show a dropdown menu to the possible multiple hyperlinks for used denitions. Most cross-referencers are for online use, which means the cross-references are calculated when clicking on a hyperlink in the source code, after which a coupling page appears, showing where the denition is dened and where it is used. As xDoc aims to generate documentation that can be used also oine, xDoc does all the cross-reference calculations at generation-time. 6.2.3 Concrete syntax visualization
The xDoc code browser also supports visualization of the inner works of a module which is dened using concrete object syntax as described in [22]. The pretty-printed code view can be very useful as it will show the code as it is transformed into terms of the meta-language. In the pretty-printed code view we get this automatically as the abstract syntax tree already is normalized to the meta-language. But in some cases we can even show the inner work of the concrete object syntax in the code view with original layout, as is shown in Figure 2.3. To accomplish this we make use of the fact that in these modules with concrete object syntax there are some constructors that indicate that a piece of the abstract syntax tree is in the object language format. We call these constructors meta-transition markers. When encountering a constructor application of such a meta-transition marker, which are given to us in the conguration le, we follow the following procedure, which is also performed in the Fact Extraction phase described in Section 3.2. The strategy make-ast implements this visualization. The AsFix term is imploded to a mixed abstract syntax tree, after which we apply meta-explode which translates the object syntax fragments of the abstract syntax tree into terms of the meta-language. The result from this translation is then pretty-printed using ast2text and abox2text, which results in a string representing the concrete object syntax fragment, showing how it would look like if it would have been written in the meta-language. To represent this visualization in HTML we put a SPAN element around the concrete object syntax fragment, which shows the visualization only when the user of the documentation hoovers over the fragment with the mouse. This span also allows us to change the colors of the object syntax fragments, creating an even bigger visual distinction between meta and object-language. These appearance issues are all implemented in the CSS stylesheet.
60
Code browser
Transforming AsFix
6.2
strategies asfix-concrete = ?appl(_,l) ; make-ast => view ; <concat>[[<concat-strings>[ "<span class=\"concrete\"><span class=\"ast\">" , view , "</span>" ] ] , l , ["</span>"] ] make-ast = xtc-temp-files( xtc-transform(!"implode-asfix",!["--layout","--appl","--inj","--lit" ,"--alt","--seq","--list","--lex","--pt"]) ; fix-implode ; concat-content2 ; write-to ; xtc-transform(!"meta-explode") ; xtc-transform(!"stratego-ensugar") ; xtc-transform(!"ast2abox", ![ "-p", <get-config ; xtc-find> "--pp-table" | <pass-verbose>]) ; xtc-transform(!"abox2text") ; ?FILE(<read-text-file>) ; html-string )
Figure 6.5: Adding AST view to concrete object syntax parts (Stratego)
Code browser
61
Chapter 7
Visualizations in xDoc
Creators of software visualization tools often claim to have wonderful tools that have a signicant impact on the deeper understanding of software systems. Even though they often generate beautiful visualizations, often no indication is given about the eectiveness of the visualization on the development or understanding process. Eectiveness of visualization is a vague expression, it is not easy to determine what is a proper and insightful visualization for a specic situation. Therefore it is needed that visualizations are developed together with the users of the documentation, the programmers. Often even simple visualizations can give insight in a software system. Good visualizations are often found by just playing around, trying to nd a proper visualization. Even though experimenting is good, we still need to wonder, what is the purpose of visualizing this in this way and always try and focus on the practical part of the visualization. This is also the approach we have followed when developing the currently implemented visualizations in xDoc. In this chapter we will look at what a visualization is in xDoc and what kind of visualizations we have developed.
7.1
Most of the documentation generators that exist, do not really allow the addition of new visualizations. In our opinion this is not a good situation as a good visualization depends on the users that use the visualizations. Therefore, xDoc supports addition of new visualizations for the documentation using xDoc visualization components. An xDoc visualization component is a tool that creates a visualization based on information contained in the input xDocInfo term and generates an XHTML term representing the visualization. The command-line options needed for an xDoc visualization component are shown below.
Options: -i f|--input f -o f|--output f --overall Read input from f Write output to f Generate visualization for overall level 63
7.2
Graphs
--package pack Generate visualization for package pack --module mod Generate visualization for module mod --definition def Generate visualization for definition def --html d --tmp d Output directory for HTML documentation Directory for temp. files
Obviously it is not always possible to capture the whole visualization in a piece of HTML, e.g. when we are generating images, which typically reside in a dierent le. Therefore we have to tell the xDoc visualization component in which directory we are generating the HTML documentation. An xDoc visualization component also needs to know the level for which it needs to generate the visualization, including an identier specifying the corresponding package, module or denition.
7.2
Graphs
Visualizations are often in a graphical form. A graph is a good model for many problems in computer science. Visualizations using graphs have two forms, static and interactive visualizations. Static graph visualization of graphs is the most common sort of visualization, typically being an image in a format like GIF and JPEG. It is relatively easy to generate and integrate into documentation of any form. Interactive visualization of graphs is dierent in the fact that the user is able to interact with the graph, changing the appearance, hide information or any other form of manipulation. This interaction could give better insight, however, interactive graph visualizations tend to use special programs, and are therefore not really suitable for oine documentation. In xDoc we therefore only generate static graphs, even though it is possible to also use interactive visualizations. A big problem when generating graphs is the arranging of the nodes and edges. This problem is a research-area on its own, so we would not want to touch this subject. In fact we would actually just want to specify the graph in some format, and let everything concerning graph drawing be done by an external program. This is exactly what GraphViz1 oers us. GraphViz is one of the best static graph visualizers available freely. It has its own graph denition language, called Dot. In StrategoXT we have a syntax denition available for the Dot language in SDF. This oers us the possibility to use concrete syntax for Dot in our Stratego programs. In Figure 7.1 we can see an example Stratego program generating a small graph with two nodes and and edge between them. This is implemented in the graph strategy. We build a Dot statement, an edge from node1 to node2. We then build the surrounding Dot constructs, giving it some properties of how it should look like. This resulting Dot abstract syntax tree is passed to the gen-dot-output strategy, which pretty-prints the abstract syntax tree to concrete Dot syntax, using the XTC tools ast2abox and abox2text. As graphs sometimes tend to be big it can get quite messy. Graphviz oers a tool that performs a transitive reduction of the input graph. xDoc also oers a command-line option to enable this transitive reduction. Whether it is useful should be determined by the users that generate the documentation. The number of edges will decrease quite a bit, but users
1 http://www.research.att.com/sw/tools/graphviz
64
Visualizations in xDoc
Graphs
7.2
strategies graph = !Stmt |[ node1 -> node2 ; ]| ; make-graph ; gen-dot-output(|"example") make-graph = ?stmts ; !Graph |[ digraph whatever { ratio=compress; size="5"; graph [ color = black , bgcolor = transparent ] ; edge [ color = black ] ; node [ color = black , fillcolor = aliceblue , style=filled , fontcolor = black , fontsize=8 , height="0.25" ] ; stmt*:stmts } ]| gen-dot-output(|bn) = ?adot ; let create-tmp-file = <concat-strings>[ <get-tmp-dir>, bn, <id>] create-gif-file = <concat-strings>[<get-html-dir>, "images/", bn, ".gif"] in <create-tmp-file>".adot" => fadot ; <create-tmp-file>".dot" => fdot ; <create-tmp-file>".dot2" => fdot2 ; <create-tmp-file>".map" => fmap ; <create-gif-file>".gif" => fgif end ; <WriteToTextFile>(fadot,adot) ; xtc-temp-files( !FILE(fadot) ; xtc-transform(!"ast2abox",![ "-p", <xtc-find>"Dot-pretty.pp" | <pass-verbose> ]) ; xtc-transform(!"abox2text",pass-verbose) ; rename-to(!fdot) ) ; ( tred-wanted < ( <xtc-command(!"tred1")>[fdot,fdot2] ; !fdot2 ) + !fdot ) ; ?fdot3 ; <xtc-command(!"dot")>["-Tgif","-o",fgif,fdot3] ; <xtc-command(!"dot")>["-Tcmap","-o",fmap,fdot3]
Visualizations in xDoc
65
7.2
Graphs
Figure 7.2: Example local import graph of the documentation might interpret the graph dierently. Therefore, the default behavior is that no transitive reduction is performed on the graphs. So the result after pretty-printing is given to the transitive reduction tool tred or is left untouched, after which it is passed twice to the dot tools, which does the actual graph drawing. Once to generate a GIF le, and once to generate an image map, which can be used in an HTML le. 7.2.1 Import graphs
An important visualization in the documentation generated by xDoc for the Stratego and SDF languages is the import graph. Import graphs are generated for the global, package and module level. The purpose of the import graphs is two-fold. They give insight in the global structure of programs and it can reveal interesting facts, such as unintended transitive import relations. Furthermore, the import graphs at the module level have shown to be an eective way to navigate through the documentation. An import graphs shows the import relations between the modules in the documented system. Each package is visualized using a subgraph containing nodes, which represent the modules in the package. An edge from node A to node B represents the A imports B relation. The dierence in the import graphs at the dierent levels are mainly in the lter that is applied on the total set of import relations in the documented system that we want to visualize. At the global level we show all packages, including all the modules and the import relations between these nodes. At the package level we show the corresponding package, including its modules and the import relations between these modules. A more interesting import graph is the import graph at module level. In Figure 7.2 we can see an example of such a graph. The import graph at module level shows all the modules that are imported by the current module and all the modules that import the current module. The modules are again shown including the package they are part of. In case of an import of an unknown module, the import graph will show the unknown module outside any of the package subgraph indicating that the module is outside the current documentation. Ommitting the import relation to that module would also have been an option, but it is
66 Visualizations in xDoc
Graphs
7.2
better in our opinion to show all the import relations of the current module at the module level. In Figure 7.3 we can see the implementation of the generation of the import graph at module level. The strategy implementing this gets the name of the module for which we want to generate the import graph as a parameter, n. First the set of nodes that is allowed to be in the graph, being all the modules that are imported by module n and the modules that import n, plus n itself. After this, we lter out all the Module nodes in the xDocInfo term that are not in the allowed set. So we are actually ltering the xDocInfo term. The resulting xDocInfo term is then mapped to the graph, by generating the subgraphs, representing the packages including the modules. This is done by applying the package-to-subgraph and module-to-node rules, which translate the xDocInfo constructs to corresponding Dot constructs, a package to a subgraph and a module to a node. The edges of the graphs are generated by the make-edges strategy which maps each Module construct to an edge from that module to its imports, translating these edges usign edge-to-stmt. The subgraphs and the edges form the body of the graph, which we wrap with a standard header and footer using the make-graph strategy. This resulting Dot abstract syntax tree is then handled like we explained earlier in the chapter. The implementation of the generation of import graphs on the other levels is only dierent in the fact that the ltering is slightly dierent. For the complete program, please refer to Appendix A.5.1.
7.2.2
Class diagrams
For documenting Java, xDoc contains a visualization component that generates Java class diagrams, showing the inheritance relations of Java classes. Just like with the import graphs that are generated for Stratego, the class diagrams are generated at the global, package and module level. The buildup of these graphs is like the import graphs, only we have dierent meanings for the edges. There are two kinds of edges, a solid edge representing an extends relation and a dotted edge representing an implements. The class diagrams at global and package level are similar to the import graphs described earlier at the same levels, just showing the relations between the ltered set of modules. However the class diagram at module level is a bit dierent. In Figure 7.4 we can see an example class diagram at module level. In the class diagram at module level we show the complete inheritance path upwards from the corresponding class and its direct implementing classes below. For the implementation, please refer to Appendix A.5.2.
7.2.3
Another example of a visualization that is provided by xDoc is the XTC composition graph. An XTC composition graph shows which XTC tools are called in an application and in which order.
Visualizations in xDoc 67
7.2
Graphs
strategies module-import-graph(|n) = ?xdocinfo ; where( collect( { na,li1,li2: ?Module(n,_,_,_,_,_,_,li1,_,_) ; ![n|li1] + ?Module(na,_,_,_,_,_,_,li2,_,_) ; <fetch-elem(?n)>li2 ; ![na] }) ; concat => allowed ) ; xDocInfo(id, id, filter (Package(id,id,id,id, filter(Module({n: ?n ; <fetch-elem(?n)>allowed } ,id,id,id,id,id,id,id,id,id)) , id) ) ) => filtered ; ?xDocInfo(_,_,<filter(package-to-subgraph(|"../"))>) => subgraphs ; <make-edges(|allowed)>filtered => edges ; <concat>[subgraphs,edges] => stmts ; make-graph ; gen-dot-output(|<concat-strings>[n,".import"]) ; <gen-graph-html>("Import graph","../../",<concat-strings>[n,".import"]) make-edges(|allowed) = collect( \ Module(n, _, _, _, _, _, _, li, _, _) -> <filter({s: ?s ; <fetch-elem(?s)>allowed ; !(n,s) })>li \ ) ; concat ; map((double-quote,double-quote);edge-to-stmt) rules edge-to-stmt : (s1,s2) -> Stmt|[ id:s1 -> id:s2 ; ]| package-to-subgraph(|or) : Package(n, _, s, lp, lm, _) -> Node |[ subgraph id:<conc-strings ; double-quote>("cluster_",n) { label = id:<double-quote>n ; style = filled ; color = blue4 ; fillcolor = floralwhite ; node [ fontsize = 9 ] ; stmt*:nds } ]| where <map(module-to-node(|or))>lm => nds ; not([]) module-to-node(|or) : Module(n, s, _, _, _, _, _, _, _, _) -> Stmt |[ id:<double-quote>n [ URL=id:url , fillcolor = id:color] ; ]| where <concat-strings ; double-quote>[or,n,"/",n,".html"] => url ; (<get-config>"--module" ; ?n ; !"khaki1" <+ !"aliceblue" ) => color
Graphs
7.2
Figure 7.4: Example class diagram XTC does not provide a concrete syntax for the composition language, but instead it provides a library for Stratego that allows composition using Stratego controlf ow constructs. Therefore it is not straightforward to make a visualization of the tools that are used in a composition and the order in which they are applied. To extract the necessary information we have to go through a few steps. Normalize specication Inline the main strategy Remove irrelevant structure Translate remaining structure to graph Consider the following Stratego program that is in fact an XTC composition of the tools parse-stratego, stratego-desugar and pp-stratego.
module xtc-composition imports xtc-lib lib strategies main = xtc-io-wrap(composition) composition = xtc-transform(!"parse-stratego") ; if-verbose1(desugar) ; xtc-transform(!"pp-stratego", !["--abstract" | <pass-verbose> ] ) desugar = xtc-transform(!"stratego-desugar")
The rst step to be performed is a normalization of the specication. The procedure is similar to what is done in the Stratego compiler (see also Figure 7.6), therefore we are reusing the compiler components, pack-stratego, pre-desugar, normalize-spec and spec-toVisualizations in xDoc 69
7.3
Graphs
sdefs. After these tools are applied, we transform all rule denitions to strategy denitions, which will result in a specication containing only strategy denitions. The normalization is necessary to allow the next step in the process, which is inlining. Inlining is a transformation that replaces function calls by the corresponding function body. We try to inline every call to an inlineable strategy except for the calls to xtc-transform, as these calls represent the calls to the XTC tools in the composition. From the resulting body of the main strategy it is already a lot easier to derive which XTC components are used. But as this body contains a lot of irrelevant control ow, calls, builds and matches, we have to simplify the body by removing this irrelevant information. We perform a crude simplication, after which we only have a body containing only sequences ;, choices +, wheres, ids and calls to xtc-transform. For the given example the resulting simplied body looks as follows when it is pretty-printed.
((xtc-transform(!"parse-stratego" |) ; (xtc-transform(!"stratego-desugar" |) + id) ; xtc-transform(!"pp-stratego", id |) + id) + id)
This piece of code is already almost the graph we want. The last step to take is to convert the remaining operators to the graph, for each XTC tool called a corresponding node and for each other operator the corresponding edges. The nal result is shown in Figure 7.5. At the moment, the XTC visualization tool is not yet fully integrated in xDoc as an xDoc visualization component. It only generates the XTC composition graph from given source code. This is done, because we have to determine manually what strategy of what module is the strategy that represents the actual program. The generated graphs could be linked to the tools invocation information. But to allow this it is needed how the tool invocation information can be incorporated into xDoc. The biggest XTC composition available at this moment is the Stratego compiler. Figure 7.6 shows the XTC composition graph of strc. For the complete implementation, please refer to Appendix A.5.3
70 Visualizations in xDoc
Graphs
7.3
7.3
Analyzers
7.3
7.3.1
Analyzers
Unused denition detection
As said before, visualizations in xDoc do not need to be graphical. We developed a visualization component for the documentation of the Stratego language, which analysis if there are denitions that do not seem to be in use. Obviously this would not work for documenting software libraries, as most denitions of libraries are typically only used outside the library. But as documentation generators are also often used to document programs, this could be a useful tool for cleaning up the code. The tool is based on the analysis done by the use-denition matching tool described in Section 3.6, which add annotations describing the denition it uses to the Use constructs of xDocInfo term. In Figure 7.7 we can see the implementation of the tool which gets an xDocInfo term as input and outputs HTML describing which denitions seem to be unused. The tool only works at the global level, but can easily be extended to the other levels. As the analysis is in fact already performed earlier in the documentation phase, the implementation is very straightforward. It gathers all the annotations from the Use constructs of the xDocInfo term and lters all the denition that are not in that list. From this list it generates a HTML view, mapping each denition to a link.
72
Visualizations in xDoc
Analyzers
7.3
module gather-unused-defs imports lib xDocInfo xdoc-utils pack-graph xml-doc strategies io-gather-unused-defs = io-wrap(x-options, gather-unused-defs) gather-unused-defs = ?xdocinfo ; generate-internal-rules ; where( collect(\ Use(_,_,_,_){[x*]} -> <map(Snd)>x* \) ; concat => used ) ; collect({nn: where( ?Definition(_,_,nn,_,_,_,_,_,_) ; not(<fetch-elem(?nn)>used) ) }) ; sort-definitions ; !%> <b>The following definitions do not seem to be used</b><br/><br/> <% map(def-to-link) ; separate-by(!%>, <%) :: content* %> <% def-to-link = ?Definition(n0,l,n1,_,_,_,_,_,_) ; where(<definition-to-module>n1 => mod) ; <concat-strings>[mod,"/",mod,".html#",n1] => link ; !%><a href="<%=link %>"><%=n0 %></a><%
Visualizations in xDoc
73
Chapter 8
xDoc in retrospect
8.1 Conclusion
In this thesis we have described the design and implementation of xDoc, an extensible documentation generator. xDoc supports the following extensions. Instantiate new languages Use new comment style Generate new visualizations and analyses To add support for a new language using xDoc we have to take the following steps, given the availability of an SDF specication of the new language. 1. Specify a name for the language and extensions of the corresponding lenames (2.3.2) 2. Take the SDF grammar and enable the grammar with xDoc comments (3.3.1) 3. Specify the kind of denitions in the language (2.3.2) 4. Write a program that translates the language into xDocInfo terms (3.5) 5. Write a program that does the matching of uses and denitions (3.6) 6. Specify the constructor names for denitions, uses, syntax highlighting and metatransistion markers (2.3.2) 7. Possibly write visualization components for the language and specify which visualization to be made in the documentation (7, 2.3.2) As one can see, we do not get a documentation generator for a new language for free. It is not a trivial process. However, by specifying the steps in the process of documentation generation and the specic tasks to perform, it makes it easier to make a documentation generator and it allows the documentation generator developer to only focus on the language specic parts. By using SDF and sglr we get some nice other features for free. The posibility to document languages that use concrete object syntax and the ability to quickly adapt to changes in a language.
75
8.2
Future work
The resulting system is very useful and is used already in the StrategoXT project for generating documentation for the SSL and other packages. As a usecase of extendibility, we have made an instantiation of xDoc for documenting Java. A Java grammar in SDF was already available, which obviously saves a lot of time. In a bit more than one day we had already a simple documentation generator which supported most of the Java language and generates class diagrams. The implementation of this Java instantiation should however be improved, as there are still features that are not implemented, like the use-denition matching (requires extensive semantic analysis) and support for complete JavaDoc comments. 8.1.1 Metrics
The xDoc package contains the following amounts of code for the two implementation languages, SDF and Stratego. Stratego: 2820 lines of code SDF : 208 lines of code Total: 3028 lines of code The table below shows the lines of code (for both SDF and Stratego modules) that we used for each language that is currently supported by xDoc, specied for each step. Note that this does not include the modules of the syntax denitions of the languages, as we assume a syntax denition in SDF is available. Task 2 4 5 7 Total Stratego 16 97 39 164 316 SDF 10 60 36 106 Java 11 111 146 268
To give a idea of how long the documentation process takes, we have run some test on some Stratego packages of dierent sizes, run with the conguration as shown in Figure 2.9 without transitive reduction on graphs, using sources taken from a subversion server. Name xDoc SSL Tiger StrategoXT LOC 2820 8336 10380 21141 Packages 1 1 8 14 Modules 31 80 151 430 Denitions 480 1518 2273 4234 Time (user + system) 334s 864s 1746s 5345s
8.2
Future work
As said before, xDoc is not nished. This section is a meant as a pool of ideas which we think still remains to be done in the future to make xDoc complete. As with a lot of new tools, xDoc is in a state which would suggest refactoring and optimization of the code. The rst step towards xDoc as a stable product will be to x the
76 xDoc in retrospect
Future work
8.2
remaining bugs and revise the code, along with adding extra command-line options to have more control over the documentation process. Another step to be taken is the design of a concrete syntax for conguration, which should make it easier to congure xDoc. As xDoc is an extensible documentation generator, it should be extended in every possible way. For example we should make instances for more languages. The main prerequisite for such an extension is the availability of a syntax denition in SDF. Therefore the most likely candidates are, C C++ Prolog ASF It would also be useful to experiment with island grammars in xDoc, especially when a syntax denition is not available for the language for which we want to generate documentation. Besides adding support for extra languages it is good to look at the possibilities to combine documentations for dierent languages. Stratego/XT is build up using Stratego, SDF and C. In an ideal situation we would like to capture all these languages into one documentation. Another good extension is the development of new visualizations. The visualizations currently implemented in xDoc have proven to be useful, especially the local import/inheritance graph and the XTC dataow diagram. New visualizations to be implemented are dicult to think of in advance. A good visualization is typically thought of in the spur of the moment and takes a lot of experimentation to get right. The concept of visualizations can be interpreted in a broad way, e.g. analyzers such a unused code analyzer is also a form of visualization when representing the gathered information in the documentation. We would like to add a notion of tools in the xDocInfo structure. This would allow generation of tool API documentation, describing the tool and its invocation methods. Tools and programs can be seen as a higher-level denition, with its own invocation procedures. This is especially useful in conjunction with the XTC visualization descibed in this thesis. At the moment xDoc generates the documentation in HTML. It would be useful to write extra back-ends to allow other output formats. For example,
A LTEX1
DocBook2 This would make it easier to include documentation generated by xDoc into publications like theses and reference manuals. Writing a back-end for these output formats is only useful for the API documentation. The structure of xDoc allows easy addition of new back-ends and can be implemented in any language which support ATerms. Even though the standard documentation comment syntax in xDoc works ne, it would be useful to add some features to it. Especially when we want to add support for dierent output formats, we want to have a simple output-independent markup language in the doc1 http://www.latex-project.org/ 2 http://www.docbook.org/
xDoc in retrospect
77
.0
Future work
umentation comment syntax. Another enhancement would be full support for the JavaDoc documentation comment syntax, which is used in many other documentation generators. Besides the mentioned extensions and improvements, it would be useful to develop some helper tools. For example, a tool could be developed that automatically combines the syntax denitions of xDoc documentation comments and the language we want to document, using the constructor names of denitions and modules.
78
xDoc in retrospect
Appendix A
Appendix
For complete code of xDoc, please refer to https://svn.cs.uu.nl:12443/repos/ StrategoXT/trunk/xdoc.
A.1
Stratego/XT: preliminaries
xDoc has been implemented using StrategoXT. To understand the implementation and design issues of xDoc it is necessary to have some knowledge of concepts and ideas used in StrategoXT. StrategoXT is the combination of the Stratego strategic programming language with the XT bundle of transformation tools [23]. A.1.1 ATerm
The foundation of StrategoXT is the ATerm format. ATerm are used for exchange of structured data such as abstract syntax trees. The ATerm format is dened as follows[19]: Int An integer constant is an ATerm Real A real constant is an ATerm Application A function symbol and a list of ATerms is an ATerm List A list of ATerms is an ATerm Blob A list of bytes is an ATerm There are libraries for C, Java, and Haskell, which support reading and writing of ATerms. As xDoc is implemented using StrategoXT, ATerms are also the exchange format for the xDoc tools. A.1.2 Stratego
Stratego is a language designed for program transformation. It is based on the paradigm of term rewriting with programmable rewriting strategies. Stratego oers conditional rewrite rules, which allow basic transformations by recognizing
79
A.1
Stratego/XT: preliminaries
a subterm to be transformed by pattern matching and replaces it with a pattern instance. In stead of applying the rules in a certain way, Stratego allows the programmer to specify how the application of the rule should be done. This is done by specifying a strategy, in which we can use control combinators, such as choice +, left-choice <+ and sequence ;. In stead of explicitly specifying the traversals over terms it is possible to specify generic traversals, which are implemented using the traversal primitives all, some and one. Examples of such generic traversals are topdown, bottomup and onetd (once topdown).
strategies topdown(s) = s ; all(topdown(s)) bottomup(s) = all(bottomup(s)) ; s oncetd(s) = s <+ one(oncetd(s))
Besides normal rewrite rules, which are context-insensitive, Stratego oers dynamic rules and dynamic scoped rules. Dynamic rules can be generated at runtime, which makes it possible to add context information. Rules can also be scoped. Dynamic rules generated within a rule scope are automatically retracted at the end of that scope[21]. Stratego oers a library, called the Stratego Standard Library, which oers a large collection of rules and strategies for generic traversal, manipulation of numbers, strings, lists, tuples and optionals, generic language processing, and system interfacing such as I/O, process control, association tables. The xDoc project was initiated mainly because of the lack of documentation for this library. All tools of xDoc are written in Stratego. A.1.3 SDF
StrategoXT uses the syntax denition formalism SDF [20] and its associated generators. SDF integrates lexical and context-free syntax, resulting in a formalism that allows all aspects of the syntax a language to be described in one formalism. As SDF is a declarative language, we can use the specications for the generation of parsers, pretty-printers and Stratego signatures. As SDF is also a modular language it is easy to make extensions to a language, such as domain-specic aspects for a language or the addition of concrete object syntax to a meta-language, whithout having to rewrite the original grammar. Concrete object syntax An example of concrete object syntax is shown below, where the same rule is shown with and without concrete object syntax. Using concrete object syntax is especially useful when the concrete object syntax fragments become bigger, as abstract syntax trees tend to get unreadable in these cases.
rules PlusZero :
80
Appendix
Stratego/XT: preliminaries
A.1
The addition of concrete object syntax to an arbitrary meta-language for manipulating object programs using SDF is described in [22]. By combining the syntax denition of the meta-language and the object-language, specifying where concrete object syntax fragments in the meta-language are allowed to be used, a new parsetable is generated. After parsing of a program, which uses concrete object syntax, and implosion of the parsetree we have an abstract syntax tree with constructs of the object-language. These constructs need to be translated in to term of the meta-language, a procedure which is called meta-explosion. This process results in an abstract syntax tree which only has constructs of the meta-language, which can then be compiled by the compiler of the meta-language. This feature of StrategoXT is used extensively in xDoc. xDoc uses concrete object syntax for XML, Dot and Stratego. A.1.4 Generic pretty printing
The Generic Pretty-printing Package (GPP) in StrategoXT allows pretty-printing to dierent targets [12]. At the moment text, html and latex are supported. This is implemented using a output-independent language, called BOX, which has some operators which are designed for pretty-printing purposes. GPP is used throughout StrategoXT to convert abstract syntax trees to concrete syntax fragments. Pretty-printing is driven by pretty-print tables, which can be generated automatically from syntax denitions. Pretty-print table specify the mapping between constructor names in the abstract syntax trees and the corresponding BOX terms. GPP is used throughout xDoc for the conversion of abstract syntax trees in all forms to text. A.1.5 XTC
Transformation systems often consist of tools that perform specic tasks that could be reused, such as parsers, pretty-printers and optimizers. In StrategoXT, a framework for tool composition has been introduced, called Transformation Tool Composition (XTC), which makes it easy to make a composition of such tools. Tools are registered in a repository, specifying a name, version and location of the tool. A library of abstractions is implemented in Stratego allowing to call these tools transparently. Using the library a tool can be used at any place in the transformation in stead of the application of a rule or strategy. Consider for example the following example, which shows a simple composition which one might use in a source-to-source optimizer. The component shown calls the parser component, after which it calls the optimizer component, typically working on an abstract syntax tree, after which it calls the pretty-printer component resulting in a text le representing the optimized program.
module source2source-optimizer
Appendix
81
A.2
Fact extraction
As xDoc has several components itself and we are reusing many components from StrategoXT, XTC is used extensively through xDoc.
A.2
A.2.1
Fact extraction
Syntax denition for xDoc comments
module xDoc exports sorts XDocComment %% Comments context-free syntax "/**" "*/" "/**" Fields "*/" "/**" SummaryOnly "*/" "/**" Summary Fields "*/"
%% Summary context-free syntax (Paragraph EmptyLines)+ -> Summary {cons("Summary")} {Paragraph EmptyLines}+ EmptyLines? -> SummaryOnly {cons("Summary")} "*" {SummaryLine "*"}+ -> Paragraph {cons("Paragraph")} lexical syntax [\n] SummaryLineChar+ [\ \t] [\*] EmptyChar* [\n] EmptyLine+ EmptyLine [\*] SummaryLineChar* [\@] SummaryLineChar*
%% Fields context-free syntax Field+ -> Fields {prefer} "*" FieldId FieldValue -> Field {cons("xDocField")} {FieldValueLine "*"}+ -> FieldValue lexical restrictions
82
Appendix
A.3
SummaryLine -/- [\ \/] lexical syntax [\@] Id -> FieldId [\n] -> FieldLineChar FieldLineChar+ -> FieldValueLine lexical restrictions FieldValueLine -/- [\ \/] %% Identifiers lexical syntax [a-zA-Z\\-\_] [a-zA-Z0-9\\-\_]* -> Id lexical restrictions Id -/- [a-zA-Z0-9\\.\-\_\*]
A.3
In xDoc we use some strategies that are helpful in the generation of HTML. In this section we will discuss these strategies, as we will use them in the next two chapters. A.3.1 Standard layout
As explained in Section 2.1, we have a standard layout for all HTML pages generated with xDoc. The implementation of the strategies which help in the process of generating HTML are shown in Figure A.1. In the standard layout of xDoc there is a menu that does not change at any point in the documentations. The generation of this menu is done by the make-menu strategy. It builds a list of tuples, representing the options in the menu, from which the gen-menu generates the HTML element of this menu. The menu contains links to the main page of the documentation, and a link to each of the alphabetical index that is generated by xDoc. When generating HTML with Stratego using concrete XML syntax, we are actually building an abstract syntax tree of XML. This abstract syntax tree has to be mapped to text, also called pretty-printing, which is done using the gen-html-output. This strategy gets an XML term as current term, and maps this term using the GPP tools, ast2abox and abox2text to the lename which is given as term parameter. Further on in this thesis, the strategies that are generating HTML pages, will typically only be using the gen-html strategy. This is the strategy that combines the helper strategies mentioned previously. It generates the menu, combines the menu and the contents, wrapping these with the standard HTML tags and outputting the HTML as text to the specied lename. In Figure A.2 we can see how a standard layout is being generated. For each HTML page we want to create, we have to specify four DIV elements, each of which represents one of the left indices and the content frame. In this case, all the DIV elements are given an empty HTML part, just some whitespace, which typically should be given a table of packages in case of the left package index ( div-package), or the contents of the API documentation in
Appendix 83
A.3
strategies gen-html(|title,fn,or) = ?content ; <concat-strings>[or,"style.css"] => csslink ; make-menu(|or) => menu ; !%><html> <head> <title><% !title :: cdata %></title> <link type="text/css" href="<% !csslink %>" rel="stylesheet" /> </head> <body> <% ![ menu | content ] :: content* %> </body> </html><% ; gen-html-output(|fn) make-menu(|or) = ![ ("Overview",<concat-strings>[or,"api/index.html"]) , ("All definitions",<concat-strings>[or,"api/index_all.html"]) , ("Module index",<concat-strings>[or,"api/index_module.html"]) | <for-all-sorts( !( <concat-strings>[<id>," index"] , <concat-strings>[or,"api/index_",<id>,".html"]) )>() ] ; gen-menu(|"menu") gen-menu(|class) = map(!%><a href="<% Snd %>"><% Fst :: cdata %></a><%) ; separate-by(!%> | <%) ; !%><div class="<% !class %>"> [ <% id :: content* %> ] </div><% gen-html-output(|fn) = concat-content ; xtc-temp-files( write-to-text ; xtc-transform(!"ast2abox",![ "-p",<xtc-find>"xml-doc.pp.af" | <pass-verbose> ]) ; xtc-abox2text ; rename-to(!fn) ) <+ say(<concat-strings>["error generating ",fn])
84
Appendix
A.3
strategies empty-layout = ![ div-packages(%> <%) , div-modules(%> <%) , div-definitions(%> <%) , div-contents("Empty page title!", %> <%) ] ; gen-html(|"Empty page title!", "empty.html", "../") overlays div-packages(packages) = %><div class="packages"> <h1>Packages</h1> <!-- package list here --> <% !packages :: content %> <br /> </div><% div-contents(title,cs) = %><div class="content"> <h1><% !title :: cdata %></h1> <% !cs :: content* %> </div><%
div-contents).
DIV
elements is handled
A.3.2
Indices
The API documentation generated by xDoc consist mainly of indices, they are everywhere. Therefore we have made a strategy that abstracts from all these dierent indices, and thus can be reused. In Figure A.3 we can see the implementation of this strategy. It is based on the fact that an index consists of categories and items that need to be categorized. The strategy make-index gets four strategy parameters which are used for visualizing of the categories and items, and for the comparison between a category and an item. Basically, what it does is the following. It gets a tuple with a list of categories and a list of items as current term. For each category it collects the items for which the comparison parameter strategy compare succeeds, and it applies the visualization parameter strategy itemview to each of these items. For each of the categories it then applies the visualization parameter strategy catview. The generation of the index is completed by applying the visualization parameter strategy bigger which combines the visualized categories. There is also a second make-index that uses the make-table strategy, which wraps an index into a TABLE element, as default for bigger. This is just a convenience alias, as we mostly generate indices that are HTML tables.
Appendix 85
A.3
strategies make-index(compare,catview,itemview) = make-index(compare,make-table,catview,itemview) make-index(compare,bigger,catview,itemview) = ?(cats,items) ; <filter( \ cat -> <catview>( cat , <filter( where( <compare>(cat,<id>) ) ) ; map(itemview) ; not(?[])>items ) \ )>cats ; flatten-list ; bigger
Figure A.3: Helper strategy used for generating indices (Stratego) A.3.3 Left indices
In the standard layout of xDoc, we always have three indices on the left of the HTML page, which are used to show package, module and denition list corresponding to the place we are in the documentation. The package index is the same on each location in the documentation. The denition list is only lled at the module level documentation and shows the denitions of the current module. The module lists at global level shows all modules categorized using the package that they belong to. At package level and module level, the module list shows a list of the modules contained in the current package. As an example we will show the implementation of the module index at global level, which should give an idea how the other indices are made and how the strategy described in the previous section can be easily used to generate indices for xDoc. The implementation of the index is shown in Figure A.4. After building a tuple containing the list of all packages and a sorted list of all the modules in the system, we call the strategy make-index, described in the previous section. Using the strategy compare-package-to-module, which checks if a module is in a certain package, make-index categorizes which modules are shown for a package. The package is then visualized using the to-th strategy and the modules are visualized using the module-to-tr-left rule which contain links to the API, original and pretty-printed code browser. A.3.4 Experiences
We have used concrete XML syntax in Stratego extensively in xDoc. Concrete syntax usually makes the code a lot clearer, but if one does not take care it can result in unreadable code, such as in the following example, which was an actual piece of code from an earlier version of xDoc.
86 Appendix
A.3
strategies left-all-module-index(|or) = !(<packages> ,<modules ; string-qsort> ) ; make-index(compare-package-to-module,to-th,module-to-tr-left(|or)) module-to-tr-left(|or) : s -> %><tr> <td valign="top"><a href="<% !api %>"><% !s :: cdata %></a></td> <td valign="top" align="right"> <% !menu :: content %> </td> </tr><% where ![ ("code",<concat-strings>[or,"code/",s,"/",s,".html"]) , ("pp",<concat-strings>[or,"code/",s,"/",s,".pp.html"]) ] ; gen-menu(|"left-module") => menu ; <concat-strings>[or,"api/",s,"/",s,".html"] => api compare-package-to-module = ?(<package-to-modules>,mod) ; fetch-elem(?mod) to-th = !%> <tr><th colspan="1" align="left"><% Fst :: cdata %></th> </tr> <% Snd ; flatten-list :: content* %> <%
Appendix
87
A.4
API
strategies gen-menu = ![ %>[ <%, <[ !%><a href=<% Snd %>><% Fst %></a><% | map(!%>| <a href=<% Snd %>><% Fst %></a><%) ]> ,%> ]<% ]
In a perfect world, concrete syntax should give us some nice advantages, like generating well-formed XML and the possibilty to transform the generated abstract syntax tree of the XML. However, in xDoc we use concrete syntax for XML mainly because it is a nice and natural way to write XML generation in Stratego, especially when you compare this to string concatenation or using abstract syntax. Well-formedness is also an issue that is not yet resolved. At the moment it is only possible to check the well-formedness at runtime, which we would obviously like to see being done at compile-time.
A.4
A.4.1
API
Implementation of module level API documentation
The implementation of module level API documentation is the most complex part of the API documentation generation, mainly because this part contains the most information, as it is really close to the source code. Implementation In Figure A.5 we can see the strategy that implements the generation of the module level API documentation, gen-api-module. The description of the module can be collected from the xDocComment describing the module. From the summary part of the xDocComment, we call a strategy, get-summary, which transforms the summary, which is a list of paragraphs into HTML by surrounding the paragraphs with a P tag. The visualizations are generated by a call to the generate-visualizations strategy, which looks up the needed visualizations in the xDoc conguration le, and returns HTML elements representing the visualizations. The rst index of the module level API documentation is the short denition index. It gives a short overview of the denitions in the module, categorized using the classication of the denitions. In Figure A.6 we can see the implementation of the generation of this index for a specic kind of denition. The index is generated using the make-index strategy, described in Appendix A.3.2, which is passed def-to-tr-short as visualization strategy. This strategy translates a Definition xDocInfo term to a TR element, in which the footprint and a short description of the denition,if available, is shown. We call the make-index strategy on the list of denitions that are in the current module and do not have an @obsolete in its xDocComment. This ltering is implemented in the relevant-defs-from-module. The second index of the module level API documentation is the detailed denition index. It gives a detailed overview of the denitions in the module, categorized using the classication
88 Appendix
API
A.4
strategies gen-api-module = ?x@Module(mod,_,_,_,footprint,_,_,_,xDocComment(sm,fs),_) ; <module-to-package>mod => pack ; <concat-strings>["../",mod,"/",mod,".html"] => browselink ; !%> <table> <tr><td>Author</td><td><% <get-author>fs :: cdata %></td></tr> <tr><td>Since</td><td><% <get-since>fs :: cdata %></td></tr> </table> <% <get-summary>sm :: content* %> <br /> <br /> <% generate-visualizations(|["--module",mod]) :: content* %> <br /> <br /> <% <gen-module-stats>x :: content %> <br /> <br /> <h1>Summary</h1> <% for-all-sorts(definition-short-index(|mod,"../../")) :: content* %> <br /> <br /> <h1>Details</h1> <% for-all-sorts(definition-detailed-index(|mod)) ; not([]) <+ !%> <b>no detailed information available</b><% :: content* %> <br /> <br /> <% ; concat-content => contents ; ![ div-packages( <left-package-index(|"../../")> ) , div-modules( <left-module-index(|"../../")>pack ) , div-definitions( <left-definitions-module-index(|"../../")>mod) , div-contents(footprint, contents) ] ; gen-html(| <concat-strings>["Overview of module ",mod, " (API)"] , <concat-strings>[<get-html-dir>,mod,"/",mod,".html"] , "../../")
Appendix
89
A.4
API
strategies definition-short-index(|mod) = ?sort ; !( [ <concat-strings>[sort," summary"] ] , <relevant-defs-from-module(id|sort)>mod ) ; make-index(id,title-to-th,def-to-tr-short) def-to-tr-short = ?Definition(n,np,nn,footprint,_,xDocComment(ps,_),_,_) ; <definition-to-module>nn => mod ; <concat-strings>["../code/",mod,"/",mod,".html#",nn] => link ; !%><tr> <td class="api" width="200"> <a href="<% <concat-strings>["#",nn] %>"><% !footprint %></a> </td> <td class="api"> <% <get-short-summary>ps :: cdata %> </td> <td class="api" width="50" align="right"> [ <a href="<% !link %>">code</a> ] </td> </tr><%
of the denitions. The dierence with the summary index is the fact that we only show the denitions which have relevant documentation information. In Figure A.7 we can see the implementation of the generation of this detailed index. The implementation is similar to the short denition index, except that the set of denition is ltered more strict and a dierent visualization strategy is passed to make-index. The visualization strategy def-to-tr-long generates a view of the construct containing the following information. Return-type Footprint Author Since Full description List of parameters and corresponding types Other, unrecognized elds Possible code examples See Section 5.1.4 for a description how the specic tags of an xDocComment are handled. See Figure 5.4 for an example of a generated module level API documentation.
90 Appendix
Definition
xDocInfo
API
A.4
strategies definition-detailed-index(|mod) = ?sort ; !( [ <concat-strings>[sort," details"] ] , <relevant-defs-from-module(def-has-relevant-info|sort)>mod ) ; make-index(id,title-to-th,strategy-to-tr-long) def-to-tr-long = ?Definition(n,params,nn,returns,footprint,_,xDocComment(ps,fs),_,_) ; <definition-to-module>nn => mod ; <concat-strings>["../code/",mod,"/",mod,".html#",nn] => link ; !%><tr> <td class="api" align="right" width="60"><% !returns :: cdata %></td> <td class="str"> <a name="<% !nn %>"></a><% !footprint :: cdata %> [ <a href="<% !link %>">code</a> ] </td> </tr> <tr> <td class="api" colspan="2"> <table> <tr><td>Author</td><td><% <get-author>fs :: cdata %></td></tr> <tr><td>Since</td><td><% <get-since>fs :: cdata %></td></tr> </table> <br /> <% <get-summary>ps :: cdata %> <table> <% <gen-param-view>(params,fs) <+ !%> <% :: content %> <% <gen-api-fields-tr<+![]>fs :: content* %> </table> <br /> <% <get-examples>fs :: content* %> </td> </tr><% gen-param-view = (not([]),filter(?xDocField("@param",_))) ; zip(\ ((n,tp),xDocField(_,v)) -> %><%=tp %> <b><%=n %></b> - <% <concat-strings>v %><% \) ; !%><tr> <td class="api" width="70"><b>Parameters</b></td> <td class="api"> <% concat :: content* %> </td> </tr><%
Appendix
91
A.5
Visualization
The implementation of the statistics for the module level API documentation have not yet been discussed. We will in stead discuss the implementation of the statistics for the global level API documentation, as the statistics for the module level is much simpler. Statistics As mentioned in Chapter 5, at the global, package and module level of the API documentation, some statistics are generated. The following statistics are generated for the global level API documentation. Lines of code Number of packages Number of modules Percentage of modules documented Number of denitions, specied for each kind of denition Percentage of denitions documented, specied for each kind of denition The statistics of the package and module level documentation are very similar, except that the number of packages is only relevant for the global level, and the number of modules is only relevant on the global and package level. In Figure A.8 we can see the implementation of the statistics part of the documentation. The statistics for the global level API documentation are generated by the gen-overall-stats strategy, which gets an xDocInfo term as input. The lines of code is calculated by get-loc which looks for modules in the xDocInfo term and returns a list of the lines of codes for each module and takes the sum of this list. The number of packages calculated by taking the length of the package list. For the module and denition statistics we want a bit more information. We want the number and the percentage which is documented. This is implemented in the stats strategy. It calculates the number for which the second strategy parameter succeeds, and the number for which also the rst strategy parameter succeeds, calculating the corresponding percentage. The statistics gathered are then transformed into a HTML table using the strategy.
gen-stats
The statistics for the other levels are generated similarly, without representing the irrelevant information. They also work on an xDocInfo term, but a partial xDocInfo term representing the package or module for which the statistics are generated.
A.5
A.5.1
Visualization
xdoc-import
92
Appendix
Visualization
A.5
strategies gen-overall-stats = ?xdocinfo ; packages ; length ; int-to-string => ps ; <stats(module-has-relevant-info,is-module|"Module")>xdocinfo => ms ; <for-all-sorts( { sort: ?sort ; <stats(def-has-relevant-info,def-of-sort(?sort)|sort)>xdocinfo })>xdocinfo => ds ; <gen-stats>[ ("Lines of code",<get-loc>xdocinfo) , ("Number of packages",ps) , ms | ds ] stats(s1,s2|n) = ?x ; collect(s2) => set ; not([]) ; length ; int-to-string => nr ; !( <concat-strings>[n," number"] , <concat-strings>[nr, " (", <get-percentage(s1,s2)>x,"% documented)"] ) <+ !(<concat-strings>[n," number"],"0") get-percentage(s1,s2) = collect(s2) => l1 ; collect(s1) => l2 ; <div>(<mul>(<length>l2,100),<length>l1) ; int-to-string get-loc = collect(?Module(_,_,_,<id>,_,_,_,_,_,_)) ; sum ; int-to-string
Appendix
93
A.5
Visualization
strategies complete-import-graph = ?xdocinfo ; ?xDocInfo(_,_,<map(package-to-subgraph(|""))>) => subgraphs ; <get-modules>xdocinfo => allowed ; <make-edges(|allowed)>xdocinfo => edges ; <concat>[subgraphs,edges] => stmts ; make-graph ; gen-dot-output(|"complete-import.import") ; <gen-graph-html>("Complete import graph","../","complete-import.import") package-import-graph(|n) = ?xdocinfo ; xDocInfo(id,id,filter(?Package(n,_,_,_,_,_))) ; where( get-modules => allowed ) ; where( make-edges(|allowed) => edges ) ; ?xDocInfo(_,_,<map(package-to-subgraph(|""))>) => subgraphs ; <concat>[subgraphs,edges] => stmts ; make-graph ; gen-dot-output(|<concat-strings>["package-",n,".import"]) ; <gen-graph-html>("Import graph","../",<concat-strings>["package-",n,".import"])
module-import-graph(|n) = ?xdocinfo ; where( collect( { na,li1,li2: ?Module(n,_,_,_,_,_,_,li1,_,_) ; ![n|li1] + ?Module(na,_,_,_,_,_,_,li2,_,_) ; <fetch-elem(?n)>li2 ; ![na] }) ; concat => allowed ) ; xDocInfo(id,id,filter(Package(id,id,id,id,filter(Module({n: ?n ; <fetch-elem(?n)>allow ; ?xDocInfo(_,_,<filter(package-to-subgraph(|"../"))>) => subgraphs ; <make-edges(|allowed)>filtered => edges ; <concat>[subgraphs,edges] => stmts ; make-graph ; gen-dot-output(|<concat-strings>[n,".import"]) ; <gen-graph-html>("Import graph","../../",<concat-strings>[n,".import"])
make-edges(|allowed) = collect(\ Module(n, _, _, _, _, _, _, li, _, _) -> <filter({s: ?s ; <fetch-elem(?s)>allo ; concat ; map(edge-to-stmt) make-graph = ?stmts ; !Graph |[
digraph whatever { ratio=compress; size="5"; graph [ color = black , bgcolor = transparent ] ; edge [ color = black ] ; node [ color = black , fillcolor= aliceblue , style=filled , fontcolor = bl
94
Appendix
Visualization
A.5
stmt*:stmts } ]| get-modules = collect(\ Module(n, _, _, _, _, _, _, li, _, _) -> n \) rules edge-to-stmt2 : (s1{e1},s2{e2}) -> Stmt|[ id:<double-quote>e1 -> id:<double-quote>e2 ; ]| node : s1{s2} -> Stmt |[ id:<double-quote>s2 [ label = id:<double-quote>s1 ] ;]| edge-to-stmt : (s1,s2) -> Stmt|[ id:<double-quote>s1 -> id:<double-quote>s2 ; ]| package-to-subgraph(|or) : Package(n, _, s, lp, lm, _) -> Node |[ subgraph id:<conc-strings ; double-quote>("cluster_",n) { label = id:<double-quote>n ; style = filled ; color = blue4 ; fillcolor = floralwhite ; node [ fontsize = 9 ] ; stmt*:nds } ]| where <map(module-to-node(|or))>lm => nds ; not([]) module-to-node(|or) : Module(n, s, _, _, _, _, _, _, _, _) -> Stmt |[ id:<double-quote>n [ URL=id:url , fillcolor = id:color] ; ]| where <concat-strings ; double-quote>[or,n,"/",n,".html"] => url ; (<get-config>"--module" ; ?n ; !"khaki1" <+ !"aliceblue" ) => color
A.5.2
java-class
module java-class imports lib Java xDocInfo pack-graph xtc-lib Dot xdoc-import-html xdoc-utils xdoc-dot-output strategies io-java-class = io-wrap(java-class-options,java-class) java-class = ?xdocinfo ; <concat-strings>[<get-html-dir>,"../xdoc_import/"] ; mkdir
Appendix
95
A.5
Visualization
; generate-class-diagram(|xdocinfo) generate-class-diagram(|xdocinfo) = <get-config>"--overall" ; <overall-class-diagram>xdocinfo + <get-config>"--module" => n ; <module-class-diagram(|n)>xdocinfo module-class-diagram(|mod) = ?xdocinfo ; <concat>[<pack-inheritance(|xdocinfo)>mod ,<get-implementing(|mod)>xdocinfo] => allowed ; <xDocInfo(id,id, filter(Package(id,id,id,id, , filter( Module( {n: ?n ; <fetch-elem(?n)>allowed } ,id,id,id,id,id,id,id,id,id) ),id)))>xdocinfo => filtered ; ?xDocInfo(_,_,<filter(package-to-subgraph(|"../"))>) => subgraphs ; <make-edges(|allowed)>filtered => edges ; <concat>[subgraphs,edges] => stmts ; make-graph ; gen-dot-output(|<concat-strings>[mod,".inheritance"]) ; <gen-graph-html>("Inheritance","../../",<concat-strings>[mod,".inheritance"]) overall-class-diagram = ?xdocinfo ; ?xDocInfo(_,_,<filter(package-to-subgraph(|""))>) => subgraphs ; <get-modules>xdocinfo => allowed ; <make-edges(|allowed)>xdocinfo => edges ; <concat>[subgraphs,edges] => stmts ; make-graph ; gen-dot-output(|"complete.inheritance") ; <gen-graph-html>("Complete class diagram","../","complete.inheritance") pack-inheritance(|xdocinfo) = \ root -> ([root], (), []) \ ; graph-nodes-roots( Fst , class-to-parents(|xdocinfo) , \ (n,x,xs) -> [n|xs] \ ) get-implementing(|name) = collect({n: where( ?Module(n,_,"Java",_,_,_,<id>,_,_,_) ; fetch-elem( ?Implements(<map(pp-java)>) ; fetch-elem(?name) + ?Extends(<pp-java>) ; ?name + ?ExtendsInterfaces(<map(pp-java)>) ; fetch-elem(?name) ) ) ; !n }) class-to-parents(|xdocinfo) = ?class ; <collect(?Module(class,_,"Java",_,_,_,<id>,_,_,_) ; get-parents)>xdocinfo
96
Appendix
Visualization
A.5
; flatten-list get-parents = filter( ?Extends(<pp-java>) + ?Implements(<map(pp-java)>) + ?ExtendsInterfaces(<map(pp-java)>) ) ; flatten-list remove-nl = string-as-chars(reverse;Tl;reverse) pp-java = xtc-text(|"jtree2text") ; remove-nl xtc-text(|tool) = xtc-temp-files(write-to ; xtc-transform(!tool) ; ?FILE(<read-text-file>)) java-class-options = ArgOption("--html", where(<set-config>("--html", <conc-strings>(<id>,"/"))) , !"--html d Output directory for HTML files") + ArgOption("--tmp", where(<set-config>("--tmp", <conc-strings>(<id>,"/"))) , !"--tmp d Directory for temp. files") + Option("--overall", where(<set-config>("--overall",["overall"])) , !"--overall Generate complete import graph from xDocInfo") + ArgOption("--package", where(<set-config>("--package",<id>)) , !"--package pac ") + ArgOption("--module", where(<set-config>("--module",<id>)) , !"--module mod ") + ArgOption("--definition", where(<set-config>("--definition",<id>)) , !"--definition def ") strategies get-modules = collect(\ Module(n, _, _, _, _, _, _, li, _, _) -> n \) make-edges(|allowed) = collect({n:?Module(n,_,_,_,_,_,<id>,_,_,_); filter(java-to-edge(|n))}) ; flatten-list make-graph = ?stmts ; !Graph |[ digraph whatever { center=true ; nodesep="0.3" ; ratio=compress ; size="5"; graph [ color = black , bgcolor = transparent ] ; edge [ color = black ] ;
Appendix
97
A.5
Visualization
node [ shape = record , color = black , fillcolor= aliceblue , style=filled , fontcolor = black , fontsize=8 , height="0.25" ] ; stmt*:stmts } ]|
package-to-subgraph(|or) : Package(n, _, s, lp, lm, _) -> Node |[ subgraph id:<conc-strings ; double-quote>("cluster_",n) { label = id:<double-quote>n ; fontsize = "10" ; style = filled ; color = blue4 ; fillcolor = floralwhite ; node [ fontsize = 9 ] ; stmt*:nds } ]| where <map(module-to-node(|or))>lm => nds ; not([]) module-to-node(|or) : Module(n, s, _, _, _, ls, _, _, _, _) -> Stmt |[ id:<double-quote>n [ label=id:<double-quote>name , URL=id:url , fillcolor = id:color] ; ]| where <concat-strings ; double-quote>[or,n,"/",n,".html"] => url ; (<get-config>"--module" ; ?n ; !"khaki1" <+ !"aliceblue" ) => color ; ( <fetch-elem(?"interface")>ls ; <concat-strings>["\\<\\<interface\\>\\>\\n",n] <+ !n ) => name
java-to-edge(|n) : Extends(m) -> Stmt|[ id:<pp-java;double-quote>m -> id:<double-quote>n [dir=back,arrowtail=empty] ; ]| java-to-edge(|n) : ExtendsInterfaces(m) -> <map(!Stmt|[ id:<pp-java;double-quote> -> id:<double-quote>n [dir=back,arrowtail=empty] ; ]|)>m java-to-edge(|n) : Implements(m) -> <map(!Stmt|[ id:<pp-java;double-quote> ->
98
Appendix
Visualization
A.5
A.5.3
xtc-graph
module xtc-graph imports list string config dir term-io Stratego options xtc-lib Stratego-Amb imports xdoc-imports-dot strategies io-xtc-graph = xtc-io-wrap(x-options,xtc-graph) x-options = ArgOption("--main", where(<set-config>( "--main", <id>)) , !"--main f | -m f Main strategy (default: main)") + ArgOption("-I", where(<extend-config>("-I", [<id>])) , !"-I dir Include modules from directory dir") xtc-graph = comp(|"pack-stratego",["--slack","-I","." |<get-config ; map(!["-I",<id>]); concat <+ ![]>"-I"]) ; add-main ; comp(|"pre-desugar") ; comp(|"normalize-spec") ; comp(|"use-def") ; comp(|"spec-to-sdefs") ; read-from ; where( <get-config>"--main" < ?main + "main" => main ) ; topdown(try(normalize + eliminate-rule + app)) ; Specification(filter(Strategies(id))) ; Specification([Strategies( map(try(is-inlineable ; make-inline-rule)))]) ; ?Specification([Strategies(<fetch-elem(?SDefT(main,_,_,_))>) ]) ; try(inline-sdef) ; do-main ; make-it ; where( collect-all(is-string ; node) => stmt ) ; collect-all(translate) ; concat ; filter(edge-to-stmt2) ; make-set ; <make-graph>[<id>|stmt] ; write-to remove-seq : Seq(s1,s2) -> <flatten-list>[s1,s2]
get-tail : s -> [s] where <is-string>s get-tail : Seq(s1,s2) -> b where <get-tail>s2 ; try( where(fetch-elem(?"id")) ; <concat>[<get-tail>s1,<filter(not(?"id"))>]) => b get-tail : Choice(s1,s2) -> <flatten-list>[<get-tail>s1,<get-tail>s2] get-tail : Where(s) -> ["id"]
Appendix
99
A.5
Visualization
get-tail : Id -> ["id"] get-head : s -> [s] where <is-string>s get-head2 : Seq(s1,s2) -> b where <get-head>s1 ; try( where(fetch-elem(?"id")) ; <concat>[<get-head>s2,<filter(not(?"id"))>]) => b get-head1 : Seq(Where(s1),s2) -> <concat>[<get-head>s1,<get-head>s2] get-head = get-head1 <+ get-head2 get-head : Choice(s1,s2) -> <flatten-list>[<get-head>s1,<get-head>s2] get-head : Where(s) -> <get-head>s get-head : Id -> ["id"] translate : Seq(s1,s2) -> <ft>(s1,s2) ft = (get-tail,get-head) ; cart(id) is-inlineable = not-recursive ; not(?SDefT("xtc-transform",a,a1,s) + ?SDefT("xtc-command",a,a1,s)) not-recursive = SDefT(not("main"); ?f, where( length => l1 ), where( length => l2 ) , where( collect-all( {a,b : ?CallT(SVar(f),a,b) ; <eq>(<length>a,l1) ; <eq>(<length>b,l2) }) ; ?[] ) ) do-main = try(?SDefT(_,_,_,s) ; <simplify-strategy-body(call-weg|)>s) inline-sdef = rec x({| inline-call: repeat(inline-call1); all(x) |}) inline-call1 = ?CallT(SVar(n),ss,ts) ; <inline-call>(n,<length>ss,<length>ts,ss,ts) simplify-strategy-body(extra|) = innermost( remove-irrelevant-match + remove-irrelevant-build + remove-scope + remove-primitive + remove-dynamic-rules + remove-irrelevant-structure + app + simple-simplify + eliminate-rule + extra + \ Choice(Id,s) -> Choice(s,Id) where <not(Id)>s \ ) make-it = bottomup(
100
Appendix
Visualization
A.5
try( \ CallT(_,[Id|_],_) -> "unknown"{<new>} \ <+ \ CallT(_,[Build(Str(f))|_],_) -> f{<new>} \ ) ) rules call-weg where call-weg where call-weg where call-weg where
: CallT(SVar(f),ss,ts) -> Id not(<eq>("xtc-transform",f) + <eq>("xtc-command",f)) : |[ rec x (s) ]| -> Id <contains-no-xtc-call>s : |[ rec x (s) ]| -> s <is-subterm>("xtc-transform", s) + <is-subterm>("xtc-command", s) : Cong(_,s) -> Id <contains-no-xtc-call>s
contains-no-xtc-call = where(not(oncetd(?CallT("xtc-transform",_,_)+?CallT("xtc-command",_,_)))) remove-irrelevant-match remove-irrelevant-build remove-irrelevant-build remove-irrelevant-build remove-scope remove-scope remove-scope remove-scope remove-scope remove-primitive remove-primitive remove-primitive remove-primitive remove-primitive remove-primitive remove-primitive : : : : : : : : : : : : : : : : |[ ?t ]| -> |[ id ]| Build(t) -> Id where <not(Str(id))>t Seq(Build(t),s) -> s Seq(s,Build(t)) -> s -> -> -> -> -> |[ |[ |[ |[ s s s s s ]| ]| ]| ]|
|[ { x* : s } ]| |[ { s } ]| |[ {| x* : s |} ]| |[ { : s } ]| Scope(_,s) |[ |[ |[ |[ |[ |[ |[
prim(str:f,t*) ]| -> |[ id ]| prim(str:f,s*|t*) ]| -> |[ id ]| all(s) ]| -> |[ s ]| one(s) ]| -> |[ s ]| some(s) ]| -> |[ s ]| test(s) ]| -> |[ s ]| not(s) ]| -> |[ s ]|
remove-dynamic-rules : DynamicRules(_) -> Strategy |[ id ]| remove-dynamic-rules : OverrideDynamicRules(_) -> Strategy |[ id ]| remove-irrelevant-structure : |[ where(id) ]| -> |[ id ]| remove-irrelevant-structure : |[ id ; s ]| -> |[ s ]| remove-irrelevant-structure : |[ s ; id ]| -> |[ s ]| remove-irrelevant-structure : Let(t,s) -> |[ s ]| where <contains-no-xtc-call>t app app : Strat |[ <s>t ]| -> |[ s ]| : App(s,t) -> s
Appendix
101
A.5
Visualization
|[ f(a*) : r ]| -> |[ f(a* | ) = s ]| where <eliminate-rule>r => s normalize : |[ f(a1*|a2*) : r ]| -> |[ f(a1*|a2*) = s ]| where <eliminate-rule>r => s eliminate-rule : |[ t1 -> t2 where s]| -> |[ ?t1 ; where(s) ; !t2 ]| eliminate-rule : |[ t1 -> t2 ]| -> |[ ?t1 ; !t2 ]| simple-simplify : |[ s + s ]| -> |[ s ]| simple-simplify : |[ fail + s ]| -> |[ s ]| simple-simplify : |[ s + fail ]| -> |[ s ]| simple-simplify : |[ s < s1 + s2 ]| -> |[ s;s1 <+ s2 ]| simple-simplify : |[ s1 <+ s2 ]| -> |[ s1 + s2 ]| strategies make-inline-rule = ?SDefT(f,ss,ts,s) ; where(<length => ls>ss ; <length => lt>ts) ; rules( inline-call : (f,ls,lt,ss1,ts1) -> s where <eq>(<length>ss,<length>ss1) ; <eq>(<length>ts,<length>ts1) ; <substitute-args>(ss,ss1,s) ; <substitute-term-args>(ts,ts1,<id>) => s ; rules(inline-call : (f,ls,lt,ss1,ts1) -> Undefined) ) substitute-term-args = {| SubsTerm : ?(xs, ss, s) ; <zip(substitute-term-arg)> (xs, ss) ; <topdown(try(SubsTerm))> s |} substitute-term-arg = ?(VarDec(x,_),t) ; rules( SubsTerm : Var(x) -> t ) substitute-args = {| SubsArgCall1, SubsArgCall2 : ?(xs, ss, s) ; <zip(substitute-arg)> (xs, ss) ; <topdown(try(SubsArgCall1 + SubsArgCall2))> s |} substitute-arg = ?(VarDec(x, FunType([_],_)), s) ; rules(SubsArgCall1 : CallT(SVar(x), [], []) -> s) substitute-arg = ?(VarDec(x, FunType([_,_|_],_)), CallT(SVar(y), [], []))
102
Appendix
Visualization
A.5
; rules(SubsArgCall2 : CallT(SVar(x), ss, ts) -> CallT(SVar(y), ss, ts)) strategies //uit compiler comp(|f) = xtc-transform(!f, !["-b" | <pass-verbose> ]) comp(|f,args) = xtc-transform(!f, <concat>[["-b"], <pass-verbose>, args]) add-main = try( where(<get-config> "--main" => mi ; if-verbose2(debug(!"main strategy is: "))) ; xtc-io-transform(AddMain(!m)) ) AddMain(m) : Specification(sects) -> Specification([Strategies([SDef("main", [], Call(SVar(<m>()), []))]) | sects])
Appendix
103
Bibliography
[1] Doc-o-matic. http://www.doc-o-matic.com/. [2] Docgen, software imporvers group. http://www.software-improvers.com/. [3] Doxygen. http://www.doxygen.org/. [4] Imagix 4d. http://www.imagix.com. [5] Javadoc. http://java.sun.com/j2se/javadoc/. [6] Rigi system. http://rigi.uvic.ca/http://rigi.uvic.ca/. [7] The software bookshelf. 364/finnigan.html. http://www.research.ibm.com/journal/sj/
[8] P. Brown. Integrated hypertext and program understanding tools. 30(3):363392, 1991. [9] Christiano de Oliveira Braga, Arndt von Staa, and Julio Cesar Sampaio do Prado Leitte. Documentu: a exible architecture for documentation production based on a reverseengineering strategy. Journal of Software Maintenance: Research and Practice, 10(4):279303, 1998. [10] A. van Deursen and T. Kuipers. Building documentation generators. In International Conference on Software Maintenance, ICSM99, pages 4049. IEEE Computer Society, 1999. [11] Andrew Forward and Timothy C. Lethbridge. The relevance of software documentation, tools and technologies: a survey. In Proceedings of the 2002 ACM symposium on Document engineering, pages 2633. ACM Press, 2002. [12] M. de Jonge. Pretty-printing for software reengineering. In Proceedings; IEEE International Conference on Software Maintenance (ICSM 2002), pages 550559. IEEE Computer Society Press, October 2002. [13] Donald E. Knuth. Literate programming. Comput. J., 27(2):97111, 1984. [14] Leon Moonen. Generating robust parsers using island grammars. In Working Conference on Reverse Engineering, pages 13, 2001. [15] Janet Nykaza, Rhonda Messinger, Fran Boehme, Cherie L. Norman, Matthew Mace, and Manuel Gordon. What programmers really want: results of a needs assessment
105
for sdk documentation. In Proceedings of the 20th annual international conference on Computer documentation, pages 133141. ACM Press, 2002. [16] M. Petre, A. Blackwell, and T. Green. Cognitive questions in software visualisation, 1997. [17] Vaclav Rajlich. Incremental redocumentation with hypertext. In Proceedings of the 1st Euromicro Working Conference on Software Maintenance and Reengineering (CSMR 97), page 68. IEEE Computer Society, 1997. [18] Scott Tilley and Shihong Huang. Documenting software systems with views iii: towards a task-oriented classication of program visualization techniques. In Proceedings of the 20th annual international conference on Computer documentation, pages 226233. ACM Press, 2002. [19] M. G. J. van den Brand, H. A. de Jong, P. Klint, and P. A. Olivier. Ecient annotated terms. SoftwarePractice & Experience, 30:259291, 2000. [20] Eelco Visser. Syntax Denition for Language Prototyping. PhD thesis, Amsterdam, 1997. [21] Eelco Visser. Scoped dynamic rewrite rules. In Mark van den Brand and Rakesh Verma, editors, Rule Based Programming (RULE01), volume 59/4 of Electronic Notes in Theoretical Computer Science. Elsevier Science Publishers, September 2001. [22] Eelco Visser. Meta-programming with concrete object syntax. In Don Batory, Charles Consel, and Walid Taha, editors, Generative Programming and Component Engineering (GPCE02), volume 2487 of Lecture Notes in Computer Science, pages 299315, Pittsburgh, PA, USA, October 2002. Springer-Verlag. [23] Eelco Visser. Program transformation with Stratego/XT: Rules, strategies, tools, and systems in StrategoXT-0.9. In Lengauer et al., editors, Domain-Specic Program Generation, Lecture Notes in Computer Science. Spinger-Verlag, November 2003. (Draft; Accepted for publication). [24] Kenny Wong, Scott R. Tilley, Hausi A. Mller, and Margaret-Anne D. Storey. Strucu tural redocumentation: A case study. IEEE Software, 12(1):4654, 1995.
106
Bibliography