parts of a program in any order and extract docurqentation and code from the same source file. The author argues that languagedepen- dence and feature com- plexity have hampered acceptance of these tools, then affers 0 simpler alternative. NORMAN RAMSEY Bellcore LITERATE PROGRAMMING S~MPLUFIED~ I n 1983, Donald Knuth introduced literate programmingin the form of Web, his tool for writing literate Pascal programs. Web lets authors interleave source code and descriptive text in a single document. It also frees authors to arrange the parts of a program in an order that helps explain how the pro- gram functions, not necessarily the order required by the compiler. In the mid-80s, word spread about this new programming method as sev- eral literate programs were published. In 1987, Com77zmications of the ACM created a special forum to discuss liter- ate programming.2 Web was adapted to programming languages other than Pascal, including C, Modula-2, Fortran, Ada, and others.3-6 With expe- rience, however, many Web users became dissatisfied. Continued inter- est in literate programming led to a frenzy of tool building. In the resulting confusion, the literate-programming forum was dropped, on the grounds that literate programming had become the province of those who could build the* own tools.* The proliferation of literate-pro- gramming tools made it hard for liter- ate programming to enter the main- stream, but it led to a better under- standing of what such tools should do. Today the field is more mature, and there is an emerging demand for tools that are simple, easy to learn, and not tied to a particular programming lan- guage- My own literate-programming tool, noweb, fills this niche. Freely available IEEE SOFTWARE 07407459/94/m 00 0 1994 IEEE 97 AN EXAMPLE OF NOWEB: COUNTING WORDS I This example, based on d program by Klaus Gunter- mann and Joachim Schrod and a program hy Silvio Levv and D. E. Knuth, presents the word count program from Lnix, rewritten in noweh to demonstrate literate programming using noweh. The level of detail in this c document is intentionally high, for didactic purposes; many of the things spelled out here dont need to he explained in other programs. The purpose ofwc is to count lines, characters, and/or words in a list of files. The number of lines in a file is the number of new-line characters it contains. The number of characters is the file length in bytes. A word is a maximal sequence of consecutive characters other than newline. space, or tah, containing at least one visible ASCII code. (Vie assume that the standard .ASCIl code is in use.) Most literate C programs share a common structure. Its probably a good idea tn state the overall structure explicitly at the outset, even though the various parts could all be introduced in chunks named <*> if we want- ed to add them piecemeal Here, then. is an overview of the file WC. c that is defined by the noweh program WC. nw: 98a Root chunk (not used in this docummt). Lve must include the standard l/O definitions because we want to send formatted output to stdout and stderr. dieader fiks to inrhde 98b>= 986 #include <etdio.h> This code is used in chunk 9%~. The status variable will tell the operating system if the run was successful or not, and prog-nume is used in case theres an error message to be printed. cDe$niriom 98~s 98C #define OK 0 /* status code for successful run */ #define usage-error 1 /* status code for improper rryntsx l / #define cannot-open-file 2 /* statm code for file acc68is error ft Definer cannot-open-file, usedin chunktO&. OK, used in chunk 98d. usage~ertot, usedinchunk IO2d. Uscs8tatwi 9M This d&niiw is continualin chunks loOa, IO&, md 102r. Thiic&iPtsedG&uak9Ra. ,: ' .P .oSk&d tamk&s 98dxa /* 962 on the Internet since 1989, noweb strips literate programming ( tc CC nc tk PI i its essentials. Programs are composed of named chunks of )de, written in any order, with documentation interleaved. To facilitate comparison of Web and noweb, a sample Iweb program appears in the shaded box that runs throughout iis article. I took the text, code, and presentation for this sam- e from Knuths Literate Programming. Noweb was developed on Unix and can be ported to non- nix platforms provided they can simulate pipelines and sup- )rt both AVSI C and either awk or Icon. For example, Kean alleges Lee Wittenberg ported noweb to MS-DOS. Noweb unique among literate-programming tools in its pipelined, rtensible implementation, which makes it easy for experi- [enters to create new features without writing their own tools. Ii1 m m W cc h: 0 St sl St P rc EBS COMPLEXITIES Webs complexities make it difficult to explore the idea of terate programming because too much effort is required to laster the tool. To compound the difficulty, different program ling languages are served by different versions of Web, each ith its own idiosyncrasies. The classic Web expands three kinds of macros, prettyprints ,de for typeset output, evaluates some constant expressions, a&s string support into Pascal, and implements a simple form f version control. The manual documents 27 control :quences. I Versions for languages other than Pascal offer ightly different functions and different sets of control :quences. Web uses its Tangle tool to produce source code and its Jeave tool to produce documentation. Webs original Tangle :moved white space and folded lines to fill each line with multiple f i Les a:~ exit (stdt:b) ; If the.first argument begins with a (\tt-), the user is choosing the desired counts and specify- Fig-we 1. A noweb sozwc(, fiagmentfiom the example progmnz. 98 SEPTEMBER 1994 -------.-7 Thi s code is used i n chunk 98~~. Now we come to the general l ayout of the mai n function. / <The mai n program 9917s 99a mai n(argc, argv) int argc; ! /* # argumnte on Uni x c onmand l i ne*/ I char l *argv; tokens, maki ng its output unreadabl e. Later adaptati ons pre- served l i ne breaks but removed other whi te space. Web s Weave di vi des a program i nto numbered secti ons, and its i ndex and cross-reference i nformati on refer to secti on numbers, not page numbers. Web works poorl y wi th LaTexz LaTex con- structs cannot be used i n Web source, and getti ng Weave out- put to work i n LaTex document s requi res tedi ous adj ustments by hand. Weave s source (wri tten i n Web) is several t housand l i nes l ong, and the formatti ng code is not i sol ated. NOWEB S FEATURES Noweb s simplicity deri ves from a si mpl e model of files, whi ch are marked up usi ng a si mpl e syntax. Fi gure 1 shows a fragment of the noweb source used to generat e the boxed sam- pl e program. It shows exampl es of chunk defi ni ti ons and uses, quot ed code, and lists of def i ned i denti fi ers - al l of noweb s syntax except escaped angl e brackets. FRO structere. A noweb file is a sequence of chunks. A chunk may contai n code, i n whi ch case it is named, or documentati on, i n whi ch case it is unnamed. Chunks may appear i n any order. Each code chunk begi ns wi th +&mk nmtzes= on a l i ne by itself. The doubl e-l eft angl e bracket must bei n the first col umn. Each document at i on chunk begi ns wi th a l i ne that starts wi th an @ symbol fol l owed by a space or newl i ne. Chunks are termi nated implicitly by the begi nni ng of anot her chunk or by the end of the file. If the first l i ne i n the file does not mark the begi nni ng of a chunk, noweb assumes it is the first l i ne of a document at i on chunk. As Fi gure 2 shows, noweb uses its not angl e and noweave tool s to extract code and documentati on, respecti vel y. When not angl e is gi ven a noweb file, it wri tes the program on st andard output. When noweave is gi ven a noweb file, it reads it and pro- duces, on st andard output, Tex source for typeset document a- ti on. codr US. Code chunks contai n program source code and /* the argumante, an array of atri age l / f 4nri abl eshcal to mai n YYb> wag-nam - argv[Ol r <Set up opti on sel ecti on 990 <process al l thefilrs 99d> cPri nt the grand tota.ls if there were mrrl ti pl efi l rr 102b exit (statue); ) Defi nes: argc, used i n chunks 99c and 99d. arm used i n chunks 99c, l OOr, and IOl c. mai n. never used. Uses prog_nuu 98d and #tatw 986. Thi s code is used i n chunk 98s. 4i ri abl csl ourl tomai n99brt int fl l e~count~ /* how many file8 thora are l / char *whi aht 99b /* whi ch oount~ to pri nt */ D&W fi l r_aouat,usedi ncbdra99i r, JOOc, 10/r, md 102b. ri hi a~, wed i n chunks J?%, l ol e, RX?&, and mu. Tbi o defi ni ti on i a umci o@ i s~dwnb I&b aad l w Thi s+bdho~ . <.Gt up opti on sel ecti on Y 9~x3 9 91. whi ch = lwc ; /* if no opti on is gi ven pri nt 3 val ues */ if (argc > 1 h& *argv[l] == t-c) f whi ch = argv[l] + 1; argc--; argv++ j 1 references to other code chunks. Several code chunks may have the same name; not angl e concatenates thei r defi ni ti ons to pro- duce a si ngl e chunk fi l e-count - argc -1~ Us- =gc 9% p%rr 99~4 fi l e-count 996, ~LIJ whi ch 99). Thi s code is used i n chunk 99a. Code-chunk defi ni ti ons are l i ke macro defi ni ti ons: Not angl e : Now we Scan the remai ni ng arguments and try to open a file extracts a program by expandi ng one chunk (by defaul t the if possi bl e. The tile is processed and its statistics are aven. i Ve use a do 9 l . whi l e hop because we shoul d read from the chunk named cc*>>). The defi ni ti on of that chunk co&&s ref- standard i nput if no tile name is gi ven. erences to other chunks, whi ch are themsel ves expanded, and so on. Fi gure 3 shows part of the boxed sampl e program as <Proaur&t be@ 99$>rrph 99d extracted by notangl e. Not angl e s output is readabl e; it pre- l &3--j & c I serves whi te space and mai ntai ns the i ndentati on of expanded : chunks wi th respect to the chunks i n whi ch they appear. Thi s behavi or al l ows noweb to be used wi th l anguages l i ke Mi randa &.&,& i mdeo- IOlrz and Haskel l , i n whi ch i ndentati on is si gni fi cant. ,,.gkaB rrp% jJ, When doubl e-l eft and -ri ght angl e brackets are not pai red, r #i i CstsWcs*@ XUXes they are treated as literals. Users can force any such brackets, *W#e ft?o+ 1, even pai red brackets, to be treated as l i teral by usi ng a preced- _ ~?qr*mrJ@?rcr; ,I II /e _.. . .~ .-..._ - -..- .-.._ .__. _____ If tbe first argument begi ns wi th a - , the user is choosi nr the desi red counts and speci fyi ng the order i n whi ch they - shoul d be di spl ayed. Each sel ecti on is gi ven by the initial char- acter (l i nes, words, or characters). For exampl e, -cl woul d cause just the number of characters and the number of l i nes to be pri nted, i n that order. We do not process this stri ng now; w si mpl y remember where it is. It wi l l be used to control the for- matti ng at output ti me. I EEE SOFTWARE 8R /* even if there is only one file*/ ) while (--argc > 0); Lsce argc 9%. This code is ued in chunk 79n. Heres the code to open the tile. A special trick allows us to handl e input From &din when no name is given. Recall that the file descriptor to etdin is 0; thats what we use as the default initial value. int fd = 0; /*file descriptor, initialized to &din*/ DdiIP3: fd, used tn chunh lOOr, 1OOd. and IOld. <Definitions 99~-w= 1 VVb #define READ-ONLY 0 /* read access code for system open l / Defines: RJIAD~ONLY, used in chunk 100~. 4f ajik is given, t9y to open l (++argv) ; cant hue if unmcces~~d I VVcx 1 voc if (file-count > 0 &8 (fd=open (*(++argv),REAI_ONLY))< 0) { fprint(stderr, "%a: cannot open file %e\n", 9rogsame. *argv); statue I= cannot~open~file; file-count--; continue; Lkesargv99a, camot~open~file98c. fd 10Oa,ffle~count 99b, prog_nans 98d, IO&%, and status 98d. This code is wed in chunk 99d. 4kejZe 1 OOd>= close (fd)f Uaesfll looa. IOVd This code is wed in chunk 99d. We will do some homemade buffering in order to speed things up: Characters will be read into the buffer array before we process them. To do this we set up appropriate pointers and counters. eDq%i ti Qm98n+e 1OOe #define buf-size BWFSI!6 /* atdi0.h BuFsIe cbnn far effici~ */ D&es: ktf-#i8& used in chmka lwfand IOlA --. - --i Figure 2. Using noweb to build code and documentati on. ing @ sign. Any line begi nni ng with 0 and a space terminates a code chunk. If such a line has the form @ cj %de f identijk-s it also means that the precedi ng chunk defines the identifiers listed in identijkn. This notation provi des a way of marki ng definitions manually when no automatic marki ng is available. Documentati on chunks. Documentati on chunks contain text that is i gnored by notangl e and copi ed verbatim to standard output by noweave (except for quoted code). Code may be quoted within documentati on chunks by placing doubl e square brack- ets around it. These brackets are i gnored by notangl e but are used by noweave to give the quoted code special typographi c treatment. For exampl e, in the sampl e program, quoted code is set in the Courier font. Noweave can work with LaTex, or it can use a plain Tex macro package, supplied with noweb, that defines commands like \chapter and \section. Noweave can also work with HTML, the hypertext markup l anguage for Mosai c and the Worl d-Wi de Web. The exampl e simulates the results after processi ng by noweave and LaTex. Noweave adds no newline characters to its output, maki ng it easy to find the sources of Tex or LaTex errors. For exampl e, an error on line 634 of a generated Tex file is caused by a prob- lem on line 634 of the correspondi ng noweb file. Index and cross-reference features. Cross-referencing of chunks and identifiers makes large programs easier to understand. The sampl e program accompanyi ng this article shows full cross-ref- erence information. Unlike Web, noweb does not introduce numbered set- tions for cross-referencing. Noweb uses page numbers. If two or more chunks appear on a page, say page 24, they are distin- gui shed by appendi ng a letter to the page number: 24a or 24b, for exampl e. Readers of large literate programs will appreciate the use of a single numberi ng system. Like Web, noweb writes chunk-cross-reference information in a footnote font bel ow each code chunk. Noweb also includes cross-reference information for identifiers, for exampl e, Defines file-count, used in chunks 7,11,19, and 21. Noweb generates this by usi ng the @ U %de f marki ngs in its source code, or by recognizing definitions automatically. Although noweb can automatically recogni ze definitions in C programs, I used @J%def to mark the definitions in the sampl e program. This choi ce not only illustrates the use of @ 0 %de f 100 SEPTEMBER 1994 but it also ensures results compatible with the CWeb version of this program. Atitomatically generated indices would differ because CWeb and noweb use different recognition heuristics. Because noweb uses a language-independent heuristic to find identifier uses, it can be fooled into finding false uses in com- ments or string literals, like the use of status in chunk 3. Complier and debugger support. On a large project, it is essential that compilers and other tools refer to locations in the noweb source, even though they work with notangles output. Giving notangle the -L option makes it emit pragmas that inform compilers of the placement of lines in the noweb source. It also preserves the columns in which tokens appear, so that line-and- column error messages are accurate. If you do not give notan- gle the -L option, it respects the indentation of its input, mak- ing its output easy to read. Formatting features. Noweave depends on text formatters in two ways: in the source of noweave itself and in the supporting macros. Noweaves dependence on its formatter is small and isolated, instead of being distributed throughout a large imple- mentation. Noweb uses 250 lines of source for Tex and LaTex combined, and another 250 for HTML. It uses about 200 lines of supporting macros for plain Tex and another 300 lines to support LaTex, primarily because the page-based cross-refer- ence mechanism is complex. LaTex support without cross-ref- / 1 maintargc, argv! t i int argc; /* the number of arguments on theUNIX command line */ char **arp. !* the &uments themselves, an array of strings */ i int fiLe_count; i* how many f-:es tkele are *: char **hick;- i* which cxnts to c:~nt *i int fd = 3; /* f:le descriptor, ir~itiallzed to stdin Y char buffer[kxf-size;; 1 i* we read the ir.~.;: :T.LO this array */ register char *ptr; ;* the first -nprocessed cnaracterin buffer */ I register char *buf-end; /* the first unused position in buffer 4 register int c; I* current character, or number of characters iust read */ int &word; /* are we within a word? */ low word-count, line-count, char-count; /*number of words, lines, and characters -&und in file so far */ which = *l~~*rol' :q P=xLn~ = -.-- Figure 3. Part of the example program after extraction by notangle. Ptr = buf-end = buffer; line-count = word-count = char-count = 0; in-word = 0; CW buf-end /Wj, buffer !f//& char-count [0/y-, in-word lM)/~, line-count I00t; Ptr IO/~; and word-count /U/!/I .Ihls co& is uvzl in chunk YYd. The grand totals must he initialized to zero at the beginning of the program. If u e made there variables local to main, we would have to do this initialization cylicitlp; however, Cs globals are automatically zeroed. (Or rather, statically zeroed.) (Get It?) cGlobaal z?ariabb 98dd>+= 1Olb long tot-word-count, tot-line-count. tot-char-count; /* total number of words, lines, chars */ The present chunk, which does the counting that is WCS mi- smz A%-e, was actually one of the simplescto write. LSre look at each character and change state if it begins ot ends a word. c.%anjk 101~~ IOIC while (1) ( &ill buf fer ifit is empty; break at end offile [Old> C ii l ptKtti if (c > " && c < 0177) 1: /* vieibile ASCII codes l / if (!in-word) { word-count++; in-word = 1; 1 continue if (C == \Zl) lZi.Ile-CoUnt++i else if (c != "'1 &Ii c Ir '\t')coutiIlue; in_word = 0: /*c ie newline, space, or tab */ Usesingr~rd 1OOj line-countlwptr 10af;wcmLcountlO@ Thiscode isusedinchunk 996 Buffered I/O allows us to count the number of characters almost for free. IEEE SOFTWARE 101 printf(" '%a\n", l argv); /* not etdin l / else printf (\n) ; /* stdin l / Cres argv YYz~.char-count IO/J5 file-count 996, line- count lOOj3L;wcgrint 1026. which 99b, word-count lOll& Yhls code 1s used ,n chunk 99d. tot-line-count t= line-count; tot-word-count += word-count; tot-char-count += char-count; 1 erencing requires only 34 lines of source and no supporting i macros. HTML requires no supporting macros. Ccrs char count lOl$ line-count 1Otlf; word-count IO@ ! - Xhls code is wed in chunk 996. I i1.c might as well improve a hit on Vnirs WC by displaying ~ the number of tiles too. <Print the grand totds ifthel-e wre multiple files 102bE 102b if (file-count > 1) { wcgrint(which, tot-char-count, totJord~count, tot~line.Jzount); prfntf (total in %d filas\n, file-count); Uses file-count 9911, wcgrint IO2d. which 996. This code is used in chunk 99~. The function below prints the values according to the speci- fied options. The calling routine should supply a newline. If an invalid option character is found we inform the user about proper uie of the command. Counts are printed in eight-digit fields so thev will line UD in columns. 1 cl)efhkiuns 98c>+a #define print-count(n) printf("%81d': n) l)ifi!liY I OZC dmkm~~ I OX>= WC grintcwhich, char-count, word_count, line_count ) 1 l,21 char *which; /* which counts to print/ long &ax-count, word-count, line-count : /* given totals l / while (*which) switch (%hich+t) ( case '1': print-count(line-count); break: case w: print-count (word_caunt) ; break; case c: print-count(char-count); break; I default: if ((status 6r usage-error) == 0) I fgrintf (stderr, \nlWage:%a[-lwcl filename.. .]\I?, Prog-=) i status I= usage~ermr; D&C?S: wc.print,usedin~hunb IOlrand 102b. Csrs char-count loof; line-count lOof; print-count 102~. prog~~ame Wd. status Pad, usage-error 98c, which 99b, andword-count lOOf. This code is used in chunk 988. A test of this program against the nstem WC command on a SparcStadon showed the official WC rvas slightly slower. Although that WC gave an appropriate error message for the options -abc, it made no complaints about the options -labc! Dare we suggest the s)istem routine might have been better had its programmer used a more literate approach? Uncoupling files and programs. The mapping between noweb files and programs is many-to-many; the mapping between files and documents is many-to-one. You combine source files by listing their names on notangles or noweaves command line. Notangle can extract more than one program from a single source file by using the -R command-line option to identify the root chunks of the different programs. The simplest example of one-to-many program mapping is that of putting a C header and program in a single noweb file. The header comes from the root chunk <header>, and the pro- gram from the default root chunk, <*A The following Unix commands extract files wc.h and wc.c from noweb file wc.nw. notangle -L wc.nw > wc.c notangle -Rheader wc.nw I cpif -ne wc.h The > in the first command directs notangles output to the file wc.c. The I in the second command directs notangles output to the cpif program, which is distributed with noweb. cpi f - ne WC. h compares its input to the contents of file wc.h; if they differ, the input replaces wc.h. This trick avoids touching the file wc.h when its contents have not changed, which avoids tiggering unnecessary recompilations. Because it is language-independent, noweb can combine dif- ferent programming languages in a single literate program. This ability makes it possible to explain all of a projects source in a single document, including not just ordinary code but also things like make files, test scripts, and test inputs. Using literate programming to describe tests as well as source code provides a lasting, written explanation of the thinking needed to create the tests, and it does so with little overhead. If not documented at the time, the rationale behind complex tests can easily be lost. IMPLEMENTING NOWEB Until now we have discussed noweb from a users point of view, showing that it is simple and easy to use. Nowebs imple- mentation is also worth discussing, because nowebs extensible implementation makes it unique among literate-programming tools. Noweb tools are implemented as pipelines. Each pipeline begins with the noweb source file. Successive stages of the pipeline implement simple transformations of the source, until the desired result emerges from the end of the pipeline. Users change or extend noweb not by recompiling but by inserting or removing pipeline stages; for example, noweave switches from LaTex to HTML by changing just the last pipeline stage. Nowebs extensibility enables its users to create new literate-programming features without having to write their own tools. Nowebs syntax is easy to read, write, and edit, but it is not easily manipulated by programs. Markup, which is the first stage in every pipeline, converts noweb source to a representa- 102 SEPTEMBER 1994 don easily manipulated by common Unix tools like sed and mands, respectively. awk, greatly simplifying the construction of later pipeline Noweb turns a World-Wide-Web browser like Mosaic stages. Middle stages add information to the representation. into a hypertext browser for literate programs. For example, Notangles final stage converts to code; noweaves final you can click on an identifier or chunk name to jump to the stages convert to Tex, LaTex, or HTML. definition of that identifier or chunk. You can find a hyper- In the pipeline representation, every line begins with 8 text version of the boxed sample program at ftp://bellcore. and a keyword. The most important possibilities appear in comfpub/norman/noweb/wc.html. Table 1. Markup brackets chunks by @begin . . . @end, and it uses the noweb source to identify text and newlines, defini- EVALUATING NOWEB tions and uses of chunks, and quoted code, which can all appear inside chunks. It 1 a so preserves information about file Reviewers have had many expectations of literate-pro- names and defined identifiers. Other index and cross-refer- gramming tools. lo We expect to be able to write code ence information is inserted automatically by later pipeline chunks in any order. We expect to develop code and docu- stages. The details of nowebs pipeline representation are mentation in one place. Finally, we expect automatically described in the Noweb Hackers Guide, which is distributed generated cross-reference and index information. Like the with noweb. original Web, noweb provides all these features, but in sim- pler form. EXTENDING NOWEB Web does provide features that noweb lacks, but existing Unix tools can substitute for most of these. Although noweb Noweb lets users insert stages into the notangle and contains no internal support for macros, Unix supplies two noweave pipelines, so that they can change a tools existing macro processors that can work with noweb: the C pre- behavior or add new features without recompiling. Even lan- processor and the m4 macro processor. The xstr program guage-dependent features like formatted output and auto- extracts string literals, and the patch program provides a matic index generation have been added to noweb without form of version control similar to Webs change files. recompiling. Indexing and cross-referencing make noweb less simple Stages inserted in the middle of a pipeline both read and than it could be. I need complex LaTex code to compute write nowebs pipeline representation; they are called Jilters, page numbers for use in cross-reference lists and in the by analogy with Unix filters, which are used in the Unix index. The ability to use page numbers justifies this com- implementation. Filters can be used to change the way noweb works; for example, a one-line sed script makes noweb treat two chunk names as identical if they differ only in their representation of white space, as in Web. A 55-line Icon program makes it possible to abbreviate chunk names using a trailing ellipsis. To share programs with colleagues who dont enjoy literate start a cllLlnk programming, I use a filter that places each line of docu- End a chunk mentation in a comment and moves it to the succeeding code chunk. With this filter, notangle transforms a literate Qtext mittg sWitt,q appeared in 3 chunk program into a traditional commented program, without @nl A newline appeared in a chunk loss of information and with only a modest penalty in read- ~ @tlefn *[z7/te The code chunk named t/N?/ze in being ability. defined Filters can be used to add significant features. Noweaves ~ @use nume A reference to code chunk named 7ull)le cross-reference and indexing features use two filters, one ~ @quote Start of quoted code in a document that finds uses of defined identifiers and one that inserts I chunk cross-reference information. In most cases, programmers @endquote F;$kf quoted code in a document must mark identifier definitions by hand, using @Cl %def, . but in some cases a third, language-dependent filter can be Q f le pt1a7ltc Name of the tile from which the used to mark identifier definitions, making index generation cl1w1ks Gllllt! completely automatic. @index defn ident The current chunk contains a Kostas Oikonomou of AT&T Bell Labs, Kaelin definition of ihnt Colclasure of Bridge Information Systems, and Conrad0 @index . . . Automatically generated index Martinez-Parra of the Universidad Politecnica de Catalunya inforination in Barcelona have written noweb filters that add prettyprint- @xref . . . Automaticall generated cross- ing for Icon, C++, and Dijkstras language of guarded corn- ~. .._- ~~-.-~~-.- reference in ormation fy --_ .- ~~ _~~ -~~ ~~, IEEE SOFTWARE 103 plexity, especially since it can be hid- den from most users. You do need to understand the LaTex code if you want to customize the appearance of your noweb documents while retain- ing nowebs use of page numbers for cross-reference. Most literate-pro- gramming tools forbid customization, but not all users will accept such a restriction. I have compromised between simplicity and customizability by add- ing LaTex options for a dozen of the most com- monly requested cus- tomizations. Users can choose from among these ontions without unders&ding nowebs LaTex code. Experimenting with noweb is easy because the tools are simple. If the experiment is unsat- isfying, it is easy to a- bandon, because notan- gles output is readable, records. Programs created with noweb may be delivered in the form of ordi- nary source code, leaving no clue that noweb was used. The only way for me to find out about uses of noweb is to appeal for information on the Internet. In this way I have learned about significant noweb projects in C++, Modula-2, Occam, parallel C, Perl, Prolog, and Scheme. David Hanson and LANGUAGE- INDEPENDENT TOOLS iIKE NOWEB ARE Chris Fraser are using noweb to write a book describing the design and implementation of a retar- getable C compiler. Tip- ton Cole & Company use Noweb in their consulting SIMILAR AND business, which focuses on EASIER TO writing database applica- tions on DOS platforms. USE THAN They find that noweb TliADlTlONAL helps compensate for some of the deficiencies in COMPLEX DOS database tools, and TOOL% that literate programming helps when a customer and documentation can - be preserved as embed- ded comments. Noweb is simpler than Web and easier to use and under- stand, but it does less. I argue, howev- er, that the benefit of Webs extra features is outweighed by the cost of the extra complexity, making noweb better for writing literate programs. Few of Webs remaining features will be missed; for example, many compil- ers evaluate constant expressions at compile time. Noweb users are most likely to miss pretty-printing, but it may be more trouble than it is worth. In my own work, I have used noweb for code written in various lan- guages, including assembly language, awk, Bourne shell, C, Icon, Modula-3, Promela, Standard ML, and Tex. These projects have ranged in size from a few hundred to twenty thou- sand lines of code. Information about other programs written using noweb is hard to find. Noweb is provided free of charge, generating no sales requests a change in a program that hasnt been touched in a year. A customer-sup- port group at Sun Micro-systems is using noweb to help teach their cus- tomers how to work with aspects of the. Solaris operating system like threads and device drivers. The liter- ate-programming paradigm makes it possible to extract working code from the same source used to create techni- cal reports and newsletters. OTHER TOOLS A survey of literate-programming tools is beyond the scope of this arti- cle, but we can still sketch nowebs place in the context of other tools. Most literate-programming tools are language-dependent and complex. You must change tools when chang- ing programming languages, repeat- ing effort spent mastering a tool. Newer tools, like noweb, are lan- guage-independent. The three most prominent are noweb, nuweb, and Funnelweb. To users, Noweb and nuweb look very similar. There are minor syntac- tic differences, and nuweb uses markup within the source file instead of command-line options to show things like the names of output files, but both are simple and easy to mas- ter. Funnelweb is a complex tool that includes its own rudimentary typeset- ting language and command shell. Many of the similarities between noweb and nuweb arise by design. Nuwebs initial design borrowed from noweb, and later versions of each tool have incorporated ideas from the other. Noweb and nuweb differ substan- tively in implementation. Nuweb is not pipelined; it is a single, mono- lithic C program. This structure makes nuweb easy to port, since only a C compiler is needed, and it makes it faster, since no parts are interpret- ed and the overhead of creating a pipeline is eliminated, but it also makes nuweb hard to extend. Nowebs pipeline makes it easy to extend, and different stages of the pipeline can be implemented in different pro- gramming languages, depending on which language is best for which job. Extensibility is particularly valu- able to those interested in pushing the frontiers of literate programming, who would otherwise have to write their own tools from scratch. I advocate language-independent tools for two reasons. First, after mas- tering one such tool, you can write almost anything as a literate program, including things like shell and per1 scripts, which often benefit dispropor- tionately from a literate treatment. Second, two of these tools - noweb and nuweb - are much simpler, and therefore much easier to master, than any of the language-dependent tools. Those who use one language exclu- sively may, however, prefer a lan- guage-dependent tool, since it pro- vides pretty-printing, which when done well can make the printed liter- ate program easier to read. . N oweb probably culminates one kind of evolution in literate pro- gramming: the trend toward greatest simplicity. No significantly simpler tool could do much. Noweb also begins another kind of evolution, toward greater extensibility and flexi- bility. Further evolution might involve replacing Unix shell scripts and pipelines with an embedded language having special data types to represent pipelines, chunks, and literate pro- grams. This step would make it easier to port noweb to nonUnix platforms, and it could make noweb run much faster. Other developments might include constructing new pipeline stages to support language-dependent operations like macro processing, pretty-printing, and automatic iden- tifier cross-reference. These changes would extend no- webs capabilities, but noweb is already quite capable of supporting complex programs and documents. It and relat- ed tools are less capable of supporting a modem word-processing style. The word processors noweb currently supports, Tex, LaTex, and HTML, all use the old batch model of word processing. Today, ma.ny authors prefer WYSIWYG word processors like Framemaker, WordPerfect, or Microsoft Word. Kean Colleges Wittenberg has developed a noweb- like system called WinWordWeb based on Word. Because of Words limitations, including its secret propri- etary data format, he could not reuse any of nowebs implementation, but the design is the same. The challenge for literate pro- gramming today is getting it into use. Noweb helps by eliminating clutter and complexity. Supporting modern word processors would eliminate ACKNOWLEDGEMENTS Mark Weisers invaluable encouragement provided the impetus for me to write this oaner, which I did while visitine the Comnuter Science Laboratorv of the Xerox Palo Alto kesearch Center. David Hanso: sugEesteh and provided the cpif brogram. Preston Briggs developed many of the ideas used in-nowebs indexing, and he con&ib;ted code used in &e of the nineline stages. Bill Trost wrote the first HTMLnineline stage. Dave Love nrovided I I I 1 1 much-needed LaTex expertise. Comments from Hanson and from the anonymous referees stimulated me to improve the paper. The development of noweb was supported by a Fannie and John Hertz Foundation Fellowship. REFERENCES 1. D.E. Knuth, Literate Programming, Stanford University, Stanford, Calif., 1992. 2. P.J. Denning, Announcing Literate Programming, &mm. ACM, July 1987, p. 593. 3. K. Guntermann and J. Schrod, Web Adapted to C, TUGBoat, Oct. 1986, pp. 134-137. 4. S. Levy, Web Adapted to C, Another Approach, TUGBoat, April 1987, pp. 12-13. 5. N. Ramsey, Literate Programming: Weaving a Language-Independent Web, Connn. ACM, Sept. 1989, pp. 1051-1055. 6. H. Thimbleby, Experiences of Literate Programming Using CWeb (a Variant of Knuths Web), Cmnputer3ouma1, 1986, pp. 201.2 11, 7. N. Ramsey and C. Marceau, Literate Programming on a Team Project, Sofnuare -Pm&e 6 Eqrit=nce, July 1991, pp. 677-683. 8. C. J. Van Wyk, Literate Programming: An Assessment, Comm. ACM, Mar. 1990, pp. 361.365. 9. D.E. Km&, The Web System of Structured Documentation, Tech. Report 980, Computer Science Dept., Stanford Univ., Stanford, Calif., 1983. another barrier, making it possible to write literate programs without first learning a new word-processing lan- guage like LaTex or HTML. More must be learned about suit- able ways of structuring literate pro- grams, about whether hypertext is a useful alternative, and about what other kinds of documents literate pro- grams should resemble. What place does literate programming have for the majority of programmers, who are not writing for publication? In the near term, I suspect the best use for literate programming will be to sup- port rapid prototyping, providing a simple and reliable way of document- ing the design decisions made in, and the lessons learned from, the proto- type. In the long term, I hope that simple, extensible tools like noweb will lead everyone to appreciate the benefits of literate programming. + Norman Ramsey is a research scientist at Bellcore. His research interests are the construc- tion of software that is easy to understand and to retar- get to different machines. His recent work includes a retargetable debugger and a toolkit that helps build debuggers and other programs that manipulate machine code. Ramsey received a PhD in computer science from Princeton University. He is a member of ACM. Address questions about this article to Ramsey at Bellcore, 445 South Street, Morristown, NJ 07960; [email protected]. Noweb can be obtained by anonymous ftp from CLAN, the Comprehensive Tex Archive Network, in directory web/noweb. CTAN replicas appear on hosts ftp.shsu.edu, ftp.tex.ac.uk, and ftpani-stuttgade. Nowebs World-Wide-Web page is located at ftp://bellcore.com/pub/norman/noweb. IEEE SOFTWARE 105
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More