TeX Program Book
TeX Program Book
E
X82 PART 1: INTRODUCTION 3
1. Introduction. This is T
E
X, a document compiler intended to produce typesetting of high quality.
The Pascal program that follows is the denition of T
E
X82, a standard version of T
E
X that is designed to
be highly portable so that identical output will be obtainable on a great variety of computers.
The main purpose of the following program is to explain the algorithms of T
E
X as clearly as possible. As
a result, the program will not necessarily be very ecient when a particular Pascal compiler has translated
it into a particular machine language. However, the program has been written so that it can be tuned to run
eciently in a wide variety of operating environments by making comparatively few changes. Such exibility
is possible because the documentation that follows is written in the WEB language, which is at a higher level
than Pascal; the preprocessing step that converts WEB to Pascal is able to introduce most of the necessary
renements. Semi-automatic translation to other languages is also feasible, because the program below does
not make extensive use of features that are peculiar to Pascal.
A large piece of software like T
E
X has inherent complexity that cannot be reduced below a certain level of
diculty, although each individual part is fairly simple by itself. The WEB language is intended to make the
algorithms as readable as possible, by reecting the way the individual program pieces t together and by
providing the cross-references that connect dierent parts. Detailed comments about what is going on, and
about why things were done in certain ways, have been liberally sprinkled throughout the program. These
comments explain features of the implementation, but they rarely attempt to explain the T
E
X language
itself, since the reader is supposed to be familiar with The T
E
Xbook.
2. The present implementation has a long ancestry, beginning in the summer of 1977, when Michael F.
Plass and Frank M. Liang designed and coded a prototype based on some specications that the author had
made in May of that year. This original protoT
E
X included macro denitions and elementary manipulations
on boxes and glue, but it did not have line-breaking, page-breaking, mathematical formulas, alignment
routines, error recovery, or the present semantic nest; furthermore, it used character lists instead of token
lists, so that a control sequence like \halign was represented by a list of seven characters. A complete version
of T
E
X was designed and coded by the author in late 1977 and early 1978; that program, like its prototype,
was written in the SAIL language, for which an excellent debugging system was available. Preliminary plans
to convert the SAIL code into a form somewhat like the present web were developed by Luis Trabb Pardo
and the author at the beginning of 1979, and a complete implementation was created by Ignacio A. Zabala
in 1979 and 1980. The T
E
X82 program, which was written by the author during the latter part of 1981
and the early part of 1982, also incorporates ideas from the 1979 implementation of T
E
X in MESA that
was written by Leonidas Guibas, Robert Sedgewick, and Douglas Wyatt at the Xerox Palo Alto Research
Center. Several hundred renements were introduced into T
E
X82 based on the experiences gained with the
original implementations, so that essentially every part of the system has been substantially improved. After
the appearance of Version 0 in September 1982, this program beneted greatly from the comments of
many other people, notably David R. Fuchs and Howard W. Trickey. A nal revision in September 1989
extended the input character set to eight-bit codes and introduced the ability to hyphenate words from
dierent languages, based on some ideas of Michael J. Ferguson.
No doubt there still is plenty of room for improvement, but the author is rmly committed to keeping
T
E
X82 frozen from now on; stability and reliability are to be its main virtues.
On the other hand, the WEB description can be extended without changing the core of T
E
X82 itself, and
the program has been designed so that such extensions are not extremely dicult to make. The banner
string dened here should be changed whenever T
E
X undergoes any modications, so that it will be clear
which version of T
E
X might be the guilty party when a problem arises.
If this program is changed, the resulting system should not be called T
E
X; the ocial name T
E
X by
itself is reserved for software systems that are fully compatible with each other. A special test suite called
the TRIP test is available for helping to determine whether a particular implementation deserves to be
known as T
E
X [cf. Stanford Computer Science report CS1027, November 1984].
dene banner ThisisTeX,Version3.1415926 printed when T
E
X starts
4 PART 1: INTRODUCTION T
E
X82 3
3. Dierent Pascals have slightly dierent conventions, and the present program expresses T
E
X in terms
of the Pascal that was available to the author in 1982. Constructions that apply to this particular compiler,
which we shall call Pascal-H, should help the reader see how to make an appropriate interface for other
systems if necessary. (Pascal-H is Charles Hedricks modication of a compiler for the DECsystem-10 that
was originally developed at the University of Hamburg; cf. SOFTWAREPractice & Experience 6 (1976),
2942. The T
E
X program below is intended to be adaptable, without extensive changes, to most other
versions of Pascal, so it does not fully use the admirable features of Pascal-H. Indeed, a conscious eort has
been made here to avoid using several idiosyncratic features of standard Pascal itself, so that most of the code
can be translated mechanically into other high-level languages. For example, the with and new features
are not used, nor are pointer types, set types, or enumerated scalar types; there are no var parameters,
except in the case of les; there are no tag elds on variant records; there are no assignments real integer ;
no procedures are declared local to other procedures.)
The portions of this program that involve system-dependent code, where changes might be necessary
because of dierences between Pascal compilers and/or dierences between operating systems, can be
identied by looking at the sections whose numbers are listed under system dependencies in the index.
Furthermore, the index entries for dirty Pascal list all places where the restrictions of Pascal have not been
followed perfectly, for one reason or another.
Incidentally, Pascals standard round function can be problematical, because it disagrees with the IEEE
oating-point standard. Many implementors have therefore chosen to substitute their own home-grown
rounding procedure.
4. The program begins with a normal Pascal program heading, whose components will be lled in later,
using the conventions of WEB. For example, the portion of the program called Global variables 13 below
will be replaced by a sequence of variable declarations that starts in 13 of this documentation. In this way,
we are able to dene each individual global variable when we are prepared to understand what it means; we
do not have to dene all of the globals at once. Cross references in 13, where it says See also sections 20,
26, . . . , also make it possible to look at the set of all global variables, if desired. Similar remarks apply to
the other portions of the program heading.
Actually the heading shown here is not quite normal: The program line does not mention any output
le, because Pascal-H would ask the T
E
X user to specify a le name if output were specied here.
dene mtype t@&y@&p@&e this is a WEB coding trick:
format mtype type mtype will be equivalent to type
format type true but type will not be treated as a reserved word
Compiler directives 9
program TEX; all le names are dened dynamically
label Labels in the outer block 6
const Constants in the outer block 11
mtype Types in the outer block 18
var Global variables 13
procedure initialize; this procedure gets things started properly
var Local variables for initialization 19
begin Initialize whatever T
E
X might access 8
end;
Basic printing procedures 57
Error handling procedures 78
5 T
E
X82 PART 1: INTRODUCTION 5
5. The overall T
E
X program begins with the heading just shown, after which comes a bunch of procedure
declarations and function declarations. Finally we will get to the main program, which begins with the
comment start here. If you want to skip down to the main program now, you can look up start here
in the index. But the author suggests that the best way to understand this program is to follow pretty
much the order of T
E
Xs components as they appear in the WEB description you are now reading, since the
present ordering is intended to combine the advantages of the bottom up and top down approaches to
the problem of understanding a somewhat complicated system.
6. Three labels must be declared in the main program, so we give them symbolic names.
dene start of TEX = 1 go here when T
E
Xs variables are initialized
dene end of TEX = 9998 go here to close les and terminate gracefully
dene nal end = 9999 this label marks the ending of the program
Labels in the outer block 6
start of TEX, end of TEX, nal end ; key control points
This code is used in section 4.
7. Some of the code below is intended to be used only when diagnosing the strange behavior that sometimes
occurs when T
E
X is being installed or when system wizards are fooling around with T
E
X without quite
knowing what they are doing. Such code will not normally be compiled; it is delimited by the codewords
debug . . . gubed, with apologies to people who wish to preserve the purity of English.
Similarly, there is some conditional code delimited by stat . . . tats that is intended for use when statistics
are to be kept about T
E
Xs memory usage. The stat . . . tats code also implements diagnostic information
for \tracingparagraphs and \tracingpages.
dene debug @{ change this to debug when debugging
dene gubed @} change this to gubed when debugging
format debug begin
format gubed end
dene stat @{ change this to stat when gathering usage statistics
dene tats @} change this to tats when gathering usage statistics
format stat begin
format tats end
8. This program has two important variations: (1) There is a long and slow version called INITEX, which
does the extra calculations needed to initialize T
E
Xs internal tables; and (2) there is a shorter and faster
production version, which cuts the initialization to a bare minimum. Parts of the program that are needed
in (1) but not in (2) are delimited by the codewords init . . . tini.
dene init change this to init @{ in the production version
dene tini change this to tini @} in the production version
format init begin
format tini end
Initialize whatever T
E
X might access 8
Set initial values of key variables 21
init Initialize table entries (done by INITEX only) 164 tini
This code is used in section 4.
6 PART 1: INTRODUCTION T
E
X82 9
9. If the rst character of a Pascal comment is a dollar sign, Pascal-H treats the comment as a list of
compiler directives that will aect the translation of this program into machine language. The directives
shown below specify full checking and inclusion of the Pascal debugger when T
E
X is being debugged, but
they cause range checking and other redundant code to be eliminated when the production system is being
generated. Arithmetic overow will be detected in all cases.
Compiler directives 9
@{@&$C, A+, D@} no range check, catch arithmetic overow, no debug overhead
debug @{@&$C+, D+@} gubed but turn everything on when debugging
This code is used in section 4.
10. This T
E
X implementation conforms to the rules of the Pascal User Manual published by Jensen and
Wirth in 1975, except where system-dependent code is necessary to make a useful system program, and
except in another respect where such conformity would unnecessarily obscure the meaning and clutter up
the code: We assume that case statements may include a default case that applies if no matching label is
found. Thus, we shall use constructions like
case x of
1: code for x = 1 ;
3: code for x = 3 ;
othercases code for x ,= 1 and x ,= 3
endcases
since most Pascal compilers have plugged this hole in the language by incorporating some sort of default
mechanism. For example, the Pascal-H compiler allows others : as a default label, and other Pascals
allow syntaxes like else or otherwise or otherwise:, etc. The denitions of othercases and endcases
should be changed to agree with local conventions. Note that no semicolon appears before endcases in this
program, so the denition of endcases should include a semicolon if the compiler wants one. (Of course,
if no default mechanism is available, the case statements of T
E
X will have to be laboriously extended by
listing all remaining cases. People who are stuck with such Pascals have, in fact, done this, successfully but
not happily!)
dene othercases others : default for cases not listed explicitly
dene endcases end follows the default case in an extended case statement
format othercases else
format endcases end
11 T
E
X82 PART 1: INTRODUCTION 7
11. The following parameters can be changed at compile time to extend or reduce T
E
Xs capacity. They
may have dierent values in INITEX and in production versions of T
E
X.
Constants in the outer block 11
mem max = 30000;
greatest index in T
E
Xs internal mem array; must be strictly less than max halfword ; must be
equal to mem top in INITEX, otherwise mem top
mem min = 0; smallest index in T
E
Xs internal mem array; must be min halfword or more; must be
equal to mem bot in INITEX, otherwise mem bot
buf size = 500; maximum number of characters simultaneously present in current lines of open les
and in control sequences between \csname and \endcsname; must not exceed max halfword
error line = 72; width of context lines on terminal error messages
half error line = 42; width of rst lines of contexts in terminal error messages; should be between 30
and error line 15
max print line = 79; width of longest text lines output; should be at least 60
stack size = 200; maximum number of simultaneous input sources
max in open = 6;
maximum number of input les and error insertions that can be going on simultaneously
font max = 75; maximum internal font number; must not exceed max quarterword and must be at
most font base + 256
font mem size = 20000; number of words of font info for all fonts
param size = 60; maximum number of simultaneous macro parameters
nest size = 40; maximum number of semantic levels simultaneously active
max strings = 3000; maximum number of strings; must not exceed max halfword
string vacancies = 8000; the minimum number of characters that should be available for the users
control sequences and font names, after T
E
Xs own error messages are stored
pool size = 32000; maximum number of characters in strings, including all error messages and help
texts, and the names of all fonts and control sequences; must exceed string vacancies by the total
length of T
E
Xs own strings, which is currently about 23000
save size = 600; space for saving values outside of current group; must be at most max halfword
trie size = 8000; space for hyphenation patterns; should be larger for INITEX than it is in production
versions of T
E
X
trie op size = 500; space for opcodes in the hyphenation patterns
dvi buf size = 800; size of the output buer; must be a multiple of 8
le name size = 40; le names shouldnt be longer than this
pool name = TeXformats:TEX.POOL;
string of length le name size; tells where the string pool appears
This code is used in section 4.
8 PART 1: INTRODUCTION T
E
X82 12
12. Like the preceding parameters, the following quantities can be changed at compile time to extend or
reduce T
E
Xs capacity. But if they are changed, it is necessary to rerun the initialization program INITEX
to generate new tables for the production T
E
X program. One cant simply make helter-skelter changes to
the following constants, since certain rather complex initialization numbers are computed from them. They
are dened here using WEB macros, instead of being put into Pascals const list, in order to emphasize this
distinction.
dene mem bot = 0
smallest index in the mem array dumped by INITEX; must not be less than mem min
dene mem top 30000 largest index in the mem array dumped by INITEX; must be substantially
larger than mem bot and not greater than mem max
dene font base = 0 smallest internal font number; must not be less than min quarterword
dene hash size = 2100 maximum number of control sequences; it should be at most about
(mem max mem min)/10
dene hash prime = 1777 a prime number equal to about 85% of hash size
dene hyph size = 307 another prime; the number of \hyphenation exceptions
13. In case somebody has inadvertently made bad settings of the constants, T
E
X checks them using a
global variable called bad .
This is the rst of many sections of T
E
X where global variables are dened.
Global variables 13
bad : integer ; is some constant wrong?
See also sections 20, 26, 30, 32, 39, 50, 54, 73, 76, 79, 96, 104, 115, 116, 117, 118, 124, 165, 173, 181, 213, 246, 253, 256, 271,
286, 297, 301, 304, 305, 308, 309, 310, 333, 361, 382, 387, 388, 410, 438, 447, 480, 489, 493, 512, 513, 520, 527, 532, 539,
549, 550, 555, 592, 595, 605, 616, 646, 647, 661, 684, 719, 724, 764, 770, 814, 821, 823, 825, 828, 833, 839, 847, 872, 892,
900, 905, 907, 921, 926, 943, 947, 950, 971, 980, 982, 989, 1032, 1074, 1266, 1281, 1299, 1305, 1331, 1342, and 1345.
This code is used in section 4.
14. Later on we will say if mem max max halfword then bad 14, or something similar. (We cant
do that until max halfword has been dened.)
Check the constant values for consistency 14
bad 0;
if (half error line < 30) (half error line > error line 15) then bad 1;
if max print line < 60 then bad 2;
if dvi buf size mod 8 ,= 0 then bad 3;
if mem bot + 1100 > mem top then bad 4;
if hash prime > hash size then bad 5;
if max in open 128 then bad 6;
if mem top < 256 + 11 then bad 7; we will want null list > 255
See also sections 111, 290, 522, and 1249.
This code is used in section 1332.
15 T
E
X82 PART 1: INTRODUCTION 9
15. Labels are given symbolic names by the following denitions, so that occasional goto statements
will be meaningful. We insert the label exit just before the end of a procedure in which we have used
the return statement dened below; the label restart is occasionally used at the very beginning of a
procedure; and the label reswitch is occasionally used just prior to a case statement in which some cases
change the conditions and we wish to branch to the newly applicable case. Loops that are set up with the
loop construction dened below are commonly exited by going to done or to found or to not found , and
they are sometimes repeated by going to continue. If two or more parts of a subroutine start dierently
but end up the same, the shared code may be gathered together at common ending .
Incidentally, this program never declares a label that isnt actually used, because some fussy Pascal
compilers will complain about redundant labels.
dene exit = 10 go here to leave a procedure
dene restart = 20 go here to start a procedure again
dene reswitch = 21 go here to start a case statement again
dene continue = 22 go here to resume a loop
dene done = 30 go here to exit a loop
dene done1 = 31 like done, when there is more than one loop
dene done2 = 32 for exiting the second loop in a long block
dene done3 = 33 for exiting the third loop in a very long block
dene done4 = 34 for exiting the fourth loop in an extremely long block
dene done5 = 35 for exiting the fth loop in an immense block
dene done6 = 36 for exiting the sixth loop in a block
dene found = 40 go here when youve found it
dene found1 = 41 like found , when theres more than one per routine
dene found2 = 42 like found , when theres more than two per routine
dene not found = 45 go here when youve found nothing
dene common ending = 50 go here when you want to merge with another branch
16. Here are some macros for common programming idioms.
dene incr (#) # # + 1 increase a variable by unity
dene decr (#) # # 1 decrease a variable by unity
dene negate(#) # # change the sign of a variable
dene loop while true do repeat over and over until a goto happens
format loop xclause WEBs xclause acts like while true do
dene do nothing empty statement
dene return goto exit terminate a procedure call
format return nil
dene empty = 0 symbolic name for a null constant
10 PART 2: THE CHARACTER SET T
E
X82 17
17. The character set. In order to make T
E
X readily portable to a wide variety of computers, all of its
input text is converted to an internal eight-bit code that includes standard ASCII, the American Standard
Code for Information Interchange. This conversion is done immediately when each character is read in.
Conversely, characters are converted from ASCII to the users external representation just before they are
output to a text le.
Such an internal code is relevant to users of T
E
X primarily because it governs the positions of characters
in the fonts. For example, the character A has ASCII code 65 = 101 , and when T
E
X typesets this letter
it species character number 65 in the current font. If that font actually has A in a dierent position,
T
E
X doesnt know what the real position is; the program that does the actual printing from T
E
Xs device-
independent les is responsible for converting from ASCII to a particular font encoding.
T
E
Xs internal code also denes the value of constants that begin with a reverse apostrophe; and it provides
an index to the \catcode, \mathcode, \uccode, \lccode, and \delcode tables.
18. Characters of text that have been converted to T
E
Xs internal form are said to be of type ASCII code,
which is a subrange of the integers.
Types in the outer block 18
ASCII code = 0 . . 255; eight-bit numbers
See also sections 25, 38, 101, 109, 113, 150, 212, 269, 300, 548, 594, 920, and 925.
This code is used in section 4.
19. The original Pascal compiler was designed in the late 60s, when six-bit character sets were common, so
it did not make provision for lowercase letters. Nowadays, of course, we need to deal with both capital and
small letters in a convenient way, especially in a program for typesetting; so the present specication of T
E
X
has been written under the assumption that the Pascal compiler and run-time system permit the use of text
les with more than 64 distinguishable characters. More precisely, we assume that the character set contains
at least the letters and symbols associated with ASCII codes 40 through 176 ; all of these characters are
now available on most computer terminals.
Since we are dealing with more characters than were present in the rst Pascal compilers, we have to
decide what to call the associated data type. Some Pascals use the original name char for the characters in
text les, even though there now are more than 64 such characters, while other Pascals consider char to be
a 64-element subrange of a larger data type that has some other name.
In order to accommodate this dierence, we shall use the name text char to stand for the data type of
the characters that are converted to and from ASCII code when they are input and output. We shall also
assume that text char consists of the elements chr (rst text char ) through chr (last text char ), inclusive.
The following denitions should be adjusted if necessary.
dene text char char the data type of characters in text les
dene rst text char = 0 ordinal number of the smallest element of text char
dene last text char = 255 ordinal number of the largest element of text char
Local variables for initialization 19
i: integer ;
See also sections 163 and 927.
This code is used in section 4.
20. The T
E
X processor converts between ASCII code and the users external character set by means of
arrays xord and xchr that are analogous to Pascals ord and chr functions.
Global variables 13 +
xord : array [text char ] of ASCII code; species conversion of input characters
xchr : array [ASCII code] of text char ; species conversion of output characters
21 T
E
X82 PART 2: THE CHARACTER SET 11
21. Since we are assuming that our Pascal system is able to read and write the visible characters of
standard ASCII (although not necessarily using the ASCII codes to represent them), the following assignment
statements initialize the standard part of the xchr array properly, without needing any system-dependent
changes. On the other hand, it is possible to implement T
E
X with less complete character sets, and in such
cases it will be necessary to change something here.
Set initial values of key variables 21
xchr [40 ] ; xchr [41 ] !; xchr [42 ] "; xchr [43 ] #; xchr [44 ] $;
xchr [45 ] %; xchr [46 ] &; xchr [47 ] ;
xchr [50 ] (; xchr [51 ] ); xchr [52 ] *; xchr [53 ] +; xchr [54 ] ,;
xchr [55 ] ; xchr [56 ] .; xchr [57 ] /;
xchr [60 ] 0; xchr [61 ] 1; xchr [62 ] 2; xchr [63 ] 3; xchr [64 ] 4;
xchr [65 ] 5; xchr [66 ] 6; xchr [67 ] 7;
xchr [70 ] 8; xchr [71 ] 9; xchr [72 ] :; xchr [73 ] ;; xchr [74 ] <;
xchr [75 ] =; xchr [76 ] >; xchr [77 ] ?;
xchr [100 ] @; xchr [101 ] A; xchr [102 ] B; xchr [103 ] C; xchr [104 ] D;
xchr [105 ] E; xchr [106 ] F; xchr [107 ] G;
xchr [110 ] H; xchr [111 ] I; xchr [112 ] J; xchr [113 ] K; xchr [114 ] L;
xchr [115 ] M; xchr [116 ] N; xchr [117 ] O;
xchr [120 ] P; xchr [121 ] Q; xchr [122 ] R; xchr [123 ] S; xchr [124 ] T;
xchr [125 ] U; xchr [126 ] V; xchr [127 ] W;
xchr [130 ] X; xchr [131 ] Y; xchr [132 ] Z; xchr [133 ] [; xchr [134 ] \;
xchr [135 ] ]; xchr [136 ] ^; xchr [137 ] _;
xchr [140 ] `; xchr [141 ] a; xchr [142 ] b; xchr [143 ] c; xchr [144 ] d;
xchr [145 ] e; xchr [146 ] f; xchr [147 ] g;
xchr [150 ] h; xchr [151 ] i; xchr [152 ] j; xchr [153 ] k; xchr [154 ] l;
xchr [155 ] m; xchr [156 ] n; xchr [157 ] o;
xchr [160 ] p; xchr [161 ] q; xchr [162 ] r; xchr [163 ] s; xchr [164 ] t;
xchr [165 ] u; xchr [166 ] v; xchr [167 ] w;
xchr [170 ] x; xchr [171 ] y; xchr [172 ] z; xchr [173 ] {; xchr [174 ] |;
xchr [175 ] }; xchr [176 ] ~;
See also sections 23, 24, 74, 77, 80, 97, 166, 215, 254, 257, 272, 287, 383, 439, 481, 490, 521, 551, 556, 593, 596, 606, 648, 662,
685, 771, 928, 990, 1033, 1267, 1282, 1300, and 1343.
This code is used in section 8.
22. Some of the ASCII codes without visible characters have been given symbolic names in this program
because they are used with a special meaning.
dene null code = 0 ASCII code that might disappear
dene carriage return = 15 ASCII code used at end of line
dene invalid code = 177 ASCII code that many systems prohibit in text les
12 PART 2: THE CHARACTER SET T
E
X82 23
23. The ASCII code is standard only to a certain extent, since many computer installations have found it
advantageous to have ready access to more than 94 printing characters. Appendix C of The T
E
Xbook gives a
complete specication of the intended correspondence between characters and T
E
Xs internal representation.
If T
E
X is being used on a garden-variety Pascal for which only standard ASCII codes will appear in the
input and output les, it doesnt really matter what codes are specied in xchr [0 . . 37 ], but the safest
policy is to blank everything out by using the code shown below.
However, other settings of xchr will make T
E
X more friendly on computers that have an extended character
set, so that users can type things like instead of \ne. People with extended character sets can assign
codes arbitrarily, giving an xchr equivalent to whatever characters the users of T
E
X are allowed to have
in their input les. It is best to make the codes correspond to the intended interpretations as shown in
Appendix C whenever possible; but this is not necessary. For example, in countries with an alphabet of
more than 26 letters, it is usually best to map the additional letters into codes less than 40 . To get the
most permissive character set, change on the right of these assignment statements to chr (i).
Set initial values of key variables 21 +
for i 0 to 37 do xchr [i] ;
for i 177 to 377 do xchr [i] ;
24. The following system-independent code makes the xord array contain a suitable inverse to the infor-
mation in xchr . Note that if xchr [i] = xchr [j] where i < j < 177 , the value of xord [xchr [i]] will turn out
to be j or more; hence, standard ASCII code numbers will be used instead of codes below 40 in case there
is a coincidence.
Set initial values of key variables 21 +
for i rst text char to last text char do xord [chr (i)] invalid code;
for i 200 to 377 do xord [xchr [i]] i;
for i 0 to 176 do xord [xchr [i]] i;
25 T
E
X82 PART 3: INPUT AND OUTPUT 13
25. Input and output. The bane of portability is the fact that dierent operating systems treat input
and output quite dierently, perhaps because computer scientists have not given sucient attention to this
problem. People have felt somehow that input and output are not part of real programming. Well, it is
true that some kinds of programming are more fun than others. With existing input/output conventions
being so diverse and so messy, the only sources of joy in such parts of the code are the rare occasions when
one can nd a way to make the program a little less bad than it might have been. We have two choices,
either to attack I/O now and get it over with, or to postpone I/O until near the end. Neither prospect is
very attractive, so lets get it over with.
The basic operations we need to do are (1) inputting and outputting of text, to or from a le or the users
terminal; (2) inputting and outputting of eight-bit bytes, to or from a le; (3) instructing the operating system
to initiate (open) or to terminate (close) input or output from a specied le; (4) testing whether the
end of an input le has been reached.
T
E
X needs to deal with two kinds of les. We shall use the term alpha le for a le that contains textual
data, and the term byte le for a le that contains eight-bit binary information. These two types turn out
to be the same on many computers, but sometimes there is a signicant distinction, so we shall be careful
to distinguish between them. Standard protocols for transferring such les from computer to computer, via
high-speed networks, are now becoming available to more and more communities of users.
The program actually makes use also of a third kind of le, called a word le, when dumping and reloading
base information for its own initialization. We shall dene a word le later; but it will be possible for us to
specify simple operations on word les before they are dened.
Types in the outer block 18 +
eight bits = 0 . . 255; unsigned one-byte quantity
alpha le = packed le of text char ; les that contain textual data
byte le = packed le of eight bits ; les that contain binary data
26. Most of what we need to do with respect to input and output can be handled by the I/O facilities
that are standard in Pascal, i.e., the routines called get , put , eof , and so on. But standard Pascal does not
allow le variables to be associated with le names that are determined at run time, so it cannot be used
to implement T
E
X; some sort of extension to Pascals ordinary reset and rewrite is crucial for our purposes.
We shall assume that name of le is a variable of an appropriate type such that the Pascal run-time system
being used to implement T
E
X can open a le whose external name is specied by name of le.
Global variables 13 +
name of le: packed array [1 . . le name size] of char ;
on some systems this may be a record variable
name length: 0 . . le name size;
this many characters are actually relevant in name of le (the rest are blank)
14 PART 3: INPUT AND OUTPUT T
E
X82 27
27. The Pascal-H compiler with which the present version of T
E
X was prepared has extended the rules of
Pascal in a very convenient way. To open le f, we can write
reset (f, name, /O) for input;
rewrite(f, name, /O) for output.
The name parameter, which is of type packed array [any] of char , stands for the name of the external
le that is being opened for input or output. Blank spaces that might appear in name are ignored.
The /O parameter tells the operating system not to issue its own error messages if something goes wrong.
If a le of the specied name cannot be found, or if such a le cannot be opened for some other reason (e.g.,
someone may already be trying to write the same le), we will have erstat (f) ,= 0 after an unsuccessful reset
or rewrite. This allows T
E
X to undertake appropriate corrective action.
T
E
Xs le-opening procedures return false if no le identied by name of le could be opened.
dene reset OK(#) erstat (#) = 0
dene rewrite OK(#) erstat (#) = 0
function a open in(var f : alpha le): boolean; open a text le for input
begin reset (f, name of le, /O); a open in reset OK(f);
end;
function a open out (var f : alpha le): boolean; open a text le for output
begin rewrite(f, name of le, /O); a open out rewrite OK(f);
end;
function b open in(var f : byte le): boolean; open a binary le for input
begin reset (f, name of le, /O); b open in reset OK(f);
end;
function b open out (var f : byte le): boolean; open a binary le for output
begin rewrite(f, name of le, /O); b open out rewrite OK(f);
end;
function w open in(var f : word le): boolean; open a word le for input
begin reset (f, name of le, /O); w open in reset OK(f);
end;
function w open out (var f : word le): boolean; open a word le for output
begin rewrite(f, name of le, /O); w open out rewrite OK(f);
end;
28. Files can be closed with the Pascal-H routine close(f), which should be used when all input or output
with respect to f has been completed. This makes f available to be opened again, if desired; and if f was
used for output, the close operation makes the corresponding external le appear on the users area, ready
to be read.
These procedures should not generate error messages if a le is being closed before it has been successfully
opened.
procedure a close(var f : alpha le); close a text le
begin close(f);
end;
procedure b close(var f : byte le); close a binary le
begin close(f);
end;
procedure w close(var f : word le); close a word le
begin close(f);
end;
29 T
E
X82 PART 3: INPUT AND OUTPUT 15
29. Binary input and output are done with Pascals ordinary get and put procedures, so we dont have to
make any other special arrangements for binary I/O. Text output is also easy to do with standard Pascal
routines. The treatment of text input is more dicult, however, because of the necessary translation to
ASCII code values. T
E
Xs conventions should be ecient, and they should blend nicely with the users
operating environment.
30. Input from text les is read one line at a time, using a routine called input ln. This function is dened
in terms of global variables called buer , rst , and last that will be described in detail later; for now, it
suces for us to know that buer is an array of ASCII code values, and that rst and last are indices into
this array representing the beginning and ending of a line of text.
Global variables 13 +
buer : array [0 . . buf size] of ASCII code; lines of characters being read
rst : 0 . . buf size; the rst unused position in buer
last : 0 . . buf size; end of the line just input to buer
max buf stack : 0 . . buf size; largest index used in buer
16 PART 3: INPUT AND OUTPUT T
E
X82 31
31. The input ln function brings the next line of input from the specied le into available positions of
the buer array and returns the value true, unless the le has already been entirely read, in which case it
returns false and sets last rst . In general, the ASCII code numbers that represent the next line of the
le are input into buer [rst ], buer [rst +1], . . . , buer [last 1]; and the global variable last is set equal
to rst plus the length of the line. Trailing blanks are removed from the line; thus, either last = rst (in
which case the line was entirely blank) or buer [last 1] ,= "".
An overow error is given, however, if the normal actions of input ln would make last buf size; this is
done so that other parts of T
E
X can safely look at the contents of buer [last + 1] without overstepping the
bounds of the buer array. Upon entry to input ln, the condition rst < buf size will always hold, so that
there is always room for an empty line.
The variable max buf stack , which is used to keep track of how large the buf size parameter must be to
accommodate the present job, is also kept up to date by input ln.
If the bypass eoln parameter is true, input ln will do a get before looking at the rst character of the line;
this skips over an eoln that was in f. The procedure does not do a get when it reaches the end of the line;
therefore it can be used to acquire input from the users terminal as well as from ordinary text les.
Standard Pascal says that a le should have eoln immediately before eof , but T
E
X needs only a weaker
restriction: If eof occurs in the middle of a line, the system function eoln should return a true result (even
though f will be undened).
Since the inner loop of input ln is part of T
E
Xs inner loopeach character of input comes in at this
placeit is wise to reduce system overhead by making use of special routines that read in an entire array of
characters at once, if such routines are available. The following code uses standard Pascal to illustrate what
needs to be done, but ner tuning is often possible at well-developed Pascal sites.
function input ln(var f : alpha le; bypass eoln : boolean): boolean;
inputs the next line or returns false
var last nonblank : 0 . . buf size; last with trailing blanks removed
begin if bypass eoln then
if eof (f) then get (f); input the rst character of the line into f
last rst ; cf. Matthew 19 : 30
if eof (f) then input ln false
else begin last nonblank rst ;
while eoln(f) do
begin if last max buf stack then
begin max buf stack last + 1;
if max buf stack = buf size then Report overow of the input buer, and abort 35 ;
end;
buer [last ] xord [f]; get (f); incr (last );
if buer [last 1] ,= "" then last nonblank last ;
end;
last last nonblank ; input ln true;
end;
end;
32. The users terminal acts essentially like other les of text, except that it is used both for input and
for output. When the terminal is considered an input le, the le variable is called term in, and when it is
considered an output le the le variable is term out .
Global variables 13 +
term in: alpha le; the terminal as an input le
term out : alpha le; the terminal as an output le
33 T
E
X82 PART 3: INPUT AND OUTPUT 17
33. Here is how to open the terminal les in Pascal-H. The /I switch suppresses the rst get .
dene t open in reset (term in, TTY:, /O/I) open the terminal for text input
dene t open out rewrite(term out , TTY:, /O) open the terminal for text output
34. Sometimes it is necessary to synchronize the input/output mixture that happens on the users terminal,
and three system-dependent procedures are used for this purpose. The rst of these, update terminal , is
called when we want to make sure that everything we have output to the terminal so far has actually left the
computers internal buers and been sent. The second, clear terminal , is called when we wish to cancel any
input that the user may have typed ahead (since we are about to issue an unexpected error message). The
third, wake up terminal , is supposed to revive the terminal if the user has disabled it by some instruction
to the operating system. The following macros show how these operations can be specied in Pascal-H:
dene update terminal break (term out ) empty the terminal output buer
dene clear terminal break in(term in, true) clear the terminal input buer
dene wake up terminal do nothing cancel the users cancellation of output
35. We need a special routine to read the rst line of T
E
X input from the users terminal. This line is
dierent because it is read before we have opened the transcript le; there is sort of a chicken and egg
problem here. If the user types \input paper on the rst line, or if some macro invoked by that line does
such an \input, the transcript le will be named paper.log; but if no \input commands are performed
during the rst line of terminal input, the transcript le will acquire its default name texput.log. (The
transcript le will not contain error messages generated by the rst line before the rst \input command.)
The rst line is even more special if we are lucky enough to have an operating system that treats T
E
X
dierently from a run-of-the-mill Pascal object program. Its nice to let the user start running a T
E
X job by
typing a command line like tex paper; in such a case, T
E
X will operate as if the rst line of input were
paper, i.e., the rst line will consist of the remainder of the command line, after the part that invoked T
E
X.
The rst line is special also because it may be read before T
E
X has input a format le. In such cases,
normal error messages cannot yet be given. The following code uses concepts that will be explained later.
(If the Pascal compiler does not support non-local goto, the statement goto nal end should be replaced
by something that quietly terminates the program.)
Report overow of the input buer, and abort 35
if format ident = 0 then
begin write ln(term out , Buffersizeexceeded!); goto nal end ;
end
else begin cur input .loc eld rst ; cur input .limit eld last 1;
overow("buffersize", buf size);
end
This code is used in section 31.
18 PART 3: INPUT AND OUTPUT T
E
X82 36
36. Dierent systems have dierent ways to get started. But regardless of what conventions are adopted,
the routine that initializes the terminal should satisfy the following specications:
1) It should open le term in for input from the terminal. (The le term out will already be open for
output to the terminal.)
2) If the user has given a command line, this line should be considered the rst line of terminal input.
Otherwise the user should be prompted with **, and the rst line of input should be whatever is
typed in response.
3) The rst line of input, which might or might not be a command line, should appear in locations rst
to last 1 of the buer array.
4) The global variable loc should be set so that the character to be read next by T
E
X is in buer [loc].
This character should not be blank, and we should have loc < last .
(It may be necessary to prompt the user several times before a non-blank line comes in. The prompt is **
instead of the later * because the meaning is slightly dierent: \input need not be typed immediately
after **.)
dene loc cur input .loc eld location of rst unread character in buer
37. The following program does the required initialization without retrieving a possible command line. It
should be clear how to modify this routine to deal with command lines, if the system permits them.
function init terminal : boolean; gets the terminal input started
label exit ;
begin t open in;
loop begin wake up terminal ; write(term out , **); update terminal ;
if input ln(term in, true) then this shouldnt happen
begin write ln(term out ); write(term out , !Endoffileontheterminal...why?);
init terminal false; return;
end;
loc rst ;
while (loc < last ) (buer [loc] = "") do incr (loc);
if loc < last then
begin init terminal true; return; return unless the line was all blank
end;
write ln(term out , Pleasetypethenameofyourinputfile.);
end;
exit : end;
38 T
E
X82 PART 4: STRING HANDLING 19
38. String handling. Control sequence names and diagnostic messages are variable-length strings of
eight-bit characters. Since Pascal does not have a well-developed string mechanism, T
E
X does all of its string
processing by homegrown methods.
Elaborate facilities for dynamic strings are not needed, so all of the necessary operations can be handled
with a simple data structure. The array str pool contains all of the (eight-bit) ASCII codes in all of the strings,
and the array str start contains indices of the starting points of each string. Strings are referred to by integer
numbers, so that string number s comprises the characters str pool [j] for str start [s] j < str start [s + 1].
Additional integer variables pool ptr and str ptr indicate the number of entries used so far in str pool and
str start , respectively; locations str pool [pool ptr ] and str start [str ptr ] are ready for the next string to be
allocated.
String numbers 0 to 255 are reserved for strings that correspond to single ASCII characters. This is in
accordance with the conventions of WEB, which converts single-character strings into the ASCII code number
of the single character involved, while it converts other strings into integers and builds a string pool le.
Thus, when the string constant "." appears in the program below, WEB converts it into the integer 46,
which is the ASCII code for a period, while WEB will convert a string like "hello" into some integer greater
than 255. String number 46 will presumably be the single character .; but some ASCII codes have no
standard visible representation, and T
E
X sometimes needs to be able to print an arbitrary ASCII character,
so the rst 256 strings are used to specify exactly what should be printed for each of the 256 possibilities.
Elements of the str pool array must be ASCII codes that can actually be printed; i.e., they must have an
xchr equivalent in the local character set. (This restriction applies only to preloaded strings, not to those
generated dynamically by the user.)
Some Pascal compilers wont pack integers into a single byte unless the integers lie in the range 128 . . 127.
To accommodate such systems we access the string pool only via macros that can easily be redened.
dene si (#) # convert from ASCII code to packed ASCII code
dene so(#) # convert from packed ASCII code to ASCII code
Types in the outer block 18 +
pool pointer = 0 . . pool size; for variables that point into str pool
str number = 0 . . max strings ; for variables that point into str start
packed ASCII code = 0 . . 255; elements of str pool array
39. Global variables 13 +
str pool : packed array [pool pointer ] of packed ASCII code; the characters
str start : array [str number ] of pool pointer ; the starting pointers
pool ptr : pool pointer ; rst unused position in str pool
str ptr : str number ; number of the current string being created
init pool ptr : pool pointer ; the starting value of pool ptr
init str ptr : str number ; the starting value of str ptr
40. Several of the elementary string operations are performed using WEB macros instead of Pascal pro-
cedures, because many of the operations are done quite frequently and we want to avoid the overhead of
procedure calls. For example, here is a simple macro that computes the length of a string.
dene length(#) (str start [# + 1] str start [#]) the number of characters in string number #
41. The length of the current string is called cur length:
dene cur length (pool ptr str start [str ptr ])
20 PART 4: STRING HANDLING T
E
X82 42
42. Strings are created by appending character codes to str pool . The append char macro, dened here,
does not check to see if the value of pool ptr has gotten too high; this test is supposed to be made before
append char is used. There is also a ush char macro, which erases the last character appended.
To test if there is room to append l more characters to str pool , we shall write str room(l), which aborts
T
E
X and gives an apologetic error message if there isnt enough room.
dene append char (#) put ASCII code # at the end of str pool
begin str pool [pool ptr ] si (#); incr (pool ptr );
end
dene ush char decr (pool ptr ) forget the last character in the pool
dene str room(#) make sure that the pool hasnt overowed
begin if pool ptr + # > pool size then overow("poolsize", pool size init pool ptr );
end
43. Once a sequence of characters has been appended to str pool , it ocially becomes a string when the
function make string is called. This function returns the identication number of the new string as its value.
function make string : str number ; current string enters the pool
begin if str ptr = max strings then overow("numberofstrings", max strings init str ptr );
incr (str ptr ); str start [str ptr ] pool ptr ; make string str ptr 1;
end;
44. To destroy the most recently made string, we say ush string .
dene ush string
begin decr (str ptr ); pool ptr str start [str ptr ];
end
45. The following subroutine compares string s with another string of the same length that appears in
buer starting at position k; the result is true if and only if the strings are equal. Empirical tests indicate
that str eq buf is used in such a way that it tends to return true about 80 percent of the time.
function str eq buf (s : str number ; k : integer ): boolean; test equality of strings
label not found ; loop exit
var j: pool pointer ; running index
result : boolean; result of comparison
begin j str start [s];
while j < str start [s + 1] do
begin if so(str pool [j]) ,= buer [k] then
begin result false; goto not found ;
end;
incr (j); incr (k);
end;
result true;
not found : str eq buf result ;
end;
46 T
E
X82 PART 4: STRING HANDLING 21
46. Here is a similar routine, but it compares two strings in the string pool, and it does not assume that
they have the same length.
function str eq str (s, t : str number ): boolean; test equality of strings
label not found ; loop exit
var j, k: pool pointer ; running indices
result : boolean; result of comparison
begin result false;
if length(s) ,= length(t) then goto not found ;
j str start [s]; k str start [t];
while j < str start [s + 1] do
begin if str pool [j] ,= str pool [k] then goto not found ;
incr (j); incr (k);
end;
result true;
not found : str eq str result ;
end;
47. The initial values of str pool , str start , pool ptr , and str ptr are computed by the INITEX program,
based in part on the information that WEB has output while processing T
E
X.
init function get strings started : boolean;
initializes the string pool, but returns false if something goes wrong
label done, exit ;
var k, l: 0 . . 255; small indices or counters
m, n: text char ; characters input from pool le
g: str number ; garbage
a: integer ; accumulator for check sum
c: boolean; check sum has been checked
begin pool ptr 0; str ptr 0; str start [0] 0; Make the rst 256 strings 48 ;
Read the other strings from the TEX.POOL le and return true, or give an error message and return
false 51 ;
exit : end;
tini
48. dene app lc hex (#) l #;
if l < 10 then append char (l + "0") else append char (l 10 + "a")
Make the rst 256 strings 48
for k 0 to 255 do
begin if ( Character k cannot be printed 49 ) then
begin append char ("^"); append char ("^");
if k < 100 then append char (k + 100 )
else if k < 200 then append char (k 100 )
else begin app lc hex (k div 16); app lc hex (k mod 16);
end;
end
else append char (k);
g make string ;
end
This code is used in section 47.
22 PART 4: STRING HANDLING T
E
X82 49
49. The rst 128 strings will contain 95 standard ASCII characters, and the other 33 characters will be
printed in three-symbol form like ^^A unless a system-dependent change is made here. Installations that
have an extended character set, where for example xchr [32 ] = , would like string 32 to be the single
character 32 instead of the three characters 136 , 136 , 132 (^^Z). On the other hand, even people with
an extended character set will want to represent string 15 by ^^M, since 15 is carriage return; the idea is
to produce visible strings instead of tabs or line-feeds or carriage-returns or bell-rings or characters that are
treated anomalously in text les.
Unprintable characters of codes 128255 are, similarly, rendered ^^80^^ff.
The boolean expression dened here should be true unless T
E
X internal code number k corresponds to a
non-troublesome visible symbol in the local character set. An appropriate formula for the extended character
set recommended in The T
E
Xbook would, for example, be k [0, 10 . . 12 , 14 , 15 , 33 , 177 . . 377 ].
If character k cannot be printed, and k < 200 , then character k + 100 or k 100 must be printable;
moreover, ASCII codes [41 . . 46 , 60 . . 71 , 136 , 141 . . 146 , 160 . . 171 ] must be printable. Thus, at
least 81 printable characters are needed.
Character k cannot be printed 49
(k < "") (k > "~")
This code is used in section 48.
50. When the WEB system program called TANGLE processes the TEX.WEB description that you are now
reading, it outputs the Pascal program TEX.PAS and also a string pool le called TEX.POOL. The INITEX
program reads the latter le, where each string appears as a two-digit decimal length followed by the string
itself, and the information is recorded in T
E
Xs string memory.
Global variables 13 +
init pool le: alpha le; the string-pool le output by TANGLE
tini
51. dene bad pool (#)
begin wake up terminal ; write ln(term out , #); a close(pool le); get strings started false;
return;
end
Read the other strings from the TEX.POOL le and return true, or give an error message and return
false 51
name of le pool name; we neednt set name length
if a open in(pool le) then
begin c false;
repeat Read one string, but return false if the string memory space is getting too tight for
comfort 52 ;
until c;
a close(pool le); get strings started true;
end
else bad pool (!IcantreadTEX.POOL.)
This code is used in section 47.
52 T
E
X82 PART 4: STRING HANDLING 23
52. Read one string, but return false if the string memory space is getting too tight for comfort 52
begin if eof (pool le) then bad pool (!TEX.POOLhasnochecksum.);
read (pool le, m, n); read two digits of string length
if m = * then Check the pool check sum 53
else begin if (xord [m] < "0") (xord [m] > "9") (xord [n] < "0") (xord [n] > "9") then
bad pool (!TEX.POOLlinedoesntbeginwithtwodigits.);
l xord [m] 10 + xord [n] "0" 11; compute the length
if pool ptr +l +string vacancies > pool size then bad pool (!YouhavetoincreasePOOLSIZE.);
for k 1 to l do
begin if eoln(pool le) then m else read (pool le, m);
append char (xord [m]);
end;
read ln(pool le); g make string ;
end;
end
This code is used in section 51.
53. The WEB operation @$ denotes the value that should be at the end of this TEX.POOL le; any other
value means that the wrong pool le has been loaded.
Check the pool check sum 53
begin a 0; k 1;
loop begin if (xord [n] < "0") (xord [n] > "9") then
bad pool (!TEX.POOLchecksumdoesnthaveninedigits.);
a 10 a + xord [n] "0";
if k = 9 then goto done;
incr (k); read (pool le, n);
end;
done: if a ,= @$ then bad pool (!TEX.POOLdoesntmatch;TANGLEmeagain.);
c true;
end
This code is used in section 52.
24 PART 5: ON-LINE AND OFF-LINE PRINTING T
E
X82 54
54. On-line and o-line printing. Messages that are sent to a users terminal and to the transcript-
log le are produced by several print procedures. These procedures will direct their output to a variety of
places, based on the setting of the global variable selector , which has the following possible values:
term and log , the normal setting, prints on the terminal and on the transcript le.
log only, prints only on the transcript le.
term only, prints only on the terminal.
no print , doesnt print at all. This is used only in rare cases before the transcript le is open.
pseudo, puts output into a cyclic buer that is used by the show context routine; when we get to that routine
we shall discuss the reasoning behind this curious mode.
new string , appends the output to the current string in the string pool.
0 to 15, prints on one of the sixteen les for \write output.
The symbolic names term and log , etc., have been assigned numeric codes that satisfy the convenient
relations no print + 1 = term only, no print + 2 = log only, term only + 2 = log only + 1 = term and log .
Three additional global variables, tally and term oset and le oset , record the number of characters
that have been printed since they were most recently cleared to zero. We use tally to record the length of
(possibly very long) stretches of printing; term oset and le oset , on the other hand, keep track of how
many characters have appeared so far on the current line that has been output to the terminal or to the
transcript le, respectively.
dene no print = 16 selector setting that makes data disappear
dene term only = 17 printing is destined for the terminal only
dene log only = 18 printing is destined for the transcript le only
dene term and log = 19 normal selector setting
dene pseudo = 20 special selector setting for show context
dene new string = 21 printing is deected to the string pool
dene max selector = 21 highest selector setting
Global variables 13 +
log le: alpha le; transcript of T
E
X session
selector : 0 . . max selector ; where to print a message
dig : array [0 . . 22] of 0 . . 15; digits in a number being output
tally: integer ; the number of characters recently printed
term oset : 0 . . max print line; the number of characters on the current terminal line
le oset : 0 . . max print line; the number of characters on the current le line
trick buf : array [0 . . error line] of ASCII code; circular buer for pseudoprinting
trick count : integer ; threshold for pseudoprinting, explained later
rst count : integer ; another variable for pseudoprinting
55. Initialize the output routines 55
selector term only; tally 0; term oset 0; le oset 0;
See also sections 61, 528, and 533.
This code is used in section 1332.
56. Macro abbreviations for output to the terminal and to the log le are dened here for convenience.
Some systems need special conventions for terminal output, and it is possible to adhere to those conventions
by changing wterm, wterm ln, and wterm cr in this section.
dene wterm(#) write(term out , #)
dene wterm ln(#) write ln(term out , #)
dene wterm cr write ln(term out )
dene wlog (#) write(log le, #)
dene wlog ln(#) write ln(log le, #)
dene wlog cr write ln(log le)
57 T
E
X82 PART 5: ON-LINE AND OFF-LINE PRINTING 25
57. To end a line of text output, we call print ln.
Basic printing procedures 57
procedure print ln; prints an end-of-line
begin case selector of
term and log : begin wterm cr ; wlog cr ; term oset 0; le oset 0;
end;
log only: begin wlog cr ; le oset 0;
end;
term only: begin wterm cr ; term oset 0;
end;
no print , pseudo, new string : do nothing ;
othercases write ln(write le[selector ])
endcases;
end; tally is not aected
See also sections 58, 59, 60, 62, 63, 64, 65, 262, 263, 518, 699, and 1355.
This code is used in section 4.
58. The print char procedure sends one character to the desired destination, using the xchr array to map
it into an external character compatible with input ln. All printing comes through print ln or print char .
Basic printing procedures 57 +
procedure print char (s : ASCII code); prints a single character
label exit ;
begin if Character s is the current new-line character 244 then
if selector < pseudo then
begin print ln; return;
end;
case selector of
term and log : begin wterm(xchr [s]); wlog (xchr [s]); incr (term oset ); incr (le oset );
if term oset = max print line then
begin wterm cr ; term oset 0;
end;
if le oset = max print line then
begin wlog cr ; le oset 0;
end;
end;
log only: begin wlog (xchr [s]); incr (le oset );
if le oset = max print line then print ln;
end;
term only: begin wterm(xchr [s]); incr (term oset );
if term oset = max print line then print ln;
end;
no print : do nothing ;
pseudo: if tally < trick count then trick buf [tally mod error line] s;
new string : begin if pool ptr < pool size then append char (s);
end; we drop characters if the string space is full
othercases write(write le[selector ], xchr [s])
endcases;
incr (tally);
exit : end;
26 PART 5: ON-LINE AND OFF-LINE PRINTING T
E
X82 59
59. An entire string is output by calling print . Note that if we are outputting the single standard ASCII
character c, we could call print ("c"), since "c" = 99 is the number of a single-character string, as explained
above. But print char ("c") is quicker, so T
E
X goes directly to the print char routine when it knows that
this is safe. (The present implementation assumes that it is always safe to print a visible ASCII character.)
Basic printing procedures 57 +
procedure print (s : integer ); prints string s
label exit ;
var j: pool pointer ; current character code position
nl : integer ; new-line character to restore
begin if s str ptr then s "???" this cant happen
else if s < 256 then
if s < 0 then s "???" cant happen
else begin if selector > pseudo then
begin print char (s); return; internal strings are not expanded
end;
if ( Character s is the current new-line character 244 ) then
if selector < pseudo then
begin print ln; return;
end;
nl new line char ; new line char 1; temporarily disable new-line character
j str start [s];
while j < str start [s + 1] do
begin print char (so(str pool [j])); incr (j);
end;
new line char nl ; return;
end;
j str start [s];
while j < str start [s + 1] do
begin print char (so(str pool [j])); incr (j);
end;
exit : end;
60. Control sequence names, le names, and strings constructed with \string might contain ASCII code
values that cant be printed using print char . Therefore we use slow print for them:
Basic printing procedures 57 +
procedure slow print (s : integer ); prints string s
var j: pool pointer ; current character code position
begin if (s str ptr ) (s < 256) then print (s)
else begin j str start [s];
while j < str start [s + 1] do
begin print (so(str pool [j])); incr (j);
end;
end;
end;
61 T
E
X82 PART 5: ON-LINE AND OFF-LINE PRINTING 27
61. Here is the very rst thing that T
E
X prints: a headline that identies the version number and format
package. The term oset variable is temporarily incorrect, but the discrepancy is not serious since we assume
that the banner and format identier together will occupy at most max print line character positions.
Initialize the output routines 55 +
wterm(banner );
if format ident = 0 then wterm ln((noformatpreloaded))
else begin slow print (format ident ); print ln;
end;
update terminal ;
62. The procedure print nl is like print , but it makes sure that the string appears at the beginning of a
new line.
Basic printing procedures 57 +
procedure print nl (s : str number ); prints string s at beginning of line
begin if ((term oset > 0) (odd (selector ))) ((le oset > 0) (selector log only)) then print ln;
print (s);
end;
63. The procedure print esc prints a string that is preceded by the users escape character (which is usually
a backslash).
Basic printing procedures 57 +
procedure print esc(s : str number ); prints escape character, then s
var c: integer ; the escape character code
begin Set variable c to the current escape character 243 ;
if c 0 then
if c < 256 then print (c);
slow print (s);
end;
64. An array of digits in the range 0 . . 15 is printed by print the digs .
Basic printing procedures 57 +
procedure print the digs (k : eight bits ); prints dig [k 1] . . . dig [0]
begin while k > 0 do
begin decr (k);
if dig [k] < 10 then print char ("0" + dig [k])
else print char ("A" 10 + dig [k]);
end;
end;
28 PART 5: ON-LINE AND OFF-LINE PRINTING T
E
X82 65
65. The following procedure, which prints out the decimal representation of a given integer n, has been
written carefully so that it works properly if n = 0 or if (n) would cause overow. It does not apply mod or
div to negative arguments, since such operations are not implemented consistently by all Pascal compilers.
Basic printing procedures 57 +
procedure print int (n : integer ); prints an integer in decimal form
var k: 0 . . 23; index to current digit; we assume that n < 10
23
b 2
4
+ c 2
12
+ d 2
20
, if a = 0;
16 + b 2
4
+ c 2
12
+ d 2
20
, if a = 255.
(No other choices of a are allowed, since the magnitude of a number in design-size units must be less than
16.) We want to multiply this quantity by the integer z, which is known to be less than 2
27
. If z < 2
23
, the
individual multiplications b z, c z, d z cannot overow; otherwise we will divide z by 2, 4, 8, or 16, to
obtain a multiplier less than 2
23
, and we can compensate for this later. If z has thereby been replaced by
z
= z/2
e
, let = 2
4e
; we shall compute
(b + c 2
8
+ d 2
16
) z
/|
if a = 0, or the same quantity minus = 2
4+e
z
z
, and this was the rst time you encountered a y or z subscript, respectively. Then assign y or z to the
new ; you have scored a hit. (b) You see
d
, and no y subscripts have been encountered so far during this
search. Then change the previous
d
to
y
(this corresponds to changing a command in the output buer),
and assign y to the new ; its another hit. (c) You see
d
, and a y subscript has been seen but not a z.
Change the previous
d
to
z
and assign z to the new . (d) You encounter both y and z subscripts before
encountering a suitable , or you scan all the way to the front of the sequence. Assign d to the new ; this
assignment may be changed later.
The subscripts 3
z
1
y
4
d
. . . in the example above were, in fact, produced by this procedure, as the reader
can verify. (Go ahead and try it.)
605. In order to implement such an idea, T
E
X maintains a stack of pointers to the down, y, and z commands
that have been generated for the current page. And there is a similar stack for right , w, and x commands.
These stacks are called the down stack and right stack, and their top elements are maintained in the variables
down ptr and right ptr .
Each entry in these stacks contains four elds: The width eld is the amount of motion down or to the
right; the location eld is the byte number of the DVI command in question (including the appropriate
dvi oset ); the link eld points to the next item below this one on the stack; and the info eld encodes the
options for possible change in the DVI command.
dene movement node size = 3 number of words per entry in the down and right stacks
dene location(#) mem[# + 2].int DVI byte number for a movement command
Global variables 13 +
down ptr , right ptr : pointer ; heads of the down and right stacks
606. Set initial values of key variables 21 +
down ptr null ; right ptr null ;
224 PART 32: SHIPPING PAGES OUT T
E
X82 607
607. Here is a subroutine that produces a DVI command for some specied downward or rightward
motion. It has two parameters: w is the amount of motion, and o is either down1 or right1 . We use
the fact that the command codes have convenient arithmetic properties: y1 down1 = w1 right1 and
z1 down1 = x1 right1 .
procedure movement (w : scaled ; o : eight bits );
label exit , found , not found , 2, 1;
var mstate: small number ; have we seen a y or z?
p, q: pointer ; current and top nodes on the stack
k: integer ; index into dvi buf , modulo dvi buf size
begin q get node(movement node size); new node for the top of the stack
width(q) w; location(q) dvi oset + dvi ptr ;
if o = down1 then
begin link (q) down ptr ; down ptr q;
end
else begin link (q) right ptr ; right ptr q;
end;
Look at the other stack entries until deciding what sort of DVI command to generate; goto found if
node p is a hit 611 ;
Generate a down or right command for w and return 610 ;
found : Generate a y0 or z0 command in order to reuse a previous appearance of w 609 ;
exit : end;
608. The info elds in the entries of the down stack or the right stack have six possible settings: y here
or z here mean that the DVI command refers to y or z, respectively (or to w or x, in the case of horizontal
motion); yz OK means that the DVI command is down (or right ) but can be changed to either y or z (or to
either w or x); y OK means that it is down and can be changed to y but not z; z OK is similar; and d xed
means it must stay down.
The four settings yz OK, y OK, z OK, d xed would not need to be distinguished from each other
if we were simply solving the digit-subscripting problem mentioned above. But in T
E
Xs case there is a
complication because of the nested structure of push and pop commands. Suppose we add parentheses to
the digit-subscripting problem, redening hits so that
y
. . .
y
is a hit if all ys between the s are enclosed
in properly nested parentheses, and if the parenthesis level of the right-hand
y
is deeper than or equal to
that of the left-hand one. Thus, ( and ) correspond to push and pop. Now if we want to assign a
subscript to the nal 1 in the sequence
2
y
7
d
1
d
( 8
z
2
y
8
z
) 1
we cannot change the previous 1
d
to 1
y
, since that would invalidate the 2
y
. . . 2
y
hit. But we can change it
to 1
z
, scoring a hit since the intervening 8
z
s are enclosed in parentheses.
The program below removes movement nodes that are introduced after a push, before it outputs the
corresponding pop.
dene y here = 1 info when the movement entry points to a y command
dene z here = 2 info when the movement entry points to a z command
dene yz OK = 3 info corresponding to an unconstrained down command
dene y OK = 4 info corresponding to a down that cant become a z
dene z OK = 5 info corresponding to a down that cant become a y
dene d xed = 6 info corresponding to a down that cant change
609 T
E
X82 PART 32: SHIPPING PAGES OUT 225
609. When the movement procedure gets to the label found , the value of info(p) will be either y here or
z here. If it is, say, y here, the procedure generates a y0 command (or a w0 command), and marks all info
elds between q and p so that y is not OK in that range.
Generate a y0 or z0 command in order to reuse a previous appearance of w 609
info(q) info(p);
if info(q) = y here then
begin dvi out (o + y0 down1 ); y0 or w0
while link (q) ,= p do
begin q link (q);
case info(q) of
yz OK: info(q) z OK;
y OK: info(q) d xed ;
othercases do nothing
endcases;
end;
end
else begin dvi out (o + z0 down1 ); z0 or x0
while link (q) ,= p do
begin q link (q);
case info(q) of
yz OK: info(q) y OK;
z OK: info(q) d xed ;
othercases do nothing
endcases;
end;
end
This code is used in section 607.
610. Generate a down or right command for w and return 610
info(q) yz OK;
if abs (w) 40000000 then
begin dvi out (o + 3); down4 or right4
dvi four (w); return;
end;
if abs (w) 100000 then
begin dvi out (o + 2); down3 or right3
if w < 0 then w w + 100000000 ;
dvi out (w div 200000 ); w w mod 200000 ; goto 2;
end;
if abs (w) 200 then
begin dvi out (o + 1); down2 or right2
if w < 0 then w w + 200000 ;
goto 2;
end;
dvi out (o); down1 or right1
if w < 0 then w w + 400 ;
goto 1;
2: dvi out (w div 400 );
1: dvi out (w mod 400 ); return
This code is used in section 607.
226 PART 32: SHIPPING PAGES OUT T
E
X82 611
611. As we search through the stack, we are in one of three states, y seen, z seen, or none seen, depending
on whether we have encountered y here or z here nodes. These states are encoded as multiples of 6, so that
they can be added to the info elds for quick decision-making.
dene none seen = 0 no y here or z here nodes have been encountered yet
dene y seen = 6 we have seen y here but not z here
dene z seen = 12 we have seen z here but not y here
Look at the other stack entries until deciding what sort of DVI command to generate; goto found if node
p is a hit 611
p link (q); mstate none seen;
while p ,= null do
begin if width(p) = w then Consider a node with matching width; goto found if its a hit 612
else case mstate + info(p) of
none seen + y here: mstate y seen;
none seen + z here: mstate z seen;
y seen + z here, z seen + y here: goto not found ;
othercases do nothing
endcases;
p link (p);
end;
not found :
This code is used in section 607.
612. We might nd a valid hit in a y or z byte that is already gone from the buer. But we cant change
bytes that are gone forever; the moving nger writes, . . . .
Consider a node with matching width; goto found if its a hit 612
case mstate + info(p) of
none seen + yz OK, none seen + y OK, z seen + yz OK, z seen + y OK:
if location(p) < dvi gone then goto not found
else Change buered instruction to y or w and goto found 613 ;
none seen + z OK, y seen + yz OK, y seen + z OK:
if location(p) < dvi gone then goto not found
else Change buered instruction to z or x and goto found 614 ;
none seen + y here, none seen + z here, y seen + z here, z seen + y here: goto found ;
othercases do nothing
endcases
This code is used in section 611.
613. Change buered instruction to y or w and goto found 613
begin k location(p) dvi oset ;
if k < 0 then k k + dvi buf size;
dvi buf [k] dvi buf [k] + y1 down1 ; info(p) y here; goto found ;
end
This code is used in section 612.
614. Change buered instruction to z or x and goto found 614
begin k location(p) dvi oset ;
if k < 0 then k k + dvi buf size;
dvi buf [k] dvi buf [k] + z1 down1 ; info(p) z here; goto found ;
end
This code is used in section 612.
615 T
E
X82 PART 32: SHIPPING PAGES OUT 227
615. In case you are wondering when all the movement nodes are removed from T
E
Xs memory, the answer
is that they are recycled just before hlist out and vlist out nish outputting a box. This restores the down
and right stacks to the state they were in before the box was output, except that some infos may have
become more restrictive.
procedure prune movements (l : integer ); delete movement nodes with location l
label done, exit ;
var p: pointer ; node being deleted
begin while down ptr ,= null do
begin if location(down ptr ) < l then goto done;
p down ptr ; down ptr link (p); free node(p, movement node size);
end;
done: while right ptr ,= null do
begin if location(right ptr ) < l then return;
p right ptr ; right ptr link (p); free node(p, movement node size);
end;
exit : end;
616. The actual distances by which we want to move might be computed as the sum of several separate
movements. For example, there might be several glue nodes in succession, or we might want to move right by
the width of some box plus some amount of glue. More importantly, the baselineskip distances are computed
in terms of glue together with the depth and height of adjacent boxes, and we want the DVI le to lump
these three quantities together into a single motion.
Therefore, T
E
X maintains two pairs of global variables: dvi h and dvi v are the h and v coordinates
corresponding to the commands actually output to the DVI le, while cur h and cur v are the coordinates
corresponding to the current state of the output routines. Coordinate changes will accumulate in cur h and
cur v without being reected in the output, until such a change becomes necessary or desirable; we can call
the movement procedure whenever we want to make dvi h = cur h or dvi v = cur v .
The current font reected in the DVI output is called dvi f ; there is no need for a cur f variable.
The depth of nesting of hlist out and vlist out is called cur s ; this is essentially the depth of push commands
in the DVI output.
dene synch h
if cur h ,= dvi h then
begin movement (cur h dvi h, right1 ); dvi h cur h;
end
dene synch v
if cur v ,= dvi v then
begin movement (cur v dvi v , down1 ); dvi v cur v ;
end
Global variables 13 +
dvi h, dvi v : scaled ; a DVI reader program thinks we are here
cur h, cur v : scaled ; T
E
X thinks we are here
dvi f : internal font number ; the current font
cur s : integer ; current depth of output box nesting, initially 1
228 PART 32: SHIPPING PAGES OUT T
E
X82 617
617. Initialize variables as ship out begins 617
dvi h 0; dvi v 0; cur h h oset ; dvi f null font ; ensure dvi open;
if total pages = 0 then
begin dvi out (pre); dvi out (id byte); output the preamble
dvi four (25400000); dvi four (473628672); conversion ratio for sp
prepare mag ; dvi four (mag ); magnication factor is frozen
old setting selector ; selector new string ; print ("TeXoutput"); print int (year );
print char ("."); print two(month); print char ("."); print two(day); print char (":");
print two(time div 60); print two(time mod 60); selector old setting ; dvi out (cur length);
for s str start [str ptr ] to pool ptr 1 do dvi out (so(str pool [s]));
pool ptr str start [str ptr ]; ush the current string
end
This code is used in section 640.
618. When hlist out is called, its duty is to output the box represented by the hlist node pointed to by
temp ptr . The reference point of that box has coordinates (cur h, cur v ).
Similarly, when vlist out is called, its duty is to output the box represented by the vlist node pointed to
by temp ptr . The reference point of that box has coordinates (cur h, cur v ).
procedure vlist out ; forward ; hlist out and vlist out are mutually recursive
619 T
E
X82 PART 32: SHIPPING PAGES OUT 229
619. The recursive procedures hlist out and vlist out each have local variables save h and save v to hold
the values of dvi h and dvi v just before entering a new level of recursion. In eect, the values of save h and
save v on T
E
Xs run-time stack correspond to the values of h and v that a DVI-reading program will push
onto its coordinate stack.
dene move past = 13 go to this label when advancing past glue or a rule
dene n rule = 14 go to this label to nish processing a rule
dene next p = 15 go to this label when nished with node p
Declare procedures needed in hlist out , vlist out 1368
procedure hlist out ; output an hlist node box
label reswitch, move past , n rule, next p;
var base line: scaled ; the baseline coordinate for this box
left edge: scaled ; the left coordinate for this box
save h, save v : scaled ; what dvi h and dvi v should pop to
this box : pointer ; pointer to containing box
g order : glue ord ; applicable order of innity for glue
g sign: normal . . shrinking ; selects type of glue
p: pointer ; current position in the hlist
save loc: integer ; DVI byte location upon entry
leader box : pointer ; the leader box being replicated
leader wd : scaled ; width of leader box being replicated
lx : scaled ; extra space between leader boxes
outer doing leaders : boolean; were we doing leaders?
edge: scaled ; left edge of sub-box, or right edge of leader space
glue temp: real ; glue value before rounding
cur glue: real ; glue seen so far
cur g : scaled ; rounded equivalent of cur glue times the glue ratio
begin cur g 0; cur glue oat constant (0); this box temp ptr ; g order glue order (this box );
g sign glue sign(this box ); p list ptr (this box ); incr (cur s );
if cur s > 0 then dvi out (push);
if cur s > max push then max push cur s ;
save loc dvi oset + dvi ptr ; base line cur v ; left edge cur h;
while p ,= null do Output node p for hlist out and move to the next node, maintaining the condition
cur v = base line 620 ;
prune movements (save loc);
if cur s > 0 then dvi pop(save loc);
decr (cur s );
end;
230 PART 32: SHIPPING PAGES OUT T
E
X82 620
620. We ought to give special care to the eciency of one part of hlist out , since it belongs to T
E
Xs inner
loop. When a char node is encountered, we save a little time by processing several nodes in succession until
reaching a non-char node. The program uses the fact that set char 0 = 0.
Output node p for hlist out and move to the next node, maintaining the condition cur v = base line 620
reswitch: if is char node(p) then
begin synch h; synch v ;
repeat f font (p); c character (p);
if f ,= dvi f then Change font dvi f to f 621 ;
if c qi (128) then dvi out (set1 );
dvi out (qo(c));
cur h cur h + char width(f)(char info(f)(c)); p link (p);
until is char node(p);
dvi h cur h;
end
else Output the non-char node p for hlist out and move to the next node 622
This code is used in section 619.
621. Change font dvi f to f 621
begin if font used [f] then
begin dvi font def (f); font used [f] true;
end;
if f 64 + font base then dvi out (f font base 1 + fnt num 0 )
else begin dvi out (fnt1 ); dvi out (f font base 1);
end;
dvi f f;
end
This code is used in section 620.
622. Output the non-char node p for hlist out and move to the next node 622
begin case type(p) of
hlist node, vlist node: Output a box in an hlist 623 ;
rule node: begin rule ht height (p); rule dp depth(p); rule wd width(p); goto n rule;
end;
whatsit node: Output the whatsit node p in an hlist 1367 ;
glue node: Move right or output leaders 625 ;
kern node, math node: cur h cur h + width(p);
ligature node: Make node p look like a char node and goto reswitch 652 ;
othercases do nothing
endcases;
goto next p;
n rule: Output a rule in an hlist 624 ;
move past : cur h cur h + rule wd ;
next p: p link (p);
end
This code is used in section 620.
623 T
E
X82 PART 32: SHIPPING PAGES OUT 231
623. Output a box in an hlist 623
if list ptr (p) = null then cur h cur h + width(p)
else begin save h dvi h; save v dvi v ; cur v base line + shift amount (p);
shift the box down
temp ptr p; edge cur h;
if type(p) = vlist node then vlist out else hlist out ;
dvi h save h; dvi v save v ; cur h edge + width(p); cur v base line;
end
This code is used in section 622.
624. Output a rule in an hlist 624
if is running (rule ht ) then rule ht height (this box );
if is running (rule dp) then rule dp depth(this box );
rule ht rule ht + rule dp; this is the rule thickness
if (rule ht > 0) (rule wd > 0) then we dont output empty rules
begin synch h; cur v base line + rule dp; synch v ; dvi out (set rule); dvi four (rule ht );
dvi four (rule wd ); cur v base line; dvi h dvi h + rule wd ;
end
This code is used in section 622.
625. dene billion oat constant (1000000000)
dene vet glue(#) glue temp #;
if glue temp > billion then glue temp billion
else if glue temp < billion then glue temp billion
Move right or output leaders 625
begin g glue ptr (p); rule wd width(g) cur g ;
if g sign ,= normal then
begin if g sign = stretching then
begin if stretch order (g) = g order then
begin cur glue cur glue + stretch(g); vet glue(oat (glue set (this box )) cur glue);
cur g round (glue temp);
end;
end
else if shrink order (g) = g order then
begin cur glue cur glue shrink (g); vet glue(oat (glue set (this box )) cur glue);
cur g round (glue temp);
end;
end;
rule wd rule wd + cur g ;
if subtype(p) a leaders then
Output leaders in an hlist, goto n rule if a rule or to next p if done 626 ;
goto move past ;
end
This code is used in section 622.
232 PART 32: SHIPPING PAGES OUT T
E
X82 626
626. Output leaders in an hlist, goto n rule if a rule or to next p if done 626
begin leader box leader ptr (p);
if type(leader box ) = rule node then
begin rule ht height (leader box ); rule dp depth(leader box ); goto n rule;
end;
leader wd width(leader box );
if (leader wd > 0) (rule wd > 0) then
begin rule wd rule wd + 10; compensate for oating-point rounding
edge cur h +rule wd ; lx 0; Let cur h be the position of the rst box, and set leader wd +lx to
the spacing between corresponding parts of boxes 627 ;
while cur h + leader wd edge do
Output a leader box at cur h, then advance cur h by leader wd + lx 628 ;
cur h edge 10; goto next p;
end;
end
This code is used in section 625.
627. The calculations related to leaders require a bit of care. First, in the case of a leaders (aligned leaders),
we want to move cur h to left edge plus the smallest multiple of leader wd for which the result is not less than
the current value of cur h; i.e., cur h should become left edge +leader wd ,(cur h left edge)/leader wd |.
The program here should work in all cases even though some implementations of Pascal give nonstandard
results for the div operation when cur h is less than left edge.
In the case of c leaders (centered leaders), we want to increase cur h by half of the excess space not
occupied by the leaders; and in the case of x leaders (expanded leaders) we increase cur h by 1/(q + 1) of
this excess space, where q is the number of times the leader box will be replicated. Slight inaccuracies in the
division might accumulate; half of this rounding error is placed at each end of the leaders.
Let cur h be the position of the rst box, and set leader wd + lx to the spacing between corresponding
parts of boxes 627
if subtype(p) = a leaders then
begin save h cur h; cur h left edge + leader wd ((cur h left edge) div leader wd );
if cur h < save h then cur h cur h + leader wd ;
end
else begin lq rule wd div leader wd ; the number of box copies
lr rule wd mod leader wd ; the remaining space
if subtype(p) = c leaders then cur h cur h + (lr div 2)
else begin lx lr div (lq + 1); cur h cur h + ((lr (lq 1) lx ) div 2);
end;
end
This code is used in section 626.
628. The synch operations here are intended to decrease the number of bytes needed to specify horizontal
and vertical motion in the DVI output.
Output a leader box at cur h, then advance cur h by leader wd + lx 628
begin cur v base line + shift amount (leader box ); synch v ; save v dvi v ;
synch h; save h dvi h; temp ptr leader box ; outer doing leaders doing leaders ;
doing leaders true;
if type(leader box ) = vlist node then vlist out else hlist out ;
doing leaders outer doing leaders ; dvi v save v ; dvi h save h; cur v base line;
cur h save h + leader wd + lx ;
end
This code is used in section 626.
629 T
E
X82 PART 32: SHIPPING PAGES OUT 233
629. The vlist out routine is similar to hlist out , but a bit simpler.
procedure vlist out ; output a vlist node box
label move past , n rule, next p;
var left edge: scaled ; the left coordinate for this box
top edge: scaled ; the top coordinate for this box
save h, save v : scaled ; what dvi h and dvi v should pop to
this box : pointer ; pointer to containing box
g order : glue ord ; applicable order of innity for glue
g sign: normal . . shrinking ; selects type of glue
p: pointer ; current position in the vlist
save loc: integer ; DVI byte location upon entry
leader box : pointer ; the leader box being replicated
leader ht : scaled ; height of leader box being replicated
lx : scaled ; extra space between leader boxes
outer doing leaders : boolean; were we doing leaders?
edge: scaled ; bottom boundary of leader space
glue temp: real ; glue value before rounding
cur glue: real ; glue seen so far
cur g : scaled ; rounded equivalent of cur glue times the glue ratio
begin cur g 0; cur glue oat constant (0); this box temp ptr ; g order glue order (this box );
g sign glue sign(this box ); p list ptr (this box ); incr (cur s );
if cur s > 0 then dvi out (push);
if cur s > max push then max push cur s ;
save loc dvi oset + dvi ptr ; left edge cur h; cur v cur v height (this box ); top edge cur v ;
while p ,= null do Output node p for vlist out and move to the next node, maintaining the condition
cur h = left edge 630 ;
prune movements (save loc);
if cur s > 0 then dvi pop(save loc);
decr (cur s );
end;
630. Output node p for vlist out and move to the next node, maintaining the condition
cur h = left edge 630
begin if is char node(p) then confusion("vlistout")
else Output the non-char node p for vlist out 631 ;
next p: p link (p);
end
This code is used in section 629.
234 PART 32: SHIPPING PAGES OUT T
E
X82 631
631. Output the non-char node p for vlist out 631
begin case type(p) of
hlist node, vlist node: Output a box in a vlist 632 ;
rule node: begin rule ht height (p); rule dp depth(p); rule wd width(p); goto n rule;
end;
whatsit node: Output the whatsit node p in a vlist 1366 ;
glue node: Move down or output leaders 634 ;
kern node: cur v cur v + width(p);
othercases do nothing
endcases;
goto next p;
n rule: Output a rule in a vlist, goto next p 633 ;
move past : cur v cur v + rule ht ;
end
This code is used in section 630.
632. The synch v here allows the DVI output to use one-byte commands for adjusting v in most cases,
since the baselineskip distance will usually be constant.
Output a box in a vlist 632
if list ptr (p) = null then cur v cur v + height (p) + depth(p)
else begin cur v cur v + height (p); synch v ; save h dvi h; save v dvi v ;
cur h left edge + shift amount (p); shift the box right
temp ptr p;
if type(p) = vlist node then vlist out else hlist out ;
dvi h save h; dvi v save v ; cur v save v + depth(p); cur h left edge;
end
This code is used in section 631.
633. Output a rule in a vlist, goto next p 633
if is running (rule wd ) then rule wd width(this box );
rule ht rule ht + rule dp; this is the rule thickness
cur v cur v + rule ht ;
if (rule ht > 0) (rule wd > 0) then we dont output empty rules
begin synch h; synch v ; dvi out (put rule); dvi four (rule ht ); dvi four (rule wd );
end;
goto next p
This code is used in section 631.
634 T
E
X82 PART 32: SHIPPING PAGES OUT 235
634. Move down or output leaders 634
begin g glue ptr (p); rule ht width(g) cur g ;
if g sign ,= normal then
begin if g sign = stretching then
begin if stretch order (g) = g order then
begin cur glue cur glue + stretch(g); vet glue(oat (glue set (this box )) cur glue);
cur g round (glue temp);
end;
end
else if shrink order (g) = g order then
begin cur glue cur glue shrink (g); vet glue(oat (glue set (this box )) cur glue);
cur g round (glue temp);
end;
end;
rule ht rule ht + cur g ;
if subtype(p) a leaders then
Output leaders in a vlist, goto n rule if a rule or to next p if done 635 ;
goto move past ;
end
This code is used in section 631.
635. Output leaders in a vlist, goto n rule if a rule or to next p if done 635
begin leader box leader ptr (p);
if type(leader box ) = rule node then
begin rule wd width(leader box ); rule dp 0; goto n rule;
end;
leader ht height (leader box ) + depth(leader box );
if (leader ht > 0) (rule ht > 0) then
begin rule ht rule ht + 10; compensate for oating-point rounding
edge cur v + rule ht ; lx 0; Let cur v be the position of the rst box, and set leader ht + lx to
the spacing between corresponding parts of boxes 636 ;
while cur v + leader ht edge do
Output a leader box at cur v , then advance cur v by leader ht + lx 637 ;
cur v edge 10; goto next p;
end;
end
This code is used in section 634.
636. Let cur v be the position of the rst box, and set leader ht + lx to the spacing between
corresponding parts of boxes 636
if subtype(p) = a leaders then
begin save v cur v ; cur v top edge + leader ht ((cur v top edge) div leader ht );
if cur v < save v then cur v cur v + leader ht ;
end
else begin lq rule ht div leader ht ; the number of box copies
lr rule ht mod leader ht ; the remaining space
if subtype(p) = c leaders then cur v cur v + (lr div 2)
else begin lx lr div (lq + 1); cur v cur v + ((lr (lq 1) lx ) div 2);
end;
end
This code is used in section 635.
236 PART 32: SHIPPING PAGES OUT T
E
X82 637
637. When we reach this part of the program, cur v indicates the top of a leader box, not its baseline.
Output a leader box at cur v , then advance cur v by leader ht + lx 637
begin cur h left edge + shift amount (leader box ); synch h; save h dvi h;
cur v cur v + height (leader box ); synch v ; save v dvi v ; temp ptr leader box ;
outer doing leaders doing leaders ; doing leaders true;
if type(leader box ) = vlist node then vlist out else hlist out ;
doing leaders outer doing leaders ; dvi v save v ; dvi h save h; cur h left edge;
cur v save v height (leader box ) + leader ht + lx ;
end
This code is used in section 635.
638. The hlist out and vlist out procedures are now complete, so we are ready for the ship out routine
that gets them started in the rst place.
procedure ship out (p : pointer ); output the box p
label done;
var page loc: integer ; location of the current bop
j, k: 0 . . 9; indices to rst ten count registers
s: pool pointer ; index into str pool
old setting : 0 . . max selector ; saved selector setting
begin if tracing output > 0 then
begin print nl (""); print ln; print ("Completedboxbeingshippedout");
end;
if term oset > max print line 9 then print ln
else if (term oset > 0) (le oset > 0) then print char ("");
print char ("["); j 9;
while (count (j) = 0) (j > 0) do decr (j);
for k 0 to j do
begin print int (count (k));
if k < j then print char (".");
end;
update terminal ;
if tracing output > 0 then
begin print char ("]"); begin diagnostic; show box (p); end diagnostic(true);
end;
Ship box p out 640 ;
if tracing output 0 then print char ("]");
dead cycles 0; update terminal ; progress report
Flush the box from memory, showing statistics if requested 639 ;
end;
639 T
E
X82 PART 32: SHIPPING PAGES OUT 237
639. Flush the box from memory, showing statistics if requested 639
stat if tracing stats > 1 then
begin print nl ("Memoryusagebefore:"); print int (var used ); print char ("&");
print int (dyn used ); print char (";");
end;
tats
ush node list (p);
stat if tracing stats > 1 then
begin print ("after:"); print int (var used ); print char ("&"); print int (dyn used );
print (";stilluntouched:"); print int (hi mem min lo mem max 1); print ln;
end;
tats
This code is used in section 638.
640. Ship box p out 640
Update the values of max h and max v ; but if the page is too large, goto done 641 ;
Initialize variables as ship out begins 617 ;
page loc dvi oset + dvi ptr ; dvi out (bop);
for k 0 to 9 do dvi four (count (k));
dvi four (last bop); last bop page loc; cur v height (p) + v oset ; temp ptr p;
if type(p) = vlist node then vlist out else hlist out ;
dvi out (eop); incr (total pages ); cur s 1;
done:
This code is used in section 638.
641. Sometimes the user will generate a huge page because other error messages are being ignored. Such
pages are not output to the dvi le, since they may confuse the printing software.
Update the values of max h and max v ; but if the page is too large, goto done 641
if (height (p) > max dimen) (depth(p) > max dimen)
(height (p) + depth(p) + v oset > max dimen) (width(p) + h oset > max dimen) then
begin print err ("Hugepagecannotbeshippedout");
help2 ("Thepagejustcreatedismorethan18feettallor")
("morethan18feetwide,soIsuspectsomethingwentwrong."); error ;
if tracing output 0 then
begin begin diagnostic; print nl ("Thefollowingboxhasbeendeleted:"); show box (p);
end diagnostic(true);
end;
goto done;
end;
if height (p) + depth(p) + v oset > max v then max v height (p) + depth(p) + v oset ;
if width(p) + h oset > max h then max h width(p) + h oset
This code is used in section 640.
238 PART 32: SHIPPING PAGES OUT T
E
X82 642
642. At the end of the program, we must nish things o by writing the postamble. If total pages = 0,
the DVI le was never opened. If total pages 65536, the DVI le will lie. And if max push 65536, the
user deserves whatever chaos might ensue.
An integer variable k will be declared for use by this routine.
Finish the DVI le 642
while cur s > 1 do
begin if cur s > 0 then dvi out (pop)
else begin dvi out (eop); incr (total pages );
end;
decr (cur s );
end;
if total pages = 0 then print nl ("Nopagesofoutput.")
else begin dvi out (post ); beginning of the postamble
dvi four (last bop); last bop dvi oset + dvi ptr 5; post location
dvi four (25400000); dvi four (473628672); conversion ratio for sp
prepare mag ; dvi four (mag ); magnication factor
dvi four (max v ); dvi four (max h);
dvi out (max push div 256); dvi out (max push mod 256);
dvi out ((total pages div 256) mod 256); dvi out (total pages mod 256);
Output the font denitions for all fonts that were used 643 ;
dvi out (post post ); dvi four (last bop); dvi out (id byte);
k 4 + ((dvi buf size dvi ptr ) mod 4); the number of 223s
while k > 0 do
begin dvi out (223); decr (k);
end;
Empty the last bytes out of dvi buf 599 ;
print nl ("Outputwrittenon"); slow print (output le name); print ("("); print int (total pages );
print ("page");
if total pages ,= 1 then print char ("s");
print (","); print int (dvi oset + dvi ptr ); print ("bytes)."); b close(dvi le);
end
This code is used in section 1333.
643. Output the font denitions for all fonts that were used 643
while font ptr > font base do
begin if font used [font ptr ] then dvi font def (font ptr );
decr (font ptr );
end
This code is used in section 642.
644 T
E
X82 PART 33: PACKAGING 239
644. Packaging. Were essentially done with the parts of T
E
X that are concerned with the input
(get next ) and the output (ship out ). So its time to get heavily into the remaining part, which does
the real work of typesetting.
After lists are constructed, T
E
X wraps them up and puts them into boxes. Two major subroutines are
given the responsibility for this task: hpack applies to horizontal lists (hlists) and vpack applies to vertical
lists (vlists). The main duty of hpack and vpack is to compute the dimensions of the resulting boxes, and
to adjust the glue if one of those dimensions is pre-specied. The computed sizes normally enclose all of the
material inside the new box; but some items may stick out if negative glue is used, if the box is overfull, or
if a \vbox includes other boxes that have been shifted left.
The subroutine call hpack (p, w, m) returns a pointer to an hlist node for a box containing the hlist that
starts at p. Parameter w species a width; and parameter m is either exactly or additional . Thus,
hpack (p, w, exactly) produces a box whose width is exactly w, while hpack (p, w, additional ) yields a box
whose width is the natural width plus w. It is convenient to dene a macro called natural to cover the
most common case, so that we can say hpack (p, natural ) to get a box that has the natural width of list p.
Similarly, vpack (p, w, m) returns a pointer to a vlist node for a box containing the vlist that starts at p.
In this case w represents a height instead of a width; the parameter m is interpreted as in hpack .
dene exactly = 0 a box dimension is pre-specied
dene additional = 1 a box dimension is increased from the natural one
dene natural 0, additional shorthand for parameters to hpack and vpack
645. The parameters to hpack and vpack correspond to T
E
Xs primitives like \hbox to 300pt, \hbox
spread 10pt; note that \hbox with no dimension following it is equivalent to \hbox spread 0pt. The
scan spec subroutine scans such constructions in the users input, including the mandatory left brace that
follows them, and it puts the specication onto save stack so that the desired box can later be obtained by
executing the following code:
save ptr save ptr 2;
hpack (p, saved (1), saved (0)).
Special care is necessary to ensure that the special save stack codes are placed just below the new group
code, because scanning can change save stack when \csname appears.
procedure scan spec(c : group code; three codes : boolean); scans a box specication and left brace
label found ;
var s: integer ; temporarily saved value
spec code: exactly . . additional ;
begin if three codes then s saved (0);
if scan keyword ("to") then spec code exactly
else if scan keyword ("spread") then spec code additional
else begin spec code additional ; cur val 0; goto found ;
end;
scan normal dimen;
found : if three codes then
begin saved (0) s; incr (save ptr );
end;
saved (0) spec code; saved (1) cur val ; save ptr save ptr + 2; new save level (c); scan left brace;
end;
240 PART 33: PACKAGING T
E
X82 646
646. To gure out the glue setting, hpack and vpack determine how much stretchability and shrinkability
are present, considering all four orders of innity. The highest order of innity that has a nonzero coecient
is then used as if no other orders were present.
For example, suppose that the given list contains six glue nodes with the respective stretchabilities 3pt,
8ll, 5l, 6pt, 3l, 8ll. Then the total is essentially 2l; and if a total additional space of 6pt is to be
achieved by stretching, the actual amounts of stretch will be 0pt, 0pt, 15pt, 0pt, 9pt, and 0pt, since only
l glue will be considered. (The ll glue is therefore not really stretching innitely with respect to l;
nobody would actually want that to happen.)
The arrays total stretch and total shrink are used to determine how much glue of each kind is present. A
global variable last badness is used to implement \badness.
Global variables 13 +
total stretch, total shrink : array [glue ord ] of scaled ; glue found by hpack or vpack
last badness : integer ; badness of the most recently packaged box
647. If the global variable adjust tail is non-null, the hpack routine also removes all occurrences of ins node,
mark node, and adjust node items and appends the resulting material onto the list that ends at location
adjust tail .
Global variables 13 +
adjust tail : pointer ; tail of adjustment list
648. Set initial values of key variables 21 +
adjust tail null ; last badness 0;
649. Here now is hpack , which contains few if any surprises.
function hpack (p : pointer ; w : scaled ; m : small number ): pointer ;
label reswitch, common ending , exit ;
var r: pointer ; the box node that will be returned
q: pointer ; trails behind p
h, d, x: scaled ; height, depth, and natural width
s: scaled ; shift amount
g: pointer ; points to a glue specication
o: glue ord ; order of innity
f: internal font number ; the font in a char node
i: four quarters ; font information about a char node
hd : eight bits ; height and depth indices for a character
begin last badness 0; r get node(box node size); type(r) hlist node;
subtype(r) min quarterword ; shift amount (r) 0; q r + list oset ; link (q) p;
h 0; Clear dimensions to zero 650 ;
while p ,= null do Examine node p in the hlist, taking account of its eect on the dimensions of the
new box, or moving it to the adjustment list; then advance p to the next node 651 ;
if adjust tail ,= null then link (adjust tail ) null ;
height (r) h; depth(r) d;
Determine the value of width(r) and the appropriate glue setting; then return or goto
common ending 657 ;
common ending : Finish issuing a diagnostic message for an overfull or underfull hbox 663 ;
exit : hpack r;
end;
650 T
E
X82 PART 33: PACKAGING 241
650. Clear dimensions to zero 650
d 0; x 0; total stretch[normal ] 0; total shrink [normal ] 0; total stretch[l ] 0;
total shrink [l ] 0; total stretch[ll ] 0; total shrink [ll ] 0; total stretch[lll ] 0;
total shrink [lll ] 0
This code is used in sections 649 and 668.
651. Examine node p in the hlist, taking account of its eect on the dimensions of the new box, or
moving it to the adjustment list; then advance p to the next node 651
begin reswitch: while is char node(p) do Incorporate character dimensions into the dimensions of the
hbox that will contain it, then move to the next node 654 ;
if p ,= null then
begin case type(p) of
hlist node, vlist node, rule node, unset node: Incorporate box dimensions into the dimensions of the
hbox that will contain it 653 ;
ins node, mark node, adjust node: if adjust tail ,= null then
Transfer node p to the adjustment list 655 ;
whatsit node: Incorporate a whatsit node into an hbox 1360 ;
glue node: Incorporate glue into the horizontal totals 656 ;
kern node, math node: x x + width(p);
ligature node: Make node p look like a char node and goto reswitch 652 ;
othercases do nothing
endcases;
p link (p);
end;
end
This code is used in section 649.
652. Make node p look like a char node and goto reswitch 652
begin mem[lig trick ] mem[lig char (p)]; link (lig trick ) link (p); p lig trick ; goto reswitch;
end
This code is used in sections 622, 651, and 1147.
653. The code here implicitly uses the fact that running dimensions are indicated by null ag , which will
be ignored in the calculations because it is a highly negative number.
Incorporate box dimensions into the dimensions of the hbox that will contain it 653
begin x x + width(p);
if type(p) rule node then s 0 else s shift amount (p);
if height (p) s > h then h height (p) s;
if depth(p) + s > d then d depth(p) + s;
end
This code is used in section 651.
242 PART 33: PACKAGING T
E
X82 654
654. The following code is part of T
E
Xs inner loop; i.e., adding another character of text to the users
input will cause each of these instructions to be exercised one more time.
Incorporate character dimensions into the dimensions of the hbox that will contain it, then move to the
next node 654
begin f font (p); i char info(f)(character (p)); hd height depth(i); x x + char width(f)(i);
s char height (f)(hd ); if s > h then h s;
s char depth(f)(hd ); if s > d then d s;
p link (p);
end
This code is used in section 651.
655. Although node q is not necessarily the immediate predecessor of node p, it always points to some
node in the list preceding p. Thus, we can delete nodes by moving q when necessary. The algorithm takes
linear time, and the extra computation does not intrude on the inner loop unless it is necessary to make a
deletion.
Transfer node p to the adjustment list 655
begin while link (q) ,= p do q link (q);
if type(p) = adjust node then
begin link (adjust tail ) adjust ptr (p);
while link (adjust tail ) ,= null do adjust tail link (adjust tail );
p link (p); free node(link (q), small node size);
end
else begin link (adjust tail ) p; adjust tail p; p link (p);
end;
link (q) p; p q;
end
This code is used in section 651.
656. Incorporate glue into the horizontal totals 656
begin g glue ptr (p); x x + width(g);
o stretch order (g); total stretch[o] total stretch[o] + stretch(g); o shrink order (g);
total shrink [o] total shrink [o] + shrink (g);
if subtype(p) a leaders then
begin g leader ptr (p);
if height (g) > h then h height (g);
if depth(g) > d then d depth(g);
end;
end
This code is used in section 651.
657. When we get to the present part of the program, x is the natural width of the box being packaged.
Determine the value of width(r) and the appropriate glue setting; then return or goto
common ending 657
if m = additional then w x + w;
width(r) w; x w x; now x is the excess to be made up
if x = 0 then
begin glue sign(r) normal ; glue order (r) normal ; set glue ratio zero(glue set (r)); return;
end
else if x > 0 then Determine horizontal glue stretch setting, then return or goto common ending 658
else Determine horizontal glue shrink setting, then return or goto common ending 664
This code is used in section 649.
658 T
E
X82 PART 33: PACKAGING 243
658. Determine horizontal glue stretch setting, then return or goto common ending 658
begin Determine the stretch order 659 ;
glue order (r) o; glue sign(r) stretching ;
if total stretch[o] ,= 0 then glue set (r) unoat (x/total stretch[o])
else begin glue sign(r) normal ; set glue ratio zero(glue set (r)); theres nothing to stretch
end;
if o = normal then
if list ptr (r) ,= null then
Report an underfull hbox and goto common ending , if this box is suciently bad 660 ;
return;
end
This code is used in section 657.
659. Determine the stretch order 659
if total stretch[lll ] ,= 0 then o lll
else if total stretch[ll ] ,= 0 then o ll
else if total stretch[l ] ,= 0 then o l
else o normal
This code is used in sections 658, 673, and 796.
660. Report an underfull hbox and goto common ending , if this box is suciently bad 660
begin last badness badness (x, total stretch[normal ]);
if last badness > hbadness then
begin print ln;
if last badness > 100 then print nl ("Underfull") else print nl ("Loose");
print ("\hbox(badness"); print int (last badness ); goto common ending ;
end;
end
This code is used in section 658.
661. In order to provide a decent indication of where an overfull or underfull box originated, we use a
global variable pack begin line that is set nonzero only when hpack is being called by the paragraph builder
or the alignment nishing routine.
Global variables 13 +
pack begin line: integer ; source le line where the current paragraph or alignment began; a negative
value denotes alignment
662. Set initial values of key variables 21 +
pack begin line 0;
244 PART 33: PACKAGING T
E
X82 663
663. Finish issuing a diagnostic message for an overfull or underfull hbox 663
if output active then print (")hasoccurredwhile\outputisactive")
else begin if pack begin line ,= 0 then
begin if pack begin line > 0 then print (")inparagraphatlines")
else print (")inalignmentatlines");
print int (abs (pack begin line)); print ("");
end
else print (")detectedatline");
print int (line);
end;
print ln;
font in short display null font ; short display(list ptr (r)); print ln;
begin diagnostic; show box (r); end diagnostic(true)
This code is used in section 649.
664. Determine horizontal glue shrink setting, then return or goto common ending 664
begin Determine the shrink order 665 ;
glue order (r) o; glue sign(r) shrinking ;
if total shrink [o] ,= 0 then glue set (r) unoat ((x)/total shrink [o])
else begin glue sign(r) normal ; set glue ratio zero(glue set (r)); theres nothing to shrink
end;
if (total shrink [o] < x) (o = normal ) (list ptr (r) ,= null ) then
begin last badness 1000000; set glue ratio one(glue set (r)); use the maximum shrinkage
Report an overfull hbox and goto common ending , if this box is suciently bad 666 ;
end
else if o = normal then
if list ptr (r) ,= null then
Report a tight hbox and goto common ending , if this box is suciently bad 667 ;
return;
end
This code is used in section 657.
665. Determine the shrink order 665
if total shrink [lll ] ,= 0 then o lll
else if total shrink [ll ] ,= 0 then o ll
else if total shrink [l ] ,= 0 then o l
else o normal
This code is used in sections 664, 676, and 796.
666. Report an overfull hbox and goto common ending , if this box is suciently bad 666
if (x total shrink [normal ] > hfuzz ) (hbadness < 100) then
begin if (overfull rule > 0) (x total shrink [normal ] > hfuzz ) then
begin while link (q) ,= null do q link (q);
link (q) new rule; width(link (q)) overfull rule;
end;
print ln; print nl ("Overfull\hbox("); print scaled (x total shrink [normal ]);
print ("pttoowide"); goto common ending ;
end
This code is used in section 664.
667 T
E
X82 PART 33: PACKAGING 245
667. Report a tight hbox and goto common ending , if this box is suciently bad 667
begin last badness badness (x, total shrink [normal ]);
if last badness > hbadness then
begin print ln; print nl ("Tight\hbox(badness"); print int (last badness ); goto common ending ;
end;
end
This code is used in section 664.
668. The vpack subroutine is actually a special case of a slightly more general routine called vpackage,
which has four parameters. The fourth parameter, which is max dimen in the case of vpack , species the
maximum depth of the page box that is constructed. The depth is rst computed by the normal rules; if it
exceeds this limit, the reference point is simply moved down until the limiting depth is attained.
dene vpack (#) vpackage(#, max dimen) special case of unconstrained depth
function vpackage(p : pointer ; h : scaled ; m : small number ; l : scaled ): pointer ;
label common ending , exit ;
var r: pointer ; the box node that will be returned
w, d, x: scaled ; width, depth, and natural height
s: scaled ; shift amount
g: pointer ; points to a glue specication
o: glue ord ; order of innity
begin last badness 0; r get node(box node size); type(r) vlist node;
subtype(r) min quarterword ; shift amount (r) 0; list ptr (r) p;
w 0; Clear dimensions to zero 650 ;
while p ,= null do Examine node p in the vlist, taking account of its eect on the dimensions of the
new box; then advance p to the next node 669 ;
width(r) w;
if d > l then
begin x x + d l; depth(r) l;
end
else depth(r) d;
Determine the value of height (r) and the appropriate glue setting; then return or goto
common ending 672 ;
common ending : Finish issuing a diagnostic message for an overfull or underfull vbox 675 ;
exit : vpackage r;
end;
669. Examine node p in the vlist, taking account of its eect on the dimensions of the new box; then
advance p to the next node 669
begin if is char node(p) then confusion("vpack")
else case type(p) of
hlist node, vlist node, rule node, unset node: Incorporate box dimensions into the dimensions of the
vbox that will contain it 670 ;
whatsit node: Incorporate a whatsit node into a vbox 1359 ;
glue node: Incorporate glue into the vertical totals 671 ;
kern node: begin x x + d + width(p); d 0;
end;
othercases do nothing
endcases;
p link (p);
end
This code is used in section 668.
246 PART 33: PACKAGING T
E
X82 670
670. Incorporate box dimensions into the dimensions of the vbox that will contain it 670
begin x x + d + height (p); d depth(p);
if type(p) rule node then s 0 else s shift amount (p);
if width(p) + s > w then w width(p) + s;
end
This code is used in section 669.
671. Incorporate glue into the vertical totals 671
begin x x + d; d 0;
g glue ptr (p); x x + width(g);
o stretch order (g); total stretch[o] total stretch[o] + stretch(g); o shrink order (g);
total shrink [o] total shrink [o] + shrink (g);
if subtype(p) a leaders then
begin g leader ptr (p);
if width(g) > w then w width(g);
end;
end
This code is used in section 669.
672. When we get to the present part of the program, x is the natural height of the box being packaged.
Determine the value of height (r) and the appropriate glue setting; then return or goto
common ending 672
if m = additional then h x + h;
height (r) h; x h x; now x is the excess to be made up
if x = 0 then
begin glue sign(r) normal ; glue order (r) normal ; set glue ratio zero(glue set (r)); return;
end
else if x > 0 then Determine vertical glue stretch setting, then return or goto common ending 673
else Determine vertical glue shrink setting, then return or goto common ending 676
This code is used in section 668.
673. Determine vertical glue stretch setting, then return or goto common ending 673
begin Determine the stretch order 659 ;
glue order (r) o; glue sign(r) stretching ;
if total stretch[o] ,= 0 then glue set (r) unoat (x/total stretch[o])
else begin glue sign(r) normal ; set glue ratio zero(glue set (r)); theres nothing to stretch
end;
if o = normal then
if list ptr (r) ,= null then
Report an underfull vbox and goto common ending , if this box is suciently bad 674 ;
return;
end
This code is used in section 672.
674 T
E
X82 PART 33: PACKAGING 247
674. Report an underfull vbox and goto common ending , if this box is suciently bad 674
begin last badness badness (x, total stretch[normal ]);
if last badness > vbadness then
begin print ln;
if last badness > 100 then print nl ("Underfull") else print nl ("Loose");
print ("\vbox(badness"); print int (last badness ); goto common ending ;
end;
end
This code is used in section 673.
675. Finish issuing a diagnostic message for an overfull or underfull vbox 675
if output active then print (")hasoccurredwhile\outputisactive")
else begin if pack begin line ,= 0 then its actually negative
begin print (")inalignmentatlines"); print int (abs (pack begin line)); print ("");
end
else print (")detectedatline");
print int (line); print ln;
end;
begin diagnostic; show box (r); end diagnostic(true)
This code is used in section 668.
676. Determine vertical glue shrink setting, then return or goto common ending 676
begin Determine the shrink order 665 ;
glue order (r) o; glue sign(r) shrinking ;
if total shrink [o] ,= 0 then glue set (r) unoat ((x)/total shrink [o])
else begin glue sign(r) normal ; set glue ratio zero(glue set (r)); theres nothing to shrink
end;
if (total shrink [o] < x) (o = normal ) (list ptr (r) ,= null ) then
begin last badness 1000000; set glue ratio one(glue set (r)); use the maximum shrinkage
Report an overfull vbox and goto common ending , if this box is suciently bad 677 ;
end
else if o = normal then
if list ptr (r) ,= null then
Report a tight vbox and goto common ending , if this box is suciently bad 678 ;
return;
end
This code is used in section 672.
677. Report an overfull vbox and goto common ending , if this box is suciently bad 677
if (x total shrink [normal ] > vfuzz ) (vbadness < 100) then
begin print ln; print nl ("Overfull\vbox("); print scaled (x total shrink [normal ]);
print ("pttoohigh"); goto common ending ;
end
This code is used in section 676.
678. Report a tight vbox and goto common ending , if this box is suciently bad 678
begin last badness badness (x, total shrink [normal ]);
if last badness > vbadness then
begin print ln; print nl ("Tight\vbox(badness"); print int (last badness ); goto common ending ;
end;
end
This code is used in section 676.
248 PART 33: PACKAGING T
E
X82 679
679. When a box is being appended to the current vertical list, the baselineskip calculation is handled by
the append to vlist routine.
procedure append to vlist (b : pointer );
var d: scaled ; deciency of space between baselines
p: pointer ; a new glue node
begin if prev depth > ignore depth then
begin d width(baseline skip) prev depth height (b);
if d < line skip limit then p new param glue(line skip code)
else begin p new skip param(baseline skip code); width(temp ptr ) d; temp ptr = glue ptr (p)
end;
link (tail ) p; tail p;
end;
link (tail ) b; tail b; prev depth depth(b);
end;
680 T
E
X82 PART 34: DATA STRUCTURES FOR MATH MODE 249
680. Data structures for math mode. When T
E
X reads a formula that is enclosed between $s, it
constructs an mlist, which is essentially a tree structure representing that formula. An mlist is a linear
sequence of items, but we can regard it as a tree structure because mlists can appear within mlists. For
example, many of the entries can be subscripted or superscripted, and such scripts are mlists in their own
right.
An entire formula is parsed into such a tree before any of the actual typesetting is done, because the
current style of type is usually not known until the formula has been fully scanned. For example, when the
formula $a+b \over c+d$ is being read, there is no way to tell that a+b will be in script size until \over
has appeared.
During the scanning process, each element of the mlist being built is classied as a relation, a binary
operator, an open parenthesis, etc., or as a construct like \sqrt that must be built up. This classication
appears in the mlist data structure.
After a formula has been fully scanned, the mlist is converted to an hlist so that it can be incorporated
into the surrounding text. This conversion is controlled by a recursive procedure that decides all of the
appropriate styles by a top-down process starting at the outermost level and working in towards the
subformulas. The formula is ultimately pasted together using combinations of horizontal and vertical boxes,
with glue and penalty nodes inserted as necessary.
An mlist is represented internally as a linked list consisting chiey of noads (pronounced no-adds), to
distinguish them from the somewhat similar nodes in hlists and vlists. Certain kinds of ordinary nodes
are allowed to appear in mlists together with the noads; T
E
X tells the dierence by means of the type eld,
since a noads type is always greater than that of a node. An mlist does not contain character nodes, hlist
nodes, vlist nodes, math nodes, ligature nodes, or unset nodes; in particular, each mlist item appears in the
variable-size part of mem, so the type eld is always present.
250 PART 34: DATA STRUCTURES FOR MATH MODE T
E
X82 681
681. Each noad is four or more words long. The rst word contains the type and subtype and link elds
that are already so familiar to us; the second, third, and fourth words are called the noads nucleus , subscr ,
and supscr elds.
Consider, for example, the simple formula $x^2$, which would be parsed into an mlist containing a single
element called an ord noad . The nucleus of this noad is a representation of x, the subscr is empty, and the
supscr is a representation of 2.
The nucleus , subscr , and supscr elds are further broken into subelds. If p points to a noad, and if q is
one of its principal elds (e.g., q = subscr (p)), there are several possibilities for the subelds, depending on
the math type of q.
math type(q) = math char means that fam(q) refers to one of the sixteen font families, and character (q) is
the number of a character within a font of that family, as in a character node.
math type(q) = math text char is similar, but the character is unsubscripted and unsuperscripted and it is
followed immediately by another character from the same font. (This math type setting appears only
briey during the processing; it is used to suppress unwanted italic corrections.)
math type(q) = empty indicates a eld with no value (the corresponding attribute of noad p is not present).
math type(q) = sub box means that info(q) points to a box node (either an hlist node or a vlist node) that
should be used as the value of the eld. The shift amount in the subsidiary box node is the amount
by which that box will be shifted downward.
math type(q) = sub mlist means that info(q) points to an mlist; the mlist must be converted to an hlist in
order to obtain the value of this eld.
In the latter case, we might have info(q) = null . This is not the same as math type(q) = empty; for example,
$P_{}$ and $P$ produce dierent results (the former will not have the italic correction added to the
width of P, but the script skip will be added).
The denitions of subelds given here are evidently wasteful of space, since a halfword is being used for
the math type although only three bits would be needed. However, there are hardly ever many noads present
at once, since they are soon converted to nodes that take up even more space, so we can aord to represent
them in whatever way simplies the programming.
dene noad size = 4 number of words in a normal noad
dene nucleus (#) # + 1 the nucleus eld of a noad
dene supscr (#) # + 2 the supscr eld of a noad
dene subscr (#) # + 3 the subscr eld of a noad
dene math type link a halfword in mem
dene fam font a quarterword in mem
dene math char = 1 math type when the attribute is simple
dene sub box = 2 math type when the attribute is a box
dene sub mlist = 3 math type when the attribute is a formula
dene math text char = 4 math type when italic correction is dubious
682 T
E
X82 PART 34: DATA STRUCTURES FOR MATH MODE 251
682. Each portion of a formula is classied as Ord, Op, Bin, Rel, Ope, Clo, Pun, or Inn, for purposes of
spacing and line breaking. An ord noad , op noad , bin noad , rel noad , open noad , close noad , punct noad ,
or inner noad is used to represent portions of the various types. For example, an = sign in a formula leads
to the creation of a rel noad whose nucleus eld is a representation of an equals sign (usually fam = 0,
character = 75 ). A formula preceded by \mathrel also results in a rel noad . When a rel noad is followed
by an op noad , say, and possibly separated by one or more ordinary nodes (not noads), T
E
X will insert a
penalty node (with the current rel penalty) just after the formula that corresponds to the rel noad , unless
there already was a penalty immediately following; and a thick space will be inserted just before the
formula that corresponds to the op noad .
A noad of type ord noad , op noad , . . . , inner noad usually has a subtype = normal . The only exception
is that an op noad might have subtype = limits or no limits , if the normal positioning of limits has been
overridden for this operator.
dene ord noad = unset node + 3 type of a noad classied Ord
dene op noad = ord noad + 1 type of a noad classied Op
dene bin noad = ord noad + 2 type of a noad classied Bin
dene rel noad = ord noad + 3 type of a noad classied Rel
dene open noad = ord noad + 4 type of a noad classied Ope
dene close noad = ord noad + 5 type of a noad classied Clo
dene punct noad = ord noad + 6 type of a noad classied Pun
dene inner noad = ord noad + 7 type of a noad classied Inn
dene limits = 1 subtype of op noad whose scripts are to be above, below
dene no limits = 2 subtype of op noad whose scripts are to be normal
252 PART 34: DATA STRUCTURES FOR MATH MODE T
E
X82 683
683. A radical noad is ve words long; the fth word is the left delimiter eld, which usually represents a
square root sign.
A fraction noad is six words long; it has a right delimiter eld as well as a left delimiter .
Delimiter elds are of type four quarters , and they have four subelds called small fam, small char ,
large fam, large char . These subelds represent variable-size delimiters by giving the small and large
starting characters, as explained in Chapter 17 of The T
E
Xbook.
A fraction noad is actually quite dierent from all other noads. Not only does it have six words, it has
thickness , denominator , and numerator elds instead of nucleus , subscr , and supscr . The thickness is a
scaled value that tells how thick to make a fraction rule; however, the special value default code is used to
stand for the default rule thickness of the current size. The numerator and denominator point to mlists
that dene a fraction; we always have
math type(numerator ) = math type(denominator ) = sub mlist .
The left delimiter and right delimiter elds specify delimiters that will be placed at the left and right of
the fraction. In this way, a fraction noad is able to represent all of T
E
Xs operators \over, \atop, \above,
\overwithdelims, \atopwithdelims, and \abovewithdelims.
dene left delimiter (#) # + 4 rst delimiter eld of a noad
dene right delimiter (#) # + 5 second delimiter eld of a fraction noad
dene radical noad = inner noad + 1 type of a noad for square roots
dene radical noad size = 5 number of mem words in a radical noad
dene fraction noad = radical noad + 1 type of a noad for generalized fractions
dene fraction noad size = 6 number of mem words in a fraction noad
dene small fam(#) mem[#].qqqq .b0 fam for small delimiter
dene small char (#) mem[#].qqqq .b1 character for small delimiter
dene large fam(#) mem[#].qqqq .b2 fam for large delimiter
dene large char (#) mem[#].qqqq .b3 character for large delimiter
dene thickness width thickness eld in a fraction noad
dene default code 10000000000 denotes default rule thickness
dene numerator supscr numerator eld in a fraction noad
dene denominator subscr denominator eld in a fraction noad
684. The global variable empty eld is set up for initialization of empty elds in new noads. Similarly,
null delimiter is for the initialization of delimiter elds.
Global variables 13 +
empty eld : two halves ;
null delimiter : four quarters ;
685. Set initial values of key variables 21 +
empty eld .rh empty; empty eld .lh null ;
null delimiter .b0 0; null delimiter .b1 min quarterword ;
null delimiter .b2 0; null delimiter .b3 min quarterword ;
686. The new noad function creates an ord noad that is completely null.
function new noad : pointer ;
var p: pointer ;
begin p get node(noad size); type(p) ord noad ; subtype(p) normal ;
mem[nucleus (p)].hh empty eld ; mem[subscr (p)].hh empty eld ;
mem[supscr (p)].hh empty eld ; new noad p;
end;
687 T
E
X82 PART 34: DATA STRUCTURES FOR MATH MODE 253
687. A few more kinds of noads will complete the set: An under noad has its nucleus underlined; an
over noad has it overlined. An accent noad places an accent over its nucleus; the accent character appears
as fam(accent chr (p)) and character (accent chr (p)). A vcenter noad centers its nucleus vertically with
respect to the axis of the formula; in such noads we always have math type(nucleus (p)) = sub box .
And nally, we have left noad and right noad types, to implement T
E
Xs \left and \right. The nucleus
of such noads is replaced by a delimiter eld; thus, for example, \left( produces a left noad such that
delimiter (p) holds the family and character codes for all left parentheses. A left noad never appears in an
mlist except as the rst element, and a right noad never appears in an mlist except as the last element;
furthermore, we either have both a left noad and a right noad , or neither one is present. The subscr and
supscr elds are always empty in a left noad and a right noad .
dene under noad = fraction noad + 1 type of a noad for underlining
dene over noad = under noad + 1 type of a noad for overlining
dene accent noad = over noad + 1 type of a noad for accented subformulas
dene accent noad size = 5 number of mem words in an accent noad
dene accent chr (#) # + 4 the accent chr eld of an accent noad
dene vcenter noad = accent noad + 1 type of a noad for \vcenter
dene left noad = vcenter noad + 1 type of a noad for \left
dene right noad = left noad + 1 type of a noad for \right
dene delimiter nucleus delimiter eld in left and right noads
dene scripts allowed (#) (type(#) ord noad ) (type(#) < left noad )
688. Math formulas can also contain instructions like \textstyle that override T
E
Xs normal style rules.
A style node is inserted into the data structure to record such instructions; it is three words long, so it
is considered a node instead of a noad. The subtype is either display style or text style or script style or
script script style. The second and third words of a style node are not used, but they are present because a
choice node is converted to a style node.
T
E
X uses even numbers 0, 2, 4, 6 to encode the basic styles display style, . . . , script script style, and
adds 1 to get the cramped versions of these styles. This gives a numerical order that is backwards from
the convention of Appendix G in The T
E
Xbook; i.e., a smaller style has a larger numerical value.
dene style node = unset node + 1 type of a style node
dene style node size = 3 number of words in a style node
dene display style = 0 subtype for \displaystyle
dene text style = 2 subtype for \textstyle
dene script style = 4 subtype for \scriptstyle
dene script script style = 6 subtype for \scriptscriptstyle
dene cramped = 1 add this to an uncramped style if you want to cramp it
function new style(s : small number ): pointer ; create a style node
var p: pointer ; the new node
begin p get node(style node size); type(p) style node; subtype(p) s; width(p) 0;
depth(p) 0; the width and depth are not used
new style p;
end;
254 PART 34: DATA STRUCTURES FOR MATH MODE T
E
X82 689
689. Finally, the \mathchoice primitive creates a choice node, which has special subelds display mlist ,
text mlist , script mlist , and script script mlist pointing to the mlists for each style.
dene choice node = unset node + 2 type of a choice node
dene display mlist (#) info(# + 1) mlist to be used in display style
dene text mlist (#) link (# + 1) mlist to be used in text style
dene script mlist (#) info(# + 2) mlist to be used in script style
dene script script mlist (#) link (# + 2) mlist to be used in scriptscript style
function new choice: pointer ; create a choice node
var p: pointer ; the new node
begin p get node(style node size); type(p) choice node; subtype(p) 0;
the subtype is not used
display mlist (p) null ; text mlist (p) null ; script mlist (p) null ; script script mlist (p) null ;
new choice p;
end;
690. Lets consider now the previously unwritten part of show node list that displays the things that can
only be present in mlists; this program illustrates how to access the data structures just dened.
In the context of the following program, p points to a node or noad that should be displayed, and the
current string contains the recursion history that leads to this point. The recursion history consists of a
dot for each outer level in which p is subsidiary to some node, or in which p is subsidiary to the nucleus
eld of some noad; the dot is replaced by _ or ^ or / or \ if p is descended from the subscr or supscr
or denominator or numerator elds of noads. For example, the current string would be .^._/ if p points
to the ord noad for x in the (ridiculous) formula $\sqrt{a^{\mathinner{b_{c\over x+y}}}}$.
Cases of show node list that arise in mlists only 690
style node: print style(subtype(p));
choice node: Display choice node p 695 ;
ord noad , op noad , bin noad , rel noad , open noad , close noad , punct noad ,
inner noad , radical noad , over noad , under noad , vcenter noad , accent noad , left noad , right noad :
Display normal noad p 696 ;
fraction noad : Display fraction noad p 697 ;
This code is used in section 183.
691. Here are some simple routines used in the display of noads.
Declare procedures needed for displaying the elements of mlists 691
procedure print fam and char (p : pointer ); prints family and character
begin print esc("fam"); print int (fam(p)); print char (""); print ASCII (qo(character (p)));
end;
procedure print delimiter (p : pointer ); prints a delimiter as 24-bit hex value
var a: integer ; accumulator
begin a small fam(p) 256 + qo(small char (p));
a a 1000 + large fam(p) 256 + qo(large char (p));
if a < 0 then print int (a) this should never happen
else print hex (a);
end;
See also sections 692 and 694.
This code is used in section 179.
692 T
E
X82 PART 34: DATA STRUCTURES FOR MATH MODE 255
692. The next subroutine will descend to another level of recursion when a subsidiary mlist needs to be
displayed. The parameter c indicates what character is to become part of the recursion history. An empty
mlist is distinguished from a eld with math type(p) = empty, because these are not equivalent (as explained
above).
Declare procedures needed for displaying the elements of mlists 691 +
procedure show info; forward ; show node list (info(temp ptr ))
procedure print subsidiary data(p : pointer ; c : ASCII code); display a noad eld
begin if cur length depth threshold then
begin if math type(p) ,= empty then print ("[]");
end
else begin append char (c); include c in the recursion history
temp ptr p; prepare for show info if recursion is needed
case math type(p) of
math char : begin print ln; print current string ; print fam and char (p);
end;
sub box : show info; recursive call
sub mlist : if info(p) = null then
begin print ln; print current string ; print ("{}");
end
else show info; recursive call
othercases do nothing empty
endcases;
ush char ; remove c from the recursion history
end;
end;
693. The inelegant introduction of show info in the code above seems better than the alternative of using
Pascals strange forward declaration for a procedure with parameters. The Pascal convention about dropping
parameters from a post-forward procedure is, frankly, so intolerable to the author of T
E
X that he would
rather stoop to communication via a global temporary variable. (A similar stoopidity occurred with respect
to hlist out and vlist out above, and it will occur with respect to mlist to hlist below.)
procedure show info; the reader will kindly forgive this
begin show node list (info(temp ptr ));
end;
694. Declare procedures needed for displaying the elements of mlists 691 +
procedure print style(c : integer );
begin case c div 2 of
0: print esc("displaystyle"); display style = 0
1: print esc("textstyle"); text style = 2
2: print esc("scriptstyle"); script style = 4
3: print esc("scriptscriptstyle"); script script style = 6
othercases print ("Unknownstyle!")
endcases;
end;
256 PART 34: DATA STRUCTURES FOR MATH MODE T
E
X82 695
695. Display choice node p 695
begin print esc("mathchoice"); append char ("D"); show node list (display mlist (p)); ush char ;
append char ("T"); show node list (text mlist (p)); ush char ; append char ("S");
show node list (script mlist (p)); ush char ; append char ("s"); show node list (script script mlist (p));
ush char ;
end
This code is used in section 690.
696. Display normal noad p 696
begin case type(p) of
ord noad : print esc("mathord");
op noad : print esc("mathop");
bin noad : print esc("mathbin");
rel noad : print esc("mathrel");
open noad : print esc("mathopen");
close noad : print esc("mathclose");
punct noad : print esc("mathpunct");
inner noad : print esc("mathinner");
over noad : print esc("overline");
under noad : print esc("underline");
vcenter noad : print esc("vcenter");
radical noad : begin print esc("radical"); print delimiter (left delimiter (p));
end;
accent noad : begin print esc("accent"); print fam and char (accent chr (p));
end;
left noad : begin print esc("left"); print delimiter (delimiter (p));
end;
right noad : begin print esc("right"); print delimiter (delimiter (p));
end;
end;
if subtype(p) ,= normal then
if subtype(p) = limits then print esc("limits")
else print esc("nolimits");
if type(p) < left noad then print subsidiary data(nucleus (p), ".");
print subsidiary data(supscr (p), "^"); print subsidiary data(subscr (p), "_");
end
This code is used in section 690.
697 T
E
X82 PART 34: DATA STRUCTURES FOR MATH MODE 257
697. Display fraction noad p 697
begin print esc("fraction,thickness");
if thickness (p) = default code then print ("=default")
else print scaled (thickness (p));
if (small fam(left delimiter (p)) ,= 0) (small char (left delimiter (p)) ,= min quarterword )
(large fam(left delimiter (p)) ,= 0) (large char (left delimiter (p)) ,= min quarterword ) then
begin print (",leftdelimiter"); print delimiter (left delimiter (p));
end;
if (small fam(right delimiter (p)) ,= 0) (small char (right delimiter (p)) ,= min quarterword )
(large fam(right delimiter (p)) ,= 0) (large char (right delimiter (p)) ,= min quarterword ) then
begin print (",rightdelimiter"); print delimiter (right delimiter (p));
end;
print subsidiary data(numerator (p), "\"); print subsidiary data(denominator (p), "/");
end
This code is used in section 690.
698. That which can be displayed can also be destroyed.
Cases of ush node list that arise in mlists only 698
style node: begin free node(p, style node size); goto done;
end;
choice node: begin ush node list (display mlist (p)); ush node list (text mlist (p));
ush node list (script mlist (p)); ush node list (script script mlist (p)); free node(p, style node size);
goto done;
end;
ord noad , op noad , bin noad , rel noad , open noad , close noad , punct noad , inner noad , radical noad ,
over noad , under noad , vcenter noad , accent noad :
begin if math type(nucleus (p)) sub box then ush node list (info(nucleus (p)));
if math type(supscr (p)) sub box then ush node list (info(supscr (p)));
if math type(subscr (p)) sub box then ush node list (info(subscr (p)));
if type(p) = radical noad then free node(p, radical noad size)
else if type(p) = accent noad then free node(p, accent noad size)
else free node(p, noad size);
goto done;
end;
left noad , right noad : begin free node(p, noad size); goto done;
end;
fraction noad : begin ush node list (info(numerator (p))); ush node list (info(denominator (p)));
free node(p, fraction noad size); goto done;
end;
This code is used in section 202.
258 PART 35: SUBROUTINES FOR MATH MODE T
E
X82 699
699. Subroutines for math mode. In order to convert mlists to hlists, i.e., noads to nodes, we need
several subroutines that are conveniently dealt with now.
Let us rst introduce the macros that make it easy to get at the parameters and other font information. A
size code, which is a multiple of 16, is added to a family number to get an index into the table of internal font
numbers for each combination of family and size. (Be alert: Size codes get larger as the type gets smaller.)
dene text size = 0 size code for the largest size in a family
dene script size = 16 size code for the medium size in a family
dene script script size = 32 size code for the smallest size in a family
Basic printing procedures 57 +
procedure print size(s : integer );
begin if s = text size then print esc("textfont")
else if s = script size then print esc("scriptfont")
else print esc("scriptscriptfont");
end;
700. Before an mlist is converted to an hlist, T
E
X makes sure that the fonts in family 2 have enough
parameters to be math-symbol fonts, and that the fonts in family 3 have enough parameters to be math-
extension fonts. The math-symbol parameters are referred to by using the following macros, which take a
size code as their parameter; for example, num1 (cur size) gives the value of the num1 parameter for the
current size.
dene mathsy end (#) fam fnt (2 + #) ] ] .sc
dene mathsy(#) font info [ # + param base [ mathsy end
dene math x height mathsy(5) height of x
dene math quad mathsy(6) 18mu
dene num1 mathsy(8) numerator shift-up in display styles
dene num2 mathsy(9) numerator shift-up in non-display, non-\atop
dene num3 mathsy(10) numerator shift-up in non-display \atop
dene denom1 mathsy(11) denominator shift-down in display styles
dene denom2 mathsy(12) denominator shift-down in non-display styles
dene sup1 mathsy(13) superscript shift-up in uncramped display style
dene sup2 mathsy(14) superscript shift-up in uncramped non-display
dene sup3 mathsy(15) superscript shift-up in cramped styles
dene sub1 mathsy(16) subscript shift-down if superscript is absent
dene sub2 mathsy(17) subscript shift-down if superscript is present
dene sup drop mathsy(18) superscript baseline below top of large box
dene sub drop mathsy(19) subscript baseline below bottom of large box
dene delim1 mathsy(20) size of \atopwithdelims delimiters in display styles
dene delim2 mathsy(21) size of \atopwithdelims delimiters in non-displays
dene axis height mathsy(22) height of fraction lines above the baseline
dene total mathsy params = 22
701. The math-extension parameters have similar macros, but the size code is omitted (since it is always
cur size when we refer to such parameters).
dene mathex (#) font info[# + param base[fam fnt (3 + cur size)]].sc
dene default rule thickness mathex (8) thickness of \over bars
dene big op spacing1 mathex (9) minimum clearance above a displayed op
dene big op spacing2 mathex (10) minimum clearance below a displayed op
dene big op spacing3 mathex (11) minimum baselineskip above displayed op
dene big op spacing4 mathex (12) minimum baselineskip below displayed op
dene big op spacing5 mathex (13) padding above and below displayed limits
dene total mathex params = 13
702 T
E
X82 PART 35: SUBROUTINES FOR MATH MODE 259
702. We also need to compute the change in style between mlists and their subsidiaries. The following
macros dene the subsidiary style for an overlined nucleus (cramped style), for a subscript or a superscript
(sub style or sup style), or for a numerator or denominator (num style or denom style).
dene cramped style(#) 2 (# div 2) + cramped cramp the style
dene sub style(#) 2 (# div 4) + script style + cramped smaller and cramped
dene sup style(#) 2 (# div 4) + script style + (# mod 2) smaller
dene num style(#) # + 2 2 (# div 6) smaller unless already script-script
dene denom style(#) 2 (# div 2) + cramped + 2 2 (# div 6) smaller, cramped
703. When the style changes, the following piece of program computes associated information:
Set up the values of cur size and cur mu, based on cur style 703
begin if cur style < script style then cur size text size
else cur size 16 ((cur style text style) div 2);
cur mu x over n(math quad (cur size), 18);
end
This code is used in sections 720, 726, 730, 754, 760, and 763.
704. Here is a function that returns a pointer to a rule node having a given thickness t. The rule will
extend horizontally to the boundary of the vlist that eventually contains it.
function fraction rule(t : scaled ): pointer ; construct the bar for a fraction
var p: pointer ; the new node
begin p new rule; height (p) t; depth(p) 0; fraction rule p;
end;
705. The overbar function returns a pointer to a vlist box that consists of a given box b, above which has
been placed a kern of height k under a fraction rule of thickness t under additional space of height t.
function overbar (b : pointer ; k, t : scaled ): pointer ;
var p, q: pointer ; nodes being constructed
begin p new kern(k); link (p) b; q fraction rule(t); link (q) p; p new kern(t); link (p) q;
overbar vpack (p, natural );
end;
260 PART 35: SUBROUTINES FOR MATH MODE T
E
X82 706
706. The var delimiter function, which nds or constructs a suciently large delimiter, is the most
interesting of the auxiliary functions that currently concern us. Given a pointer d to a delimiter eld in
some noad, together with a size code s and a vertical distance v, this function returns a pointer to a box that
contains the smallest variant of d whose height plus depth is v or more. (And if no variant is large enough,
it returns the largest available variant.) In particular, this routine will construct arbitrarily large delimiters
from extensible components, if d leads to such characters.
The value returned is a box whose shift amount has been set so that the box is vertically centered with
respect to the axis in the given size. If a built-up symbol is returned, the height of the box before shifting
will be the height of its topmost component.
Declare subprocedures for var delimiter 709
function var delimiter (d : pointer ; s : small number ; v : scaled ): pointer ;
label found , continue;
var b: pointer ; the box that will be constructed
f, g: internal font number ; best-so-far and tentative font codes
c, x, y: quarterword ; best-so-far and tentative character codes
m, n: integer ; the number of extensible pieces
u: scaled ; height-plus-depth of a tentative character
w: scaled ; largest height-plus-depth so far
q: four quarters ; character info
hd : eight bits ; height-depth byte
r: four quarters ; extensible pieces
z: small number ; runs through font family members
large attempt : boolean; are we trying the large variant?
begin f null font ; w 0; large attempt false; z small fam(d); x small char (d);
loop begin Look at the variants of (z, x); set f and c whenever a better character is found; goto
found as soon as a large enough variant is encountered 707 ;
if large attempt then goto found ; there were none large enough
large attempt true; z large fam(d); x large char (d);
end;
found : if f ,= null font then Make variable b point to a box for (f, c) 710
else begin b new null box ; width(b) null delimiter space;
use this width if no delimiter was found
end;
shift amount (b) half (height (b) depth(b)) axis height (s); var delimiter b;
end;
707. The search process is complicated slightly by the facts that some of the characters might not be
present in some of the fonts, and they might not be probed in increasing order of height.
Look at the variants of (z, x); set f and c whenever a better character is found; goto found as soon as a
large enough variant is encountered 707
if (z ,= 0) (x ,= min quarterword ) then
begin z z + s + 16;
repeat z z 16; g fam fnt (z);
if g ,= null font then Look at the list of characters starting with x in font g; set f and c whenever
a better character is found; goto found as soon as a large enough variant is encountered 708 ;
until z < 16;
end
This code is used in section 706.
708 T
E
X82 PART 35: SUBROUTINES FOR MATH MODE 261
708. Look at the list of characters starting with x in font g; set f and c whenever a better character is
found; goto found as soon as a large enough variant is encountered 708
begin y x;
if (qo(y) font bc[g]) (qo(y) font ec[g]) then
begin continue: q char info(g)(y);
if char exists (q) then
begin if char tag (q) = ext tag then
begin f g; c y; goto found ;
end;
hd height depth(q); u char height (g)(hd ) + char depth(g)(hd );
if u > w then
begin f g; c y; w u;
if u v then goto found ;
end;
if char tag (q) = list tag then
begin y rem byte(q); goto continue;
end;
end;
end;
end
This code is used in section 707.
709. Here is a subroutine that creates a new box, whose list contains a single character, and whose width
includes the italic correction for that character. The height or depth of the box will be negative, if the height
or depth of the character is negative; thus, this routine may deliver a slightly dierent result than hpack
would produce.
Declare subprocedures for var delimiter 709
function char box (f : internal font number ; c : quarterword ): pointer ;
var q: four quarters ; hd : eight bits ; height depth byte
b, p: pointer ; the new box and its character node
begin q char info(f)(c); hd height depth(q); b new null box ;
width(b) char width(f)(q) + char italic(f)(q); height (b) char height (f)(hd );
depth(b) char depth(f)(hd ); p get avail ; character (p) c; font (p) f; list ptr (b) p;
char box b;
end;
See also sections 711 and 712.
This code is used in section 706.
710. When the following code is executed, char tag (q) will be equal to ext tag if and only if a built-up
symbol is supposed to be returned.
Make variable b point to a box for (f, c) 710
if char tag (q) = ext tag then
Construct an extensible character in a new box b, using recipe rem byte(q) and font f 713
else b char box (f, c)
This code is used in section 706.
262 PART 35: SUBROUTINES FOR MATH MODE T
E
X82 711
711. When we build an extensible character, its handy to have the following subroutine, which puts a
given character on top of the characters already in box b:
Declare subprocedures for var delimiter 709 +
procedure stack into box (b : pointer ; f : internal font number ; c : quarterword );
var p: pointer ; new node placed into b
begin p char box (f, c); link (p) list ptr (b); list ptr (b) p; height (b) height (p);
end;
712. Another handy subroutine computes the height plus depth of a given character:
Declare subprocedures for var delimiter 709 +
function height plus depth(f : internal font number ; c : quarterword ): scaled ;
var q: four quarters ; hd : eight bits ; height depth byte
begin q char info(f)(c); hd height depth(q);
height plus depth char height (f)(hd ) + char depth(f)(hd );
end;
713. Construct an extensible character in a new box b, using recipe rem byte(q) and font f 713
begin b new null box ; type(b) vlist node; r font info[exten base[f] + rem byte(q)].qqqq ;
Compute the minimum suitable height, w, and the corresponding number of extension steps, n; also set
width(b) 714 ;
c ext bot (r);
if c ,= min quarterword then stack into box (b, f, c);
c ext rep(r);
for m 1 to n do stack into box (b, f, c);
c ext mid (r);
if c ,= min quarterword then
begin stack into box (b, f, c); c ext rep(r);
for m 1 to n do stack into box (b, f, c);
end;
c ext top(r);
if c ,= min quarterword then stack into box (b, f, c);
depth(b) w height (b);
end
This code is used in section 710.
714 T
E
X82 PART 35: SUBROUTINES FOR MATH MODE 263
714. The width of an extensible character is the width of the repeatable module. If this module does not
have positive height plus depth, we dont use any copies of it, otherwise we use as few as possible (in groups
of two if there is a middle part).
Compute the minimum suitable height, w, and the corresponding number of extension steps, n; also set
width(b) 714
c ext rep(r); u height plus depth(f, c); w 0; q char info(f)(c);
width(b) char width(f)(q) + char italic(f)(q);
c ext bot (r); if c ,= min quarterword then w w + height plus depth(f, c);
c ext mid (r); if c ,= min quarterword then w w + height plus depth(f, c);
c ext top(r); if c ,= min quarterword then w w + height plus depth(f, c);
n 0;
if u > 0 then
while w < v do
begin w w + u; incr (n);
if ext mid (r) ,= min quarterword then w w + u;
end
This code is used in section 713.
715. The next subroutine is much simpler; it is used for numerators and denominators of fractions as well
as for displayed operators and their limits above and below. It takes a given box b and changes it so that
the new box is centered in a box of width w. The centering is done by putting \hss glue at the left and
right of the list inside b, then packaging the new box; thus, the actual box might not really be centered, if
it already contains innite glue.
The given box might contain a single character whose italic correction has been added to the width of the
box; in this case a compensating kern is inserted.
function rebox (b : pointer ; w : scaled ): pointer ;
var p: pointer ; temporary register for list manipulation
f: internal font number ; font in a one-character box
v: scaled ; width of a character without italic correction
begin if (width(b) ,= w) (list ptr (b) ,= null ) then
begin if type(b) = vlist node then b hpack (b, natural );
p list ptr (b);
if (is char node(p)) (link (p) = null ) then
begin f font (p); v char width(f)(char info(f)(character (p)));
if v ,= width(b) then link (p) new kern(width(b) v);
end;
free node(b, box node size); b new glue(ss glue); link (b) p;
while link (p) ,= null do p link (p);
link (p) new glue(ss glue); rebox hpack (b, w, exactly);
end
else begin width(b) w; rebox b;
end;
end;
264 PART 35: SUBROUTINES FOR MATH MODE T
E
X82 716
716. Here is a subroutine that creates a new glue specication from another one that is expressed in mu,
given the value of the math unit.
dene mu mult (#) nx plus y(n, #, xn over d (#, f, 200000 ))
function math glue(g : pointer ; m : scaled ): pointer ;
var p: pointer ; the new glue specication
n: integer ; integer part of m
f: scaled ; fraction part of m
begin n x over n(m, 200000 ); f remainder ;
if f < 0 then
begin decr (n); f f + 200000 ;
end;
p get node(glue spec size); width(p) mu mult (width(g)); convert mu to pt
stretch order (p) stretch order (g);
if stretch order (p) = normal then stretch(p) mu mult (stretch(g))
else stretch(p) stretch(g);
shrink order (p) shrink order (g);
if shrink order (p) = normal then shrink (p) mu mult (shrink (g))
else shrink (p) shrink (g);
math glue p;
end;
717. The math kern subroutine removes mu glue from a kern node, given the value of the math unit.
procedure math kern(p : pointer ; m : scaled );
var n: integer ; integer part of m
f: scaled ; fraction part of m
begin if subtype(p) = mu glue then
begin n x over n(m, 200000 ); f remainder ;
if f < 0 then
begin decr (n); f f + 200000 ;
end;
width(p) mu mult (width(p)); subtype(p) explicit ;
end;
end;
718. Sometimes it is necessary to destroy an mlist. The following subroutine empties the current list,
assuming that abs (mode) = mmode.
procedure ush math;
begin ush node list (link (head )); ush node list (incompleat noad ); link (head ) null ; tail head ;
incompleat noad null ;
end;
719 T
E
X82 PART 36: TYPESETTING MATH FORMULAS 265
719. Typesetting math formulas. T
E
Xs most important routine for dealing with formulas is called
mlist to hlist . After a formula has been scanned and represented as an mlist, this routine converts it to an
hlist that can be placed into a box or incorporated into the text of a paragraph. There are three implicit
parameters, passed in global variables: cur mlist points to the rst node or noad in the given mlist (and
it might be null ); cur style is a style code; and mlist penalties is true if penalty nodes for potential line
breaks are to be inserted into the resulting hlist. After mlist to hlist has acted, link (temp head ) points to
the translated hlist.
Since mlists can be inside mlists, the procedure is recursive. And since this is not part of T
E
Xs inner
loop, the program has been written in a manner that stresses compactness over eciency.
Global variables 13 +
cur mlist : pointer ; beginning of mlist to be translated
cur style: small number ; style code at current place in the list
cur size: small number ; size code corresponding to cur style
cur mu: scaled ; the math unit width corresponding to cur size
mlist penalties : boolean; should mlist to hlist insert penalties?
720. The recursion in mlist to hlist is due primarily to a subroutine called clean box that puts a given
noad eld into a box using a given math style; mlist to hlist can call clean box , which can call mlist to hlist .
The box returned by clean box is clean in the sense that its shift amount is zero.
procedure mlist to hlist ; forward ;
function clean box (p : pointer ; s : small number ): pointer ;
label found ;
var q: pointer ; beginning of a list to be boxed
save style: small number ; cur style to be restored
x: pointer ; box to be returned
r: pointer ; temporary pointer
begin case math type(p) of
math char : begin cur mlist new noad ; mem[nucleus (cur mlist )] mem[p];
end;
sub box : begin q info(p); goto found ;
end;
sub mlist : cur mlist info(p);
othercases begin q new null box ; goto found ;
end
endcases;
save style cur style; cur style s; mlist penalties false;
mlist to hlist ; q link (temp head ); recursive call
cur style save style; restore the style
Set up the values of cur size and cur mu, based on cur style 703 ;
found : if is char node(q) (q = null ) then x hpack (q, natural )
else if (link (q) = null ) (type(q) vlist node) (shift amount (q) = 0) then x q
its already clean
else x hpack (q, natural );
Simplify a trivial box 721 ;
clean box x;
end;
266 PART 36: TYPESETTING MATH FORMULAS T
E
X82 721
721. Here we save memory space in a common case.
Simplify a trivial box 721
q list ptr (x);
if is char node(q) then
begin r link (q);
if r ,= null then
if link (r) = null then
if is char node(r) then
if type(r) = kern node then unneeded italic correction
begin free node(r, small node size); link (q) null ;
end;
end
This code is used in section 720.
722. It is convenient to have a procedure that converts a math char eld to an unpacked form. The
fetch routine sets cur f , cur c, and cur i to the font code, character code, and character information bytes
of a given noad eld. It also takes care of issuing error messages for nonexistent characters; in such cases,
char exists (cur i ) will be false after fetch has acted, and the eld will also have been reset to empty.
procedure fetch(a : pointer ); unpack the math char eld a
begin cur c character (a); cur f fam fnt (fam(a) + cur size);
if cur f = null font then Complain about an undened family and set cur i null 723
else begin if (qo(cur c) font bc[cur f ]) (qo(cur c) font ec[cur f ]) then
cur i char info(cur f )(cur c)
else cur i null character ;
if (char exists (cur i )) then
begin char warning (cur f , qo(cur c)); math type(a) empty;
end;
end;
end;
723. Complain about an undened family and set cur i null 723
begin print err (""); print size(cur size); print char (""); print int (fam(a));
print ("isundefined(character"); print ASCII (qo(cur c)); print char (")");
help4 ("Somewhereinthemathformulajustended,youusedthe")
("statedcharacterfromanundefinedfontfamily.Forexample,")
("plainTeXdoesntallow\itor\slinsubscripts.Proceed,")
("andIlltrytoforgetthatIneededthatcharacter."); error ; cur i null character ;
math type(a) empty;
end
This code is used in section 722.
724. The outputs of fetch are placed in global variables.
Global variables 13 +
cur f : internal font number ; the font eld of a math char
cur c: quarterword ; the character eld of a math char
cur i : four quarters ; the char info of a math char , or a lig/kern instruction
725 T
E
X82 PART 36: TYPESETTING MATH FORMULAS 267
725. We need to do a lot of dierent things, so mlist to hlist makes two passes over the given mlist.
The rst pass does most of the processing: It removes mu spacing from glue, it recursively evaluates all
subsidiary mlists so that only the top-level mlist remains to be handled, it puts fractions and square roots
and such things into boxes, it attaches subscripts and superscripts, and it computes the overall height and
depth of the top-level mlist so that the size of delimiters for a left noad and a right noad will be known.
The hlist resulting from each noad is recorded in that noads new hlist eld, an integer eld that replaces
the nucleus or thickness .
The second pass eliminates all noads and inserts the correct glue and penalties between nodes.
dene new hlist (#) mem[nucleus (#)].int the translation of an mlist
726. Here is the overall plan of mlist to hlist , and the list of its local variables.
dene done with noad = 80 go here when a noad has been fully translated
dene done with node = 81 go here when a node has been fully converted
dene check dimensions = 82 go here to update max h and max d
dene delete q = 83 go here to delete q and move to the next node
Declare math construction procedures 734
procedure mlist to hlist ;
label reswitch, check dimensions , done with noad , done with node, delete q , done;
var mlist : pointer ; beginning of the given list
penalties : boolean; should penalty nodes be inserted?
style: small number ; the given style
save style: small number ; holds cur style during recursion
q: pointer ; runs through the mlist
r: pointer ; the most recent noad preceding q
r type: small number ; the type of noad r, or op noad if r = null
t: small number ; the eective type of noad q during the second pass
p, x, y, z: pointer ; temporary registers for list construction
pen: integer ; a penalty to be inserted
s: small number ; the size of a noad to be deleted
max h, max d : scaled ; maximum height and depth of the list translated so far
delta: scaled ; oset between subscript and superscript
begin mlist cur mlist ; penalties mlist penalties ; style cur style;
tuck global parameters away as local variables
q mlist ; r null ; r type op noad ; max h 0; max d 0;
Set up the values of cur size and cur mu, based on cur style 703 ;
while q ,= null do Process node-or-noad q as much as possible in preparation for the second pass of
mlist to hlist , then move to the next item in the mlist 727 ;
Convert a nal bin noad to an ord noad 729 ;
Make a second pass over the mlist, removing all noads and inserting the proper spacing and
penalties 760 ;
end;
268 PART 36: TYPESETTING MATH FORMULAS T
E
X82 727
727. We use the fact that no character nodes appear in an mlist, hence the eld type(q) is always present.
Process node-or-noad q as much as possible in preparation for the second pass of mlist to hlist , then move
to the next item in the mlist 727
begin Do rst-pass processing based on type(q); goto done with noad if a noad has been fully
processed, goto check dimensions if it has been translated into new hlist (q), or goto done with node
if a node has been fully processed 728 ;
check dimensions : z hpack (new hlist (q), natural );
if height (z) > max h then max h height (z);
if depth(z) > max d then max d depth(z);
free node(z, box node size);
done with noad : r q; r type type(r);
done with node: q link (q);
end
This code is used in section 726.
728. One of the things we must do on the rst pass is change a bin noad to an ord noad if the bin noad
is not in the context of a binary operator. The values of r and r type make this fairly easy.
Do rst-pass processing based on type(q); goto done with noad if a noad has been fully processed, goto
check dimensions if it has been translated into new hlist (q), or goto done with node if a node has
been fully processed 728
reswitch: delta 0;
case type(q) of
bin noad : case r type of
bin noad , op noad , rel noad , open noad , punct noad , left noad : begin type(q) ord noad ;
goto reswitch;
end;
othercases do nothing
endcases;
rel noad , close noad , punct noad , right noad : begin
Convert a nal bin noad to an ord noad 729 ;
if type(q) = right noad then goto done with noad ;
end;
Cases for noads that can follow a bin noad 733
Cases for nodes that can appear in an mlist, after which we goto done with node 730
othercases confusion("mlist1")
endcases;
Convert nucleus (q) to an hlist and attach the sub/superscripts 754
This code is used in section 727.
729. Convert a nal bin noad to an ord noad 729
if r type = bin noad then type(r) ord noad
This code is used in sections 726 and 728.
730 T
E
X82 PART 36: TYPESETTING MATH FORMULAS 269
730. Cases for nodes that can appear in an mlist, after which we goto done with node 730
style node: begin cur style subtype(q);
Set up the values of cur size and cur mu, based on cur style 703 ;
goto done with node;
end;
choice node: Change this node to a style node followed by the correct choice, then goto
done with node 731 ;
ins node, mark node, adjust node, whatsit node, penalty node, disc node: goto done with node;
rule node: begin if height (q) > max h then max h height (q);
if depth(q) > max d then max d depth(q);
goto done with node;
end;
glue node: begin Convert math glue to ordinary glue 732 ;
goto done with node;
end;
kern node: begin math kern(q, cur mu); goto done with node;
end;
This code is used in section 728.
731. dene choose mlist (#)
begin p #(q); #(q) null ; end
Change this node to a style node followed by the correct choice, then goto done with node 731
begin case cur style div 2 of
0: choose mlist (display mlist ); display style = 0
1: choose mlist (text mlist ); text style = 2
2: choose mlist (script mlist ); script style = 4
3: choose mlist (script script mlist ); script script style = 6
end; there are no other cases
ush node list (display mlist (q)); ush node list (text mlist (q)); ush node list (script mlist (q));
ush node list (script script mlist (q));
type(q) style node; subtype(q) cur style; width(q) 0; depth(q) 0;
if p ,= null then
begin z link (q); link (q) p;
while link (p) ,= null do p link (p);
link (p) z;
end;
goto done with node;
end
This code is used in section 730.
270 PART 36: TYPESETTING MATH FORMULAS T
E
X82 732
732. Conditional math glue (\nonscript) results in a glue node pointing to zero glue, with subtype(q) =
cond math glue; in such a case the node following will be eliminated if it is a glue or kern node and if the
current size is dierent from text size. Unconditional math glue (\muskip) is converted to normal glue by
multiplying the dimensions by cur mu.
Convert math glue to ordinary glue 732
if subtype(q) = mu glue then
begin x glue ptr (q); y math glue(x, cur mu); delete glue ref (x); glue ptr (q) y;
subtype(q) normal ;
end
else if (cur size ,= text size) (subtype(q) = cond math glue) then
begin p link (q);
if p ,= null then
if (type(p) = glue node) (type(p) = kern node) then
begin link (q) link (p); link (p) null ; ush node list (p);
end;
end
This code is used in section 730.
733. Cases for noads that can follow a bin noad 733
left noad : goto done with noad ;
fraction noad : begin make fraction(q); goto check dimensions ;
end;
op noad : begin delta make op(q);
if subtype(q) = limits then goto check dimensions ;
end;
ord noad : make ord (q);
open noad , inner noad : do nothing ;
radical noad : make radical (q);
over noad : make over (q);
under noad : make under (q);
accent noad : make math accent (q);
vcenter noad : make vcenter (q);
This code is used in section 728.
734. Most of the actual construction work of mlist to hlist is done by procedures with names like make fraction,
make radical , etc. To illustrate the general setup of such procedures, lets begin with a couple of simple
ones.
Declare math construction procedures 734
procedure make over (q : pointer );
begin info(nucleus (q)) overbar (clean box (nucleus (q), cramped style(cur style)),
3 default rule thickness , default rule thickness ); math type(nucleus (q)) sub box ;
end;
See also sections 735, 736, 737, 738, 743, 749, 752, 756, and 762.
This code is used in section 726.
735 T
E
X82 PART 36: TYPESETTING MATH FORMULAS 271
735. Declare math construction procedures 734 +
procedure make under (q : pointer );
var p, x, y: pointer ; temporary registers for box construction
delta: scaled ; overall height plus depth
begin x clean box (nucleus (q), cur style); p new kern(3 default rule thickness ); link (x) p;
link (p) fraction rule(default rule thickness ); y vpack (x, natural );
delta height (y) + depth(y) + default rule thickness ; height (y) height (x);
depth(y) delta height (y); info(nucleus (q)) y; math type(nucleus (q)) sub box ;
end;
736. Declare math construction procedures 734 +
procedure make vcenter (q : pointer );
var v: pointer ; the box that should be centered vertically
delta: scaled ; its height plus depth
begin v info(nucleus (q));
if type(v) ,= vlist node then confusion("vcenter");
delta height (v) + depth(v); height (v) axis height (cur size) + half (delta);
depth(v) delta height (v);
end;
737. According to the rules in the DVI le specications, we ensure alignment between a square root sign
and the rule above its nucleus by assuming that the baseline of the square-root symbol is the same as the
bottom of the rule. The height of the square-root symbol will be the thickness of the rule, and the depth of
the square-root symbol should exceed or equal the height-plus-depth of the nucleus plus a certain minimum
clearance clr . The symbol will be placed so that the actual clearance is clr plus half the excess.
Declare math construction procedures 734 +
procedure make radical (q : pointer );
var x, y: pointer ; temporary registers for box construction
delta, clr : scaled ; dimensions involved in the calculation
begin x clean box (nucleus (q), cramped style(cur style));
if cur style < text style then display style
clr default rule thickness + (abs (math x height (cur size)) div 4)
else begin clr default rule thickness ; clr clr + (abs (clr ) div 4);
end;
y var delimiter (left delimiter (q), cur size, height (x) + depth(x) + clr + default rule thickness );
delta depth(y) (height (x) + depth(x) + clr );
if delta > 0 then clr clr + half (delta); increase the actual clearance
shift amount (y) (height (x) + clr ); link (y) overbar (x, clr , height (y));
info(nucleus (q)) hpack (y, natural ); math type(nucleus (q)) sub box ;
end;
272 PART 36: TYPESETTING MATH FORMULAS T
E
X82 738
738. Slants are not considered when placing accents in math mode. The accenter is centered over the
accentee, and the accent width is treated as zero with respect to the size of the nal box.
Declare math construction procedures 734 +
procedure make math accent (q : pointer );
label done, done1 ;
var p, x, y: pointer ; temporary registers for box construction
a: integer ; address of lig/kern instruction
c: quarterword ; accent character
f: internal font number ; its font
i: four quarters ; its char info
s: scaled ; amount to skew the accent to the right
h: scaled ; height of character being accented
delta: scaled ; space to remove between accent and accentee
w: scaled ; width of the accentee, not including sub/superscripts
begin fetch(accent chr (q));
if char exists (cur i ) then
begin i cur i ; c cur c; f cur f ;
Compute the amount of skew 741 ;
x clean box (nucleus (q), cramped style(cur style)); w width(x); h height (x);
Switch to a larger accent if available and appropriate 740 ;
if h < x height (f) then delta h else delta x height (f);
if (math type(supscr (q)) ,= empty) (math type(subscr (q)) ,= empty) then
if math type(nucleus (q)) = math char then Swap the subscript and superscript into box x 742 ;
y char box (f, c); shift amount (y) s + half (w width(y)); width(y) 0; p new kern(delta);
link (p) x; link (y) p; y vpack (y, natural ); width(y) width(x);
if height (y) < h then Make the height of box y equal to h 739 ;
info(nucleus (q)) y; math type(nucleus (q)) sub box ;
end;
end;
739. Make the height of box y equal to h 739
begin p new kern(h height (y)); link (p) list ptr (y); list ptr (y) p; height (y) h;
end
This code is used in section 738.
740. Switch to a larger accent if available and appropriate 740
loop begin if char tag (i) ,= list tag then goto done;
y rem byte(i); i char info(f)(y);
if char exists (i) then goto done;
if char width(f)(i) > w then goto done;
c y;
end;
done:
This code is used in section 738.
741 T
E
X82 PART 36: TYPESETTING MATH FORMULAS 273
741. Compute the amount of skew 741
s 0;
if math type(nucleus (q)) = math char then
begin fetch(nucleus (q));
if char tag (cur i ) = lig tag then
begin a lig kern start (cur f )(cur i ); cur i font info[a].qqqq ;
if skip byte(cur i ) > stop ag then
begin a lig kern restart (cur f )(cur i ); cur i font info[a].qqqq ;
end;
loop begin if qo(next char (cur i )) = skew char [cur f ] then
begin if op byte(cur i ) kern ag then
if skip byte(cur i ) stop ag then s char kern(cur f )(cur i );
goto done1 ;
end;
if skip byte(cur i ) stop ag then goto done1 ;
a a + qo(skip byte(cur i )) + 1; cur i font info[a].qqqq ;
end;
end;
end;
done1 :
This code is used in section 738.
742. Swap the subscript and superscript into box x 742
begin ush node list (x); x new noad ; mem[nucleus (x)] mem[nucleus (q)];
mem[supscr (x)] mem[supscr (q)]; mem[subscr (x)] mem[subscr (q)];
mem[supscr (q)].hh empty eld ; mem[subscr (q)].hh empty eld ;
math type(nucleus (q)) sub mlist ; info(nucleus (q)) x; x clean box (nucleus (q), cur style);
delta delta + height (x) h; h height (x);
end
This code is used in section 738.
743. The make fraction procedure is a bit dierent because it sets new hlist (q) directly rather than making
a sub-box.
Declare math construction procedures 734 +
procedure make fraction(q : pointer );
var p, v, x, y, z: pointer ; temporary registers for box construction
delta, delta1 , delta2 , shift up, shift down, clr : scaled ; dimensions for box calculations
begin if thickness (q) = default code then thickness (q) default rule thickness ;
Create equal-width boxes x and z for the numerator and denominator, and compute the default amounts
shift up and shift down by which they are displaced from the baseline 744 ;
if thickness (q) = 0 then Adjust shift up and shift down for the case of no fraction line 745
else Adjust shift up and shift down for the case of a fraction line 746 ;
Construct a vlist box for the fraction, according to shift up and shift down 747 ;
Put the fraction into a box with its delimiters, and make new hlist (q) point to it 748 ;
end;
274 PART 36: TYPESETTING MATH FORMULAS T
E
X82 744
744. Create equal-width boxes x and z for the numerator and denominator, and compute the default
amounts shift up and shift down by which they are displaced from the baseline 744
x clean box (numerator (q), num style(cur style));
z clean box (denominator (q), denom style(cur style));
if width(x) < width(z) then x rebox (x, width(z))
else z rebox (z, width(x));
if cur style < text style then display style
begin shift up num1 (cur size); shift down denom1 (cur size);
end
else begin shift down denom2 (cur size);
if thickness (q) ,= 0 then shift up num2 (cur size)
else shift up num3 (cur size);
end
This code is used in section 743.
745. The numerator and denominator must be separated by a certain minimum clearance, called clr in
the following program. The dierence between clr and the actual clearance is 2delta.
Adjust shift up and shift down for the case of no fraction line 745
begin if cur style < text style then clr 7 default rule thickness
else clr 3 default rule thickness ;
delta half (clr ((shift up depth(x)) (height (z) shift down)));
if delta > 0 then
begin shift up shift up + delta; shift down shift down + delta;
end;
end
This code is used in section 743.
746. In the case of a fraction line, the minimum clearance depends on the actual thickness of the line.
Adjust shift up and shift down for the case of a fraction line 746
begin if cur style < text style then clr 3 thickness (q)
else clr thickness (q);
delta half (thickness (q)); delta1 clr ((shift up depth(x)) (axis height (cur size) + delta));
delta2 clr ((axis height (cur size) delta) (height (z) shift down));
if delta1 > 0 then shift up shift up + delta1 ;
if delta2 > 0 then shift down shift down + delta2 ;
end
This code is used in section 743.
747. Construct a vlist box for the fraction, according to shift up and shift down 747
v new null box ; type(v) vlist node; height (v) shift up + height (x);
depth(v) depth(z) + shift down; width(v) width(x); this also equals width(z)
if thickness (q) = 0 then
begin p new kern((shift up depth(x)) (height (z) shift down)); link (p) z;
end
else begin y fraction rule(thickness (q));
p new kern((axis height (cur size) delta) (height (z) shift down));
link (y) p; link (p) z;
p new kern((shift up depth(x)) (axis height (cur size) + delta)); link (p) y;
end;
link (x) p; list ptr (v) x
This code is used in section 743.
748 T
E
X82 PART 36: TYPESETTING MATH FORMULAS 275
748. Put the fraction into a box with its delimiters, and make new hlist (q) point to it 748
if cur style < text style then delta delim1 (cur size)
else delta delim2 (cur size);
x var delimiter (left delimiter (q), cur size, delta); link (x) v;
z var delimiter (right delimiter (q), cur size, delta); link (v) z;
new hlist (q) hpack (x, natural )
This code is used in section 743.
749. If the nucleus of an op noad is a single character, it is to be centered vertically with respect to
the axis, after rst being enlarged (via a character list in the font) if we are in display style. The normal
convention for placing displayed limits is to put them above and below the operator in display style.
The italic correction is removed from the character if there is a subscript and the limits are not being
displayed. The make op routine returns the value that should be used as an oset between subscript and
superscript.
After make op has acted, subtype(q) will be limits if and only if the limits have been set above and below
the operator. In that case, new hlist (q) will already contain the desired nal box.
Declare math construction procedures 734 +
function make op(q : pointer ): scaled ;
var delta: scaled ; oset between subscript and superscript
p, v, x, y, z: pointer ; temporary registers for box construction
c: quarterword ; i: four quarters ; registers for character examination
shift up, shift down: scaled ; dimensions for box calculation
begin if (subtype(q) = normal ) (cur style < text style) then subtype(q) limits ;
if math type(nucleus (q)) = math char then
begin fetch(nucleus (q));
if (cur style < text style) (char tag (cur i ) = list tag ) then make it larger
begin c rem byte(cur i ); i char info(cur f )(c);
if char exists (i) then
begin cur c c; cur i i; character (nucleus (q)) c;
end;
end;
delta char italic(cur f )(cur i ); x clean box (nucleus (q), cur style);
if (math type(subscr (q)) ,= empty) (subtype(q) ,= limits ) then width(x) width(x) delta;
remove italic correction
shift amount (x) half (height (x) depth(x)) axis height (cur size); center vertically
math type(nucleus (q)) sub box ; info(nucleus (q)) x;
end
else delta 0;
if subtype(q) = limits then Construct a box with limits above and below it, skewed by delta 750 ;
make op delta;
end;
276 PART 36: TYPESETTING MATH FORMULAS T
E
X82 750
750. The following program builds a vlist box v for displayed limits. The width of the box is not aected
by the fact that the limits may be skewed.
Construct a box with limits above and below it, skewed by delta 750
begin x clean box (supscr (q), sup style(cur style)); y clean box (nucleus (q), cur style);
z clean box (subscr (q), sub style(cur style)); v new null box ; type(v) vlist node;
width(v) width(y);
if width(x) > width(v) then width(v) width(x);
if width(z) > width(v) then width(v) width(z);
x rebox (x, width(v)); y rebox (y, width(v)); z rebox (z, width(v));
shift amount (x) half (delta); shift amount (z) shift amount (x); height (v) height (y);
depth(v) depth(y);
Attach the limits to y and adjust height (v), depth(v) to account for their presence 751 ;
new hlist (q) v;
end
This code is used in section 749.
751. We use shift up and shift down in the following program for the amount of glue between the displayed
operator y and its limits x and z. The vlist inside box v will consist of x followed by y followed by z, with
kern nodes for the spaces between and around them.
Attach the limits to y and adjust height (v), depth(v) to account for their presence 751
if math type(supscr (q)) = empty then
begin free node(x, box node size); list ptr (v) y;
end
else begin shift up big op spacing3 depth(x);
if shift up < big op spacing1 then shift up big op spacing1 ;
p new kern(shift up); link (p) y; link (x) p;
p new kern(big op spacing5 ); link (p) x; list ptr (v) p;
height (v) height (v) + big op spacing5 + height (x) + depth(x) + shift up;
end;
if math type(subscr (q)) = empty then free node(z, box node size)
else begin shift down big op spacing4 height (z);
if shift down < big op spacing2 then shift down big op spacing2 ;
p new kern(shift down); link (y) p; link (p) z;
p new kern(big op spacing5 ); link (z) p;
depth(v) depth(v) + big op spacing5 + height (z) + depth(z) + shift down;
end
This code is used in section 750.
752 T
E
X82 PART 36: TYPESETTING MATH FORMULAS 277
752. A ligature found in a math formula does not create a ligature node, because there is no question of
hyphenation afterwards; the ligature will simply be stored in an ordinary char node, after residing in an
ord noad .
The math type is converted to math text char here if we would not want to apply an italic correction to
the current character unless it belongs to a math font (i.e., a font with space = 0).
No boundary characters enter into these ligatures.
Declare math construction procedures 734 +
procedure make ord (q : pointer );
label restart , exit ;
var a: integer ; address of lig/kern instruction
p, r: pointer ; temporary registers for list manipulation
begin restart :
if math type(subscr (q)) = empty then
if math type(supscr (q)) = empty then
if math type(nucleus (q)) = math char then
begin p link (q);
if p ,= null then
if (type(p) ord noad ) (type(p) punct noad ) then
if math type(nucleus (p)) = math char then
if fam(nucleus (p)) = fam(nucleus (q)) then
begin math type(nucleus (q)) math text char ; fetch(nucleus (q));
if char tag (cur i ) = lig tag then
begin a lig kern start (cur f )(cur i ); cur c character (nucleus (p));
cur i font info[a].qqqq ;
if skip byte(cur i ) > stop ag then
begin a lig kern restart (cur f )(cur i ); cur i font info[a].qqqq ;
end;
loop begin If instruction cur i is a kern with cur c, attach the kern after q; or if it is
a ligature with cur c, combine noads q and p appropriately; then return if the
cursor has moved past a noad, or goto restart 753 ;
if skip byte(cur i ) stop ag then return;
a a + qo(skip byte(cur i )) + 1; cur i font info[a].qqqq ;
end;
end;
end;
end;
exit : end;
278 PART 36: TYPESETTING MATH FORMULAS T
E
X82 753
753. Note that a ligature between an ord noad and another kind of noad is replaced by an ord noad , when
the two noads collapse into one. But we could make a parenthesis (say) change shape when it follows certain
letters. Presumably a font designer will dene such ligatures only when this convention makes sense.
If instruction cur i is a kern with cur c, attach the kern after q; or if it is a ligature with cur c,
combine noads q and p appropriately; then return if the cursor has moved past a noad, or goto
restart 753
if next char (cur i ) = cur c then
if skip byte(cur i ) stop ag then
if op byte(cur i ) kern ag then
begin p new kern(char kern(cur f )(cur i )); link (p) link (q); link (q) p; return;
end
else begin check interrupt ; allow a way out of innite ligature loop
case op byte(cur i ) of
qi (1), qi (5): character (nucleus (q)) rem byte(cur i ); =:|, =:|>
qi (2), qi (6): character (nucleus (p)) rem byte(cur i ); |=:, |=:>
qi (3), qi (7), qi (11): begin r new noad ; |=:|, |=:|>, |=:|>>
character (nucleus (r)) rem byte(cur i ); fam(nucleus (r)) fam(nucleus (q));
link (q) r; link (r) p;
if op byte(cur i ) < qi (11) then math type(nucleus (r)) math char
else math type(nucleus (r)) math text char ; prevent combination
end;
othercases begin link (q) link (p); character (nucleus (q)) rem byte(cur i ); =:
mem[subscr (q)] mem[subscr (p)]; mem[supscr (q)] mem[supscr (p)];
free node(p, noad size);
end
endcases;
if op byte(cur i ) > qi (3) then return;
math type(nucleus (q)) math char ; goto restart ;
end
This code is used in section 752.
754 T
E
X82 PART 36: TYPESETTING MATH FORMULAS 279
754. When we get to the following part of the program, we have fallen through from cases that did not
lead to check dimensions or done with noad or done with node. Thus, q points to a noad whose nucleus
may need to be converted to an hlist, and whose subscripts and superscripts need to be appended if they
are present.
If nucleus (q) is not a math char , the variable delta is the amount by which a superscript should be moved
right with respect to a subscript when both are present.
Convert nucleus (q) to an hlist and attach the sub/superscripts 754
case math type(nucleus (q)) of
math char , math text char : Create a character node p for nucleus (q), possibly followed by a kern node
for the italic correction, and set delta to the italic correction if a subscript is present 755 ;
empty: p null ;
sub box : p info(nucleus (q));
sub mlist : begin cur mlist info(nucleus (q)); save style cur style; mlist penalties false;
mlist to hlist ; recursive call
cur style save style; Set up the values of cur size and cur mu, based on cur style 703 ;
p hpack (link (temp head ), natural );
end;
othercases confusion("mlist2")
endcases;
new hlist (q) p;
if (math type(subscr (q)) = empty) (math type(supscr (q)) = empty) then goto check dimensions ;
make scripts (q, delta)
This code is used in section 728.
755. Create a character node p for nucleus (q), possibly followed by a kern node for the italic correction,
and set delta to the italic correction if a subscript is present 755
begin fetch(nucleus (q));
if char exists (cur i ) then
begin delta char italic(cur f )(cur i ); p new character (cur f , qo(cur c));
if (math type(nucleus (q)) = math text char ) (space(cur f ) ,= 0) then delta 0;
no italic correction in mid-word of text font
if (math type(subscr (q)) = empty) (delta ,= 0) then
begin link (p) new kern(delta); delta 0;
end;
end
else p null ;
end
This code is used in section 754.
280 PART 36: TYPESETTING MATH FORMULAS T
E
X82 756
756. The purpose of make scripts (q, delta) is to attach the subscript and/or superscript of noad q to the
list that starts at new hlist (q), given that subscript and superscript arent both empty. The superscript will
appear to the right of the subscript by a given distance delta.
We set shift down and shift up to the minimum amounts to shift the baseline of subscripts and superscripts
based on the given nucleus.
Declare math construction procedures 734 +
procedure make scripts (q : pointer ; delta : scaled );
var p, x, y, z: pointer ; temporary registers for box construction
shift up, shift down, clr : scaled ; dimensions in the calculation
t: small number ; subsidiary size code
begin p new hlist (q);
if is char node(p) then
begin shift up 0; shift down 0;
end
else begin z hpack (p, natural );
if cur style < script style then t script size else t script script size;
shift up height (z) sup drop(t); shift down depth(z) + sub drop(t); free node(z, box node size);
end;
if math type(supscr (q)) = empty then Construct a subscript box x when there is no superscript 757
else begin Construct a superscript box x 758 ;
if math type(subscr (q)) = empty then shift amount (x) shift up
else Construct a sub/superscript combination box x, with the superscript oset by delta 759 ;
end;
if new hlist (q) = null then new hlist (q) x
else begin p new hlist (q);
while link (p) ,= null do p link (p);
link (p) x;
end;
end;
757. When there is a subscript without a superscript, the top of the subscript should not exceed the
baseline plus four-fths of the x-height.
Construct a subscript box x when there is no superscript 757
begin x clean box (subscr (q), sub style(cur style)); width(x) width(x) + script space;
if shift down < sub1 (cur size) then shift down sub1 (cur size);
clr height (x) (abs (math x height (cur size) 4) div 5);
if shift down < clr then shift down clr ;
shift amount (x) shift down;
end
This code is used in section 756.
758 T
E
X82 PART 36: TYPESETTING MATH FORMULAS 281
758. The bottom of a superscript should never descend below the baseline plus one-fourth of the x-height.
Construct a superscript box x 758
begin x clean box (supscr (q), sup style(cur style)); width(x) width(x) + script space;
if odd (cur style) then clr sup3 (cur size)
else if cur style < text style then clr sup1 (cur size)
else clr sup2 (cur size);
if shift up < clr then shift up clr ;
clr depth(x) + (abs (math x height (cur size)) div 4);
if shift up < clr then shift up clr ;
end
This code is used in section 756.
759. When both subscript and superscript are present, the subscript must be separated from the super-
script by at least four times default rule thickness . If this condition would be violated, the subscript moves
down, after which both subscript and superscript move up so that the bottom of the superscript is at least
as high as the baseline plus four-fths of the x-height.
Construct a sub/superscript combination box x, with the superscript oset by delta 759
begin y clean box (subscr (q), sub style(cur style)); width(y) width(y) + script space;
if shift down < sub2 (cur size) then shift down sub2 (cur size);
clr 4 default rule thickness ((shift up depth(x)) (height (y) shift down));
if clr > 0 then
begin shift down shift down + clr ;
clr (abs (math x height (cur size) 4) div 5) (shift up depth(x));
if clr > 0 then
begin shift up shift up + clr ; shift down shift down clr ;
end;
end;
shift amount (x) delta; superscript is delta to the right of the subscript
p new kern((shift up depth(x)) (height (y) shift down)); link (x) p; link (p) y;
x vpack (x, natural ); shift amount (x) shift down;
end
This code is used in section 756.
760. We have now tied up all the loose ends of the rst pass of mlist to hlist . The second pass simply goes
through and hooks everything together with the proper glue and penalties. It also handles the left noad and
right noad that might be present, since max h and max d are now known. Variable p points to a node at
the current end of the nal hlist.
Make a second pass over the mlist, removing all noads and inserting the proper spacing and penalties 760
p temp head ; link (p) null ; q mlist ; r type 0; cur style style;
Set up the values of cur size and cur mu, based on cur style 703 ;
while q ,= null do
begin If node q is a style node, change the style and goto delete q ; otherwise if it is not a noad, put
it into the hlist, advance q, and goto done; otherwise set s to the size of noad q, set t to the
associated type (ord noad . . inner noad ), and set pen to the associated penalty 761 ;
Append inter-element spacing based on r type and t 766 ;
Append any new hlist entries for q, and any appropriate penalties 767 ;
r type t;
delete q : r q; q link (q); free node(r, s);
done: end
This code is used in section 726.
282 PART 36: TYPESETTING MATH FORMULAS T
E
X82 761
761. Just before doing the big case switch in the second pass, the program sets up default values so that
most of the branches are short.
If node q is a style node, change the style and goto delete q ; otherwise if it is not a noad, put it into the
hlist, advance q, and goto done; otherwise set s to the size of noad q, set t to the associated type
(ord noad . . inner noad ), and set pen to the associated penalty 761
t ord noad ; s noad size; pen inf penalty;
case type(q) of
op noad , open noad , close noad , punct noad , inner noad : t type(q);
bin noad : begin t bin noad ; pen bin op penalty;
end;
rel noad : begin t rel noad ; pen rel penalty;
end;
ord noad , vcenter noad , over noad , under noad : do nothing ;
radical noad : s radical noad size;
accent noad : s accent noad size;
fraction noad : begin t inner noad ; s fraction noad size;
end;
left noad , right noad : t make left right (q, style, max d , max h);
style node: Change the current style and goto delete q 763 ;
whatsit node, penalty node, rule node, disc node, adjust node, ins node, mark node, glue node, kern node:
begin link (p) q; p q; q link (q); link (p) null ; goto done;
end;
othercases confusion("mlist3")
endcases
This code is used in section 760.
762. The make left right function constructs a left or right delimiter of the required size and returns the
value open noad or close noad . The right noad and left noad will both be based on the original style, so
they will have consistent sizes.
We use the fact that right noad left noad = close noad open noad .
Declare math construction procedures 734 +
function make left right (q : pointer ; style : small number ; max d , max h : scaled ): small number ;
var delta, delta1 , delta2 : scaled ; dimensions used in the calculation
begin if style < script style then cur size text size
else cur size 16 ((style text style) div 2);
delta2 max d + axis height (cur size); delta1 max h + max d delta2 ;
if delta2 > delta1 then delta1 delta2 ; delta1 is max distance from axis
delta (delta1 div 500) delimiter factor ; delta2 delta1 + delta1 delimiter shortfall ;
if delta < delta2 then delta delta2 ;
new hlist (q) var delimiter (delimiter (q), cur size, delta);
make left right type(q) (left noad open noad ); open noad or close noad
end;
763. Change the current style and goto delete q 763
begin cur style subtype(q); s style node size;
Set up the values of cur size and cur mu, based on cur style 703 ;
goto delete q ;
end
This code is used in section 761.
764 T
E
X82 PART 36: TYPESETTING MATH FORMULAS 283
764. The inter-element spacing in math formulas depends on a 8 8 table that T
E
X preloads as a 64-digit
string. The elements of this string have the following signicance:
0 means no space;
1 means a conditional thin space (\nonscript\mskip\thinmuskip);
2 means a thin space (\mskip\thinmuskip);
3 means a conditional medium space (\nonscript\mskip\medmuskip);
4 means a conditional thick space (\nonscript\mskip\thickmuskip);
* means an impossible case.
This is all pretty cryptic, but The T
E
Xbook explains what is supposed to happen, and the string makes it
happen.
A global variable magic oset is computed so that if a and b are in the range ord noad . . inner noad ,
then str pool [a 8 + b + magic oset ] is the digit for spacing between noad types a and b.
If Pascal had provided a good way to preload constant arrays, this part of the program would not have
been so strange.
dene math spacing =
"0234000122*4000133**3**344*0400400*000000234000111*1111112341011"
Global variables 13 +
magic oset : integer ; used to nd inter-element spacing
765. Compute the magic oset 765
magic oset str start [math spacing ] 9 ord noad
This code is used in section 1337.
766. Append inter-element spacing based on r type and t 766
if r type > 0 then not the rst noad
begin case so(str pool [r type 8 + t + magic oset ]) of
"0": x 0;
"1": if cur style < script style then x thin mu skip code else x 0;
"2": x thin mu skip code;
"3": if cur style < script style then x med mu skip code else x 0;
"4": if cur style < script style then x thick mu skip code else x 0;
othercases confusion("mlist4")
endcases;
if x ,= 0 then
begin y math glue(glue par (x), cur mu); z new glue(y); glue ref count (y) null ;
link (p) z; p z;
subtype(z) x + 1; store a symbolic subtype
end;
end
This code is used in section 760.
284 PART 36: TYPESETTING MATH FORMULAS T
E
X82 767
767. We insert a penalty node after the hlist entries of noad q if pen is not an innite penalty, and if
the node immediately following q is not a penalty node or a rel noad or absent entirely.
Append any new hlist entries for q, and any appropriate penalties 767
if new hlist (q) ,= null then
begin link (p) new hlist (q);
repeat p link (p);
until link (p) = null ;
end;
if penalties then
if link (q) ,= null then
if pen < inf penalty then
begin r type type(link (q));
if r type ,= penalty node then
if r type ,= rel noad then
begin z new penalty(pen); link (p) z; p z;
end;
end
This code is used in section 760.
768 T
E
X82 PART 37: ALIGNMENT 285
768. Alignment. Its sort of a miracle whenever \halign and \valign work, because they cut across
so many of the control structures of T
E
X.
Therefore the present page is probably not the best place for a beginner to start reading this program; it
is better to master everything else rst.
Let us focus our thoughts on an example of what the input might be, in order to get some idea about
how the alignment miracle happens. The example doesnt do anything useful, but it is suciently general
to indicate all of the special cases that must be dealt with; please do not be disturbed by its apparent
complexity and meaninglessness.
\tabskip 2pt plus 3pt
\halign to 300pt{u1#v1&
\tabskip 1pt plus 1fil u2#v2&
u3#v3\cr
a1&\omit a2&\vrule\cr
\noalign{\vskip 3pt}
b1\span b2\cr
\omit&c2\span\omit\cr}
Heres what happens:
(0) When \halign to 300pt{ is scanned, the scan spec routine places the 300pt dimension onto the
save stack , and an align group code is placed above it. This will make it possible to complete the alignment
when the matching } is found.
(1) The preamble is scanned next. Macros in the preamble are not expanded, except as part of a tabskip
specication. For example, if u2 had been a macro in the preamble above, it would have been expanded,
since T
E
X must look for minus... as part of the tabskip glue. A preamble list is constructed based on
the users preamble; in our case it contains the following seven items:
\glue 2pt plus 3pt (the tabskip preceding column 1)
\alignrecord, width (preamble info for column 1)
\glue 2pt plus 3pt (the tabskip between columns 1 and 2)
\alignrecord, width (preamble info for column 2)
\glue 1pt plus 1fil (the tabskip between columns 2 and 3)
\alignrecord, width (preamble info for column 3)
\glue 1pt plus 1fil (the tabskip following column 3)
These alignrecord entries have the same size as an unset node, since they will later be converted into such
nodes. However, at the moment they have no type or subtype elds; they have info elds instead, and these
info elds are initially set to the value end span, for reasons explained below. Furthermore, the alignrecord
nodes have no height or depth elds; these are renamed u part and v part , and they point to token lists for
the templates of the alignment. For example, the u part eld in the rst alignrecord points to the token list
u1, i.e., the template preceding the # for column 1.
(2) T
E
X now looks at what follows the \cr that ended the preamble. It is not \noalign or \omit, so
this input is put back to be read again, and the template u1 is fed to the scanner. Just before reading u1,
T
E
X goes into restricted horizontal mode. Just after reading u1, T
E
X will see a1, and then (when the & is
sensed) T
E
X will see v1. Then T
E
X scans an endv token, indicating the end of a column. At this point an
unset node is created, containing the contents of the current hlist (i.e., u1a1v1). The natural width of this
unset node replaces the width eld of the alignrecord for column 1; in general, the alignrecords will record
the maximum natural width that has occurred so far in a given column.
(3) Since \omit follows the &, the templates for column 2 are now bypassed. Again T
E
X goes into
restricted horizontal mode and makes an unset node from the resulting hlist; but this time the hlist contains
simply a2. The natural width of the new unset box is remembered in the width eld of the alignrecord for
column 2.
(4) A third unset node is created for column 3, using essentially the mechanism that worked for column 1;
this unset box contains u3\vrule v3. The vertical rule in this case has running dimensions that will later
286 PART 37: ALIGNMENT T
E
X82 768
extend to the height and depth of the whole rst row, since each unset node in a row will eventually inherit
the height and depth of its enclosing box.
(5) The rst row has now ended; it is made into a single unset box comprising the following seven items:
\glue 2pt plus 3pt
\unsetbox for 1 column: u1a1v1
\glue 2pt plus 3pt
\unsetbox for 1 column: a2
\glue 1pt plus 1fil
\unsetbox for 1 column: u3\vrule v3
\glue 1pt plus 1fil
The width of this unset row is unimportant, but it has the correct height and depth, so the correct baselineskip
glue will be computed as the row is inserted into a vertical list.
(6) Since \noalign follows the current \cr, T
E
X appends additional material (in this case \vskip 3pt)
to the vertical list. While processing this material, T
E
X will be in internal vertical mode, and no align group
will be on save stack .
(7) The next row produces an unset box that looks like this:
\glue 2pt plus 3pt
\unsetbox for 2 columns: u1b1v1u2b2v2
\glue 1pt plus 1fil
\unsetbox for 1 column: (empty)
\glue 1pt plus 1fil
The natural width of the unset box that spans columns 1 and 2 is stored in a span node, which we will
explain later; the info eld of the alignrecord for column 1 now points to the new span node, and the info
of the span node points to end span.
(8) The nal row produces the unset box
\glue 2pt plus 3pt
\unsetbox for 1 column: (empty)
\glue 2pt plus 3pt
\unsetbox for 2 columns: u2c2v2
\glue 1pt plus 1fil
A new span node is attached to the alignrecord for column 2.
(9) The last step is to compute the true column widths and to change all the unset boxes to hboxes,
appending the whole works to the vertical list that encloses the \halign. The rules for deciding on the nal
widths of each unset column box will be explained below.
Note that as \halign is being processed, we fearlessly give up control to the rest of T
E
X. At critical junctures,
an alignment routine is called upon to step in and do some little action, but most of the time these routines
just lurk in the background. Its something like post-hypnotic suggestion.
769. We have mentioned that alignrecords contain no height or depth elds. Their glue sign and glue order
are pre-empted as well, since it is necessary to store information about what to do when a template ends.
This information is called the extra info eld.
dene u part (#) mem[# + height oset ].int pointer to u
j
token list
dene v part (#) mem[# + depth oset ].int pointer to v
j
token list
dene extra info(#) info(# + list oset ) info to remember during template
770 T
E
X82 PART 37: ALIGNMENT 287
770. Alignments can occur within alignments, so a small stack is used to access the alignrecord information.
At each level we have a preamble pointer, indicating the beginning of the preamble list; a cur align pointer,
indicating the current position in the preamble list; a cur span pointer, indicating the value of cur align at
the beginning of a sequence of spanned columns; a cur loop pointer, indicating the tabskip glue before an
alignrecord that should be copied next if the current list is extended; and the align state variable, which
indicates the nesting of braces so that \cr and \span and tab marks are properly intercepted. There also are
pointers cur head and cur tail to the head and tail of a list of adjustments being moved out from horizontal
mode to vertical mode.
The current values of these seven quantities appear in global variables; when they have to be pushed down,
they are stored in 5-word nodes, and align ptr points to the topmost such node.
dene preamble link (align head ) the current preamble list
dene align stack node size = 5 number of mem words to save alignment states
Global variables 13 +
cur align: pointer ; current position in preamble list
cur span: pointer ; start of currently spanned columns in preamble list
cur loop: pointer ; place to copy when extending a periodic preamble
align ptr : pointer ; most recently pushed-down alignment stack node
cur head , cur tail : pointer ; adjustment list pointers
771. The align state and preamble variables are initialized elsewhere.
Set initial values of key variables 21 +
align ptr null ; cur align null ; cur span null ; cur loop null ; cur head null ;
cur tail null ;
772. Alignment stack maintenance is handled by a pair of trivial routines called push alignment and
pop alignment .
procedure push alignment ;
var p: pointer ; the new alignment stack node
begin p get node(align stack node size); link (p) align ptr ; info(p) cur align;
llink (p) preamble; rlink (p) cur span; mem[p + 2].int cur loop; mem[p + 3].int align state;
info(p + 4) cur head ; link (p + 4) cur tail ; align ptr p; cur head get avail ;
end;
procedure pop alignment ;
var p: pointer ; the top alignment stack node
begin free avail (cur head ); p align ptr ; cur tail link (p + 4); cur head info(p + 4);
align state mem[p + 3].int ; cur loop mem[p + 2].int ; cur span rlink (p); preamble llink (p);
cur align info(p); align ptr link (p); free node(p, align stack node size);
end;
773. T
E
X has eight procedures that govern alignments: init align and n align are used at the very
beginning and the very end; init row and n row are used at the beginning and end of individual rows;
init span is used at the beginning of a sequence of spanned columns (possibly involving only one column);
init col and n col are used at the beginning and end of individual columns; and align peek is used after
\cr to see whether the next item is \noalign.
We shall consider these routines in the order they are rst used during the course of a complete \halign,
namely init align, align peek , init row, init span, init col , n col , n row, n align.
288 PART 37: ALIGNMENT T
E
X82 774
774. When \halign or \valign has been scanned in an appropriate mode, T
E
X calls init align, whose
task is to get everything o to a good start. This mostly involves scanning the preamble and putting its
information into the preamble list.
Declare the procedure called get preamble token 782
procedure align peek ; forward ;
procedure normal paragraph; forward ;
procedure init align;
label done, done1 , done2 , continue;
var save cs ptr : pointer ; warning index value for error messages
p: pointer ; for short-term temporary use
begin save cs ptr cur cs ; \halign or \valign, usually
push alignment ; align state 1000000; enter a new alignment level
Check for improper alignment in displayed math 776 ;
push nest ; enter a new semantic level
Change current mode to vmode for \halign, hmode for \valign 775 ;
scan spec(align group, false);
Scan the preamble and record it in the preamble list 777 ;
new save level (align group);
if every cr ,= null then begin token list (every cr , every cr text );
align peek ; look for \noalign or \omit
end;
775. In vertical modes, prev depth already has the correct value. But if we are in mmode (displayed
formula mode), we reach out to the enclosing vertical mode for the prev depth value that produces the
correct baseline calculations.
Change current mode to vmode for \halign, hmode for \valign 775
if mode = mmode then
begin mode vmode; prev depth nest [nest ptr 2].aux eld .sc;
end
else if mode > 0 then negate(mode)
This code is used in section 774.
776. When \halign is used as a displayed formula, there should be no other pieces of mlists present.
Check for improper alignment in displayed math 776
if (mode = mmode) ((tail ,= head ) (incompleat noad ,= null )) then
begin print err ("Improper"); print esc("halign"); print ("inside$$s");
help3 ("Displayscanusespecialalignments(like\eqalignno)")
("onlyifnothingbutthealignmentitselfisbetween$$s.")
("SoIvedeletedtheformulasthatprecededthisalignment."); error ; ush math;
end
This code is used in section 774.
777 T
E
X82 PART 37: ALIGNMENT 289
777. Scan the preamble and record it in the preamble list 777
preamble null ; cur align align head ; cur loop null ; scanner status aligning ;
warning index save cs ptr ; align state 1000000; at this point, cur cmd = left brace
loop begin Append the current tabskip glue to the preamble list 778 ;
if cur cmd = car ret then goto done; \cr ends the preamble
Scan preamble text until cur cmd is tab mark or car ret , looking for changes in the tabskip glue;
append an alignrecord to the preamble list 779 ;
end;
done: scanner status normal
This code is used in section 774.
778. Append the current tabskip glue to the preamble list 778
link (cur align) new param glue(tab skip code); cur align link (cur align)
This code is used in section 777.
779. Scan preamble text until cur cmd is tab mark or car ret , looking for changes in the tabskip glue;
append an alignrecord to the preamble list 779
Scan the template u
j
, putting the resulting token list in hold head 783 ;
link (cur align) new null box ; cur align link (cur align); a new alignrecord
info(cur align) end span; width(cur align) null ag ; u part (cur align) link (hold head );
Scan the template v
j
, putting the resulting token list in hold head 784 ;
v part (cur align) link (hold head )
This code is used in section 777.
780. We enter \span into eqtb with tab mark as its command code, and with span code as the command
modier. This makes T
E
X interpret it essentially the same as an alignment delimiter like &, yet it is
recognizably dierent when we need to distinguish it from a normal delimiter. It also turns out to be useful
to give a special cr code to \cr, and an even larger cr cr code to \crcr.
The end of a template is represented by two frozen control sequences called \endtemplate. The rst
has the command code end template, which is > outer call , so it will not easily disappear in the presence of
errors. The get x token routine converts the rst into the second, which has endv as its command code.
dene span code = 256 distinct from any character
dene cr code = 257 distinct from span code and from any character
dene cr cr code = cr code + 1 this distinguishes \crcr from \cr
dene end template token cs token ag + frozen end template
Put each of T
E
Xs primitives into the hash table 226 +
primitive("span", tab mark , span code);
primitive("cr", car ret , cr code); text (frozen cr ) "cr"; eqtb[frozen cr ] eqtb[cur val ];
primitive("crcr", car ret , cr cr code); text (frozen end template) "endtemplate";
text (frozen endv ) "endtemplate"; eq type(frozen endv ) endv ; equiv (frozen endv ) null list ;
eq level (frozen endv ) level one;
eqtb[frozen end template] eqtb[frozen endv ]; eq type(frozen end template) end template;
781. Cases of print cmd chr for symbolic printing of primitives 227 +
tab mark : if chr code = span code then print esc("span")
else chr cmd ("alignmenttabcharacter");
car ret : if chr code = cr code then print esc("cr")
else print esc("crcr");
290 PART 37: ALIGNMENT T
E
X82 782
782. The preamble is copied directly, except that \tabskip causes a change to the tabskip glue, thereby
possibly expanding macros that immediately follow it. An appearance of \span also causes such an expansion.
Note that if the preamble contains \global\tabskip, the \global token survives in the preamble and
the \tabskip denes new tabskip glue (locally).
Declare the procedure called get preamble token 782
procedure get preamble token;
label restart ;
begin restart : get token;
while (cur chr = span code) (cur cmd = tab mark ) do
begin get token; this token will be expanded once
if cur cmd > max command then
begin expand ; get token;
end;
end;
if cur cmd = endv then fatal error ("(interwovenalignmentpreamblesarenotallowed)");
if (cur cmd = assign glue) (cur chr = glue base + tab skip code) then
begin scan optional equals ; scan glue(glue val );
if global defs > 0 then geq dene(glue base + tab skip code, glue ref , cur val )
else eq dene(glue base + tab skip code, glue ref , cur val );
goto restart ;
end;
end;
This code is used in section 774.
783. Spaces are eliminated from the beginning of a template.
Scan the template u
j
, putting the resulting token list in hold head 783
p hold head ; link (p) null ;
loop begin get preamble token;
if cur cmd = mac param then goto done1 ;
if (cur cmd car ret ) (cur cmd tab mark ) (align state = 1000000) then
if (p = hold head ) (cur loop = null ) (cur cmd = tab mark ) then cur loop cur align
else begin print err ("Missing#insertedinalignmentpreamble");
help3 ("Thereshouldbeexactlyone#between&s,whenan")
("\halignor\valignisbeingsetup.Inthiscaseyouhad")
("none,soIveputonein;maybethatwillwork."); back error ; goto done1 ;
end
else if (cur cmd ,= spacer ) (p ,= hold head ) then
begin link (p) get avail ; p link (p); info(p) cur tok ;
end;
end;
done1 :
This code is used in section 779.
784 T
E
X82 PART 37: ALIGNMENT 291
784. Scan the template v
j
, putting the resulting token list in hold head 784
p hold head ; link (p) null ;
loop begin continue: get preamble token;
if (cur cmd car ret ) (cur cmd tab mark ) (align state = 1000000) then goto done2 ;
if cur cmd = mac param then
begin print err ("Onlyone#isallowedpertab");
help3 ("Thereshouldbeexactlyone#between&s,whenan")
("\halignor\valignisbeingsetup.Inthiscaseyouhad")
("morethanone,soImignoringallbutthefirst."); error ; goto continue;
end;
link (p) get avail ; p link (p); info(p) cur tok ;
end;
done2 : link (p) get avail ; p link (p); info(p) end template token put \endtemplate at the end
This code is used in section 779.
785. The tricky part about alignments is getting the templates into the scanner at the right time, and
recovering control when a row or column is nished.
We usually begin a row after each \cr has been sensed, unless that \cr is followed by \noalign or by the
right brace that terminates the alignment. The align peek routine is used to look ahead and do the right
thing; it either gets a new row started, or gets a \noalign started, or nishes o the alignment.
Declare the procedure called align peek 785
procedure align peek ;
label restart ;
begin restart : align state 1000000; Get the next non-blank non-call token 406 ;
if cur cmd = no align then
begin scan left brace; new save level (no align group);
if mode = vmode then normal paragraph;
end
else if cur cmd = right brace then n align
else if (cur cmd = car ret ) (cur chr = cr cr code) then goto restart ignore \crcr
else begin init row; start a new row
init col ; start a new column and replace what we peeked at
end;
end;
This code is used in section 800.
786. To start a row (i.e., a row that rhymes with dough but not with bough), we enter a new semantic
level, copy the rst tabskip glue, and change from internal vertical mode to restricted horizontal mode or
vice versa. The space factor and prev depth are not used on this semantic level, but we clear them to zero
just to be tidy.
Declare the procedure called init span 787
procedure init row;
begin push nest ; mode (hmode vmode) mode;
if mode = hmode then space factor 0 else prev depth 0;
tail append (new glue(glue ptr (preamble))); subtype(tail ) tab skip code + 1;
cur align link (preamble); cur tail cur head ; init span(cur align);
end;
292 PART 37: ALIGNMENT T
E
X82 787
787. The parameter to init span is a pointer to the alignrecord where the next column or group of columns
will begin. A new semantic level is entered, so that the columns will generate a list for subsequent packaging.
Declare the procedure called init span 787
procedure init span(p : pointer );
begin push nest ;
if mode = hmode then space factor 1000
else begin prev depth ignore depth; normal paragraph;
end;
cur span p;
end;
This code is used in section 786.
788. When a column begins, we assume that cur cmd is either omit or else the current token should be
put back into the input until the u
j
template has been scanned. (Note that cur cmd might be tab mark
or car ret .) We also assume that align state is approximately 1000000 at this time. We remain in the same
mode, and start the template if it is called for.
procedure init col ;
begin extra info(cur align) cur cmd ;
if cur cmd = omit then align state 0
else begin back input ; begin token list (u part (cur align), u template);
end; now align state = 1000000
end;
789. The scanner sets align state to zero when the u
j
template ends. When a subsequent \cr or \span
or tab mark occurs with align state = 0, the scanner activates the following code, which res up the v
j
template. We need to remember the cur chr , which is either cr cr code, cr code, span code, or a character
code, depending on how the column text has ended.
This part of the program had better not be activated when the preamble to another alignment is being
scanned, or when no alignment preamble is active.
Insert the v
j
template and goto restart 789
begin if (scanner status = aligning ) (cur align = null ) then
fatal error ("(interwovenalignmentpreamblesarenotallowed)");
cur cmd extra info(cur align); extra info(cur align) cur chr ;
if cur cmd = omit then begin token list (omit template, v template)
else begin token list (v part (cur align), v template);
align state 1000000; goto restart ;
end
This code is used in section 342.
790. The token list omit template just referred to is a constant token list that contains the special control
sequence \endtemplate only.
Initialize the special list heads and constant nodes 790
info(omit template) end template token; link (omit template) = null
See also sections 797, 820, 981, and 988.
This code is used in section 164.
791 T
E
X82 PART 37: ALIGNMENT 293
791. When the endv command at the end of a v
j
template comes through the scanner, things really
start to happen; and it is the n col routine that makes them happen. This routine returns true if a row as
well as a column has been nished.
function n col : boolean;
label exit ;
var p: pointer ; the alignrecord after the current one
q, r: pointer ; temporary pointers for list manipulation
s: pointer ; a new span node
u: pointer ; a new unset box
w: scaled ; natural width
o: glue ord ; order of innity
n: halfword ; span counter
begin if cur align = null then confusion("endv");
q link (cur align); if q = null then confusion("endv");
if align state < 500000 then fatal error ("(interwovenalignmentpreamblesarenotallowed)");
p link (q); If the preamble list has been traversed, check that the row has ended 792 ;
if extra info(cur align) ,= span code then
begin unsave; new save level (align group);
Package an unset box for the current column and record its width 796 ;
Copy the tabskip glue between columns 795 ;
if extra info(cur align) cr code then
begin n col true; return;
end;
init span(p);
end;
align state 1000000; Get the next non-blank non-call token 406 ;
cur align p; init col ; n col false;
exit : end;
792. If the preamble list has been traversed, check that the row has ended 792
if (p = null ) (extra info(cur align) < cr code) then
if cur loop ,= null then Lengthen the preamble periodically 793
else begin print err ("Extraalignmenttabhasbeenchangedto"); print esc("cr");
help3 ("Youhavegivenmore\spanor&marksthantherewere")
("inthepreambletothe\halignor\valignnowinprogress.")
("SoIllassumethatyoumeanttotype\crinstead."); extra info(cur align) cr code;
error ;
end
This code is used in section 791.
793. Lengthen the preamble periodically 793
begin link (q) new null box ; p link (q); a new alignrecord
info(p) end span; width(p) null ag ; cur loop link (cur loop);
Copy the templates from node cur loop into node p 794 ;
cur loop link (cur loop); link (p) new glue(glue ptr (cur loop));
end
This code is used in section 792.
294 PART 37: ALIGNMENT T
E
X82 794
794. Copy the templates from node cur loop into node p 794
q hold head ; r u part (cur loop);
while r ,= null do
begin link (q) get avail ; q link (q); info(q) info(r); r link (r);
end;
link (q) null ; u part (p) link (hold head ); q hold head ; r v part (cur loop);
while r ,= null do
begin link (q) get avail ; q link (q); info(q) info(r); r link (r);
end;
link (q) null ; v part (p) link (hold head )
This code is used in section 793.
795. Copy the tabskip glue between columns 795
tail append (new glue(glue ptr (link (cur align)))); subtype(tail ) tab skip code + 1
This code is used in section 791.
796. Package an unset box for the current column and record its width 796
begin if mode = hmode then
begin adjust tail cur tail ; u hpack (link (head ), natural ); w width(u); cur tail adjust tail ;
adjust tail null ;
end
else begin u vpackage(link (head ), natural , 0); w height (u);
end;
n min quarterword ; this represents a span count of 1
if cur span ,= cur align then Update width entry for spanned columns 798
else if w > width(cur align) then width(cur align) w;
type(u) unset node; span count (u) n;
Determine the stretch order 659 ;
glue order (u) o; glue stretch(u) total stretch[o];
Determine the shrink order 665 ;
glue sign(u) o; glue shrink (u) total shrink [o];
pop nest ; link (tail ) u; tail u;
end
This code is used in section 791.
797. A span node is a 2-word record containing width, info, and link elds. The link eld is not really a
link, it indicates the number of spanned columns; the info eld points to a span node for the same starting
column, having a greater extent of spanning, or to end span, which has the largest possible link eld; the
width eld holds the largest natural width corresponding to a particular set of spanned columns.
A list of the maximum widths so far, for spanned columns starting at a given column, begins with the
info eld of the alignrecord for that column.
dene span node size = 2 number of mem words for a span node
Initialize the special list heads and constant nodes 790 +
link (end span) max quarterword + 1; info(end span) null ;
798 T
E
X82 PART 37: ALIGNMENT 295
798. Update width entry for spanned columns 798
begin q cur span;
repeat incr (n); q link (link (q));
until q = cur align;
if n > max quarterword then confusion("256spans"); this can happen, but wont
q cur span;
while link (info(q)) < n do q info(q);
if link (info(q)) > n then
begin s get node(span node size); info(s) info(q); link (s) n; info(q) s; width(s) w;
end
else if width(info(q)) < w then width(info(q)) w;
end
This code is used in section 796.
799. At the end of a row, we append an unset box to the current vlist (for \halign) or the current hlist
(for \valign). This unset box contains the unset boxes for the columns, separated by the tabskip glue.
Everything will be set later.
procedure n row;
var p: pointer ; the new unset box
begin if mode = hmode then
begin p hpack (link (head ), natural ); pop nest ; append to vlist (p);
if cur head ,= cur tail then
begin link (tail ) link (cur head ); tail cur tail ;
end;
end
else begin p vpack (link (head ), natural ); pop nest ; link (tail ) p; tail p; space factor 1000;
end;
type(p) unset node; glue stretch(p) 0;
if every cr ,= null then begin token list (every cr , every cr text );
align peek ;
end; note that glue shrink (p) = 0 since glue shrink shift amount
296 PART 37: ALIGNMENT T
E
X82 800
800. Finally, we will reach the end of the alignment, and we can breathe a sigh of relief that memory
hasnt overowed. All the unset boxes will now be set so that the columns line up, taking due account of
spanned columns.
procedure do assignments ; forward ;
procedure resume after display; forward ;
procedure build page; forward ;
procedure n align;
var p, q, r, s, u, v: pointer ; registers for the list operations
t, w: scaled ; width of column
o: scaled ; shift oset for unset boxes
n: halfword ; matching span amount
rule save: scaled ; temporary storage for overfull rule
aux save: memory word ; temporary storage for aux
begin if cur group ,= align group then confusion("align1");
unsave; that align group was for individual entries
if cur group ,= align group then confusion("align0");
unsave; that align group was for the whole alignment
if nest [nest ptr 1].mode eld = mmode then o display indent
else o 0;
Go through the preamble list, determining the column widths and changing the alignrecords to dummy
unset boxes 801 ;
Package the preamble list, to determine the actual tabskip glue amounts, and let p point to this
prototype box 804 ;
Set the glue in all the unset boxes of the current list 805 ;
ush node list (p); pop alignment ; Insert the current list into its environment 812 ;
end;
Declare the procedure called align peek 785
801 T
E
X82 PART 37: ALIGNMENT 297
801. Its time now to dismantle the preamble list and to compute the column widths. Let w
ij
be the
maximum of the natural widths of all entries that span columns i through j, inclusive. The alignrecord for
column i contains w
ii
in its width eld, and there is also a linked list of the nonzero w
ij
for increasing j,
accessible via the info eld; these span nodes contain the value j i +min quarterword in their link elds.
The values of w
ii
were initialized to null ag , which we regard as .
The nal column widths are dened by the formula
w
j
= max
1ij
w
ij
ik<j
(t
k
+ w
k
)
,
where t
k
is the natural width of the tabskip glue between columns k and k + 1. However, if w
ij
= for
all i in the range 1 i j (i.e., if every entry that involved column j also involved column j + 1), we let
w
j
= 0, and we zero out the tabskip glue after column j.
T
E
X computes these values by using the following scheme: First w
1
= w
11
. Then replace w
2j
by
max(w
2j
, w
1j
t
1
w
1
), for all j > 1. Then w
2
= w
22
. Then replace w
3j
by max(w
3j
, w
2j
t
2
w
2
)
for all j > 2; and so on. If any w
j
turns out to be , its value is changed to zero and so is the next
tabskip.
Go through the preamble list, determining the column widths and changing the alignrecords to dummy
unset boxes 801
q link (preamble);
repeat ush list (u part (q)); ush list (v part (q)); p link (link (q));
if width(q) = null ag then Nullify width(q) and the tabskip glue following this column 802 ;
if info(q) ,= end span then
Merge the widths in the span nodes of q with those of p, destroying the span nodes of q 803 ;
type(q) unset node; span count (q) min quarterword ; height (q) 0; depth(q) 0;
glue order (q) normal ; glue sign(q) normal ; glue stretch(q) 0; glue shrink (q) 0; q p;
until q = null
This code is used in section 800.
802. Nullify width(q) and the tabskip glue following this column 802
begin width(q) 0; r link (q); s glue ptr (r);
if s ,= zero glue then
begin add glue ref (zero glue); delete glue ref (s); glue ptr (r) zero glue;
end;
end
This code is used in section 801.
298 PART 37: ALIGNMENT T
E
X82 803
803. Merging of two span-node lists is a typical exercise in the manipulation of linearly linked data
structures. The essential invariant in the following repeat loop is that we want to dispense with node
r, in qs list, and u is its successor; all nodes of ps list up to and including s have been processed, and the
successor of s matches r or precedes r or follows r, according as link (r) = n or link (r) > n or link (r) < n.
Merge the widths in the span nodes of q with those of p, destroying the span nodes of q 803
begin t width(q) + width(glue ptr (link (q))); r info(q); s end span; info(s) p;
n min quarterword + 1;
repeat width(r) width(r) t; u info(r);
while link (r) > n do
begin s info(s); n link (info(s)) + 1;
end;
if link (r) < n then
begin info(r) info(s); info(s) r; decr (link (r)); s r;
end
else begin if width(r) > width(info(s)) then width(info(s)) width(r);
free node(r, span node size);
end;
r u;
until r = end span;
end
This code is used in section 801.
804. Now the preamble list has been converted to a list of alternating unset boxes and tabskip glue, where
the box widths are equal to the nal column sizes. In case of \valign, we change the widths to heights, so
that a correct error message will be produced if the alignment is overfull or underfull.
Package the preamble list, to determine the actual tabskip glue amounts, and let p point to this prototype
box 804
save ptr save ptr 2; pack begin line mode line;
if mode = vmode then
begin rule save overfull rule; overfull rule 0; prevent rule from being packaged
p hpack (preamble, saved (1), saved (0)); overfull rule rule save;
end
else begin q link (preamble);
repeat height (q) width(q); width(q) 0; q link (link (q));
until q = null ;
p vpack (preamble, saved (1), saved (0)); q link (preamble);
repeat width(q) height (q); height (q) 0; q link (link (q));
until q = null ;
end;
pack begin line 0
This code is used in section 800.
805 T
E
X82 PART 37: ALIGNMENT 299
805. Set the glue in all the unset boxes of the current list 805
q link (head ); s head ;
while q ,= null do
begin if is char node(q) then
if type(q) = unset node then Set the unset box q and the unset boxes in it 807
else if type(q) = rule node then
Make the running dimensions in rule q extend to the boundaries of the alignment 806 ;
s q; q link (q);
end
This code is used in section 800.
806. Make the running dimensions in rule q extend to the boundaries of the alignment 806
begin if is running (width(q)) then width(q) width(p);
if is running (height (q)) then height (q) height (p);
if is running (depth(q)) then depth(q) depth(p);
if o ,= 0 then
begin r link (q); link (q) null ; q hpack (q, natural ); shift amount (q) o; link (q) r;
link (s) q;
end;
end
This code is used in section 805.
807. The unset box q represents a row that contains one or more unset boxes, depending on how soon \cr
occurred in that row.
Set the unset box q and the unset boxes in it 807
begin if mode = vmode then
begin type(q) hlist node; width(q) width(p);
end
else begin type(q) vlist node; height (q) height (p);
end;
glue order (q) glue order (p); glue sign(q) glue sign(p); glue set (q) glue set (p);
shift amount (q) o; r link (list ptr (q)); s link (list ptr (p));
repeat Set the glue in node r and change it from an unset node 808 ;
r link (link (r)); s link (link (s));
until r = null ;
end
This code is used in section 805.
300 PART 37: ALIGNMENT T
E
X82 808
808. A box made from spanned columns will be followed by tabskip glue nodes and by empty boxes as if
there were no spanning. This permits perfect alignment of subsequent entries, and it prevents values that
depend on oating point arithmetic from entering into the dimensions of any boxes.
Set the glue in node r and change it from an unset node 808
n span count (r); t width(s); w t; u hold head ;
while n > min quarterword do
begin decr (n); Append tabskip glue and an empty box to list u, and update s and t as the prototype
nodes are passed 809 ;
end;
if mode = vmode then
Make the unset node r into an hlist node of width w, setting the glue as if the width were t 810
else Make the unset node r into a vlist node of height w, setting the glue as if the height were t 811 ;
shift amount (r) 0;
if u ,= hold head then append blank boxes to account for spanned nodes
begin link (u) link (r); link (r) link (hold head ); r u;
end
This code is used in section 807.
809. Append tabskip glue and an empty box to list u, and update s and t as the prototype nodes are
passed 809
s link (s); v glue ptr (s); link (u) new glue(v); u link (u); subtype(u) tab skip code + 1;
t t + width(v);
if glue sign(p) = stretching then
begin if stretch order (v) = glue order (p) then t t + round (oat (glue set (p)) stretch(v));
end
else if glue sign(p) = shrinking then
begin if shrink order (v) = glue order (p) then t t round (oat (glue set (p)) shrink (v));
end;
s link (s); link (u) new null box ; u link (u); t t + width(s);
if mode = vmode then width(u) width(s) else begin type(u) vlist node; height (u) width(s);
end
This code is used in section 808.
810. Make the unset node r into an hlist node of width w, setting the glue as if the width were t 810
begin height (r) height (q); depth(r) depth(q);
if t = width(r) then
begin glue sign(r) normal ; glue order (r) normal ; set glue ratio zero(glue set (r));
end
else if t > width(r) then
begin glue sign(r) stretching ;
if glue stretch(r) = 0 then set glue ratio zero(glue set (r))
else glue set (r) unoat ((t width(r))/glue stretch(r));
end
else begin glue order (r) glue sign(r); glue sign(r) shrinking ;
if glue shrink (r) = 0 then set glue ratio zero(glue set (r))
else if (glue order (r) = normal ) (width(r) t > glue shrink (r)) then
set glue ratio one(glue set (r))
else glue set (r) unoat ((width(r) t)/glue shrink (r));
end;
width(r) w; type(r) hlist node;
end
This code is used in section 808.
811 T
E
X82 PART 37: ALIGNMENT 301
811. Make the unset node r into a vlist node of height w, setting the glue as if the height were t 811
begin width(r) width(q);
if t = height (r) then
begin glue sign(r) normal ; glue order (r) normal ; set glue ratio zero(glue set (r));
end
else if t > height (r) then
begin glue sign(r) stretching ;
if glue stretch(r) = 0 then set glue ratio zero(glue set (r))
else glue set (r) unoat ((t height (r))/glue stretch(r));
end
else begin glue order (r) glue sign(r); glue sign(r) shrinking ;
if glue shrink (r) = 0 then set glue ratio zero(glue set (r))
else if (glue order (r) = normal ) (height (r) t > glue shrink (r)) then
set glue ratio one(glue set (r))
else glue set (r) unoat ((height (r) t)/glue shrink (r));
end;
height (r) w; type(r) vlist node;
end
This code is used in section 808.
812. We now have a completed alignment, in the list that starts at head and ends at tail . This list will be
merged with the one that encloses it. (In case the enclosing mode is mmode, for displayed formulas, we will
need to insert glue before and after the display; that part of the program will be deferred until were more
familiar with such operations.)
In restricted horizontal mode, the clang part of aux is undened; an over-cautious Pascal runtime system
may complain about this.
Insert the current list into its environment 812
aux save aux ; p link (head ); q tail ; pop nest ;
if mode = mmode then Finish an alignment in a display 1206
else begin aux aux save; link (tail ) p;
if p ,= null then tail q;
if mode = vmode then build page;
end
This code is used in section 800.
302 PART 38: BREAKING PARAGRAPHS INTO LINES T
E
X82 813
813. Breaking paragraphs into lines. We come now to what is probably the most interesting algo-
rithm of T
E
X: the mechanism for choosing the best possible breakpoints that yield the individual lines of
a paragraph. T
E
Xs line-breaking algorithm takes a given horizontal list and converts it to a sequence of
boxes that are appended to the current vertical list. In the course of doing this, it creates a special data
structure containing three kinds of records that are not used elsewhere in T
E
X. Such nodes are created while
a paragraph is being processed, and they are destroyed afterwards; thus, the other parts of T
E
X do not need
to know anything about how line-breaking is done.
The method used here is based on an approach devised by Michael F. Plass and the author in 1977,
subsequently generalized and improved by the same two people in 1980. A detailed discussion appears in
SOFTWAREPractice & Experience 11 (1981), 11191184, where it is shown that the line-breaking problem
can be regarded as a special case of the problem of computing the shortest path in an acyclic network. The
cited paper includes numerous examples and describes the history of line breaking as it has been practiced
by printers through the ages. The present implementation adds two new ideas to the algorithm of 1980:
Memory space requirements are considerably reduced by using smaller records for inactive nodes than for
active ones, and arithmetic overow is avoided by using delta distances instead of keeping track of the
total distance from the beginning of the paragraph to the current point.
814. The line break procedure should be invoked only in horizontal mode; it leaves that mode and places
its output into the current vlist of the enclosing vertical mode (or internal vertical mode). There is one
explicit parameter: nal widow penalty is the amount of additional penalty to be inserted before the nal
line of the paragraph.
There are also a number of implicit parameters: The hlist to be broken starts at link (head ), and it is
nonempty. The value of prev graf in the enclosing semantic level tells where the paragraph should begin
in the sequence of line numbers, in case hanging indentation or \parshape are in use; prev graf is zero
unless this paragraph is being continued after a displayed formula. Other implicit parameters, such as the
par shape ptr and various penalties to use for hyphenation, etc., appear in eqtb.
After line break has acted, it will have updated the current vlist and the value of prev graf . Furthermore,
the global variable just box will point to the nal box created by line break , so that the width of this line can
be ascertained when it is necessary to decide whether to use above display skip or above display short skip
before a displayed formula.
Global variables 13 +
just box : pointer ; the hlist node for the last line of the new paragraph
815. Since line break is a rather lengthy proceduresort of a small world unto itselfwe must build it
up little by little, somewhat more cautiously than we have done with the simpler procedures of T
E
X. Here
is the general outline.
Declare subprocedures for line break 826
procedure line break (nal widow penalty : integer );
label done, done1 , done2 , done3 , done4 , done5 , continue;
var Local variables for line breaking 862
begin pack begin line mode line; this is for over/underfull box messages
Get ready to start line breaking 816 ;
Find optimal breakpoints 863 ;
Break the paragraph at the chosen breakpoints, justify the resulting lines to the correct widths, and
append them to the current vertical list 876 ;
Clean up the memory by removing the break nodes 865 ;
pack begin line 0;
end;
816 T
E
X82 PART 38: BREAKING PARAGRAPHS INTO LINES 303
816. The rst task is to move the list from head to temp head and go into the enclosing semantic level.
We also append the \parfillskip glue to the end of the paragraph, removing a space (or other glue node)
if it was there, since spaces usually precede blank lines and instances of $$. The par ll skip is preceded
by an innite penalty, so it will never be considered as a potential breakpoint.
This code assumes that a glue node and a penalty node occupy the same number of mem words.
Get ready to start line breaking 816
link (temp head ) link (head );
if is char node(tail ) then tail append (new penalty(inf penalty))
else if type(tail ) ,= glue node then tail append (new penalty(inf penalty))
else begin type(tail ) penalty node; delete glue ref (glue ptr (tail )); ush node list (leader ptr (tail ));
penalty(tail ) inf penalty;
end;
link (tail ) new param glue(par ll skip code); init cur lang prev graf mod 200000 ;
init l hyf prev graf div 20000000 ; init r hyf (prev graf div 200000 ) mod 100 ; pop nest ;
See also sections 827, 834, and 848.
This code is used in section 815.
817. When looking for optimal line breaks, T
E
X creates a break node for each break that is feasible,
in the sense that there is a way to end a line at the given place without requiring any line to stretch more
than a given tolerance. A break node is characterized by three things: the position of the break (which is
a pointer to a glue node, math node, penalty node, or disc node); the ordinal number of the line that will
follow this breakpoint; and the tness classication of the line that has just ended, i.e., tight t , decent t ,
loose t , or very loose t .
dene tight t = 3 tness classication for lines shrinking 0.5 to 1.0 of their shrinkability
dene loose t = 1 tness classication for lines stretching 0.5 to 1.0 of their stretchability
dene very loose t = 0 tness classication for lines stretching more than their stretchability
dene decent t = 2 tness classication for all other lines
818. The algorithm essentially determines the best possible way to achieve each feasible combination of
position, line, and tness. Thus, it answers questions like, What is the best way to break the opening
part of the paragraph so that the fourth line is a tight line ending at such-and-such a place? However, the
fact that all lines are to be the same length after a certain point makes it possible to regard all suciently
large line numbers as equivalent, when the looseness parameter is zero, and this makes it possible for the
algorithm to save space and time.
An active node and a passive node are created in mem for each feasible breakpoint that needs to be
considered. Active nodes are three words long and passive nodes are two words long. We need active nodes
only for breakpoints near the place in the paragraph that is currently being examined, so they are recycled
within a comparatively short time after they are created.
304 PART 38: BREAKING PARAGRAPHS INTO LINES T
E
X82 819
819. An active node for a given breakpoint contains six elds:
link points to the next node in the list of active nodes; the last active node has link = last active.
break node points to the passive node associated with this breakpoint.
line number is the number of the line that follows this breakpoint.
tness is the tness classication of the line ending at this breakpoint.
type is either hyphenated or unhyphenated , depending on whether this breakpoint is a disc node.
total demerits is the minimum possible sum of demerits over all lines leading from the beginning of the
paragraph to this breakpoint.
The value of link (active) points to the rst active node on a linked list of all currently active nodes. This
list is in order by line number , except that nodes with line number > easy line may be in any order relative
to each other.
dene active node size = 3 number of words in active nodes
dene tness subtype very loose t . . tight t on nal line for this break
dene break node rlink pointer to the corresponding passive node
dene line number llink line that begins at this breakpoint
dene total demerits (#) mem[# + 2].int the quantity that T
E
X minimizes
dene unhyphenated = 0 the type of a normal active break node
dene hyphenated = 1 the type of an active node that breaks at a disc node
dene last active active the active list ends where it begins
820. Initialize the special list heads and constant nodes 790 +
type(last active) hyphenated ; line number (last active) max halfword ; subtype(last active) 0;
the subtype is never examined by the algorithm
821. The passive node for a given breakpoint contains only four elds:
link points to the passive node created just before this one, if any, otherwise it is null .
cur break points to the position of this breakpoint in the horizontal list for the paragraph being broken.
prev break points to the passive node that should precede this one in an optimal path to this breakpoint.
serial is equal to n if this passive node is the nth one created during the current pass. (This eld is used
only when printing out detailed statistics about the line-breaking calculations.)
There is a global variable called passive that points to the most recently created passive node. Another
global variable, printed node, is used to help print out the paragraph when detailed information about the
line-breaking computation is being displayed.
dene passive node size = 2 number of words in passive nodes
dene cur break rlink in passive node, points to position of this breakpoint
dene prev break llink points to passive node that should precede this one
dene serial info serial number for symbolic identication
Global variables 13 +
passive: pointer ; most recent node on passive list
printed node: pointer ; most recent node that has been printed
pass number : halfword ; the number of passive nodes allocated on this pass
822 T
E
X82 PART 38: BREAKING PARAGRAPHS INTO LINES 305
822. The active list also contains delta nodes that help the algorithm compute the badness of individual
lines. Such nodes appear only between two active nodes, and they have type = delta node. If p and r are
active nodes and if q is a delta node between them, so that link (p) = q and link (q) = r, then q tells the
space dierence between lines in the horizontal list that start after breakpoint p and lines that start after
breakpoint r. In other words, if we know the length of the line that starts after p and ends at our current
position, then the corresponding length of the line that starts after r is obtained by adding the amounts in
node q. A delta node contains six scaled numbers, since it must record the net change in glue stretchability
with respect to all orders of innity. The natural width dierence appears in mem[q + 1].sc; the stretch
dierences in units of pt, l, ll, and lll appear in mem[q +2 . . q +5].sc; and the shrink dierence appears
in mem[q + 6].sc. The subtype eld of a delta node is not used.
dene delta node size = 7 number of words in a delta node
dene delta node = 2 type eld in a delta node
823. As the algorithm runs, it maintains a set of six delta-like registers for the length of the line following
the rst active breakpoint to the current position in the given hlist. When it makes a pass through the active
list, it also maintains a similar set of six registers for the length following the active breakpoint of current
interest. A third set holds the length of an empty line (namely, the sum of \leftskip and \rightskip);
and a fourth set is used to create new delta nodes.
When we pass a delta node we want to do operations like
for k 1 to 6 do cur active width[k] cur active width[k] + mem[q + k].sc;
and we want to do this without the overhead of for loops. The do all six macro makes such six-tuples
convenient.
dene do all six (#) #(1); #(2); #(3); #(4); #(5); #(6)
Global variables 13 +
active width: array [1 . . 6] of scaled ; distance from rst active node to cur p
cur active width: array [1 . . 6] of scaled ; distance from current active node
background : array [1 . . 6] of scaled ; length of an empty line
break width: array [1 . . 6] of scaled ; length being computed after current break
306 PART 38: BREAKING PARAGRAPHS INTO LINES T
E
X82 824
824. Lets state the principles of the delta nodes more precisely and concisely, so that the following
programs will be less obscure. For each legal breakpoint p in the paragraph, we dene two quantities (p)
and (p) such that the length of material in a line from breakpoint p to breakpoint q is +(q) (p), for
some xed . Intuitively, (p) and (q) are the total length of material from the beginning of the paragraph
to a point after a break at p and to a point before a break at q; and is the width of an empty line,
namely the length contributed by \leftskip and \rightskip.
Suppose, for example, that the paragraph consists entirely of alternating boxes and glue skips; let
the boxes have widths x
1
. . . x
n
and let the skips have widths y
1
. . . y
n
, so that the paragraph can be
represented by x
1
y
1
. . . x
n
y
n
. Let p
i
be the legal breakpoint at y
i
; then (p
i
) = x
1
+ y
1
+ + x
i
+ y
i
,
and (p
i
) = x
1
+ y
1
+ + x
i
. To check this, note that the length of material from p
2
to p
5
, say, is
+ x
3
+ y
3
+ x
4
+ y
4
+ x
5
= + (p
5
) (p
2
).
The quantities , , involve glue stretchability and shrinkability as well as a natural width. If we were
to compute (p) and (p) for each p, we would need multiple precision arithmetic, and the multiprecise
numbers would have to be kept in the active nodes. T
E
X avoids this problem by working entirely with
relative dierences or deltas. Suppose, for example, that the active list contains a
1
1
a
2
2
a
3
, where the
as are active breakpoints and the s are delta nodes. Then
1
= (a
1
) (a
2
) and
2
= (a
2
) (a
3
).
If the line breaking algorithm is currently positioned at some other breakpoint p, the active width array
contains the value + (p) (a
1
). If we are scanning through the list of active nodes and considering a
tentative line that runs from a
2
to p, say, the cur active width array will contain the value +(p) (a
2
).
Thus, when we move from a
2
to a
3
, we want to add (a
2
) (a
3
) to cur active width; and this is just
2
,
which appears in the active list between a
2
and a
3
. The background array contains . The break width array
will be used to calculate values of new delta nodes when the active list is being updated.
825. Glue nodes in a horizontal list that is being paragraphed are not supposed to include innite
shrinkability; that is why the algorithm maintains four registers for stretching but only one for shrinking. If
the user tries to introduce innite shrinkability, the shrinkability will be reset to nite and an error message
will be issued. A boolean variable no shrink error yet prevents this error message from appearing more than
once per paragraph.
dene check shrinkage(#)
if (shrink order (#) ,= normal ) (shrink (#) ,= 0) then
begin # nite shrink (#);
end
Global variables 13 +
no shrink error yet : boolean; have we complained about innite shrinkage?
826. Declare subprocedures for line break 826
function nite shrink (p : pointer ): pointer ; recovers from innite shrinkage
var q: pointer ; new glue specication
begin if no shrink error yet then
begin no shrink error yet false; print err ("Infiniteglueshrinkagefoundinaparagraph");
help5 ("Theparagraphjustendedincludessomegluethathas")
("infiniteshrinkability,e.g.,`\hskip0ptminus1fil.")
("Suchgluedoesntbelongthereitallowsaparagraph")
("ofanylengthtofitononeline.Butitssafetoproceed,")
("sincetheoffensiveshrinkabilityhasbeenmadefinite."); error ;
end;
q new spec(p); shrink order (q) normal ; delete glue ref (p); nite shrink q;
end;
See also sections 829, 877, 895, and 942.
This code is used in section 815.
827 T
E
X82 PART 38: BREAKING PARAGRAPHS INTO LINES 307
827. Get ready to start line breaking 816 +
no shrink error yet true;
check shrinkage(left skip); check shrinkage(right skip);
q left skip; r right skip; background [1] width(q) + width(r);
background [2] 0; background [3] 0; background [4] 0; background [5] 0;
background [2 + stretch order (q)] stretch(q);
background [2 + stretch order (r)] background [2 + stretch order (r)] + stretch(r);
background [6] shrink (q) + shrink (r);
828. A pointer variable cur p runs through the given horizontal list as we look for breakpoints. This
variable is global, since it is used both by line break and by its subprocedure try break .
Another global variable called threshold is used to determine the feasibility of individual lines: Breakpoints
are feasible if there is a way to reach them without creating lines whose badness exceeds threshold . (The
badness is compared to threshold before penalties are added, so that penalty values do not aect the feasibility
of breakpoints, except that no break is allowed when the penalty is 10000 or more.) If threshold is 10000
or more, all legal breaks are considered feasible, since the badness function specied above never returns a
value greater than 10000.
Up to three passes might be made through the paragraph in an attempt to nd at least one set of feasible
breakpoints. On the rst pass, we have threshold = pretolerance and second pass = nal pass = false.
If this pass fails to nd a feasible solution, threshold is set to tolerance, second pass is set true, and an
attempt is made to hyphenate as many words as possible. If that fails too, we add emergency stretch to the
background stretchability and set nal pass = true.
Global variables 13 +
cur p: pointer ; the current breakpoint under consideration
second pass : boolean; is this our second attempt to break this paragraph?
nal pass : boolean; is this our nal attempt to break this paragraph?
threshold : integer ; maximum badness on feasible lines
308 PART 38: BREAKING PARAGRAPHS INTO LINES T
E
X82 829
829. The heart of the line-breaking procedure is try break , a subroutine that tests if the current breakpoint
cur p is feasible, by running through the active list to see what lines of text can be made from active nodes
to cur p. If feasible breaks are possible, new break nodes are created. If cur p is too far from an active
node, that node is deactivated.
The parameter pi to try break is the penalty associated with a break at cur p; we have pi = eject penalty
if the break is forced, and pi = inf penalty if the break is illegal.
The other parameter, break type, is set to hyphenated or unhyphenated , depending on whether or not
the current break is at a disc node. The end of a paragraph is also regarded as hyphenated ; this case is
distinguishable by the condition cur p = null .
dene copy to cur active(#) cur active width[#] active width[#]
dene deactivate = 60 go here when node r should be deactivated
Declare subprocedures for line break 826 +
procedure try break (pi : integer ; break type : small number );
label exit , done, done1 , continue, deactivate;
var r: pointer ; runs through the active list
prev r : pointer ; stays a step behind r
old l : halfword ; maximum line number in current equivalence class of lines
no break yet : boolean; have we found a feasible break at cur p?
Other local variables for try break 830
begin Make sure that pi is in the proper range 831 ;
no break yet true; prev r active; old l 0; do all six (copy to cur active);
loop begin continue: r link (prev r ); If node r is of type delta node, update cur active width, set
prev r and prev prev r , then goto continue 832 ;
If a line number class has ended, create new active nodes for the best feasible breaks in that class;
then return if r = last active, otherwise compute the new line width 835 ;
Consider the demerits for a line from r to cur p; deactivate node r if it should no longer be active;
then goto continue if a line from r to cur p is infeasible, otherwise record a new feasible
break 851 ;
end;
exit : stat Update the value of printed node for symbolic displays 858 tats
end;
830. Other local variables for try break 830
prev prev r : pointer ; a step behind prev r , if type(prev r ) = delta node
s: pointer ; runs through nodes ahead of cur p
q: pointer ; points to a new node being created
v: pointer ; points to a glue specication or a node ahead of cur p
t: integer ; node count, if cur p is a discretionary node
f: internal font number ; used in character width calculation
l: halfword ; line number of current active node
node r stays active: boolean; should node r remain in the active list?
line width: scaled ; the current line will be justied to this width
t class : very loose t . . tight t ; possible tness class of test line
b: halfword ; badness of test line
d: integer ; demerits of test line
articial demerits : boolean; has d been forced to zero?
save link : pointer ; temporarily holds value of link (cur p)
shortfall : scaled ; used in badness calculations
This code is used in section 829.
831 T
E
X82 PART 38: BREAKING PARAGRAPHS INTO LINES 309
831. Make sure that pi is in the proper range 831
if abs (pi ) inf penalty then
if pi > 0 then return this breakpoint is inhibited by innite penalty
else pi eject penalty this breakpoint will be forced
This code is used in section 829.
832. The following code uses the fact that type(last active) ,= delta node.
dene update width(#) cur active width[#] cur active width[#] + mem[r + #].sc
If node r is of type delta node, update cur active width, set prev r and prev prev r , then goto
continue 832
if type(r) = delta node then
begin do all six (update width); prev prev r prev r ; prev r r; goto continue;
end
This code is used in section 829.
833. As we consider various ways to end a line at cur p, in a given line number class, we keep track of the
best total demerits known, in an array with one entry for each of the tness classications. For example,
minimal demerits [tight t ] contains the fewest total demerits of feasible line breaks ending at cur p with
a tight t line; best place[tight t ] points to the passive node for the break before cur p that achieves
such an optimum; and best pl line[tight t ] is the line number eld in the active node corresponding to
best place[tight t ]. When no feasible break sequence is known, the minimal demerits entries will be equal
to awful bad , which is 2
30
1. Another variable, minimum demerits , keeps track of the smallest value in
the minimal demerits array.
dene awful bad 7777777777 more than a billion demerits
Global variables 13 +
minimal demerits : array [very loose t . . tight t ] of integer ;
best total demerits known for current line class and position, given the tness
minimum demerits : integer ; best total demerits known for current line class and position
best place: array [very loose t . . tight t ] of pointer ; how to achieve minimal demerits
best pl line: array [very loose t . . tight t ] of halfword ; corresponding line number
834. Get ready to start line breaking 816 +
minimum demerits awful bad ; minimal demerits [tight t ] awful bad ;
minimal demerits [decent t ] awful bad ; minimal demerits [loose t ] awful bad ;
minimal demerits [very loose t ] awful bad ;
835. The rst part of the following code is part of T
E
Xs inner loop, so we dont want to waste any time.
The current active node, namely node r, contains the line number that will be considered next. At the end
of the list we have arranged the data structure so that r = last active and line number (last active) > old l .
If a line number class has ended, create new active nodes for the best feasible breaks in that class; then
return if r = last active, otherwise compute the new line width 835
begin l line number (r);
if l > old l then
begin now we are no longer in the inner loop
if (minimum demerits < awful bad ) ((old l ,= easy line) (r = last active)) then
Create new active nodes for the best feasible breaks just found 836 ;
if r = last active then return;
Compute the new line width 850 ;
end;
end
This code is used in section 829.
310 PART 38: BREAKING PARAGRAPHS INTO LINES T
E
X82 836
836. It is not necessary to create new active nodes having minimal demerits greater than minimum demerits +
abs (adj demerits ), since such active nodes will never be chosen in the nal paragraph breaks. This observa-
tion allows us to omit a substantial number of feasible breakpoints from further consideration.
Create new active nodes for the best feasible breaks just found 836
begin if no break yet then Compute the values of break width 837 ;
Insert a delta node to prepare for breaks at cur p 843 ;
if abs (adj demerits ) awful bad minimum demerits then minimum demerits awful bad 1
else minimum demerits minimum demerits + abs (adj demerits );
for t class very loose t to tight t do
begin if minimal demerits [t class ] minimum demerits then
Insert a new active node from best place[t class ] to cur p 845 ;
minimal demerits [t class ] awful bad ;
end;
minimum demerits awful bad ; Insert a delta node to prepare for the next active node 844 ;
end
This code is used in section 835.
837. When we insert a new active node for a break at cur p, suppose this new node is to be placed just
before active node a; then we essentially want to insert cur p
= (cur p) (a) in the notation explained above. The cur active width array now holds +(cur p)
(a); so can be obtained by subtracting cur active width from the quantity +(cur p) (cur p). The
latter quantity can be regarded as the length of a line from cur p to cur p; we call it the break width at
cur p.
The break width is usually negative, since it consists of the background (which is normally zero) minus the
width of nodes following cur p that are eliminated after a break. If, for example, node cur p is a glue node,
the width of this glue is subtracted from the background; and we also look ahead to eliminate all subsequent
glue and penalty and kern and math nodes, subtracting their widths as well.
Kern nodes do not disappear at a line break unless they are explicit .
dene set break width to background (#) break width[#] background [#]
Compute the values of break width 837
begin no break yet false; do all six (set break width to background ); s cur p;
if break type > unhyphenated then
if cur p ,= null then Compute the discretionary break width values 840 ;
while s ,= null do
begin if is char node(s) then goto done;
case type(s) of
glue node: Subtract glue from break width 838 ;
penalty node: do nothing ;
math node: break width[1] break width[1] width(s);
kern node: if subtype(s) ,= explicit then goto done
else break width[1] break width[1] width(s);
othercases goto done
endcases;
s link (s);
end;
done: end
This code is used in section 836.
838 T
E
X82 PART 38: BREAKING PARAGRAPHS INTO LINES 311
838. Subtract glue from break width 838
begin v glue ptr (s); break width[1] break width[1] width(v);
break width[2 + stretch order (v)] break width[2 + stretch order (v)] stretch(v);
break width[6] break width[6] shrink (v);
end
This code is used in section 837.
839. When cur p is a discretionary break, the length of a line from cur p to cur p has to be dened
properly so that the other calculations work out. Suppose that the pre-break text at cur p has length l
0
,
the post-break text has length l
1
, and the replacement text has length l. Suppose also that q is the node
following the replacement text. Then length of a line from cur p to q will be computed as +(q)(cur p),
where (q) = (cur p) l
0
+ l. The actual length will be the background plus l
1
, so the length from cur p
to cur p should be + l
0
+ l
1
l. If the post-break text of the discretionary is empty, a break may also
discard q; in that unusual case we subtract the length of q and any other nodes that will be discarded after
the discretionary break.
The value of l
0
need not be computed, since line break will put it into the global variable disc width before
calling try break .
Global variables 13 +
disc width: scaled ; the length of discretionary material preceding a break
840. Compute the discretionary break width values 840
begin t replace count (cur p); v cur p; s post break (cur p);
while t > 0 do
begin decr (t); v link (v); Subtract the width of node v from break width 841 ;
end;
while s ,= null do
begin Add the width of node s to break width 842 ;
s link (s);
end;
break width[1] break width[1] + disc width;
if post break (cur p) = null then s link (v); nodes may be discardable after the break
end
This code is used in section 837.
841. Replacement texts and discretionary texts are supposed to contain only character nodes, kern nodes,
ligature nodes, and box or rule nodes.
Subtract the width of node v from break width 841
if is char node(v) then
begin f font (v); break width[1] break width[1] char width(f)(char info(f)(character (v)));
end
else case type(v) of
ligature node: begin f font (lig char (v));
break width[1] break width[1] char width(f)(char info(f)(character (lig char (v))));
end;
hlist node, vlist node, rule node, kern node: break width[1] break width[1] width(v);
othercases confusion("disc1")
endcases
This code is used in section 840.
312 PART 38: BREAKING PARAGRAPHS INTO LINES T
E
X82 842
842. Add the width of node s to break width 842
if is char node(s) then
begin f font (s); break width[1] break width[1] + char width(f)(char info(f)(character (s)));
end
else case type(s) of
ligature node: begin f font (lig char (s));
break width[1] break width[1] + char width(f)(char info(f)(character (lig char (s))));
end;
hlist node, vlist node, rule node, kern node: break width[1] break width[1] + width(s);
othercases confusion("disc2")
endcases
This code is used in section 840.
843. We use the fact that type(active) ,= delta node.
dene convert to break width(#) mem[prev r + #].sc
mem[prev r + #].sc cur active width[#] + break width[#]
dene store break width(#) active width[#] break width[#]
dene new delta to break width(#) mem[q + #].sc break width[#] cur active width[#]
Insert a delta node to prepare for breaks at cur p 843
if type(prev r ) = delta node then modify an existing delta node
begin do all six (convert to break width);
end
else if prev r = active then no delta node needed at the beginning
begin do all six (store break width);
end
else begin q get node(delta node size); link (q) r; type(q) delta node;
subtype(q) 0; the subtype is not used
do all six (new delta to break width); link (prev r ) q; prev prev r prev r ; prev r q;
end
This code is used in section 836.
844. When the following code is performed, we will have just inserted at least one active node before r,
so type(prev r ) ,= delta node.
dene new delta from break width(#) mem[q + #].sc cur active width[#] break width[#]
Insert a delta node to prepare for the next active node 844
if r ,= last active then
begin q get node(delta node size); link (q) r; type(q) delta node;
subtype(q) 0; the subtype is not used
do all six (new delta from break width); link (prev r ) q; prev prev r prev r ; prev r q;
end
This code is used in section 836.
845 T
E
X82 PART 38: BREAKING PARAGRAPHS INTO LINES 313
845. When we create an active node, we also create the corresponding passive node.
Insert a new active node from best place[t class ] to cur p 845
begin q get node(passive node size); link (q) passive; passive q; cur break (q) cur p;
stat incr (pass number ); serial (q) pass number ; tats
prev break (q) best place[t class ];
q get node(active node size); break node(q) passive; line number (q) best pl line[t class ] + 1;
tness (q) t class ; type(q) break type; total demerits (q) minimal demerits [t class ];
link (q) r; link (prev r ) q; prev r q;
stat if tracing paragraphs > 0 then Print a symbolic description of the new break node 846 ;
tats
end
This code is used in section 836.
846. Print a symbolic description of the new break node 846
begin print nl ("@@"); print int (serial (passive)); print (":line"); print int (line number (q) 1);
print char ("."); print int (t class );
if break type = hyphenated then print char ("");
print ("t="); print int (total demerits (q)); print (">@@");
if prev break (passive) = null then print char ("0")
else print int (serial (prev break (passive)));
end
This code is used in section 845.
847. The length of lines depends on whether the user has specied \parshape or \hangindent. If
par shape ptr is not null, it points to a (2n + 1)-word record in mem, where the info in the rst word
contains the value of n, and the other 2n words contain the left margins and line lengths for the rst n lines
of the paragraph; the specications for line n apply to all subsequent lines. If par shape ptr = null , the
shape of the paragraph depends on the value of n = hang after ; if n 0, hanging indentation takes place
on lines n + 1, n + 2, . . . , otherwise it takes place on lines 1, . . . , [n[. When hanging indentation is active,
the left margin is hang indent , if hang indent 0, else it is 0; the line length is hsize [hang indent [. The
normal setting is par shape ptr = null , hang after = 1, and hang indent = 0. Note that if hang indent = 0,
the value of hang after is irrelevant.
Global variables 13 +
easy line: halfword ; line numbers > easy line are equivalent in break nodes
last special line: halfword ; line numbers > last special line all have the same width
rst width: scaled ; the width of all lines last special line, if no \parshape has been specied
second width: scaled ; the width of all lines > last special line
rst indent : scaled ; left margin to go with rst width
second indent : scaled ; left margin to go with second width
314 PART 38: BREAKING PARAGRAPHS INTO LINES T
E
X82 848
848. We compute the values of easy line and the other local variables relating to line length when the
line break procedure is initializing itself.
Get ready to start line breaking 816 +
if par shape ptr = null then
if hang indent = 0 then
begin last special line 0; second width hsize; second indent 0;
end
else Set line length parameters in preparation for hanging indentation 849
else begin last special line info(par shape ptr ) 1;
second width mem[par shape ptr + 2 (last special line + 1)].sc;
second indent mem[par shape ptr + 2 last special line + 1].sc;
end;
if looseness = 0 then easy line last special line
else easy line max halfword
849. Set line length parameters in preparation for hanging indentation 849
begin last special line abs (hang after );
if hang after < 0 then
begin rst width hsize abs (hang indent );
if hang indent 0 then rst indent hang indent
else rst indent 0;
second width hsize; second indent 0;
end
else begin rst width hsize; rst indent 0; second width hsize abs (hang indent );
if hang indent 0 then second indent hang indent
else second indent 0;
end;
end
This code is used in section 848.
850. When we come to the following code, we have just encountered the rst active node r whose
line number eld contains l. Thus we want to compute the length of the lth line of the current paragraph.
Furthermore, we want to set old l to the last number in the class of line numbers equivalent to l.
Compute the new line width 850
if l > easy line then
begin line width second width; old l max halfword 1;
end
else begin old l l;
if l > last special line then line width second width
else if par shape ptr = null then line width rst width
else line width mem[par shape ptr + 2 l ].sc;
end
This code is used in section 835.
851 T
E
X82 PART 38: BREAKING PARAGRAPHS INTO LINES 315
851. The remaining part of try break deals with the calculation of demerits for a break from r to cur p.
The rst thing to do is calculate the badness, b. This value will always be between zero and inf bad + 1;
the latter value occurs only in the case of lines from r to cur p that cannot shrink enough to t the necessary
width. In such cases, node r will be deactivated. We also deactivate node r when a break at cur p is forced,
since future breaks must go through a forced break.
Consider the demerits for a line from r to cur p; deactivate node r if it should no longer be active; then
goto continue if a line from r to cur p is infeasible, otherwise record a new feasible break 851
begin articial demerits false;
shortfall line width cur active width[1]; were this much too short
if shortfall > 0 then
Set the value of b to the badness for stretching the line, and compute the corresponding t class 852
else Set the value of b to the badness for shrinking the line, and compute the corresponding t class 853 ;
if (b > inf bad ) (pi = eject penalty) then Prepare to deactivate node r, and goto deactivate unless
there is a reason to consider lines of text from r to cur p 854
else begin prev r r;
if b > threshold then goto continue;
node r stays active true;
end;
Record a new feasible break 855 ;
if node r stays active then goto continue; prev r has been set to r
deactivate: Deactivate node r 860 ;
end
This code is used in section 829.
852. When a line must stretch, the available stretchability can be found in the subarray cur active width[2 . .
5], in units of points, l, ll, and lll.
The present section is part of T
E
Xs inner loop, and it is most often performed when the badness is innite;
therefore it is worth while to make a quick test for large width excess and small stretchability, before calling
the badness subroutine.
Set the value of b to the badness for stretching the line, and compute the corresponding t class 852
if (cur active width[3] ,= 0) (cur active width[4] ,= 0) (cur active width[5] ,= 0) then
begin b 0; t class decent t ; innite stretch
end
else begin if shortfall > 7230584 then
if cur active width[2] < 1663497 then
begin b inf bad ; t class very loose t ; goto done1 ;
end;
b badness (shortfall , cur active width[2]);
if b > 12 then
if b > 99 then t class very loose t
else t class loose t
else t class decent t ;
done1 : end
This code is used in section 851.
316 PART 38: BREAKING PARAGRAPHS INTO LINES T
E
X82 853
853. Shrinkability is never innite in a paragraph; we can shrink the line from r to cur p by at most
cur active width[6].
Set the value of b to the badness for shrinking the line, and compute the corresponding t class 853
begin if shortfall > cur active width[6] then b inf bad + 1
else b badness (shortfall , cur active width[6]);
if b > 12 then t class tight t else t class decent t ;
end
This code is used in section 851.
854. During the nal pass, we dare not lose all active nodes, lest we lose touch with the line breaks already
found. The code shown here makes sure that such a catastrophe does not happen, by permitting overfull
boxes as a last resort. This particular part of T
E
X was a source of several subtle bugs before the correct
program logic was nally discovered; readers who seek to improve T
E
X should therefore think thrice before
daring to make any changes here.
Prepare to deactivate node r, and goto deactivate unless there is a reason to consider lines of text from r
to cur p 854
begin if nal pass (minimum demerits = awful bad ) (link (r) = last active) (prev r = active) then
articial demerits true set demerits zero, this break is forced
else if b > threshold then goto deactivate;
node r stays active false;
end
This code is used in section 851.
855. When we get to this part of the code, the line from r to cur p is feasible, its badness is b, and
its tness classication is t class . We dont want to make an active node for this break yet, but we will
compute the total demerits and record them in the minimal demerits array, if such a break is the current
champion among all ways to get to cur p in a given line-number class and tness class.
Record a new feasible break 855
if articial demerits then d 0
else Compute the demerits, d, from r to cur p 859 ;
stat if tracing paragraphs > 0 then Print a symbolic description of this feasible break 856 ;
tats
d d + total demerits (r); this is the minimum total demerits from the beginning to cur p via r
if d minimal demerits [t class ] then
begin minimal demerits [t class ] d; best place[t class ] break node(r); best pl line[t class ] l;
if d < minimum demerits then minimum demerits d;
end
This code is used in section 851.
856 T
E
X82 PART 38: BREAKING PARAGRAPHS INTO LINES 317
856. Print a symbolic description of this feasible break 856
begin if printed node ,= cur p then
Print the list between printed node and cur p, then set printed node cur p 857 ;
print nl ("@");
if cur p = null then print esc("par")
else if type(cur p) ,= glue node then
begin if type(cur p) = penalty node then print esc("penalty")
else if type(cur p) = disc node then print esc("discretionary")
else if type(cur p) = kern node then print esc("kern")
else print esc("math");
end;
print ("via@@");
if break node(r) = null then print char ("0")
else print int (serial (break node(r)));
print ("b=");
if b > inf bad then print char ("*") else print int (b);
print ("p="); print int (pi ); print ("d=");
if articial demerits then print char ("*") else print int (d);
end
This code is used in section 855.
857. Print the list between printed node and cur p, then set printed node cur p 857
begin print nl ("");
if cur p = null then short display(link (printed node))
else begin save link link (cur p); link (cur p) null ; print nl ("");
short display(link (printed node)); link (cur p) save link ;
end;
printed node cur p;
end
This code is used in section 856.
858. When the data for a discretionary break is being displayed, we will have printed the pre break and
post break lists; we want to skip over the third list, so that the discretionary data will not appear twice. The
following code is performed at the very end of try break .
Update the value of printed node for symbolic displays 858
if cur p = printed node then
if cur p ,= null then
if type(cur p) = disc node then
begin t replace count (cur p);
while t > 0 do
begin decr (t); printed node link (printed node);
end;
end
This code is used in section 829.
318 PART 38: BREAKING PARAGRAPHS INTO LINES T
E
X82 859
859. Compute the demerits, d, from r to cur p 859
begin d line penalty + b;
if abs (d) 10000 then d 100000000 else d d d;
if pi ,= 0 then
if pi > 0 then d d + pi pi
else if pi > eject penalty then d d pi pi ;
if (break type = hyphenated ) (type(r) = hyphenated ) then
if cur p ,= null then d d + double hyphen demerits
else d d + nal hyphen demerits ;
if abs (t class tness (r)) > 1 then d d + adj demerits ;
end
This code is used in section 855.
860. When an active node disappears, we must delete an adjacent delta node if the active node was at the
beginning or the end of the active list, or if it was surrounded by delta nodes. We also must preserve the
property that cur active width represents the length of material from link (prev r ) to cur p.
dene combine two deltas (#) mem[prev r + #].sc mem[prev r + #].sc + mem[r + #].sc
dene downdate width(#) cur active width[#] cur active width[#] mem[prev r + #].sc
Deactivate node r 860
link (prev r ) link (r); free node(r, active node size);
if prev r = active then Update the active widths, since the rst active node has been deleted 861
else if type(prev r ) = delta node then
begin r link (prev r );
if r = last active then
begin do all six (downdate width); link (prev prev r ) last active;
free node(prev r , delta node size); prev r prev prev r ;
end
else if type(r) = delta node then
begin do all six (update width); do all six (combine two deltas ); link (prev r ) link (r);
free node(r, delta node size);
end;
end
This code is used in section 851.
861. The following code uses the fact that type(last active) ,= delta node. If the active list has just become
empty, we do not need to update the active width array, since it will be initialized when an active node is
next inserted.
dene update active(#) active width[#] active width[#] + mem[r + #].sc
Update the active widths, since the rst active node has been deleted 861
begin r link (active);
if type(r) = delta node then
begin do all six (update active); do all six (copy to cur active); link (active) link (r);
free node(r, delta node size);
end;
end
This code is used in section 860.
862 T
E
X82 PART 39: BREAKING PARAGRAPHS INTO LINES, CONTINUED 319
862. Breaking paragraphs into lines, continued. So far we have gotten a little way into the
line break routine, having covered its important try break subroutine. Now lets consider the rest of the
process.
The main loop of line break traverses the given hlist, starting at link (temp head ), and calls try break at
each legal breakpoint. A variable called auto breaking is set to true except within math formulas, since glue
nodes are not legal breakpoints when they appear in formulas.
The current node of interest in the hlist is pointed to by cur p. Another variable, prev p, is usually one
step behind cur p, but the real meaning of prev p is this: If type(cur p) = glue node then cur p is a legal
breakpoint if and only if auto breaking is true and prev p does not point to a glue node, penalty node,
explicit kern node, or math node.
The following declarations provide for a few other local variables that are used in special calculations.
Local variables for line breaking 862
auto breaking : boolean; is node cur p outside a formula?
prev p: pointer ; helps to determine when glue nodes are breakpoints
q, r, s, prev s : pointer ; miscellaneous nodes of temporary interest
f: internal font number ; used when calculating character widths
See also section 893.
This code is used in section 815.
320 PART 39: BREAKING PARAGRAPHS INTO LINES, CONTINUED T
E
X82 863
863. The loop in the following code is performed at most thrice per call of line break , since it is actually
a pass over the entire paragraph.
Find optimal breakpoints 863
threshold pretolerance;
if threshold 0 then
begin stat if tracing paragraphs > 0 then
begin begin diagnostic; print nl ("@firstpass"); end; tats
second pass false; nal pass false;
end
else begin threshold tolerance; second pass true; nal pass (emergency stretch 0);
stat if tracing paragraphs > 0 then begin diagnostic;
tats
end;
loop begin if threshold > inf bad then threshold inf bad ;
if second pass then Initialize for hyphenating a paragraph 891 ;
Create an active breakpoint representing the beginning of the paragraph 864 ;
cur p link (temp head ); auto breaking true;
prev p cur p; glue at beginning is not a legal breakpoint
while (cur p ,= null ) (link (active) ,= last active) do Call try break if cur p is a legal breakpoint;
on the second pass, also try to hyphenate the next word, if cur p is a glue node; then advance
cur p to the next node of the paragraph that could possibly be a legal breakpoint 866 ;
if cur p = null then Try the nal line break at the end of the paragraph, and goto done if the
desired breakpoints have been found 873 ;
Clean up the memory by removing the break nodes 865 ;
if second pass then
begin stat if tracing paragraphs > 0 then print nl ("@secondpass"); tats
threshold tolerance; second pass true; nal pass (emergency stretch 0);
end if at rst you dont succeed, . . .
else begin stat if tracing paragraphs > 0 then print nl ("@emergencypass"); tats
background [2] background [2] + emergency stretch; nal pass true;
end;
end;
done: stat if tracing paragraphs > 0 then
begin end diagnostic(true); normalize selector ;
end;
tats
This code is used in section 815.
864. The active node that represents the starting point does not need a corresponding passive node.
dene store background (#) active width[#] background [#]
Create an active breakpoint representing the beginning of the paragraph 864
q get node(active node size); type(q) unhyphenated ; tness (q) decent t ; link (q) last active;
break node(q) null ; line number (q) prev graf + 1; total demerits (q) 0; link (active) q;
do all six (store background );
passive null ; printed node temp head ; pass number 0; font in short display null font
This code is used in section 863.
865 T
E
X82 PART 39: BREAKING PARAGRAPHS INTO LINES, CONTINUED 321
865. Clean up the memory by removing the break nodes 865
q link (active);
while q ,= last active do
begin cur p link (q);
if type(q) = delta node then free node(q, delta node size)
else free node(q, active node size);
q cur p;
end;
q passive;
while q ,= null do
begin cur p link (q); free node(q, passive node size); q cur p;
end
This code is used in sections 815 and 863.
866. Here is the main switch in the line break routine, where legal breaks are determined. As we move
through the hlist, we need to keep the active width array up to date, so that the badness of individual lines
is readily calculated by try break . It is convenient to use the short name act width for the component of
active width that represents real width as opposed to glue.
dene act width active width[1] length from rst active node to current node
dene kern break
begin if is char node(link (cur p)) auto breaking then
if type(link (cur p)) = glue node then try break (0, unhyphenated );
act width act width + width(cur p);
end
Call try break if cur p is a legal breakpoint; on the second pass, also try to hyphenate the next word, if
cur p is a glue node; then advance cur p to the next node of the paragraph that could possibly be a
legal breakpoint 866
begin if is char node(cur p) then
Advance cur p to the node following the present string of characters 867 ;
case type(cur p) of
hlist node, vlist node, rule node: act width act width + width(cur p);
whatsit node: Advance past a whatsit node in the line break loop 1362 ;
glue node: begin If node cur p is a legal breakpoint, call try break ; then update the active widths by
including the glue in glue ptr (cur p) 868 ;
if second pass auto breaking then Try to hyphenate the following word 894 ;
end;
kern node: if subtype(cur p) = explicit then kern break
else act width act width + width(cur p);
ligature node: begin f font (lig char (cur p));
act width act width + char width(f)(char info(f)(character (lig char (cur p))));
end;
disc node: Try to break after a discretionary fragment, then goto done5 869 ;
math node: begin auto breaking (subtype(cur p) = after ); kern break ;
end;
penalty node: try break (penalty(cur p), unhyphenated );
mark node, ins node, adjust node: do nothing ;
othercases confusion("paragraph")
endcases;
prev p cur p; cur p link (cur p);
done5 : end
This code is used in section 863.
322 PART 39: BREAKING PARAGRAPHS INTO LINES, CONTINUED T
E
X82 867
867. The code that passes over the characters of words in a paragraph is part of T
E
Xs inner loop, so it has
been streamlined for speed. We use the fact that \parfillskip glue appears at the end of each paragraph;
it is therefore unnecessary to check if link (cur p) = null when cur p is a character node.
Advance cur p to the node following the present string of characters 867
begin prev p cur p;
repeat f font (cur p); act width act width + char width(f)(char info(f)(character (cur p)));
cur p link (cur p);
until is char node(cur p);
end
This code is used in section 866.
868. When node cur p is a glue node, we look at prev p to see whether or not a breakpoint is legal at
cur p, as explained above.
If node cur p is a legal breakpoint, call try break ; then update the active widths by including the glue in
glue ptr (cur p) 868
if auto breaking then
begin if is char node(prev p) then try break (0, unhyphenated )
else if precedes break (prev p) then try break (0, unhyphenated )
else if (type(prev p) = kern node) (subtype(prev p) ,= explicit ) then try break (0, unhyphenated );
end;
check shrinkage(glue ptr (cur p)); q glue ptr (cur p); act width act width + width(q);
active width[2 + stretch order (q)] active width[2 + stretch order (q)] + stretch(q);
active width[6] active width[6] + shrink (q)
This code is used in section 866.
869. The following code knows that discretionary texts contain only character nodes, kern nodes, box
nodes, rule nodes, and ligature nodes.
Try to break after a discretionary fragment, then goto done5 869
begin s pre break (cur p); disc width 0;
if s = null then try break (ex hyphen penalty, hyphenated )
else begin repeat Add the width of node s to disc width 870 ;
s link (s);
until s = null ;
act width act width + disc width; try break (hyphen penalty, hyphenated );
act width act width disc width;
end;
r replace count (cur p); s link (cur p);
while r > 0 do
begin Add the width of node s to act width 871 ;
decr (r); s link (s);
end;
prev p cur p; cur p s; goto done5 ;
end
This code is used in section 866.
870 T
E
X82 PART 39: BREAKING PARAGRAPHS INTO LINES, CONTINUED 323
870. Add the width of node s to disc width 870
if is char node(s) then
begin f font (s); disc width disc width + char width(f)(char info(f)(character (s)));
end
else case type(s) of
ligature node: begin f font (lig char (s));
disc width disc width + char width(f)(char info(f)(character (lig char (s))));
end;
hlist node, vlist node, rule node, kern node: disc width disc width + width(s);
othercases confusion("disc3")
endcases
This code is used in section 869.
871. Add the width of node s to act width 871
if is char node(s) then
begin f font (s); act width act width + char width(f)(char info(f)(character (s)));
end
else case type(s) of
ligature node: begin f font (lig char (s));
act width act width + char width(f)(char info(f)(character (lig char (s))));
end;
hlist node, vlist node, rule node, kern node: act width act width + width(s);
othercases confusion("disc4")
endcases
This code is used in section 869.
872. The forced line break at the paragraphs end will reduce the list of breakpoints so that all active
nodes represent breaks at cur p = null . On the rst pass, we insist on nding an active node that has the
correct looseness. On the nal pass, there will be at least one active node, and we will match the desired
looseness as well as we can.
The global variable best bet will be set to the active node for the best way to break the paragraph, and a
few other variables are used to help determine what is best.
Global variables 13 +
best bet : pointer ; use this passive node and its predecessors
fewest demerits : integer ; the demerits associated with best bet
best line: halfword ; line number following the last line of the new paragraph
actual looseness : integer ; the dierence between line number (best bet ) and the optimum best line
line di : integer ; the dierence between the current line number and the optimum best line
873. Try the nal line break at the end of the paragraph, and goto done if the desired breakpoints have
been found 873
begin try break (eject penalty, hyphenated );
if link (active) ,= last active then
begin Find an active node with fewest demerits 874 ;
if looseness = 0 then goto done;
Find the best active node for the desired looseness 875 ;
if (actual looseness = looseness ) nal pass then goto done;
end;
end
This code is used in section 863.
324 PART 39: BREAKING PARAGRAPHS INTO LINES, CONTINUED T
E
X82 874
874. Find an active node with fewest demerits 874
r link (active); fewest demerits awful bad ;
repeat if type(r) ,= delta node then
if total demerits (r) < fewest demerits then
begin fewest demerits total demerits (r); best bet r;
end;
r link (r);
until r = last active;
best line line number (best bet )
This code is used in section 873.
875. The adjustment for a desired looseness is a slightly more complicated version of the loop just
considered. Note that if a paragraph is broken into segments by displayed equations, each segment will
be subject to the looseness calculation, independently of the other segments.
Find the best active node for the desired looseness 875
begin r link (active); actual looseness 0;
repeat if type(r) ,= delta node then
begin line di line number (r) best line;
if ((line di < actual looseness ) (looseness line di ))
((line di > actual looseness ) (looseness line di )) then
begin best bet r; actual looseness line di ; fewest demerits total demerits (r);
end
else if (line di = actual looseness ) (total demerits (r) < fewest demerits ) then
begin best bet r; fewest demerits total demerits (r);
end;
end;
r link (r);
until r = last active;
best line line number (best bet );
end
This code is used in section 873.
876. Once the best sequence of breakpoints has been found (hurray), we call on the procedure post line break
to nish the remainder of the work. (By introducing this subprocedure, we are able to keep line break from
getting extremely long.)
Break the paragraph at the chosen breakpoints, justify the resulting lines to the correct widths, and
append them to the current vertical list 876
post line break (nal widow penalty)
This code is used in section 815.
877 T
E
X82 PART 39: BREAKING PARAGRAPHS INTO LINES, CONTINUED 325
877. The total number of lines that will be set by post line break is best line prev graf 1. The last
breakpoint is specied by break node(best bet ), and this passive node points to the other breakpoints via
the prev break links. The nishing-up phase starts by linking the relevant passive nodes in forward order,
changing prev break to next break . (The next break elds actually reside in the same memory space as the
prev break elds did, but we give them a new name because of their new signicance.) Then the lines are
justied, one by one.
dene next break prev break new name for prev break after links are reversed
Declare subprocedures for line break 826 +
procedure post line break (nal widow penalty : integer );
label done, done1 ;
var q, r, s: pointer ; temporary registers for list manipulation
disc break : boolean; was the current break at a discretionary node?
post disc break : boolean; and did it have a nonempty post-break part?
cur width: scaled ; width of line number cur line
cur indent : scaled ; left margin of line number cur line
t: quarterword ; used for replacement counts in discretionary nodes
pen: integer ; use when calculating penalties between lines
cur line: halfword ; the current line number being justied
begin Reverse the links of the relevant passive nodes, setting cur p to the rst breakpoint 878 ;
cur line prev graf + 1;
repeat Justify the line ending at breakpoint cur p, and append it to the current vertical list, together
with associated penalties and other insertions 880 ;
incr (cur line); cur p next break (cur p);
if cur p ,= null then
if post disc break then Prune unwanted nodes at the beginning of the next line 879 ;
until cur p = null ;
if (cur line ,= best line) (link (temp head ) ,= null ) then confusion("linebreaking");
prev graf best line 1;
end;
878. The job of reversing links in a list is conveniently regarded as the job of taking items o one stack
and putting them on another. In this case we take them o a stack pointed to by q and having prev break
elds; we put them on a stack pointed to by cur p and having next break elds. Node r is the passive node
being moved from stack to stack.
Reverse the links of the relevant passive nodes, setting cur p to the rst breakpoint 878
q break node(best bet ); cur p null ;
repeat r q; q prev break (q); next break (r) cur p; cur p r;
until q = null
This code is used in section 877.
326 PART 39: BREAKING PARAGRAPHS INTO LINES, CONTINUED T
E
X82 879
879. Glue and penalty and kern and math nodes are deleted at the beginning of a line, except in the
anomalous case that the node to be deleted is actually one of the chosen breakpoints. Otherwise the pruning
done here is designed to match the lookahead computation in try break , where the break width values are
computed for non-discretionary breakpoints.
Prune unwanted nodes at the beginning of the next line 879
begin r temp head ;
loop begin q link (r);
if q = cur break (cur p) then goto done1 ; cur break (cur p) is the next breakpoint
now q cannot be null
if is char node(q) then goto done1 ;
if non discardable(q) then goto done1 ;
if type(q) = kern node then
if subtype(q) ,= explicit then goto done1 ;
r q; now type(q) = glue node, kern node, math node or penalty node
end;
done1 : if r ,= temp head then
begin link (r) null ; ush node list (link (temp head )); link (temp head ) q;
end;
end
This code is used in section 877.
880. The current line to be justied appears in a horizontal list starting at link (temp head ) and ending at
cur break (cur p). If cur break (cur p) is a glue node, we reset the glue to equal the right skip glue; otherwise
we append the right skip glue at the right. If cur break (cur p) is a discretionary node, we modify the list so
that the discretionary break is compulsory, and we set disc break to true. We also append the left skip glue
at the left of the line, unless it is zero.
Justify the line ending at breakpoint cur p, and append it to the current vertical list, together with
associated penalties and other insertions 880
Modify the end of the line to reect the nature of the break and to include \rightskip; also set the
proper value of disc break 881 ;
Put the \leftskip glue at the left and detach this line 887 ;
Call the packaging subroutine, setting just box to the justied box 889 ;
Append the new box to the current vertical list, followed by the list of special nodes taken out of the
box by the packager 888 ;
Append a penalty node, if a nonzero penalty is appropriate 890
This code is used in section 877.
881 T
E
X82 PART 39: BREAKING PARAGRAPHS INTO LINES, CONTINUED 327
881. At the end of the following code, q will point to the nal node on the list about to be justied.
Modify the end of the line to reect the nature of the break and to include \rightskip; also set the
proper value of disc break 881
q cur break (cur p); disc break false; post disc break false;
if q ,= null then q cannot be a char node
if type(q) = glue node then
begin delete glue ref (glue ptr (q)); glue ptr (q) right skip; subtype(q) right skip code + 1;
add glue ref (right skip); goto done;
end
else begin if type(q) = disc node then
Change discretionary to compulsory and set disc break true 882
else if (type(q) = math node) (type(q) = kern node) then width(q) 0;
end
else begin q temp head ;
while link (q) ,= null do q link (q);
end;
Put the \rightskip glue after node q 886 ;
done:
This code is used in section 880.
882. Change discretionary to compulsory and set disc break true 882
begin t replace count (q);
Destroy the t nodes following q, and make r point to the following node 883 ;
if post break (q) ,= null then Transplant the post-break list 884 ;
if pre break (q) ,= null then Transplant the pre-break list 885 ;
link (q) r; disc break true;
end
This code is used in section 881.
883. Destroy the t nodes following q, and make r point to the following node 883
if t = 0 then r link (q)
else begin r q;
while t > 1 do
begin r link (r); decr (t);
end;
s link (r); r link (s); link (s) null ; ush node list (link (q)); replace count (q) 0;
end
This code is used in section 882.
884. We move the post-break list from inside node q to the main list by reattaching it just before the
present node r, then resetting r.
Transplant the post-break list 884
begin s post break (q);
while link (s) ,= null do s link (s);
link (s) r; r post break (q); post break (q) null ; post disc break true;
end
This code is used in section 882.
328 PART 39: BREAKING PARAGRAPHS INTO LINES, CONTINUED T
E
X82 885
885. We move the pre-break list from inside node q to the main list by reattaching it just after the present
node q, then resetting q.
Transplant the pre-break list 885
begin s pre break (q); link (q) s;
while link (s) ,= null do s link (s);
pre break (q) null ; q s;
end
This code is used in section 882.
886. Put the \rightskip glue after node q 886
r new param glue(right skip code); link (r) link (q); link (q) r; q r
This code is used in section 881.
887. The following code begins with q at the end of the list to be justied. It ends with q at the beginning
of that list, and with link (temp head ) pointing to the remainder of the paragraph, if any.
Put the \leftskip glue at the left and detach this line 887
r link (q); link (q) null ; q link (temp head ); link (temp head ) r;
if left skip ,= zero glue then
begin r new param glue(left skip code); link (r) q; q r;
end
This code is used in section 880.
888. Append the new box to the current vertical list, followed by the list of special nodes taken out of
the box by the packager 888
append to vlist (just box );
if adjust head ,= adjust tail then
begin link (tail ) link (adjust head ); tail adjust tail ;
end;
adjust tail null
This code is used in section 880.
889. Now q points to the hlist that represents the current line of the paragraph. We need to compute the
appropriate line width, pack the line into a box of this size, and shift the box by the appropriate amount of
indentation.
Call the packaging subroutine, setting just box to the justied box 889
if cur line > last special line then
begin cur width second width; cur indent second indent ;
end
else if par shape ptr = null then
begin cur width rst width; cur indent rst indent ;
end
else begin cur width mem[par shape ptr + 2 cur line].sc;
cur indent mem[par shape ptr + 2 cur line 1].sc;
end;
adjust tail adjust head ; just box hpack (q, cur width, exactly); shift amount (just box ) cur indent
This code is used in section 880.
890 T
E
X82 PART 39: BREAKING PARAGRAPHS INTO LINES, CONTINUED 329
890. Penalties between the lines of a paragraph come from club and widow lines, from the inter line penalty
parameter, and from lines that end at discretionary breaks. Breaking between lines of a two-line paragraph
gets both club-line and widow-line penalties. The local variable pen will be set to the sum of all relevant
penalties for the current line, except that the nal line is never penalized.
Append a penalty node, if a nonzero penalty is appropriate 890
if cur line + 1 ,= best line then
begin pen inter line penalty;
if cur line = prev graf + 1 then pen pen + club penalty;
if cur line + 2 = best line then pen pen + nal widow penalty;
if disc break then pen pen + broken penalty;
if pen ,= 0 then
begin r new penalty(pen); link (tail ) r; tail r;
end;
end
This code is used in section 880.
330 PART 40: PRE-HYPHENATION T
E
X82 891
891. Pre-hyphenation. When the line-breaking routine is unable to nd a feasible sequence of break-
points, it makes a second pass over the paragraph, attempting to hyphenate the hyphenatable words. The
goal of hyphenation is to insert discretionary material into the paragraph so that there are more potential
places to break.
The general rules for hyphenation are somewhat complex and technical, because we want to be able to
hyphenate words that are preceded or followed by punctuation marks, and because we want the rules to
work for languages other than English. We also must contend with the fact that hyphens might radically
alter the ligature and kerning structure of a word.
A sequence of characters will be considered for hyphenation only if it belongs to a potentially hyphenatable
part of the current paragraph. This is a sequence of nodes p
0
p
1
. . . p
m
where p
0
is a glue node, p
1
. . . p
m1
are either character or ligature or whatsit or implicit kern nodes, and p
m
is a glue or penalty or insertion
or adjust or mark or whatsit or explicit kern node. (Therefore hyphenation is disabled by boxes, math
formulas, and discretionary nodes already inserted by the user.) The ligature nodes among p
1
. . . p
m1
are
eectively expanded into the original non-ligature characters; the kern nodes and whatsits are ignored. Each
character c is now classied as either a nonletter (if lc code(c) = 0), a lowercase letter (if lc code(c) = c),
or an uppercase letter (otherwise); an uppercase letter is treated as if it were lc code(c) for purposes of
hyphenation. The characters generated by p
1
. . . p
m1
may begin with nonletters; let c
1
be the rst letter
that is not in the middle of a ligature. Whatsit nodes preceding c
1
are ignored; a whatsit found after c
1
will
be the terminating node p
m
. All characters that do not have the same font as c
1
will be treated as nonletters.
The hyphen char for that font must be between 0 and 255, otherwise hyphenation will not be attempted.
T
E
X looks ahead for as many consecutive letters c
1
. . . c
n
as possible; however, n must be less than 64, so a
character that would otherwise be c
64
is eectively not a letter. Furthermore c
n
must not be in the middle
of a ligature. In this way we obtain a string of letters c
1
. . . c
n
that are generated by nodes p
a
. . . p
b
, where
1 a b + 1 m. If n l hyf + r hyf , this string qualies for hyphenation; however, uc hyph must be
positive, if c
1
is uppercase.
The hyphenation process takes place in three stages. First, the candidate sequence c
1
. . . c
n
is found; then
potential positions for hyphens are determined by referring to hyphenation tables; and nally, the nodes
p
a
. . . p
b
are replaced by a new sequence of nodes that includes the discretionary breaks found.
Fortunately, we do not have to do all this calculation very often, because of the way it has been taken out
of T
E
Xs inner loop. For example, when the second edition of the authors 700-page book Seminumerical
Algorithms was typeset by T
E
X, only about 1.2 hyphenations needed to be tried per paragraph, since the
line breaking algorithm needed to use two passes on only about 5 per cent of the paragraphs.
Initialize for hyphenating a paragraph 891
begin init if trie not ready then init trie;
tini
cur lang init cur lang ; l hyf init l hyf ; r hyf init r hyf ;
end
This code is used in section 863.
892 T
E
X82 PART 40: PRE-HYPHENATION 331
892. The letters c
1
. . . c
n
that are candidates for hyphenation are placed into an array called hc; the number
n is placed into hn; pointers to nodes p
a1
and p
b
in the description above are placed into variables ha and
hb; and the font number is placed into hf .
Global variables 13 +
hc: array [0 . . 65] of 0 . . 256; word to be hyphenated
hn: small number ; the number of positions occupied in hc
ha, hb: pointer ; nodes ha . . hb should be replaced by the hyphenated result
hf : internal font number ; font number of the letters in hc
hu: array [0 . . 63] of 0 . . 256; like hc, before conversion to lowercase
hyf char : integer ; hyphen character of the relevant font
cur lang , init cur lang : ASCII code; current hyphenation table of interest
l hyf , r hyf , init l hyf , init r hyf : integer ; limits on fragment sizes
hyf bchar : halfword ; boundary character after c
n
893. Hyphenation routines need a few more local variables.
Local variables for line breaking 862 +
j: small number ; an index into hc or hu
c: 0 . . 255; character being considered for hyphenation
894. When the following code is activated, the line break procedure is in its second pass, and cur p points
to a glue node.
Try to hyphenate the following word 894
begin prev s cur p; s link (prev s );
if s ,= null then
begin Skip to node ha, or goto done1 if no hyphenation should be attempted 896 ;
if l hyf + r hyf > 63 then goto done1 ;
Skip to node hb, putting letters into hu and hc 897 ;
Check that the nodes following hb permit hyphenation and that at least l hyf + r hyf letters have
been found, otherwise goto done1 899 ;
hyphenate;
end;
done1 : end
This code is used in section 866.
895. Declare subprocedures for line break 826 +
Declare the function called reconstitute 906
procedure hyphenate;
label common ending , done, found , found1 , found2 , not found , exit ;
var Local variables for hyphenation 901
begin Find hyphen locations for the word in hc, or return 923 ;
If no hyphens were found, return 902 ;
Replace nodes ha . . hb by a sequence of nodes that includes the discretionary hyphens 903 ;
exit : end;
332 PART 40: PRE-HYPHENATION T
E
X82 896
896. The rst thing we need to do is nd the node ha just before the rst letter.
Skip to node ha, or goto done1 if no hyphenation should be attempted 896
loop begin if is char node(s) then
begin c qo(character (s)); hf font (s);
end
else if type(s) = ligature node then
if lig ptr (s) = null then goto continue
else begin q lig ptr (s); c qo(character (q)); hf font (q);
end
else if (type(s) = kern node) (subtype(s) = normal ) then goto continue
else if type(s) = whatsit node then
begin Advance past a whatsit node in the pre-hyphenation loop 1363 ;
goto continue;
end
else goto done1 ;
if lc code(c) ,= 0 then
if (lc code(c) = c) (uc hyph > 0) then goto done2
else goto done1 ;
continue: prev s s; s link (prev s );
end;
done2 : hyf char hyphen char [hf ];
if hyf char < 0 then goto done1 ;
if hyf char > 255 then goto done1 ;
ha prev s
This code is used in section 894.
897. The word to be hyphenated is now moved to the hu and hc arrays.
Skip to node hb, putting letters into hu and hc 897
hn 0;
loop begin if is char node(s) then
begin if font (s) ,= hf then goto done3 ;
hyf bchar character (s); c qo(hyf bchar );
if lc code(c) = 0 then goto done3 ;
if hn = 63 then goto done3 ;
hb s; incr (hn); hu[hn] c; hc[hn] lc code(c); hyf bchar non char ;
end
else if type(s) = ligature node then Move the characters of a ligature node to hu and hc; but goto
done3 if they are not all letters 898
else if (type(s) = kern node) (subtype(s) = normal ) then
begin hb s; hyf bchar font bchar [hf ];
end
else goto done3 ;
s link (s);
end;
done3 :
This code is used in section 894.
898 T
E
X82 PART 40: PRE-HYPHENATION 333
898. We let j be the index of the character being stored when a ligature node is being expanded, since
we do not want to advance hn until we are sure that the entire ligature consists of letters. Note that it is
possible to get to done3 with hn = 0 and hb not set to any value.
Move the characters of a ligature node to hu and hc; but goto done3 if they are not all letters 898
begin if font (lig char (s)) ,= hf then goto done3 ;
j hn; q lig ptr (s); if q > null then hyf bchar character (q);
while q > null do
begin c qo(character (q));
if lc code(c) = 0 then goto done3 ;
if j = 63 then goto done3 ;
incr (j); hu[j] c; hc[j] lc code(c);
q link (q);
end;
hb s; hn j;
if odd (subtype(s)) then hyf bchar font bchar [hf ] else hyf bchar non char ;
end
This code is used in section 897.
899. Check that the nodes following hb permit hyphenation and that at least l hyf + r hyf letters have
been found, otherwise goto done1 899
if hn < l hyf + r hyf then goto done1 ; l hyf and r hyf are 1
loop begin if (is char node(s)) then
case type(s) of
ligature node: do nothing ;
kern node: if subtype(s) ,= normal then goto done4 ;
whatsit node, glue node, penalty node, ins node, adjust node, mark node: goto done4 ;
othercases goto done1
endcases;
s link (s);
end;
done4 :
This code is used in section 894.
334 PART 41: POST-HYPHENATION T
E
X82 900
900. Post-hyphenation. If a hyphen may be inserted between hc[j] and hc[j + 1], the hyphenation
procedure will set hyf [j] to some small odd number. But before we look at T
E
Xs hyphenation procedure,
which is independent of the rest of the line-breaking algorithm, let us consider what we will do with the
hyphens it nds, since it is better to work on this part of the program before forgetting what ha and hb,
etc., are all about.
Global variables 13 +
hyf : array [0 . . 64] of 0 . . 9; odd values indicate discretionary hyphens
init list : pointer ; list of punctuation characters preceding the word
init lig : boolean; does init list represent a ligature?
init lft : boolean; if so, did the ligature involve a left boundary?
901. Local variables for hyphenation 901
i, j, l: 0 . . 65; indices into hc or hu
q, r, s: pointer ; temporary registers for list manipulation
bchar : halfword ; right boundary character of hyphenated word, or non char
See also sections 912, 922, and 929.
This code is used in section 895.
902. T
E
X will never insert a hyphen that has fewer than \lefthyphenmin letters before it or fewer than
\righthyphenmin after it; hence, a short word has comparatively little chance of being hyphenated. If no
hyphens have been found, we can save time by not having to make any changes to the paragraph.
If no hyphens were found, return 902
for j l hyf to hn r hyf do
if odd (hyf [j]) then goto found1 ;
return;
found1 :
This code is used in section 895.
903 T
E
X82 PART 41: POST-HYPHENATION 335
903. If hyphens are in fact going to be inserted, T
E
X rst deletes the subsequence of nodes between ha
and hb. An attempt is made to preserve the eect that implicit boundary characters and punctuation marks
had on ligatures inside the hyphenated word, by storing a left boundary or preceding character in hu[0] and
by storing a possible right boundary in bchar . We set j 0 if hu[0] is to be part of the reconstruction;
otherwise j 1. The variable s will point to the tail of the current hlist, and q will point to the node
following hb, so that things can be hooked up after we reconstitute the hyphenated word.
Replace nodes ha . . hb by a sequence of nodes that includes the discretionary hyphens 903
q link (hb); link (hb) null ; r link (ha); link (ha) null ; bchar hyf bchar ;
if is char node(ha) then
if font (ha) ,= hf then goto found2
else begin init list ha; init lig false; hu[0] qo(character (ha));
end
else if type(ha) = ligature node then
if font (lig char (ha)) ,= hf then goto found2
else begin init list lig ptr (ha); init lig true; init lft (subtype(ha) > 1);
hu[0] qo(character (lig char (ha)));
if init list = null then
if init lft then
begin hu[0] 256; init lig false;
end; in this case a ligature will be reconstructed from scratch
free node(ha, small node size);
end
else begin no punctuation found; look for left boundary
if is char node(r) then
if type(r) = ligature node then
if subtype(r) > 1 then goto found2 ;
j 1; s ha; init list null ; goto common ending ;
end;
s cur p; we have cur p ,= ha because type(cur p) = glue node
while link (s) ,= ha do s link (s);
j 0; goto common ending ;
found2 : s ha; j 0; hu[0] 256; init lig false; init list null ;
common ending : ush node list (r);
Reconstitute nodes for the hyphenated word, inserting discretionary hyphens 913 ;
ush list (init list )
This code is used in section 895.
904. We must now face the fact that the battle is not over, even though the hyphens have been found: The
process of reconstituting a word can be nontrivial because ligatures might change when a hyphen is present.
The T
E
Xbook discusses the diculties of the word dicult, and the discretionary material surrounding a
hyphen can be considerably more complex than that. Suppose abcdef is a word in a font for which the only
ligatures are bc, cd, de, and ef. If this word permits hyphenation between b and c, the two patterns with
and without hyphenation are a b cd ef and a bc de f. Thus the insertion of a hyphen might cause eects to
ripple arbitrarily far into the rest of the word. A further complication arises if additional hyphens appear
together with such rippling, e.g., if the word in the example just given could also be hyphenated between c
and d; T
E
X avoids this by simply ignoring the additional hyphens in such weird cases.
Still further complications arise in the presence of ligatures that do not delete the original characters.
When punctuation precedes the word being hyphenated, T
E
Xs method is not perfect under all possible
scenarios, because punctuation marks and letters can propagate information back and forth. For example,
suppose the original pre-hyphenation pair *a changes to *y via a |=: ligature, which changes to xy via a
=:| ligature; if p
a1
= x and p
a
= y, the reconstitution procedure isnt smart enough to obtain xy again. In
such cases the font designer should include a ligature that goes from xa to xy.
336 PART 41: POST-HYPHENATION T
E
X82 905
905. The processing is facilitated by a subroutine called reconstitute. Given a string of characters x
j
. . . x
n
,
there is a smallest index m j such that the translation of x
j
. . . x
n
by ligatures and kerning has the form
y
1
. . . y
t
followed by the translation of x
m+1
. . . x
n
, where y
1
. . . y
t
is some nonempty sequence of character,
ligature, and kern nodes. We call x
j
. . . x
m
a cut prex of x
j
. . . x
n
. For example, if x
1
x
2
x
3
= fly, and if
the font contains as a ligature and a kern between and y, then m = 2, t = 2, and y
1
will be a ligature
node for followed by an appropriate kern node y
2
. In the most common case, x
j
forms no ligature with
x
j+1
and we simply have m = j, y
1
= x
j
. If m < n we can repeat the procedure on x
m+1
. . . x
n
until the
entire translation has been found.
The reconstitute function returns the integer m and puts the nodes y
1
. . . y
t
into a linked list starting at
link (hold head ), getting the input x
j
. . . x
n
from the hu array. If x
j
= 256, we consider x
j
to be an implicit
left boundary character; in this case j must be strictly less than n. There is a parameter bchar , which
is either 256 or an implicit right boundary character assumed to be present just following x
n
. (The value
hu[n + 1] is never explicitly examined, but the algorithm imagines that bchar is there.)
If there exists an index k in the range j k m such that hyf [k] is odd and such that the result of
reconstitute would have been dierent if x
k+1
had been hchar , then reconstitute sets hyphen passed to the
smallest such k. Otherwise it sets hyphen passed to zero.
A special convention is used in the case j = 0: Then we assume that the translation of hu[0] appears
in a special list of charnodes starting at init list ; moreover, if init lig is true, then hu[0] will be a ligature
character, involving a left boundary if init lft is true. This facility is provided for cases when a hyphenated
word is preceded by punctuation (like single or double quotes) that might aect the translation of the
beginning of the word.
Global variables 13 +
hyphen passed : small number ; rst hyphen in a ligature, if any
906. Declare the function called reconstitute 906
function reconstitute(j, n : small number ; bchar , hchar : halfword ): small number ;
label continue, done;
var p: pointer ; temporary register for list manipulation
t: pointer ; a node being appended to
q: four quarters ; character information or a lig/kern instruction
cur rh: halfword ; hyphen character for ligature testing
test char : halfword ; hyphen or other character for ligature testing
w: scaled ; amount of kerning
k: font index ; position of current lig/kern instruction
begin hyphen passed 0; t hold head ; w 0; link (hold head ) null ;
at this point ligature present = lft hit = rt hit = false
Set up data structures with the cursor following position j 908 ;
continue: If theres a ligature or kern at the cursor position, update the data structures, possibly
advancing j; continue until the cursor moves 909 ;
Append a ligature and/or kern to the translation; goto continue if the stack of inserted ligatures is
nonempty 910 ;
reconstitute j;
end;
This code is used in section 895.
907 T
E
X82 PART 41: POST-HYPHENATION 337
907. The reconstitution procedure shares many of the global data structures by which T
E
X has processed
the words before they were hyphenated. There is an implied cursor between characters cur l and cur r ;
these characters will be tested for possible ligature activity. If ligature present then cur l is a ligature
character formed from the original characters following cur q in the current translation list. There is a
ligature stack between the cursor and character j + 1, consisting of pseudo-ligature nodes linked together
by their link elds. This stack is normally empty unless a ligature command has created a new character that
will need to be processed later. A pseudo-ligature is a special node having a character eld that represents
a potential ligature and a lig ptr eld that points to a char node or is null . We have
cur r =
] = 0,
hyf num[v
] = min quarterword .
T
E
X computes an appropriate value v with the new trie op subroutine below, by setting
v
).
This subroutine looks up its three parameters in a special hash table, assigning a new value only if these
three have not appeared before for the current language.
The hash table is called trie op hash, and the number of entries it contains is trie op ptr .
Global variables 13 +
init trie op hash: array [trie op size . . trie op size] of 0 . . trie op size;
trie op codes for quadruples
trie used : array [ASCII code] of quarterword ; largest opcode used so far for this language
trie op lang : array [1 . . trie op size] of ASCII code; language part of a hashed quadruple
trie op val : array [1 . . trie op size] of quarterword ; opcode corresponding to a hashed quadruple
trie op ptr : 0 . . trie op size; number of stored ops so far
tini
944 T
E
X82 PART 43: INITIALIZING THE HYPHENATION TABLES 351
944. Its tempting to remove the overow stops in the following procedure; new trie op could return
min quarterword (thereby simply ignoring part of a hyphenation pattern) instead of aborting the job.
However, that would lead to dierent hyphenation results on dierent installations of T
E
X using the same
patterns. The overow stops are necessary for portability of patterns.
Declare procedures for preprocessing hyphenation patterns 944
function new trie op(d, n : small number ; v : quarterword ): quarterword ;
label exit ;
var h: trie op size . . trie op size; trial hash location
u: quarterword ; trial op code
l: 0 . . trie op size; pointer to stored data
begin h abs (n + 313 d + 361 v + 1009 cur lang ) mod (trie op size + trie op size) trie op size;
loop begin l trie op hash[h];
if l = 0 then empty position found for a new op
begin if trie op ptr = trie op size then overow("patternmemoryops", trie op size);
u trie used [cur lang ];
if u = max quarterword then
overow("patternmemoryopsperlanguage", max quarterword min quarterword );
incr (trie op ptr ); incr (u); trie used [cur lang ] u; hyf distance[trie op ptr ] d;
hyf num[trie op ptr ] n; hyf next [trie op ptr ] v; trie op lang [trie op ptr ] cur lang ;
trie op hash[h] trie op ptr ; trie op val [trie op ptr ] u; new trie op u; return;
end;
if (hyf distance[l] = d) (hyf num[l] = n) (hyf next [l] = v) (trie op lang [l] = cur lang ) then
begin new trie op trie op val [l]; return;
end;
if h > trie op size then decr (h) else h trie op size;
end;
exit : end;
See also sections 948, 949, 953, 957, 959, 960, and 966.
This code is used in section 942.
945. After new trie op has compressed the necessary opcode information, plenty of information is available
to unscramble the data into the nal form needed by our hyphenation algorithm.
Sort the hyphenation op tables into proper order 945
op start [0] min quarterword ;
for j 1 to 255 do op start [j] op start [j 1] + qo(trie used [j 1]);
for j 1 to trie op ptr do trie op hash[j] op start [trie op lang [j]] + trie op val [j]; destination
for j 1 to trie op ptr do
while trie op hash[j] > j do
begin k trie op hash[j];
t hyf distance[k]; hyf distance[k] hyf distance[j]; hyf distance[j] t;
t hyf num[k]; hyf num[k] hyf num[j]; hyf num[j] t;
t hyf next [k]; hyf next [k] hyf next [j]; hyf next [j] t;
trie op hash[j] trie op hash[k]; trie op hash[k] k;
end
This code is used in section 952.
352 PART 43: INITIALIZING THE HYPHENATION TABLES T
E
X82 946
946. Before we forget how to initialize the data structures that have been mentioned so far, lets write
down the code that gets them started.
Initialize table entries (done by INITEX only) 164 +
for k trie op size to trie op size do trie op hash[k] 0;
for k 0 to 255 do trie used [k] min quarterword ;
trie op ptr 0;
947. The linked trie that is used to preprocess hyphenation patterns appears in several global arrays. Each
node represents an instruction of the form if you see character c, then perform operation o, move to the
next character, and go to node l; otherwise go to node r. The four quantities c, o, l, and r are stored in four
arrays trie c, trie o, trie l , and trie r . The root of the trie is trie l [0], and the number of nodes is trie ptr .
Null trie pointers are represented by zero. To initialize the trie, we simply set trie l [0] and trie ptr to zero.
We also set trie c[0] to some arbitrary value, since the algorithm may access it.
The algorithms maintain the condition
trie c[trie r [z]] > trie c[z] whenever z ,= 0 and trie r [z] ,= 0;
in other words, sibling nodes are ordered by their c elds.
dene trie root trie l [0] root of the linked trie
Global variables 13 +
init trie c: packed array [trie pointer ] of packed ASCII code; characters to match
trie o: packed array [trie pointer ] of quarterword ; operations to perform
trie l : packed array [trie pointer ] of trie pointer ; left subtrie links
trie r : packed array [trie pointer ] of trie pointer ; right subtrie links
trie ptr : trie pointer ; the number of nodes in the trie
trie hash: packed array [trie pointer ] of trie pointer ; used to identify equivalent subtries
tini
948. Let us suppose that a linked trie has already been constructed. Experience shows that we can often
reduce its size by recognizing common subtries; therefore another hash table is introduced for this purpose,
somewhat similar to trie op hash. The new hash table will be initialized to zero.
The function trie node(p) returns p if p is distinct from other nodes that it has seen, otherwise it returns
the number of the rst equivalent node that it has seen.
Notice that we might make subtries equivalent even if they correspond to patterns for dierent languages,
in which the trie ops might mean quite dierent things. Thats perfectly all right.
Declare procedures for preprocessing hyphenation patterns 944 +
function trie node(p : trie pointer ): trie pointer ; converts to a canonical form
label exit ;
var h: trie pointer ; trial hash location
q: trie pointer ; trial trie node
begin h abs (trie c[p] + 1009 trie o[p] + 2718 trie l [p] + 3142 trie r [p]) mod trie size;
loop begin q trie hash[h];
if q = 0 then
begin trie hash[h] p; trie node p; return;
end;
if (trie c[q] = trie c[p]) (trie o[q] = trie o[p]) (trie l [q] = trie l [p]) (trie r [q] = trie r [p]) then
begin trie node q; return;
end;
if h > 0 then decr (h) else h trie size;
end;
exit : end;
949 T
E
X82 PART 43: INITIALIZING THE HYPHENATION TABLES 353
949. A neat recursive procedure is now able to compress a trie by traversing it and applying trie node to
its nodes in bottom up fashion. We will compress the entire trie by clearing trie hash to zero and then
saying trie root compress trie(trie root ).
Declare procedures for preprocessing hyphenation patterns 944 +
function compress trie(p : trie pointer ): trie pointer ;
begin if p = 0 then compress trie 0
else begin trie l [p] compress trie(trie l [p]); trie r [p] compress trie(trie r [p]);
compress trie trie node(p);
end;
end;
950. The compressed trie will be packed into the trie array using a top-down rst-t procedure. This
is a little tricky, so the reader should pay close attention: The trie hash array is cleared to zero again and
renamed trie ref for this phase of the operation; later on, trie ref [p] will be nonzero only if the linked trie
node p is the smallest character in a family and if the characters c of that family have been allocated to
locations trie ref [p] + c in the trie array. Locations of trie that are in use will have trie link = 0, while
the unused holes in trie will be doubly linked with trie link pointing to the next larger vacant location and
trie back pointing to the next smaller one. This double linking will have been carried out only as far as
trie max , where trie max is the largest index of trie that will be needed. To save time at the low end of
the trie, we maintain array entries trie min[c] pointing to the smallest hole that is greater than c. Another
array trie taken tells whether or not a given location is equal to trie ref [p] for some p; this array is used to
ensure that distinct nodes in the compressed trie will have distinct trie ref entries.
dene trie ref trie hash where linked trie families go into trie
dene trie back (#) trie[#].lh backward links in trie holes
Global variables 13 +
init trie taken: packed array [1 . . trie size] of boolean; does a family start here?
trie min: array [ASCII code] of trie pointer ; the rst possible slot for each character
trie max : trie pointer ; largest location used in trie
trie not ready: boolean; is the trie still in linked form?
tini
951. Each time \patterns appears, it contributes further patterns to the future trie, which will be built
only when hyphenation is attempted or when a format le is dumped. The boolean variable trie not ready
will change to false when the trie is compressed; this will disable further patterns.
Initialize table entries (done by INITEX only) 164 +
trie not ready true; trie root 0; trie c[0] si (0); trie ptr 0;
952. Here is how the trie-compression data structures are initialized. If storage is tight, it would be possible
to overlap trie op hash, trie op lang , and trie op val with trie, trie hash, and trie taken, because we nish
with the former just before we need the latter.
Get ready to compress the trie 952
Sort the hyphenation op tables into proper order 945 ;
for p 0 to trie size do trie hash[p] 0;
trie root compress trie(trie root ); identify equivalent subtries
for p 0 to trie ptr do trie ref [p] 0;
for p 0 to 255 do trie min[p] p + 1;
trie link (0) 1; trie max 0
This code is used in section 966.
354 PART 43: INITIALIZING THE HYPHENATION TABLES T
E
X82 953
953. The rst t procedure nds the smallest hole z in trie such that a trie family starting at a given
node p will t into vacant positions starting at z. If c = trie c[p], this means that location z c must not
already be taken by some other family, and that z c +c
in the family.
The procedure sets trie ref [p] to z c when the rst t has been found.
Declare procedures for preprocessing hyphenation patterns 944 +
procedure rst t (p : trie pointer ); packs a family into trie
label not found , found ;
var h: trie pointer ; candidate for trie ref [p]
z: trie pointer ; runs through holes
q: trie pointer ; runs through the family starting at p
c: ASCII code; smallest character in the family
l, r: trie pointer ; left and right neighbors
ll : 1 . . 256; upper limit of trie min updating
begin c so(trie c[p]); z trie min[c]; get the rst conceivably good hole
loop begin h z c;
Ensure that trie max h + 256 954 ;
if trie taken[h] then goto not found ;
If all characters of the family t relative to h, then goto found , otherwise goto not found 955 ;
not found : z trie link (z); move to the next hole
end;
found : Pack the family into trie relative to h 956 ;
end;
954. By making sure that trie max is at least h +256, we can be sure that trie max > z, since h = z c.
It follows that location trie max will never be occupied in trie, and we will have trie max trie link (z).
Ensure that trie max h + 256 954
if trie max < h + 256 then
begin if trie size h + 256 then overow("patternmemory", trie size);
repeat incr (trie max ); trie taken[trie max ] false; trie link (trie max ) trie max + 1;
trie back (trie max ) trie max 1;
until trie max = h + 256;
end
This code is used in section 953.
955. If all characters of the family t relative to h, then goto found , otherwise goto not found 955
q trie r [p];
while q > 0 do
begin if trie link (h + so(trie c[q])) = 0 then goto not found ;
q trie r [q];
end;
goto found
This code is used in section 953.
956 T
E
X82 PART 43: INITIALIZING THE HYPHENATION TABLES 355
956. Pack the family into trie relative to h 956
trie taken[h] true; trie ref [p] h; q p;
repeat z h + so(trie c[q]); l trie back (z); r trie link (z); trie back (r) l; trie link (l) r;
trie link (z) 0;
if l < 256 then
begin if z < 256 then ll z else ll 256;
repeat trie min[l] r; incr (l);
until l = ll ;
end;
q trie r [q];
until q = 0
This code is used in section 953.
957. To pack the entire linked trie, we use the following recursive procedure.
Declare procedures for preprocessing hyphenation patterns 944 +
procedure trie pack (p : trie pointer ); pack subtries of a family
var q: trie pointer ; a local variable that need not be saved on recursive calls
begin repeat q trie l [p];
if (q > 0) (trie ref [q] = 0) then
begin rst t (q); trie pack (q);
end;
p trie r [p];
until p = 0;
end;
958. When the whole trie has been allocated into the sequential table, we must go through it once again so
that trie contains the correct information. Null pointers in the linked trie will be represented by the value 0,
which properly implements an empty family.
Move the data into trie 958
h.rh 0; h.b0 min quarterword ; h.b1 min quarterword ;
trie link 0, trie op min quarterword , trie char qi (0)
if trie root = 0 then no patterns were given
begin for r 0 to 256 do trie[r] h;
trie max 256;
end
else begin trie x (trie root ); this xes the non-holes in trie
r 0; now we will zero out all the holes
repeat s trie link (r); trie[r] h; r s;
until r > trie max ;
end;
trie char (0) qi ("?"); make trie char (c) ,= c for all c
This code is used in section 966.
356 PART 43: INITIALIZING THE HYPHENATION TABLES T
E
X82 959
959. The xing-up procedure is, of course, recursive. Since the linked trie usually has overlapping subtries,
the same data may be moved several times; but that causes no harm, and at most as much work is done as
it took to build the uncompressed trie.
Declare procedures for preprocessing hyphenation patterns 944 +
procedure trie x (p : trie pointer ); moves p and its siblings into trie
var q: trie pointer ; a local variable that need not be saved on recursive calls
c: ASCII code; another one that need not be saved
z: trie pointer ; trie reference; this local variable must be saved
begin z trie ref [p];
repeat q trie l [p]; c so(trie c[p]); trie link (z + c) trie ref [q]; trie char (z + c) qi (c);
trie op(z + c) trie o[p];
if q > 0 then trie x (q);
p trie r [p];
until p = 0;
end;
960. Now lets go back to the easier problem, of building the linked trie. When INITEX has scanned the
\patterns control sequence, it calls on new patterns to do the right thing.
Declare procedures for preprocessing hyphenation patterns 944 +
procedure new patterns ; initializes the hyphenation pattern data
label done, done1 ;
var k, l: 0 . . 64; indices into hc and hyf ; not always in small number range
digit sensed : boolean; should the next digit be treated as a letter?
v: quarterword ; trie op code
p, q: trie pointer ; nodes of trie traversed during insertion
rst child : boolean; is p = trie l [q]?
c: ASCII code; character being inserted
begin if trie not ready then
begin set cur lang ; scan left brace; a left brace must follow \patterns
Enter all of the patterns into a linked trie, until coming to a right brace 961 ;
end
else begin print err ("Toolatefor"); print esc("patterns");
help1 ("Allpatternsmustbegivenbeforetypesettingbegins."); error ;
link (garbage) scan toks (false, false); ush list (def ref );
end;
end;
961 T
E
X82 PART 43: INITIALIZING THE HYPHENATION TABLES 357
961. Novices are not supposed to be using \patterns, so the error messages are terse. (Note that all error
messages appear in T
E
Xs string pool, even if they are used only by INITEX.)
Enter all of the patterns into a linked trie, until coming to a right brace 961
k 0; hyf [0] 0; digit sensed false;
loop begin get x token;
case cur cmd of
letter , other char : Append a new letter or a hyphen level 962 ;
spacer , right brace: begin if k > 0 then Insert a new pattern into the linked trie 963 ;
if cur cmd = right brace then goto done;
k 0; hyf [0] 0; digit sensed false;
end;
othercases begin print err ("Bad"); print esc("patterns"); help1 ("(SeeAppendixH.)"); error ;
end
endcases;
end;
done:
This code is used in section 960.
962. Append a new letter or a hyphen level 962
if digit sensed (cur chr < "0") (cur chr > "9") then
begin if cur chr = "." then cur chr 0 edge-of-word delimiter
else begin cur chr lc code(cur chr );
if cur chr = 0 then
begin print err ("Nonletter"); help1 ("(SeeAppendixH.)"); error ;
end;
end;
if k < 63 then
begin incr (k); hc[k] cur chr ; hyf [k] 0; digit sensed false;
end;
end
else if k < 63 then
begin hyf [k] cur chr "0"; digit sensed true;
end
This code is used in section 961.
358 PART 43: INITIALIZING THE HYPHENATION TABLES T
E
X82 963
963. When the following code comes into play, the pattern p
1
. . . p
k
appears in hc[1 . . k], and the
corresponding sequence of numbers n
0
. . . n
k
appears in hyf [0 . . k].
Insert a new pattern into the linked trie 963
begin Compute the trie op code, v, and set l 0 965 ;
q 0; hc[0] cur lang ;
while l k do
begin c hc[l]; incr (l); p trie l [q]; rst child true;
while (p > 0) (c > so(trie c[p])) do
begin q p; p trie r [q]; rst child false;
end;
if (p = 0) (c < so(trie c[p])) then
Insert a new trie node between q and p, and make p point to it 964 ;
q p; now node q represents p
1
. . . p
l1
end;
if trie o[q] ,= min quarterword then
begin print err ("Duplicatepattern"); help1 ("(SeeAppendixH.)"); error ;
end;
trie o[q] v;
end
This code is used in section 961.
964. Insert a new trie node between q and p, and make p point to it 964
begin if trie ptr = trie size then overow("patternmemory", trie size);
incr (trie ptr ); trie r [trie ptr ] p; p trie ptr ; trie l [p] 0;
if rst child then trie l [q] p else trie r [q] p;
trie c[p] si (c); trie o[p] min quarterword ;
end
This code is used in section 963.
965. Compute the trie op code, v, and set l 0 965
if hc[1] = 0 then hyf [0] 0;
if hc[k] = 0 then hyf [k] 0;
l k; v min quarterword ;
loop begin if hyf [l] ,= 0 then v new trie op(k l, hyf [l], v);
if l > 0 then decr (l) else goto done1 ;
end;
done1 :
This code is used in section 963.
966 T
E
X82 PART 43: INITIALIZING THE HYPHENATION TABLES 359
966. Finally we put everything together: Here is how the trie gets to its nal, ecient form. The following
packing routine is rigged so that the root of the linked tree gets mapped into location 1 of trie, as required
by the hyphenation algorithm. This happens because the rst call of rst t will take location 1.
Declare procedures for preprocessing hyphenation patterns 944 +
procedure init trie;
var p: trie pointer ; pointer for initialization
j, k, t: integer ; all-purpose registers for initialization
r, s: trie pointer ; used to clean up the packed trie
h: two halves ; template used to zero out tries holes
begin Get ready to compress the trie 952 ;
if trie root ,= 0 then
begin rst t (trie root ); trie pack (trie root );
end;
Move the data into trie 958 ;
trie not ready false;
end;
360 PART 44: BREAKING VERTICAL LISTS INTO PAGES T
E
X82 967
967. Breaking vertical lists into pages. The vsplit procedure, which implements T
E
Xs \vsplit
operation, is considerably simpler than line break because it doesnt have to worry about hyphenation, and
because its mission is to discover a single break instead of an optimum sequence of breakpoints. But before
we get into the details of vsplit , we need to consider a few more basic things.
968. A subroutine called prune page top takes a pointer to a vlist and returns a pointer to a modied vlist
in which all glue, kern, and penalty nodes have been deleted before the rst box or rule node. However, the
rst box or rule is actually preceded by a newly created glue node designed so that the topmost baseline will
be at distance split top skip from the top, whenever this is possible without backspacing.
In this routine and those that follow, we make use of the fact that a vertical list contains no character
nodes, hence the type eld exists for each node in the list.
function prune page top(p : pointer ): pointer ; adjust top after page break
var prev p: pointer ; lags one step behind p
q: pointer ; temporary variable for list manipulation
begin prev p temp head ; link (temp head ) p;
while p ,= null do
case type(p) of
hlist node, vlist node, rule node: Insert glue for split top skip and set p null 969 ;
whatsit node, mark node, ins node: begin prev p p; p link (prev p);
end;
glue node, kern node, penalty node: begin q p; p link (q); link (q) null ; link (prev p) p;
ush node list (q);
end;
othercases confusion("pruning")
endcases;
prune page top link (temp head );
end;
969. Insert glue for split top skip and set p null 969
begin q new skip param(split top skip code); link (prev p) q; link (q) p;
now temp ptr = glue ptr (q)
if width(temp ptr ) > height (p) then width(temp ptr ) width(temp ptr ) height (p)
else width(temp ptr ) 0;
p null ;
end
This code is used in section 968.
970 T
E
X82 PART 44: BREAKING VERTICAL LISTS INTO PAGES 361
970. The next subroutine nds the best place to break a given vertical list so as to obtain a box of
height h, with maximum depth d. A pointer to the beginning of the vertical list is given, and a pointer to
the optimum breakpoint is returned. The list is eectively followed by a forced break, i.e., a penalty node
with the eject penalty; if the best break occurs at this articial node, the value null is returned.
An array of six scaled distances is used to keep track of the height from the beginning of the list to the
current place, just as in line break . In fact, we use one of the same arrays, only changing its name to reect
its new signicance.
dene active height active width new name for the six distance variables
dene cur height active height [1] the natural height
dene set height zero(#) active height [#] 0 initialize the height to zero
dene update heights = 90 go here to record glue in the active height table
function vert break (p : pointer ; h, d : scaled ): pointer ; nds optimum page break
label done, not found , update heights ;
var prev p: pointer ; if p is a glue node, type(prev p) determines whether p is a legal breakpoint
q, r: pointer ; glue specications
pi : integer ; penalty value
b: integer ; badness at a trial breakpoint
least cost : integer ; the smallest badness plus penalties found so far
best place: pointer ; the most recent break that leads to least cost
prev dp: scaled ; depth of previous box in the list
t: small number ; type of the node following a kern
begin prev p p; an initial glue node is not a legal breakpoint
least cost awful bad ; do all six (set height zero); prev dp 0;
loop begin If node p is a legal breakpoint, check if this break is the best known, and goto done if p is
null or if the page-so-far is already too full to accept more stu 972 ;
prev p p; p link (prev p);
end;
done: vert break best place;
end;
971. A global variable best height plus depth will be set to the natural size of the box that corresponds to
the optimum breakpoint found by vert break . (This value is used by the insertion-splitting algorithm of the
page builder.)
Global variables 13 +
best height plus depth: scaled ; height of the best box, without stretching or shrinking
362 PART 44: BREAKING VERTICAL LISTS INTO PAGES T
E
X82 972
972. A subtle point to be noted here is that the maximum depth d might be negative, so cur height and
prev dp might need to be corrected even after a glue or kern node.
If node p is a legal breakpoint, check if this break is the best known, and goto done if p is null or if the
page-so-far is already too full to accept more stu 972
if p = null then pi eject penalty
else Use node p to update the current height and depth measurements; if this node is not a legal
breakpoint, goto not found or update heights , otherwise set pi to the associated penalty at the
break 973 ;
Check if node p is a new champion breakpoint; then goto done if p is a forced break or if the page-so-far
is already too full 974 ;
if (type(p) < glue node) (type(p) > kern node) then goto not found ;
update heights : Update the current height and depth measurements with respect to a glue or kern
node p 976 ;
not found : if prev dp > d then
begin cur height cur height + prev dp d; prev dp d;
end;
This code is used in section 970.
973. Use node p to update the current height and depth measurements; if this node is not a legal
breakpoint, goto not found or update heights , otherwise set pi to the associated penalty at the
break 973
case type(p) of
hlist node, vlist node, rule node: begin
cur height cur height + prev dp + height (p); prev dp depth(p); goto not found ;
end;
whatsit node: Process whatsit p in vert break loop, goto not found 1365 ;
glue node: if precedes break (prev p) then pi 0
else goto update heights ;
kern node: begin if link (p) = null then t penalty node
else t type(link (p));
if t = glue node then pi 0 else goto update heights ;
end;
penalty node: pi penalty(p);
mark node, ins node: goto not found ;
othercases confusion("vertbreak")
endcases
This code is used in section 972.
974 T
E
X82 PART 44: BREAKING VERTICAL LISTS INTO PAGES 363
974. dene deplorable 100000 more than inf bad , but less than awful bad
Check if node p is a new champion breakpoint; then goto done if p is a forced break or if the page-so-far
is already too full 974
if pi < inf penalty then
begin Compute the badness, b, using awful bad if the box is too full 975 ;
if b < awful bad then
if pi eject penalty then b pi
else if b < inf bad then b b + pi
else b deplorable;
if b least cost then
begin best place p; least cost b; best height plus depth cur height + prev dp;
end;
if (b = awful bad ) (pi eject penalty) then goto done;
end
This code is used in section 972.
975. Compute the badness, b, using awful bad if the box is too full 975
if cur height < h then
if (active height [3] ,= 0) (active height [4] ,= 0) (active height [5] ,= 0) then b 0
else b badness (h cur height , active height [2])
else if cur height h > active height [6] then b awful bad
else b badness (cur height h, active height [6])
This code is used in section 974.
976. Vertical lists that are subject to the vert break procedure should not contain innite shrinkability,
since that would permit any amount of information to t on one page.
Update the current height and depth measurements with respect to a glue or kern node p 976
if type(p) = kern node then q p
else begin q glue ptr (p);
active height [2 + stretch order (q)] active height [2 + stretch order (q)] + stretch(q);
active height [6] active height [6] + shrink (q);
if (shrink order (q) ,= normal ) (shrink (q) ,= 0) then
begin
print err ("Infiniteglueshrinkagefoundinboxbeingsplit");
help4 ("Theboxyouare\vsplittingcontainssomeinfinitely")
("shrinkableglue,e.g.,`\vssor`\vskip0ptminus1fil.")
("Suchgluedoesntbelongthere;butyoucansafelyproceed,")
("sincetheoffensiveshrinkabilityhasbeenmadefinite."); error ; r new spec(q);
shrink order (r) normal ; delete glue ref (q); glue ptr (p) r; q r;
end;
end;
cur height cur height + prev dp + width(q); prev dp 0
This code is used in section 972.
364 PART 44: BREAKING VERTICAL LISTS INTO PAGES T
E
X82 977
977. Now we are ready to consider vsplit itself. Most of its work is accomplished by the two subroutines
that we have just considered.
Given the number of a vlist box n, and given a desired page height h, the vsplit function nds the best
initial segment of the vlist and returns a box for a page of height h. The remainder of the vlist, if any,
replaces the original box, after removing glue and penalties and adjusting for split top skip. Mark nodes
in the split-o box are used to set the values of split rst mark and split bot mark ; we use the fact that
split rst mark = null if and only if split bot mark = null .
The original box becomes void if and only if it has been entirely extracted. The extracted box is void
if and only if the original box was void (or if it was, erroneously, an hlist box).
function vsplit (n : eight bits ; h : scaled ): pointer ; extracts a page of height h from box n
label exit , done;
var v: pointer ; the box to be split
p: pointer ; runs through the vlist
q: pointer ; points to where the break occurs
begin v box (n);
if split rst mark ,= null then
begin delete token ref (split rst mark ); split rst mark null ; delete token ref (split bot mark );
split bot mark null ;
end;
Dispense with trivial cases of void or bad boxes 978 ;
q vert break (list ptr (v), h, split max depth);
Look at all the marks in nodes before the break, and set the nal link to null at the break 979 ;
q prune page top(q); p list ptr (v); free node(v, box node size);
if q = null then box (n) null the eq level of the box stays the same
else box (n) vpack (q, natural );
vsplit vpackage(p, h, exactly, split max depth);
exit : end;
978. Dispense with trivial cases of void or bad boxes 978
if v = null then
begin vsplit null ; return;
end;
if type(v) ,= vlist node then
begin print err (""); print esc("vsplit"); print ("needsa"); print esc("vbox");
help2 ("Theboxyouaretryingtosplitisan\hbox.")
("Icantsplitsuchabox,soIllleaveitalone."); error ; vsplit null ; return;
end
This code is used in section 977.
979 T
E
X82 PART 44: BREAKING VERTICAL LISTS INTO PAGES 365
979. Its possible that the box begins with a penalty node that is the best break, so we must be careful
to handle this special case correctly.
Look at all the marks in nodes before the break, and set the nal link to null at the break 979
p list ptr (v);
if p = q then list ptr (v) null
else loop begin if type(p) = mark node then
if split rst mark = null then
begin split rst mark mark ptr (p); split bot mark split rst mark ;
token ref count (split rst mark ) token ref count (split rst mark ) + 2;
end
else begin delete token ref (split bot mark ); split bot mark mark ptr (p);
add token ref (split bot mark );
end;
if link (p) = q then
begin link (p) null ; goto done;
end;
p link (p);
end;
done:
This code is used in section 977.
366 PART 45: THE PAGE BUILDER T
E
X82 980
980. The page builder. When T
E
X appends new material to its main vlist in vertical mode, it uses a
method something like vsplit to decide where a page ends, except that the calculations are done on line
as new items come in. The main complication in this process is that insertions must be put into their boxes
and removed from the vlist, in a more-or-less optimum manner.
We shall use the term current page for that part of the main vlist that is being considered as a candidate
for being broken o and sent to the users output routine. The current page starts at link (page head ), and
it ends at page tail . We have page head = page tail if this list is empty.
Utter chaos would reign if the user kept changing page specications while a page is being constructed,
so the page builder keeps the pertinent specications frozen as soon as the page receives its rst box or
insertion. The global variable page contents is empty when the current page contains only mark nodes and
content-less whatsit nodes; it is inserts only if the page contains only insertion nodes in addition to marks
and whatsits. Glue nodes, kern nodes, and penalty nodes are discarded until a box or rule node appears, at
which time page contents changes to box there. As soon as page contents becomes non-empty, the current
vsize and max depth are squirreled away into page goal and page max depth; the latter values will be used
until the page has been forwarded to the users output routine. The \topskip adjustment is made when
page contents changes to box there.
Although page goal starts out equal to vsize, it is decreased by the scaled natural height-plus-depth of the
insertions considered so far, and by the \skip corrections for those insertions. Therefore it represents the
size into which the non-inserted material should t, assuming that all insertions in the current page have
been made.
The global variables best page break and least page cost correspond respectively to the local variables
best place and least cost in the vert break routine that we have already studied; i.e., they record the location
and value of the best place currently known for breaking the current page. The value of page goal at the
time of the best break is stored in best size.
dene inserts only = 1 page contents when an insert node has been contributed, but no boxes
dene box there = 2 page contents when a box or rule has been contributed
Global variables 13 +
page tail : pointer ; the nal node on the current page
page contents : empty . . box there; what is on the current page so far?
page max depth: scaled ; maximum box depth on page being built
best page break : pointer ; break here to get the best page known so far
least page cost : integer ; the score for this currently best page
best size: scaled ; its page goal
981 T
E
X82 PART 45: THE PAGE BUILDER 367
981. The page builder has another data structure to keep track of insertions. This is a list of four-
word nodes, starting and ending at page ins head . That is, the rst element of the list is node r
1
=
link (page ins head ); node r
j
is followed by r
j+1
= link (r
j
); and if there are n items we have r
n+1
=
page ins head . The subtype eld of each node in this list refers to an insertion number; for example,
\insert 250 would correspond to a node whose subtype is qi (250) (the same as the subtype eld of the
relevant ins node). These subtype elds are in increasing order, and subtype(page ins head ) = qi (255), so
page ins head serves as a convenient sentinel at the end of the list. A record is present for each insertion
number that appears in the current page.
The type eld in these nodes distinguishes two possibilities that might occur as we look ahead before
deciding on the optimum page break. If type(r) = inserting , then height (r) contains the total of the height-
plus-depth dimensions of the box and all its inserts seen so far. If type(r) = split up, then no more insertions
will be made into this box, because at least one previous insertion was too big to t on the current page;
broken ptr (r) points to the node where that insertion will be split, if T
E
X decides to split it, broken ins (r)
points to the insertion node that was tentatively split, and height (r) includes also the natural height plus
depth of the part that would be split o.
In both cases, last ins ptr (r) points to the last ins node encountered for box qo(subtype(r)) that would be
at least partially inserted on the next page; and best ins ptr (r) points to the last such ins node that should
actually be inserted, to get the page with minimum badness among all page breaks considered so far. We
have best ins ptr (r) = null if and only if no insertion for this box should be made to produce this optimum
page.
The data structure denitions here use the fact that the height eld appears in the fourth word of a box
node.
dene page ins node size = 4 number of words for a page insertion node
dene inserting = 0 an insertion class that has not yet overowed
dene split up = 1 an overowed insertion class
dene broken ptr (#) link (# + 1) an insertion for this class will break here if anywhere
dene broken ins (#) info(# + 1) this insertion might break at broken ptr
dene last ins ptr (#) link (# + 2) the most recent insertion for this subtype
dene best ins ptr (#) info(# + 2) the optimum most recent insertion
Initialize the special list heads and constant nodes 790 +
subtype(page ins head ) qi (255); type(page ins head ) split up; link (page ins head ) page ins head ;
368 PART 45: THE PAGE BUILDER T
E
X82 982
982. An array page so far records the heights and depths of everything on the current page. This array
contains six scaled numbers, like the similar arrays already considered in line break and vert break ; and it
also contains page goal and page depth, since these values are all accessible to the user via set page dimen
commands. The value of page so far [1] is also called page total . The stretch and shrink components of the
\skip corrections for each insertion are included in page so far , but the natural space components of these
corrections are not, since they have been subtracted from page goal .
The variable page depth records the depth of the current page; it has been adjusted so that it is at most
page max depth. The variable last glue points to the glue specication of the most recent node contributed
from the contribution list, if this was a glue node; otherwise last glue = max halfword . (If the contribution
list is nonempty, however, the value of last glue is not necessarily accurate.) The variables last penalty and
last kern are similar. And nally, insert penalties holds the sum of the penalties associated with all split
and oating insertions.
dene page goal page so far [0] desired height of information on page being built
dene page total page so far [1] height of the current page
dene page shrink page so far [6] shrinkability of the current page
dene page depth page so far [7] depth of the current page
Global variables 13 +
page so far : array [0 . . 7] of scaled ; height and glue of the current page
last glue: pointer ; used to implement \lastskip
last penalty: integer ; used to implement \lastpenalty
last kern: scaled ; used to implement \lastkern
insert penalties : integer ; sum of the penalties for held-over insertions
983. Put each of T
E
Xs primitives into the hash table 226 +
primitive("pagegoal", set page dimen, 0); primitive("pagetotal", set page dimen, 1);
primitive("pagestretch", set page dimen, 2); primitive("pagefilstretch", set page dimen, 3);
primitive("pagefillstretch", set page dimen, 4); primitive("pagefilllstretch", set page dimen, 5);
primitive("pageshrink", set page dimen, 6); primitive("pagedepth", set page dimen, 7);
984. Cases of print cmd chr for symbolic printing of primitives 227 +
set page dimen: case chr code of
0: print esc("pagegoal");
1: print esc("pagetotal");
2: print esc("pagestretch");
3: print esc("pagefilstretch");
4: print esc("pagefillstretch");
5: print esc("pagefilllstretch");
6: print esc("pageshrink");
othercases print esc("pagedepth")
endcases;
985 T
E
X82 PART 45: THE PAGE BUILDER 369
985. dene print plus end (#) print (#); end
dene print plus (#)
if page so far [#] ,= 0 then
begin print ("plus"); print scaled (page so far [#]); print plus end
procedure print totals ;
begin print scaled (page total ); print plus (2)(""); print plus (3)("fil"); print plus (4)("fill");
print plus (5)("filll");
if page shrink ,= 0 then
begin print ("minus"); print scaled (page shrink );
end;
end;
986. Show the status of the current page 986
if page head ,= page tail then
begin print nl ("###currentpage:");
if output active then print ("(heldoverfornextoutput)");
show box (link (page head ));
if page contents > empty then
begin print nl ("totalheight"); print totals ; print nl ("goalheight");
print scaled (page goal ); r link (page ins head );
while r ,= page ins head do
begin print ln; print esc("insert"); t qo(subtype(r)); print int (t); print ("adds");
if count (t) = 1000 then t height (r)
else t x over n(height (r), 1000) count (t);
print scaled (t);
if type(r) = split up then
begin q page head ; t 0;
repeat q link (q);
if (type(q) = ins node) (subtype(q) = subtype(r)) then incr (t);
until q = broken ins (r);
print (",#"); print int (t); print ("mightsplit");
end;
r link (r);
end;
end;
end
This code is used in section 218.
987. Here is a procedure that is called when the page contents is changing from empty to inserts only or
box there.
dene set page so far zero(#) page so far [#] 0
procedure freeze page specs (s : small number );
begin page contents s; page goal vsize; page max depth max depth; page depth 0;
do all six (set page so far zero); least page cost awful bad ;
stat if tracing pages > 0 then
begin begin diagnostic; print nl ("%%goalheight="); print scaled (page goal );
print (",maxdepth="); print scaled (page max depth); end diagnostic(false);
end; tats
end;
370 PART 45: THE PAGE BUILDER T
E
X82 988
988. Pages are built by appending nodes to the current list in T
E
Xs vertical mode, which is at the
outermost level of the semantic nest. This vlist is split into two parts; the current page that we have been
talking so much about already, and the contribution list that receives new nodes as they are created. The
current page contains everything that the page builder has accounted for in its data structures, as described
above, while the contribution list contains other things that have been generated by other parts of T
E
X but
have not yet been seen by the page builder. The contribution list starts at link (contrib head ), and it ends
at the current node in T
E
Xs vertical mode.
When T
E
X has appended new material in vertical mode, it calls the procedure build page, which tries to
catch up by moving nodes from the contribution list to the current page. This procedure will succeed in its
goal of emptying the contribution list, unless a page break is discovered, i.e., unless the current page has
grown to the point where the optimum next page break has been determined. In the latter case, the nodes
after the optimum break will go back onto the contribution list, and control will eectively pass to the users
output routine.
We make type(page head ) = glue node, so that an initial glue node on the current page will not be
considered a valid breakpoint.
Initialize the special list heads and constant nodes 790 +
type(page head ) glue node; subtype(page head ) normal ;
989. The global variable output active is true during the time the users output routine is driving T
E
X.
Global variables 13 +
output active: boolean; are we in the midst of an output routine?
990. Set initial values of key variables 21 +
output active false; insert penalties 0;
991. The page builder is ready to start a fresh page if we initialize the following state variables. (However,
the page insertion list is initialized elsewhere.)
Start a new current page 991
page contents empty; page tail page head ; link (page head ) null ;
last glue max halfword ; last penalty 0; last kern 0; page depth 0; page max depth 0
This code is used in sections 215 and 1017.
992. At certain times box 255 is supposed to be void (i.e., null ), or an insertion box is supposed to be
ready to accept a vertical list. If not, an error message is printed, and the following subroutine ushes the
unwanted contents, reporting them to the user.
procedure box error (n : eight bits );
begin error ; begin diagnostic; print nl ("Thefollowingboxhasbeendeleted:");
show box (box (n)); end diagnostic(true); ush node list (box (n)); box (n) null ;
end;
993 T
E
X82 PART 45: THE PAGE BUILDER 371
993. The following procedure guarantees that a given box register does not contain an \hbox.
procedure ensure vbox (n : eight bits );
var p: pointer ; the box register contents
begin p box (n);
if p ,= null then
if type(p) = hlist node then
begin print err ("Insertionscanonlybeaddedtoavbox");
help3 ("Tuttut:Youretryingto\insertintoa")
("\boxregisterthatnowcontainsan\hbox.")
("Proceed,andIlldiscarditspresentcontents."); box error (n);
end;
end;
994. T
E
X is not always in vertical mode at the time build page is called; the current mode reects what T
E
X
should return to, after the contribution list has been emptied. A call on build page should be immediately
followed by goto big switch, which is T
E
Xs central control point.
dene contribute = 80 go here to link a node into the current page
Declare the procedure called re up 1012
procedure build page; append contributions to the current page
label exit , done, done1 , continue, contribute, update heights ;
var p: pointer ; the node being appended
q, r: pointer ; nodes being examined
b, c: integer ; badness and cost of current page
pi : integer ; penalty to be added to the badness
n: min quarterword . . 255; insertion box number
delta, h, w: scaled ; sizes used for insertion calculations
begin if (link (contrib head ) = null ) output active then return;
repeat continue: p link (contrib head );
Update the values of last glue, last penalty, and last kern 996 ;
Move node p to the current page; if it is time for a page break, put the nodes following the break
back onto the contribution list, and return to the users output routine if there is one 997 ;
until link (contrib head ) = null ;
Make the contribution list empty by setting its tail to contrib head 995 ;
exit : end;
995. dene contrib tail nest [0].tail eld tail of the contribution list
Make the contribution list empty by setting its tail to contrib head 995
if nest ptr = 0 then tail contrib head vertical mode
else contrib tail contrib head other modes
This code is used in section 994.
372 PART 45: THE PAGE BUILDER T
E
X82 996
996. Update the values of last glue, last penalty, and last kern 996
if last glue ,= max halfword then delete glue ref (last glue);
last penalty 0; last kern 0;
if type(p) = glue node then
begin last glue glue ptr (p); add glue ref (last glue);
end
else begin last glue max halfword ;
if type(p) = penalty node then last penalty penalty(p)
else if type(p) = kern node then last kern width(p);
end
This code is used in section 994.
997. The code here is an example of a many-way switch into routines that merge together in dierent
places. Some people call this unstructured programming, but the author doesnt see much wrong with it, as
long as the various labels have a well-understood meaning.
Move node p to the current page; if it is time for a page break, put the nodes following the break back
onto the contribution list, and return to the users output routine if there is one 997
If the current page is empty and node p is to be deleted, goto done1 ; otherwise use node p to update
the state of the current page; if this node is an insertion, goto contribute; otherwise if this node is
not a legal breakpoint, goto contribute or update heights ; otherwise set pi to the penalty associated
with this breakpoint 1000 ;
Check if node p is a new champion breakpoint; then if it is time for a page break, prepare for output,
and either re up the users output routine and return or ship out the page and goto done 1005 ;
if (type(p) < glue node) (type(p) > kern node) then goto contribute;
update heights : Update the current page measurements with respect to the glue or kern specied by
node p 1004 ;
contribute: Make sure that page max depth is not exceeded 1003 ;
Link node p into the current page and goto done 998 ;
done1 : Recycle node p 999 ;
done:
This code is used in section 994.
998. Link node p into the current page and goto done 998
link (page tail ) p; page tail p; link (contrib head ) link (p); link (p) null ; goto done
This code is used in section 997.
999. Recycle node p 999
link (contrib head ) link (p); link (p) null ; ush node list (p)
This code is used in section 997.
1000 T
E
X82 PART 45: THE PAGE BUILDER 373
1000. The title of this section is already so long, it seems best to avoid making it more accurate but still
longer, by mentioning the fact that a kern node at the end of the contribution list will not be contributed
until we know its successor.
If the current page is empty and node p is to be deleted, goto done1 ; otherwise use node p to update the
state of the current page; if this node is an insertion, goto contribute; otherwise if this node is not a
legal breakpoint, goto contribute or update heights ; otherwise set pi to the penalty associated with
this breakpoint 1000
case type(p) of
hlist node, vlist node, rule node: if page contents < box there then
Initialize the current page, insert the \topskip glue ahead of p, and goto continue 1001
else Prepare to move a box or rule node to the current page, then goto contribute 1002 ;
whatsit node: Prepare to move whatsit p to the current page, then goto contribute 1364 ;
glue node: if page contents < box there then goto done1
else if precedes break (page tail ) then pi 0
else goto update heights ;
kern node: if page contents < box there then goto done1
else if link (p) = null then return
else if type(link (p)) = glue node then pi 0
else goto update heights ;
penalty node: if page contents < box there then goto done1 else pi penalty(p);
mark node: goto contribute;
ins node: Append an insertion to the current page and goto contribute 1008 ;
othercases confusion("page")
endcases
This code is used in section 997.
1001. Initialize the current page, insert the \topskip glue ahead of p, and goto continue 1001
begin if page contents = empty then freeze page specs (box there)
else page contents box there;
q new skip param(top skip code); now temp ptr = glue ptr (q)
if width(temp ptr ) > height (p) then width(temp ptr ) width(temp ptr ) height (p)
else width(temp ptr ) 0;
link (q) p; link (contrib head ) q; goto continue;
end
This code is used in section 1000.
1002. Prepare to move a box or rule node to the current page, then goto contribute 1002
begin page total page total + page depth + height (p); page depth depth(p); goto contribute;
end
This code is used in section 1000.
1003. Make sure that page max depth is not exceeded 1003
if page depth > page max depth then
begin page total page total + page depth page max depth;
page depth page max depth;
end;
This code is used in section 997.
374 PART 45: THE PAGE BUILDER T
E
X82 1004
1004. Update the current page measurements with respect to the glue or kern specied by node p 1004
if type(p) = kern node then q p
else begin q glue ptr (p);
page so far [2 + stretch order (q)] page so far [2 + stretch order (q)] + stretch(q);
page shrink page shrink + shrink (q);
if (shrink order (q) ,= normal ) (shrink (q) ,= 0) then
begin
print err ("Infiniteglueshrinkagefoundoncurrentpage");
help4 ("Thepageabouttobeoutputcontainssomeinfinitely")
("shrinkableglue,e.g.,`\vssor`\vskip0ptminus1fil.")
("Suchgluedoesntbelongthere;butyoucansafelyproceed,")
("sincetheoffensiveshrinkabilityhasbeenmadefinite."); error ; r new spec(q);
shrink order (r) normal ; delete glue ref (q); glue ptr (p) r; q r;
end;
end;
page total page total + page depth + width(q); page depth 0
This code is used in section 997.
1005. Check if node p is a new champion breakpoint; then if it is time for a page break, prepare for
output, and either re up the users output routine and return or ship out the page and goto
done 1005
if pi < inf penalty then
begin Compute the badness, b, of the current page, using awful bad if the box is too full 1007 ;
if b < awful bad then
if pi eject penalty then c pi
else if b < inf bad then c b + pi + insert penalties
else c deplorable
else c b;
if insert penalties 10000 then c awful bad ;
stat if tracing pages > 0 then Display the page break cost 1006 ;
tats
if c least page cost then
begin best page break p; best size page goal ; least page cost c; r link (page ins head );
while r ,= page ins head do
begin best ins ptr (r) last ins ptr (r); r link (r);
end;
end;
if (c = awful bad ) (pi eject penalty) then
begin re up(p); output the current page at the best place
if output active then return; users output routine will act
goto done; the page has been shipped out by default output routine
end;
end
This code is used in section 997.
1006 T
E
X82 PART 45: THE PAGE BUILDER 375
1006. Display the page break cost 1006
begin begin diagnostic; print nl ("%"); print ("t="); print totals ;
print ("g="); print scaled (page goal );
print ("b=");
if b = awful bad then print char ("*") else print int (b);
print ("p="); print int (pi ); print ("c=");
if c = awful bad then print char ("*") else print int (c);
if c least page cost then print char ("#");
end diagnostic(false);
end
This code is used in section 1005.
1007. Compute the badness, b, of the current page, using awful bad if the box is too full 1007
if page total < page goal then
if (page so far [3] ,= 0) (page so far [4] ,= 0) (page so far [5] ,= 0) then b 0
else b badness (page goal page total , page so far [2])
else if page total page goal > page shrink then b awful bad
else b badness (page total page goal , page shrink )
This code is used in section 1005.
1008. Append an insertion to the current page and goto contribute 1008
begin if page contents = empty then freeze page specs (inserts only);
n subtype(p); r page ins head ;
while n subtype(link (r)) do r link (r);
n qo(n);
if subtype(r) ,= qi (n) then Create a page insertion node with subtype(r) = qi (n), and include the glue
correction for box n in the current page state 1009 ;
if type(r) = split up then insert penalties insert penalties + oat cost (p)
else begin last ins ptr (r) p; delta page goal page total page depth + page shrink ;
this much room is left if we shrink the maximum
if count (n) = 1000 then h height (p)
else h x over n(height (p), 1000) count (n); this much room is needed
if ((h 0) (h delta)) (height (p) + height (r) dimen(n)) then
begin page goal page goal h; height (r) height (r) + height (p);
end
else Find the best way to split the insertion, and change type(r) to split up 1010 ;
end;
goto contribute;
end
This code is used in section 1000.
376 PART 45: THE PAGE BUILDER T
E
X82 1009
1009. We take note of the value of \skip n and the height plus depth of \box n only when the rst
\insert n node is encountered for a new page. A user who changes the contents of \box n after that rst
\insert n had better be either extremely careful or extremely lucky, or both.
Create a page insertion node with subtype(r) = qi (n), and include the glue correction for box n in the
current page state 1009
begin q get node(page ins node size); link (q) link (r); link (r) q; r q; subtype(r) qi (n);
type(r) inserting ; ensure vbox (n);
if box (n) = null then height (r) 0
else height (r) height (box (n)) + depth(box (n));
best ins ptr (r) null ;
q skip(n);
if count (n) = 1000 then h height (r)
else h x over n(height (r), 1000) count (n);
page goal page goal h width(q);
page so far [2 + stretch order (q)] page so far [2 + stretch order (q)] + stretch(q);
page shrink page shrink + shrink (q);
if (shrink order (q) ,= normal ) (shrink (q) ,= 0) then
begin print err ("Infiniteglueshrinkageinsertedfrom"); print esc("skip"); print int (n);
help3 ("Thecorrectionglueforpagebreakingwithinsertions")
("musthavefiniteshrinkability.Butyoumayproceed,")
("sincetheoffensiveshrinkabilityhasbeenmadefinite."); error ;
end;
end
This code is used in section 1008.
1010. Here is the code that will split a long footnote between pages, in an emergency. The current situation
deserves to be recapitulated: Node p is an insertion into box n; the insertion will not t, in its entirety, either
because it would make the total contents of box n greater than \dimen n, or because it would make the
incremental amount of growth h greater than the available space delta, or both. (This amount h has been
weighted by the insertion scaling factor, i.e., by \count n over 1000.) Now we will choose the best way to
break the vlist of the insertion, using the same criteria as in the \vsplit operation.
Find the best way to split the insertion, and change type(r) to split up 1010
begin if count (n) 0 then w max dimen
else begin w page goal page total page depth;
if count (n) ,= 1000 then w x over n(w, count (n)) 1000;
end;
if w > dimen(n) height (r) then w dimen(n) height (r);
q vert break (ins ptr (p), w, depth(p)); height (r) height (r) + best height plus depth;
stat if tracing pages > 0 then Display the insertion split cost 1011 ;
tats
if count (n) ,= 1000 then best height plus depth x over n(best height plus depth, 1000) count (n);
page goal page goal best height plus depth; type(r) split up; broken ptr (r) q;
broken ins (r) p;
if q = null then insert penalties insert penalties + eject penalty
else if type(q) = penalty node then insert penalties insert penalties + penalty(q);
end
This code is used in section 1008.
1011 T
E
X82 PART 45: THE PAGE BUILDER 377
1011. Display the insertion split cost 1011
begin begin diagnostic; print nl ("%split"); print int (n); print ("to"); print scaled (w);
print char (","); print scaled (best height plus depth);
print ("p=");
if q = null then print int (eject penalty)
else if type(q) = penalty node then print int (penalty(q))
else print char ("0");
end diagnostic(false);
end
This code is used in section 1010.
1012. When the page builder has looked at as much material as could appear before the next page break,
it makes its decision. The break that gave minimum badness will be used to put a completed page into
box 255, with insertions appended to their other boxes.
We also set the values of top mark , rst mark , and bot mark . The program uses the fact that bot mark ,=
null implies rst mark ,= null ; it also knows that bot mark = null implies top mark = rst mark = null .
The re up subroutine prepares to output the current page at the best place; then it res up the users
output routine, if there is one, or it simply ships out the page. There is one parameter, c, which represents
the node that was being contributed to the page when the decision to force an output was made.
Declare the procedure called re up 1012
procedure re up(c : pointer );
label exit ;
var p, q, r, s: pointer ; nodes being examined and/or changed
prev p: pointer ; predecessor of p
n: min quarterword . . 255; insertion box number
wait : boolean; should the present insertion be held over?
save vbadness : integer ; saved value of vbadness
save vfuzz : scaled ; saved value of vfuzz
save split top skip: pointer ; saved value of split top skip
begin Set the value of output penalty 1013 ;
if bot mark ,= null then
begin if top mark ,= null then delete token ref (top mark );
top mark bot mark ; add token ref (top mark ); delete token ref (rst mark ); rst mark null ;
end;
Put the optimal current page into box 255, update rst mark and bot mark , append insertions to their
boxes, and put the remaining nodes back on the contribution list 1014 ;
if (top mark ,= null ) (rst mark = null ) then
begin rst mark top mark ; add token ref (top mark );
end;
if output routine ,= null then
if dead cycles max dead cycles then
Explain that too many dead cycles have occurred in a row 1024
else Fire up the users output routine and return 1025 ;
Perform the default output routine 1023 ;
exit : end;
This code is used in section 994.
378 PART 45: THE PAGE BUILDER T
E
X82 1013
1013. Set the value of output penalty 1013
if type(best page break ) = penalty node then
begin geq word dene(int base + output penalty code, penalty(best page break ));
penalty(best page break ) inf penalty;
end
else geq word dene(int base + output penalty code, inf penalty)
This code is used in section 1012.
1014. As the page is nally being prepared for output, pointer p runs through the vlist, with prev p trailing
behind; pointer q is the tail of a list of insertions that are being held over for a subsequent page.
Put the optimal current page into box 255, update rst mark and bot mark , append insertions to their
boxes, and put the remaining nodes back on the contribution list 1014
if c = best page break then best page break null ; c not yet linked in
Ensure that box 255 is empty before output 1015 ;
insert penalties 0; this will count the number of insertions held over
save split top skip split top skip;
if holding inserts 0 then Prepare all the boxes involved in insertions to act as queues 1018 ;
q hold head ; link (q) null ; prev p page head ; p link (prev p);
while p ,= best page break do
begin if type(p) = ins node then
begin if holding inserts 0 then Either insert the material specied by node p into the
appropriate box, or hold it for the next page; also delete node p from the current page 1020 ;
end
else if type(p) = mark node then Update the values of rst mark and bot mark 1016 ;
prev p p; p link (prev p);
end;
split top skip save split top skip; Break the current page at node p, put it in box 255, and put the
remaining nodes on the contribution list 1017 ;
Delete the page-insertion nodes 1019
This code is used in section 1012.
1015. Ensure that box 255 is empty before output 1015
if box (255) ,= null then
begin print err (""); print esc("box"); print ("255isnotvoid");
help2 ("Youshouldntuse\box255exceptin\outputroutines.")
("Proceed,andIlldiscarditspresentcontents."); box error (255);
end
This code is used in section 1014.
1016. Update the values of rst mark and bot mark 1016
begin if rst mark = null then
begin rst mark mark ptr (p); add token ref (rst mark );
end;
if bot mark ,= null then delete token ref (bot mark );
bot mark mark ptr (p); add token ref (bot mark );
end
This code is used in section 1014.
1017 T
E
X82 PART 45: THE PAGE BUILDER 379
1017. When the following code is executed, the current page runs from node link (page head ) to node
prev p, and the nodes from p to page tail are to be placed back at the front of the contribution list.
Furthermore the heldover insertions appear in a list from link (hold head ) to q; we will put them into the
current page list for safekeeping while the users output routine is active. We might have q = hold head ; and
p = null if and only if prev p = page tail . Error messages are suppressed within vpackage, since the box
might appear to be overfull or underfull simply because the stretch and shrink from the \skip registers for
inserts are not actually present in the box.
Break the current page at node p, put it in box 255, and put the remaining nodes on the contribution
list 1017
if p ,= null then
begin if link (contrib head ) = null then
if nest ptr = 0 then tail page tail
else contrib tail page tail ;
link (page tail ) link (contrib head ); link (contrib head ) p; link (prev p) null ;
end;
save vbadness vbadness ; vbadness inf bad ; save vfuzz vfuzz ; vfuzz max dimen;
inhibit error messages
box (255) vpackage(link (page head ), best size, exactly, page max depth); vbadness save vbadness ;
vfuzz save vfuzz ;
if last glue ,= max halfword then delete glue ref (last glue);
Start a new current page 991 ; this sets last glue max halfword
if q ,= hold head then
begin link (page head ) link (hold head ); page tail q;
end
This code is used in section 1014.
1018. If many insertions are supposed to go into the same box, we want to know the position of the
last node in that box, so that we dont need to waste time when linking further information into it. The
last ins ptr elds of the page insertion nodes are therefore used for this purpose during the packaging phase.
Prepare all the boxes involved in insertions to act as queues 1018
begin r link (page ins head );
while r ,= page ins head do
begin if best ins ptr (r) ,= null then
begin n qo(subtype(r)); ensure vbox (n);
if box (n) = null then box (n) new null box ;
p box (n) + list oset ;
while link (p) ,= null do p link (p);
last ins ptr (r) p;
end;
r link (r);
end;
end
This code is used in section 1014.
1019. Delete the page-insertion nodes 1019
r link (page ins head );
while r ,= page ins head do
begin q link (r); free node(r, page ins node size); r q;
end;
link (page ins head ) page ins head
This code is used in section 1014.
380 PART 45: THE PAGE BUILDER T
E
X82 1020
1020. We will set best ins ptr null and package the box corresponding to insertion node r, just after
making the nal insertion into that box. If this nal insertion is split up, the remainder after splitting and
pruning (if any) will be carried over to the next page.
Either insert the material specied by node p into the appropriate box, or hold it for the next page; also
delete node p from the current page 1020
begin r link (page ins head );
while subtype(r) ,= subtype(p) do r link (r);
if best ins ptr (r) = null then wait true
else begin wait false; s last ins ptr (r); link (s) ins ptr (p);
if best ins ptr (r) = p then Wrap up the box specied by node r, splitting node p if called for; set
wait true if node p holds a remainder after splitting 1021
else begin while link (s) ,= null do s link (s);
last ins ptr (r) s;
end;
end;
Either append the insertion node p after node q, and remove it from the current page, or delete
node(p) 1022 ;
end
This code is used in section 1014.
1021. Wrap up the box specied by node r, splitting node p if called for; set wait true if node p
holds a remainder after splitting 1021
begin if type(r) = split up then
if (broken ins (r) = p) (broken ptr (r) ,= null ) then
begin while link (s) ,= broken ptr (r) do s link (s);
link (s) null ; split top skip split top ptr (p); ins ptr (p) prune page top(broken ptr (r));
if ins ptr (p) ,= null then
begin temp ptr vpack (ins ptr (p), natural ); height (p) height (temp ptr ) + depth(temp ptr );
free node(temp ptr , box node size); wait true;
end;
end;
best ins ptr (r) null ; n qo(subtype(r)); temp ptr list ptr (box (n));
free node(box (n), box node size); box (n) vpack (temp ptr , natural );
end
This code is used in section 1020.
1022. Either append the insertion node p after node q, and remove it from the current page, or delete
node(p) 1022
link (prev p) link (p); link (p) null ;
if wait then
begin link (q) p; q p; incr (insert penalties );
end
else begin delete glue ref (split top ptr (p)); free node(p, ins node size);
end;
p prev p
This code is used in section 1020.
1023 T
E
X82 PART 45: THE PAGE BUILDER 381
1023. The list of heldover insertions, running from link (page head ) to page tail , must be moved to the
contribution list when the user has specied no output routine.
Perform the default output routine 1023
begin if link (page head ) ,= null then
begin if link (contrib head ) = null then
if nest ptr = 0 then tail page tail else contrib tail page tail
else link (page tail ) link (contrib head );
link (contrib head ) link (page head ); link (page head ) null ; page tail page head ;
end;
ship out (box (255)); box (255) null ;
end
This code is used in section 1012.
1024. Explain that too many dead cycles have occurred in a row 1024
begin print err ("Outputloop"); print int (dead cycles ); print ("consecutivedeadcycles");
help3 ("Iveconcludedthatyour\outputisawry;itneverdoesa")
("\shipout,soImshipping\box255outmyself.Nexttime")
("increase\maxdeadcyclesifyouwantmetobemorepatient!"); error ;
end
This code is used in section 1012.
1025. Fire up the users output routine and return 1025
begin output active true; incr (dead cycles ); push nest ; mode vmode;
prev depth ignore depth; mode line line; begin token list (output routine, output text );
new save level (output group); normal paragraph; scan left brace; return;
end
This code is used in section 1012.
1026. When the users output routine nishes, it has constructed a vlist in internal vertical mode, and
T
E
X will do the following:
Resume the page builder after an output routine has come to an end 1026
begin if (loc ,= null ) ((token type ,= output text ) (token type ,= backed up)) then
Recover from an unbalanced output routine 1027 ;
end token list ; conserve stack space in case more outputs are triggered
end graf ; unsave; output active false; insert penalties 0;
Ensure that box 255 is empty after output 1028 ;
if tail ,= head then current list goes after heldover insertions
begin link (page tail ) link (head ); page tail tail ;
end;
if link (page head ) ,= null then and both go before heldover contributions
begin if link (contrib head ) = null then contrib tail page tail ;
link (page tail ) link (contrib head ); link (contrib head ) link (page head ); link (page head ) null ;
page tail page head ;
end;
pop nest ; build page;
end
This code is used in section 1100.
382 PART 45: THE PAGE BUILDER T
E
X82 1027
1027. Recover from an unbalanced output routine 1027
begin print err ("Unbalancedoutputroutine");
help2 ("Yoursneakyoutputroutinehasproblematic{sand/or}s.")
("Icanthandlethatverywell;goodluck."); error ;
repeat get token;
until loc = null ;
end loops forever if reading from a le, since null = min halfword 0
This code is used in section 1026.
1028. Ensure that box 255 is empty after output 1028
if box (255) ,= null then
begin print err ("Outputroutinedidntuseallof"); print esc("box"); print int (255);
help3 ("Your\outputcommandsshouldempty\box255,")
("e.g.,bysaying`\shipout\box255.")
("Proceed;Illdiscarditspresentcontents."); box error (255);
end
This code is used in section 1026.
1029 T
E
X82 PART 46: THE CHIEF EXECUTIVE 383
1029. The chief executive. We come now to the main control routine, which contains the master
switch that causes all the various pieces of T
E
X to do their things, in the right order.
In a sense, this is the grand climax of the program: It applies all the tools that we have worked so hard
to construct. In another sense, this is the messiest part of the program: It necessarily refers to other pieces
of code all over the place, so that a person cant fully understand what is going on without paging back
and forth to be reminded of conventions that are dened elsewhere. We are now at the hub of the web, the
central nervous system that touches most of the other parts and ties them together.
The structure of main control itself is quite simple. Theres a label called big switch, at which point the
next token of input is fetched using get x token. Then the program branches at high speed into one of about
100 possible directions, based on the value of the current mode and the newly fetched command code; the
sum abs (mode) + cur cmd indicates what to do next. For example, the case vmode + letter arises when a
letter occurs in vertical mode (or internal vertical mode); this case leads to instructions that initialize a new
paragraph and enter horizontal mode.
The big case statement that contains this multiway switch has been labeled reswitch, so that the program
can goto reswitch when the next token has already been fetched. Most of the cases are quite short; they
call an action procedure that does the work for that case, and then they either goto reswitch or they fall
through to the end of the case statement, which returns control back to big switch. Thus, main control is
not an extremely large procedure, in spite of the multiplicity of things it must do; it is small enough to be
handled by Pascal compilers that put severe restrictions on procedure size.
One case is singled out for special treatment, because it accounts for most of T
E
Xs activities in typical
applications. The process of reading simple text and converting it into char node records, while looking for
ligatures and kerns, is part of T
E
Xs inner loop; the whole program runs eciently when its inner loop is
fast, so this part has been written with particular care.
384 PART 46: THE CHIEF EXECUTIVE T
E
X82 1030
1030. We shall concentrate rst on the inner loop of main control , deferring consideration of the other
cases until later.
dene big switch = 60 go here to branch on the next token of input
dene main loop = 70 go here to typeset a string of consecutive characters
dene main loop wrapup = 80 go here to nish a character or ligature
dene main loop move = 90 go here to advance the ligature cursor
dene main loop move lig = 95 same, when advancing past a generated ligature
dene main loop lookahead = 100 go here to bring in another character, if any
dene main lig loop = 110 go here to check for ligatures or kerning
dene append normal space = 120 go here to append a normal space between words
Declare action procedures for use by main control 1043
Declare the procedure called handle right brace 1068
procedure main control ; governs T
E
Xs activities
label big switch, reswitch, main loop, main loop wrapup, main loop move, main loop move + 1,
main loop move + 2, main loop move lig , main loop lookahead , main loop lookahead + 1,
main lig loop, main lig loop + 1, main lig loop + 2, append normal space, exit ;
var t: integer ; general-purpose temporary variable
begin if every job ,= null then begin token list (every job, every job text );
big switch: get x token;
reswitch: Give diagnostic information, if requested 1031 ;
case abs (mode) + cur cmd of
hmode + letter , hmode + other char , hmode + char given: goto main loop;
hmode + char num: begin scan char num; cur chr cur val ; goto main loop; end;
hmode + no boundary: begin get x token;
if (cur cmd = letter ) (cur cmd = other char ) (cur cmd = char given) (cur cmd = char num)
then cancel boundary true;
goto reswitch;
end;
hmode + spacer : if space factor = 1000 then goto append normal space
else app space;
hmode + ex space, mmode + ex space: goto append normal space;
Cases of main control that are not part of the inner loop 1045
end; of the big case statement
goto big switch;
main loop: Append character cur chr and the following characters (if any) to the current hlist in the
current font; goto reswitch when a non-character has been fetched 1034 ;
append normal space: Append a normal inter-word space to the current list, then goto big switch 1041 ;
exit : end;
1031. When a new token has just been fetched at big switch, we have an ideal place to monitor T
E
Xs
activity.
Give diagnostic information, if requested 1031
if interrupt ,= 0 then
if OK to interrupt then
begin back input ; check interrupt ; goto big switch;
end;
debug if panicking then check mem(false); gubed
if tracing commands > 0 then show cur cmd chr
This code is used in section 1030.
1032 T
E
X82 PART 46: THE CHIEF EXECUTIVE 385
1032. The following part of the program was rst written in a structured manner, according to the
philosophy that premature optimization is the root of all evil. Then it was rearranged into pieces of
spaghetti so that the most common actions could proceed with little or no redundancy.
The original unoptimized form of this algorithm resembles the reconstitute procedure, which was described
earlier in connection with hyphenation. Again we have an implied cursor between characters cur l and
cur r . The main dierence is that the lig stack can now contain a charnode as well as pseudo-ligatures; that
stack is now usually nonempty, because the next character of input (if any) has been appended to it. In
main control we have
cur r =