Programming A Problem-Oriented Language
Programming A Problem-Oriented Language
A PROBLEM-ORIENTED
LANGUAGE
Charles H. Moore
2
Preface
This is an unpublished book I wrote long ago. Just after I'd written the
first versions of Forth. Perhaps it explains the motivation behind
Forth. There is some interest in it, and publishing is no longer
relevant since I can post it on my website.
I have a typescript that I recovered from Forth, Inc long ago. I had
typed it on my Smith-Corona portable, complete with overstrikes and
annotations. It is illegible enough to discourage a casual reader, so
I'm re-keying it in HTML.
This is useful, since I used to be a good typist and that skill has
deteriorated. My fingers can use the exercise and I'm curious if I can
reduce my error rate.
I'm making minimal changes to the text; just enough to fit HTML. The
language remains quaint, ungrammatical or unclear. Any remaining
typos are modern.
COMPUTER DIVISION
File Copy
PROGRAMMING A
PROBLEM-ORIENTED-
LANGUAGE
Charles H. Moore
written ~ June 1970
4
Contents
1. Introduction
1. .. Basic Principle .................................................. 8
2. Preview ................................................................ 11
Figure 1, 2, 3 ........................................................139
Figure 6.2 .............................................................140
6
1. Introduction
I'm not sure why you're reading this book. It's taken me a while to
discover why I'm writing it. Let's examine the title: Programming a
Problem-Oriented-Language. The key word is programming. I've
written many programs over the years. I've tried to write good
programs, and I've observed the manner in which I write them rather
critically. My goal has been to decrease the effort required and
increase the quality produced.
I've also been distressed at the lack of concern from others about
problems I consider significant. It amounts to a general indifference
to quality; a casual attitude of confidence that one's programs are
pretty good, in any case as good as necessary. I'm convinced this
confidence is misplaced. Moreover this attitude is reinforced by the
massive trend to high-level languages and a placid acceptance of
their inefficiencies: What's the use of designing a really good
algorithm if the compiler's going to botch it up anyway?
Keep it Simple
In order to help you apply the Basic Principle, I'm going to tell you
how many instructions you should use in some routines. And how
large a program with certain capabilities should be. These numbers
are largely machine independent; basically they measure the
complexity of the task. They are based upon routines I have used in
my programs, so I can substantiate them. Let me warn you now that
I'll be talking about programs that will fit comfortably in 4K words of
core.
Do Not Speculate!
Do not put code in your program that might be used. Do not leave
hooks on which you can hang extensions. The things you might want
to do are infinite; that means that each one has 0 probability of
realization. If you need an extension later, you can code it later - and
probably do a better job than if you did it now. And if someone else
adds the extension, will they notice the hooks you left? Will you
document that aspect of your program?
Do It Yourself!
Now we get down the the nitty-gritty. This is our first clash with the
establishment. The conventionsl approach, enforced to a greater or
lesser extent, is that you shall use a standard subroutine. I say that
you should write your own subroutines.
Before you can write your own subroutine, you have to know how.
This means, to be practical, that you have written it before; which
makes it difficult to get started. But give it a try. After writing the same
subroutine a dozen times on as many computers and languages,
you'll be pretty good at it. If you don't plan to be programming that
long, you won't be interested in this book.
As I will detail later, the input routine is the most important code in
your program. After all, no one sees your program; but everyone sees
your input. To abdicate to a system subroutine that hasn't the
slightest interest in your particular problem is foolish. The same can
be said for output subroutine and disk-access subroutine.
Moreovere, the task is not that great as to deter you. Although it takes
hundreds of instructions to write a general purpose subroutine, you
10
can do what you need with tens of instructions. In fact, I would advise
against writing a subroutine longer that a hundred instructions.
But suppose everyone wrote their own subroutines? Isn't that a step
backward; away from the millenium when our programs are machine
independent, when we all write in the same language, maybe even on
the same computer? Let me take a stand: I can't solve the problems of
the world. With luck, I can write a good program.
11
1.2 Preview
I'm going to tell you how to write a program. It is a specific program;
that is, a program with a specific structure and capabilities. In
particular, it is a program that can be expanded from simple to
complex along a well-defined path, to handle a wide range of
problems, likewise varying from simple to complex. One of the
problems it considers is exactly the problem of complexity. How can
you control your program so that it doesn't grow more complicated
than your application warrants?
I've gone to some lengths to simplify. I hope that you don't find too
many violations of the Basic Principle, for it's much easier to
elaborate upon a program than it is to strip it to basics. You should
feel free to build upon my basic routines, provided that you recognise
that you are adding a convenience. If you confuse what is expedient
with what is necessary, I guarantee your program will never stop
growing.
You will notice a lack of flow-charts. I've never liked them, for they
seem to include a useless amount of information - either too little or
too much. Besides they imply a greater rigidity in program structure
than usually exists. I will be quite specific about what I think you
should do and how you should do it. But I will use words, and not
diagrams. I doubt that you would give a diagram the attention it
deserved, anyway. Or that I would in preparing it.
13
However, data very often has input mixed with it - information that
identifies or disposes of the data. For example, a code in col. 80 might
identify a card. It is input, the rest of the card probably data.
But I'll be viewing programs from the input side. I'll be ranking
programs according to the complexity of their input and I plan to
demonstrate that a modest increase in the complexity of input can
provide a substantial decrease in the complexity of the program.
From this point of view, a program with no input is simple.
As you put more thought into the problem, you begin to relate it to
your particular machine: this data comes off tape, that loop is stopped
by . . ., this is really a 3-way branch. you modify the problem as
required by your particular hardware configuration.
I'll have a bit more to say about languages, but mostly we'll stay at the
most abstract level - talking computerese. We won't be talking in
meta-language exclusively. I may tell you to load an index-register or
to jump on negative and you'll have to translate that into the
equivalent for your computer and language.
16
Each language can be very efficient at its sort of job. But if you want
conditional loops involving complicated decimal expressions you
have a problem.
You will have to code in assembler! Not the whole program, if you
insist, but the important parts that we'll be concentrating on. You
might be able to do some of these in FORTRAN, but it simply isn't
worth the effort. I'll show you where higher-level subroutines can go,
17
and I think you'll agree there is good reason to restrict them to that
function.
Declare all variables. Even in FORTRAN when you don't have to.
Everyone likes to know what parameters you are using, presumably
need to use; likes to count them, to see if they could use fewer; is
annoyed if you slip one in without mentioning it.
Define everything you can before you reference it. Even in FORTRAN
when you don't have to. Why not? You don't like to read a program
backwards either. 'Everything you can' means everything except
forward jumps. You better not have many forward jumps.
Make the variables as GLOBAL as possible. Why not? You can save
some space and clarify your requirements. For instance, how many Is,
Js and Ks do you need? In most cases a single copy in COMMON
would suffice (you have to declare them, remember, and may as well
put them in COMMON); you can redefine it locally if you must; and it
is of interest that you must.
2.4 Mnemonics
You will find as you read, that I have strong opinions on some
subjects and no opinion of others. Actually I have strong opinions on
all, but sometimes I can't make up my mind which to express.
Fortunately it leaves you some decisions to make for yourself.
Use short words. You don't want to type long words, and I don't want
to read them. In COBOL this means avoid dashes and avoid
qualification, though both can be useful upon occasion.
LA B . Load A with B
In fact it does positive bad: if I see comments like that I'll quit reading
them - and miss the helpful ones.
If you jump somewhere, not intending to come back, you can save
trouble, time and space. But only if you really never come back. To
simulate a subroutine call is worse than ever.
This chapter is full of details, more than I anticipated when I started it.
Although I'm surprised there's so much to say, I think it's all of value. I
only caution you not to get lost in the details; the structure, the
concept of the program are what is important.
To set the stage, let me briefly outline how our program must operate.
You are sitting at a keyboard typing input. You type a string of
characters that the computer breaks into words. It finds each word in
a dictionary, and executes the code indicated by the dictionary entry,
perhaps using parameters also supplied by the entry. The process of
reading words, identifying them and executing code for them is
certainly not unusual. I am simply trying to systematize the process,
to extract the inevitable functions and see that they are efficiently
performed.
25
We're going to read words from your input, find them in the dictionary,
and execute their code. A particular kind of word is a literal, a word
that identifies itself:
1 17 -3 .5
We won't find such words in the dictionary, but we can identify them
by their appearance. Such words act as if they were in the dictionary,
and the code executed for them places them on a push-down stack.
Other words act upon arguments found on this stack, for example:
+ add the last 2 numbers placed on the stack, leave the sum
there.
, type the number on top of the stack, and remove it from the
stack.
1 17 + ,
We are saying: put 1 onto the stack, 17 onto the stack, add them, and
type their sum. Each word performs its specific, limited function;
independently of any other word. Yet the combination of words
achieves something useful. In fact if we type:
Let's look more closely at the words we used above. They fall into 2
distinct classes; English even provides names for them:
Unary and binary verbs, as well as the type verb ",", are destructive.
The verb DUP, which I define to duplicate the top of the stack, is non-
destructive. In general verbs are destructive. In fact, I deliberately
27
Literals are nouns. We can define other words as nouns; words that
use their parameter field to place numbers onto the stack:
1. PI 2. * / ,
reads: place 1. onto the stack, place 3.14 onto the stack, place 2. onto
the stack, multiply (2. and PI), divide (1. by 2PI), and type. Constants
are particularly useful when you're using code numbers. It lets you
give names to numbers that might otherwise be hard to remember.
However the most important nouns by far are literals and variables. A
variable gives a name to a location and not to a value, as elementary
programming texts laboriously explain. However, what higher-level
languages conceal is that variables may be used in 2 distinct ways:
X@,
I mean: place the address of X onto the stack, fetch its value, and type.
And if I type,
X@Y@+,
I mean: fetch the value of X, the value of Y, add, and type. On the
other hand,
X@Y=
will: fetch the address of X, then its value, fetch the address of Y, and
store the value of X into Y. But if I type
XY=
I'm saying: fetch the address of X, the address of Y, and store the
address of X into Y. Maybe this is that I mean to do, it's not
unreasonable.
I don't want to belabor the point, for we're getting ahead of ourselves.
But variables require special verbs, one of which (@) is not ordinarily
explicit. Incidently, I originally used the word VALUE for @. But the
verb is used so often it deserves a single character name, and I
thought @ (at) had some mnemonic value, besides being otherwise
useless.
XY=Y@@,
I hope I've given you some idea of how you can put arguments onto
the stack and act on them with verbs. Although I define constants and
variables, unary and binary verbs, I hope it's clear that these are only
examples. You must define the nouns and verbs and perhaps other
kinds of words that are useful for your application. In fact, I think that
is what programming is all about. If you have available a program
such as I will now describe, once you decide what entries an
application requires, you'll find it absolutely trivial to code those
entries, and thus complete your problem.
30
We are going to read a word from the input string, look up that word
in the dictionary, and jump to the routine it specifies. Each routine will
return to the top of the loop to read another word. We will be
discussing many routines and it will be helpful to have a term to
identify "return to the top of the loop to read another word". I will use
the word RETURN; you should provide a standard macro or label in
your program for the same purpose.
By the way. Since you don't check the stack until after you executed a
routine, it will exceed stack limits before you know it. Thus stack
overflow and underflow should be non-fatal. A good solution is to let
the parameter stack overflow into the return stack, and underflow into
the message buffer. The return stack should never underflow.
32
Very well, let's write the WORD subroutine. It uses the input pointer to
point at the current position in the source text, the output pointer to
point at the current position in memory where we will move the word.
We must move it; partly to align it on a computer-word boundary and
partly because we may want to modify it.
You may want to set an upper limit on word length. Such a limit
should include the largest number you will be using. Then the
question arises as to what to do with a longer word. You might simply
33
discard the excess characters, providing you don't plan to dissect the
word (Chapter 8). Better, perhaps, that you force a space into the
word at the limit. That is, break the word into 2 words. Presumably
something's wrong and you will eventually discover it in attempting to
process the fragments. However this limit should be large enough - 10
to 20 characters - so that it does not constitute a real restriction on
your input. It should also be 1 character less than a multiple of your
computer-word length, so that you can always include the terminal
space in the aligned word.
Although it's possible to read cards, I'm going to assume that you
have a keyboard to type input. Now there are 2 kinds of keyboards,
buffered and unbuffered. A buffered keyboard stores the message
until you type an end-of-message character. An unbuffered keyboard
sends each character as you type it. Your hardware, in turn, may
buffer input for you or not.
In any case we may want to examine each character more than once,
so we want buffered input. Even if you can process characters as they
arrive, don't. Store them into a message buffer.
Set aside a 1-line message buffer. Its size is the maximum size of a
message, either input or output, so if you plan to use a 132 position
printer make it large enough.
We will use the same message buffer for both input and output. My
motivation is to save space, or rather to increase the utilization of
space. My reasoning is that input and output are mutually exclusive.
There are exceptions, but we don't usually read input and prepare
output simultaneously. At least we never have to.
Let us define 2 entities: an input pointer and an output pointer. For the
moment you can think of them as index registers, although we will
have to generalize later. Let's also write 2 subroutines, although your
hardware may permit them to be instructions: FETCH will load the
character identified by the input pointer into a register, and advance
the input pointer; DEPOSIT will store that register at the position
identified by the output pointer, and advance the output pointer.
The input and output pointers use index registers. However, those
registers should only be used during a move. They should be loaded
prior to a move and saved after it, for they will be used for a number
of purposes, and it becomes impractical to store anything there
permanently.
37
We will discuss the stack in the next section. First let's define a
number more precisely.
38
3.4.1 Numbers
It is very hard to state exactly what is a number and what is not. You
will have to write a NUMBER subroutine to convert numbers to binary,
and this subroutine is the definition of a number. If it can convert a
word to binary, that word is a number; otherwise not.
are some decimal, octal and hex numbers. The number does not
specify its base, and a word that may be a hexadecimal number, may
not be a decimal number.
One of your major tasks will be to decide what kinds of numbers you
need for your application, how you will format them, and how you will
convert them. Each kind of number must be uniquely identifiable by
the NUMBER subroutine, and for each you must provide an output
conversion routine.
Basic Principle!
You can add and subtract such numbers without concern; their
decimal points are aligned. After multiplying 2 numbers, you must
divide by 1000 to re-align the decimal points. Hardware usually
facilitates this; the result of a multiply is a double-precision product in
the proper position for a dividend. Before dividing 2 numbers, you
must multiply the dividend by 1000 to maintain precision and align the
decimal points. Again this is easy.
With this routine, NUMBER can work as follows: set the input pointer
to the start of the aligned word, call SIGNED. If the stopping character
is a decimal point, clear counter, call NATURAL to get the fraction,
and use counter to choose a power-of-ten to convert to a floating or
fixed-point fraction. In any case, apply SIGNED's switch to make
number-so-far negative. Exit.
The routine that calls NUMBER can test the stopping character:
If you want to verify that "in" are less than 12, you'll want to modify
this slightly.
If you take the care, and spend a couple of instructions, you can
improve the appearance of your numbers by:
3.5 Stacks
We will be using several push-down stacks and I want to make sure
you can implement them. A push-down stack operates in a last-in
first-out fashion. It is composed of an array and a pointer. The pointer
identifies the last word placed in the array. To place a word onto the
stack you must advance the pointer, and store the word (in that order).
To take a word off the stack you must fetch the word and drop the
pointer (in that order). There is no actual pushing-down involved,
though the effect is the same.
You must never use this register for any other purpose.
You must keep this register full; no flag to indicate that it's
empty.
If you cannot fulfill these conditions, you're better off with the stack
entirely in core.
You place a word onto then stack, thereby increasing its size.
You drop a word from the stack, thereby decreasing its size.
The word on top of the stack is called the top word.
The word immediately below the top of the stack is called the
lower word.
You may need to control the parameter stack from the input. These
words (dictionary entries) are extremely useful, and illustrate the
terminology above:
3.6 Dictionary
Every program with input must have a dictionary. Many programs
without input have dictionaries. However these are often not
recognised as such. A common 'casual' dictionary is a series of IF . . .
ELSE IF . . . ELSE IF . . . statements, or their equivalent. Indeed this is
a reasonable implementation if the dictionary is small (8 entries) and
non-expandable.
One possibility is to split an entry into two portions, one of fixed size,
one of variable size. This permits scanning fixed size entries to
identify a word and often there are hardware instructions to speed
this search. A part of the fixed entry can be a link to a variable area; of
course you choose the fixed size so as to make the link in the nature
of an overflow - an exception.
An entry has 4 fields: the word being defined, the code to be executed,
a link to the next entry and parameters. Each of these warrants
discussion.
The code field should contain the address of a routine rather than an
index to a table or other abbreviation. Program efficiency depends
strongly on how long it takes to get to the code once a entry is
50
3.6.3 Initialization
The dictionary is built into your program and is presumably initialized
by your compiler. This is centainly true if you have fixed-size entries.
Variable-sized entries must be linked together, however, and this can
be beyond the ability of your compiler, especially if you have multiple
chains.
Other things may need initializing, particularly any registers that are
assigned specific tasks. All such duties should be concentrated in
this one place.
53
First we choose records with dept = 6 and copy them into a temporary
file. Then we sort that file by name. Then we list it.
List, by age, all employees whose salary is greater than $10,000; and
identify those whose seniority is less than 3:
REWIND END
There are 2 problems with this situation. First, to add an entry to your
dictionary you must re-compile the program. Clearly, you won't be
adding many entries - but maybe you won't have to. Second, all your
entries must be present at the same time. This creates, not so much a
volume problem, as a complexity problem. If your application is
complex, it becomes increasingly difficult to make all aspects
compatible. For instance, to find distinct names for all fields. Third, if
you find an error in an entry you must recompile the program. You
have no ability to correct an entry - though of course you could define
entries to provide that ability.
If you can create dictionary entries you can accomplish 2 things: You
can apply your program to different aspects of your application -
without conflicts and reducing complexity. You can create a
dictionary entry differently, and thus correct an error. In fact, the
purpose of your program undergoes a gradual but important change.
You started with a program that controlled an application. You now
have a program that provides the capability to control an application.
In effect, you have moved up a level from language to meta-language.
This is an extremely important step. It may not be productive. It leads
you from talking to your application to talking about your application.
I hesitate to say whether this is good or bad. By now you surely know
- it depends on the application. I suspect any application of sufficient
complexity, and surely any application of any generality, must
develop a specialized language. Not a control language, but a
descriptive language.
Let me now assume that you have a problem that qualifies for a
descriptive language. What dictionary entries do you need?
57
Recall the control loop: it reads a word and searches the dictionary. If
you want to define a word, you must not let the control loop see it.
Instead you must define an entry that will read the next word and use
it before RETURNing to the control loop. In effect, it renders the
following word invisible. It must call the word subroutine, which is
why it is a subroutine rather than a routine. Let us call such an entry a
defining entry, its purpose is to define the next word.
I'm afraid this is confusing. We have one entry that supplies the
address field of a new entry from its own parameter field. Let's take an
example; suppose we want to define a constant:
0 CONSTANT ZERO
0 is placed on the stack; the code for the word CONSTANT reads the
next word, ZERO, and constructs a dictionary entry for it: it
establishes the link to a previous entry, stores 0 from the stack into
the parameter field, and from its own parameter field stores the
address of the code ZERO will execute. This is, presumably, the
address of code that will place the contents of the parameter field
onto the stack.
If you then define appropriate verbs to advance, test and reset J, you
can have a powerful indexing facility. Or define:
Anything you need for your application you can define. But you can
never define everything. Basic Principle!
59
There is only one feasible way to delete entries. That is to delete all
entries after a certain point. If you were to delete specific entries, you
would leave holes in the dictionary, since it occupies contiguous core.
If you attempt to pack the dictionary to recover the holes, you are
faced with a wicked re-location problem, since we use absolute
addresses. To avoid absolute addresses is inefficient and
unnecessary.
One exception is when you use some entries to construct others. The
constructing entries are then no longer needed, and there is no way
to get rid of them. It happens; I may even give some examples later.
But all you lose is dictionary space, and I can't see a practical
solution.
OK, how do you delete trailing entries? You want to mark a point in
your dictionary and reset evereything to that position. One thing is the
dictionary pointer that identifies the next available word in the
dictionary. That's easy. However you must reset the chain heads that
identify the previous entry for each of your search chains. It only
60
takes a small loop: follow each chain back, as you do when searching,
until you find a link that preceeds your indicated point.
If you have fixed-size entries, you must reset the pointer to the
parameter area, but you don't have to follow links.
REMEMBER HERE
4.3 Operations
Recall that the stack is where arguments are found. There are some
words you may want to define to provide arithmetic capabilities. They
are of little value to a control language, but essential to add power to
it. I'll use logical constructs TRUE (1) and FALSE (0). And remember
the definition of top and lower from 3.6.
Binary operators: Remove top from the stack and replace lower by a
function of both.
These are only samples. Clearly you are free to define whatever words
you feel useful. Keep in mind that you must place the arguments on
the stack before you operate on them. Numbers are automatically
placed on the stack. Constants are too. Thus the following make
sense:
12+
PI 2. *
1 2 + 3 * 7 MOD 4 MAX
123+*
62
Do not bother with mixed-mode arithmetic. You never need it, and it's
not convenient often enough to be worth the great bother. With
multiple word numbers (complex, double-precision) you may put the
address of the number on the stack. However, this leads to 3-address
operations with the result generally replacing one of the arguments.
And this, in turn, leads to complications about constants.
returning to a position where the next word can be found. You might
consider a definition to be just that: a series of subroutine calls with
the addresses of the subroutines constituting the definition.
Of course you have to pay for this convenience, though probably less
than you would with FORTRAN subroutine calls. The price is the
control loop. It's pure overhead. Executing the code for each entry of
course proceeds at computer speed; however obtaining the address
of the next code to execute takes some instructions, about 8. This is
why I urge you to optimize your control loop.
Notice that if the code executed for words is long compared to the
control loop, the cost is negligible. This is the principle of control
languages. As the code shrinks to control loop size, and smaller,
overhead rises to 50% and higher. This is the price of an application
language. Note, however, that 50% overhead is easily reached with
operating systems and compilers that support an application program.
It then sets a switch STATE. The control loop must be changed to test
STATE: if it is 0, words are executed as I've already described; if it is 1,
words are compiled. Let me repeat: if you add definitions to your
program, you must modify the control loop so that it will either
execute or compile words. If you plan to include definitions from the
start, you should plan the control loop accordingly. Implement the
switch so that executing words is as fast a possible; you'll execute
many more words than you'll compile.
1 CONSTANT 1
The code in the control loop that compiles words much watch for ";".
It is compiled as usual, but it also resets STATE to prevent further
compiling. It also performs another task, which requires a digression.
: = SWAP = ;
In any case, the capability is easy to provide. Let ":" bugger the
search so the latest entry cannot be found. And let ";" unbugger the
search and thereby activate the new definition. If you want recursive
definitions, you could provide a defining entry ":R" that did not
bugger, providing you make ";" work for both. I'll mention another
technique later.
70
Recall the structure of the control loop: the routine NEXT W provides
the address of a dictionary entry; the routine associated with this
entry is entered; it ultimately returns to NEXT W. The same procedure
is required in order to execute a definition, with the exception that
NEXTW is replaced by NEXTI. Where NEXTW read a word and found it
in the dictionary, NEXTI simply fetches the next entry from the
parameter field of the definition.
Thus you need a variable that identifies the routine to be entered for
the next entry. One implementation is to define a field NEXT that
contains either the address of NEXT W or NEXTI. If you jump indirect to
NEXT, you will enter the appropriate routine. One task of EXECUTE is
therefore to store the address of NEXT I into NEXT, causing
subsequent entries to be obtained in a different way.
Of course NEXTI must know where to find the next entry. Here the
virtual computer analogy is extended by the addition of an instruction
counter. If you define a field, preferably an index register, named IC it
can act exactly like an instruction counter on a real computer. It
identifies the next entry to be executed, and must be advanced during
execution.
You can now see the complete operation of NEXT I: fetch the entry
identified by IC, advance IC to the next enty, and return to the same
point NEXTW does to execute the entry (or compile it, as the case may
be). If you use definitions at all, you'll use them extensively. So NEXT I
should be optimized at the cost of NEXT W. In particular, the code that
executes (compiles) entries should be fallen into from NEXT I and
jumped to from NEXTW. This saves one instruction (a jump) in the
control loop using NEXTI. This can be 20% of the loop, apart from
actually executing the entry's code, for a substantial saving.
none of these conflict with such use. If one definition is executed from
within another, it is clear the current IC must be saved. Otherwise the
current value of IC is undefined.
One more routine is involved in this process. The code executed for
";" must return from the definition. This means simply that it must
restore IC from the return stack. However it must also restore the
value of NEXT, which was set to NEXT I by EXECUTE. You might store
the old value of NEXT in the return stack and let ";" recover it. Simpler,
perhaps, is to let the undefined value of IC be zero, and act as a flag
to restore NEXT to NEXT W. For while executing definitions, NEXT will
always contain NEXTI. Only when returning from a definition that
originated within the source text must NEXT W be reestablished. Since
while executing source text IC is irrelevant, it might as well by useful
in this limited way.
That's all there is to it. The combination of EXECUTE, NEXT I and ";"
provide a powerful and efficient subroutine facility. Notice that the
code "executed" for a definition might actually be compiled,
depending on the field STATE, as dicussed earlier. Notice also that
the entries executed by a definition might compile other entries. That
is, one entry might deposit numbers in the dictionary, using DP. Thus
although the fields IC and DP are similar in use, DP deposits entries
and IC fetches them, they may both be in use at the same time. If
you're short of index registers, don't try to combine them.
72
4.4.3 Conditions
Let me review briefly the process of defining a definition: The word ":"
sets a switch that modifies the control loop; it will now compile words
instead of executing them. The word ";" is compiled, but also causes
the switch to be reset, ending the process of compilation. Following
words will now be executed as usual.
There are other words like ";" that must be executed during
compilation. These words control the compilation. They perform code
more complicated that simply depositing an entry address. In
particular, they are required to provide forward and backward
branching.
Rather than talk abstractly about a difficult and subtle point, I'll give
some examples of words that I've found useful. As always, you are
free to choose your own conventions, but they will probably resemble
mine in their basic effects.
Define the words IF, ELSE and THEN to permit the following
conditional statement format:
The words have a certain mnemonic value, though they are permuted
from the familiar ALGOL format. Such a statement can only appear in
a definition, for IF, ELSE and THEN are instruction-generating words.
All right, back to IF. At definition time it compiles the conditional jump
pseudo-entry, followed by a 0. For it doesn't know how far to jump.
And it places the location of the 0, the unknown address, onto the
stack. Remember that the stack is currently not in use, because we're
defining. Later it wil be used by those words we're defining, but at the
moment we're free to use it to help in the process.
the effect is the same; if a, b and c are all true the conditional
statement is executed. Otherwise not. Each IF generates a forward
jump that is caught by its matching THEN. Note that you must still
match IFs with THENs. In fact this is one sort of nested IF . . . THEN
statement. It is an extremely efficient construction.
a b OR c OR IF . . . THEN
or in ALGOL
if a or b or c then
If a is true you may as well quit, for the conjunction cannot be false. If
you re-write the statement as
the statement works as follows: if a is true, -IF will jump; if b is true, -if
will jump; if c is false, IF will jump. The first HERE will catch b's jump
(the SWAP gets c's address out of the way); the second HERE
catches a's jump; THEN catches c's jump. Thus a and b jump into the
condition, while c jumps over it.
4.4.4 Loops
I'll continue with a couple more examples of words executed at
definition time. This time examples of backward jumps, used to
construct loops.
BEGIN stores DP onto the stack, thus marking the beginning of a loop.
END generates a conditional backward jump to the location left by
BEGIN. That is, it deposits a conditional jump pseudo-entry, subtracts
DP+1 from the stack, and deposits that relative address. If the boolean
value is false during execution, you stay in the loop. When it becomes
true, you exit.
a b DO . . . CONTINUE
10 0 DO 1 + . . . CONTINUE
77
The first argument is 10, the stopping value; the second is 0, which is
immediately incremented to 1, the index value. Within the loop this
index is available for use. the DUP operation will obtain a copy. Each
time through the loop the index will be incremented by 1. After the
loop is executed for index value 10, the CONTINUE operation will stop
the loop and drop the 2 arguments - now both 10.
11 1 DO . . . 1 + CONTINUE
Here the index is incremented at the end of the loop, instead of the
beginning. Upon reaching 11 and exceeding the limit of 10, the loop is
stopped.
4.4.5 Implementation
I hope you now appreciate the need for words that are executed at
define time. I'm sure you're aware of the need for branches and loops.
Perhaps you'll notice that I did not mention labels; the branch
generating words I mentioned, and others you can invent, are
perfectly capable of handling jumps without labels. You saw in the
definition of HERE how the stack can be manipulated to permit
overlapping jumps as well as nested ones. However in a sense we
have many labels, for every dictionary entry effectively assigns a
name to a piece of code.
1: execute
0: compile
For a given entry, 'or' the switch and flag together; if either is 1,
execute the word, else compile it.
The above rule is correct, and even fairly efficient. Remember that we
want the control loop efficient! And it's adequate providing all words
that must be executed are built into your system dictionary.
Unfortunately, it's not adequate for the examples I gave above, which
probably means it's inadequate, since those were pretty simple
examples. But complication is part of the fun of programming. So pay
attention and I'll try to explain some problems I don't understand very
well myself.
So, what to do? I bet you think I have a solution. Your faith is touching,
but I don't have a very good one. It suffers a small restriction, but a
nagging one: you may not execute a literal in a definition. To phrase it
positively: literals must be compiled inside definitions. Let's see how
it works.
Define a new entry "!". Let it execute the last entry compiled and
remove it from the compilation. Now we can re-write the definition of
HERE as
The Basic Principle intrudes. If you add code entries to your program,
you add enormous power and flexibility. Anything your computer can
do, any instructions it has, any tricks you can play with its hardware
are at you fingertips. This is fine, but you rarely need such power.
And the cost is appreciable. You'll need many entries (say 10) to
provide a useful compiler; plus all the instruction mnemonics.
Moreover you'll have to design an application language directed at the
problem of compiling code.
On the other hand, if you start with code entries, you can construct all
the other entries I've been talking about: arithmetic operators, noun
entries, definitions. In Chapter 9 I'll show how you can use code
entries in a really essential role; and achieve a significantly more
efficient and powerful program than by any other means. But except
for that I'm afraid they are marginal.
So how can you generate code? First you need a defining entry that
defines a code entry. The characteristic of a code entry is that it
executes code stored in its parameter field. Thus the address passed
to ENTRY by its defining entry (say CODE) must be the location into
which will be placed the first instruction. This is not DP, because the
entry itself takes space; but is simply DP plus a constant.
Now you can appreciate the source of my earlier caution. You'll have
to provide a flock of entries that access code compiled into your
program that we've not needed to reference directly before. For
example RETURN: when you routine is finished, it must jump to the
control loop, just as you built-in entries do. However you don't know
the location of the control loop in core; and it moves as you change
your program. So you must have an entry to generate a RETURN
instruction.
All right, you've done that much. Now you've got to decide how to
construct an instruction. They have several fields - instruction, index,
adddress - that you'll want to put onto the stack separately and
combine somehow. This is easy to do, but hard to design. You
probably don't want to copy your assembler, and probably couldn't
follow its format conveniently anyway. In fact you can do a good job
of designing a readable compiler language; but it will take some effort.
Definitions provide all the tools you need.
For example, you might write a definition that will "or" together an
instruction and address and deposit it. Or if your hardware's awkward,
you can provide a definition that converts absolute addresses to
relative, or supplies appropriate paging controls. Whatever you need,
or want can be readily defined. Done properly, such a compiler is a
substantial application in itself, and if you're going to do it at all, plan
to spend the necessary time and effort.
The first line defines the word UNIT. The next line uses this defining
entry to define the word IN (inches). The last line uses IN in a way that
puts 4 inches onto the stack, as centimeters. The 3 lines are
equivalent to
: IN 2.54 * ;
The first special word is ENTER. It calls the ENTRY subroutine used
by all your defining entries, but passes a 0 address as the location of
the code to be executed. Look at the definition of UNIT. The word
ENTER is imperative. It generates a double-length pseudo-instruction;
a pseudo-entry for the first half and a 0 constant for the second. At
execution time, the pseudo-entry will call ENTRY to construct a new
dictionary entry, passing the following constant as the address of
code to be executed. The word ;CODE is a combination of the words
";" and CODE. It terminates the definition of UNIT and stores DP into
the address field established by ENTER. Thus the code that
follows ;CODE is the code that will be executed for all entries created
by UNIT. ;CODE knows where to store DP because ENTER is
84
restricted to being the first word in any definition that uses it;
and ;CODE knows which definition it is terminating.
One more suggestion might prove helpful. You might define a new
kind of constant: an instruction. When executed, an instruction
expects an address on the stack, extracts a constant from its
parameter field and constrcts and deposits a completed instruction.
You'll probably have a large number of instructions, and use a large
number. This will save you many deposit entries.
I'm sorry, but I think it's infeasible to attempt an example. If you can't
see how to construct your own code entries from what I've already
said, forget it. The application is extremely machine dependent - and
rightly so. Don't attempt to apply the same code to several computers;
definitions already do that for you. The purpose of code is to exploit
the properties of your particular computer.
85
This means that as the data in blocks becomes useless, space will
become available in block-sized holes. We must somehow re-use
these holes. Which means that we must allocate, and re-allocate, disk
in block-sized pieces.
You will want to copy disk (onto another disk, or tape) for protection.
You need only copy the nuber of blocks used, which is usually less
than half the disk capacity, or else you're pretty worried about space.
If you destroy block 1 (you will) you will have to re-load the entire disk
from your back-up. Never try to recover just block 1, you'll end up
horribly confused.
You may want to put your object program on this disk. Fine! It won't
even take many blocks. You may need to start it in block 0 in order to
do an initial load (bootstrap). OK, but be able to re-load the program
(only) from back-up because you will destroy block 0. Only if you
destroy the block (we'll call it block 1) containing available space
information must you re-load data (all data). Unless you destroy many
blocks. Choose the path of least confusion, not least effort. Re-
loading disk will confuse you, you'll forget what you've changed and
be days discovering it. Much better you spend hours re-typing text
and re-entering data.
88
So when you need a block, you type a word (GET) which reads block
1, places the block up for re-use on the stack, reads that block, places
the contents of its first word into block 1, and re-writes block 1. The
first word, of course, contains the address of the next block up for re-
use. If no block was availabe for re-use (initially the case), GET
increments the last block used, puts it on the stack and re-writes
block 1. GET then clears your new block to 0 and re-writes it.
Several comments: Notice that GET places its result on the stack - the
logical place where it is available for further use. Notice that blocks
are re-used in preference to expanding the disk used. This makes
sense except for the problem of arm motion. Forget arm motion. You
just have to live with it. This is, after all, a random memory. Don't
neglect clearing the block to 0.
89
How many blocks you can have is probably limited by the disk,
however it may be limited by the field you choose to store block
addresses in. Be careful! You can circumvent the first limit by
modifying your read subroutine to choose one of several disks. You
must re-format all your block addresses (cross-references on disk,
remember) to expand the second.
90
You'll want a table specifying which blocks are in core: your read
routine can check this table before reading.
But you should not write a block when you change it. Rather mark it
'to be written' in the buffer table. When you come to re-use that buffer,
write the old block first. The principle is that you're likely to change a
block again if you change it once. If you minimize writes you can save
a lot of disk accesses. Of course, there is a trade-off - if your program
crashes, you may have updated blocks in core that aren't on disk. You
should be able to re-start your program and preserve the core buffers.
If you are going to scan data sequentially, you can save many
accesses by reading consecutive blocks at the same time. However it
is likely that random reads may be interspersed with these sequential
ones. An effective solution is to store the last block in the sequential
area and the number of blocks somewhere for your read subroutine. If
the block isn't in core, and is within the sequential range, it can read
as many consecutive blocks as there are consecutive buffers
available. Don't attempt more than this - ie, making more buffers
available. The net effect is that you will do the best you can with
sequential blocks, subject to interfering constraints.
A block that contains text should have a special name, for you will be
using it often in conversation. I have called such blocks SHEETs -
since the text filled a sheet of paper - and SCREENs - since the text
filled the screen of a scope. Define the word READ to save the input
address, the block and character position of the next character to be
scanned, on the return stack; and reset the input pointer to the block
on the stack and the first character position. Define the word ;S to
restore the original input pointer. Very simply you can have your
program read block 123:
123 READ
You will find that with text on disk, the original characterization of
'input' as low volume is strained. You will read many words and do
many dictionary searches. However, on a microsecond computer, you
won't notice it.
92
0 C1 42 # :R RECORD
: LINE 1 - 7 * RECORD + ;
Here I'm defining a verb that will convert a line number (1-15) to a field
address. It modifies the RECORD descriptor by changing the word
specification (low order bits). Thus line 1 starts in word 0; line 2 in
word 7; etc.
: T CR LINE ,C ;
93
: R LINE =C ;
: LIST 15 0 DO 1 +
CR DUP LINE ,C DUP ,I CONTINUE ;
LIST will list the entire block: 15 42-character lines followed by line
numbers. It sets up a DO-CONTINUE loop with the stack varying from
1 - 15. Each time through the loop it: does a CR; copies the stack and
executes LINE; types the field (,C); copies the stack again and types it
as an integer (,I).
: I 1 + DUP 15 DO 1 -
DUP LINE DUP 7 + =C CONTINUE R ;
If I type " NEW TEXT" 6 I - I want the text inserted after line 6. "I" must
first shift lines 7 - 14 down one position (losing line 15) and then
replace line 7. It adds 1 to the line number, sets up a backwards DO-
CONTINUE loop starting at 14, constructs two field descriptors, LINE
and LINE+7, and shifts them (,C). When the loop if finished, it does an
R.
: D 15 SWAP DO 1 +
DUP LINE DUP 7 - =C CONTINUE " " 15 R ;
If I type 12 D - I want to delete line 12. D must move lines 13-15 up one
position and clear line 15: It sets up a DO-CONTINUE loop from
stack+1 to 15. Each iteration it: constructs fields LINE and LINE-7 and
shifts them (=C). Then it replaces line 15 with spaces.
That's it. With 10 lines of code I can define a text-editor. It's not the
most efficient possible, but it's fast enough and illustrates many
points: In dealing with small amounts of text, you needn't be clever;
94
let the machine do the work. The verb LINE is an extremely useful one;
such useful verbs are invariably an empirical discovery. The verbs ,C
and =C are the heart of the method; incidently, they only work on
fields less than 64 characters. Notice how one definition wants to
reference another (R used by I and D; LINE used by all). Notice how I
and D are similar yet different. And notice how a few verbs eliminate a
lot of bookkeeping and let you concentrate on the problem and not
the details.
95
So far I've ignored that error message; not because it's unimportant
or trivial to implement, but because it's part of a diffcult subject -
output. Logically I oughtn't have delayed discussing output this long,
for even a control language needs output. But as usual in this
program it is involved with other features that we've only just
discussed. I'll leave it to you to implement those features of the
output capabilities I'll present, that your application requires.
You compose input: you select words and combine them into fairly
complex phrases; your program spends considerable effort
deciphering this input and extracting its meaning. In reply it will not
go through any such elaborate procedure. You'll see that most of its
output consists of the word OK. You are talking to the computer, but it
is hardly talking to you; at best it's grunting.
I maintain that the two processes have nothing in common, that the
computer does not prepare output in a manner analogous to you
preparing input. In Chapter 8 I'll describe a way your program can
compose complex output messages. Although such a technique
might provide a 2-way dialog, it has even less similarity to interpreting
input.
96
After finding an error, you of course quit doing whatever you were
doing. There is no point in trying to continue when you're standing by
ready to correct and start again. However it is convenient to reset
things that you'd probably have to reset anyway. In particular, set the
stacks empty. This is sometimes unfortunate since the parameter
stack might help you locate an error. But it usually is most convenient.
Don't try to reset the dictionary since you're not sure what you may
want to reset it to.
97
6.2 Acknowledgement
I mentioned in Chapter 3 that you must write subroutines to send and
receive messages. Now I must expand on exactly how you should use
these subroutines.
Recall that input and output share the same message buffer. This now
causes trouble. However it considerably simplifies the more powerful
message routines of Chapter 7. On balance the single message buffer
seems optimal.
First let me call the subroutine that sends a message SEND. It sends a
single line and should add a carriage return to the line, as well as any
other control characters needed, and translate characters as required.
The routine that receives a message is QUERY. It is a routine, and not
a subroutine. QUERY calls SEND to send a message, and then awaits
and processes an input message. stripping control characters and
translating characters as required. It initializes the input pointer IP
and jumps to NEXTW. Notice that your program can send output via
SEND wherever it pleases. However it can only receive input in
conjunction with output, via QUERY. You have no provision for
receiving successive messages without intervening output. This is
exactly the behavior you need, and actually simplifies the coding of
message I/O.
Drop the stack. Each output verb must have an argument. Its
last argument can be dropped at this point, and the stack
pointer checked against its lower limit.
Set EMPTY true.
If NEXT contains NEXTW and SCREEN is 0, jump to QUERY.
Under these circumstances there is no further input available
in the message buffer.
Jump to NEXT.
The logic required is summarized in Fig 6.2 and is the price paid for
duplexing the message buffer. One final complication concerns
EMPTY. If true, it states that input has been destroyed; it does not
indicate that output is currently in the message buffer. Output may
have been placed there and already sent. If the message buffer is
empty, type OK before jumping to QUERY.
99
100
What does a character string look like? Of all the ways you might
choose, one is completely natural:
"ABCDEF . . . XYZ"
The extra space is annoying, but in Chapter 8 I will tell you how to
eliminate it without the objections I just raised. So a character string
is started with a quote-space and terminated by a quote.
input buffer (so far), and we had better use the string before we
destroy it with output or additional input. When it is destroyed
depends on many things, so the best rule is to use it immediately.
What can you do with a character string? I've only found 2 uses. They
are very similar, but part of the frustration of implementing them is to
take advantage of the similarity. You can type a string, or you can
move it to a character field.
If you can do the above, you can also move one character field to
another. That is, if you make your character string and field
descriptors compatible - which adds to the fun. You might want to
prevent moving a field to a string, but than who cares.
We've slid into the subject of field descriptors. You might want to type
a character field, and of course the same code should work as for
string descriptors.
102
Providing the message traffic from any one terminal is low enough, as
is inevitably the case - for we have in effect slowed the computer
down to human speed - we can handle a much larger number of
terminals than can fit in core, hundreds, by storing inactive users on
disk.
For example, if you have to poll phone lines to acquire input, you want
to perform these polls asynchronously with whatever other work
you're doing. Since interrupt routines are best kept small, the task of
translating character sets, checking parity, distributing messages, ets.
should be performed at lower priority. This is easy to do with an entry
in the ready table. The interrupt routine sets a message routine
"ready" and the computer will process it when possible.
Each such independent activity should have a ready table entry and a
(perhaps) small dictionary in which to store its parameters; return
address, register contents, etc. in the same format as a user activity.
In fact these activities are competely equivalent to users, except that
they don't process users. This is significant, for it means they never
generate error messages, they must handle their own errors,
somehow.
Such activities cost little, and usually provide the simplest answer to
any asynchronous problem. Mind the Basic Principle, though!
106
If all your users are not core resident, it is better if none of them are.
Then any input message can be written into the message buffer area
on disk. And all output messages read from disk. The fact that some
users might reside in core, causes an unreasonable complication, and
the fact that disk access is fast compared to message transmission
means that to attempt to save such disk accesses is not efficient.
107
7.2 Queing
You can save yourself a lot of trouble by putting some code in the
user controller. Two subroutines: QUE and UNQUE. When a user
needs a facility that might be in use by someone else, he calls QUE. If
it's available, he gets it. If it's not available, he joins the que of people
waiting for it. When it is released, and his turn, he will get it.
These are extremely valuable routines, for there are many facilities
that can be handled in the manner; each disk, each line (shared lines),
the printer, block 1 (disk allocation), non-re-entrant routines (SQRT).
An extension will even permit exclusive use of blocks.
In addition to the user's dictionary address and ready flag, each user
must have a link field - not in his dictionary, but in user control. Each
facility that is to be protected must have associated with it 2 fields:
the owner, and the first person waiting. The best arrangement is to
have a table of such que-words, one for each facility. If a facility is
free, its owner is 0; otherwise its owner is the number of the user
owning it. A user's number is his position in the table of users,
starting at 1. If no one is waiting, a facility's waiter field is 0; otherwise
it is the number of the user waiting.
If someone's waiting:
I follow the chain of links starting at the waiter's link field until
I find a 0 link; I place my number there, 0 my link field, and
relinquish control.
It's complicated, it's troublesome, and it's the price you must pay for
multiple users.
110
7.2.1 Usage
To gain exclusive use of a block, with the exception of block 1, best
handled as an exception, set aside some facility que-words for this
purpose. Find a free one and store the block number it represents
somewhere, then treat that block like any other facility. When the last
waiter releases the block, release the facility que-word for re-use.
Notice that this technique has no effect upon the block itself. It may
be resident in core, or not. Anyone may read or write it. However, no
one else may have exclusive use of it. If all users cooperate to request
exclusive use when the should, it works perfectly - with no extra cost
to ordinary reads/writes. Actually, exclusive use of a block is
necessary only under exceptional circumstances. Block 1 is an
example of such: The block may not be used by anyone else until
another block has been read, and the available space up-dated.
111
If you establish several user dictionaries, the first entry in each will
link to the system dictionary (Fig 7.1) at the same point. Thus each
user is unaware of any other user, and his dictionary search is
unaffected.
112
If you have multiple chains in your dictionary, each chain must jump
from the user's to the system dictionary. This is only a problem when
re-initializing the dictionary, and can be easily solved by keeping a
copy of the chain heads for the system dictionary.
113
establishing the words GET and RELEASE with the code identified in
the 17th and 18th table positions. Library subroutines (FORTRAN
arithmetic subroutines) might be treated similarly.
I have had all the entries I describe in a single program. This program
had less than 1500 instructions so it is practical to include everything
in a single program. But I was experimenting, and never found an
application that needed a fraction of them.
118
Likewise, there are no simple rules that separate these strings into
the words intended:
But don't dispair! There is a general solution that can handle all these
cases. It is expensive in time, perhaps very expensive. But it solves
the problem so thoroughly, while demonstrating that no lesser
solution is possible, that I consider it well worth the price. Besides,
the speed of processing text is not a critical factor. We maximize
speed precisely so that we can afford extravagances such as this.
compiler, and if you are you can probably make your word subroutine
cope.
There are several things to be careful of: As you drop characters from
the aligned word, you must keep track of your current position within
this word. However, you must also back-up the input pointer so that
you can start the next word correctly. Incidently this requires an initial
back-up over the terminal space that is not repeated.
Backing the input pointer is not possible with unbuffered input. This
is why I suggested that you buffer un-buffered devices back in
Chapter 3. If you aren't going to dissect, apply the Basic Principle.
You must also have a way to detect that you have dropped the last
character: a counter is one solution. Another is to place a space
immediately ahead of your aligned word, and to stop on the space. I
prefer the second, for I find I lack a convenient counter that is
preserved over dictionary search and numeric conversion. But this
means that I must fetch each character before I deposit a space over
it. And this means that my fetch subroutine must operate backwards,
the only place I ever need to fetch backwards. It depends on your
hardware.
However, this means you cannot dissect letter strings and you might
want to. Plurals, for instance, can be easily accomodated by dropping
the terminal 's'. On the other hand, you can easily mis-identify words
by dissecting letter strings: I once dissected the word SWAP: S was
defined, W was defined and my error message was AP ? Perhaps
when dropping a single letter you should replace it with a dash to
120
One further caution: If you are going to dissect, you must not discard
extra characters while initially aligning the word. Your input pointer
must be positioned so that you can backspace it correctly. If you
exceed maximum word size, stop immediately and supply a terminal
space. This means that no single word can exceed maximum size,
which has now become maximum string size.
I would like to be able to say that this ability will impress people. It will
impress you - at least it should. But ordinary people, like your boss,
expect this kind of ability from computers. They are only impressed,
negatively, if they discover its absence.
121
A+B*C
the multiply must be done before the add. Moreover, parentheses are
used to modify the standard heirarchy:
A*(B+C)
2 :L word . . . ;
The 2 is the level number, taken from the stack. :L declares the next
word as a level-definition. ';' marks the end.
0 :L , ;
1 :L + + ;
2 :L * * ;
3+4*5,
What happened? 3 goes onto the parameter stack, + goes onto the
level-stack, 4 onto the parameter stack, * onto the level-stack (since it
has a higher level number than the + already there), 5 onto the
parameter stack. Now ',' forces the * to be executed (since its level
number is smaller) and * finds 5 and 4 on the parameter stack. ',' also
forces + to be executed (with arguments 20 and 3) and then, because
its level number is 0, is itself executed and does nothing.
Clear? I would like to assume you're familiar with this technique, but I
don't quite dare. All I'm really contributing is a way to implement with
dictionary entries a technique usually built into compilers. Perhaps
the cop-out of suggesting you define the arithmetic operators and
work out some examples for yourself. Remember that equal level
operators force each other out, and that a lower level operator forces
out a higher. It is strangely easy to reason out the relative levels of
operators incorrectly.
Now back to work. You've seen some level definitions. I hope you've
played with them some. How do we implement them? Well we don't.
Rather we implement a generalization: level-entries. When I found an
application for level-entries I also found out it was cheaper to
implement level-definitions as such than the way I was doing.
Before actually executing an entry from the stack, LEVEL must set the
SOURCE address to reference another routine, FORCE. You recall
124
that your main control loop obtains its next entry either by reading a
word and searching, or by fetching from a definition. Well here is a
third source, the level-stack. As for a definition, the old value of
SOURCE and the virtual-IC must be saved - on the return-stack.
When a level-entry is done, it will RETURN and your control loop will
go to FORCE. The only way you can get to FORCE is by completing a
level-entry. Its function is to check the level stack and see if any other
entry can be forced off by the one on top. 3 cases arise:
I pass the ball to you. If you have an application that could profit from
a natural language input format, you have the capability with level
definitions to implement it. For example, it would not be hard to teach
your program to solve the problems at the end of a high-school
physics text.
Clearly such volume must be stored on disk. Also clearly, you don't
want to have to search disk explicitely. There is a gratifyingly effective
solution: If you can't find the word in the core dictionary, and it's not a
number, search a block on disk. Now the question reduces to: Which
block?
Establish a field called CONTEXT. Treat it like you did a block address:
it both identifies a block and suggests where it might be in core.
Search this block. By changing CONTEXT you can search different
disk dictionaries. By linking several blocks together, you can search
larger amounts of disk; or search several dictionaries in sequence.
You can afford to search a fair amount of disk, because if you can't
find the word you're going to generate an error message. A delay in
typing that message to make sure you can't find the word, is an
excellent investment. Still for really large vocabularies - thousands of
entries - such an approach is inadequate.
For very large dictionaries, scramble the word into a block address
and search that block. By that I mean compute a block address from
the letters in a word, just as we did for multiple chains in the core
dictionary, though you'll probably want a different algorithm. You can
search one of a thousand blocks and be assured that if the word is
anywhere, it's in that block. Because you used the same scramble
technique to put it there as you use to find it. Since many words will
scramble into the same block, you of course search for an exact
match. Again, just as in core. With such a large disk dictionary, you
want to be careful of several things. First, once you choose a
scrambling algorithm you can never change it; so make a good
choice before you define lots of entries. Second, try to keep the
number of entries roughly the same in all blocks; and roughly equal to
128
What do disk dictionary entries look like? I have found that 2 fields
are sufficient: the word field, the same size as the core dictionary
word field; and a parameter field, 1 word long. If you find a match on
disk, you put the parameter on the stack. Remember that you can't
afford to store absolute addresses on disk, so you can't have an
address field as in core. You could provide a coded address field, but
it seems adequate to treat disk entries as constants.
For instance you can name blocks. When you type the name of a
block its address is moved from the parameter field onto the stack.
That is an excellent place for it, because if you type the block number
itself that's where it would be placed. You can use block numbers and
block names interchangeably. Thus when you type an account
number the block associated with that account is placed onto the
stack, whereupon you store it into the base word that its fields
reference. An illegal account will cause an error message, in the
ordinary way. Or you might name the instructions for your computer.
Then typing its name will place a 1-word instruction on the stack,
ready for further processing.
0 NAME ZERO
129
FORGET ZERO
FORGET must call WORD as defining entries do, since this is a non-
typical use of the word ZERO. When it finds the entry, it simple clears
it without trying to pack. Your entry routine should first search disk to
see if the word is already there. You don't want multiple definitions on
disk, even though there're useful in core. Then it should search for a
hole. If it finds the word already there, or if it can't find a hole? You
guessed it, an error message.
Let's talk about a refinement. With a thousand names on disk it's easy
to run out of mnemonics. Let's re-use the field CONTEXT: after you
scramble the word into a block address, add the contents of
CONTEXT and search that block. If CONTEXT is 0, no difference. But if
CONTEXT is non-zero, you're searching a different block. If CONTEXT
can vary from 0 to 15, you can have 16 different definitions of the
same word. You'll find the one that had the same value of CONTEXT
when you defined it. If there is no entry for a word under a given
CONTEXT, you won't get a match. A block containing a definition for
the same word under a different CONTEXT won't be searched.
For example, stock numbers might look the same for different sales-
lines. By setting CONTEXT you can distinguish them. You can use the
same name for a report screen that you use for its instruction screen;
distinguish them by CONTEXT. If you're scrambling anyway, you may
as well add in CONTEXT (modulo a power of 2); it costs nothing, and
vastly extends the universe of names. In fact, you can use CONTEXT
in both the ways we've discussed, simultaneously. For as an aditive
constant it tends to be small; and as a block number, large. So your
search routine can decide whether to scramble or not based on its
size.
with only the first couple of characters, so at least the disk searches
are in the same block - which will be in core. Or use only non-zero
values of CONTEXT and let 0 inhibit the disk search. That is, make
dissection and disk searching mutually exclusive. As is often the case,
the problem is serious only if you aren't aware of it.
131
Before you start objecting, let me rush on. Stored with the block
address is the location of the core buffer that block last occuppied.
So the program needn't actually read disk, or even search core
buffers for the block, unless the block has been overlaid. Hence
repeated accesses to the same block cost little.
another one, all you need do is store the link in the base location for
other fields, and forget that a link is involved. If you access fields in
the link it will automatically be read. If not, it won't be. The more
complex your data, the greater the advantage.
You can make these field entries identical with those accessing core,
by making the pointer to the base address 0. If you don't point to a
disk address, you must mean core.
Consider how the field reference actually works. In the field entry you
have a word parameter that tells which word the field is in (or starts
in). If this field references another, you add the word parameters
together. When you find the core address of the disk block, you add
the word offset and voila': you have the word you want. Going
through intermediate fields has little advangage unless the
intermediate fields change. Why not? By incrementing a base field
address, you can access different rows of a matrix or different
records in a block. Or you can access different sub-records of a
record. Very useful! It's enough to make me think COBOL is a pretty
good language. Of course you can do the same thing with core fields,
you just never point to a disk address at the very end.
Given such elaborate addressing capabilities, you can use some help
debugging your screens. Memory protection is easy to provide, and
very helpful. Include with each field entry a maximum size (in words)
for that field. When you calculate an address that purports to be in
that field, make sure it is. The upper limit for the final block reference
is of course the block size. The upper limit for a core reference is also
known. A simple error message stating OVERFLOW will catch trouble
before it has a chance to propagate.
So, we can have automatic access to fields scattered all over disk and
in variable size records at that. Basic Principle!
134
One thing! If field entries can address other field entries, you need
some way to distinguish a field from a disk address. I have no
suggestion.
135
So now you're face to face with the computer. What do you do? First
an exercise. Initialize the interrupt locations in such a way that the
computer will run, will execute an endless loop, when you start it. OK?
Then modify your loop so that it will clear memory. OK? You've
probably learned a lot.
Now we're going to start for real. We're going to start building your
dictionary, even though you can't use it yet. You must choose your
entry format now; variable-sized entries are required, but you can
decide about word-size and layout. The first entry is SAVE; it will save
your program on disk. Lacking a control loop you'll have to jump to it
manually, but at least you can minimize re-doing a lot of work. The
second entry is LOAD; it will re-load your program from disk. You
may have a hardware load button, if you can store your program
compatibly with it, fine. You might want to punch a load card, to
provide initial load otherwise. But it's always convenient to be able to
re-start from core.
The third entry is DUMP; it will dump core onto the printer. It needn't
be very fast to be a lot faster than looking with the switches. This
probably isn't a trivial routine, but it oughtn't take more than a dozen
instructions. You might want to postpone it just a bit.
So, with a couple hours work - providing you read the manual first -
you have an operating system (SAVE, LOAD) and debugging package
(DUMP). And you know a lot about your computer.
137
I presume you can LOAD your program and DUMP core. It's time to
get away from the switches and use the typewriter. So set up a
message buffer from which you can send and receive text.
Presumably when awaiting text your program sits in an endless loop
somewhere. Learn to recognise that loop. You'll spend most of your
running time there and it's reassuring to know that everything's
allright.
Your're doing great. Now establish the stacks, the dictionary search
subroutine and entries for WORD and NUMBER. Be very careful to do
it right the first time; that is, don't simplify NUMBER and plan to re-do
it later. The total amount of work is greater, even using the switches.
Now write a control loop. You might test the stack, but jump to an
unspecified error routine. And run. DUMP is still our only output
routine, but you should be able to read and execute words like DUMP,
SAVE and LOAD.
Now define the code-entry, the word that names code; and the deposit
word, the word that places the stack in core. Now you can type octal
138
numbers and store them in the dictionary. No more switches. You can
also construct new dictionary entries, for code.
139
We now need the READ and ;S verbs for screens. Specify a block
number and we can read the text in that block.
I'm sure you've noticed the difficulty with modifying code in the root.
A powerful tool is to be able to shift the dictionary in core. If the root
doesn't use absolute addresses, define a SHIFT entry and use it.
Otherwise minimize the number of absolute addresses and define a
more elaborate SHIFT verb that adjusts them.
Figure 6.2
###
142
Charles H Moore
Education:Born in McKeesport Pennsylvania,
near Pittsburg, in 1938. He grew up in Flint
Michigan and was Validictorian of Central High
School (1956). Granted a National Merit
scholarship to MIT where he joined Kappa Sigma
fraternity. Awarded a BS in Physics (1960) with a
thesis on data reduction for the Explorer XI
Gamma Ray Satellite. Then went to Stanford
where he studied mathematics for 2 years (1961).
Programmer: He learned Lisp from John McCarthy. And Fortran II for the IBM 704
to predict Moonwatch satellite observations at Smithsonian Astrophysical
Observatory (1958). Compressed this program into assembler to determine satellite
orbits (1959). On the other coast, he learned Algol for the Burroughs B5500 to
optimize electron-beam steering at Stanford Linear Accelerator Center (1962). As
Charles H Moore and Associates, he wrote a Fortran-Algol translator to support a
timesharing service (1964). And programmed a real-time gas chromatograph on his
first minicomputer (1965). Learned Cobol to program order-entry network at
Mohasco (1968).
Forth: Chuck invented Forth (1968) and collected his personal software library onto
an IBM 1130 which was connected to the first graphics terminal he'd seen (IBM
2250). Soon he used Forth to control the 30ft telescope at Kitt Peak for the National
Radio Astronomy Observatory (1970).
And then helped found Forth, Inc (1973) with $5,000 from an angel investor. For
the next 10 years, he ported Forth to numerous mini, micro and main-frame
computers. And programmed numerous applications from data-base to robotics.
In 1980, Byte magazine published a special issue on The Forth Language.
Gregg Williams editorial (2.5MB) provides a rare view of Forth from the outside.