Power Programming For The Commodore 64
Power Programming For The Commodore 64
Power Programming For The Commodore 64
_-----
Librarv oj Congress Cataloging In Publication Data
Sutton, James, (date)
Power programming the Commodore 64,
Includes index.
I. Commodore 64 (Computer)-Prop:ramming.
2. A.'.sembler language (Computer program language)
3. Computer graphics. 4. Computer sound processing.
/. Title.
QA76.8.C64S89 1985 001.64'2 85~3517
ISBN 0~13~6X7H49~O
The author and publisher of this book have used their best efforts in preparing this book.
These efforts include the development, research, and testing of the theories and programs to
determine their effectiveness. The author and publisher make no warranty of any kind,
expressed or implied, with regard to these programs or the documentation contained in this
book. The author and publisher shall not be liable in any event for incidental or consequential
damages in connection with, or arising out of, the furnishing, performance, or use of these
programs.
10 9 8 7 fi 5 4 2
ISBN 0-13-687849-0 01
ACKNOWLEDGMENTS xiii
iii
iv Contents
SYMBOLIC CODES 25
CONCEPTUAL CODES 28
DATA STRUCTURES 28
TYPES OF INFORMATION TASKS 30
ORDER AND ENTROPY INFORMATION 32
The Computer 34
THE CPU 36
BUSES 36
REGISTERS 36
THE FETCH-EXECUTE CYCLE 45
THE MEMORY MAP 48
DATA MOVEMENT 86
Between Register and Memory 87
Between Register and Top-Of-Stack 89
Between Register and Register 89
DATA TRANSFORMATION 91
Arithmetic 92
Addition 92
Subtraction 94
Shifts 96
Logical Operations 98
PROGRAM STRUCTURING 100
Constructs 103
Sequences 105
Selections 111
Repetitions 122
Modules 126
Hierarchies 13 I
Optimization 132
ALGORITHM 132
IMPLEMENTATION 136
SUMMARY 140
Testing 146
RS-232 229
Data Cassette 230
Keyboard and Screen 230
LINKS 230
Logical File Number 231
Device Number 231
Commands 232
CHANNEL PROGRAMMING 232
CHANNEL MANAGEMENT 234
KERNAL SUPPORT 234
WHOLE-FILE CHANNELS 237
Data Cassette 238
Disk Drive 238
PARTIAL-FILE CHANNELS 240
Data Cassette 242
Disk Drive 242
Sequential Files 242
Relative Files 243
IEEE-488/Serial Bus 244
RS-232 244
INTERACTIVE 110 244
SUMMARY 244
CHANNEL COMMUNICATIONS 245
KERNE L SUPPORT 245
WHOLE-FILE CHANNELS 247
Data Cassette 249
Disk 249
The Auto-Start Loader 249
Overlays 251
PARTIAL-FILE CHANNELS 251
Data Cassette 255
Disk 256
Sequential Files 256
Relative Files 259
IEEE-488/Serial Bus 262
RS-232 262
Keyboard and Screen 263
INTERACTIVE CHANNELS 263
CHANNEL 110 SUMMARY 265
Using the 1/0 Block: Non-Channel 1/0 266
GAME PORT 110 267
LIGHT PENS 267
GAME PADDLES 268
Preparing for Paddle 110 269
Selecting a Paddle Port 270
Reading Paddle Data 270
JOYSTICKS 271
Content s
viii
304
7 VOCAL CHORDS FOR A CHIME RA: SOUND SYNTHESIS
Wave Attribu tes 304
INTEN SITY 305
FREQU ENCY 306
WAVE FORM 312
PHASE 314
Wave Modul ation 315
LOUD NESS 315
PITCH 318
TIMBR E 320
Sound Progra mming 324
APPENDICES 327
Library of Congress Cataloging in Publication Data
James. (dale)
SlIllon.
Power rrogr~rnming the Commodore 64.
lndudc~ index.
I. COllllllodore 64 (Computer)-Programming.
2. Assembler language ('omputer program language)
3. Computer graphics. 4. Computer sound processing.
I. Title.
QA76.8.C64S89 1985 001.64'2 85-3537
ISBN 0-13-687849-0
The author and puhlisher of this hook have used their best efforts in preparing this book.
These efforts include the development, research, and testing of the theories and programs to
determine their effectiveness. The author and publisher make nO warranty of any kind,
expressed or implied, with regard to these programs or the documentation contained in this
book. The author and publisher shall not be liable in any event for incidental or consequential
damages in connection with, or arising out of, the furnishing, performance, or use of these
programs.
10 9 8 7 6 5 4 2
ISBN 0-13-687849-0 01
ACKNOWLEDGMENTS xiii
iii
iv Contents
SYMBOLIC CODES 25
CONCEPTUAL CODES 28
DATA STRUCTURES 28
TYPES OF INFORMATION TASKS 30
ORDER AND ENTROPY INFORMATION 32
The Computer 34
THE CPU 36
BUSES 36
REGISTERS 36
THE FETCH-EXECUTE CYCLE 45
THE MEMORY MAP 48
DATA MOVEMENT 86
Between Register and Memory 87
Between Register and Top-Of-Stack 89
Between Register and Register 89
DAT A TRANSFORMATION 91
Arithmetic 92
Addition 92
Subtraction 94
Shifts 96
Logical Operations 98
PROGRAM STRUCTURING 100
Constructs 103
Sequences 105
Selections III
Repetitions 122
Modules 126
Hierarchies 131
Optimization 132
ALGORITHM 132
IMPLEMENTATION 136
SUMMARY 140
Testing 146
RS-232 229
Data Cassette 230
Keyboard and Screen 230
LINKS 230
Logical File Number 231
Device Number 231
Commands 232
CHANNEL PROGRAMMING 232
CHANNEL MANAGEMENT 234
KERNAL SUPPORT 234
WHOLE-FILE CHANNELS 237
Data Cassette 238
Disk Drive 238
PARTIAL-FILE CHANNELS 240
Data Cassette 242
Disk Drive 242
Sequential Files 242
Relative Files 243
IEEE-488/Serial Bus 244
RS-232 244
INTERACTIVE I/O 244
SUMMARY 244
CHANNEL COMMUNICATIONS 245
KERNEL SUPPORT 245
WHOLE-FILE CHANNELS 247
Data Cassette 249
Disk 249
The Auto-Start Loader 249
Overlays 251
PARTIAL-FILE CHANNELS 251
Data Cassette 255
Disk 256
Sequential Files 256
Relative Files 259
IEEE-488/Serial Bus 262
RS-232 262
Keyboard and Screen 263
INTERACTIVE CHANNELS 263
CHANNEL I/O SUMMARY 265
Using the I/O Block: Non-Channel 1/0 266
GAME PORT 1/0 267
LIGHT PENS 267
GAME PADDLES 268
Preparing for Paddle I/O 269
Selecting a Paddle Port 270
Reading Paddle Data 270
JOYSTICKS 271
viii Contents
APPENDICES 327
The one, central problem we have all faced with computers is how to make our
electronic serfs do -significant work for us. This question causes difficulties for
everyone from the casual user to the serious apprentice or professional programmer.
Those that intend to solve the problem by writing their own programs set off
on a quest for "how-to" knowledge. The first leg of this search for the holy grail
is spent groping just to discover what subjects to study! Typically, computer in-
formation is artificially separated into such topics as assembly language, hardware,
graphics, structured programming, and so on, which are then discussed in their own
volumes. The circular reasoning involved requires that the student know the subject
to be able to study it! The sales of computer books prove that many people are
spending hundreds of dollars attempting to gain an integrated understanding of
computers; the number of computers being set aside shortly after purchase prove
that many people are failing in these attempts. Together, these statistics suggest
where the problem lies.
A single book has been needed to guide the computer user from first principles
to the ultimate goal of programming, the conversion of vague ideas for computer
tasks into specific and complete computer tasks. This view of programming is in-
terdisciplinary at heart and requires that such varied fields as software development,
computer resources (chips, speed, etc.), problem solving, and the arts be considered
in unison. Filling these complementary needs is the purpose of this book.
Each chapter explains concepts relevant to any computer language, including
BASIC, and applies these concepts to the Commodore 64 wherever possible. Much
of the computer-specific material is unavailable anywhere else, being gathered dur-
ing long discussions with practicing Commodore 64 programmers. Almost all the
ix
Preface xi
puting magazine. * The task of filling up the TV screen with one character, then
filling it with another, and so on, until every character supported by the computer
has filled the screen once, was implemented in each language.
Here is the portion of the results dealing with the major languages we have
mentioned, adding results for a different high-level language, Pascal, for compar-
ison:
The BASIC version of the program took less of author Gregory Yob's time
to write and correct than did the other versions, but Mr. Yob attributed much of
that difference to his lack of familiarity with assembly language and Pascal. Indeed,
small programs like this often run correctly on the first try if written in Pascal. This
implies 0 correcting time. Pascal is the fastest and easiest language of the listed three
for programming and debugging large programs, and is the language of choice
wherever BASIC "will do." (Pascal was originally written to teach good program-
ming, and it is worth learning for that reason alone; habits encouraged by BASIC
have been shown to make poor programmers.)
Although writing characters to a television screen is not a general test of lan-
guages, it does relate to a distinguishing feature of the Commodore 64, the graphics
character mode. This mode allows the creation of your own specialized sets of
graphics or other symbols for use in programs.
Note that while Pascal is much faster than BASIC, assembly is outrageously
faster than either. Your Commodore 64 has the ability in assembly language to
display graphics 700 times faster than BASIC allows. This opens up built-in graphics
capabilities that are completely unavailable to BASIC programs! For example, in
assembly language there is enough time to split the TV screen into horizontal pieces
and treat each piece as a separate screen. Each screen can have different features,
an option precluded by BASIC's slowness. Assembly language even leaves enough
time for a program to do other work "between the cracks" as it controls the visual
display. There is more on such graphics techniques in Chapter 6.
A second comparison comes from Byte magazine. t This time the task is "The
Sieve oj Eratosthenes," which finds all the prime numbers up to a given number
(in this case, 8191). All the multiples of 2, 3, and of the primes (as they are dis-
covered) are crossed off a number list. When the task is completed, only the primes
are left. This test addition (the multiplications are implemented as repeated addi-
tions), subtraction (number comparisons), and several other mathematically ori-
ented functions that are common in computer operation.
In this case the languages span several different computers, but each computer
mentioned contains the same "brain," or microprocessor, as that used in the Com-
modore 64, running at the same speed. The results are as follows:
You could say that the BASIC version leaves time to finish a full-course dinner,
but assembly just barely lets you tie your shoelaces.
There is a place for BASIC programming, but that place does not include high-
speed computation, sophisticated graphics, or applications requiring flexible use of
the computer's resources. Having been invented in 1965, a long time ago for any
computer language, BASIC should be spared harsh criticism. It is, and always was,
intended as an introduction to computers.
For all of the preceding reasons, assembly is the language we have chosen to
complement a powerful and proven method for accomplishing your computer goals.
This welds the lever of program-planning technique with the hammer of assembly
language into a force-multiplying tool for Power Programming.
I have many people to thank for their many different kinds of help. For technical
help lowe the most to Larry Holland of HES Corp. He is a true gentleman and a
Commodore 64 professional. Many of the I/O coding techniques were suggested
by him and have been used with telling effect in his own programs. HES Corp.
deserves similar thanks for making Larry available to me. Another HES expert who
has provided help is Jay Stevens. Any errors are the fault of this author, not of
Larry or Jay.
Further, my employer, Lockheed Missiles and Space Company of Sunnyvale,
California, has directly supported this book with shifted work hours, and indirectly
supported it with an outstanding technical environment to work and grow in. I could
not have written as much or as well without them.
For authorial support lowe the most to Bernard Goodwin, my publisher at
Prentice-Hall. His patience with my perfectionism and the resulting broken dead-
lines allowed several extra passes at the book. Another major support came from
my brother, whose command of the English language is as unshakable as anyone's
I know. His suggestions led to some of the most important changes in the early
portions of the text.
For personal support I thank my family: my wife, Debbie, and daughters
Noelle, Jenyfer, and Natalie. They sacrificed most of their time with me for nearly
two years. Additionally, Debbie acted as a reviewer, and often saw the forest while
I was wandering among the trees. Her suggestions greatly improved the organization
of a number of sections.
Finally, I thank my Lord Jesus for the encouragement his words have given me
to see through this twenty-two-month pregnancy with Power Programming. With-
out that encouragement this "baby" would have miscarried on several occasions.
xiii
---------------_._-----
The London street light edges the window like a mute sunburst. Lying on oak, raggedly
stacked pages are swirled by a mantle's glow, but their inner current of symbols and
mechanical drawings is diverted only by the occasional footnote "1842 REV/SION."
The Prime Minister reaches to blacken a quill, then marks the final page "Rejected."
Mankind is denied the computer for another hundred years.
The rejected manuscript was the blueprint for a computer built of mechanical
parts. Its inventor, Englishman Charles Babbage (1792-1871), alienated fellow
scientists by criticizing the bureaucratic leaders of their scientific societies. In return,
those leaders lobbied the British government, which then withdrew its financial sup-
port for the development of Babbage's computer. The completed blueprint lay
forgotten after the inventor's death in 1871. Its rediscovery in 1937 and updated
development during World War II resulted in the first electronic digital computer,
ENIAC (Electronic Numerical Integrator and Calculator), in 1946.
The mechanical computer, which Babbage called the Analytical Engine, had
all but two of the basic attributes of modern computers. These attributes are easily
understood in the mechanical computer since the physical actions supporting them
are familiar to us from everyday life. We will present these basic attributes first in
mechanical terms, and then apply them to the electronic actions and circuitry of
your modern computer. The analogy is simple and direct. Even the roots of the two
basic modern-computer functions absent in Babbage's invention can be traced to the
mechanical computer.
1
----_._------
The heart of the computer was originally called the "mill." Babbage named it after
the grain mill, a machine that converts raw grain, or grist, into flour. In a grain mill,
grist is placed between two stone disks and one disk is turned, usually by a water-
wheel. The trapped grist is pulverized into flour. In both grain mills and computers
the raw material and end product are made of the same substance; only the form
changes. The substance milled by a computer is not, of course, grain. It is numbers.
Numbers entering the mill will be "mashed," that is, added, subtracted, and other-
wise transformed, and number results of a different form exit.
Number transformation is the central activity of the computer. Outside the
mill portion of the computer is a miniature world centered around and supporting
the mill. This world has storage bins for the grist, a transportation system to move
grist between mill, storage, and consumers, and a communications system to coor-
dinate all these activities. "Consumers" are often devices that show human beings
the milling results.
All these milling support systems are present in modern computers. Two other
attributes common to both the Analytical Engine and today's computers are grist
made of binary instead of decimal numbers, and a program, or sequence of instruc-
tions like a recipe, to direct the mill's operations.
All mathematical machines prior to the Analytical Engine, including an earlier
calculator invented by Babbage, used the decimal or base 10 number system for
calculations. The decimal system, based on our two five-fingered hands for a total of
10 "digits," complicates machinery by requiring machine representations for 10 dif-
ferent numbers from 0 to 9. Its rules for arithmetic are convoluted, as seen in the
rules for carrying or borrowing a 1. On the other hand, the binary system of two
digits, 0 and 1, has the very fewest number of digits that can convey a useful mean-
ing: With two digits one can both express numbers and modify them with simple
arithmetic operations.
Just two different mechanical or electronic digit representations are needed in
a binary-based machine. This not only simplifies computers, it also structures the
computer machinery more regularly. This structuring makes construction of com-
puters more practical.
The original concept of a program was at once radical and elementary. To
Babbage, a program was more than just a sequence of number transformations for
the mill to perform on grist; it implied the ability of the miIl to look at the result of a
transformation and then choose a next transformation! This was the beginning of
the modern computer program.
For example, a program may direct the fol!ow\ng mill actions in response to
these different results of a subtraction: If the result is negative, move a storage loca-
tion's contents; if positive, do another subtraction; and if zero, "wave a (figurative)
flag" to alert the human being operating the computer. The ability to select among
alternative actions allows the mill to skip unnecessary commands, repeat parts of a
program, or pick one of many paths through the program.
The "Number Mill" 3
The same four categories of information for the mill are also found in
microcomputer programs. Constants and variables are grouped under the name
operands, because operations can obtain grist from either of them. Again, note this
difference: A constant is a number to be operated on; a variable is the address of a
storage bin holding a number to be operated on.
Except for constants, information on the Babbage cards was non numerical
and could not be stored in the mill's internal and numerical storage. Thus the cards
were read by the mill as needed, and the program was said to be externally stored.
This limitation was largely responsible for the development of the two most recent
basic principles: codes and internal program storage.
A code is a meaning assigned to a number. The number 127 could be chosen to
represent the subtraction operation, for instance. Then if 127 enters the mill in a
special operation code context, the mill will interpret it to mean "perform a subtrac-
tion." If entering the mill during another context, 127 might be interpreted as a
variable (address of a storage bin) containing number grist. Or again, 127 might sim-
ply be taken for a constant to be used as grist. Numbers can now be used not only as
the internal grist to be operated on, but also as the representatives of operations and
variables.
This opens the door for the second advance: internal program storage. With
the entire program represented numerically, its parts could now reside in internal
storage bins. This allows the modern mill to use the normal transportation system to
retrieve "operators" (representing operations) and operands, and to accomplish
new goals that are impractical with an externally stored program.
Information 5
grammer's imagination. The pioneer who exploits codes creatively will be placing an
uncharted region into his domain.
But as quickly as we have entered that mystic realm, we must retreat to the
known colonies of existing codes. Their mastery is essential preparation for any fur-
ther sorties.
Three categories of codes warrant our attention: the numeric, the symbolic,
and the conceptual. The numeric codes are foundational and will require the most
effort to master. Your time with these codes will be repaid many times over in
Chapter 3 when we study their transformation with assembly language.
Numeric Codes
The numeric codes assign number meanings to the numbers used within the com-
puter. Numerically coded numbers can be used in counting and all other
mathematical activities. They can also indirectly represent anything that
mathematics can manipulate; that is, they can indirectly represent any information.
So numerically coded numbers can represent the number of shopping days left until
Christmas, the number of times the computer must perform a particular set of ac-
tions, and so on. This is analogous to the role of numbers in equations. Numbers are
treated by equations as pure values, but may and often do represent physical or con-
ceptual quantities to the human being utilizing the equations.
The most general codes are numeric. Since these codes can indirectly repre-
sent any information, there may seem to be little need for other types of codes.
However, the remaining code types restrict and predefine the assignment of infor-
mation to numbers in certain special cases to simplify and enlighten the program-
mer's job.
We will examine three numeric codes: the binary code, the hexadecimal code,
and the binary-coded decimal (BCD) code. Be sure that you understand each code as
it is discussed before moving along; the explanation is graduated and depends on
your grasping all earlier material.
Binary code. The binary code, also called the binary number system,
represents arbitrarily large or small numbers with combinations of the digits 0 and 1.
Binary number representation is similar to that of the decimal system, but simpler.
Imagine a row of light switches on a wall. Each switch has the standard Off
and On positions. Let's say that these represent the binary digits 0 and 1, respec-
tively (Fig. 1.1). Using the rightmost switch alone allows for only two number
values, 0 and 1 (Off and On). Using the two rightmost switches allows for four
number values; two for the Off and On positions of the rightmost switch while the
next-left switch is Off, producing 00 and 01, and two for the same positions of the
rightmost switch while the next-left switch is On, producing 10 and 11.
Each added switch doubles the total number of combinations, since all
previous combinations are allowable in combination with each of the two positions
of the new switch. Saying this with numbers, we have 2 (or 21) possible position com-
6 Past Its Plastic Envelope: The Computer's Inner Machinery Chap. 1
Switches
'ON' ('1')
'OFF' ("0)
S7 S6 S5 S4 S3 S2 S1 SO
Figure 1.1
binations with one switch, 4 (two times 2 or 22) combinations with two switches, 8
(two times 4 or 23) combinations with three, and in general, 2n combinations with n
switches. The if notation means "multiply i by itself j times," as in 23 == 2 x 2 x 2
== 8. In the base 10 or decimal system, the rule would be IOn combinations with n
10-position switches.
Each two-position switch in the wall plate corresponds to one column or bit,
short for "BInary digiT," in a binary number. Bits are grouped together, like the
switches above, to express more numbers than would be possible with a single bit.
Early computers actually used OnlOff switches like those above for entering data.
All data expressed as a sequence of On-Off bits are called digital data, due to
the grouping of digits. Eight-bit numbers are the norm in microcomputers like the
Commodore 64, and are called bytes. We now know that an 8-bit byte allows 28 ==
256 different bit combinations.
In our discussions of binary numbers we shall use and assume byte groupings
of bits. Four-bit nibbles and 16-bit numbers are also common. Any bit grouping,
regardless of the number of bits involved, is generically called a word . The entire
switchlbit analogy can be demonstrated visually as shown in Fig. 1.2. This analogy
is especially close since bits are represented in the microcomputer by tiny switches.
These onloff devices are familiar to you as transistors. They provide a physical way
to represent numbers. The remainder of our discussion of binary numbers will con-
centrate on binary math and how special types of numbers can be represented in the
binary code.
Binary counting begins with the number zero, which is represented by a binary
word containing only O-bits. We say that all the bits have been reset, since "setting"
a bit means making it a 1. Thus the byte value 0000 0000 equals the number zero.
The digits are grouped in fours to make reading them easier.
Beginning as in the decimal code, the rightmost column of a O-byte increments
to 1. Then, since there are only two possible digits per column, a succeeding incre-
S7 S6 S5 S4 S3 S2 S1 SO
'ON' ('1')
o o o
'OFF' ('0')
Hgure 1.2
Information 7
ment returns the rightmost bit to 0, and a carry is made to the next bit to the left. So
it goes, incrementing and carrying, as shown in Table 1.1.
Thus a 1 value in the rightmost binary column and O's in all others represents a
decimal value of 1. A 1 in the column left of rightmost and O's elsewhere represents
the decimal value 2. Moving over one more column, a lone I digit represents decimal
4. Following this pattern, the weight or multiplier for a I-bit in each column is a
power of 2. 2, 21, and 22 are the weights for the three rightmost columns just
discussed. Put a little more formally, the columns in a byte are numbered from 0 to 7
and are labeled d7 d6 d5 d4 d3 d2 dl dO. The dO or rightmost bit has a weight of
2, and the d7 or leftmost bit has a weight of 27.
The first 16 powers of 2 are used quite frequently. They are listed in Table 1.2.
Note that the d7 bit of an 8-bit number has a weight of 27 or 128, while a byte as a
whole can represent 28 or 256 possible numbers. This is because the number resulting
from setting the d7 bit is the binary value 1000 0000, or decimal 128, while a byte as
a whole can represent 256 different numbers from 0000 0000 to 1111 1111.
One last point about the table of powers: 2 10 has a special place in binary ter-
minology. Its decimal equivalent, 1024, is close to a convenient power of 10, the
number 1000. In electronics and other fields, the number 1000 is often abbreviated
"K". Because of its kinship to the number system we know and love, we tend to
speak of large binary numbers in terms of K. So 216 = 26 X 210 = (26)K = 64K, and
so on.
0000 0000 0 0 I
0000 0001 I 2
0000 00 10 2 2 4
0000 0011 3 3 8
0000 0 I 00 4 4 16
0000 0101 5 5 32
0000 OlIO 6 6 64
0000 0111 7 7 128
0000 1000 8 8 256
9 512
10 1,024
II 2,048
1111 1111 255 12 4,096
13 8,192
14 16,384
15 32,768
16 65,536
Information 9
Addition. The following four operations are all that are needed for binary
addition.
(e.g., a carry)
o o +
+ 0 + + +
o (1)0 (1)1
(0 carry 1) (1 carry 1)
Each column in a binary addition of two numbers will contain one of these
four operations.
Example:
Adding the byte values 1011 1101 and 1001 1100 demonstrates all the addition opera-
tions in one problem. Each column addition is visually separated on different levels as it
is performed below.
1011 101
+1001 00
+0
o
+ 0
o
1
+ 1
o
+ 1
+ 1
+ 1
+ 1
+ 1
+ 0
1 0
+ 0
+ 0
1
+ 1
o
1 0 1 0 1 100 1
10 Past Its Plastic Envelope: The Computer's Inner Machinery Chap. 1
Both addends in the preceding example were 8-bit bytes, but the sum was 9 bits
long, or I bit larger. In a byte-wide computer such as the Commodore 64, this extra
bit is called the carry bit or carry flag. It is outside the storage location for the result
byte, but its value is stored elsewhere so it can be checked by the program.
Exercise:
Practice the binary operations with the following addition problems.
(h) o1
+ 0 1 1 1 1
(i) 1 0 1 0 1 0 1
+ 1 1 1 1 1 1
U) 100 1 0 1
+ 1 0100110
from which we are subtracting. The computer does not care, and it gives us a bit to
borrow from. Thus the subtraction will be of the form
I 0000 0000 - Y
As in addition, there are four possible column operations. Each column sub-
traction looks like one of the following:
1 o 10 (this is a 0 - I after
- I - 0 - 0 - I borrowing from the
ninth I-bit)
o o
We will illustrate the initial 0 - Y subtraction of the long method in an exam-
ple problem containing all four subtraction operations.
Example:
Negate Y, for Y = 0110 1110.
As we have just said, the negative of a number can be found by subtracting it
from O. With the ninth bit set to provide a source to borrow from, this problem appears
as follows:
1 000 0 0 000 Stage I (0 - Y; restate this problem with the
- 0 1 101 I 1 0 carry bit borrowed down the byte)
I I I I I I 10 0 Stage 2 (the carry bit has been borrowed down
-011011 1 0 to the last negative-result column)
100 1 0 0 1 0 Stage 3 (- Y)
so the arithmetic negative of - 01101110 is 10010010.
The term in brackets, 1111 1111 - Y, holds the key to the shortcut.
Example:
Observe the results of the operation 1111 1111 - Y = Z on a few values for Y:
0) 1 1 1 1 1 1 1
-01101011 Y
100 1 0 1 0 0 Z
12 Past Its Plastic Envelope: The Computer's Inner Machinery Chap. 1
(ii) 1 1 1 1 1 1 1 1
- I 100 1 101 Y
001 100 I 0 z
(iii) 1 I I I I 1 1 1
-001 10101 Y
11001010 Z
Each bit in each result Z is reversed from the corresponding bit in Y. Thus, instead
of subtracting Y from 1111 1111, each bit in Y can simply be reversed. The reverse
of Y is called the logical NOT of Y, since when it is complete every bit in Y is NOT
the same as in the original.
Thus the shortcut reduces to the equation
x - Y = X + ([NOT Yj + 1)
In words, this equation says "To subtract Y from X, reverse every bit in Yand add
1. Then add the result to X." The subtraction has been converted to an addition,
and the promised simplicity has at last arrived.
Example:
Calculate X - Y for X = 1100 lOll and Y = 1001 OlIO.
First, reverse each bit in Y:
(NOT y) = 01101001
Next, add I:
01101001 = (NOT Y)
+
01101010 = (NOT Y + I)
X = 128 + 64 + 8 + 2 + I 203
Y = 128 + 16 + 4 + 2 150
X - Y = 32 + 16 + 4 + I 53
It checks.
Of course, the foregoing method also works with longer data words, as in the
following example.
Information 13
Example:
Solve X - Y for 16-bit words where X = 10010111 11101101 and Y = 00010101
01101001.
Following the shortcut, we find that
(NOT Y) = 11101010 10010110
and
(NOT Y) + 1 = 11101010 10010111
so
X + [(NOT Y) + IJ 10010111 11101101
+ 11101010 10010111
10000010 10000100
The carry, or seventeenth (dI6) bit, is discarded, so the result is
X - Y = 10000010 10000100
Exercise:
Hone your subtraction skills by solving the following problems. Do (k) as a long sub-
traction, and (I) and (m) as additions using the conversion process described above.
(k) 10001011
01111101
(I) 01101001
0010101
(m) 11000110
10110011
Another term for NOT Y is the one's complement of Y, due to its origin by
subtraction from a number consisting of all 1'So This is usually abbreviated IC(Y).
NOT Y + 1 is an even more important value called the two's complement of Y. It is
abbreviated 2C( Y).
As in signed binary, the top bit in two's-complement numbers equals 0 for
positive numbers and 1 for negatives. Practically speaking, this means that 8 bits will
not count to + 256 in two's-complement notation. Instead, an 8-bit 2C number
represents the integers from 0 to 127 and from -1 to -128. With 16-bit words the
representation is of all integers from 0 to 32,767 and from -1 to -32,768, with dl5
being the sign bit.
If two large positive 2C numbers are added together, the sign (top) bit of the
result can be set to 1. The sum will then erroneously be interpreted as being negative.
Similarly, the sum of two large negative numbers can misleadingly appear positive
when the top bit is reset to O. Both conditions are known as overflow. Overflows can
be avoided by switching to larger words (e.g., from 8 bits to 16 bits).
Minute Numbers. This is a good time to show how minute numbers, or frac-
tions, can be represented. Recall that each column has a weight or power of 2
Information 15
Exercise:
Solve the following problems using any method you like.
(n) I 0 0 1 I 0 0 1 . 0 I I
-01110011.101
Example:
Multiply the decimal numbers 243 and 768.
243
x 768
1944
1458 partial products
1701
186624 product
Similarly, to multiply two binary numbers, we obtain the partial products due
to each digit in the multiplier and sum them. Binary column multiplication, like
binary column addition and subtraction, is based on four operations. They are
o o 1
x 0 x 1 x 0 x
o o o
The rules for binary-point placement in binary multiplication are the same as
those for decimal-point placement in decimal multiplication. An example of binary
multiplication that incorporates all four types of column multiplications with frac-
tional binary numbers is given below. It illustrates the foregoing principles and com-
pletes our discussion of multiplication for now. This subject will be taken up again
in greater detail when we study assembly language in Chapter 3.
16 Past Its Plastic Envelope: The Computer's Inner Machinery Chap. 1
Example:
Multiply the binary numbers 101101.01 and 1.01.
101101.01
x 1.01
10110101
OOOOOOOO
10110101
111000.1001
0.00
(q) 1 0 0 . 1 1111 0 1 1 . 1 0 1
weights double going left, and halve going right, moving a digit one column either
doubles or halves its multiplier. The whole number is the sum of these digits times
their weights, so shifting the number left multiplies it by 2, and shifting it right
divides it by 2.
This allows us to perform a nice trick for decimal-to-binary conversion. If we
divide a decimal whole number by 2, we are in effect shifting it as a binary number
to the right. A remainder of 0 means that the value 0/2 passed the decimal point and
the equivalent binary point into the first fractional column. A remainder of 1 means
that 112 ended up in the first fractional column. The value remaining to the left of
the decimal point is the same as if the original number's binary equivalent had been
shifted right one place.
If we collect those remainders until nothing remains left of the decimal point,
we will have obtained the entire binary equivalent. The best way to convey this proc-
ess is with an example.
Example:
Convert the decimal number 55 to its binary equivalent.
Divide the number 55 by the number 2 until the results are completely fractional:
5512 = 27 remainder
2712 = 13 remainder 1
1312 = 6 remainder 1
612 = 3 remainder 0
312 = 1 remainder 1
112 = 0 remainder 1 done, since the next division 012 shows
nothing remaining left of the binary
point in the binary equivalent
Thus the binary equivalent of 55 i, 111011.
This trick works equally well for converting fractional decimal numbers to
their binary equivalents. However, fractional numbers are multiplied rather than
divided by 2 to shift the binary equivalent's digits left of the binary point. As each
digit crosses the decimal point it is collected and removed from the fractional part to
be multiplied in the next step.
Example:
Convert the decimal number .671875 to its binary equivalent.
Multiply the number .671875 by 2 until the results are completely nonfractional.
2 x .671875 = 1.34375
2 x .34375 = 0.6875
2 x .6875 1.375
2 x .375 = 0.75
2 x .75 1.5
2 x .5 = 1.0 done, since 0 remains right of the
decimal (and binary) point
Thus the binary equivalent of .671875 is .101011.
Information 19
OR (Logical) OR (Binary)
Inputs Inputs
A B Result A B Result
Inputs Inputs
A B Result A B Result
AND o o
AND
On the other hand, a O-bit in an AND operation "masks" out the other bit,
regardless of its value, and gives a 0 result. That is,
o AND o o
o AND o
One use of the AND function is in testing the contents of bits in a byte or
word. To discover if bit dS of a target byte holds ai, AND the byte with a condition-
ing byte of 0010 0000. The operation zeroes all bits but dS, gates dS, and produces a
directly testable nonzero-result if dS originally contained a 1.
The last logical operation, XOR or Exclusive OR, produces a True result from
two inputs when EXCLUSIVEL Y one OR the other, but not both, is True (see Table
I.S).
The XOR operation can be used to set a byte equal to O. Simply XOR the byte
with itself.
Example:
Logically XOR the value 1001 1011 with itself.
10011011
XOR 10011011
00000000
Another use comes from the property that a 0 XORed with any bit produces
the original value of the bit, but a 1 XORed with any bit produces the reverse of the
original bit value. Thus selected bits in a byte can be complemented.
Inputs Inputs
A B Result A B Result
The symbols for NOT, OR, AND, and XOR are the overbar C), plus (+), dot
(.), and circled plus { e ), respectively.
Exercise:
Perform the following logical operations.
(u) 10110011
OR 00101110
(v) 0 I 0 0 I 0 I 0
AND I 0 0 I I 1 I 0
(w) I 0 I I 0 1 10
XOR 0 I I 0 1 10 0
Example:
Convert several 4-bit binary numbers to their hexadecimal equivalents.
(i) ooll binary = 3 hex
(ii) lool binary = 9 hex
(iii) 1101 binary = D hex
0
16 10
2 256 100
3 4096 1000
4 65,536 10000
Arithmetic. Because of the similarities between the hex and binary codes, we
will discuss only hex addition and subtraction here. The other arithmetic and logical
operations are similar extensions of their binary counterparts.
To perform totally hexadecimal addition you would have to memorize all the
possible column additions. That is, you would have to immediately know that B hex
+ B hex = 18 hex, and so on. For most people it is easier to convert the hex digits
mentally into their decimal equivalents for each column addition, and convert the
column result back into hex. If the result is greater than lSd, you have to subtract
16d from it to obtain the carry for the next column.
Example:
Add the hex numbers A7BOh and 98AOh. Show all carry digits in parentheses.
A7BOh
+ 98AOh
o
+ 0
o
B
+ A
(1)
5
+ 7
+ 8
(1)
o
+ A
+ 9
(1)
4
A7BOh
+ 98AOh
(I)
4050h
--------------
The leftmost carry in an addition result will often be discarded. If so, the
numbers are said to wrap around, so that the first number higher than FFFFh is
OOOOh.
Hex subtraction is similar to the long method of binary subtraction that we
discussed earlier. The two's complement of a hex number is found by subtracting the
number from a zero value having ai' 'borrow digit" in the column left of its highest
column.
Example:
Find the 2C negative of BC7Fh.
10000h
B C 7 Fh
F F F 10 (the carry is borrowed down to
BC 7 F the last column needing it)
4 3 8 Ih
or 4381h.
Exercise:
Solve the following subtraction problem by converting the subtrahend into a 2C value
and adding it to the minuend. Don't forget to discard the carry from the result.
(z) D B 8 A minuend
- 7 E C C subtrahend
BCD code. Binary-coded decimal or BCD code is a hybrid, as its name im-
plies. Like the hexadecimal code, it divides a byte into two 4-bit groups. However,
each group represents a decimal digit rather than a hexadecimal digit, and the weight
of each group is a power of 10 rather than a power of 16. This implies that the values
o through 9 are the only legal contents of a group, with the six values A through Fa
"no-man's land" that is never to be trespassed. BCD code, in short, is a binary way
of expressing decimal numbers.
Example:
Show the BCD form of several decimal numbers.
(i) 27d = 0010 0111 bcd
(ii) 95d = 1001 0101 bcd
(iii) 34d = 00 11 0100 bcd
In BCD code a byte's two groups of 4 bits can represent two decimal digits.
Hence the largest legal value of a BCD byte is 99d. 100lb = 9d, so 99d appears as
1001 1001.
Imagine adding the two legal4-bit nibbles 7 and 8. F, the sum, is clearly in the
Information 25
no-man's land, so what do we do? There are six digits in no-man's land, so we add a
6 (01 lOb ) to the result to "leapfrog" the forbidden territory. This addition generates
a carry bit that must be added with the next-highest nibble addition, or that becomes
the final leftmost digit if there are no further additions to perform.
The "no-man's land" correction is mentioned here only so that you will
understand BCD arithmetic. You need not perform the addition or corresponding
subtraction corrections, since your computer has special BCD operations that per-
form them automatically.
Example:
Add the two BCD numbers 0111 and 1000.
0111 bed (7d)
+ 1000 bed (8d)
1111 in no-man's land
+ 0110
(000)1 0101 bcd = 1 5 (15) decimal
BCD is used wherever the direct conversion of a number from its computer
representation to a human-readable form is desired, or wherever each decimal digit
must correspond to a single nibble, or most important, where precision to the right
of the decimal point is required, as in accounting. Since many decimal fractions re-
quire an infinite number of binary fractional digits to express them, BCD is needed
to produce an exact duplicate of the decimal number in machine form.
II Exercise:
Add the BCD numbers 1001 1000 and 0101 0110.
(aa) 1 0 0 1 1 0 0 0
+01010110
II
Some types of information must be coded and grouped. For instance, a force
must be described with numbers for each coordinate direction in space. However,
other types of information are of a simpler underlying structure and can be directly
coded in imaginative ways. Although only the most obvious examples of these types
of information have been coded, direct coding opens up some elegant and direct
ways to handle all such information. Our last two categories of codes exploit this
potential.
Symbolic Codes
The 7 bits of the ASCII code allow 128 possible meaning assignments. In addi-
tion to numbers and letters, punctuation marks and control characters are coded.
Control characters have noncharacter meanings such as "ring a bell," "this is the
end of the message," and "back up a space." The most common control characters
are the carriage return (ODh), the linejeed (OAh), and the tab (09h). All ASCII con-
trol characters have number values between Oh and IFh.
The Commodore 64 extends ASCII for its internal use by defining the d7 bit of
standard characters as the value 0 and by adding new characters with the eighth bit
set to 1. It also substitutes its own graphic and control characters for the ASCII
characters between 60 hex and 7F hex, and substitutes its own interpretations to the
ASCII values Oh through IFh. We will call the Commodore ASCII code extended
ASCll or simply XASCII.
These new assignments are not recognized by most devices in the outside
world, so standard ASCII must still be used on occasion. A chart of the standard
and extended ASCII codes makes up Appendix C. It omits those ASCII comrol
codes that are of no use for normal programming purposes. Please glance through it
and hold your place there for reference during the following remarks.
The control values unique to XASCII include CLR HOME, RVS OFF, and so
on, as shown in Appendix C. These values duplicate the computer's keyboard func-
tions and allow control of the TV screen directly from a program. Graphics
characters can be sent by the program to the TV screen or to the Commodore dot
matrix printer.
The Commodore 64 has several built-in "miniprograms," making up what is
known as an operating system, to do frequently required chores. For instance, there
is a mini program for sending single data bytes to locations of your choice, including
the TV screen and the printer. If you provide this mini program with an XASCII
value to send to the TV, that character, or its effect if it is a control value, will show
up on the screen. Chapter 5 describes in detail the use of the operating system.
In Appendix C, note the relationship between the standard ASCII characters
"a" and "A". Their values, 0110 0001 and 0100 0001, respectively, differ only in
their d5 bits. This is true of all other standard ASCII upper- and lowercase letters.
To convert from uppercase to lowercase, just add 20h to the uppercase value, or
equivalently, set the d5 bit of the uppercase letter to 1.
There is a second Commodore code, called the screen display code, that
represents most of the characters found in XASCII code. However, the XASCII
control codes are omitted and extra graphics symbols are added.
These changes reflect the different usage of the screen display code. A pro-
gram places screen display character codes directly into the storage locations from
which the TV image data are obtained. With the Commodore 64's display of 25 lines
of 40 characters each, a 1000-byte storage area holds the entire screen image. The
screen display code is listed in Appendix B.
Actually, the screen display code is two codes collapsed into one. Depending
on the current configuration of the computer, one or the other subcode will be used.
So at a screen position holding the screen code value 1, either" An or "a" can be
Information 29
name tells us much more than do its individual characters; a name identifies a
human being. Similarly, an address composed of only letters and numbers helps us
locate one building among all the buildings in the world.
A group of smaller data units forming a single and meaningful larger unit is
called a data structure. Data structures are valuable because of the information
beyond that found in their elements. The first type of added information is an
overall interpretation assigned to the structure. Again, the interpretation "name" is
assigned to collections of characters to identify people. This means of adding infor-
mation should seem familiar since coding itself is the assignment of interpretation,
albeit at a lower level.
The second type of information added to a data structure is the format given
to the included data. The format implies relationships and the significance of in-
dividual data elements. For example, the data structure called a person's name is
usually in a three-part format, with each part being a single-word data structure. By
mixing the format order of the contained single-word data structures the meaning of
the entire name structure is changed. The name "Mason," as a last name, indicates
both the individual's kinship and his or her family roots in the profession of stone-
or brickwork. The same name in the first position normally implies that the person is
a male, a meaning absent in the last position. As a forename, "Mason" may also
refer to strong qualities observed or hoped for in the baby, or more likely in our
society, it may have been chosen for its sound.
So information can be expressed by assigning interpretations to numbers
through coding, and by assigning interpretations to groups of data and to their inter-
nal organizations.
As the' 'name" structure suggests, data organization need not stop at one data
structure level above the coded data elements. A data structure can be formed from
other data structures. For instance, the common mailing-address data structure is
made of the name data structure, the street-address data structure, the city, state-
names data structure, and the zip code data structure. The interpretation of this
structure as a mailing address is added information, as is the ordering of its member
data structures. At an even higher level, a company-mailing-list data structure may
consist of multiple mailing-address data structures, and so on.
Three levels of information representation are shown in Fig. 1.3. We view the
Figure 1.3
30 Past Its Plastic Envelope: The Computer's Inner Machinery Chap. 1
computer here as a mill that transforms numbers to perform tasks. The numbers
represent information and are organized into data structures to represent additional
levels of information. Arbitrarily complex information could be stored in data struc-
tures built of layers of lesser data structures. Thus we understand at an overview
level what the computer does and what it does it on.
Just knowing what the computer does is insufficient. We must also know how
it does it and why. To learn how to perform tasks with the computer, we must begin
by examining the types of tasks that a computer can perform. We know that they
must involve information. What are their other characteristics?
The two major types of tasks performable by computer are those that require the
simple selection of action, sometimes called switching tasks, and those: that require
the transformation of information, called dataflow tasks. Switching tasks are com-
mon in communications systems, where a microprocessor acts as a "traffic cop" and
routes data from one place to another without transforming them in any way. Pure
switching tasks are seldom encountered by the personal-computer programmer, so
are not considered further in this text.
Data flow tasks are tasks that can be performed by transforming one or more
known data elements or data structures into one or more new data elements or data
structures of differing content or organization. Each input or output data element or
data structure can be viewed as a separate flow into or out of the data-
transformation process; hence the term "data flow task."
The computer's role in performing data flow tasks is similar to that of an elec-
trical transformer in manipulating energy. An electrical transformer receives one or
more electrical power flows of given voltages and transforms them into one or more
electrical power flows of different voltages. In an analogous manner, the computer,
uI}der the control of a computer program, transforms input data flows into output
data flows.
Data flow tasks lend themselves to pictorial representation. One type of data
flow picture is called a dataflow diagram or DFD. DFDs, however, are not just pic-
tures; they are tools that help the programmer understand the problem and solve it.
DFDs represent data flows as arrows and data transformers as circles. The generic
DFD for a simple data flow task is shown in Fig. 1.4.
Data element or structure in Data element or structure out
First addend
Number sum
A simple example of a data flow task is the addition of two numbers. The in-
put data flows are the two numbers being added, and the output data flow is the
sum.
Example:
Draw the DFD for a two-number addition problem.
The diagram should look as shown in Fig. 1.5.
The "Add the numbers" process has no effect on the structure of the input
data flows, since it produces an output flow of the same structure as the inputs.
However, the contents of the output flow are (in general) changed from the contents
of either input flow.
A transformation process that changes the structure of an input flow without
changing the contents of its individual elements is also easily imaginable. The grain
mill is an excellent physical example of this type of transformation. A grain mill
takes raw grain and changes its natural structure by pulverizing it. All the grain's
components remain, but they are reorganized. The same thing can be done with data
structures by retaining their component parts while transforming the shape of the
structure holding them.
Each transformation task can generally be broken into a group of less am-
bitious transformation subtasks. Each subtask transformation acts on portions of
the input data flows or on data flows produced by other such transformations.
Example:
Show how the transformation process of a hypothetical data flow task might be sub-
divided.
Your diagram should look something like Fig. 1.6.
Transformation
Data Data
flow flow
'6' 'C'
Figure 1.6
Information 33
Census information is not the only system whose entropy naturally increases
with time. There is a law of nature called the "second law of thermodynamics," or
simply the' 'law of entropy," which says that the entropy of any undisturbed system
will always remain the same or increase with time.
The only way to combat the law of entropy is to disturb the system from the
outside and reorder it. The army of clerks from headquarters served this purpose
with the system of census information. The computer can serve this purpose with
any system of information. It can either locate the causes of disorder and negate
them, or it can discover the natural groupings of the system's parts and restore
them. In either case it is able, locally, to reverse the law of nature and increase the
usefulness of the information.
Suppose that you would like to send information over the telephone from your
computer to another computer. The information will probably be ASCII encoded.
However, phone lines tend to have stray electrical signals on them which can mix
with the data being sent and confuse the receiving computer. Your computer sends
an A, and the receiving computer reads it as a C, a difference of only 1 bit in the
ASCII code. The message has become the victim of entropy. But the computers can
negate this cause of disorder. The sending computer adds a data flow of parity infor-
mation to the data flow of ASCII information. This additional information is
placed on the d7 bit of the ASCII bytes, as explained earlier. The receiving computer
checks the ASCII data against the parity data. If any part of the ASCII data has
been changed, it is rejected and the sender is asked to repeat the ASCII data. This
process continues until the entire ASCII flow has been correctly received. Thus the
computer uses additional information to counteract the effects of entropy in a
system and to transform a disordered ASCII data flow into an ordered and more
useful one.
Computers can go one step further and increase the order of existing informa-
tion. As we learned earlier, the organization of a data structure implies information
about the nature and interrelationships of its data elements. A computer can often
transform the structure of incoming data flows to produce outgoing higher-order in-
formation. This is a mechanization of the learning process. For instance, a computer
can take the results of the population census and sort it by region. The resulting data
structures contain only the original census reports, but their partitioning increases
the order of the entire system. With this additional order it is easy to count the
number of respondents in each region, to catalog their characteristics by region, and
to obtain other information that was too difficult to find in the original overall sam-
ple. The computer has in a sense "learned" of new relationships among the data and
"remembered" them as new data structures.
There are game programs that watch for a pattern in their human opponent's
moves over several games. They treat this pattern information as a data flow and
direct the computer to apply it to the data flow representing the best theoretical
moves to create a high-probability guess of the human being's next moves. Clearly,
this also is higher-order information.
Learning always decreases redundancy and the unprofitable repetitIOn of
34 Past Its Plastic Envelope: The Computer's Inner Machinery Chap. 1
work. Looking through a stack of census forms every time you want a single fact is
redundant, as is wasting an opportunity for attack by defending against a move that
the opponent will not make.
All information tasks are performed by increasing the order of existing infor-
mation, since information cannot be created from nothing. By consciously identify-
ing the sources of entropy in an information system, and by considering methods for
increasing its order, a task affecting that system can sometimes be performed more
quickly or elegantly. At the least, knowing that entropy is at the heart of performing
all useful tasks provides a deeper understanding of the program-planning process to
be discussed in Chapters 3 and 4.
Thus entropy is the economy that justifies computers. At last we are ready to
explore your electronic computer in depth. What does the blueprint for your infor-
mation mill look like?
THE COMPUTER
To answer this question, we begin by drawing the information mill with its natural
surroundings in the computer. Those surroundings are the grist storage area and the
mill support systems for transportation and communications. Figure l.7 illustrates
the two major areas in the computer and how they are connected by the support
systems. The transportation system carries grist back and forth between the mill and
storage under the coordination of the communications system.
For the rest of this chapter we will explore the elements in Fig. 1.7. Do not be
concerned about the specific means for controlling the computer; that is the realm of
assembly language, discussed in Chapter 3. For now, your goal should be to get a
feeling for what is inside the Commodore 64 and how those things work together.
The information mill portion of the computer subdivides into two parts. These
two components correspond to the two operating functions of the mill: number
transformation (i.e., performing the arithmetic and logical operations of the
numeric codes discussion) and coordination with the mill world through a com-
munications system. Reasonably enough, the number transformation section is
called the arithmetic-logic unit or AL U. The coordinating section is named the con-
trol unit or Cu. In this terminology, the information mill as a whole is sometimes
called the central processing unit or CPU.
Transportation
Data
Grist
Information mill
storage
Communications
---- --
Control Response
Figure 1.7
The Computer 35
~----------------l
Figure 1.8
The Computer 37
plies the second input in most cases, and the result goes back into the accumulator.
The process is shown in Fig. 1.9.
This is a hallmark of the 6510: Its organization revolves around an ac-
cumulator which provides an input to the ALU and receives the ALU's output. In
other words, the 6510 is what is called an accumulator-based processor. One typical
6510 instruction says: "Logically AND the contents of the accumulator with the
contents of memory map address z, and deposit the results in the accumulator" (z is
specified in actual use, of course). In Fig. 1.9, the ALU internals are represented by
arrowheads to illustrate the two data inputs and one data output.
Data
,----------,
I ALU I
I I
I
I j
I I
L __________ J
Figure 1.9
The remaining two 6510 data registers are called index registers and are labeled
X and Y. They too can buffer data, and in a more limited way, serve as ALU inputs.
Their main use, however, is related to addressing, a topic that will be discussed later.
The three data registers all connect with the ALU (Fig. 1.10), so they are tradition-
ally considered as part of the ALU section of the microprocessor.
There is a special-purpose register that is also closely associated with the ALU.
Recall that under program control, the CPU can look at the results of previous
operations and make program decisions based on them. These results represent the
status of the CPU due to its recent history. A special-purpose register maintains
records of these results and is therefore called the processor status register, labeled P.
The processor status register keeps a record of important arithmetic outcomes
such as zero or negative results from subtractions and overflows from 2C additions
by setting flags or register bits. The flags can be tested and reacted to by the CPU
under program control. The processor status register is more commonly called the
flag register because it flags specific types of CPU operation results with set bits.
Other flags in the flag register show if the CPU is currently set up for BCD or
binary arithmetic, if there has been a carry from an arithmetic operation, and if a
38 Past Its Plastic Envelope: The Computer's Inner Machinery Chap. 1
Data
-+--- ~
~--~ A f----.--.,
----I
I
Figure 1.10
break or interrupt, two means that we have not yet encountered for c:hanging the
flow of program execution, has occurred.
The first four registers that we have discussed-the accumulator, the X and Y
registers, and the status register-all connect to the data bus and the ALU. The next
three registers connect to the address bus and the ALU, and hence are called address
registers.
Address registers place their contents on the address bus, but never accept data
from it. Two of them can be grouped together to hold the 16-bit addresses used to
select memory map locations. The third register is used by itself in a special way that
will be explained shortly.
If you recall, one of the greatest breakthroughs in the development of com-
puters came from placing the program into memory. This was made possible by the
coding of program operations, so the operations and their operands could then be
loaded into memory. To execute the program, the CPU need only point to the loca-
tion of the next byte of the program and fetch its contents to the mill. If the byte
represents an operand it will be milled, and if it represents an operation it will be in-
terpreted and executed.
This pointing to a memory map location is what we call addressing. The ad-
dress located in the address registers is placed on the address bus and sent to
memory. Fractions of a millionth of a second later, a "write" or "readl" command
is sent over the control bus to select and activate the addressed memory location for
data transfer in or out over the data bus.
The two general-purpose address registers are grouped as a ;,ingle 16-bit
register called the program counter or Pc. The upper 8 bits of an address are held in
The Computer 39
the PC High or PCH register. The lower 8-bit register is called PC Low, or PCL.
The PC is an incrementing register; that is, it is a register that automatically adds 1
to its own contents under certain circumstances. To simplify our explanation it is
also preferable to think of the X and Y registers as being able to both increment and
decrement their own contents. Increments and decrements are the only
mathematical operations that can be performed directly on the contents of the X and
Y registers.
The PC addresses each byte in a program as the program executes. After each
program byte is fetched, the PC is incremented or increased by 1. In this way the PC
counts its way up through the memory locations holding the program.
Before the contents of the PC exit the CPU, they are buffered by another two
"invisible" registers. As with the previous invisible register, these cannot be directly
affected by the programmer. We mention them because they allow addresses from
sources other than the PC to be placed on the address bus. These two registers are
labeled HIBUF and LOBUF in Fig. 1.11.
.. Data
,------+-lAf-------.,
I~~~~~-~~-,
I ALU I
I I
~----!PCL
LOBUF
Address bus
Addresses
~---I PCH ~---I
--~~~
HIBUF
Figure 1.11
The Computer 41
on the rack from the front (pushing back the old milk). You come shopping and
want some milk. You pick up the milk carton at the front of the rack, and since it
was placed there most recently, you get nice fresh milk. This grocer always fills the
rack from the front, so the milk at the back of the rack stays so long that it curdles.
Now, no grocer is going to be so careless as to put the fresh milk up front, but
this fantasy demonstrates a principle. The latest milk carton placed on the rack by
the grocer is always the first milk to be removed. Think of the rack of milk as a
horizontal "stack" of milk. We can say that "the last milk in the stack is always the
first milk out."
....~
42 Past Its Plastic Envelope: The Computer's Inner Machinery Chap. 1
Why not do the same thing with data? Suppose that in the course of some
calculation in a program, you need to store several numbers for later use. If you set
up a "data stack," like the milk stack, the numbers can simply be "pushed in at the
front" as they are generated, and "pulled off the front" in reverse order as needed.
One of the advantages of this type of data transfer is that only one address is
needed: the location of the top, or latest front position, of the stack. In practice, the
address of the top of the stack is kept in the third address register. Because it services
the stack, it is called the stack pointer register or simply the stack pointer.
To build a stack, a starting or bottom address must first be chosen and loaded
into the stack pointer. Since no data are in the stack, this address is also the top of
the stack. The first data byte is then placed into the accumulator. Finally, execution
of a 6510 instruction called a PUSH causes the accumulator contents 1:0 be placed
into the location whose address is in the stack pointer, and the address in the stack
pointer to be decremented by 1. Subsequent pushes force the stack pointer addresses
even lower, causing the stack to grow downward. This seems a little unusual to most
people, since it appears more natural that the stack would grow toward higher ad-
dresses. However, the stacks for most microprocessors grow the same way.
To retrieve the last byte pushed on the stack, the stack pointer must first be in-
cremented by 1 to make up for the decrement after the last push. Then the contents
of the location pointed to by the stack pointer can be loaded into the accumulator.
The 6510 instruction causing this operation is called a POP.
Since the stack pointer can address only locations 100h to IFFh, the stack can
at most hold only 256 bytes. This, however, is usually more than enough. IFFh is
frequently chosen as the bottom of the stack to allow for this maximum size, but it is
occasionally useful to set up more than one nonoverlapping stack, each with its own
bottom address in page-one memory. In that case, one must load the current top ad-
dress of each stack into the stack pointer before use, and save it before switching
stacks. The assembly language instructions needed for these two op,erations are
discussed in Chapter 3.
One use of the stack is to save the contents of the data registers prior to execu-
tion of a part of a program called a module or subroutine. Modules are also dis-
cussed in Chapter 3. The accumulator and flag registers are the only locations whose
contents can be placed on the stack or retrieved from it. The other registers and all
the memory locations must go through the accumulator to use the stack.
The stack register can be incremented or decremented by POP and PUSH in-
structions, and the PC register can be altered by addition of 2C constants. HIBUF
and LOBUF can be altered by the addition of values from the X and Y registers. The
different ways of altering all these address registers provide the different addressing
methods or addressing modes that make the 6510 an extremely flexible
microprocessor. Addressing modes will be discussed again in Chapter 3. Since the
contents of the address registers can be altered with ALU operations, the address
registers, like the data registers, are associated with the ALU section of the CPU
(Fig. 1.12).
The Computer 43
.. Data
.-----~AI-----__.
1-----------:
, AlU
I
L ___________ -'
....- - - - - - - - - 1 PCl
Address bus
SP
.. Addresses
'1 '
HIBUF
Figure 1.12
44 Past Its Plastic Envelope: The Computer's Inner Machinery Chap. 1
Yet another type of register is physically located in the 6510 but is. accessed by
the 6510 as if it were in the memory map. Two registers fit this category. The bits of
one register are connected with pins or wires exiting the 6510 microprocessor. The
other register determines the data direction of each bit in the first, that is, whether
each bit will be an input to the 6510 or an output. Thus one register is an input/out-
put or lIO register, and one is a data direction register or DDR.
These registers are used to change the types of locations in the memory map. As
we stated earlier, memory is not the only type of location in the memory map. The
Commodore 64 contains 64K of user memory alone, with many more locations of
other types available. Since the 6510 can address only 64K of memory map locations
at a time, using additional locations requires the ability to attach and detach them
from the CPU buses. Three bits of the 110 register perform this task. We could
think of them as a memory-map configuration selector.
Another 3 bits of the 110 register control and monitor the 64's optional data
storage cassette player. The last two bits are unassigned. This register is the con-
figuration register, labeled CR in Fig. l.13. Since this register pair can move data in
or out of the CPU, it is jointly called the bidirectional 1/0 port. You can think of
the bidirectional 110 port as being associated with the ALU since it can be loaded
from and stored into the accumulator, but the relationship is more tenuous than it is
for any of the other registers.
The last type of register is connected directly to the CU and the data bus and
indirectly to the control bus. There is only one register of this type and ill is called the
instruction register, labeled IR in Fig. l.13.
A program is composed of instructions built of one operation byte and zero to
two operand (address or constant data) bytes each. Every time an operation byte is
brought into the CPU, the data bus deposits it in the instruction register.
The instruction register is attached directly to a control unit (CU) circuit called
the decoder. The decoder converts the operation byte into explicit directions that
select the CPU registers, timing, and actions needed to execute the instruction. The
control bus then delivers any directions affecting the mill world or memory map
around the CPU.
Each operation byte specifically informs the CPU of the number of following
operand bytes, so the CPU knows to look for the next operation byte after the last
operand in the present instruction. With the additional constraint that a program
must always start with an operation byte, the CPU always knows if it has retrieved
an operation or an operand. The mystery of how the CPU understands an instruc-
tion is solved.
This completes our discussion of the floor plan, or structure, of the CPU in
your computer. Now let's fully expand the CPU side of the mill world first seen in
Fig.!. 7.
r----+-1 A 1------,
I----------~
ALU I
i
I
I
,L- _ _ _ _ _ _ _ _ _ _ J
Address bus
Addresses
HlBUF
DO ~----1
I/O port
CR
Memory configuration select I
---
~ ~--------~~
Figure 1.13
The Computer 47
As we have said repeatedly, memory is only one part of the memory map. There is a
second major type of memory map location, called I/O.
Memory locations come in two varieties. One type allows the CPU to both
read from and write to it, while the other allows the CPU only to read its contents.
The first variety of memory is abundant in the Commodore 64. The' '64" in "Com-
modore 64" refers to the 64K of this type of memory in the computer. The most
logical name for this memory would be "read-write memory," but the historical
name has been random access memory or RAM. This name arose when certain types
of memory could only be read or written in sequential order, so memory with loca-
tions accessible in any order (randomly) was dubbed RAM. RAM must have power
applied to retain its contents. When the computer's power is turned off, any data in
RAM are lost.
The second variety is more descriptively named read-only memory or ROM.
Although the CPU cannot write data to ROM, it can read ROM randomly in the
same way that it reads RAM. Thus both RAM and ROM are random-access
memories, making the name RAM a particularly poor choice. The CPU addresses
RAM and ROM the same way, which leaves it up to the programmer to make sure
that a program's use of the two types of locations makes sense. The contents of
ROM are retained whether or not power is applied. Any programs or subprograms
needed to run the computer have been placed in 20K of ROM in the Commodore 64.
Included in the ROM are the BASIC programming language and the operating
system.
The Fig. 1.8 mill-world drawing shows the outside world connected to the
memory map. This connection is made through the second major type of memory
map location, the JlO port.
There are read-only, write-only, and read-write 110 locations. All 110 ports
connect to CPU buses and to devices or circuits outside the CPU and the memory
map. They buffer data from the CPU and pass it on to an outside device, or in
reverse, they buffer the data from the outside device and hold it for the CPU to
read. These locations are called "ports" because data enter and exit the computer
through them. The internal 6510 110 register is a port of sorts, since it connects the
CPU with other nonmemory circuitry.
48 Past Its Plastic Envelope: The Computer's Inner Machinery Chap. 1
Data are transferred directly through I/O ports to and from large-capacity
storage devices such as disk drives and cassette recorders. Other peripherals, such as
printers, game joysticks and paddles, light p.ens, and talking data-to-voice con-
verters, communicate through the I/O ports. The modem, a device that connects
computers by telephone, attaches to the computer through an I/O port. Com-
modore markets most of these peripherals, and we discuss them at greater length in
Chapter 5.
Although the just-mentioned peripherals all accept bit-pattern or digital data,
there are other peripherals, such as TV and audio equipment, that require signals in
"wave" or analog form. The circuitry that provides these analog signals is con-
nected to and packaged with several of the I/O ports.
Almost all of the electronic circuits in the Commodore 64 come grouped on in-
tegrated circuits, called chips for the pieces of silicon on which they are placed.
There are eight I-bit-wide 64K-deep RAM chips in the 64, which together form the
8-bit-wide 64K-deep RAM memory, and three 8-bit-wide ROM chips: One 8K-deep
ROM holds the BASIC programming language, another 8K-deep ROM holds the
operating system, and the third 4K-deep ROM holds codes for the visual appearance
of characters. The computer uses the latter code values to display characters on the
TV.
Many of the chips containing I/O ports also contain circuits to convert be-
tween the port data and the signals needed by outside devices like the TV. The design
of this part of the 64 is simple and elegant. There are only four chips, two identical,
controlling all of the aforementioned peripherals. The chips handling video and
audio have on-board I/O ports and analog outputs; data written into these ports
control the analog output of the chips. Thus digital data produced by a program
control both analog and digital devices through the I/O ports. The graphics and
audio chips and the supporting principles of the visual and aural arts are discussed in
Chapters 6 and 7.
Summarizing our discussion of the physical contents of your computer, we
note that it contains three major types of chips: memory, I/O, and the
microprocessor.
We can now expand the left or memory map side of the mill world diagram in
Fig. 1.8. The "conversion" process in Fig. 1.8 is handled in Fig. 1.14 by the I/O
chips on which the I/O ports reside. All three buses from the CPU affect all devices
in the memory map.
The great founding computer principles of number codes, internal program
storage, and a central information mill with a surrounding world of storage,
transportation, and communications are now in your hands. Domination is next.
Programmers aren't built in a day.
The Computer 49
Exercise:
Answer the following questions to review the internal structure of the C64.
(bb) What are the three CPU buses?
(cc) What are the two sections of the CPU?
(dd) What are the three sections of the memory map?
(ee) Name the CPU data registers.
Outside world
Joyst ick
Memory map
(grist storagel
Game paddle -
Data bus
I
Light pen Main memory
Printe r
Address bus
RAM ROM
Mode m
Netw ork
Disk
Voice
synth esizer - Digital data
Data
1)0
casset te I/O I/O Control bus
chip ports
circuits
Telev i s i o n - -
Analog data
Audi I/O port
equip m e n t -
I
Figure 1.14
50 Past Its Plastic Envelope: The Computer's Inner Machinery Chap. 1
fjord-tide over
shards
now mirrors
Mercury, or "quicksilver," is the densest element that is both liquid and stable.
Disturb a little of it and you move a lot of mass. The designer of a mass pump might
well choose mercury as its mass-transporting fluid.
Now imagine a similar fluid that carries dense "conceptual mass." It dissolves
information and then compactly transports it. In short, it is a "conceptual
quicksilver." The quicksilver is the data flow, and it consists of information
"molecules" called data structures. As was explained in Chapter 1, data structures
are highly organized groupings of coded information.
In this analogy the CPU is a transmutation device that includes a pump to cir-
51
Modeling 53
has never yet played in a softball game; he always sits on the bench. Can we
categorize him as a softball player? On the other hand, can we afford not to, given
that he may play tomorrow?
The team member's role is actually well defined. Our difficulty comes from
placing him in a category that is too simple to encompass all the relevant informa-
tion about him; primarily, it omits the fact that he's a reserve player.
Problems with identifying the information needed to perform a task often
result from such attempts at oversimplification. Problems can also result from
"overcornplexifying" including so much specialized information that the resulting
category applies to only one ridiculously restricted situation. For instance, we could
place the softball-team member in the category "overweight but enthusiastic used-
auto dealer and sometimes Joe DiMaggio fan who has never played ball on the field
but who has played catch five times and swung a bat twice, whose name is Fred
Bazooka, whose wife's name is Dotty, who has three kids named Joe, Annie, and
Marvin, and a sister named Griselda, which means 'gray battle-maiden' in old Ger-
man, and the name really fits" (Fig. 2.2).
The latest category includes much more information about its subject than the
ballplayer category. However, a lot of it is inapplicable to the question we are really
interested in: Fred's status on the team. Further, no one but Fred will ever fit in this
category, so we cannot collect parallel information about the other players on the
team and come to any conclusions about Fred's role on the team.
The type of category that is most useful is one that is simple and broad enough
to be applied to other similar objects (e.g., team members) but detailed and complex
enough to contain all information needed to answer the questions we are interested
in. If that category is then made into a data structure and stored on a computer, a
program can ask those questions for us about many different subjects. An ap-
propriate category for our softball-team member might be "team-member status."
This category could contain the following information on any team member: name
(Fred), whether the member is reserve or active (Fred is reserve), how many games
the member has played this year (Fred has played 0), how many games the member
has ever played (again, 0 for Fred), playing position, batting average, home runs in
each game, and so on. A program handling the team-member status data could com-
pile team and even league statistics, as well as tell us where Fred stands relative to the
other team members.
The first step in performing an information-handling task, then, is to identify
the information that is required to produce the answers we want. This is the crux
of the first stage of modeling; selecting the smallest amount of information from
which the desired information results can be derived. This narrowing down is also
sometimes called abstraction.
How do you select the right subset of data to gain desired results? Sometimes
the choices are obvious, but sometimes you must just make a decision and proceed.
Fortunately, a poor choice can be abandoned and corrections such as adding or
deleting data made later. All programmers backtrack on occasion, and often more
than once on a given task. Overcoming any reluctance to improve a model is one of
the most important steps you can take toward fine craftsmanship in programming.
Data Structuring
Once you have a minimal set of data, you must decide on a structure in which the
data fit most naturally. We have a similar problem with structure that we had with
raw information. From the complex structure organizing all imaginable information
about a subject, we must identify the minimal structure necessary for performing the
desired information-handling task. A good criterion for this choice is efficient proc-
essing. Whichever structure makes the processing of that data by a program most
efficient, whether the standard of efficiency is program size, speed, or whatever,
should be used. This is a question you will be better equipped to answer once we
have discussed assembly language and program planning.
Example:
Walk through an everyday example of the modeling process.
The models most of us are most familiar with are the airplanes, boats, and so on,
we grew up with. The two stages of modeling are used by their designers.
First, designers select a subset of the total information about their real-world sub-
ject. Most model airplanes include a canopy, but few include a rotating shaft inside the
jet engine. It is economical for designers to include the minimum amount of informa-
tion necessary to convey "jet airplane." This is a kind of minimal-data choice.
Data Structure Types 55
Three basic data structures are all that are needed to organize any data for process-
ing. Called the sequence, the repetition, and the selection, they have the simplest
structural forms known.
Data structures and individual data items can be defined in a data dictionary.
Data definitions use a simple notation based on data names and symbols for the
three ways of organizing them. This notation resembles equations, except that the
name of the data structure being defined is placed on the left of the equals sign, and
the names of the data elements (data structures or data items) contained in the struc-
ture, and the symbols for the structure organizing them, are placed on the right. The
symbol for each type of data structure is discussed in that structure's section below.
Sequences
In an actual definition each label would be replaced with the descriptive name of the
data.
An individual sequence structure is called a record. Each data element in the
record is called a lield.
The sequential data structure can be represented pictorially as shown in Fig.
2.3.
n + i + j + k + ... I Field x ('m' bytes long) Figure 2.3 Sequential data structure.
Example:
In the record 'mailing_address' we find the fields 'name', 'locaLaddress', and
'zip_code'. A diagram of the complete mailin~address record (Fig. 2.4) shows the
order of its fields, starting from the left. The fields in 'mailin~address' would prob-
ably be composed of ASCII or XASCII bytes. The data dictionary definition of 'mail-
ing_address' is:
If the length of each field and the record as a whole have been set at a prede-
fined length, the structure shown in Fig. 2.3 is all that is needed. However, if the
length of the fields can differ, each field must be followed by an end-of-field
marker. This marker is a byte value or sequence of values that never occurs as a data
value. So if the data are ASCII characters, the value 0 will never occur among them,
and it can be used as a marker. Any other noncharacter ASCII value could also be
used. An end-of-field marker tells a program where each field ends and the next field
begins.
Similarly, if the record length may vary, for instance as fields are added or as
fields are removed and the structure is compacted, the record must: be followed by
an end-oj-record marker having a different value than the end-of-field marker. The
end-of-record marker tells a program when it has reached the last field in the record.
Without predefined length or end-of-field and end-of-record markers, a pro-
gram would interpret all data starting with the first byte of the first field as a single,
endless field. Sooner or later the program would "crash."
Selections
There are two important types of selection data structures: the simple selection and
the set. We will discuss these separately.
We might assign the values 0, 1, 2, and 3 to each of the make identifiers listed
in the definition. Then by reading the value in the variable we would know the make
of the family car.
58 Conceptual Quicksilver: Data Structures Chap. 2
A common way of storing sets is to assign each Boolean variable to a single bit,
with enough total bytes reserved to contain the entire base set. A 1 value in a bit in-
dicates that the corresponding data element of the base set is present in the particular
subset. This structure is illustrated in Fig. 2.5 for a base set of 16 data elements.
d7 d6 d5 d4 d3 d2 d1 dO
n Element 8 Element 7 Element 6 Element 5 Element 4 Element 3 Element 2 Element 1 I
n +1 Element 16 Element 15 Element 14 Element 13 Element 12 Element 11 Element 10 Element 9 I
Figure 2.5 Set data structure.
Example:
Develop a data structure that represents employee attendance during anyone week.
We will start by giving the data structure the descriptive name 'week's_attend-
ance'. The structure must identify the days that the employee worked during a 7-day
week. This can be done with a base set workweek of seven Boolean va,riables of type
'workday', where a true value in a given variable indicates that the employee worked
that day.
The workday elements are 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Fri-
day', 'Saturday', and 'Sunday'. Since the base set contains just seven data elements, it
can be represented in 7 of the 8 bits of a single byte.
Data Structure Types 59
Assuming that dO corresponds to 'Sunday' and that the remaining days are
assigned to bits in ascending order, the base set looks as shown in Fig. 2.6.
o
d7 d6 d5 d4 d3 d2 dl dO
o o o
d7 d6 d5 d4 d3 d2 dl dO
If the employee works a full 5-day shift one week, his or her attendance can be
shown with the following week's_attendance subset, shown in Fig. 2.7. The data dic-
tionary definition for the week's_attendance set is
Simple selections and sets do not require terminating markers, since they are
always of known and unvarying length.
A program examines a selection or set data structure and then selects what it
will do based on what values it finds. This processing pattern can be described as
"IF condition is TRUE, THEN do action 1; ELSE do action 2," where action 2 may
be another IF ... THEN ... ELSE action decision, and so on until all data
elements have been tested for and handled. We will see how this works when we
discuss assembly language in Chapter 3.
60 Conceptual Quicksilver: Data Structures Chap. 2
Repetitions
A repetition data structure is composed of repetitions of the same type of data ele-
ment. The name "repetition" also reflects the repetition of actions performed to
process all the data elements in such a structure.
The symbol for the repetition of a data element is enclosing braces preceded by
the lowest possible number of repetitions and followed by the highest possible
number of repetitions. Omitting the higher number means that there is no upper
limit to the number of repetitions (except, of course, the practical limit imposed by
the physical capabilities of the computer used). Thus the data dictionary definition
for a repetition data structure containing between nand m repetitions of a data ele-
ment takes the form
Note that preceding the braces with 0 and following them with 1 means that the
enclosed data element is optional. An additional symbol that can be used in any
structure definition encloses a written description of a lowest-level data item. The
symbol consists of parentheses enclosing asterisks enclosing the description. Thus
the 'name' field, holding the recipient's name in the earlier mailing __address se-
quential data structure, could be written
which defines the 'name' data structure as containing from 2 to 13 ASCII byte
values.
There are several variations on the repetition structure. The siimplest one
places the data elements consecutively in memory. Other types of repetitions are
made by combining data elements with pointers.
Apointer is a value that points to the location of a data element. A pointer can
be the address of the data element, or it can be an address offset into a data area
starting at a particular address, or if the data element is one of many consecutive
elements having the same length, it can be the position number of the element in the
list of elements. We will call these three types of pointers address pointers, address-
offset pointers, and position pointers, and will see how they work as we: discuss the
various advanced repetition structures.
The different types of repetition structures are discussed individually below.
Data Structure Types 61
n Element 1
n +1 Element 2
n +m Element m +1
Figure 2.8 File data structure.
62 Conceptual Quicksilver: Data Structures Chap. 2
marker. As with sequences, the end marker is a byte value or sequenCI~ of values that
never occurs in a data element.
Either a number-of-elements value or an EOF marker will define the end of a
file structure. Which one you choose depends on which matches the intended use of
the file best (i.e., which leads to the simplest and most efficient program). However,
to make that choice you need programming knowledge that will be provided in the
next two chapters.
The next most common type of repetitive data structure is the stack, of which
one form was discussed in Chapter 1.
The general stack structure expands and contracts like the 6510 microprocessor's
built-in stack.
Stacks, sometimes called LIFOs or last in, first out data structures, are useful
for organizing any data that should be retrieved in the opposite order to their crea-
tion or storage. The stack structure can be represented pictorially as shown in Fig.
2.9.
n Bottomof-stack element
n +1 BOS + 1 element
Queues. A relative of the LIFO is the structure called the FIFO or first in,
first out. FIFOs are also called queues, for their similarity to waiting lines such as
those at movie theaters and grocery store checkout counters.
When a customer waiting line is first formed it contains no people. As
customers enter the line it grows, and as customers are serviced it shortens. People
are serviced in the order in which they enter the line. There is little or no connection
between the rate at which people are added to the line and the rate at which they
leave it.
A FIFO is empty when first created. Data elements are placed in a FIFO and
removed from it in an asynchronous manner, that is, without requiring that data be
deposited and removed at the same time or rate. By definition, data are removed
from a queue in the order in which they are placed in it.
FIFOs are useful for data elements that are produced and processed in the
same order but not necessarily at the same rate. The Commodore 64 has a FIFO
called a type-ahead buffer, which accepts up to 10 characters from the keyboard
when the computer is busy running a program in BASIC. When the program is
finished, the computer's operating system reads the characters and acts on them.
A FIFO is constructed in two steps. First, a memory area of constant length is
set aside for the FIFO data structure. The larger the area, the greater the irregularity
that can be tolerated in both the input and output data rates. Of course, the average
rate of removing data from the queue must be greater than or equal to the average
rate of depositing data on the queue if data are not to be lost.
Second, two pointers are set up: one for the input location in the FIFO and one
for the removal location. These are usually either address pointers or address-offset
pointers giving the position offsets of the input and output locations from the first
location in the set-aside memory area for the FIFO (e.g., an offset pointer to address
8002h in a FIFO starting at address 8000h would have the value 02h).
In use, FIFO is treated much like a circular slide tray, with the last location in
the FIFO connected to the first. Imagine that you are visiting a friend who receives a
package of developed slides in the mail. On the spur of the moment you both decide
to view them immediately. Your friend sets up a slide projector and places an empty
circular slide tray on the top. He inserts the first slide into the slot just in front of the
projector opening, and you press the button for the tray to advance and the slide to
drop in. To make this fiction resemble the operation of a FIFO most closely, we will
assume that as the slides are viewed, the slide projector ejects them into a pile on the
floor (Fig. 2.10).
As you are advancing the slides, your friend is placing them into the slots in
front of the projector opening in the order that they are to be viewed. As long as the
slides are being viewed and ejected at least as fast as they are placed in the tray, there
are no problems. Of course, you may have to wait for your friend to insert a slide if
you catch up with him. However, if your friend fills the tray while you are viewing a
particularly interesting shot, he must wait until you begin ejecting slides again before
64 Conceptual Quicksilver: Data Structures Chap. 2
Figure 2.10
he can enter them again. Otherwise, he will jam the projector by placing slides into
locations that already hold slides.
In a FIFO data structure, the location from which data will be removed cor-
responds to the slide tray slot over the projector opening. As each data element is
removed, the output location moves to the next memory address, just as the source
of the next output slide moves to the next tray slot. Similarly, input data items are
placed into consecutive memory locations just as new slides are placed into con-
secutive slots.
If the input and output locations are constantly changing, how can those loca-
tions be tracked? In a slide tray, the human user keeps track of the current input slot
by sight while the projector automatically tracks the current output slot. In a data
FIFO, the pointers keep track of the input and output locations for the program us-
ing that structure. In a data dictionary, the pointer data could be definled as follows:
Address
n
Output
n+l r-- pointer
n+2
Input
n+3 f-- pointer
n+m
Figure 2.11 FIFO data structure.
Data Structure Types 65
When the FIFO is first created, the input and output pointers both point to
location n. The FIFO is not activated until the first data element is placed into nand
the input pointer is incremented to point to n + 1. Removing a data element causes
the output pointer to be incremented, so that the output pointer effectively
"follows" the input pointer.
As long as the input pointer stays ahead of the output pointer without catching
up to or lapping it, data can be input and output without any delay. However, if the
input pointer catches up with the output pointer, it means that all the FIFO locations
have been filled with input data. Further input must be refused until FIFO locations
have been emptied by output operations. Alternatively, if the output pointer catches
up with the input pointer, that is, if the FIFO is empty, there can be no further out-
put until input data have been received by the FIFO.
Linked lists. Another type of data structure results from pairing a pointer
with each data element in a simple repetition, where the data elements are in some
logical order, such as alphabetical or numerical. If the pointer attached to each data
element points to the next element in the logical order, the resulting structure is
called a linked list. We will use position pointers in our linked lists, although other
types will do just as well. Such a pointer is defined in a data dictionary as follows:
Linked lists are used because they can change their length. Therefore, a linked
list needs an end-of-list marker to notify a program when it has reached the last data
element. This marker is an impossible or nill value placed in the last element's
pointer. If position pointers are used and the first data element is defined as being at
position 1, the value 0 will never be used and can be defined as the nill pointer. Fur-
ther, certain 6510 instructions that we will study in Chapter 3 make the value 0 par-
ticularly easy to detect and react to.
It is difficult to give a totally general representation of a linked list: The
physical order of the elements can vary from their logical order in any number of
ways. So we will show a specific linked list that illustrates the linked-list concept ade-
quately. The linked list shown in Fig. 2.12 contains four elements. Each element
contains a one-byte data element and a one-byte pointer. The element numbers
reflect their logical rather than physical order. The pointer numbers reflect the
physical position of the next element in the list (e.g., a pointer number of 3 points to
the third element in the list). In general, of course, the data elements can be longer
than one byte.
One advantage in having a pointer with each data element is the ease with
which the logical order can be changed. To insert a new element in the logical order,
for instance, the new element is first placed in an unoccupied physical space in the
list. Since a list usually fills a contiguous memory area, this would be the first ad-
dress following the last list element. Then the data element's proper logical position
66 Conceptual Quicksilver: Data Structures Chap. 2
Address
n Element 1
n+ 1 Pointer 1 (= 3) -
n+2 Element 3
in the list is located. Finally, the affected element pointers are changed to place the
element in the proper logical position. The stages of this process are illustrated in the
following example.
Example:
There is a linked list of names in alphabetical order. Initially, the list contains only the
names 'Franklin' and 'Smith'. Insert the name' Jones' by changing the name pointers.
As shown in Fig. 2.13, the JONES element is added to the list by transferring the
value in the FRANKLIN pointer (i.e., 2 if position pointers are used) into the JONES
pointer so that JONES will point to SMITH. Then the FRANKLIN pointer is loaded
with the value of the position of the JONES element in the list (i.e., 3) so that
FRANKLIN will point to JONES. The order of the list is then FRANKLIN, JONES,
SMITH. Reversing this process, the name JONES can be deleted by moving the value in
the JONES pointer into the FRANKLIN pointer. Then FRANKLIN points to the ele-
ment that previously followed JONES, and there are no pointers to JONES, so it has ef-
fectively been deleted.
L-F_R_A_N_K_L_I_N_'~__P_o_in_te_r__~~~___SM__IT_H___.~__P_o_in_te_r__~
la) before insertion
I'- ~
_'F_R_A_N_K_L_I_N_' -,rt=m
__p_O_in_te_r__
po~r--,J-O-N-E-S-'--;----p-o-in-t-er---,~
Figure 2.13
Data Structure Types 67
If the pointer of the last element in the list points to the first element in the list,
the structure is called a ring. By definition, a ring has no end-of-list marker. In a
ring, data elements earlier in the logical order can be reached from elements later in
the order by continued forward movement through the list via the pointers.
Another variation on the linked list is made by attaching two pointers to each
data element, one pointing to the following element and one pointing to the
preceding one. The additional effort needed to process two pointers in operations
such as insertions and deletions may be offset by the advantage of being able to back
up in the list while searching through it.
Root data
element
Figure 2.14
68 Conceptual Quicksilver: Data Structures Chap. 2
Example:
Show three generations of an imaginary family's lineage in a family tree.
In this example each element (i.e., name) is dependent on the names of the im-
mediate ancestors. These relationships are the most direct and important to a family
tree, so elements so related are linked directly (Fig. 2.15). Indirect relationships such as
cousin or uncle can be found by tracing through multiple elements and applying rules
concerning relative levels in the tree.
Figure 2.15
The parts of a hierarchical data structure are named according to the tree
metaphor. As you have seen, the top data element is called the root. The lowest data
elements are called leaves, and data elements on intermediate levels are called
branches.
Only downward pointers are needed in a hierarchy. Each data ekment has one
pointer for each attached element on the level beneath it. Even though different data
elements can have different numbers of branches downward, to simplify processing
the data structure, all elements should be given the same number of pointers. Un-
used pointers in a given element, or the pointers for an element on the bottom level
of the hierarchy, can be given a nill value. The number of pointers should equal the
maximum number of branches used by any element in the hierarchy. The data dic-
tionary definition for a data element in a tree that allows up to four child data
elements per parent element is as follows:
tree_element_name = data_structure_name + pointer_1_name +
pointer_2_name + pointer_3_name +
pOinter_4_name
with the pointers defined elsewhere in the data dictionary in the usual manner. This
structure is shown pictorially in Fig. 2.16.
A hierarchy breaks data into categories according to which branch they belong
to. Data whose categories and therefore branches can be calculated beforehand are
the most natural candidates for placement in a hierarchical structun:.
The Virtue of Simplicity 69
Example:
Identify an everyday collection of information that can be organized hierarchically.
A cookbook contains information of this type. The main branches in a cookbook
hierarchy might be to national groupings such as 'Mexican' and 'French'. The next-
lower branches could be to main ingredients such as 'Fish', 'Poultry', 'Beans', and so
on. Finding a recipe would be a matter of traveling down a limited number of category
branches, instead of searching linearly through an unpredictable number of elements in
a file structure.
Example:
Develop a data structure for storing data records for the members of an automobile
club.
A logical way to organize member information is by the type of car owned. A
hierarchical data structure whose top element is 'club membership' can be broken into
'auto type' data elements underneath. These can then be broken into the individual
member records. With this structure, finding the names of all the members owning
RX-7's, for instance, is simply a matter of retrieving all the names in the RX-7 branch.
For data that can easily be categorized, especially if such data are in large
amounts, hierarchical organization can allow using the fastest and most efficient ac-
cessing methods in your programs. Indeed, a whole class of programs called data
base managers, which are used to catalog and manipulate data flexibly, favor two
structures for data storage. One of those structures is the relational data base; its
study is outside the scope of this book. The other structure is the hierarchy.
structures we have discussed. Data dictionary definitions for such structures can be
built from combinations of the basic symbols for the simpler structures being com-
bined.
Example:
Show the data dictionary definition for a structure built by combining the simple struc-
tures.
The zip_code field in the mailin~address sequential data struc:ture has a com-
bination structure of this type. It consists of a selection of two possible repetition struc-
tures, as seen in the following definition:
which means that zip_code is composed of either five or nine repetitions of an ASCII
digit. The choice allows for either five- or nine-digit zip codes in an address.
After data have been structured in this way. program processes can be de-
signed to transform each basic structure or level of basic structures. This keeps to a
minimum the portion of an information-handling task being dealt with at any given
time. The simple regularity of the data structures discussed in this chapter leads
to simple and regular programs, which is one of the most important goals of the pro-
gramming craftsman.
Exercise:
Review what you have learned by answering the following questions.
(a) What are the two stages of modeling called, and what is done in each stage?
(b) Describe the characteristics of each of the three basic types of data structures.
(c) What is the difference between a LIFO and a FIFO?
(d) Think of a category of information whose data might best be stored in a hierarchy.
71
72 Into Its Brain: 6510 Assembly Language Chap. 3
COMPUTER PROGRAMS
catalog a few major contexts and their best compromise translations, or worse, use
entirely safe but even less efficient "universal" translations for each instruction.
Most BASIC translators come under the latter category. Inefficient translation is the
main reason for the speed and size disadvantages of HLL programs. Further,
BASIC programs are kept as source code and executed under the direction of an in-
terpreter, which slows them even more.
The advantage of HLLs, of course, is that the programmer writes shorter pro-
grams at a higher logical level. Given the memory and speed limitations of personal
computers, however, the disadvantages of HLLs usually outweigh the advantages
for performing significant tasks.
The one-to-one conversion of written assembly language instructions into ob-
ject code instructions is performed by programs called assemblers. Since the CPU
operations used to perform any task are hand-picked by the programmer, a pro-
gram's efficiency is limited only by the programmer's knowledge and experience.
The necessary knowledge for efficient programming is provided in the next two
chapters. Gaining experience is up to you, but even an inexperienced programmer
can create far more efficient programs in assembly language than is possible with
HLLs.
Our next topic is the programming language aspect of assembly language. We
will learn how assembly language instructions are formatted in programs. Then we
will explore the instructions themselves and the CPU operations they represent. Next
we will see how instructions are combined to do useful work on the various types of
data structures. Finally, we will see how to improve and correct programs to obtain
the best possible program performance.
ASSEMBLY LANGUAGE
As you may recall from Chapter 2, each object code instruction consists of two
parts: an operation byte, which the CPU decodes to determine what it is to do, and
between zero and two operand bytes, which hold or point to any data on which the
CPU will perform the operation. An operand byte consisting of data for the opera-
tion to manipulate is called a data operand. An operand consisting of an address
pointer is called an address operand.
In assembly language, each CPU operation is represented by a three-letter ab-
breviation. Operands can be represented with programmer-assigned names or with
written numbers.
Operation abbreviations are called mnemonics, which literally means
"memory aids." For instance, there is a CPU operation that loads a data byte into
the accumulator. This operation has been given the mnemonic LDA, which means
LoaD the Accumulator. Every CPU operation has its own mnemonic, giving the
programmer complete control over the CPU.
As we said, operands can be represented with written-out numbers such as
"82," or with programmer-assigned descriptive names such as "count." Such
74 Into Its Brain: 6510 Assembly Language Chap. 3
names are symbolic representations of a number. Like mnemonics, they make a pro-
gram easier to read, write, and correct. However, not all assemblers allow the pro-
grammer to define such symbols; a restricted type of assembler called a machine
language monitor does not, for instance. Even when using a machine language
monitor, however, your handwritten or typed-in copy of a program should use sym-
bols for understandability.
The Commodore assembler limits symbols to six alphanumeric characters in
length, where the first character must be a letter and the remaining characters may
be letters or numbers. Other assemblers may allow fewer or more characters.
The simplest type of address operand was described in Chapter 1. It consists of
the actual address of the data to be manipulated. Other types of address operands
allow for CPU changes to the address value before it is placed on the address bus.
Each different type of address operand is matched to a particular type of data struc-
ture. The careful use of these addressing methods is the first key to effective
assembly language programming.
The addressing method used with an address operand is indicated to the
assembler by accompanying the operand with assembler-defined emblems such as #
and ( ). For instance, # temp could be one such operand. We will postpone defining
these accompanying emblems until the addressing methods have been discussed.
Operands must have their number code defined to the assembler with accom-
panying emblems. This avoids any ambiguity with numbers such as 10, which could
otherwise be interpreted in binary, decimal, or hex code. Following are the number
code tokens recognized by the Commodore assembler:
None Decimal 37
Prefix $ Hexadecimal $FA
Prefix 0,70 Binary 07000010101
Enclosing apostrophes ASCII 'try again'
fields for the operation byte or opcode, the operand, a comment, and a symbolic
name or label for the address of the operation byte in the object code. The instruc-
tion format for the Commodore disk assembler appears as follows:
where the parentheses mean that the field is optional. The first address in the
machine-executable program is defined in the source code with a special directive.
Most assemblers require this assignment at the very beginning of the program, so
that all address operands and symbolic data can be given known address values dur-
ing assembly.
The opcode field holds a mnemonic. The operand is a data byte or an address
in numeric or symbolic form, with possible accompanying symbols to indicate ad-
dressing method or number base. Comments should be preceded by a semicolon for
clarity and to allow a comment to take a line by itself.
We can make the following instruction using the mnemonic LDA:
Addressing
The 6510 has two main methods for addressing data. In the first method, the CPU
obtains a pointer to data and uses it without modification to access data. The
pointer is usually an address operand. This method is called direct addressing.
In the second addressing method, the CPU obtains a pointer and adds to it a
---------------'----"._---
byte called an index to get a final data address. This method's pointer is also usually
an address operand. This method is called indexed addressing. The data registers X
and Yare the two possible index sources.
These two addressing methods have very different strengths. Direct addressing
is best suited to accessing sequential data structures. Indexed addressing is best with
repetitive data structures. Both methods are useful with selective structures. The
reasons for these affinities will become apparent shortly.
Each addressing method allows for several addressing modes or submethods.
Modes allow the accessing method to be fine-tuned for the specific data structure be-
ing processed.
The addressing methods and modes will be discussed in terms of the data struc-
ture types they access best. Later in this chapter you will be using these modes in
combination with the CPU operations to perform small but interesting tasks on your
computer. The programming of larger tasks requires techniques that will be ex-
plained in Chapter 4.
Sequential data. Recall that a record is a collection of dissimilar field data
elements. For fast instruction execution and simplicity, data in small records should
be directly accessed with unaltered operands (i.e., by direct addressing).
The direct addressing modes are optimized for accessing two special cases of
the record structure: the single-byte constant and the small record of single-byte
variable fields. With larger records or larger variable fields the suitability of direct
versus indexed addressing will have to be judged on a case-by-case basis. You will be
ready to make such judgments by the end of this chapter.
Single-Byte Constants. By definition, a constant is a data value that never
changes. To a program, the main difference between constants and variables is that
constants can be placed in both changeable and unchangeable memory locations,
while variables can be placed only in changeable locations. Since a program's in-
struction bytes are unchanging, constants can be stored and accessed as operands,
whereas variables cannot. Variables can be stored in RAM or in registers and are ac-
cessed with unchanging pointers in the program instructions.
Placing a constant in an instruction saves program space and execution time.
Space is saved since there is no extra pointer to store. Time is saved since the fetching
of an address operand and its placement on the address bus is eliminated.
A constant data operand is called an immediate operand. An instruction con-
taining an immediate operand fills two bytes in the memory map. Graphically, this
appears as shown in Fig. 3.1.
Accessing data in an immediate operand is called immediate addressing. The
execution of an instruction using immediate addressing is as follows. The opcode is
fetched and decoded, the PC is incremented, and the immediate operand is retrieved
and processed. The PC is then incremented again to point to the next instruction.
Thus in this addressing mode the PC serves as the data pointer. Most of the other
direct modes use address operands as pointers.
Earlier we mentioned the emblems that indicate the addressing mode of an
Assembly Language 77
Memory map
Address 0000
Address 0001
Opcode Address'm'
Constant Address 'm + l'
Address FFFFh
Figure 3.1 Immediate addressing
operand. The emblem for the immediate addressing of an operand is a # as the first
character in the operand.
Example:
The 6510 operation called LDA has already been mentioned. One of its addressing
modes is immediate addressing. Show the full instruction for moving the constant value
I Dh into the accumulator, using immediate addressing. Then show the same instruction
with the value IDh represented by the symbol 'temp'.
Combining the LDA mnemonic with the immediate operand yields the instruc-
tion
LDA #$1 D
(Recall that assembly language uses the $ sign to indicate the hex number base.)
Substituting the symbol 'temp' for the number in the instruction above produces
LDA #temp
The latter form illustrates how symbols can improve the understandability of assembly
language instructions.
points to the data to be processed by the opcode. Graphically, this appears as shown
in Fig. 3.2.
Using an absolute address to access a data byte is called absolute addressing.
The execution of an instruction with an absolute address proceeds as follows. The
opcode is fetched and decoded, the PC is incremented, the low-order byte of the ad-
dress operand is retrieved and placed in LOBUF, the PC is incremented, the high-
order byte of the address operand is retrieved and placed in HIBUF, the address buf-
fer is placed on the address bus, and the data byte is retrieved and processed. The PC
is then incremented to point to the next instruction.
No addressing mode emblems are used with absolute addresses, since the
assembler can determine the addressing mode from the context of the instruction.
Example:
Show the assembly language instruction for loading the acculumator with a data byte at
memory location SD7Fh. Show the same instruction with the symbolic label 'text'
representing the data address.
Combining the LDA mnemonic with the absolute address yields the instruction
LOA $807F
LOA text
Memory map
B Address 0000
Address 0001
Opcode Address'm'
,--{
I
POinterl
Pointerh
Address 'm
Address 'm
+ l'
+ 2'
IL _ _ -i
Data Address (Pointer)
Address FFFFh
figure 3.2 Absolute addressing
Assembly Language 79
This shortens the instruction length by one byte and instruction execution time by
one byte retrieval. Execution speed is so improved that page-zero memory is often
treated as a bank of fast-access registers such as those in the 6510. Using an 8-bit ad-
dress to access a data byte is called zero-page addressing.
Example:
Show the instruction that loads the accumulator with a data byte stored in address 7Fh.
Combining the LDA mnemonic with the zero-page address produces the instruc-
tion
LOA $7F
By restricting the length of a variable record to one byte and by limiting its
placement to the internal registers of the 6510, the address pointer can be eliminated.
Instead, the opcode byte "implies" the registers to be accessed. When the 6510
decodes the opcode, it performs the specified operation on the data in the implied
registers. Hence this mode is called implied addressing. Only a few 6510 instructions
allow implied addressing. For those that do, instruction size and execution time are
reduced to an absolute minimum.
Pointers in the foregoing three addressing modes point to variable data. In yet
another addressing mode, the pointer contains the address of another pointer in
memory. With some microprocessors this secondary pointer holds the address of a
data byte. However, in the 6510 direct addressing method the secondary pointer
holds the address of a program instruction. This use of the secondary address will be
explained later in the discussion of the JUMP operation. In the 6510 both addresses
are 16 bits long. A pointer to a pointer is called an indirect address. The data ad-
dressing use of an indirect address is shown in Fig. 3.3.
The mode using indirect addresses is called indirect addressing. It can be
thought of as two consecutive executions of simple direct addressing: The address
operand is placed on the address bus to retrieve the secondary address through the
data bus, and the secondary address is placed on the address bus for the retrieval of
the data byte. Although the 6510 does not implement this process in its pure form, it
does support two indexed variations on it. We will consider them shortly.
Table 3.1 summarizes the direct addressing modes available for accessing
Data type
Record length Constant Variable
I ndirect addressing
Memory map
Address 0000
Address 000 1
Opcode Address'm'
r---{
I
Pointerl
Pointerh
Address 'm
Address'm
+ l'
+ 2'
L--i
r--
I
I
Pointer21
Pointer2h
Address (pointer)
I
I
L ___ -j Data I Address (Pointer 2)
Address FFFFh
Figure 3.3
records of constants and variables. An asterisk highlights the fastest executing ad-
dressing mode for each combination of data type and structure length.
Look over the table now as a review of the uses for the direct addressing
modes. Later, it and its counterpart in the repetition section will help you write the
most efficient form of an instruction for a given data structure.
Exercise:
Try answering the following questions.
(a) What is direct addressing?
(b) Describe the absolute addressing process.
(c) How do zero-page and absolute addressing differ?
(d) Describe the immediate addressing process.
since the bit values can be tested so much more quickly in the accumulator that the
time spent loading the bytes from memory is regained several times over.
Single-byte variable sets in memory should be accessed with an address
operand mode of direct addressing. Multiple-byte sets stored in memory should
usually be loaded into the accumulator with the indexed addressing modes, which
will be discussed in the next section. Sets stored as constant operands are of course
loaded with the immediate mode of direct addressing. Once set elements are in the
accumulator they can be accessed with the implied mode of direct addressing, that
is, with operations that assume the data object is in the accumulator.
A summary table for this section would simply list the direct and indexed ad-
dressing modes discussed in the bracketing sections, so it will be omitted.
Exercise:
Review this section by answering the following questions.
(e) In what location should set elements be placed to be manipulated?
(f) Under what circumstances should direct addressing be used on sets?
(g) Under what circumstances should indexed addressing be used on sets?
Memory map
Address 0001
Opcode Address'm'
Address F F F Fh
LDA $8D7F,Y
Assembly Language 83
If an array can be placed entirely within page zero memory, a zero-page ad-
dress operand can be used to save time and space. The zero-page indexed modes are
called zero-page X and zero-page Y, depending on the register source of the index.
Other than the length of the address operand, the zero-page indexed modes work
just like the absolute indexed modes.
Example:
Show the instruction for loading the accumulator with the value at the address produced
by adding the base address 7Fh with the index value in register X.
This instruction is
LOA $7F,Y
There are two situations where absolute and zero-page indexing are insuffi-
cient. First, if the array is longer than 256 bytes, the byte-wide index combined with
a constant base address cannot access all array locations. Second, if multiple arrays
of the same type are being processed in the same way, the previous addressing modes
prevent using the same program instructions to access them all.
Both situations can be accommodated with a changeable base address. Both
absolute and zero-page addresses are unchangeable. However, the only un-
changeable address in the indirect mode is the pointer to the secondary pointer. The
secondary pointer resides in general memory, so it can be changed. There is an in-
dexed addressing mode that uses the secondary pointer as its base address, so that
both the index and the base address can be adjusted. Thus arrays larger than 256
bytes, and multiple arrays in different locations, can be accessed in a thoroughly
general manner.
The price for this flexibility is time. Retrieving an address operand, then a
secondary address, indexing, and finally retrieving a data byte is time consuming. In
the two situations listed above, however, the natural fit of this addressing mode to
the data probably saves more execution time for the overall program than it loses.
To save some of the time taken by indirect addressing, the primary pointer is
limited to 8 bits length. Thus the secondary pointer must reside in page zero
memory. A further limitation on this mode is that the index value can come only
from register Y. Appropriately, this mode is called the indirect indexed addressing
mode. It is illustrated in Fig. 3.5.
The emblem for an indirect operand is enclosing parentheses, as is shown in
the following example.
Example:
Show the instruction that loads the accumulator with the data byte pointed to by the
address in zero-page locations addr and addr + I, indexed by the contents of reg-
ister Y.
This instruction is
LOA (addr),Y
84 Into Its Brain: 6510 Assembly Language Chap. 3
Memory map
Address 0000
Address 0001
:---1 Pointer21
Pointer2h
-1
I
I
I
Address (Pointer)
I
Opcode I Address'm'
I
Pointer roo- J lI.ddress 'm + l'
Address F F F Fh
Figure 3.5
An array of l6-bit address registers can be created with the last array-
addressmg mode. The three previous indexed modes obtain a base address and add
an index. This fourth mode uses the index to obtain a base address. Ilt is descriptively
called the indexed indirect mode.
Indexed indirect addressing starts with a zero-page address operand, which
points to an array of 16-bit addresses in page-zero memory. An index stored in the X
register is added to the zero-page pointer to form the final pointer into the array. If
the sum of the index and the original pointer is greater than 255, the resulting ad-
dress wraps around to the low end of page-zero memory. So if the sum would be
l02h, the new pointer address becomes 2h, and so on. The final pointed-at location,
an ad.iress element in the array of addresses, is retrieved and used to access a data
byte. Indexed indirect addressing is illustrated in Fig. 3.6.
Indexed indirect addressing allows many addresses and individual data
elements to be selected from easily. As such it may be more useful for accessing
fields in a record than for accessing repetition data structures. For instance, the
beginning addresses of each field in a record of different-size fields could be handled
with this mode.
Since indexing is used to access the address in page zero, it is unavailable to
step through elements in an eventual target array. However, the address in the page-
zero array can itself be incremented or decremented to step through a data array,
although with a large time penalty.
Assembly Language 85
Memory map
Address 0000
Address 000 1
}-~
Sum Pointer21 Address (Pointer)
Pointer2h
I
I
I
Address'm'
L __
I
I Address 'm + l'
Pointer I
I
I
Data f-..J Address (Pointer 2)
Address FFFFh
Figure 3.6
Indexed addressing generally adds one tick of the system clock to an instruc-
tion's execution time. With instructions using a direct address mode averaging four
I-microsecond ticks in length, the penalty is a 25 percent slowdown. As we have
pointed out earlier, though, the alternative in using direct addressing is usually
worse. Of course, indexed instructions of a given base address mode can do the same
things as their nonindexed counterparts. After all, if the index equals 0, the base ad-
dress will be unchanged after indexing.
Stacks. Stack data are usually accessed with the 651O's built-in stack ad-
dressing mode, which can be considered a mode of indexed addressing. More
general stack structures can easily be built and accessed with any of the previous in-
dexed modes except for indexed indirect addressing.
In stack addressing, the index value increases or decreases by a constant value,
usually 1, as data are popped off or pushed on the stack, respectively. The main
qualitative difference between stack and array addressing is the ebb-and-flow chang-
ing of the index found in stack addressing.
FIFOs, linked lists, and hierarchies can also be built using the general indexed
addressing modes. Most variations on these structures can easily be implemented by
following the accessing principles explained in Chapter 2.
Table 3.2 completes the addressing information begun in Table 3.1. This table
summarizes the main indexed addressing modes used to access each type of repeti-
86 Into Its Brain: 6510 Assembly Language Chap. 3
> 256 bytes *Indir index Indir index Indir index Indir index
Index indir Index indir
tion data structure. As before, an asterisk highlights the fastest executing addressing
mode for each combination of structure type and length. The word "indexed" has
been omitted from most mode titles, except that "indirect indexed" has been ab-
breviated "indir index," and "indexed indirect" has been abbreviated to "index in-
dir. "
Exercise:
11 Test your retention of the material in this section by answering these questions.
II (h) What is indexed addressing?
I! (i) Describe the absolute indexed addressing process.
i' (j) Describe the indirect indexed addressing process.
CPU Operations
The remaining half of an instruction, the operation byte, usually determines what is
done to addressed data. Its other use is to alter the flow of instructions into the
microprocessor for execution.
The mnemonic given this byte describes the action it causes. There are three
types of actions controlled by the operations byte, or opcode. The first two affect
addressed data. The remaining one affects the instruction flow. These actions are
data movement, data transformation, and program structuring.
In this section the opcodes and their mnemonics are grouped and discussed ac-
cording to the three types of CPU actions. Data transformation operations are fur-
ther grouped into three functions: arithmetic, shifts, and logic. The mathematics of
the latter functions can be reviewed in Chapter 1.
Data movement is the most basic action performed on addressed data, so we
will begin with it.
operations, usually for both directions of transfer. We can now discuss the data
movement operations in terms of the three source/destination paths.
d7 d6 d5 d4 d3 d2 d1 dO
Hgure 3.7
flags. Thus if the value 0 is loaded from memory into the accumulator, the Z flag
will be set to 1. If a value whose d7 bit equals 1 is loaded into the accumulator, the N
flag will be set because the top bit of a 2C number denotes its sign. Even if the data
byte is not interpreted as a 2C number by the program, d7 can be used as a Boolean
variable and tested with the N flag.
Example:
Write a typical load instruction, combining an operation with an addressing mode.
The instruction for loading a byte into the accumulator from all. address specified
with indirect indexed addressing is
LDA (address),Y
Table 3.3 summarizes the most important facts about load and store opera-
tions. Corresponding load and store operations are placed together for ease of com-
parison. Each operation's effect is symbolized within brackets beneath the opcode
mnemonic, using the abbreviations M for memory location, A for accumulator, and
X and Y for the index registers. The affected flags are also listed.
TABLE 3.3 LOAD AND STORE OPERATIONS
A table in this form will follow every operations group we discuss. Included in
these tables are the hexadecimal opcode values for each combination of operation
and addressing mode, and the instruction's corresponding execution time in
microseconds (i.e., one-millionth of a second). These two values are displayed in the
form lopcode, timel," following the addressing mode name. The execution times
can be used to compute how long a program will take to execute; however, the times
listed are based on the assumption that the operations are occurring within a single
page of memory (addresses xxOO through xxFF hex). Some instructions straddling a
page boundary will take an additional microsecond to execute. Such instructions will
be highlighted with an asterisk beside their execution time.
Between Register and Top of Stack. Only the accumulator and status regis-
ters can transfer data directly to and from the stack. However, by using accumu-
lator load and store operations, data in memory can be moved into and out of
the stack. Finally, using the register-to-register operations of the next section allows
data in the remaining registers to be pushed on and popped off the stack via the ac-
cumulator.
The ability to save the contents of the processor status register on the stack is
extremely important. Sometimes the flow of instructions into the CPU is inten-
tionally but temporarily diverted with a program structuring operation. When this
diversion is complete, execution will return to the main stream. If the status of the
execution before the diversion is unavailable, the return to the main instruction
stream will be of limited usefulness. Saving the status register on the stack before the
diversion, and recovering it afterward, solves this problem. This entire process will
be described in the program structuring section.
Two operations are used to move data from a register to the top of the stack.
They are PHA, for PusH Accumulator byte onto the stack, and PHP, for PusH
Processor status byte onto the stack. Neither operation affects the flags, and neither
operation has an operand byte since the address is held in the stack pointer.
Two operations are used to move data from the top of the stack to a register.
They are PLA, for PulL Accumulator byte from the top of the stack, and PLP, for
PulL Processor status byte from the top of the stack. As with the load operations,
moving data into the accumulator affects the zero and negative flags. Moving data
from the top of the stack into the status register affects all the flags, of course.
In Table 3.4 the effects of push and pull operations are shown with several new
symbols. The symbol S represents the contents of the stack pointer, TOS represents
the contents of the memory location pointed to by the stack pointer, and P
represents the contents of the processor status register.
Between Register and Register. There are two types of register-to-register
operations. One serves the general need to move data between registers; these opera-
tions are called transfers. The other type alters specific bits in the status register; this
type consists of the flag modification operations.
Transfer operations service the A, X, Y, and S or stack pointer registers. Only
three of the possible source/destination pairings are used: A with X, A with Y, and
90 Into Its Brain: 6510 Assembly Langua!le Chap. 3
X with S. There are two operations for each of these pairs, one operation for each
transfer direction. The mnemonic for each operation consists of the letter T, for
Transfer, followed by the letters of the source and destination registers. Of course,
all transfer operations use implied addressing.
The operation that moves a data byte from the A register into the X register is
called TAX, for Transfer A into X. The operation moving data from X to A is called
TXA, for Transfer X into A. The zero and negative flags are affected by both these
instructions.
The operation moving data from A into Y is called T AY, for Transfer A into
Y. The operation moving data from Y to A is called TY A, for Transfer Y into A.
These instructions also affect the zero and negative flags.
The operation moving data from X into S is called TXS, for Transfer X into S.
The operation moving data from S into X is called TSX, for Transfer S into X. TXS
has no effect on the flags. TSX affects the zero and negative flags. Table 3.5 lists the
transfer operations.
Flag modification operations work differently from transfer operations. When
the opcode for a flag modification or flag mod operation is fetched from the pro-
gram in memory, it is loaded into the instruction register, or IR, and decoded like
any other opcode. Based on this decoding, the microprocessor fixes the value of the
appropriate bit in the status register. The process can be thought of as a roundabout
data transfer from the IR to the flag register, which is why these operations have
been placed in the register-to-register category.
Flag mod operations can place a selected value of 0 or 1 into any of the four
following flags: C, I, D, and V.
The first of these, the carry flag, is set to 1 with the SEC or SEt the Carry
operation. The C flag is reset by the CLC or CLear Carry operation.
The interrupt flag is set by the SEI or SEt the Interrupt mask operation. The I
flag is reset by the CLI or CLear the Interrupt mask operation. Again, interrupts
will be explained in the program structure section.
The decimal flag is set by the SED or SEt Decimal mode operation, and reset
by the CLD or CLear Decimal mode operation.
Finally, the overflow flag can only be reset, using the CL V or CLear the
oVerflow operation.
All flag mod operations imply the status register as the data location;
therefore, they all use the implied addressing mode. The flag mod operations are
summarized in Table 3.6.
The data movement operations provide the means to move data from
anywhere in the computer to anywhere else in the computer. However, the purpose
of moving data is usually to support their transformation. We can now discuss the
operations that transform data.
Data transformation. In Chapter 1 we explained three ways of transform-
ing binary data: arithmetic, shifts, and logic. There are CPU operations for each of
TABLE 3.6 FLAG MOD OPERATIONS
these functions. We will explore these operations in the order that we earlier studied
the three types of transformations.
Arithmetic. Arithmetic is defined as the four computations of addition, sub-
traction, multiplication, and division. The 6510, like most microprocessors, has
CPU operations only for the addition and subtraction computations. The latter two
computations are easily though relatively slowly performed with combinations of
other CPU operations. Multiplication and division will be demonstrated after the
basic CPU operations become familiar.
There are general- and special-case versions of the addition and subtraction
operations. The general case adds or subtracts a byte in the accumulator with a byte
in memory. The special case adds or subtracts the value 1 with the value in an index
register or in memory. The special-case addition and subtraction operations are
known as incrementing and decrementing.
The 6510 supports arithmetic for two numerical codes: binary and BCD. The
microprocessor knows which type of arithmetic to perform by the contents of the
decimal flag in the status register. Recall that the flag mod operations SED and CLD
are used to set this flag for BCD or binary arithmetic, respectively. Since most pro-
grams use just one type of arithmetic, the SED or CLD operation usually appears
just once, at the start of the program.
In BCD mode general arithmetic, the microprocessor will compensate for the
BCD "no-man's land" of codes from 1010 to 1111. Beware, however; BCD mode
increment and decrement operations do not compensate for the illegal code values. It
is best to use only general arithmetic operations for BCD arithmetic.
BCD and binary mode arithmetic also differ in their effects on the flags in the
status register. In BCD arithmetic the zero and negative flags are fixed according to
the result that would occur if the arithmetic were binary. In effect, the CPU does the
arithmetic in binary and fixes the Z and N flags before converting the result to BCD.
Only the carry flag is fixed consistently with the decimal result. In the program
structuring section we will study a group of operations that test individual flags and
select an action accordingly. With BCD arithmetic the flag inconsistencies require
that these flag-testing instructions be carefully selected. Such instructions should
either test only the carry flag, which is accurate in BCD arithmetic, or possibly also
test the zero or negative flags, if their value is always fixed the sam(~ way by the
previous flag-affecting operation whether the arithmetic is BCD or binary.
If the carry is a 0, the result is just the sum of two addends. If the carry is a I, the
result is one larger.
Assembly Language 93
Multiple-byte additions are simple with the ADC operation. The carry flag ad-
dition in the ADC operation allows a carry from the addition of two less-significant
bytes to be added automatically into the next-higher byte addition. Of course, the
carry must be reset to 0 before the addition of the least-significant bytes in the addi-
tion. This is done with an initial CLC flag mod operation.
Example:
Write assembly language instructions to add the numbers 5427h and 1050h.
The addition will be set up as two single-byte additions. Only the ADC operation
and a few data movement and flag mod operations are required. The location receiving
the least-significant byte of the sum has been symbolically named 'sum'.
Be sure that you fully understand these instructions before going on. Here as elsewhere
in assembly language, a potentially imposing group of instructions often proves quite
simple when taken one at a time.
The ADC operation can be used with all direct addressing modes except im-
plicit, and all indexed addressing modes except zero-page, Y. ADC fixes the values of
the zero, negative, carry, and overflow flags (recall that an overflow occurs when a
2C addition overflows into the d7, or sign, bit).
The special-case addition called the increment comes in three forms for the
three types of locations it can alter. All forms fix the zero and negative flags.
Memory locations are incremented with the INC operation. It can be com-
bined only with the absolute and zero-page modes of direct addressing, and the ab-
solute,X and zero-page,X modes of indexed addressing.
Register X is incremented with the INX operation. INX is of course addressed
only implicitly.
Register Y is incremented with the INY operation, which is also addressed im-
plicitly.
With all three operations, if the data being incremented equal FFh, the sum
will wrap around to the value 00.
Example:
Write the assembly language instruction for incrementing an indirect indexed memory
location.
Using the symbolic label 'place' for the memory location, this instruction is
INC (place),Y
----------------------_ ..-
In subtraction the inverse of the carry flag acts as a borrow flag. For single-byte sub-
tractions the carry flag can be set so that subtracting NOT carry will have no effect
on the result.
In multiple-byte subtractions, if a borrow is required by the subtraction of two
less-significant bytes, the carry flag is reset. Then the next-higher byte subtraction
will, in deducting the inversion of the carry flag, reduce the accumulator-data dif-
ference by 1 and complete the borrow. If a subtraction generates no borrow, the
carry flag is set, and the next-higher subtraction will deduct the borrow value 0 and
leave the difference unchanged.
Because of the borrow flag, the subtraction result can be interpreted either as a
positive binary number or as a 2C (possibly negative) number. For instance, the sub-
traction 30h - 60h produces the value DOh, which is the positive binary result from
130h - 60h (subtraction with borrow) and also the negative 2C result from 30h -
60h (subtraction without borrow). Thus SBC can be used with either straight binary
or 2C coded numbers.
SBC fixes the zero, negative, and overflow flags according to the 2C inter-
pretation of the result. Thus d7 is interpreted as the sign bit to set the negative flag.
Multibyte subtractions follow the same pattern as multibyte additions, with
the substitution of SBC for ADC and SEC for CLC to set the carry flag and
therefore reset the borrow.
Example:
Change the earlier addition problem, 5427h + 1050h, to the subtraction problem 5427h
- 1050h.
All that is required is to replace the arithmetic and flag mod operations of the
earlier example. Each of these replacements is highlighted with an asterisk. The location
receiving the least-significant byte of the result has been renamed 'diff'.
The SBC operation, like ADC, can be combined with all direct addressing
modes except implicit, and all indexed data addressing modes except zero-page,Y.
There is another general subtraction operation, called a compare, that com-
pares two numbers and determines which is larger, without producing a numerical
result. Compare operations differ from the SBC in four ways:
The compare operation is useful for testing the contents of a register or an ad-
dressed memory location without changing its contents. Compare operations fix the
zero, negative, and carry flags.
The compare operation that subtracts addressed data from the accumulator is
called CMP. This operation has the same choice of addressing modes as SBC.
--------------------"'--_.--_."'-
The compare operation that subtracts addressed data from X is called CPX. It
allows only the direct addressing modes accessing memory (i.e., absolute, zero-page,
and immediate).
The compare operation subtracting addressed data from Y is called CPY. Its
addressing modes are the same as for CPX.
Example:
Write assembly language instructions to compare the contents of the memory location
at address 'loc' with the number 'max':
The zero flag will be set if the numbers are equal; the carry flag will be set (i.e., the bor-
row cleared) if 'max' is greater than or equal to the contents of 'loc'. The negative flag
will be set if 'max' is less than the contents of 'loc', unless a 2C overflow occurs.
The special-case subtraction called the decrement comes in three forms that
correspond directly in their addressing modes to the three forms of increments. All
decrements fix the zero and negative flags.
Memory locations are decremented with the DEC or DECrement operation.
Like INC, it can be combined with the absolute and zero-page modes of direct ad-
dressing, and the absolute,X and zero-page,X modes of indexed addressing.
Register X is decremented with the DCX or DeCrement X operation. Like
INX, it is addressed itnplicitly.
Register Y is decremented with the DCY or DeCrement Y operation, which is
also addressed implicitly.
The data value 00 decrements to FF hex. As with increments, decrements in
BCD mode should be substituted for with data movement and general subtraction
operations.
This completes our study of the data transformation operations performing
arithmetic. The subtraction operations are represented in Table 3.8. Recall that the
overbar symbol means "negate the quantity underneath" (i.e., it is the NOT sym-
bol). The "\ " character as a destination means that the operation results are
thrown away.
Exercise:
Try the following assignments as a review of the assembly language instructions we have
studied thus far.
(k) Write a small sequence of assembly instructions for adding two single-byte BCD
numbers.
(I) Write a small sequence of assembly instructions for comparing two BCD numbers.
moves inward and vacates its old end position. The falling-off bit is placed into the
carry flag by all shift operations. However, the vacated bit position is filled by shift
operations with either a 0, as with two shift operations officially called shifts, or
with the contents of the previous carry flag, as with two shift operations called
rotates. We will use the terms "shift" and "rotate" in this more limited sense for the
rest of the section.
All shift and rotate operations can be combined with the absolute, zero-page,
and implied modes of direct addressing, and the absolute, X and zero-page,X modes
of indexed addressing. The implied mode accesses only the accumulator; the symbol
A in the operand field indicates the implied mode to the assembler.
The two shift operations are called Logical Shift Right or LSR, and Arithmetic
Shift Left or ASL. As you might expect, the first operation moves bits to the
"right" or toward lower-bit-position numbers. The second operation moves bits
leftward, toward higher-bit-position numbers.
The two rotate operations are called ROtate Right and ROtate Left. They
similarly describe the direction of bit movement.
The effects of the shift and rotate operations are illustrated in Fig. 3.8. The
name of each assembly language operation accompanies its illustration.
----------------------------------
-.- .....----~......-......-....----
Rotates
Shift group operations are used to double or halve a data value, as explained in
Chapter 1, or to place a particular bit into the carry flag for testing.
Example:
Write assembly instructions that move the d7 bit of the accumulator into the carry flag
for testing and that then restore the accumulator to its initial condition.
The shift and rotate operations are summarized in Table 3.9. The symbol d(i)
represents a bit position from dO to d7.
Exercise:
Try the following review question
,i (m) Describe in detail the effect of executing an ROR operation.
Logical Operations. The logical or Boolean functions OR, AND, XOR, and
NOT are performed with four CPU operations.
CPU operation ORA, for OR Accumulator, performs a bitwise OR on the
contents of the accumulator and an addressed data byte, and places the result in the
accumulator. ORA fixes the zero and negative flags.
Assembly Language 99
Operation AND, for AND accumulator, performs the bitwise AND of the ac-
cumulator and an addressed data byte, places the result in the accumulator, and also
fixes the zero and negative flags.
A variation on AND, called BIT, ANDs the accumulator with the addressed
data and throws away the result. BIT also fixes the zero flag in the usual way;
however, it fixes the overflow and negative flags in an unintuitive fashion, placing
bits d6 and d7 of the memory data byte in the overflow and negative flags, respec-
tively. Since d6 and d7 are moved intact regardless of accumulator contents, BIT can
be used to check them without any attention or alterations to the accumulator.
Operation EOR, for Exclusive OR accumulator, performs the bitwise XOR of
the accumulator and an addressed data byte, places the result in the accumulator,
and fixes the zero and negative flags.
NOT is absent from the 6510 instruction set, but can be simulated with the in-
struction EOR # Ufo 11111111 (recall that Exclusive-ORing a bit with a 1 inverts the
bit).
All logical operations except BIT allow any direct addressing mode except im-
plicit, and any indexed mode except zero-page,Y. BIT allows only the absolute and
zero-page direct addressing modes.
Example:
Write assembly language instructions that isolate the d 1 and d3 bits of memory location
'bitloc' and transport them to the accumulator for further testing.
This task can be performed with two instructions:
Program structuring. The last type of CPU operation is used to give pro-
grams their structure. These program structuring operations shape a program as it
executes, choosing between alternative instructions or reexecuting instructions as
needed. The program structuring operations organize the data movement and data
transformation operations into a system that does useful work.
Any significant system contains multiple levels of structure. First, there is
almost always a central structure that holds the lower levels of structure and the
system itself together. Just as the central structure called the skeleton is necessary to
support the specialized functions of a higher animal, a central structure is necessary
to support and connect the many specialized functions that go into performing a
significant task. Skeletons and rugged program structures have many other traits in
common. We will use skeletal structure as a familiar angle from which to approach
program structure.
Assembly Language 101
A skeleton can be examined at three levels. At the highest level, the skeleton
can be considered in the context of the central supporting structure called the spine.
Below this level are the skeleton's functional groups, such as the rib cage and the
hand. At the lowest level there are individual bones and their connective tissues.
The spine dictates the relationships and general structure of the rest of the
skeleton. Looking at an X-ray of the spine alone tells much about an organism's
shape, manner of locomotion, and purpose in the ecology.
Looking one level lower at the functional bone groups and their relationships
provides a deeper understanding of the day-to-day functioning of the organism.
Finally, studying the lowest level, the individual bones, reveals the smallest
details about the organism's structure and internal functioning, information which
is usually needed only for special scholarly studies.
There are also three levels of structure in a good program, directly correspond-
ing to the three levels of skeletal structure. From highest to lowest they are the
hierarchy, the module, and the construct. They can be visually represented with
"X-rays" of program structure, pictorial images that we will learn to draft in
Chapter 4.
The hierarchy is the spinal structure of a program. It dictates the relationships
and general structure of the major parts of a program. Simply looking over the
hierarchy chart reveals the most important facts about a program's shape, manner
of execution, and purpose or overall function.
A module is a functional group in the program skeleton. A module contains
assembly language instructions which together perform one or more closely related
functions, such as moving or transforming a single data structure.
The unity of purpose within a module allows it to be given a short, specific
name in the form "imperative verb-optional adjectives-object-optional adverbial
phrase." Indeed, with this wording convention it is impossible to give a strong name
to a module that does not have a unified purpose. So the naming convention has the
double purpose of making the purpose of a module clear to the hierarchy-chart
reader and of helping the programmer to create modules of unified purpose. A
typical example of a module name might be 'Place Mailing-List Names In
Alphabetical Order'.
Hierarchy charts show modules accompanied by the names of the data flows
into and out of them. A module can dole out portions of its overall task to smaller
modules, controlling the order and circumstances of their execution in the process.
The levels of modules this allows combine into a functional hierarchy, hence the
name "hierarchy" for the central program structure.
Constructs are the individual bones and their associated connective tissues
within the functional groups of the program skeleton. A construct contains a closely
related group of assembly language instructions performing a task too small to be
considered a module-level function. Constructs are named with the same word for-
mat as modules. If the fine detail within a module must be known, its constructs
should be examined. Knowledge of this level of detail is necessary during program-
ming and program-correcting or "debugging."
102 Into Its Brain: 6510 Assembly Language Chap. 3
Figure 3.9
Each box in a hierarchy chart represents a module. The lines connecting the
boxes indicate higher modules utilizing lower modules to perform their functions.
The arrows between the boxes represent data; an arrow into a box represents data
used by the module, and an arrow out of a box represents data produced by the
module. The name in the top module of the hierarchy chart is the name of the entire
hierarchy. Constructs are not shown on a hierarchy chart.
Assembly Language 103
All these aspects of the hierarchy chart will be discussed in Chapter 4, as will
methods for creating one as the top level of a program's structure. Creating the
lower levels of program structure, modules and constructs, and the assembly opera-
tions for doing so, make up the major topic of the rest of this chapter. We will start
our expansion with the lowest level of structure, the construct. This will also allow
us to continue learning and using assembly language instructions in a relatively sim-
ple context. Soon we will see how to combine constructs to form the larger structures
called modules.
Structured flowcharts are built with three symbols: the oval, representing the
start or stop of construct processing; the rectangle, representing individual process-
ing actions within the structure; and the diamond, representing a choice of execution
paths.
Example:
Write a structured flowchart describing a construct that calculates the sales tax on an
item purchased in a particular county. The tax rate depends on where it is purchased; if
the purchase is made in a city the rate is 6.51170, and if the purchase is madl~ outside a city
it is 61170.
The two tax rates indicate that a simple two-path selection of procc:ssing action is
needed. We show the example construct here in Fig. 3.10 and explain its structure in
detail later.
No
Figure 3.10
Figure 3.11
106 Into Its Brain: 6510 Assembly Language Chap. 3
The generic template for the sequence construct consists solely of comments
accompanying data processing operations, since program structuring operations are
not included in sequences. Recall that comments are preceded by semicolons in most
assembly languages. The generic sequence template appears as follows:
INSTRUCTION n
;END (construct name)
.'igure 3.12
Assembly Language 107
serted at the top. We will call the action 'convert a byte into two hex characters', and
follow the assembly language convention of preceding it with a semicolon to indicate
that it's a comment. The pseudocode is
Ideally, the next step would be to fill in a template with assembly language in-
structions. However, each line in the pseudocode above describes a task beyond the
capabilities of any assembly language operation. There is, for example, no single in-
~truction that can translate a nibble into ASCII and store it. To obtain pseudocode de-
tailed enough to fill a template from, we must treat each line in our initial pseudocode
as a separate task and divide it as before with a flowchart and lower-level pseudocode.
Our flowchart for the first line, 'separate byte into two nibbles', appears as
shown in Fig. 3.13. These actions place the upper nibble of the binary byte in the lowest
four-bit positions of A, and the lower nibble of the binary byte in the lowest four-bit
positions of an index register.
Figure 3.13
The pseudocode for the flowchart in Fig. 3.13 with the index register and shift
operations specified is
Figure 3.14
The flowchart expanding the task 'translate nibble into ASCII and store it' is
shown in Fig. 3.14. We will make two pseudocode translations of this flowchart: one for
the byte in the accumulator and one for the byte in the index register. The flowchart
fully describes the processing of the accumulator byte, and becomes the following
pseudocode:
To use the pseudocode above with the byte in X, the byte must first be moved into A
and its top nibble must be zeroed. With initial lines for these actions, the pseudocode
can be used again. The resulting pseudocode expansion is
Assembly Language 109
Placing the three sections of detailed pseudocode into the initial pseudocode produces
detailed pseudocode for the entire construct:
Note that the work is done by the lowest-level lines and that the higher-level lines
become comments on the processing work done.
The pseudocode is now close enough to assembly language that the template can
be filled in. Lines in the detailed pseudocode will become comments in the completed
assembly language template. Most lines translate directly into a single assembly
language operation. The template will be preceded by an "assignments" section that
defines variable names and sets up the table of ASCII characters. The assignment direc-
tives =, =, = + , and .BYTE are used. = assigns a numeric value to a symbolic
name. = establishes the memory address at which the following data or program in-
structions will be placed when the program is first loaded into memory. = +, fol-
lowed by a number, reserves that number of byte locations in memory; if the directive is
preceded by a symbolic name, the first address in the reserved area is assigned to that
name .. BYTE causes the bytes starting at the program address where this directive is
placed to be filled with the byte value or ASCII character sequence that accompanies the
directive. The completed template below should make the usage of these directives
clear. Comments are preceded by semicolons, as usual.
110 Into Its Brain: 6510 Assembly Languag;e Chap. 3
;ASSIGNMENTS
=
binary $50 ;assigns value 50h to 'binary'
=$80 ;sets first data address to 80 hex
table
.byte '0123456789ABCOEF' ;stores ASCII table at 80 hex
lonibl = + 1 ;'_.nibl' bytes reserved to
hinibl * = * + 1 store ASCII characters
* = $2000 ;address of following instruction
;ENO ASSIGNMENTS
The benefits of using pseudocode and flowcharts together to design code are
even more apparent when writing more complex processing structures such as selec-
tions or repetitions.
Note that the pseudocode representation provides the comments for the
assembly language construct. It is important that you include informative and
readable comments with constructs, and pseudocode can provide them. Whether or
not the comments are taken from the pseudocode, however, it is very easy to over-
comment or to write comments that simply restate the actions of particular instruc-
tions. Such comments occasionally serve an educational purpose in this book (as in
some of the comments above), but they are in general redundant and distracting.
Lines from a pseudocode representation that fit this description should not be in-
cluded in an actual program.
It is equally important to choose the most descriptive label, address, and data
names possible within the character limit. Names and comments are the only direct,
written explanations that you will have within a source program (although you will
have charts of the structure levels, they are kept separately). Do not be reluctant to
Assembly Language 111
spend a few minutes finding a descriptive name for a problematic label or data struc-
ture; the small effort will save you many hours of musings later.
For readability you should also choose one of the letter cases, upper or lower,
to notate instructions and structural dividers such as section names (e.g.,
ASSIGNMENTS), and reserve the other case for variable, label, and address names,
and comments as in the preceding example.
There are no variations on the sequence construct.
:1
Exercise:
!I Use the following questions to review sequential constructs.
(p) How is a sequential structure created?
(q) Under what circumstances should a sequential construct be used?
(r) Design and program a sequential construct that as part of a home-control program,
adds together energy-use variables for five home devices and places the result in a
'total energy use' variable. Assume that the energy-consumption values have
already been set by another part of the program. Choose your own devices and
il variable labels.
addressing mode. Both indirect modes use an operand to access a 16-bit address
pointer in memory, but absolute indirect does so with a 16-bit address operand,
allowing the memory address to reside anywhere in the memory map, while zero-
page indirect accesses the memory pointer in page zero with an 8-bit address
operand.
In Commodore assembly language format, the JUMP operation has the
mnemonic JMP and appears as follows:
if the BRANCH instruction straddles an address page boundary. This time penalty
occurs whether or not the BRANCH is taken. INDX refers to the 8-bit operand of
the BRANCH instruction. ADDRESS refers to the address value in or pointed to by
the operand of the JMP instruction.
To ease the programmer's work, symbolic assembly language format allows
the index operand of the BRANCH operation to be written as the symbolic label of
the destination address. The assembler converts that symbolic address into the re-
quired 2C relative address at assembly time. So, for instance, the instruction using
the BRANCH operation BCS to transfer program execution to address 'label' ap-
pears in assembly language format as
BCS label
The address 'label' must, of course, be within -126 to + 129 address locations
of the BRANCH operation.
114 Into Its Brain: 6510 Assembly Language Chap. 3
JUMP and BRANCH operations are used together in all selection constructs
except the simplest one, the IF-THEN. Since the IF-THEN is a special case of the
two-path IF-THEN-ELSE, we start by showing the three design tool representations
of the latter construct and will modify the representations to accoum for the IF-
THEN form.
The generic flowchart for the IF-THEN-ELSE construct appears as shown in
Fig. 3.15. 'Test' is the program structure selection of the execution path. The IF-
THEN flowchart omits one of the 'do task' boxes, and appears as shown in Fig.
3.16.
The generic pseudocode for the IF-THEN-ELSE construct is
The generic pseudocode for the IF-THEN construct omits the ELSE clause.
Of course, the pseudocode statement IF condition implies "IF condition is
true." In assembly language code conditions are flag contents, but it is more useful
during design to think of conditions in terms of the overall "barometer" for the up-
coming processing. That barometer will be a data relationship or an outcome of
previous processes. Later we will see how such conditions can be determined by ex-
ecuting operations and testing the status flags.
Start )
T F
Look over the pseudocode for a construct whose choice of execution path
depends on the comparative size of two numbers:
In this case the processing barometer is found from the relative size of two data
elements.
All conditions have Boolean values. 'Num 1 > num 2' is either a true or a false
statement. This condition can be translated to a flag value by using a comparison
operation with operands Num 1 and Num2 to set the value of the negative flag.
It is often very useful to be able to take the logical AND or OR of two or more
different Boolean conditions to form a single true or false condition for exiting a
construct. This is called a compound condition. Compound conditions are often
used to save space or to reflect more accurately the barometer on which the process-
ing depends. Also, a single logical barometer will sometimes translate into com-
pound flag tests at the assembly language level.
From Chapter 3 we know that the logical OR of two Boolean values is true if
either or both values are true. Thus the condition written 'conditionl OR condi-
tion2' is true if at least one of the subconditions is true.
Similarly, the logical AND of two Boolean values is true only if both values are
true. Thus the condition written 'conditionl AND condition2' is true only if both
subconditions are true.
An IF-THEN-ELSE construct using an OR compound condition is as follows:
INSTRUCTION n
;ENDIF (construct name)
The IF-THEN generic template omits from the JMP operation at the end of
the THEN section down. The BRANCH in the IF section proceeds to the entry point
of the next construct.
Replacing the simple 'condition' with the compound 'condition I AND condi-
tion2' in the generic IF-THEN-ELSE template yields
;END
Note that the THEN section is reached only if conditionl AND condition2 are
true.
Finally, replacing the simple 'condition' with the compound 'condition 1 OR
condition2' in the generic IF-THEN-ELSE template requires an additional label in
the THEN section and yields
Assembly Language 117
;ENDIF
Note that if condition! is true, the THEN path is BRANCHed to and ex-
ecuted, OR if condition2 is true, the BRANCH to the ELSE path is not taken and
the THEN path is executed.
Example:
Write a simple construct that can be used in a space game to keep track of the number
of laser hits to your spacecraft and to the spacecraft of your computer opponent, and to
keep track of the changing capabilities of the two spacecrafts as a result of the hits they
sustain.
Let's assume that the spacecraft that has absorbed the most hits gives up a speed
advantage. Giving up an advantage requires comparing numbers of hits and penalizing
one spacecraft or the other. This requires the selection of one of two possible actions,
which can be represented with a flowchart (Fig. 3.17). We'll call the number of hits to
your spacecraft "humanhits". and the number of hits to the computer's spacecraft
"computerhits" .
Figure 3.17
- -----.. '.-'-~~---------
Note that the ELSE action is only performed if humanhits > = computerhits (i.e., if
[NOT (humanhits < computerhits)] is true). This implies that if both sides start out with
zero hits, the human is immediately placed at a disadvantage and must fight to gain the
advantage. This is typical of many "shoot-ern-up" computer games.
Obviously, this pseudocode is not yet ready to be translated into assembly
language instructions in a template. The THEN and ELSE lines must each be expanded
into multiple low-level lines of pseudocode.
First we must decide on a method for assigning disadvantages. We will do this by
defining a 'disadvantage' variable and giving it one of two values, symbolically named
'human' and 'computer' to indicate which spacecraft has taken the most hits. The
numeric values of 'human' and 'computer' can be decided upon later. Using this
method, the THEN line can be directly expanded into
In this case the final pseudocode is no longer than the initial but has required a design
decision to prepare it for coding in assembly language. The final pseudocode represen-
tation is
We almost have enough information to fill in a template. All that remains is to decide
how the comparison 'humanhits < computerhits' will be made. Recall that numbers are
compared by subtracting one from the other. Which number should b{: subtracted from
which?
The subtraction 'humanhits - computerhits' correctly tes:s the condition
'humanhits < computerhits', since when 'humanhits < computerhits' is True, the sub-
Assembly Language 119
traction result will be negative, setting flags we can test. That is, if 'humanhits' is less
than 'computerhits', one of two things will happen: If the size of the result fits in a 2C
number (i.e., if 'humanhits - computerhits' is greater than or equal to - 128, the neg-
ative flag will be set to show the negative result). However, if 'humanhits - computer-
hits' does not fit in a 2C number (i.e., if 'humanhits - computerhits' is less than - 128),
the result will overflow. The negative flag will be forced to a 0 by the I-bit in d7, but the
overflow flag will equal 1. These aspects of 2C arithmetic can be reviewed in Chapter 1.
Thus a compound flag condition of two possible compound flag conditions after
the subtraction 'humanhits - computerhits' is equivalent to the logical condition
'humanhits < computerhits'. The flag condition can be stated as
(N = 1 AND V = 0) OR (N = 0 AND V = 1)
As we said earlier, testing a single processing barometer sometimes requires testing more
than one flag.
We look over the IF-THEN-ELSE templates for both OR and AN~ compound
conditions, and build a template based on both. Since the rest of the pseudocode is
ready to be converted into assembly language, we will fill in the template. The
ASSIGNMENTS section is included to show how the variables and their allowable
values can be defined. The variables 'humanhits' and 'computerhits' are assumed to ex-
ist elsewhere in the game program and to already contain values.
;ASSIG N MENTS
human $FF= ;constant definitions
comptr = $00 ;(Iabels are limited to six characters)
* = $80 ;'humanhits' will be at address BOh
=
huhits * * + 1 ;reserves 1 byte for each variable
=
cmhits * * + 1
=
disadv * * + 1
;END ASSIGNMENTS
Example:
Show generic pseudocode for a construct that selects between four execution paths
depending on the contents of a four-valued variable.
The pseudocode is
Case flowchart
Test
variable
Vall Val2 Val3 Other
Figure 3.18
The generic flowchart for the CASE construct is shown in Fig. 3.18. The
default task is performed only if the variable has a value other than that of the in-
dividually defined values 1 through n. The default task might react to all erroneous
values of the variable in the same way, for instance.
The generic pseudocode for the CASE construct is
value n: do task n
other: do default task
ENDCASE
;value n
TEST INSTRUCTIONS
BRANCH on (NOT value n) to 'default' case
PROCESSING
JUMP (to entry point of next construct)
;'default'
PROCESSING
;ENDCASE (construct name)
Compound conditions can also be used. They are built the same way as in IF-
THEN-ELSE templates.
The 'value' sections of the CASE template are so similar to the THEN section
of an IF-THEN-ELSE template that we will omit a CASE example and go on to the
final type of construct, the repetition.
Exercise:
Answer the following questions to review selection constructs.
(s) Describe how the JUMP operation is used in an IF-THEN-ELSE template. Do the
same with the BRANCH.
(t) When would a CASE statement be used instead of an IF-THEN-ELSE?
(u) Use all three design tools to write a construct that controls an air-conditioning
system. Temperature will be the condition for selecting an action. Assume that
'temp' is a one-byte variable given its value elsewhere in a larger program. Use its
value to set another one-byte variable called 'cycle' to one of three values: 'heating',
'waiting', or 'cooling'. Heating should be commanded for 'temp < 68', waiting for
'68 < = temp < 78', and cooling for 'temp> = 78'. These ranges will require the use
of compound conditions. Try writing this construct first as nested IF-THEN-ELSE
and then as a CASE construct.
among the instructions, is satisfied. Execution then diverts to the entry point of
another construct. Because of this circular execution pattern, the repetition con-
struct is also called the loop.
The choice between continuing or exiting the loop is made using a BRANCH
instruction. If the test condition is compound, a combination of BRANCH and
JUMP instructions may follow the test as in IF-THEN-ELSE constructs. A JUMP
instruction is located at the physical end of the loop to return execution to the
physical beginning of the loop.
By allowing the test to reside anywhere within the repetition structure, a single
byte of repetition construct serves all programming needs. The full name of this
general repetition construct is the LOOP-EXITIF -ENDLOOP. Its three representa-
tions follow.
Loop flowchart
Figure 3.19
In generic flowchart form the LOOP construct appears as shown in Fig. 3.19.
Either of the 'do part of task' blocks can be omitted to allow the test to be placed at
the beginning or end of the construct.
The generic pseudocode for the LOOP construct is
Compound conditions for the EXITIF test are treated as they are in IF-THEN-
ELSE constructs.
LOOPs usually step through a repetitive data structure using an index in X or
Y. If the index equals the number of bytes in the array, it can be decremented to
work backward through the array and to produce an easily detected status condition
(e.g., 0 or minus) when processing is finished. Although processing backward seems
odd at first, it avoids the time-consuming strategy of incrementing an index from 0
and comparing to the number-of-bytes figure on each pass through the LOOP.
An interesting variation of the LOOP construct omits the EXITIF test to pro-
duce an infinite loop. Once started, an infinite loop cannot be stopped except by
turning off or resetting the computer, or by an outside intervention called an inter-
rupt. We will study interrupts later in this chapter. Infinite loops are sometimes
useful for such tasks as monitoring and reacting to a condition that occurs repeat-
edly but at unpredictable intervals. However, the unstoppable execution of an in-
finite loop limits its usefulness to relatively few situations.
We are now ready for an example of a typical LOOP construct.
Example:
Some tasks require that data in a contiguous memory area be checked to ensure they
have not somehow been changed. For instance, data sent to the computer by a
peripheral should be checked after their reception, and data written into memory as part
of a memory test should also be checked for changes. One way of checking data is to ac-
company the original data with a data element containing the lowest n bils of the sum of
all the data bytes, with n most commonly equaling 8 or 16. This accompanying data ele-
ment is called a checksum. Creating or verifying a checksum requires that the data in
memory be treated as a repetition of bytes, and which therefore calls for the use of a
repetition construct.
Assume that an array of bytes has somehow been placed into memory. The byte
preceding the array is located at address 'numbyt' and has as its value the total number
of bytes in the array plus 2, to account for itself and for a byte following the array. The
byte following the array has the 8-bit checksum from adding the array bytes with 'num-
byt'. Use the design tools to write a construct that verifies the checksum for the data in
memory.
This is a simple enough task that writing a flowchart would not be of much help
to us. The repetition construct will be contained in a sequence that sets up, performs,
and reacts to the checksum calculation. We will begin with pseudocode:.
Assembly Language 125
A repetition construct will be needed to perform the 'add numbyt ... ' line. To allow
the numbyt value 1 to represent the first byte in the data area, the base address for in-
dexing through the array will be set to one address below the beginning of the array.
This is done in the pseudocode expansion below. Indexing will proceed from the end of
the array to the beginning, as explained earlier. Since the first line and last two lines are
almost at a low-enough level to code, we will expand all three lines at once. This pro-
duces
The completed template below will omit an ASSIGNMENTS section and any comments
from the detailed pseudocode which translate directly into assembly instructions. The
result is
;LOOP
loop
ADD numbyt-1,X
DEX
;EXITIF X equals 0
BEQ com par
;ENDLOOP
JMP loop
;compare sum with the checksum byte following the array
compar
LDX numbyt
CMP numbyt,X ;compare A to the checksum
;set a warning flag in memory if A < > checksum byte
;IF A <> contents of checksum byte
SEQ else
;THEN set contents of location 'data' = 'nvalid'
LDA #nvalid
STA data
JMP done
;ELSE set contents of location 'data' = 'valid'
ELSE
LDA #valid
STA data
;ENDIF
;RETURN
;END (calculate)
Exercise:
Answer the following question to review repetition constructs.
(v) Use all three design tools to write a repetition construct that checks for a time-of-
day variable, sets a 'sprinkler system' variable to the value 'ON' at 9:00 A.M., and
sets the 'sprinkler system' variable to the value 'OFF' at 9:20 A.M.
Modules. Recall that a module is a group of one or more constructs that per-
form a complete function in a program. Thus modules are internally similar to con-
structs and can be implemented by using the same three design tools.
The feature that distinguishes modules from constructs is theJir scope. While
constructs often organize instructions to perform small actions that are by
themselves unmeaningful to any overall task, modules always perform major ac-
tions that are critical functions of an overall task.
An analogy for this distinction is found in the typical auto repair manual.
Most such manuals describe how to repair the major systems of a car in a step-by-
step fashion. If you close your eyes, open the manual to a page at random, and read
anyone sentence in a repair description, you probably will not learn anything about
what it takes to repair that automobile system, and you may not even understand
what the sentence means because it is out of its context. However, if the repair in-
Assembly Language 127
Do
task
'A'
Figure 3.20
pushes the most significant byte, or MSByte, of the current PC address onto the
stack, followed by the PC least significant byte, or LSB, and the contents of the flag
register. It then fetches an indirect address from memory locations FFFEh and
FFFFh and places it in the PC. The contents of these bytes depend on the current
configuration of the Commodore 64, a topic that is discussed in Chapter 5. The PC
then begins executing an interrupt-handling routine, which investigates and responds
to the interrupt condition and which would be written as a module. Toward the end
of the interrupt handler the interrupts must be reenabled with a CLI instruction.
Although interrupts are disabled automatically at the beginning of the interrupt-
handling routine, it is often useful to disable them manually. This is done with the
SEI or SEt Interrupt disable flag mod operation.
The exit point for the interrupt-handling routine is an instruction called RTI,
for ReTurn from Interrupt. This instruction places the flag byte from the stack into
the status register and places the return address from the stack into the pc.
There is an instruction that simulates the effect of an interrupt request on IRQ.
It is called a BRK, for BReaK, and it causes the 6510 to perform the same actions as
an IRQ pulse, except that 'PC + 2' is pushed on the stack instead of 'PC', and the B
flag is set to 1. As with IRQ-called routines, RTI rather than RTS is used to return
from a routine called by a BRK.
Although the return address is PC + 2, BRK is a one-byte instruction. The
return address on the stack will have to be popped off the stack and decremented if
the BRK is used as a permanent part of a program. BRK, however, is usually used
for debugging an existing program, where it is placed over a program instruction.
The debugging program that places the BRK has to determine the length of the over-
written instruction, and, if it is other than two bytes, to adjust the return address on
the stack during execution of the BRK routine.
The CALL and BRK operations are summarized in Table 3.12. TOS represents
"Top Of Stack." (FFFE, FFFF) means "the contents of locations FFFE and
FFFF."
With these operations a black-box module or a black-box peripheral device
can activate another black-box module. This is only part of the job, however, since
data must also usually be communicated between the black boxes. This communica-
tion is called parameter passing. Parameters can be passed between black boxes by
having the activating box place the data in one or more registers, memory locations,
or stack locations before the lower box is activated. Large amounts of data can be
passed from various areas of memory by passing the data address instead of the data
itself. Of course, all the black boxes must agree where the data or pointers will be
placed. Parameter passing will be demonstrated in the examples of Chapter 4.
You are now familiar with every operation the 6510 performs except one. That
instruction belongs with none of the groups we have discussed, for it does nothing
but consume time and space. It is therefore called the NOP or No OPeration. It can
be used to pad a loop so that it takes a specific execution time, to create a loop whose
only purpose is to pause or delay, or to reserve space for the possible future instruc-
Assembly Language 131
tions that might be inserted at a point in the program. The NOP can be summarized
as follows:
Exercise:
Review the module level of program structure with the following questions.
I'I (w) Explain the difference between the module and basic construct levels of program
II structure.
(x) Explain the difference between the JSR and IRQ methods of calling a module and
returning from it.
OPTIMIZATION
Algorithm Optimization
With any module or construct the first use of a design tool, be it the flowchart or
pseudocode, represents the algorithm for performing that structure's task. Thus the
first step in writing a module or construct is to select an algorithm for the task to be
performed.
Algorithms have already been devised for most of the major tasks that a com-
puter performs. Beyond Power Programming, the advanced adjunct to this book,
lists many of them in the pseudocode format used throughout this text. Professional
programmers often use algorithm references, although their representation form is
usually more cryptic than the style of pseudocode used herein.
Optimization 133
Normally, more than one algorithm is available to perform a given task. The
appropriate choice depends on the type of efficiency being sought. Either execution
time or memory space can be minimized by choosing the proper algorithm. Accord-
ing to the principle called the time-space trade-off, efficiency in one criterion is ob-
tained at the expense of the other. So a task often will have preferred algorithms for
speed, for brevity, and for the best compromise between the two. Speed is usually
considered to be the more important efficiency criterion, since programs often leave
memory space available that can be traded for shorter execution time. In any case,
choosing the most optimal algorithms will produce the fastest-executing program
for the memory, and possibly disk, space available.
An example of the time-space trade-off is found in the binary-to-ASCII con-
version in the sequential construct section. This construct obtained its ASCII values
from a look-up table. The look-up table algorithm was used for two reasons. First, it
is the fastest algorithm for binary-to-ASCII conversion. Second, it is the only major
conversion algorithm that can be placed in a sequential construct, and the other
types of constructs had not yet been discussed.
If we had considered algorithms of other processing structures, a more space-
efficient algorithm would have been available. It is the computational algorithm
mentioned at the beginning of the conversion example. The computational
algorithm computes the ASCII values representing a binary byte by choosing one of
two possible values to add to each nibble in a binary byte. Which value is added
depends on whether the hex digit for that nibble will be represented by a number or a
letter. The choice of the value to add is a selection process, requiring a selection
structure. The computational algorithm is at once slower and shorter than the look-
up-table algorithm.
The look-up and computational algorithms clearly demonstrate the time-space
trade-off. Similar algorithms of both these types are also used in other types of con-
version tasks. For instance, obtaining sines and cosines for given angles can be done
with a look-up table or by computation. The look-up-table algorithm for sine
calculation quickly lifts a needed sine from a large table of sine values. The com-
putational algorithm performs a complex and time-consuming but relatively space-
thrifty calculation of a sine value. Contrasting the ASCII and trigonometric tasks
suggests, correctly, that the greater the complexity of the task being performed, the
greater the time-space trade-off between its various candidate algorithms will be.
Under certain circumstances a look-up-table algorithm can replace a program
CASE structure and save both time and space. This is possible when the CASE ex-
ecution path options all contain the same type of data movement operations. Thus if
the options of a particular CASE construct assign different values to a single
variable, the CASE selection variable can often be converted into an index for a
look-up table of the different option values. One of these values will be retrieved
during execution and placed in the target variable. This strategy can also be helpful
when one or two CASE options include additional operations, if the look-up-table
algorithm is followed by simpler IF-THEN or CASE constructs than the original
CASE. This trade-off will have to be examined on a case-by-case basis.
134 Into Its Brain: 6510 Assembly Langua!~e Chap. 3
Rank Order
Best constant
log2(n)
n
n x log2(n)
n l !2
n2
nl
Worst 2"
136 Into Its Brain: 6510 Assembly Langua~le Chap. 3
Implementation Optimization
To implement an algorithm efficiently one must understand the role of the input and
output data in the most basic operations of their processing. Since this understand-
ing is seldom complete from the start, it usually takes several passes to optimize an
im plementation.
As an algorithm is expanded to obtain the detail required for programming,
the role of the data becomes clearer. At the lowest level of pseudocode this role can
be seen most clearly, and without the syntax rules of assembly language to distract
from it. This is why it is important not to omit the level of pseudocode that
translates more or less directly into assembly language instructions.
An understanding of the data can be exploited in at least four ways. First, data
that can be most efficiently processed with register-specific CPU operations can be
placed into the appropriate registers. Counter and index values are two examples of
data that are more efficiently handled in registers. Proper use of the accumulator
can also accelerate arithmetic and logical operations.
Second, data can often be placed in page-zero memory for faster access. The
appropriate register and memory placement choices are more obvious if you make a
list of all data elements and accompany each element with the types of CPU opera-
tions to which it will be subjected.
Third, by observing how the available CPU instructions lend themselves to
data manipulation in a given situation, the input data can sometimes be redefined in
their grouping or content to decrease the number of instructions required to process
them. For instance, refer to the checksum example of the repetition construct sec-
tion to see that the 'numbyt' variable, whose value indicated the number of data
bytes in the memory area, could have contained the number of data structure bytes,
that number plus one to also account for the numbyt byte, or that number plus two
to account for numbyt and for the trailing checksum byte. It turned out that the task
was performed most easily and quickly if numbyt held the number of data structure
bytes plus two. Although that choice was not pointed out at the time, it was never-
theless consciously made.
The fourth way to exploit an understanding of data is to use the special
capabilities of the CPU instruction set knowledgeably with the data. The more exotic
addressing modes, like indirect indexing or creative uses of the stack, are often
overlooked by programmers but can greatly streamline an algorithmic: implementa-
tion. Multiplication and division by numbers that can be broken into sums of
powers of 2 is often faster using shift and rotate instructions; so a multiplication by
10 can be performed as the sum of a shift-left multiplication by 8 and a shift-left
multiplication by 2. Finally, although some will disagree with this statement, a
technique called self-modifying code can be used under very special and very
restricted circumstances for large time savings.
Self-modifying code refers to the modification of instruction operands by in-
structions during program execution. Thus a constant loaded as an immediate
Optimization 137
operand becomes a variable if its location in memory can be modified during pro-
gram execution.
The objection to self-modifying code is that it makes a program's functioning
difficult to understand or predict. The argument is that if operands vary dynam-
ically as the program is executed, it is difficult to understand how the program really
works. This argument is valid in almost all circumstances. The one exception is when
the modification is performed by and on instructions within a repetition construct at
a low program level, and the repetition executes a set number of times until its pur-
pose is accomplished as opposed to executing continuously until some outside event
occurs. In this case the context of the modification is so restricted that if the modify-
ing instructions and the modified operand are adequately commented, there will be
no mistaking their purpose or effect. Indeed, the modification in effect adds an ad-
dressing mode to the instruction set.
As an idea of the potential gains from using self-modifying code in the proper
context, look over the following example.
Example:
Design a repetition construct that moves multiple 256-byte pages from one memory ad-
dress to another. Implement it with and without self-modifying code.
One algorithm for this process is
Aside from pre-LOOP assignments, the only expansion needed to prepare this
pseudocode for programming is that of the first line. It is
decrement byte_count
EXITIF byte_count 0 =
ENDLOOP
;END move
The following completed template uses no self-modifying code. It assumes that the
source and destination addresses are being passed in the locations 'source' and 'destina-
138 Into Its Brain: 6510 Assembly Language Chap. 3
tion', with the least-significant byte first, and that the number of data pages to move is
being passed in location 'totpgs'. Because the source and destination addresses are
stored as variables, indirect indexed addressing can be used to access them.
In the self-modifying version, the locations 'source' and 'destination' are in the con-
struct itself. They are the operands of the LDA and ST A operations in the inner LOOP.
The LDA and ST A operations can use absolute indexing for speed because the source
and destination addresses are passed directly into the data movement instructions. Thus
the construct executes as if the construct were always written to move data between only
those two addresses. Examine the construct closely to see how this works:
As you can see, only two instructions were changed from one version to the next.
However, those two instructions were at the heart of the tight inner LOOP moving data
from the source area to the destination area. This loop executes every time a byte is
moved.
In that loop there are an LDA, an STA, a DEY, and a BNE. The only difference
between the two construct versions is the addressing mode of the LDA and the ST A.
Looking up the" (-state" or "clock time" consumption for these operations and ad-
dressing modes reveals that the non-self-modifying inner LOOP takes 16 ( states for
most executions, while the self-modifying inner LOOP takes 14 (states. This is a 12.5070
speed improvement on the inner LOOP, where the bulk of construct execution time oc-
curs. Further, the bulk of a program's execution time tends to occur in this sort of tight
loop, so the improvement is likely to affect greatly the overall program speed. Although
this improvement is nothing like that available from choosing an appropriate algorithm,
it is still a worthwhile improvement to achieve for so little effort at the end of the im-
plementation stage.
140 Into Its Brain: 6510 Assembly Language Chap. 3
Optimization Summary
The optimization techniques that we have discussed can be grouped and summarized
by the construct categories they affect. These groupings are:
Example:
Write a speed-efficient construct that multiplies two binary numbers together.
We will develop an algorithm based on a study of the arithmetic of multiplica-
tion. To provide symbolic labels for each number in a multiplication problem, the
names for each data element in a multiplication problem are labeled as follows:
multiplicand
x multiplier
product
The simplest way to solve a multiplication problem is to add the multiplicand to itself
"multiplier" times. Because of the wide range of value a multiplie:r can take on,
however, an algorithm based on this method will execute a number of operations pro-
portional to the size of the multiplier. The order of such an algorithm is O(n), where n
represents the multiplier size as the number of integer values between 0 and the value of
the multiplier.
The widely varying execution time of such an algorithm with different multiplier
values causes us to look for a better algorithm. Since a pencil-and-paper "long multipli-
cation" can solve the problem "123 x 123" in about the same lengtlh of time as the
problem "789 x 789," it seems that there should be a method of computer multiplica-
tion that is similarly insensitive to the specific values bl~ing multiplied. Indeed, such a
method can be directly adapted from long multiplication. Observe tht: following long
multiplication:
456
x 301
456 partial product
+000 partial product
+ 1368 partial product
137256 product
Optimization 141
The long-multiplication method uses one single-digit multiplication plus one addition
per multiplier digit, instead of the conceptually simpler multiplier additions of the
previous method. The long-multiplication method can be made more repetitive by
adding the partial products into a running sum as they are generated.
First observe the decimal-number version of the long-multiplication algorithm:
Note that the multiplicand is mUltiplied by 10 on each pass through the loop to compen-
sate for the increasing place values of the multiplier digits, so that they can always be
treated as individual digits. In paper-and-pencil long multiplication we do something
similar; each multiplier digit is treated as a single-digit multiplier, but the resulting par-
tial product is shifted-left one place before it is added to the others.
The long-multiplication algorithm becomes even simpler when adapted for binary
numbers. Since the only allowable binary digits are 0 and 1, the 'partiaLproduct' of
each internal LOOP multiplication will equal either 0 or the multiplicand itself. This
means that the binary form of this algorithm can be performed without using any
multiplication operations whatsoever. The available CPU operations are then sufficient
for performing all aspects of this algorithm.
Because of the machine details of the CPU, it turns out that it is slightly faster to
shift the running sum one position right instead of shifting the multiplicand one posi-
tion left. Shifting the running sum right is equivalent to the hand operation mentioned
above, of shifting the partial product one place left, as long as the bit positions of the
final running sum are interpreted with the proper weights or place values.
The final form of the long-multiplication algorithm for binary numbers is
ENDIF
shift 'running~sum' one place right
EXITIF leftmost 'multiplier' bit has been used
obtain the nextleft 'multiplier' bit
ENDLOOP
done ('running~sum' holds the result)
;END multiply
This algorithm, which we'll call the shift-add algorithm, performs at most one addition
for every bit in the multiplier. The logarithm of a value is equal to the number of
times that its number base, 2 in this case, must be multiplied by itself to equal the value.
In other words, the base-2 logarithm of n, log2(n), is the power of 2 that produces n.
With a little reflection you can see that the logarithm of an m-digit number in any base
is betweelllll - I and III. Thus the shift-add algorithm, which performs at mosllll addi-
tions on an tn-digit multiplier equaling n, is an O(log2(n algorithm.
For a one-byte multiplier, the shift-add algorithm requires up to 8 additions,
while the repeated-addition algorithm requires up to 256 additions. You can see the ad-
vantage an O(log2(n algorithm such as shift-add multiplication has over an O(n)
algorithm such as repeated-add multiplication.
The shift-add algorithm can now be expanded. Most of the lines involve shifts,
adds, and bit tests. There are assembly language instructions for each of these opera-
tions, so the pseudocode is nearly detailed enough to program from. The second stage
of optimization must now be considered.
Recall that the role of data in the algorithm must be identified to optimize the im-
plementation. The input data elements are the multiplier and multiplicand, and the out-
put data element is the running sum.
The next step in optimization is to identify the roles of the multiplier, multipli-
cand, and running sum data elements in the processing. Their placement in registers and
memory can then be chosen for the greatest efficiency. According to the pseudocode,
these data elements have the following roles:
If we restrict the multiplier and multiplicand to 8 bits in length, both will fit in single
registers. The running sum for an 8 x 8 multiply can be up to 9 bits long, so it requires
two bytes of storage.
These facts allow us to choose storage locations. The most restricted role is taken
by the running sum, which must be placed where it can both be shifted and added.
There is only one location that both of those operations can access and leave results in;
that is the accumulator. Therefore, we will keep the lower part of the running sum in the
accumulator. The upper part of the running sum will be kept in page-zero memory so
that bits can be quickly shifted from the lower to the upper byte (recall that shift opera-
tions can address both the accumulator and memory).
The multiplicand is added to the running sum, which means that it must be placed
where the addition operation can access both. Since only memory locations fit that re-
quirement, the mUltiplicand will be placed in page-zero memory for fastest access.
Optimization 143
Finally, the multiplier must be shifted to obtain its bits from right to left. This in-
dicates using the LSR operation repeatedly to place the bits from right to left into the
carry flag for testing and use. Only memory locations and the accumulator can be ad-
dressed by the LSR operation, so with the accumulator taken, the multiplier must also
go into page-zero memory.
The EXITIF line indicates that the multiplication has been completed when the
leftmost bit in the multiplier has been used. The easiest way to know this has happened
is to count bit shifts and tests until eight (for all 8 x 8 multiply) have occurred. Thi~ i,
easily done by loading the X register with the number 8 and decrementing it each time
through the loop until it reaches O.
The location assignments we have made are:
Number Location
use its conditional BRANCH operation for the usual JMP at the ENDLOOP as well.
Finally, we will add an instruction at the end to place the accumulator byte of 'run-
ning_sum' into page-zero memory. With these revisions the completed construct can
now be written. Pseudocode lines that translate directly into assembly language without
adding any additional information are omitted.
;ASSIGNMENTS
BITCNT=8 ;number of bits in multiplier
* =$80 ;memory data start at 80h
RUNSUM * = * +2 ;'running_sum'
MULTCO*=*+1 ;'multiplicand'
MUL TPL * = * + 1 ;'multiplier'
;ENO ASSIGNMENTS
dividend remainder
quotient +
divisor divisor
We will write the algorithm for a 16-bit dividend, an 8-bit divisor, and a 16-bit quotient.
As the quantity being divided, the dividend roughly corresponds to the running sum
quantity produced by the multiplication. The final running sum was stored in two
memory locations and used the accumulator for intermediate storage and calculation,
so the dividend will reverse this pattern and start out in two memory locations and also
use the accumulator for intermediate storage and calculation. However, the division
algorithm leaves the remainder in the accumulator when it is finished.
Like the multiplier and multiplicand of the multiplication operation, the divisor
and quotient will be stored in memory locations.
Reversing the logic of the multiplication algorithm produces a shift-subtract
algorithm for multiplication. It encloses the central LOOP with an IF test for division
by 0, which of course was not needed for multiplication. The algorithm is given below
for your study or use.
;divide two binary numbers
set quotient to 0
IF divisor < > 0
THEN
LOOP
shift dividend left 1 place into accum
IF divisor < = accum
THEN subtract divisor from accum
put 1 in quotient's LSBit
ENDIF
EXITIF last dividend bit has been placed in A
shift quotient left one place
ENDLOOP
ELSE
set 'cannot divide by zero' indicator (byte, msg, etc.)
ENDIF
;END divide
After the execution of the algorithm either the remainder is in A and the quotient is in
memory, or the "cannot divide by zero" indicator has been set.
Exercise:
Use the following questions to review optimization.
(y) Which stage of optimization provides the largest efficiency gains?
(z) How do you estimate the order of an algorithm?
(aa) Name three ways of optimizing an implementation.
Once optimized code has been written, it must be tested to verify that it works
properly under a variety of expected and unexpected circumstances. This testing
process is our next topic.
146 Into Its Brain: 6510 Assembly Language Chap. 3
TESTING
All three structural levels of a program must be tested, but testing begins at the con-
struct level. To test a construct fully, the construct would have to be run with all
possible legal and illegal combinations of input and output data values. With even a
small construct such as the one in the multiplication example of the preceding sec-
tion, it could take months to run all the test cases that would have to be executed for
total testing. Running all the test cases for an entire module can literally take cen-
turies.
A more sensible approach is to identify a complete set of categories for all
types of input and output values, and to run a representative test of each category.
Six major categories cover this need sufficiently, although only a subset of them may
apply to any given construct.
The six categories to test for are:
The procedures for running sufficient tests in each category are described
below.
1. Normal output data. Select a typical output value (or values if multiple
data elements are produced by the construct) for each legal value range or type, and
try to obtain it from all significant combinations of typical input values that should
produce it. A typical value is a value that is in a legal range of values and that is not
otherwise distinguished from its surrounding values. So a construct that performs a
simple transformation on either positive or negative values should be tested under
this category twice: once for normal positive output and once for normal negative
output.
In the multiplication example of the preceding section, a reasonable test case
would be a number that is the product of two prime numbers. The two possible com-
binations of typical input values would be with one prime as multiplier and one as
multiplicand, and then to reverse their roles as multiplier and multiplicand.
2. Boundary input data. Select all input values that are both legal and yet
distinguished from their surrounding values, and run the construct to test for the ex-
pected output values.
The value 0 is often considered a boundary value, when it is not illegal as in the
divisor of the division algorithm, for instance. 0 is a boundary inpm value for the
multiplier and multiplicand of the multiplication example of the preceding section.
The largest and smallest values allowed by a construct are also boundary
Testing 147
values. Other boundary values can be determined by examining the specific con-
struct.
3. Boundary output data. Select all output boundary values and, as in
category 1, test all significant combinations of input values expected to produce each
output value.
4. Invalid input data. Select representative input values that are illegal to a
construct's processing, and run the construct to see if the expected results are ob-
tained (the result should be some type of error indication and recovery from the
error). Illegal data are usually values that are meaningless or out of bounds to the
construct's processing. Any construct in which such an error occurs should be
changed in structure to allow an error trap execution path that reacts appropriately
to the invalid data. The division algorithm contained an error trap for division by
zero; it used a selection construct to allow it to react to either valid or invalid input
data.
5. Invalid output data. This category requires a thought test rather than a
code execution. Imagine possible categories of invalid outputs and examine the con-
struct to see if data in any of them can be produced. An invalid output is an output
that is neither a normal valid output nor an error message. The processing that
allows any such values should then be corrected at the highest level of pseudocode
that the processing mistake was introduced.
6. Input data that exercise an execution path. The primary purpose of some
constructs is to select between actions rather than to transform input data into out-
put data. The first five test categories may not exercise all execution paths in such a
construct. For instance, if the construct has a CASE structure that selects different
actions for each of several normal input values, none of the foregoing categories will
reliably exercise all the CASE options. Such a construct should be fully exercised by
selecting a set of input data that will activate every execution path in the construct.
The programmers should determine and look for the expected output of each test
case.
o With every construct you test, make your best effort to identify inputs and
outputs in each category. Satisfy yourself that a category truly does not apply to a
construct before you move on.
The particulars of each test case you run must be written down to ensure that
the test is thorough and that its relationship to the other test results is clear. The in-
formation you need to record includes the test category, its input and output data,
the expected results, and the actual results when the test is run. To keep yourself
honest, be sure and write down the expected results before running the test. The
scorecard sheet shown in Fig. 3.21 is a useful format for recording this information
and for examining it afterward.
A successful test is a test that finds an error. A test that finds no errors does
not guarantee error-free code; :t simply has failed to unearth errors that may still ex-
ist. So, as you design test cases, try to use your intuition within the testing categories
listed above to pick the highest-risk cases and increase the likelihood of successful
testing.
148 Into Its Brain: 6510 Assembly Language Chap. 3
Test Categories
I) Norma I output data IV) Invalid input data
III Boundary input data V) I nvalid output data
III) Boundar,. output data VI) Input data that exercises a
particu lar execution path
Figure 3.21
Testing for errors and correcting them is called debugging. A program to assist
in object code testing, called a debugger, was probably included in your assembler
purchase. Machine-language monitors do not include a debugger, but they include
many debugger functions. Debuggers allow you to place data of your choice in
memory or in registers, to execute a program from any address you choose, to ex-
Testing 149
ecute them one step at a time and then observe the register and memory contents be-
tween each instruction execution, and to execute instructions in the range between a
start and stop address, to name a few of the most useful functions.
As a construct or module is executed with test cases, two types of errors will
become apparent. There will be errors due to incorrect assembly language notation.
This type of error includes among other things the use of the wrong number-base or
addressing-mode emblems, mistyped labels, and the use of the incorrect mnemonic
for a given operation. Most of these errors are caught by the assembler before object
code is produced. This type of error can be corrected by changing individual instruc-
tions in the completed assembly language template.
A more serious type of error includes those where the algorithm itself or some
expansion of it works differently than desired. These errors must be corrected at the
pseudocode level in which they were introduced. The pseudocode must then be reex-
panded as in the initial construct development to produce a new version of the con-
struct.
Some of the most common but easiest-to-fix errors of this type occur in the
condition-testing instructions of selections and loops (i.e., in IFs, CASE com-
parisons, and EXITIFs). If the instructions test only for the expected values of a
variable, and the construct has execution options only for the expected values of the
variable, an unexpected value of tested variable can cause undesired and erroneous
results.
To prevent this from happening, a technique called error trapping is used.
Every type of value a condition variable can take on is tested for and reacted to with
a safe, meaningful construct action. So even those values that seemingly should
never occur during construct execution will be met with an error message, a prompt
for a new user input, or some other graceful response. Of course, if an erroneous
condition-variable value ever does occur, the construct or constructs causing that
value should be corrected.
Example:
Illustrate the effects of using and not using error trapping.
Consider a program that calculates the amount of dynamite needed to blast a tun-
nel section without caving in the rest of the tunnel. Assume that a particular CASE con-
struct in the program selects from its calculating options by testing a simple selection
variable named 'rock_type'. Rock_type has three possible values: 'igneous',
'sedimentary', and 'metamorphic', which have been defined as the values 0, I, and 2,
respectively.
The CASE construct could test 'rock_type' for the values and I, and assume
that if 'rock_type' equaled neither of these, it must equal 2. This seems like a
reasonable assumption, and saves writing a few assembly language instructions.
However, assume 'rock_Jype' is assigned its value by another construct as the
program runs. If the assigning construct is faulty, it might place a value other than 0, 1,
or 2 in the variable. For instance, it might set bit d7 of the value temporarily for some
reason and fail to reset the bit. Thus the intended value 1 would be passed as the value
81, the CASE construct would test for values 0 and 1 and then interpret the "81" as if it
150 Into Its Brain: 6510 Assembly Language Chap. 3
were 2 or 'metamorphic', and the user would get seemingly valid but, in fact, faulty
results. If the program reached the field, a geology professor or experienced ordinance
person might catch the error, but someone less skilled might follow the program's
recommendations and destroy an entire tunnel in one blast!
Simple error trapping would prevent this from happening. The CASE construct
would test 'rock_type' for the values 0, I, and 2, and assume that if 'rock_type'
equaled none of these, there was an error. The fourth execution option, for an er-
roneous rock_type value, would print an error message and allow the programmer to
find the error before the program reached the field.
Exercise:
Answer the following questions to review the testing process.
(bb) What are the differences between normal, boundary, and invalid test cases?
(cc) What is the difference between a syntactic error and a logical error? What is the
difference in how they are corrected?
151
152 Imposing Reason: Program Planning Chap. 4
quality literary, musical, and artistic works before, had to be observed or deduced,
tested, and then recorded. Such laws, and the techniques they spawn, release one's
creativity by expanding the scope of the work that one can create and by minimizing
one's unprofitable efforts.
Since programming was at that time still caught up in small-scale~ thinking, the
first laws sought for were those that concerned the coding of individual instructions
in a program. The principle that a program should have some sort of structure
evolved into two laws: the law that a program should b(~ divided into modules, and
the law that program structures should be limited to the three basic construct pat-
terns. Upon these laws the phase of software development known as structured pro-
gramming was built. Many at that time embraced structured programming as the
final solution to the problems of software development. As with the e:arlier "bag of
tricks" style of software development, structured programming has become an end
in itself for many programmers. The laws and techniques of structured program-
ming were discussed in Chapter 3, although they were never labeled as such. We will
resummarize them shortly.
Unfortunately, programmers using just these techniques and laws found that
although their constructs and modules were easy to write and understand, those
structures often failed to cooperate with each other in performing an overall task.
The resulting programs frequently failed to run at all. The need to discover higher
laws of software development became obvious.
By 1975 the major laws governing the design of the highest structural level of a
program, the hierarchy, had been identified. These laws became the basis for the
phase of software development known as structured design.
Programs written according to the principles of structured design and struc-
tured programming were easy to develop and to understand, usually worked when
completed, but often performed the task incorrectly! There was still no reliable way
to divide a task into the right subtasks, the subtasks that when performed together
would completely perform the original task. Of course, this meant that there was no
good way to select modules for an intended program. Even so, most programmers
today practice no more than structured design and structured programming.
By 1977, laws sufficient for dividing tasks reliably had finally been identified.
These laws were the basis for the phase of software development called structured
analysis.
The term "structured" introduces the name of each phase because each phase
is conducted along a well-defined pathway of techniques and laws. The programmer
is guided through the entire software development cycle by a structure of activities
that if followed faithfully will lead to a program that accomplishes everything the
programmer originally intended. This assumes, of course, that the programmer's in-
tentions are within the physical limitations of the computer.
In this type of software development the programmer starts with structured
analysis, follows with structured design, and finishes with structured programming.
Structured programming can be viewed as the implementing of a program model
derived from structured analysis and structured design. The latter two phases are
Structured Programming 153
parallel to the two phases of data modeling discussed in Chapter 2. Let's look at the
three phases from this perspective.
As Chapter 2 explained, the general name for the first phase of modeling is
analysis. Analysis identifies the purest form of a system among its sometimes con-
fusing surroundings, divides it into its fundamental parts, and defines what each
part must do in the overall system. This defines the requirements that any copy of a
system must meet to make it equivalent to the original system, so analysis is
sometimes called requirements definition. Analysis can be applied to both real and
imaginary systems, but in programming the system to be analyzed is usually an
imaginary one that the programmer wishes to implement.
The structured analysis phase identifies the extent of a task, identifies its com-
ponent subtasks and their data interfaces, and defines what the subtasks do. In a
similar manner, data selection identifies the scope of information needed to perform
a task, identifies its component data elements, and defines their role in the task.
Chapter 2 also noted that the general name for the second phase of modeling is
synthesis. Synthesis recombines the parts defined by analysis into an efficient model
of the original system. This model is thorough enough to implement an actual
system from. Synthesis is often called design.
The structured design phase synthesizes component subtasks to produce an ef-
ficient model for performing an overall task. This model embodies the top level of
program structure, the hierarchy. Similarly, data structuring synthesizes data
elements to produce a model for an efficiently processed data structure. As indicated
in Chapter 2, data structures can, but need not, be hierarchical.
Finally, as we have already said, structured programming implements the
design model into the working system called a program.
We will be studying these phases in reverse order, in the order that they were
developed instead of the order that they are normally used, so that you will be able
to use the phases as you learn them. This order also puts the role of each phase in
clearer perspective.
Therefore, the body of this chapter begins with a summary of structured pro-
gramming, with which you are already familiar. Next is the discussion of structured
design. Finally, structured analysis is covered. Together, these three development
phases will provide a complete method for utilizing your computer to perform any
task within its memory, speed, and I/O capabilities. The Commodore 64's memory
and speed capabilities were discussed in Chapters 1 and 3. Its 110 capabilities will be
discussed in Chapters 5, 6, and 7.
STRUCTURED PROGRAMMING
1. Modularity. Programs should be divided into modules that perform the dif-
ferent functions of the overall program task.
2. Structured code. Program instructions should be organized into one of three
processing structures: the sequence, the selection, or the repetition.
These principles and techniques were illustrated with many examples in Chapter 3,
so we will omit further examples here.
We are now familiar with all the activities of structured programming. Before
these activities can begin, a model of the hierarchical module structure must be made
from which to program. This model defines all the modules needed in the program
and their interrelationships. From this model each module can be written using
structured programming. When all the modules shown in the model have been writ-
ten, and they have been tested both individually and as a group, the program will be
complete.
Developing a model of a hierarchy of modules that performs a given task in
the most efficient manner possible is the goal of structured design.
STRUCTURED DESIGN
According to Webster's, to design is to "plot out the shape and disposition of the
parts"* that will go into a product. The plotting is typically carried out on a model
of the product. By this definition, the shape and disposition of model parts are the
most important aspects of a design. This definition strongly implies that the shaping
and placing of those parts are the two most important design activities.
A familiar illustration of design is found in the automotive industry. An
automobile designer knows from the start that the car he or she will design must in-
clude certain parts, such as an engine, a transmission, brakes, and a body. These
parts and their relationships have been identified by prior analyses. The auto
Webster's Third New International Dictionary, G. & C. Merriam Company, Springfield, Mass.,
1971.
Structured Design 155
designer designs the car by shaping the imaginary parts and placing them in a model
of the physical car. This model has traditionally been kept as drawings on paper, but
is increasingly being stored as computer files that can be viewed and altered from
video terminals.
In software design, the component parts are the basic stand-alone sub tasks
defined by analysis. Also defined by analysis are the data flows needed between sub-
tasks for the overall task to be completed. At the beginning of the design process the
subtasks are notated in a hierarchy chart as modules that perform those subtasks.
The duties of these modules will be adjusted and refined during the design process as
the model is perfected.
Parameter-passing connections between modules, called interfaces, are the
offspring of the data flows identified between subtasks during the structured
analysis phase. We shall see examples of all these things shortly.
The programmer designs a program by shaping and placing the prospective
modules into a model of the most efficient way to perform the overall task. Modules
are shaped by shifting their inner functions between them. Modules are placed by
setting them into a hierarchical framework and by moving them around within that
hierarchy to alter their working relationships. The details of shaping and placing will
be discussed shortly.
Program models are still usually kept on paper, although they will be stored
and manipulated on computers as software is written to support those functions.
As we have said, making an optimal program model is the goal of structured
design. Therefore, our first concern in studying structured design is to understand
what the program model is.
A program model consists of three items. One of these items is the central compo-
nent of the model, and the other two elaborate on the first. The first item is the
module hierarchy chart, and the latter two are the data dictionary and the module
specifications. After we describe each of these items we will illustrate it with an ex-
ample.
Module hierarchy chart. The module hierarchy chart is the heart of the
program model. It shows modules in an organizing framework with information
passing between them. These chart elements are discussed separately below.
Modules. Recall from Chapter 3 that a hierarchy chart represents modules
as boxes. Each box contains a short description of the module's actions, in the now
familiar "imperative verb ... object" format.
Organizing Framework. The module boxes are placed in rows or "levels" in
the chart. Any two module boxes on adjacent levels can be connected by a line to
signify that the upper module utilizes or invokes the lower one to perform part of its
overall task. During a structured programming phase using assembly language, all
156 Imposing Reason: Program Planning Chap. 4
such connections will be implemented with a calling JSR instruction in the upper
module, and a returning RET instruction at the end of the lower module. However,
the means by which the upper module uses the lower is irrelevant to the design pro-
cess, so hierarchy charts are equally useful in programming with languages that do
not use JSRs and RETs.
Control interfaces of the first type are called command interfaces, for obvious
reasons. By issuing such commands, the manager becomes intimately involved with
the details of each worker's job. This has several disadvantages. The manager's
overall function becomes diluted with bits and pieces of functions lifted from its
subordinate modules. The dismembered jobs of the workers are never given the
isolated programmer attention necessary to make sure that they have been totally
understood and accounted for. The programmer has to consider the same job details
in more than one module, and must make sure that all modules over which a job is
spread will work together as expected.
Command variables have a selection organization, with each value option
representing one way to do the job. Such a variable might, for instance, command
one of several possible destinations for the worker module to send its data results to,
or command one of several possible actions for the worker to carry out.
Command interfaces do not belong in polished hierarchy charts. Eliminating
them is a part of the design process.
Control interfaces of the second type are called status interfaces, because
worker modules use them to tell their managers what happened when they did their
jobs (i.e., to give their managers a status report). For instance, if a manager module
invokes a subordinate whose job is to search a disk directory for a user-requested
file, when the subordinate is finished the manager will want to know whether or not
the file was found. This information cannot be considered a data interface because it
is not transformed into any part or stage of the data that the program will produce.
It is a control interface that the manager uses to decide what to do next. Since it does
not command any action of the manager, it is not carried by a command variable. It
is instead carried by a status variable.
Status variables, like command variables, have a selection structure. Their
value options represent the different possible types of outcomes of a module's task.
If there is only one type of job outcome, or if different types of outcomes are ade-
quately described by the module's data interfaces, status variables are unnecessary.
The manager module receiving a status variable need not know how the send-
ing worker module produced it. Thus most of the disadvantages that come with
command interfaces do not apply to status interfaces. Nevertheless, status interfaces
are minimized in a good design model. This is because a good design places the most
closely related functions together in their own modules, and the information needed
for decision making can then be obtained within the module making the decision.
All command interfaces and most status interfaces are implementation depen-
dent; that is, they will vary between different programs written to perform the same
task. Therefore, they are usually identified during the design phase as the model's
need for them is recognized. Again, a control interface passing downward is carried
by a command variable and marks a flawed design, while a control interface passing
upward is carried by a status variable and indicates a possible area of design im-
provement.
An interface is shown in a hierarchy chart with its name beside an arrow point-
ing from its source to its destination. The arrow's tail indicates the type of informa-
158 Imposing Reason: Program Planning Chap. 4
Data
Status -.-~
Cornmand ~---.- Hgure 4.1
listing, and must know both a name and a phone number to add a listing.
'Retrieve_number' must know the name in a listing. The 'name' and 'number' data in-
terfaces provide this information.
The module hierarchy chart provides a complete overview of a system that per-
forms a particular task. Detailed definitions of the interfaces and modules are kept
in the supporting data dictionary and module specifications.
Storr'
listing directory
2.1 4.2
Figure 4.2
160 Imposing Reason: Program Planninl~ Chap. 4
Example:
Give the definitions for the data and control interface, in thc 'managc_directory' hier-
archy.
Assume that the directory is defined as being from 1 to 100 listings in length,
where each listing contains a person's name and telephone number. Each 'name' data
structure is 10 bytes long, to hold up to ten ASCII characters. The name put in the data
structure will be left-justified (i.e., the first letter of the name will be in the first byte of
the data structure, and the data structure will be padded with ASCII space characters as
needed after the name to fill out the 10 bytes). Assume also that phone numbers are
always 10 digits long, so that every phone number includes an area code. Preceding the
first listing is a byte whose value equals the total number of listings in the directory at
the time. This directory will only be accessed by the lowest-level worker modules, which
is why it is not named as an interface in the hierarchy chart. It is defined in the data dic-
tionary as a system interface.
The definitions are listed alphabetically to make them easier to locate. They are
Pay close attention to the 'directory' definition, since it plays a particularly important
role in the example in the next section.
Example:
Show the module specifications for the modules in the 'manage_directory' hierarchy.
These specifications are given below. You must refer to the hierarchy chart and
data dictionary to understand fully the internal workings of these modules. Make sure
that you grasp these specifications, for they will be seen again later in the design process.
You should note that the user's input options have been intentionally and some-
times significantly limited for the sake of simplicity. For instance, once a 'job_type' has
been selected there are no opportunities for the user to "bail out" without giving all the
inputs for that job and seeing it to its completion. Such options are very desirable in real
programs, but also add considerably to their complexity. Note also that specific
algorithms for significant module functions, such as searching the directory, are not a
part of the module specifications. Algorithm selection belongs in the programming
phase and not in design. Finally, remember that this entire model, including these
module specifications, is unimproved so that we can practice the techniques and laws of
structured design on it later. Most of the complexity in this model is due to its unim-
proved condition, and the eventual improved model will be much simpler.
'manage_directory';manage the use of the phone directory
LOOP
accept_inputs
EXITIF job_type = 'quit'
CASE job_type OF
transfer: transfer_directory
retrieve: retrieve_number
update: update_directory
ENDCASE
show_result
ENDLOOP
END 'manage_directory'
IF listing~search~result found
THEN isolate~number
ENDIF
END 'retrieve~number'
Programming these modules would require selecting algorithms and developing their
pseudocode to the template level of detail. Programming the 110 operations (i.e., the
data movement to and from disk, TV, and keyboard) requires knowledge of the com-
puter's operating system. This information is supplied in Chapter 5.
The algorithms for most of these modules will be straightforward. However, a
few modules, such as 'find.---Jisting', which searches the directory for individual listings,
require a more significant algorithm choice during structured programming. For in-
stance, the simplest search algorithm for 'find.---Jisting' is a front-to-back comparison
of each directory listing against the desired directory name. The time taken by this
algorithm depends on how listings are added to the directory (another algorithm
choice). Assuming that the listings are in no particular order, the average !lime taken by
a sequential search of n listings would be proportional to nl2. This is because the
average length of all searches is halfway through the directory. This algorithm with no
data ordering has order n performance, since multiplied or divided constants such as the
2 in nl2 are ignored in algorithmic order. The order of this algorithm can be decreased
(i.e., its speed improved) by placing the listings in the directory in decreasing order of
usage. Thus the most-often-accessed listing will be first, the second-most will be second,
and so on. The simplest way to achieve this is to prompt the user to input listings in this
order, but listings added later will be out of order or will require additional program
logic to place them in their proper position. In this application unordered data elements
are sufficient. Thus listings can simply be added to the end of the directory until the
directory is full, However, this algorithm with frequency-ordered data, or more ad-
vanced search algorithms of order down to n x log2(n), would be desirable with a larger
Structured Design 165
directory or a more demanding search problem. The most generally useful of the ad-
vanced search algorithms are discussed in the follow-up volume Beyond Power Pro-
gramming.
Program design is a two-stage process. First, the programmer places the parts de-
fined by structured analysis into an initial program model. Then he or she perfects
the model using the techniques and laws of structured design. We will discuss these
two stages separately.
Creating the initial model. The programmer creates the first version of a
program model directly from the results of the structured analysis phase. As we shall
see in the structured analysis section, the results of analysis are pictures and text
defining the basic parts of a task. Recall that these parts are the separate subtasks
and their data interfaces.
The analysis results are transformed into a design model with a technique
called transform analysis. This name misleadingly implies that the technique is part
of analysis; nevertheless, the name is widely accepted and we will use it also. The
model produced by transform analysis is complete except for control interfaces,
which are filled in by the programmer after careful thought. A supplementary
technique called state analysis assists in this process and is discussed in Beyond
Power Programming. However, it is not necessary for effective programming.
Since transform analysis works directly with the results of structured analysis,
we must postpone studying it until we have discussed the structured analysis phase.
Once an initial model has been obtained, we can begin the work of perfecting it
into an efficient and trouble-free basis for structured programming. This is the pur-
pose of the second stage of structured design.
Perfecting the model. The second and concluding stage of the design pro-
cess perfects an initial hierarchy into an optimal model of how to perform an overall
task. This is done by using the remaining techniques of structured design to change
the model, and the laws of structured design to guide the changes and check the
quality of the results. We will discuss the techniques first and then the laws.
The Techniques of Structured Design. Recall that design is the plotting out
of the shape and disposition of parts that have been identified by analysis. After the
parts are placed into an initial model using transform analysis, the model is
perfected with techniques for reshaping the parts and replacing them within the
hierarchy. We will call the first of these techniques re-forming the modules, and the
second reorganizing the hierarchy. These two techniques are discussed below.
166 Imposing Reason: Program Planning Chap. 4
Completeness. The first law of structured design states that' 'the program
model must show the desired task being performed correctly and completely." A
model may violate this law by performing only part of its original task, or by per-
forming the original task incorrectly and thus producing undesired results. Careful
inspection of the module specifications, starting at the top level of the hierarchy
chart and proceeding down, is the most likely way to reveal violations of this law.
There are only two reasons for this error to appear in a model. Either the
analysis used to produce the model was incorrect, or the techniques of structured
design have been used incorrectly and have altered the basic nature of the model.
How this problem is corrected depends on how it arose. If it is the result of a
faulty analysis, the only solution is to start over with another analysis phase. This is
not so bad; programming is a repetitive activity, with the programmer gradually
closing in on the desired result. It cannot and really should not be a one-pass activ-
ity, since many of the best features of programs are serendipitous discoveries from
improving them.
Structured Design 167
If the violation of this law is caused by the incorrect use of design techniques,
the solution is much easier. If the programmer keeps copies of each generation of
the design, he can simply back up to the last version of the model before the mistake
was made and continue from there.
Example:
Apply the law of completeness to the 'manage_directory' model.
The 'manage_directory' model as supplied performs the desired directory
management task completely.
Conservation of Data. The second law of structured design states that "all
module input data must be used to produce module output data, and all module out-
put data must have corresponding module input data from which they are derived."
In other words, a module cannot create data from nothing, and data cannot simply
disappear into a module without a reason. Of course, the lowest-level modules of a
hierarchy chart can have system interfaces to outside devices, and so on, making the
modules appear to violate this law. However, studying the module specifications will
reveal the outside source of these data items.
Violations of this law show up in the hierarchy chart and sometimes in the
module specifications. It is easy on the chart to show data arrows pouring out of
modules that have no way of producing them, and it is equally easy to show data ar-
rows disappearing into modules without a trace. However, the written specifications
for those modules will either not match their pictures in the hierarchy chart-that is,
they will not produce the impossible data outputs and they will show no mention of
the disappearing data inputs-or the module specifications will be muddled on how
those data items are handled.
In the first case, where the hierarchy chart and specifications do not match,
there is either a problem in the original analysis, with the transform analysis, or with
the incorrect use of the design techniques. These problems are corrected as with
completeness violations; the programmer backs up as far as necessary to get a cor-
rect basis to work from. In the second case, where the problem is with muddled
module specifications, the only cure is to return to the analysis phase. Module
specifications are directly obtained from subtask specifications written during
analysis.
Example:
Apply the law of data conservation to the 'manage_directory' hierarchy.
The supplied initial model obeys the law of conservation of data. 'Accept_in-
puts' and 'show----.result', which on the surface appear to violate this law, have system
interfaces that show up in their module specifications.
Coupling. The third law of structured design states that "module interfaces
should be as few and as nonintrusive as possible." Interfaces are sometimes called
couples, hence the name of the law.
168 Imposing Reason: Program Planning Chap. 4
The more this law is violated, the less the modules look like black boxes, and
the more complicated the module specifications and general model functioning
become.
Violations of this law are revealed on the hierarchy chart and confirmed in the
module specifications. As you already know, there are three types of interfaces:
data, command, and status. Every command interface in a hierarchy is a violation
of this law. Every status interface should be viewed as a possible violation. Even
multiple data interfaces passing in one direction between two modules should be
viewed suspiciously.
Modules connected by few data interfaces and no control interfaces are said to
be loosely coupled. Modules connected by many interfaces are tightly coupled. The
benefits of loose coupling include obtaining black-box modules and simplifying pro-
gram testing and debugging. The drawbacks of tight coupling include the scattering
of jobs over multiple modules with the extra code and complexity needed to make
the pieces work together. Also, the complicated interactions of the tightly coupled
modules require much more time to test and may still conceal untested bugs after-
ward.
Too much coupling or coupling of the wrong kind indicates that related func-
tions have been scattered between modules, and should be regrouped so that they are
together in the same modules. This is done by re-forming the modules.
There are variations on the three interface types that yield a number of dif-
ferent possible coupling categories. An interface that seems to belong to two or more
of these categories is classed with the worst of them. This is because connected
modules are no more independent than their tightest coupling. Because the different
kinds of coupling vary in their desirability, we will discuss them individually from
the best or loosest coupling to the worst or tightest coupling. You should identify
all the interfaces in your designs by their coupling type to identify problem modules
and the corrective actions needed.
The best type of couple is an indivisible data item. This type of interface is
simply called a data couple.
Example:
Identify the data couples in the 'manage_directory' model.
From the data dictionary definitions we know that these interfaces are 'name' and
'number' (logically, names and phone numbers seem more like single low-level entities
than like collections of characters).
Example:
Identify the stamp couples in the 'manage__ directory' model.
The data dictionary shows three stamp couples: 'listing_position', 'directory',
and 'listing'. 'Listin~position' is considered to be a stamp couple because it passes a
pointer to a data structure instead of to an indivisible data item.
Sometimes an interface will appear repeatedly on a hierarchy chart and its COn-
tents be altered by two or more different modules. All modules sending or receiving
that interface are said to be common coupled. Interestingly, this type of coupling
can connect modules that are not adjacent in the hierarchy chart. Such modules are
scarcely related except through the common couple and can affect each other
through that couple in unexpected and undesired ways. For instance, an error in a
module altering that couple can affect all modules reading that couple and throw
suspicion on all other modules writing to it. It can take a long time to locate and fix
such an error.
Common couples are evident on hierarchy charts where two or more modules
independently produce interfaces having the same names. Common couples via
system interfaces, which do not show up on hierarchy charts, can be found by ex-
amining the module specifications. In the final program, the coded information for
all interfaces of the same name will be kept in the same physical location, which is
usually in memory, but can be kept on mass storage such as disk or cassette by using
the operating system modules described in Chapter 5.
There are a few natural and desirable uses for common coupling. In general,
any program whose central purpose is to handle one or more major data structures
170 Imposing Reason: Program Planning Chap. 4
can justify using common coupling. A chess program is an excellent example of this.
The chess board is stored in memory as a data structure that is the focal point of the
program. Therefore, the entire program should be able to access it. In this type of
program the complication required to avoid common coupling would be worse than
the common coupling itself. However, make sure that you have one of these special
cases before resorting to common coupling. Then follow all the other laws of design
to minimize any resulting problems.
Example:
Identify the common couples in the 'manage_directory' model.
The major common couple is 'directory'. It is not shown on the hierarchy chart
because it is a system interface, but the data dictionary defines it and the module
specifications show its memory copy being written to by the 'add.-listing',
'delete_listing', and 'update_directory' modules, and being read by the 'find.-listing'
and 'isolate_number' modules. Since handling the directory is the central purpose of
the 'manage_directory' model, this common couple is acceptable.
Another common couple is 'directory_search __result'. It is alt(!red by both the
'find.-listing' and 'transfer_directory' modules. Since handling this control interface
is not the central purpose of the model, this common couple is a design flaw. Note that
'directory_search---'"esult' was also listed as a status couple. Common coupling is
worse than status coupling, so by our earlier rule the 'directory_search_result' inter-
face is no better than a common couple. The final common couple is 'number', which
carries the phone number input by the user, as well as the phone number found in the
directory data structure by module 'isolate_number'.
Coupling with command interfaces is the last type of coupling we will mention
and the only totally unacceptable one. We have already explained the problems with
such command coupling, especially that it fragments subtasks and is incompatible
with black-box modules. Command couples are obvious on the hierarchy chart by
their triangular tails.
Example:
Identify all command couples in the 'manage_directory' model.
In this initial design, all the status couples are used elsewhere as command
couples. This is especially seen at the 'show_result' module. As with the 'accept_in-
puts' module earlier, the key to re-forming 'show---'"esult' to correct this problem is
found in the next design law. For now, though, all th,~ status couples must be down-
graded to command couples.
From best to worst, then, the coupling categories are data, stamp, status, com-
mon, and command. If your data interfaces are few and of the top two coupling
categories only, if you minimize the use of status coupling, if you reserve common
coupling for tasks revolving around a central data object, and if you remove all
command coupling, your programs will be very loosely coupled with all the resulting
benefits.
One other symptom of poor coupling are interfaces that pass Ithrough one or
Structured Design 171
more modules without being processed (i.e., transformed, checked for errors, and
so on). These interfaces are called tramp data, for they "tramp" through the
modules between the initial source and the eventual destination. Passing an interface
through nonintervening modules indicates an even greater separation of related
functions than is found in adjacent modules with too many interfaces.
Nevertheless, an interface passing from a subordinate module through its
manager to another subordinate on its own level is not considered tramp data, since
the modules should be black boxes that are unaware of each other, with the manager
their only logical link. We say "logical link" because in an actual assembly language
program the manager does nothing with the interface; the source and destination
modules directly access the same physical location to write or read the information
passing between them. This is also true for source and destination modules passing
tramp data; the in-between modules do nothing with the interface.
Tramp data are eliminated by reorganizing the hierarchy so that the source and
destination modules are adjacent to each other, or by re-forming the source and
destination modules to eliminate the need for any interfaces at all.
Coupling is one of the most important design laws. However, it is sometimes
difficult to see the best way to correct a model that violates coupling. The remaining
design laws are more limited but also more specific in identifying the exact defect in
the design model. Of course, once the defect has been identified, the solution is
straightforward. In this context the law of coupling is used to identify problem areas
in the model, and the remaining laws are used to correct them quickly and precisely.
Example:
Identify tramp-data and any remaining coupling defects in the 'manage_directory'
model.
The 'directory_search-Iesult' and 'listin8-search-Iesult' interfaces from the
'findJsting' module tramp through 'update~istings', 'retrieve~umber', and
'manage_directory' to finally arrive at 'show-Iesult'. Also, 'name' and 'number'
tramp from the 'accept--.-inputs' module through 'manage_directory' and 'up-
date_listings' to 'add_entry'. Finally, 6 of the 12 connected module pairs in the
hierarchy chart are bridged by three or more interfaces. Even if those interfaces were all
data or stamp couples, the modules involved would need to be examined to see if related
functions had been divided among them. Of course, in this model most of the inter-
faces connecting those module pairs are control couples of one type or the other, and
are therefore even worse offenders.
modules adjoining it. Each design must be tested for coupling and cohesion to en-
sure both module independence and module strength.
Coupling problems around modules in the hierarchy chart suggest possible
cohesion problems in those modules. The module specifications can then be studied
to identify their specific shortcomings. A weakly cohesive module may need com-
mand interfaces to help it decide which of its functions to perform, or iit may become
such a special-purpose mixture of functions that no one can understand its overall
purpose. Modules with poor cohesion cannot be black boxes.
The main technique for improving cohesion is to re-form the modules,
dividing and redistributing their internal functions to form more coht!sive modules.
The goal is to maximize the relatedness of the functions within each module in the
hierarchy. As usual, re-forming the modules should be supported with reorganizing
the hierarchy as changes to the module responsibilities make it necessary.
Just as only a handful of relationships are possible between members of a
family (e.g., uncle, cousin, brother), only a handful of relationships are found be-
tween the functions in a module. The cohesion of the entire module is determined by
the weakest relationship between any of its functions. We will discuss these types of
relatedness or cohesion from best or most strongly cohesive to worst or most weakly
cohesive. Keep these categories in mind, or at least remember where to look them
up, for it is a good idea to identify the cohesion of every module in a design model.
The strongest type of relatedness is called junctional cohesion. It exists in any
module whose functions work together to perform one well-defined module task. A
well-defined module task is evidenced by a strong "imperative verb .... direct ob-
ject" module name. Indeed, such a strong name alone heavily implies that a module
is functionally cohesive, without any reference to the module specification.
If you will insist on finding strong names for your modules, as we have recom-
mended, you will have few problems with weak cohesion. If after your best efforts a
module name remains weak, unclear, or noncommittal (e.g., process data), it prob-
ably means that the module's task is ill-defined and that its cohesion will be poor.
You can either re-form the modules to create better defined modules, or if too many
modules have naming problems you may repeat the analysis phase to obtain more
cohesive modules in the initial design model.
Example:
Identify the functionally cohesive modules in the 'manage_directory' model.
From the hierarchy chart and module specifications we see that 'manage_direc-
tory', 'find~isting', 'add~isting', 'delete~isting', 'retrieve_..I1umber', and
'isolate_number' are functionally cohesive, although several of them violate coupling
rules by exporting or simply passing too many interfaces.
completes no overall task, its functions are in the correct order to complete the sec-
tion's portion of the overall task.
A major difference between this and the previous type of cohesion is that
modules of this type cannot be given a concise, single-job name. At best, sequen-
tially cohesive modules have strong names for each major function in the module,
with each name separated by the word "and." For instance, a module named 'ac-
cept_and_decode_message' is probably sequentially cohesive, with the 'ac-
cept~essage' function providing its output to the 'decode_message' function.
As we said, a sequentially cohesive module is like a section of an assembly line.
If the functions in the other sections or modules contributing to an overall task can
be moved into the first module, so that the entire assembly-line task is performed in
one place, the resulting module will become functionally cohesive. The multiple
function descriptions in the previous module name will be replaced by a single strong
name for the overall module task. Alternatively, the individual functions in a se-
quentially cohesive module may be large enough to be separated into separat.e
modules performing complete subtasks of the overall assembly-line task. If all the
functions in the overall task can be made into separate modules, the whole group of
them can be placed under a manager module and the hierarchy reorganized to hold
the group. All those modules, including the manager, will then be functionally
cohesive. As a further advantage, the separated-out and more fundamental black-
box functions may be useful to other manager modules in the hierarchy, saving on
duplications of similar functions in the hierarchy.
A module that requires a command interface to function properly cannot be
functionally cohesive, since at least one of the functions necessary to do its job is in
the module above it, but it can still be strongly sequentially cohesive. By re-forming
the module to contain the function it needs to make its own decisions, such a module
can easily be upgraded to functional cohesion.
Example:
Identify any sequentially cohesive modules in the 'manage_directory' model.
Inspection of the module names and study of the module specifications show
that several otherwise strong modules are receiving command interfaces and are
therefore sequentially cohesive. Those modules are 'load_directory' and 'store_direc-
tory'.
A module whose functions are related only by their occurrence at the same
stage of the program's execution is said to have temporal cohesion. This type of
cohesion is usually found in setup or initialization modules, and cleanup or finaliza-
tion modules. Initialization modules do things like setting up initial variable values,
getting initial user inputs, and checking to make sure all the necessary 110 devices
are attached and working. Finalization modules do things like displaying the pro-
gram results for the user and turning off 110 devices used by the program.
The functions in a module of this type are usually closely related to tasks per-
formed by other modules. They are grouped in temporal modules only because the
designer is thinking temporally and not functionally. Because the functions in the
separated tasks have been split among two or more modules, more interfaces are re-
quired between the modules and the entire model is harder to understand. There are
no advantages to temporal cohesion.
The cure for temporal cohesion is simply to distribute the grouped functions to
those modules whose tasks they contribute to.
Example:
Identify the temporally cohesive modules in the 'manage_directory' model.
'Accept~nputs' fits our definition of an 'initialization' module, and
'show--..result' qualifies as a 'finalization' module. In both cases their inte:rnal functions
have been grouped solely to make them all able to execute at the same time in the pro-
gram loop. Note that the functions in both modules really belong in other modules; for
instance, the only module that needs to use the 'number' interface is 'add--.-listing', so
that is where 'number' should be obtained.
'U pdate_directory' is also temporally cohesive, since its first function of
creating a directory in memory is placed there only to make it execute just before
'add--.-listing'. The function of creating a directory when no directory is available on
disk is much more related to the function of loading the directory, so it belongs with
'load_directory' .
Structured Design 175
We will redistribute all these functions to their proper locations when we have
completed discussing the law of cohesion.
A module whose functions seem to be related by subject matter, but which ac-
tually contribute to completely separate tasks, is said to have logical cohesion. The
name is ironic since such cohesion is deeply illogical.
Logical cohesion is very dangerous because it appears to have cohesive
strength when it actually has none. Because of their apparent strength such modules
may be left alone when they should be re-formed, and the result will be extra inter-
faces and weaker cohesion in all the modules whose related functions have been left
in the logically cohesive module.
Consider, for example, the logically cohesive module named 'handle
_finances' (whatever that name means). Assume that 'handle_finances' contains
the following functions:
1. Balance checkbook.
2. Compute dollar principal in mortgage payment.
3. Estimate risk on currently held stock options.
4. Project effect of national debt on auto interest rates.
5. Generate belligerent letter to bill collectors.
What these functions seem to have in common is "finances". What they actu-
ally have in common is ... nothing!
A logically cohesive module will be coupled so tightly to other modules that
changing it slightly can cause the whole program to break down. Such programs are
also very difficult to debug.
Example:
Identify the logically cohesive modules in the 'manage_directory' model.
'Transfer_directory' is logically cohesive because its functions of loading and
storing the directory are actually related to the task of managing the directory, which is
the responsibility of 'manage_directory', and they are grouped only because of their
apparent relationship of directory movement. The poor cohesion of this module is
evidenced by the command couples in and out of it.
If the functions in a module not only have nothing in common but also do not
even appear to have anything in common, the module is said to have coincidental
cohesion. The only good thing that can be said about coincidental cohesion is that if
you analyze a problem at all, and even attempt to give your modules strong names, it
will never appear in your programs. As usual, the cure for it is to re-form the offend-
ing modules.
Example:
Identify any coincidentally cohesive modules in the 'manage_directory' model.
None of the 'manage_directory' modules have coincidental cohesion.
176 Imposing Reason: Program Planning Chap. 4
In summary, the cohesion types from best to worst are functional, sequential,
communicational, temporal, logical, and coincidental. The first four laws of struc-
tured design reveal most of the defects in a design model. At this point in the design
process it is a good idea to use the techniques of structured design to fix all identified
flaws. Once these flaws have been eliminated, the resulting design model can be fur-
ther corrected and fine-tuned from testing the model against the remaining laws of
structured design.
Example:
Use the techniques of structured design to improve the 'manage_directory' defects
identified by the laws of coupling and cohesion.
We have noted the temporal cohesion of 'accept~nputs' and 'show_results', so
we will break up those modules and distribute their functions to the related modules.
For instance, the function of displaying the phone number is moved into the
'isolate_~_number' module that obtains the number. The resulting module can be
renamed 'display~stin~umber' to reflect its overall function.
Several modules can be moved within the hierarchy to place them closer to their
related tasks, and therefore to save on control coupling and tramp data. By their func-
tions we can see that 'load_directory' and 'store_directory' are really needed only at
the very beginning and the very end of program execution; they should directly support
'manage_directory'. Therefore, 'transfer_directory' is unnecessary and can be
eliminated. 'Load_directory' can be given the function of searching for the directory,
previously in 'find~isting', and the function of creating an empty directory in
memory, from 'update_directory', to form a module that does whatever it takes to get
a copy of the directory into memory. We will rename it 'obtain_directory'.
Similarly, 'add~isting' and 'delete~isting' are direct parts of managing the
directory, and therefore 'update_directory' can be eliminated for the same reasons as
'transfer_directory' was before. The verbs "transfer" and "update" in the names of
those modules are weak compared to the verbs such as "store" and "delete" in their
subordinates' names, which hints at the weakness of the two manager modules. In both
cases, moving the subordinate modules to places where they can be dire'Ctly controlled
eliminates the control couples that were needed before.
We will also separate out the functions 'get~istin~name' and 'geL_
listin~umber' so that 'get~stin~ame' can be used by the three modules that re-
quire that function. The resulting hierarchy appears in Fig. 4.3. Note that
'get_Iistin~ame' is called by two levels of modules. This is perfectly acceptable.
We see that these changes have eliminated all the command couples and all but
one status couple. The tramp data are also gone. If any tramp data had been left, we
would have applied the techniques of structured design again to pull the source and
destination functions into the same or at least adjacent modules. After just one correc-
tion stage we now have a smaller and simpler model with fewer interfaces of which all
are data or data structures. Further, the model now defines a more logical and
understandable method for performing the 'manage_directory' task. Also note that
most of the data are passed in the lower levels of the chart, a healthy sign.
The updated data dictionary contains the following definitions:
Structured Design 177
~t
~;
I'
L
u
~!
OJ;
'0>.
1'1
c
31
Find listing
position
4.1
Get listing
name
2.2
Figure 4.3
(1) 'obtain_directory'
check for directory on disk
IF directory is present
THEN load directory from disk into memory
ELSE create an empty directory in memory
ENDIF
END 'obtain_directory'
find_listing_position
IF listing_search_result = found
THEN delete listing pOinted to by listing_position
move all following listings up one listing position
subtract 1 from directory_size
display 'listing deleted' message
Structured Design 179
ENDIF
END 'delete_listing'
find_listing_posit ion
IF listing_search ___ result = found
THEN display_listing_number
ENDIF
END 'retrieve_number'
Each time a design model is altered it must be checked again for completeness
of solution and conservation of data. If neither of these laws has been violated by
changes to the design, it is time to apply the remaining laws of structured design to
the model.
Balance. The law of balance states that "the data inputs to any module must
be of similar complexity to its data outputs." This law spotlights modules whose in-
put data interfaces are either much simpler or much more complex in structure than
their output data interfaces. A module whose input is 'company mailing list' and
whose output is 'employee zip code' undoubtedly violates the law of balance.
180 Imposing Reason: Program Planning Chap. 4
This problem is caused by modules that do too much of the work personally
(i.e., "shirt-sleeve managers"). By making any offending module into a manager
and pulling some of its major functions below it as subordinates, the data can be
transformed in stages instead of all at once. This has the side benefits of making the
model easier to understand and to program. Thus curing imbalance expands the
hierarchy.
Prevention is always better than cure, and the prevention for imbalance is to
conduct a more thorough analysis phase. If the overall task has been properly di-
vided into levels of subtasks with no more than nine subtasks below anyone subtask
the next level up, there will almost never be a problem with gross differences in data
complexity between module levels, and almost never a problem with balance.
Example:
Identify the imbalanced modules in the manage_directory model.
The model shows no great difference in complexity between the input and output
data of any module, so balance has not been violated.
Module Size. The law of module size states that "modules should be no
shorter than one complete function and no longer than two assembly language pages
in length."
Until you have written a few programs it is difficult to judge what will be the
programmed size of a module from its module specification. However, the spirit of
this law is that modules should be kept relatively short but long enough to do a com-
plete job. The ideal length is from 1/3 to 1 assembly language page in length, in-
cluding comments, so that it can all be viewed at once. A module longer than two
pages is too long to keep under your eyes at one time, making it much more difficult
to understand, program, or debug.
Example:
Identify any modules in the manage_directory model that violate the law of module
size.
Module size is obeyed by all modules in 'manage_directory'.
Generality. The law of generality states that "design models should be built
of modules of the most general function consistent with their intended use and the
laws of structured design."
Modules obeying the law of generality can often be used in other programs or
by multiple higher-level modules in the program at hand. The goal of generality is to
make the module perform the general case of the task it is assigned, rather than
specific and arbitrary instances of that task. However, be careful nolt to expand the
generality of a module beyond its intended use, since that will require extra work
and error risk for no good reason.
Structured Design 181
Example:
Identify any modules that exhibit generality in the manage_directory model.
One of the products of correcting the model was a module called
'get~isting_name'. This is a general function that was duplicated in several modules
before the model was improved. By identifying it as a general function and putting it in
a separate module, we were able to make the design a little more focused and to
eliminate the unproductive repetition of that function within the model.
Structured design uses three techniques and seven laws to produce a program design
model in the form of a perfected hierarchy chart, data dictionary, and module
specifications.
The three techniques of structured design are:
I. Complereness. The program model must show the desired task being pcr-
formed correctly and completely.
2. Conservation of dara. All module input data must be used to produce module
output data, and all module output data must have corresponding module in-
put data from which it is derived.
3. Coupling. Module interfaces should be as few as and as non intrusive as possi-
ble. The types of coupling from best to worst are data, stamp, status, com-
mon, and command.
4. Cohesion. Each module should consist only of directly related functions, and
as much as possible, all directly related functions should be grouped together
in the same modules. The types of cohesion from best to worst are functional,
sequential, communicational, temporal, and coincidental.
5. Balance. The data inputs to any module must be of similar complexity to its
data outputs.
6. A10dule size. Modules should be no shorter than one complete function and
no longer than two assembly language pages in length.
182 Imposing Reason: Program Planning Chap. 4
After a design model has been perfected, the structured programming phase can
begin. One way to start is to program all the modules in the system, including their
JSRs and RETURNs, assemble the entire program, and run it to see if it works. This
approach is called big-bang building, because the program will either work or, more
likely, blow up completely.
A better approach is to program the modules from the top of the hierarchy
down, plugging them into the hierarchy one at a time and testing the resulting partial
program each time to see how it works with the new module. This approach is called
top-down building.
In top-down building the first module to be programmed is the top module of
the hierarchy. Since that module calls the modules on the level beneath it, we must
provide simple replacements that return to the calling module and allow it to con-
tinue executing. Such replacement modules are called stubs. Stubs also set up values
for any necessary upward interfaces, with those values either written into the stub
assembly code before each test run, or by getting them from the keyboard as the stub
executes (see Chapter 5 for the means of doing this). The latter is as complicated as a
stub should ever get. Complex I/O and data transformation are left out so that the
stubs will be reliable without a lot of programming effort.
Once the top module has been tested and debugged, the stubs below it are
replaced one at a time with real modules, and the resulting partial program is tested
again. When a stub is replaced by a full module, of course, another layer of stubs
must be written to substitute for the modules underneath the new module.
New modules should only be inserted after the existing partial program has
been thoroughly tested and debugged. Then any errors that occur after a new
module is inserted must be due to that module. Either the new module is faulty, or
its interactions with the module or modules calling it has exposed their faults. In
either case, finding and correcting the error is much easier than with the big-bang
technique.
The top-down building process is continued until the lowermost modules in the
model have been inserted, tested, and debugged. At that point the entire program
will be working and complete.
A variation on the top-down approach that some like better is called string
building. String building starts with the top module, as in top-down building, but
then follows each string below the top module down to the lowest level. It builds
down and then across; top-down builds across and then down. One advantage of
string building is that the programmer works with closely related modules as a group
instead of hopping from one group to another while working down module rows.
Structured Analysis 183
However, string building is more difficult to plan out. Either method works well; try
both and use whichever you prefer.
Thus the sequence of software development as we now know it is to analyze a
task, to convert the analysis into an initial design model, to perfect that model, to
write individual modules according to the principles of structured programming,
and to build those modules into a tested and working program. We can now study
the first two steps in that sequence: analyzing a task and converting the analysis
results into a design model.
STRUCTURED ANALYSIS
Webster's defines analysis as the "separation or breaking up of a whole into its fun-
damental elements or component parts."* This definition applies to software anal-
ysis if the whole is a task that we want a computer to perform. Analysis lets us under-
stand a task by identifying the many individual processes that must be performed to
complete that task. Those processes are the parts of the whole.
In the design phase we used a set of task-building techniques on a concrete
representation of a program. We called the program representation a program
model. In the analysis phase we will apply a set of task-dividing techniques to a con-
crete representation of the task we are trying to understand. We will call this task
representation a task image. Like a program model, a task image consists of pictures
and text that can be physically manipulated into an optimal form. However, a task
image only describes the parts of a task; it says nothing about how to perform that
task with a computer. That is an advantage during the analysis phase since mistakes
can be made by designing a system to perform a task before the task is fully
understood.
The task image and the structured analysis process allow us to come to a full
understanding of a task before we start designing a way to perform it with a com-
puter. How well an analysis of a task is performed sets the limit on how well the pro-
gram to perform that task will work.
Developing a complete and accurate task image is the goal of structured
analysis. Thus we begin our study of structured analysis by discussing the task im-
age.
A task image consists of three items. One of these items is central to the task image
and corresponds to the role of the hierarchy chart in the design model. This item is a
set of data flow diagrams; you saw them before in Chapter I. The two other items
Webster's Third New International Dictionary, G. & C. Merriam Company, Springfield, Mass.,
1971.
184 Imposing Reason: Program Planning Chap. 4
support the first. One of these is a data dictionary, which is identical to the data dic-
tionary of the design process except that it leaves out the implementation details of
the low-level data elements (e.g., that characters are ASCII encoded). The other sup-
porting item is a set of process specifications, which are in the same form as the
module specifications of structured design.
Data flow diagrams. In Chapter 1 we said that data transformers and the
data flows between them are the basic elements of all information-handling tasks ex-
cept those involving switching (see the section "Types of Information Tasks" in
Chapter 1). Data flow diagrams, or DFDs, reduce a task to its purest state by
representing only its data-handling aspects. Control information (i.e., information
used to coordinate or command the execution of data transformers) is specifically
excluded from data flow diagrams. Control information is omitted because it con-
cerns how the task will be performed, a design issue, rather than what the task is, an
analysis issue. The only exception to this rule is that status information essential to
performing the task can be shown. For instance, status messages to the TV and thus
to the user may be necessary. Considering unnecessary control information during
analysis locks the analyst into specific ways of dividing a task, instead of allowing
him or her the total freedom to discover the most logical way to divide the task.
DFDs show the data-handling processes required to perform a task as if they
were occurring simultaneously. Imagine these processes running from the initiation
of a task until its completion; one is "on" now, then another, and so on like twin-
kling lights until the task is completed. This is the way your computer performs a
task; one process at a time. If you hold open the shutter of an imaginary "data-
sensitive" camera during this entire period, you will have a photograph of the data
usage of all processes at all execution times. This is what a DFD does. A DFD allows
us to examine every stage of performing a task at once, which is important if we are
to understand the task as a whole instead of as isolated pieces. Incidentally, thinking
of a task as a group of simultaneous processes is not all that artificial. Computers
are now being built that assign a separate CPU to each process so that all the pro-
cesses can be executed simultaneously for a manyfold speed improvement. This ap-
proach, called parallel processing, will be used in personal computers before long.
Four distinct types of objects can be shown in DFDs. In Chapter 1 we saw two
of these objects: arrows, which represent data flows, and circles, which represent
data transforming processes. Additionally, whenever a data flow branches and goes
to more than one destination or comes from multiple sources, the branch is shown
on a darkened point. You will see this in the next example.
There are two other data-related things that a DFD must show about a task.
First, it must show where the task obtains its initial data and where it sends its com-
pleted data. These outside data sources and destinations are the context of the task
we are interested in. They are called terminators and are represented with boxes.
Second, it must show data that are created and used at different times, or are used
repeatedly while a task is performed. Most data are used just once, right after their
creation, but data that are used repeatedly must be stored between usages. They are
Structured Analysis 185
kept in data stores, which you can think of as data flows at rest. Stores are symbol-
ized by two parallel lines.
All the DFD objects-data flows, processes, terminators, and stores-must be
named to identify their role in the task. Data flows, terminators, and stores are
"things" and are named with nouns. Processes are actions, so their names are based
on verbs in the familiar "imperative verb ... direct object" format. The data flows
into and out of a store need not be named, since they symbolize moving copies of the
data structure named in the store.
Example:
Show a DFD from which the final manage_directory design model could have been
directly developed.
We developed the final manage_directory model from an inefficient and flawed
initial model. We could have instead developed a similar model from a DFD such as the
one shown in Fig. 4.4. Note that this DFD shows all the major data-handling operations
seen in the 'manage_hierarchy' design model: adding and deleting listings, retrieving
phone numbers, and loading and storing the directory. The data used and produced by
each of these operations are shown on the data flows that connect with them.
Note also that the DFD omits control information such as the 'listin~
search_result' status flag in the design model. The main purpose for such information
is to help modules decide when and how they will do their jobs; it does nothing to help
us understand what the underlying processes do.
In the later stages of program development, both the data flows and the resting
data flows or stores will be implemented as variables and data structures. Stores
must be handled more carefully than data flows, however, because they may affect
several modules via common coupling, or single modules at different times via a
kind of time coupling.
The DFD processes will eventually become design-model modules, and the ter-
minators will show up in module pseudocode when I/O devices such as the keyboard
and the video screen or TV set are accessed.
There are a few rules that govern how the four types of DFD objects can be
used and combined. You cannot obtain a useful analysis from DFDs that violate
these rules, so please study them carefully.
First, the names of all data flows touching the same process must differ; it is il-
logical to have input and output data flows of the same name, since this indicates
that the process does not transform the data that are flowing through it. The one ex-
ception to this rule is mentioned as part of the following rule.
Second, data flows can be shown connecting processes with other processes,
stores, or terminators, but data flows cannot be shown connecting a store with a
store, a store with a terminator, or a terminator with a terminator. This is because a
data flow directly between stores or between a store and a terminator provides no
way to move the data between the two (whereas a process does), and a data flow be-
tween terminators has nothing to do with the task being analyzed. If the nature of a
task requires that data be moved between a store and a store or a store and a ter-
186 Imposing Reason: Program Plannin!J Chap. 4
Hgure 4.4
minator, it is acceptable to violate the first rule and show a process that moves the
data without transforming it. This is the exception to the first rule that we noted.
Third, every process must be connected to both input and output data flows.
Clearly, processes that do nothing with something (i.e .., have only input flows) or
that produce something from nothing (i.e., have only output flows) contribute
nothing to our understanding of a task that produces something from something.
They are also impossible to implement in programs.
Fourth, each data item should be shown entering a process in only one place.
This removes a source of redundancy and therefore of confusion. For instance, in
the manage_directory task you would not show 'name' and 'listing' entering the
same process, because 'listing' contains 'name', and violates this rule. If these two
data elements were used by the same process, you would probably show 'name' and
'number' input data flows under the assumption that the process would put them
together to form a listing.
Fifth, trivial error data can be shown on a DFD as a short data flow from a
Structured Analysis 187
process into midair. Thus we could have shown short "no such listing" arrows ex-
iting the "retrieve phone number" and "delete listing from directory" processes.
Sixth and last, as we have already said, information flows used to control or
coordinate the execution of processes should not be shown on a DFD. However,
status flows can be shown if they are essential to performing the task.
We mentioned that a task image consists of a set of data flow diagrams. We
will discuss why more than one DFD is necessary, and how the set is developed,
when we study the analysis process.
We can now discuss the second item in a task image: the data dictionary.
Data dictionary. The data dictionary of the task image holds definitions of
the data flows and data stores. These definitions are in the format used for design-
model definitions. Indeed, the data dictionary developed during the analysis phase is
used as the starting point for the design-phase data dictionary. Low-level data
definitions such as 'number = lO{(ASCII DIGIT)}lO' are omitted from the dic-
tionary during the analysis phase because, as we said, analysis defines the inherent
parts of a task without considering how they will be implemented. For an example of
a data dictionary, see the data dictionary section of the earlier discussion of the
design phase.
The final item in a task image is the collection of the process specifications for
the processes making up the overall task.
Analyzing a task is a two-stage process. First, the scope of the task must be deter-
mined. Then the task must be divided into its component processes and their data in-
terfaces.
Example:
Show the context diagram for the manage~ierarchy task.
The context diagram in Fig. 4.5 shows the single highlevel process
'manage_hierarchy', surrounded by the data sources and sinks with which it must
communicate. Notice that it is consistent with the earlier DFD example, but with the in-
ternal details of the task missing.
Drawing the context diagram is the most important step in structured analysis,
for it defines the precise limits on the scope and nature of a task. Your first step in
drawing a context diagram is to identify the terminators, or data input and output
devices. With a task for a personal computer the terminators are typically things like
TV sets, disk drives, data cassettes, and keyboards. Draw the identified terminators,
with appropriate names, around an empty circle that represents the task. You will
name the task later.
Second, define the data flows outward from the circle to the terminators.
These are the data produced by performing the overall task. They are the desired
products of executing that task. The output data flows must be identified first
because you must know exactly what results you want from performing a task
before you can determine what input data or internal processes are required to
achieve those results. The data definitions of the output data flows should be the
first entries in your data dictionary. In the manage_directory task we might have
the definitions:
directory == 1{listing}100
listing == name + number
phone number message =(,phone number in displayable torm')
The first of these definitions differs from its counterpart in the design model in that
it omits the 'directory_size' data element. This happens because we are not yet con-
Figure 4.5
Structured Analysis 189
sidering such design issues as "How will a module know the size of the directory
data structure?" We would add that information later, when we become aware that
such an action is necessary. Like all other analysis and design tools, the data dic-
tionary need not be in a perfect and final form on the first try; it only needs to be
consistent within itself.
Third, examine those output data flows and determine the minimal input data
from which the outputs can be derived. Each data item should be input in only one
place. This is easier to do than it may seem. For instance, in the manage_directory
task we know that a directory and phone number messages are output. It is clear that
phone numbers must first be input to later be output, and it is also clear that names
must also be input to define listings fully and to select particular phone numbers. So
'name' and 'number' are the only input flows necessary. Note in the preceding
paragraph that the data dictionary definitions for the output flows told us that
'name' and 'number' would be needed. If we decide that we want to be able to
reload an existing directory from the disk drive, 'directory' will also be an input data
flow, although strictly speaking, it is not absolutely necessary for the function of the
task. Even with more complicated tasks than 'manage_directory', the input data
necessary to produce the output data are usually easily determined. If they are not,
you should identify all the necessary input data that you can and correct any errors
in your choices later as they are by the analysis process. You should also record
definitions for the input data flows in the data dictionary.
Fourth and finally, look over the data interfaces and try to understand from
them the exact nature of the task. Then choose the most applicable, specific, and
"strong" (i.e., imperative) name for the overall task. Write that name inside the
task circle. This step has been delayed until now because, for the reasons noted
under the first step in this sequence, it is the data interfaces that determine the nature
of the task. One wants the task name to reflect the task nature as closely as possible.
At this point the context diagram is complete. However, with some tasks the
execution time from receipt of the data inputs until production of the data results is
a critical factor. This consideration can also be shown on a context diagram. One of
the better ways of doing this is to draw a dashed line between each time-critical out-
put data flow and the input flows used to produce it, and to note the time limit for
the transformation beside the dashed line. You would refer to this information dur-
ing the programming phase to select algorithms and programming strategies that
meet the noted time limits.
Example:
Show a time limit of 0.5 seconds from the time a name is entered from the keyboard un-
til the corresponding phone number is displayed on the TV screen.
Figure 4.6 (see pg. 190) shows how this is done.
Dividing the task. After the context diagram has defined the scope of a
task, the processes needed to perform that task must be identified together with any
190 Imposing Reason: Program Planning Chap. 4
Figure 4.6
necessary data interfaces between them. We do this with the techniques of structured
analysis.
The Techniques of Structure Analysis. A task is divided into its component
processes using two techniques; top-down leveling, which identifies the component
processes with increasingly detailed levels of DFDs, and bottom-up leveling, which
is usually used to improve the quality of existing analyses.
Top-Down Leveling. The levels of DFDs we will use to divide a task begin
with the most basic DFD, the context diagram, which shows a task being performed
by a single all-encompassing process. Using top-down leveling, we will separate out
the many little subprocesses in the overall process. However, those subprocesses are
often too numerous to show at the same time in a single DFD. When that is the case,
we use a DFD to divide the overall process into between three and nine intermediate
subprocesses, more DFDs to divide each of those subprocesses into between three
and nine more, and so on until we have DFDs that show the most basic subprocesses
and their data interfaces. The DFDs together form a leveled set that describe in
various levels of detail the processes needed to perform a task.
The overall process shown in the context diagram is divided from the outside
in, starting with the external output and input data interfaces. To work inward along
an output data flow you must imagine what data a last-step subprocess would re-
quire to produce the output data. You need not identify or name the last-step sub-
process yet; you only need to name the data that subprocess uses to produce the out-
put.
When dividing a complex task, the data required to produce the output will be
intermediate in content and structure complexity to the external task inputs and out-
puts. In simpler tasks the identified flows might be the task inputs.
After you have identified the data needed to produce the output flow, draw a
smaller circle inside the overall task circle and connect it to the new data flows and to
the output data flow. It helps to start with a large copy of the context diagram so
Structured Analysis 191
Figure 4.7
that there will be plenty of room for what you will draw inside the process circle. A
large black or white board is an ideal place to make this copy, since it will allow you
to easily make the many corrections that will be needed as you go along.
Once you have drawn the inner subprocess circle, give it a strong process name
based on its surrounding data interfaces, in the same way that you named the overall
task after its external data interfaces.
Example:
Show how the subprocess producing the 'phone-number message' might be identified.
The 'phone-number message' can be produced directly from a 'name' data flow
and the 'directory' data structure. The 'manage_directory' context diagram with the
added data flows and added subprocess circle is shown in Fig. 4.7.
The data interfaces to the 'obtain matching phone number' subprocess inspired
its name. Note that this process corresponds to the 'retrieve phone number' process in
the 'manage_directory' DFD at the end of the DFD section. Choosing the process
name based on its data interfaces has made it stronger and more descriptive than before.
Note also that we have not yet connected the 'directory' data flow into this process with
the task input flow of the same name. This is because we know that a store will be
needed to hold the directory. Thus we can work inward one more step and show the
store now (Fig. 4.8).
Dividing a task along an input interface is done in a similar way. Imagine what
data flows can be produced from the interface data in a logical first-step process.
Such a process will often require the data from more than one input interface. Once
the necessary input interfaces and produced data have been identified, draw another
192 Imposing Reason: Program Planning Chap. 4
Figure 4.8
small circle inside the overall task circle and connect it with the identified data flows.
Then give the new circle a strong process name based on its data interfaces. As with
dividing along output interfaces, depending on the complexity of the task, the newly
identified data flows mayor may not be external interfaces.
Example:
Show how the subprocess that first uses the 'directory' input might be identified.
The 'directory' store and 'directory' input interface have already been defined.
The DFD rules say that a process must be used to move data from a terminator to a
store. Thus we connect the 'directory' input flow to a subprocess circle and connect the
circle with a data flow to the store. The result is shown in Fig. 4.9.
The dangling data flows left on the interior of the task circle by previous in-
ward divisions are treated like external interfaces and worked further inward to
identify other data flows and transforming processes. The goal is to work the inputs
and the outputs toward each other until they are connected by transforming pro-
cesses. When a complete data path exists from the external inputs to the external
outputs, the level 0 DFD is complete. The top-level process circle is no longer shown
in the level 0 DFD.
To divide a task in this way you must take the passive viewpoint of observing
what happens to the data, rather than the active viewpoint of deciding what job
must be done next. You are observing objects rather than deciding upon actions.
This makes task division with DFDs a fundamentally different kind of activity than,
say, drawing a flowchart as we did in structured programming. A passive, observing
Structured Analysis 193
Name Number
Phone_
number
message
Directory
l<'igure 4.9
method is better suited to large tasks, while an active, decisive method is better with
very small tasks. Each has its place.
Example:
Show the complete level 0 DFD for the manage_directory task.
This DFD combines the process divisions of the last two examples with other
similar divisions until the task has been completely divided into five processes and one
data store. The DFD appears as shown in Fig. 4.10. Note that this is the same DFD as
was shown in the DFD section example except for the name change of the 'obtain
matching phone number' process.
Each process in the level 0 DFD can be treated like the overall process in the
context diagram and divided in a separate DFD by working inward from its sur-
rounding interfaces. A DFD used to divide a process shown in the level 0 DFD is
given the number of the process being divided. So a DFD showing the division of the
'add listing to directory' process would be called the level 2 DFD, and would have as
its external interfaces the data flows 'name', 'number', and 'directory'. The pro-
cesses in this level 2 DFD would be numbered 2.1, 2.2, and so on. A DFD dividing
the first process in the level 2 DFD would be the level 2.1 DFD. This division con-
tinues until DFDs are obtained whose processes cannot be divided any further
without weakening their cohesiveness, where "cohesion" has the same implications
of unity of purpose and function that it had with modules.
With leveled DFDs, no single DFD divides an entire task, but the context
194 Imposing Reason: Program Planning Chap. 4
Directory
Figure 4.10
diagram and level 0 DFD together provide a task overview. With simple tasks such
as 'manage_directory' these two DFDs may even identify the basic processes.
Otherwise, still lower-level DFDs will reveal the basic processes required to perform
the task.
All the DFD rules we mentioned earlier apply to leveled DFDs. Additionally,
the data flows into and out of a process in a DFD must match the data flows into
and out of the DFD that expands that process. This ensures conservation of data,
and that the lower-level DFD accurately describes what must be done to perform the
overall process it describes.
When the most basic processes have been identified, their functions can be
Structured Analysis 195
defined in pseudocode process specifications. These are exactly like the module
specifications used during design; indeed, the module specifications will be
developed from the process specifications. As we noted earlier, process specifica-
tions omit all control or coordination decisions (e.g., 'when status_flag = TR VE
THEN do action ELSE wait'). Process specifications should state only what the pro-
cess will do with input data that we assume are always available and ready for pro-
cessing.
Example:
Finish leveling the manage_directory task.
Only two processes in the level 0 DFD can be divided without weakening their
cohesion. They are 'obtain matching phone number', process 2, and 'delete listing from
directory', process 3. Even these processes are so simple that their divisions are trivial.
Thus the level 2 and level 3 DFDs are all that are needed to complete the division of the
manage_directory task. The level 2 DFD is shown in Fig. 4.11.
The level 3 DFD is similar. Try drawing it to practice what we have discussed.
Leveled DFDs can also be used to divide noncomputer tasks; indeed, DFDs in
one form or another have been used for analyzing tasks since long before the elec-
tronic computer was invented. For instance, data flow diagrams were and are used
by contractors to divide the task 'build a house' into the smaller processes such as
'build a foundation' that make up the task. For this type of task the data flows are
replaced by material flows consisting of the wood, concrete, pipes, cabinets, and so
on, that are used in the various construction processes. Each process is checked for
conservation of material, rather than conservation of data. The basic principles are
otherwise the same.
DFDs have also been used to analyze the main task performed in an office. In
this case the material flows are the office paperwork, and dividing the task in the
best possible way leads to fewer steps between desks and therefore saves time and
money.
Phone
number _
Listing
Figure 4.11
-.----.-- .. ---------.-~.,-------------------,--
Finally, the processes in such jobs are defined with pseudocode process
specifications. These descriptions are actually procedures for performing each pro-
cess.
With a team of workers, such as are found on construction sites and in offices,
parallel processing is the norm. You might have bricklayers putting up a fireplace
while finishers are putting in the sheetrock, and so on, and similarly in an office.
Thus DFDs are useful for understanding tasks of many sorts, and can be used
profitably to gain control over your noncomputer projects as well as over your pro-
grams.
between groups. You should find that the most loosely coupled groupings are easiest
to name.
Now, draw a new chart with just the groupings and their interfaces. Group and
name them as before, and you will have the second level up from the bottom. Con-
tinue this process until you have worked back up to the context diagram, and the im-
proved set of leveled DFDs is complete.
Note that in bottom-up leveling you do not consider data flow names or pro-
cess names, only the number of arrows between groupings. This keeps the use of the
upward-leveling technique simple and quick.
The work you did in top-down leveling is never wasted, because until you
know what the basic, indivisible processes at the most detailed level of a task are,
you cannot improve the analysis. At the very least you will be keeping the top and
bottom levels of your original analysis; the bottom-up leveling process then allows
you to improve the intermediate levels. These levels will be particularly important
when we convert the analysis results into an initial design, as you shall see.
Example:
Show how bottom-up leveling can improve the coupling of an intermediate-level DFD.
Since bottom-up leveling is done based on arrows and circles, not on data or pro-
cess names, we will label the data flows with numbers and the processes with letters.
This will ensure that we think only in terms of fewest interfaces and do not become en-
tangled in the words accompanying the figures. We use uppercase letters for the
intermediate-level DFD and lowercase letters for the low-level DFD.
The intermediate-level DFD shows tight coupling between its processes and a
need for correction from below by upward leveling, as we see in Fig. 4.12. Processes A
and C have been downward-leveled into bottom-level DFDs, which are shown below.
Process B is indivisible and cannot be leveled further. The low-level DFD expanding
process A appears as shown in Fig. 4.13.
Note that the processes making up A have no data connections between them. As
we said in the discussion of structured design, bad coupling and bad cohesion go hand
in hand. The processes in A have terrible cohesion; they do not have anything in com-
mon!
The low-level DFD expanding process C appears as shown in Fig. 4.14. We now
plug in the low-level processes of A and C into the original intermediate-level DFD, to
form a composite low-level DFD (Fig. 4.15).
Next we will regroup these low-level processes to form new intermediate-level
processes with the fewest connecting arrows possible (Fig. 4.16). We will call these new
groupings X and Y.
Last, we draw the new intermediate-level DFD with just the two new
intermediate-process circles (Fig. 4.17). Where before we had three processes with two,
four, and five interfaces, respectively, now we have two processes with just two and
three interfaces, respectively. We could also form just one process with only three inter-
faces if that seems more appropriate to the underlying system.
---~.-I0-a 3
-~
5~
Figure 4.12 Figure 4.13
7
9
context diagram and a large surface to write on. Then work inward, identifying and
notating the lowest-level processes you can. Often it is easier to identify basic pro-
cesses than it is to select appropriate intermediate-level processes. When you have
finished with this step, you will be ready to level upward until reaching the context
diagram. The result is the same as with top-down leveling; a set of leveled DFDs
describing the processes required to perform a task. Of course, this method cannot
be used if the task is so large that its basic processes cannot be written on a single
chart.
Bottom-up leveling of an existing set of DFDs can be used with a large task if
you pregroup the basic processes that are most tightly coupled. Multiple bottom-
level charts will be required, but at least the basic processes that are most likely to be
functionally related will be on the same charts. These basic charts are leveled upward
until the earliest time that a single chart can be made from the process groupings.
The upward leveling then continues as usual through the context diagram.
Bottom-up leveling corrects problems identified by the law of coupling. As in
the design phase, we use laws to identify problem areas in our work. The remaining
laws of analysis point out other improvements needed in the results of an analysis.
We discuss the laws of analysis next.
The Laws of Structured Analysis. Many of the laws of structured analysis
have design counterparts with which you are already familiar. In those cases we will
highlight the differences in how the law is applied during analysis, and move on. The
other laws are relatively simple and intuitive, and will be easily grasped and applied.
The laws are discussed in their general order of importance.
control and coordination information flows, which control when the processes ex-
ecute what, and control-related decision making in the process specifiications.
Common violators of this law include processes that act based on the time ele-
ment, whether it be execution time or real time, and control flows produced by such
processes to activate and deactivate other processes. For instance, given the task of
maintaining an executive's appointment book, its DFDs might show a time-aware
process to 'output appointment message on target date'. This process might activate
a 'display appointment message' process, so that both the first process and its con-
trol flow violate the law of nontemporality.
Violations of this law reveal that the person making the analysis has been
thinking actively, of action sequence, rather than passively observing the path the
data take through the system. This has limited the direction of the analysis to con-
form to his or her preconceived ideas of what the system should do. At best, a pro-
gram based on such an analysis may work but not as well as it otherwise would have.
At worst, a program based on such an analysis will be too complicated to get it
working properly, if at all.
Violations of this law can often be corrected simply by removing the offending
processes and control flows. However, if this leaves dangling data flows, the cure is
to return to the level above the first introduction of temporal elements and level
downward without them from there.
Balance. The law of balance for structured analysis says that "the data in-
puts to any process in a DFD must be of similar complexity to the outputs." If this
law is followed, each process in a DFD can be divided into its basic component pro-
cesses in about the same number of levels as any other process in the same DFD.
Structured Analysis 201
Process Size. The law of process size states that "the specifications for the
basic processes identified by top-down leveling should require one half page of
pseudocode or less." Of course, the figure of one half page is somewhat arbitrary,
but it is a useful yardstick nevertheless.
Violations of this law indicate that the most basic processes have probably not
yet been identified. This makes it much more difficult to write and read the program
code that will eventually perform those processes. A long process specification often
implies less than optimal cohesion, as well. Further top-down leveling of the offend-
ing processes corrects this problem.
Structured analysis uses two techniques and six laws to produce a task image in the
form of a set of leveled DFDs, a data dictionary, and process specifications.
The two techniques of structured analysis are:
1. Conservation of data. All process input data must be used to produce process
output data, and all process output data must have corresponding process in-
put data from which it is derived.
2. Non temporality. Temporal considerations must be omitted from the task im-
age. Temporal considerations include control and coordination information
flows, which control when the processes execute what, and control-related
decision making in the process specifications.
3. Coupling. Process interfaces should be as few and as nonintrusive as possible.
4. Cohesion. Each process should consist only of directly related functions, and
as much as possible, all directly related functions should be grouped together
in the same processes.
5. Balance. The data inputs to any process in a DFD must be of similar complex-
ity to the outputs.
6. Process size. Tht specifications for the basic processes identified by top-down
leveling should require one half page of pseudocode or less.
202 Imposing Reason: Program Planning Chap. 4
The analysis phase is complete with the finishing of a task image that obeys all the
laws of structured analysis. The design phase begins at this point with the conversion
of the task image into an initial design model. We will make the conversion using the
transform analysis technique.
Figure 4.18
Structured Analysis 203
Once the level 0 DFD has been prepared, you must find the central process or
processes of the DFD. This process or group of processes will perform the central
function of the overall program and will determine the name of the top or 'boss'
module in the design hierarchy. These central function processes are also called the
central transform, for obvious reasons.
There are two ways to find the central transform. The first is to search the
DFD for the processes that seem to be most involved in the central data transforma-
tions of the task. Such processes tend to be more highly coupled with each other
than with the surrounding data input- or output-related processes, although this is
not always true. Looking for a more tightly coupled group of data-transforming
processes can often lead you to the central transform.
The second way of finding the central transform uses the process of elimina-
tion instead of the direct approach. You trace all external input and output data
flows inward until reaching the most logical (i.e., the most general or highest-level
data structure) form of the data. If you make a mark across each of these data flows
and then connect them to form a circle, all processes inside the circle will be involved
in the transformation of those data flows and therefore will be in the central
transform. This method is usually better than the first, since solid principles rather
than intuition lead us to the central transform.
Example:
Apply the second method of finding the central transform to the prepared
manage_directory DFD.
Working inward along the external data flows leads us to the central transform
shown circled in Fig. 4.19.
Once you have the central transform circled, imagine that the DFD processes
are ping-pong balls and that the connecting data flows are string. If you pick up all
the balls in the central transform in your hand, the other balls will hang underneath
by their threads. Note that the result is beginning to look like a hierarchy chart.
Cut the threads that cross the circle of the central transform, and glue the cut
ends to a new ping-pong ball, which we will call the 'boss' ball. Give the boss ball the
imperative name that best describes the function of the central transform.
If the central transform contains just one process, hang it directly beneath the
'boss' as well. If the central transform contains more than one process, you can
hang each of them under a 'supervisor' ball and then hang the 'supervisor' under the
'boss', or you can hang each of them directly under the 'boss'. Closely related pro-
cesses in the central transform favor the first approach, but since the rest of the
design phase will be spent perfecting the initial hierarchy, either approach is accept-
able.
At this point you may also want to remove the names of the data flows to or
from any data stores, and consider all the connected processes as having equal but
unshown access to them. Data stores tend to act as common couples between pro-
204 Imposing Reason: Program Planning Chap. 4
Numb~
Add listi~
tu dir~ctOJ
Phone
number
Figure 4.19
cesses, and showing the data interfaces through the 'boss' process clutters up the
diagram with many copies of the same data flow. In programs that revolve around
data stores, the process names will usually refer to the store already, and any access-
ing of the stores will show up in the process pseudocode.
Example:
Show the manage_directory DFD after it has been placed in ping-pong form.
Since the three processes in the central transform are only loosely related, we will
give the 'boss' ball the general but weak name 'manage_directory', and hang each of
the processes directly from the 'boss' (Fig. 4.20). We will not show the 'directory' flows
since they flow to and from a data store.
To complete a ping-pong chart, look at the DFDs expanding each process now
hanging below the 'boss'. Now treat each hanging process as a 'boss' in its own
right, and hang the lower-level DFD beneath it using the same procedure as above.
Do this level by level until the lowest-level DFDs have been hung on the chart.
To complete the format of the initial hierarchy chart, replace the process
circles with module rectangles, add the data interface arrows beside the data flow
names, and hang an 110 module on each dangling-end external interface. The latter
type of module formats data as necessary to meet the differing needs of the external
Structured Analysis 205
Q)' I
s'" i
Q)
I-DE
i:J
z t tz
Add
listing to
directory
Figure 4.20
device and the computer, and handles any protocol needed to input or output that
data. For an input flow from a keyboard such a module might provide prompts and
canned selections to guide the user in providing the correct data. For an output flow
to a TV screen such a module might place the data in labeled columns and add
remarks on their significance. Of course, only one module is needed on anyone type
of dangling end, no matter how many low-level DFDs the interface appears in.
There are no dangling interfaces in our ping-pong manage_directory chart,
but there would have been if we had not kept 'directory' in a data store. In that case,
the directory would have been loaded directly from the disk terminator each time it
was used by a process, and stored to disk whenever it was updated. The
'load_directory' and 'store_directory' processes would have been omitted in the
original DFD since they do not transform the directory. In creating the ping-pong
chart, the removal of the terminators would leave the directory interfaces dangling.
When converting the ping-pong chart into a module hierarchy we would have added
'load_directory' and 'store_directory' modules at the end of these loose ends to
move the data from and to the disk.
To complete the initial design model, add any necessary control interfaces to
the hierarchy chart and data dictionary and any necessary control logic to the
pseudocode specifications of the modules. Completing the 'manage_directory' ini-
tial design model is left to you for practice. There is no one right answer at the end of
a transform analysis process; it is absolutely necessary only that you include all pro-
cesses from the DFDs, and all data flows except those to and from data stores, if you
206 Imposing Reason: Program Planning Chap. 4
choose to omit those. Almost any other flaw can be corrected during the rest of the
design phase, including oversights in adding the 110 and control aspects needed to
make the hierarchy work in the real world. Do your best with identifying the 110
and control needs, however, since this can save you a lot of extra work later. The
110 and control aspects must also be correct by the end of the design phase.
The completion of the initial design model starts the second portion of the
structured design phase, perfecting the model. This brings us full circle: to the struc-
tured design section earlier in this chapter and completes our discussion of the soft-
ware development process.
iI Exercise:
Work through the following problems to practice the software development process.
II (a) Analyze, design, and write your own program for a telephone directory. Refer to
II Chapters 3 and 4 as necessary for help with the techniques, but do not try to
duplicate the system we described exactly. Add enhancements to vary the project
from the examples in this chapter, but use the examples as a reference to guide you
through the development.
(b) Develop a program from one of the following categories or a simple category of
your own choosing (you will need information from the next three chapters to pro-
gram the 1/0 modules, but the analysis and design phases can be c:ompleted with
what you know now).
ADDRESSABLE LOCATIONS
As we noted in Chapter 1, the memory map of the 65lO CPU is 64K bytes long.
However, the Commodore 64 provides a total of 88K addressable locations from
which to select the 64K, including 20K of ROM, 64K of RAM, and 4K of 110 data
and control locations. These locations can be divided and combined in a number of
different ways so that the memory map can be tailored for specific tasks. We will
start by exploring the ROM, RAM, and 110 locations, and then see how they are
combined to form the different memory maps.
ROM Locations
The Commodore 64 is supplied with one built-in data structure and two built-in pro-
grams. These are kept in permanent form in 20K bytes of ROM. The built-in data
structure is the 4K-byte-long character set. The built-in programs are the BASIC
language interpreter and the operating system. These programs are 8K bytes long
each, and are discussed below.
207
208 Connecting the Nerves: Using the Memory Map Chap. 5
Character set ROM. The character set data define the graphical ap-
pearance of the characters shown on the Commodore 64 keyboard. This data struc-
ture will be described in greater detail in Chapter 6. It can optionally be in or out of
the CPU memory map. In either case it is still available to the graphics chip, which
accesses it independently of the CPU as we will also see in Chapter 6. When the
character set ROM is in the memory map, it is always located between addresses
DOOO and DFFF hex.
"
2) BASIC instrLlction
~
~
f-! 1) I nterpreter code
---.- 3) Selection of machln'3 code
J 4) 'FF' 'D5' '20' " .
----
6510
CPU P '. ----~===~
f-!
I!
b
/ ~
.... - --
n Machine coele
..-.
BASIC interpreter
(ROM) -
I nterpreter code
/ \
b -
BASIC instruction
BASIC program
(RAM)
.'igure 5,1 Interpreting a BASIC instruction (CPU inputs and outputs are numbered by order of use)
Addressable Locations 209
the map it is located at addresses AOOO through BFFF hex. The computer does not
need the BASIC ROM to run machine language programs except at the outset to
load a program and start it executing. Once executing, the machine language pro-
gram can select a memory map that replaces the BASIC ROM with RAM locations
in its address range.
Kernel ROM. The operating system, also called the kernel, is more a collec-
tion of machine language modules than a program, although some kernel modules
call other kernel modules to perform their tasks. However, the modules together do
perform one broad overall task; supporting I/O communications with the outside
world using the Commodore 64's 110 circuits. This task often involves complicated
housekeeping functions requiring specialized programming knowledge not needed
for anything else. The kernel simplifies our programming job by providing easily
used modules that our programs can call to perform those functions. Figure 5.2
shows how a program uses a kernel module to perform I/O functions.
When a machine language program calls a kernel module, it passes it any
necessary data in the 6510 registers. Any data produced by executing the module are
returned in the registers and are then handled by the program.
A module in a typical program hierarchy calls another module with a JSR to
the beginning address of the called module. We could use the same method to call
kernel modules if we had a list of their beginning addresses. Such a list could be
made easily enough. However, Commodore has stated that the beginning addresses
of the kernel modules may be changed as module features and length are altered.
Such changes could be due to changes in the 110 chips, for instance, which might re-
quire a change in the housekeeping duties and therefore length of the kernel
modules. More likely, extra capabilities could be added to or hidden errors corrected
in the modules. A table of module addresses would become obsolete as soon as one
module address changed, and so would the programs built using the old table.
For a program to be sure to run on every Commodore 64, including your own,
it must call the kernel modules without using their beginning addresses. A clever
trick allows this to be done.
Begin
End
Figure 5.2
210 Connecting the Nerves: Using the Memory Map Chap. 5
Inside the kernel ROM of every Commodore 64, starting at address FF81 hex
when kernel ROM is in the memory map, is an ordered group of 39 JMP instruc-
tions. These instructions point to 39 kernel modules also in kernel ROM, and are in
a known order so that the JMP instruction for any particular kernel module is at the
same address in every Commodore 64. The group of JMPs is called the kernel jump
tab/e.
To call a particular kernel module, a program loads the CPU registers with any
necessary data and calls the location holding the JMP instruction for the desired
module. Execution passes to the JMP instruction, which routes execution to the
beginning address of the module in that particular C64. When the module finishes
its job, execution returns to the location following the original call. Any data from
the kernel are then retrieved from the CPU registers. We see how this works in Fig.
5.3.
The complete JMP table is shown in Table 5.1, with the destination module of
each JMP named and its purpose summarized. You will use only about half of the
JMPs and kernel modules during normal programming; most of the remaining
modules are directly called by this basic group. The JMPs you will use in your pro-
grams are highlighted in the chart. The modules they point to form the nucleus of
the Commodore 64's operating system; they perform most input and output with a
simple technique called channel JlO. Understanding channel 110 is the key to using
the operating system, so we will return to this subject later in the chapter.
So, according to Table 5.1, the kernel module called CHKOUT can be called
by executing the program instruction JSR $FFC9. This causes the JMP instruction at
$FFC9 to execute, which routes execution to the module CHKOUT. If you are
curious where CHKOUT or other modules are located in your Commodore 64, you
can find their addresses by using a machine language monitor to look at the last two
locations of each three-byte JMP instruction. Absolute addressed JMPs contain the
module address in the last two bytes of the instruction, with the low byte of the ad-
Program
Instruction 1
Load registers
Kernel JMP table
JSR Jump_ 4
Save registers ~ JMP module 1
JMP module 2
n-
.End JMP module 4
Instruction 1
Instruction 2
JMP modu Ie n
RTS
Figure 5.3
Addressable Locations 211
dress first. Indirect addressed JMPs contain the address of the memory locations
holding the module address.
With a monitor program you can also write short test modules that call kernel
modules and watch them execute one step at a time. There is no better way to learn
how an operating system works, but it is a challenging and sometimes tedious
business,
212 Connecting the Nerves: Using the Memory Map Chap. 5
The Commodore VIC 20 has a jump table that is nearly identical to that in the
Commodore 64, and Commodore claims intent to do the same with future compati-
ble computers. Thus Commodore 64 programs using the kernel jump table should
translate easily to other Commodore computers, as long as the program is compati-
ble in other ways (e.g., in program size and use of the I/O chips).
As with the BASIC ROM, the kernel can be in or out of the CPU memory
map. When the kernel is in the map it is always located at addresses EOOO through
FFFF hex.
RAM Locations
The 64K bytes of RAM locations are used to hold user programs and their data as
well as any temporary data handled by the two built-in ROM programs. RAM is
located at addresses 0 through FFFF hex.
The first four 100 hex pages of RAM are used by the BASIC interpreter and
the kernel to store the temporary data they produce as they execute. The BASIC
locations are available to most machine language programs since such programs
normally remove the BASIC interpreter ROM from the memory map. Additionally,
locations used by any kernel modules or functions that the program never uses will
also be available. This releases some page-zero locations for your use, which allows
using the faster page-zero addressing. However, be sure not to use: any locations
assigned to kernel modules that will be used, since dual use leads to errors in both
the kernel and your program.
Data usee by the VIC graphics chip to produce the computer's output picture
are also stored in RAM. These data are kept in one of several predefined data struc-
tures, depending on the type of display. Most graphics data can be placed in a
number of different areas in RAM. However, a special purpose graphics data struc-
ture called Color RAM can only be placed in addresses D800 to DFFF hex. We
discuss graphics data in detail in Chapter 6.
All areas of RAM not used by the built-in programs or for graphics data are
available to hold your programs and their data. During programming you will have
to select a beginning address for the program to rest in memory. Under certain cir-
cumstances that we will discuss shortly (e.g., auto-load), you may also have to
specify beginning addresses for modules located apart from the main program. In
selecting memory locations for your program the only rule is to make sure that the
entire program rests in otherwise unused RAM locations. This means that the loca-
tions you have available in which to place your program will depend on the specific
memory map that you select.
The reserved low-address RAM locations are shown in more detail in Tables
5.2 and 5.3. Table 5.2 is a summary of the divisions of low memory into BASIC,
kernel, and graphics sections. Table 5.3 lists the specific kernel and BASIC data
locations that are most important to assembly language programming. Do not ex-
pect the contents of these charts to be very meaningful your first time through. Later
Addressable Locations 213
they will be useful as a programming reference, but for now they will have served
their purpose if they introduce you to the major uses of low RAM.
Each section of low RAM is shown with its address range in hexadecimal nota-
tion, what type of program uses it (PROGR means its free for use by user pro-
grams), and in some cases, what it is used for.
So the locations in address pages 0 and 2 through 3 store temporary data for
kernel modules, for the memory-map and other configurations (through the 6510'5
I/O and data direction registers at addresses 0 and 1), for the BASIC interpreter,
and for your programs if you desire. Since most I/O operations performed by your
programs will use the kernel modules, the page 0 kernel locations should be left
alone. All the BASIC locations are free for use by your assembly language pro-
grams. The two BASIC locations that are shown in the following detailed RAM map
are those that you must use in designing programs that load and start eXl;;cUling
automatically under the contfol of the BASIC interpreter. We will see later how
these locations are used.
Page 1 locations, at addresses 0100 through OIFF hex, make up the stack area,
110 Locations
The I/O locations are clustered in a 4K-byte block that can be in or out of the
memory map. When present in the memory map this block is always located at ad-
dresses DOOO through DFFF hex.
Some of the I/O locations serve as the windows to and from the outside world
via devices such as printers, cassette drives, disk drives, and game joysticks. There
are also built-in devices connecting the computer to the outside world, including
timers and an alarm clock. Another group of I/O locations are used w control the
windows. The remaining I/O locations are unused.
Each of the I/O locations is a register on one of the four I/O chips in the Com-
modore 64. One chip supports graphics, one supports audio, and two support most
other I/O operations. The graphics chip is called the VIC-II, which is short for
Video Interface Controller model II. The audio chip is called SID, for Sound Inter-
face Device. Finally, the general-purpose chips are called CIAs, for Complex Inter-
face Adapters. In this chapter we introduce the first two chips and then concentrate
on the latter two. The following chapters fill in the details on the graphics and audio
chips.
The kernel utilizes the I/O locations to carry out its I/O functions. We will use
kernel modules for I/O wherever possible. However, some I/O functions are not
supported by the kernel, and for these functions we must use the I/O locations
directly. Locations of both types are shown in Table 5.4, with the addresses they
reside at when the 110 block is in the memory map.
From the 88K bytes of available addressable locations, a 64K memory map must be
selected and electrically attached to the microprocessor. 64K bytes of RAM are
always at the foundation of the memory map, brokenly or continuously spanning
The Memory Map 215
(continued)
216 Connecting the Nerves: Using the Memory Map Chap.S
(continued)
218 Connecting the Nerves: Using the Memory Map Chap. 5
addresses 0 to FFFF hex. Variety in the memory maps comes from placing ROM
over specified areas of RAM, and from replacing one specific area of RAM with the
I/O circuits block.
Placing ROMs over the foundation RAM is different from re:placing RAM
altogether: a microprocessor reading from a ROM location that is over RAM
retrieves the ROM contents as usual, but one writing to such a location puts data
into the RAM locations underneath the ROM. So, although the microprocessor sees
the currently selected memory map, the video chip has access to all of RAM,
although only 16K at a time (the functioning of the video chip is described in
Chapter 6). One can update video display data that are stored in RAM underneath
The Memory Map 219
ROM in a current memory map, and thereby change the TV picture drawn by the
video chip.
When in the memory map, the BASIC ROM lies over the RAM from AOOO to
BFFF hex. Similarly, the kernel ROM lies over the RAM from EOOO to FFFF hex,
and the character ROM over the RAM in the shared RAM-I/O area from DOOO to
DFFF hex.
Most of the I/O block replaces rather than overlays RAM. This means that the
RAM is completely disconnected from the address bus so that the CPU cannot write
through the I/O locations to it. This is necessary because the 110 chips have
readlwrite locations like RAM, so that writing to I/O over RAM would change both
locations identically. However, the color RAM between addresses D800 and DBFF
hex holds important data about the color of the TV display that must be program
changeable. Therefore, this address range in the 1/0 block has been left vacant and
data can be written through the 110 block into the color RAM beneath.
Although most of the RAM in the 110 block address range is detached from
the CPU when the 110 block is present, these RAM locations are always available to
the VIC chip, which has its own memory map that never includes the 110 block.
The various CPU memory maps are selected by writing binary values into the
three least significant bits of the 651O's on-board 110 register, at address 0001 hex
(Fig. 5.4).
07 06 05 04 03 02 01 00
Usage of the upper 5 bits is divided between cassette control and no defined
function. These higher-order bits are usually handled only by the kernel routines,
but their contents must still be preserved when we change the lower 3 bits. The safest
way to do this is to retrieve the byte in location 0001 hex to the accumulator, AND
or OR the accumulator with bit patterns that change only the desired bits, and return
the accumulator's contents to location 0001 hex.
With three memory map bits, up to eight (2 3) different memory maps can be
selected. However, two of the maps are identical, reducing the total number of
unique maps to seven.
Overview Map
As the name "memory map" implies, the 64K address space can be shown picto-
rially. In the overview memory map shown in Fig. 5.5, all 88K available locations
are pictured at their assigned addresses. The individual memory maps are con-
220 Connecting the Nerves: Using the Memory Map Chap. 5
Kernel ROM
[~
EOOO - ~------i .-DFFF
COLOR RAM
D800 - f - - - - - - - j Character ROM I/O BLOCK
- DOOO
~-BFFF
L:::J -AOOO
0800 - ~------I
DEFAULT SCRN
0400 -
ROM-CODE USE
0000 -- ' - - - - - - - - '
structed by selecting locations of one type in each area where two or more types of
locations are allowable.
In Fig. 5.5, the 64K RAM foundation map appears at left, labeled with ad-
dresses. The ROM and I/O areas that overlay or replace portions of it are shown to
the right beside the locations they occupy.
Selecting the map. The three ROMs and the 110 block are attached to and
detached from the memory map using the memory map bits in the 651O's 110
register at address 0001 hex. A table of the bit patterns for attaching these sections
(Table 5.5) resembles the binary counting table. In this table, all possible values of
bits dO, d 1, and d2 of the 110 register are shown together with the contents of their
resulting memory maps. The table starts with d2 equaling 1 instead of 0 as in the
normal counting sequence, to show the most important memory maps first. Also,
memory maps that are especially, or perhaps only, well suited to a single purpose are
labeled as such (e.g., "assembly" for holding assembly language programs).
Note that changing d2 from a 1 to a 0 merely substitutes the character ROM
for the I/O block, except in the all-RAM maps. The main reason for making this
substitution is to make the character set available for transfer into RAM and subse-
Using the Kernel: Channel 1/0 221
(0001 hex)
Use 02:01:00 Map contents
ASSEMBLY 0 0 64KRAM
0 1 60K RAM, 4K 110
0 52K RAM, 4K 110, 8K kernel
BASIC 1 1 1 46K RAM, 4K 1/0, 8K kernel, 8K BASIC
OATA TRANSFERS 0 0 0 64K RAM (SAME AS 100)
0 0 60K RAM, 4K character set
0 0 52K RAM, 4K character set, 8K kernel
0 46K RAM, 4K character set, 8K kernel, 8K BASIC
In channel 110, a program uses the kernel to send and receive data through a com-
munications channel between the CPU and a nonmemory device. The concept can
be illustrated with a historical example. Prior to the presidency of John F. Kennedy,
222 Connecting the Nerves: Using the Memory Map Chap. 5
there was no means for the President of the United States and the Soviet head of
state to communicate quickly and reliably. Of course, there were communications
paths such as radio and telephone between the two countries, but in their normal
form none of these were sufficient for the special needs during tense situations.
In 1963, existing communications paths were tailored and combined with com-
munications devices called teletypes at each leader's end to form a communications
link called the "hot line." The hot line is inactive except in crisis situations be-
tween the two countries. When activated, however, it becomes a communications
channel for carrying information directly between the national leaders.
The Commodore 64 also has links and channels. A link consists of a com-
munications path connected to the computer on one end and to an I/O device or
peripheral on the other. As with the hot line, a Commodore 64 link is normally in an
inactive state. To communicate with a peripheral a program must both create a link
and activate it using the kernel. Thus a channel is an activated link.
Channels are unidirectional; that is, a given channel can input Olr output data,
but not do both. Two channels are supported by the kernel: one for input and one
for output. However, there is a rarely used method that circumvents the kernel to
allow multiple output channels. We will discuss it later.
When the computer is first turned on, or when channels created by a program
have been dissolved, the kernel forms the input channel from the keyboard and its
communications path, and the output channel from the TV screen and its com-
munications path. These are the default channels. A program can replace these
channels with channels for communicating with any other peripheral attached to the
Commodore 64. When a program finishes using its selected channels and deactivates
them, the kernel automatically res elects the default channels.
The kernel provides several ways to move data through a channel. Depending
on which communications path and device have been selected, data may be moved
as a complete file data structure, as parts of a file, or as sentence or character units
tailored for human input or human viewing. We will call these choices whole-file
//0, partial-file //0, and interactive //0, respectively. This flexibility allows a pro-
gram to transfer only the amount of data needed for a given purpose. The computer
is unavailable for program execution while data are being transferred, and some of
the communications paths are excruciatingly slow (such as those to the disk or
cassette drives), so the ability to transfer a little data at a time and get back to other
work can be invaluable.
The steps in using a channel are quite simple. First, a program creates a com-
munications link containing a peripheral and the communications path to which it is
attached. Second, the program activates the link as an input or output channel.
Third, the program communicates with the peripheral via the channel. Fourth, when
communications have been completed, the channel is deactivated. Fin ally, the link is
dissolved. These steps are summarized below:
1. Create a link.
2. Activate the link.
Using the Kernel: Channel I/O 223
How these steps are executed depends on the type of channel being used. Thus,
to understand how to use a channel from a program, we must first be familiar with
the component parts of channels: communications paths and peripheral devices.
These are discussed next.
Channel Parts
tached to a path, it is clear that the IEEE bus can support many links at once. This
will lead to some interesting possibilities.
The IEEE-488 path was originally designed to attach test equipment such as
voltmeters and oscilloscopes together for remote control and monitoring . This is still
its most common use. However, Commodore has also made use of the bus to con-
nect a series of business-quality peripherals to some of its other computers, although
not to the Commodore 64.
Economy has dictated the Commodore 64's departure from thl~ IEEE-488
standard. Most of the IEEE rules have been obeyed, but a less expensive bit-wide or
serial data path has been substituted. Commodore calls the new path the serial bus.
IEEE-488 peripherals will not work with this path, so Commodore built
special C64 peripherals that will. Among them are familiar devices such as the 1541
disk drive and the 1515 printer. Many but not all C64 peripherals attach to the com-
puter through this bus. Communications with the disk drive are relatively slow
because serial data transmission causes a bottleneck. Only about 300 bytes can be
transmitted per second. However, the serial bus is easy to use because the kernel
handles its functions for us.
Full IEEE-488 communications can still be performed by the Commodore 64
if an IEEE-488 interface card is attached to the computer. However, the complica-
tions in coordinating IEEE-488 bus activity to the C64's internal timing re-
quirements makes using most of these cards a matter of very advanced and extensive
programming.
A better alternative is to attach a bus converter to the computer. This type of
device converts the standard signals on one type of communications path to stan-
dard signals of another. In this case, the conversion is from the control method and
serial data of the serial bus to the control method and parallel data of the IEEE bus,
and vice versa. Of course, the maximum speed for data transfer over the IEEE bus
must be restricted to the maximum speed of the serial bus for the conversion to
work. The programmer uses the same kernel calls and machine code that he or she
used with the serial bus alone. One bus converter is the Interpod, made by Oxford
Computer Systems of Oxford, England, and available in the United States.
A faster-operating variation on this type of bus converter plugs into the
cassette port to gain access to the C64's internal buses and then substitutes for and
makes itself appear like the serial bus to the kernel. It makes the conversion to and
from the IEEE-bus communications without the speed limitations of physically go-
ing through the serial bus. An excellent example of this type of converter is the
BusCard II by Batteries Included, which is available from retailers or from their of-
fice in Irvine, California. This unit is more expensive than the Interpod, which is the
trade-off made for the unit's extra speed, a built-in machine language monitor, and
a BASIC 4.0 interpreter. It is a well-built and easy-to-use device and a good standard
of comparison for evaluating other bus converters. It also has a serial-to-Centronics
bus converter, which allows using non-Commodore printers as if they were attached
to the serial bus.
Commodore makes a line of peripherals that are faster and more robust than
Using the Kernel: Channel 1/0 225
those of their Commodore 64 line. These peripherals are designed to work with
Commodore business computers via the IEEE bus. They include disk drives and
printers, and can be used with the Commodore 64 through a serial-to-IEEE bus con-
verter like the Buscard. The bus converter allows a 330/0 speed increase over the
serial bus in transferring data to and from a disk drive. This could be important to
someone who routinely writes and assembles large programs, for instance, because
assembling a large program requires moving a lot of data back and forth between
the computer and the disk drive. Using the Commodore 64's serial bus and a 1541
disk drive a long program can take as long as a half hour to assemble. The 1541 has
an annoying tendency to overheat during such prolonged use, often causing ir-
reparable damage, unless extra steps are taken to cool the drive (e.g., with a fan
directed at the drive).
The Commodore business drives are better than the 1541 but they have their
own disadvantages, namely expense and incompatibility with the copy protection
schemes designed for the 1541 drive. By accessing special characteristics of the 1541
drive these schemes prevent many purchased programs from being copied onto
backup disks and, incidentally, prevent the same programs from running on the
Commodore's business drives.
A preferable 1541 replacement drive, one of lower cost than the Commodore
business peripherals and that works well with most but not all copy protection
schemes, is the Super Disk available from MSD Inc. of Dallas, Texas. The Super
Disk can be used directly on the serial bus or in combination with the Buscard on the
IEEE bus for faster data transfer rates.
Both Commodore and MSD produce a dual drive configuration of their 1541
replacement drives. For serious software development a dual drive 1541 replace-
ment, particularly with a serial-to-IEEE-488 bus converter, makes for a durable and
fast system that simplifies the frequent disk backups necessary to protect your work.
This is because in the dual configuration both drives have the same device number,
8, with the two drives being numbered 0 and 1 in ASCII messages sent to the drives.
Device numbers and drive messages are discussed in detail later in this chapter. Shar-
ing the same device number allows the data on a disk in drive 0 to be copied directly
onto a disk in drive 1 with a single BASIC DUPLICATE or BACKUP command.
This is much quicker and easier than a backup operation with 1541 drives, which re-
quires loading files into memory from the original disk and saving them to the
backup disk one at a time.
Describing the complexities of actual IEEE-488 bus activity is beyond the
scope of this book, so if you are interested in using the IEEE bus the hard way (i.e.,
without a bus converter) or just in learning more about it, I suggest you obtain a
copy of one of the IEEE bus tutorials on the market and enjoy.
The Commodore 64 provides both the physical signals and kernel support for
an RS-232 path. However, the voltage of the C64's signals is too low for RS-232
peripherals, so you must at the least purchase an RS-232 board that plugs into the
C64 to use these devices.
Most microcomputer peripherals on the market communicate over an RS-232
path. Since Commodore's peripherals use a serial or IEEE-488 path, you will prob-
ably never need to use an RS-232 peripheral. If you should desire to use one,
however, you have the same two options as with IEEE-488. First, you can obtain a
bus converter to bridge the serial and RS-232 buses. Interpod contains one of these
converters. As with a serial-to-IEEE bus converter, an RS-232 bus converter should
let your programs communicate with RS-232 devices using the same kernel calls and
assembly language code as they use with devices on the serial bus. Be sure to check
any bus converter's user's manual before purchase to make sure that it works this
way.
The second option is to use a less-expensive board that converts the available
RS-232 signals to the correct voltage. This is a more difficult alternative since pro-
gramming for these boards forces you to directly manipulate the RS-232 locations in
the memory map. Most people should avoid this alternative. However, if you enjoy
a challenge or simply appreciate pain, your reference for direct RS-232 program-
ming will be the Commodore's Programmer's Reference Guide. It gives the
minimum and only available information on these locations in its RS-232 and de-
tailed memory map sections.
Data Cassette. The C64's third type of communications path is dedicated to
the data cassette. This path carries data serially, like the serial bus, but is otherwise
simpler since it services only one peripheral. Its speed is limited by the data cassette,
which as any data-cassette user knows is almost interminably slow. Any other
physical characteristics are unimportant to the programmer because, unlike the
IEEE-488 and RS-232 paths, the cassette path is completely supported by the kernel
and C64 circuitry.
Keyboard and Screen. The last two paths are also physically dedicated to
just one peripheral device each. The keyboard path carries data from the keyboard
to the computer. The screen path carries data from the computer to the TV screen.
Recall that these paths are contained in the kernel's default channels, the channels
that are selected whenever a program has not substituted channels of its own choos-
ing.
The screen path normally carries data in a 40-character-per-screen-line or
40-column format. A typical typed page can have up to 80 characters per line, but a
normal TV set cannot show that much detail. So word processing programs for the
C64 wrap around each line in a document to show its two halves on two lines.
However, something like a bus converter is available that converts this format to an
80-column format that shows an entire page across. These devices, called 80-column
cards, are used with computer monitors that can clearly display a full 80 columns.
Using the Kernel: Channel 110 227
The B.1.-80 card from Batteries Included of Irvine, California, is a high-quality ex-
ample of this type of converter.
Since these paths are fully supported by the kernel and C64 circuitry, no fur-
ther knowledge of their physical characteristics is needed to use them.
The first three C64 communications paths all connect an automated machine,
the computer, with an automated machine that is a peripheral. Data are moved be-
tween the machines as simple files, leaving the interpretation of the transferred data
to the machines (i.e., their internal programs or circuitry). Thus the serial!
IEEE-488, RS-232, and cassette buses are called the file paths.
The last two paths connect an automated machine, the computer, with a
machine that is directly controlled and used by a human being. The file format is not
well suited for human input or viewing, so the kernel intervenes and places input and
output data into sentence and character formats that are more natural to human be-
ings. The give-and-take interaction between user and computer that these paths sup-
port leads them to be called the interactive paths.
The kernel handles channels containing file paths differently than channels
containing interactive paths. When we discuss how channels are used in programs,
we will cover the types of channels in each of these categories separately.
The tree diagram in Fig. 5.6 summarizes the communications paths available
to the Commodore 64.
~
File
~
Interactive
~
Sr:ria l-buslIE E E- 488 RS -232 Cassette
~
Keyboard Screen
Figure 5.6
228 Connecting the Nerves: Using the Memory Map Chap. 5
the serial bus is normally "8." A secondary address value between 2 and 14 com-
mands the disk drive to prepare to transfer data with the computer. So, assuming
that the disk drive is attached to the serial bus, if a program calls the appropriate
kernel modules to send a primary address of "8" and a secondary address of "8"
out the bus, the drive will activate itself and prepare to transfer data with the com-
puter.
A 1541 disk can store up to 170K bytes of data in from 1 to 144 files, or in-
dependent data structures.
RS-232. Only one device can be attached to the RS-232 path at a time, so
devices need no bus address. However, the RS-232 standard allows several options
on how data are physically handled, which can vary from one peripheral to the next.
These options must be accounted for by a program using the RS-232 bus. Generally,
the program adjusts the computer or bus converter to match the characteristics of
the peripheral.
In the RS-232 standard, data are sent serially at a selectable transmission rate,
with the data divided into groups of seven or eight bits followed by a selectable num-
ber of stop bits signifying the end of the group, and with one of several types of par-
ity assigned to the groupings. Each data group may also be preceded and/or fol-
lowed by "no data" 1 values that tell the receiving device when valid data are on the
path.
The first quantity, transmission rate, governs the length of time each bit exists
on the RS-232 bus. Common transmission rates are 110, 300, 1200,2400, and 9600
bits per second, or baud. The devices on both sides of the bus must be "told" the
same transmission rate so that the receiver can read the value on the data line at the
same rate that the transmitter is writing the data.
The last three quantities relate to the low-level physical structure of RS-232
data. A single data-bit group is isolated in Fig. 5.7 by surrounding "no-data" 1
values.
The RS-232 data line is normally at the binary 1 value. When a data unit is
transmitted, the line changes to a 0 for one bit length of time at the current transmis-
sion rate. The data receiver uses this start bit to synchronize itself with the data
transmitter.
Seven or eight bits of data then follow on the line. One of those bits can be
reserved for parity checking, which we described in Chapter 1. Parity can be
"even," for the parity bit being set to 1 as needed to maintain an even total number
of 1's in the data word; "odd," to maintain an odd number of 1's in the data word;
or "none," for no parity checking.
Finally, one, one and one-half, or two stop bits of binary value 1 are transmit-
ted. "One and one-half" stop bits simply means that a 1 value is held on the line for
1 [ 0 [1 0 ... 0 11 [1
No data Start Data Stop No data
bit bit(s) Figure 5.7
230 Connecting the Nerves: Using the Memory Map Chap. 5
kernel keeps track of these user-defined links by placing the link definitions in a
table of up to 10 entries.
The link-definition area is called the open table and is located starting at ad-
dress 0259h. Each link definition consists of three bytes which are provided by the
executing program. The first byte in each three-byte group is the link ID number. A
program refers to a link definition by this number when telling the kernel to activate
the link to form a channel or to dissolve the link by removing it from the open table.
Commodore calls this first byte the logical file number or Ifn, for reasons that will
become clear later. The second byte is a device number which identifies the
peripheral device and communications path being used. The third byte is a com-
mand to the device or the kernel for when the link is activated into a channel. If no
command need be sent, this byte can be given a dummy value.
Conceptually, the open table is of the form shown in Fig. 5.8. Links are added
to the table in the order that they are defined, from position 1 through position 10.
The open table is physically stored as 10 consecutive Ifn bytes starting at loca-
tion 0259h, followed by 10 consecutive device-number bytes starting at 0263h,
followed by 10 consecutive command bytes starting at location 026Dh. So if the
kernel needs to read the link definition whose Ifn is 8, and if that definition is third
in the open table, the kernel will read the third location in the Ifn area, at address
025Bh, for the Ifn, the third location in the device number area, at address 0265h,
for the device number, and the third location in the command area, at address
026Fh, for the command byte.
Unused definitions in the table are filled with 0 values by the kernel. The
kernel will not accept more than 10 link definitions from a program until definitions
have been removed from the table to make room.
Each of the three bytes in a link definition is discussed in detail next.
Logical File Number. An Ifn can be any value between 01 to 7F hex. It
would be ambiguous to give the same lfn value to two different links, so the kernel
will not accept an attempt to do so (we shall see how this works shortly).
Device Number. The device number identifies both a device and the path to
which it attaches. This is done by dividing the possible device-number values from 0
to FF into five ranges, one for each communications path.
Uokd.fio.io"" : II----I-----+-----t
10
Figure 5.8 Open table
232 Connecting the Nerves: Using the Memory Map Chap. 5
0 Keyboard 0 Keyboard
1 Cassette I Cassette
2 RS .. 232 2 RS-232
3 Screen 3 TV Imonitor
4-255 IEEE/serial 4-5 Printers
8-11 Disk drives
The RS-232, cassette, keyboard, and screen paths each allow for only one at-
tached device at a time, so only one device number is needed to select both path and
device. Therefore, the number ranges for these paths contain just one value each.
The IEEE/serial-bus path can have many attached devices at once, so it has
been allotted the remainder of the possible byte values. Device numbers in the
IEEE/serial-bus range are used as primary addresses to select particular peripherals
on the bus.
The device-number ranges for the five communications paths, and the specific
device numbers that have already been assigned to particular peripherals, are shown
in Table 5.6.
Commands. The command byte serves as a command to the device on the
link or to the kernel on how to use the link. If the link contains a IEEE/serial-bus
peripheral, the command byte will be sent to it as a secondary address. Where no
command is necessary, a dummy value of FFh should be placed in the command
byte.
Table 5.7 shows useful values of the command byte for the most common
types of links. Note that the final meaning of many of these command values
depends on how the link is used in channell/O. Thus the "meaning" column of this
table will remain meaningless to you until we have discussed the channel 110 opera-
tions.
Channel Programming
Using channel 110 distances the programmer from the physical considerations in
moving data between the computer and a peripheral. The physical aspect of com-
municating with a peripheral includes details such as monitoring and controlling the
voltage on individual lines in the physical bus, comparing physical changes on the
bus with a clock to control the timing of other changes within predefined limits, and
converting data in memory into a form compatible with the path, and vice versa.
The kernel handles these physical details and provides other services so that we can
treat communicating with a peripheral as a simple matter of setting up a channel,
moving data through it, and closing out the channel.
Using the Kernel: Channel 1/0 233
1. Create a link.
2. Activate the link.
3. Communicate via the channel.
4. Deactivate the channel.
S. Dissolve the link.
It is useful to group these steps into two activities: channel management and
channel communications. Channel management consists of steps 1, 2, 4, and 5,
which involve setting up and closing out channels. Channel communications consists
of step 3.
234 Connecting the Nerves: Using the Memory Map Chap. 5
Channel Management
The kernel provides one group of modules for channel management, and another
group for channel communications. The channel management modules are in-
troduced below. They will be used for managing the various types of channels we
will discuss afterward.
Kernel support. The kernel modules used to create and terminate channels
are SETLFS, SETNAM, OPEN, CHKIN, CHKOUT, CLOSE, and CLRCHN. As
they execute, these modules can also provide feedback to the person running the
program by displaying error and status messages on the screen. Error messages in-
form the user of any condition causing a kernel module to be unable to perform its
task. This is particularly useful in debugging the program. Status messages inform
the user of the kernel module's status in performing its task. Either, neither, or both
types of messages are selected for all modules by executing a kernel module called
SETMSG.
The channel management modules are summarized below. The "prerequi-
sites" category refers to modules that must be called before calling the current
module. The "altered registers" category lists the microprocessor registers that are
altered during the module's execution. These registers must be saved before the
module call and reloaded after the return if their data are to be preserved. Moving
them through the accumulator into and out of the stack is usually the best method.
Note that register Y is unaffected by most of the routines. This makes Y a good
place to store a loop counter or other frequently used variable. The JSR address is
the address of the module's JMP table entry; the program should execute a JSR to
the JSR address to call the kernel module.
The other summary categories are self-explanatory.
SETLf'S:
Purpose: Define a communications link.
JSR address: FFBA hex.
Prerequisites: None.
Data passed: Lfn in A, device # in X, command in Y.
Data returned: None.
Altered registers: None.
SETNAM:
Purpose: Name a file-path link.
JSR address: FFBD hex.
Using the Kernel: Channel I/O 235
Prerequisites: SETLFS.
Data passed: Name of 1 to 16 XASCII characters, beginning at
'name_address' in memory, with low byte of 'name_
address' in X and high byte of 'name_address' in
Y, and byte length of name in A.
Data returned: None.
Altered registers: None.
OPEN:
Purpose: Add link definition to open table,
prepare the peripheral to send or receive file.
JSR address: FFCO hex.
Prerequisites: SETLFS (SETNAM if link is to a named file).
Data passed: None.
Data returned: Error #s 1,2,4,5,6,7,FO in A (see SETMSG).
Altered registers: A, X, and Y.
CHKIN:
Purpose: Select a link to be input channel.
JSR address: FFC6 hex.
Prerequisites: OPEN.
Data passed: Un in X.
Data returned: Error #s 0,3,5,6 in A (see SETMSG).
Altered registers: A and X.
CHKOUT:
Purpose: Select a link to be output channel.
JSR address: FFC9 hex.
Prerequisites; (OPEN for any channel except keyboard or screen).
Data passed: Lfn in X.
Data returned: Error #s 0,3,5,7 in A (see SETMSG).
Altered registers: A and X.
CLOSE:
Purpose: Remove link from open table, close file in storage.
JSR address: FFC3 hex.
Prerequisites: Makes sense only after OPEN.
Data passed: Un in A.
Data returned: Error # s O,FO (see SETMSG).
Altered registers: A, X, and Y.
236 Connecting the Nerves: Using the Memory Map Chap. 5
CLRCHN:
Purpose: Deactivate serial bus devices, terminate channels,
set channels to defaults (keyboard and screen).
JSR address: FFCC hex.
Prerequisites: CLRCHN is only needed after channels have been created.
Data passed: None.
Data returned: None.
Altered registers: A and X.
SETMSG:
Purpose: Select type of message feedback from kernel modules.
JSR address: FF90 hex.
Prerequisites: None.
Data passed: Desired message type, in A:
o no messages
4 = error messages (e.g., 110 ERROR #5)
8 = status messages (e.g., SEARCHING FOR filename)
C = error and status messages.
Data returned: None.
Altered register: A.
SETMSG note. Kernel error messages are displayed in the form of the words
110 ERROR # followed by a number. If error messages have not been selected, the
error number is simply returned by the kernel in the accumulator. The meaning of
each number is shown in Table 5.8, followed by the names of the hrnel modules
most likely to draw each type of error (including channel communications modules).
Kernel status messages are short phrases that inform the program's user of
what to do next and what the kernel is doing. An example of the former type of mes-
The assembly code above can be preceded by a call to SETMSG to select the
type of user feedback that the called modules will provide.
Preparing to store or retrieve a file on a storage device first requires defining a
link to that device using SETLFS. The name of the file can then be specified by pass-
ing the file name to the module SETNAM from somewhere in RAM. This is usually
done by including the name in the program as reserved bytes-for instance, using a
BYTE or similar program directive, as discussed in Chapter 3.
No further channel management operations are required for whole-file 110.
Setting up and executing the appropriate channel communications module
automatically terminates the channel when the transfer has been completed.
238 Connecting the Nerves: Using the Memory Map Chap. 5
Data Cassette. One specific of whole-file channel management with the data
cassette is the choice of parameters passed to SETLFS. The lfn can be any value
from 1 to 7F hex. The device number is always 1. The command number can be
from 0 to 2 to select file-handling options that we can now explain.
The command byte is used to prepare the kernel for the upcoming file transfer.
The file-handling option selected by the command byte is exercised during the trans-
fer itself. Therefore, the meaning of the command byte depends on the type of
transfer that will be performed. An output transfer is called a save and an input
transfer is called a load.
If the file is to be saved on the cassette, a command byte of 1 will cause an EOF
or end-of-file marker to be written at the end of the file when it is transferred. The
kernel will use this marker when loading the file to determine when the entire file has
been transferred. A command number of 2 followed by a file save causes both an
EOF marker and an EOT or end-of-tape marker to be written after the file. The
EOT marker effectively ends the tape. When the kernel is searching for a file during
a whole- or partial-file input operation and it encounters the EOT marker, it will
stop its search and return control to the calling program.
If the file is to be loaded from the cassette, 0 and I command values select
where the kernel will obtain the beginning address of the memory area in which to
place the file. A command value of 0 tells the kernel to look for the beginning ad-
dress in the X and Y registers when the kernel module LOAD, which loads the file, is
called. A command value of 1 tells the kernel to place the file into the same locations
from which it was originally saved. The details of whole-file channel communica-
tions (i.e., loading and saving files) will be covered in the channel communications
section.
With the cassette, identifying the file name by setting up and calling SETNAM
is optional. A file can be saved without a name, and performing a load operation
without specifying a file name results in the kernel obtaining the first file it comes to
on the tape.
A file in memory can be saved with the same name as a file already existing on
the cassette. When the data are saved, a new file of the same name is created starting
at the position at which the tape is currently located, overwriting whatever was on
the tape previously. The kernel does not check the tape for another file of the same
name. This situation is treated differently when the file is being stored on disk, as the
next section explains.
Disk Drive. The parameters passed to SETLFS for disk files are as follows.
As always, the lfn can be from I to 7F hex. The device number depends on which
one of how many disk drives are being accessed. Recall that Commodore 1541 disk
drives are delivered with a device number of 8. Most C64s use only one drive. Never-
theless, up to four drives can be attached, and the additional drives an: given device
numbers 9 through 11. Command numbers of 0 and 1 have the same ,effect as with
the data cassette and should be used accordingly.
Using the Kernel: Channel 1/0 239
The name of disk files must be specified so SETNAM must be called. This re-
quires an expanded name format. To access a file on drives 9 through II, a device
number between 9 and 11 must be passed to SETLFS, and the XASCII file name
must be preceded by a drive number from 1 to 3 in XASCII code, respectively, and a
colon. To compensate for the extra characters, the 'file name length' value passed to
SETNAM must be increased by two, and the 'file name address' values must point
to the drive-number byte instead of the first byte in the file name itself.
Thus, depending on which drive is being accessed, the file name passed to
SETNAM will be in one of the following two formats:
<dr>:<filename> OR
<filename>
For instance,
1:test
test
The drive number dr ranges from 0 to 3 (ASCII code) for drive device numbers
8 through 11, respectively. Passing SETLFS a 0 command tells the kernel to place
the directory at an address passed to the LOAD module in registers X and Y. Pass-
ing SETLFS a 1 value causes the directory to be loaded at address 800 hex. This is
fully explained in the whole-file I/O portion of the channel communications section.
240 Connecting the Nerves: Using the Memory Map Chap. 5
The drive number part of the directory name can be omitted for drive 0 (device
number 8), so that the name becomes simply $. Thus, passing SETLFS the device
number 8, the command value 1, and passing SETNAM the file name $ prepares a
link for reading the directory of the first C64 drive into memory starting at address
0800 hex. A device number of 9, a command byte of 1, and a file name of $1 will
prepare for reading the directory of the second drive into location 800 hex, and so
on.
The directory is in the form of a series of 32-byte-Iong entries, each containing
a file name followed by a file type. The file name is the 1- to 16-byte XASCII name
passed to SETNAM, without the optional drive number and colon. The file type in-
dicates to the operating system whether the file was created using a whole-file or one
of two varieties of partial-file channel. This notifies the operating system of the
channel-communications operations that can legally be performed on the file.
Files created using a whole-file channel are given type PRG, which reflects
their common use for holding programs and can be accessed with whole-file I/O or a
type of partial-file I/O called sequential I/O, which is introduced in the next section.
Next, one of two modules is called to activate the link as a channel. The kernel
module CHKIN is called to create an input channel, while the module CHKOUT is
called to create an output channel. This completes the second step of channel JlO,
activating a link into a channel. The channel is then ready to carry data to or from
the cassette.
The complete assembly code for creating a sequential-file channel is shown
below. As before, it may be preceded by a call to SETMSG.
Executing CLOSE removes the link from the open table and if the file is on
disk or cassette, performs any file housekeeping necessary to make the file available
for future use.
Sometimes programs have a reason to alternate between two or more input
channels or between two or more output channels. You could execute all the channel
242 Connecting the Nerves: Using the Memory Map Chap. 5
management steps above each time a particular channel is used, but that is ineffi-
cient.
Instead, a number of links can be created and kept in the open table by ex-
ecuting all the channel-creation instructions up to but not including CHKIN or
CHKOUT for each link. A program selects the input channel and the output channel
at any given time from the links in the open table, by setting up and executing
CHKIN or CHKOUT. When finished with a particular channel, the program can
deactivate it without removing it from the open table by executing CLRCHN but not
CLOSE. After the last time the channel is used, however, it must be completely
dissolved by executing CLOSE. This signals the kernel to perform housekeeping
work that will make the file accessible to the kernel in the future. If you omit this
step with a cassette or disk channel, you can lose the contents of your file.
Dala Cassette. The parameters passed to SETLFS for a cassette link are as
follows. The lfn can be between 1 and 7F hex. The device number must be 1. A com-
mand byte of 0 signals the kernel that an input channel will be created, while a 1 or a
2 signals the creation of an output channel. A command byte of 1 will also cause the
file written through the channel to be written with a trailing EOF marker, and a 2
will cause the file to be written with both an EOF and an EOT (recall the command
chart in the link section and the discussion of command numbers in the cassette por-
tion of the whole-file section).
When the kernel module OPEN is called, a command byte of 0 causes the
kernel to search for the file on the cassette, while a command byte of I or 2 causes
the kernel to create a file on cassette. As with whole-file channels, cassette files need
not be named and SETNAM is optional. If SETNAM is not used, a cre:ated file will
be nameless. Its position on the tape should be written down so that it can be read
later by rewinding the tape to a position just before the file and creating a nameless
input channel to the cassette.
When the link is activated into a channel using CHKIN or CHKOUT, the
channel direction chosen must be consistent with the direction chosen by the com-
mand byte.
Disk Drive. Both the sequential-file and relative-file methods for partial-file
I/O can be used with the disk drive. We discuss channel management for these two
I/O methods separately below.
Sequential Files. The link parameters passed to SETLFS are as follows. The
lfn is any number between 1 and 7F hex. The device number is between 8 and 11,
depending on which drive is being accessed. The command byte is any number be-
tween 2 and 14.
The file name format passed to SETNAM is in a slightly different format than
with previous channel types. The format used with the disk drive is as follows:
< dr:> < file name> ,< S>or< P>, < R>or<W>
Using the Kernel: Channel 1/0 243
As usual, the dr: field is optional with a disk of device number 8. The file
name portion is composed of from 1 to 16 XASCII characters. The third field, con-
sisting of ASCII" ,S" or ",P", indicates the kind of file that will be accessed. ",S"
is used to create a file through the channel or to read from an existing file that was
created through a sequential-file channel earlier. Such files have type SEQ in the disk
directory. ",P" is used to read a PRG (whole-file) file through a sequential-file chan-
nel using sequential-file 110. There is seldom any reason to do this, however. The
fourth field, consisting of ", R" or ", W", indicates the direction of data transfer.
",R" selects Read, which is for data input, and ",W" selects Write, for data out-
put.
Disks usually contain more than one file. Up to two individual links can be
created for each file-one for input and one for output-but no more than five links
total can be open (i.e., in the open table) to a single 1541 disk drive at once.
A special type of sequential-file channel can be created to send commands to
the disk drive and to read error information from it. This channel is called the com-
mand/error channel. Its uses are discussed in the channel communications section,
but its management is our concern now. The command/error channel is managed
using the standard sequential-file assembly code, except that the channel is unnamed
and the SETNAM preparation and call is omitted. The link is created with any Ifn
from 1 to 7F hex, the device number of the disk drive (usually 8), and a command
number of 15.
The command/error link for each disk drive must be created before any other
links to the disk are created, and terminated after all other disk links have been ter-
minated, or many 110 operations will not work properly.
Relative Files. Relative-file 110 is unique to the disk drive. Like sequential-
file 110, its uses are explained in the channel communications section. However, to
explain how relative-file channels are managed, we must note here that a relative file
consists of from 1 to 720 equally sized records each from 1 to 254 bytes in length.
The main difference between managing a relative-file channel and a
sequential-file channel is in the file name format. One of two file name formats is
used. If the channel is to a file that already exists on the disk, a simple file name of
from 1 to 16 XASCII characters is passed to SETNAM. If the channel is being
created to a file that does not yet exist on the disk, the following format is used:
where 'record length' is the number of bytes in each record, and is stored in binary,
not XASCII, code. The other bytes are in XASCII code as usual. Some example
file names are:
Files created using a relative-file channel have type REL in the disk directory.
A relative-file link can be activated as either an input or as an output channel
using CHKIN or CHKOUT.
Interactive 1/0. The remaining devices, the keyboard and the TV screen,
use an I/O method better suited than whole- or partial-file I/O to human interac-
tion. The kernel automatically creates an interactive channel to the keyboard
whenever the program-selected input channel is terminated. Similarly, the kernel
automatically creates an interactive channel to the TV screen whenever the program-
selected output channel is terminated. Alternatively, the program can create
interactive-1I0 links and channels using the sequential-file management code of the
preceding section, with device numbers 0 and 3 for the keyboard and screen, respec-
tively, and a command number of FF hex. Interactive channels are nc:ver named, so
SETNAM is not used.
Summary. The channel management code for file-path (i.e., whole-file and
partial-file) 110 is summarized in the table on pg. 245.
Channel management for the interactive paths is not shown since interactive
channels are normally created by terminating all other channels, and otherwise are
managed like sequential-file channels.
Channel Communications
The kernel modules used for communicating through channels are introduced
below. They will be used in the following sections.
;define a link
LDA #Ifn
LOX #device-number
LDY #command
JSR SETLFS
SAVE:
Purpose: Create disk or cassette channel, save memory to 'PRO'
file, terminate channel.
JSR address: FFD8 hex_
Prerequisites: SETLFS (SETNAM optional with cassette).
Data passed: Start address of memory block in page 0
with low address byte first,
poiuter to the page 0 locations in A,
end address of memory block as low byte in X
and high byte in Y.
Data returned: Error # s 5,8,9 in A.
Altered registers: A, X, and Y.
246 Connecting the Nerves: Using the Memory Map Chap. 5
LOAD:
Purpose: Create disk or cassette channel, load PRG file to
memory, terminate channel.
JSR address: FFD5 hex.
Prerequisites: SETLFS (SETNAM optional with cassette).
Data passed: in A,
if secondary address defined in SETLFS = 0, then
low byte of destination address in X and
high byte of destination address in Y; else
if secondary address in SETLFS = I, then
X and Y not used
(destination address = SAVE address).
Data returned: Error #s 0,4,5,8,9 in A (see SETMSG).
Altered registers: A, X, and Y.
CHRIN:
Purpose: Wait for and return a byte from the input channel.
JSR address: FFCF hex.
Prerequisites: Creation of an input channel.
Data passed: None.
Data returned: Input byte in A.
Altered registers: A and X.
Special comment: If input channel is keyboard, first call of CHRIN initiates
editor. Editor waits for further keys until carriage return is typed, then
CHRIN returns to program with first typed charact~:r in the ac-
cumulator. Subsequent calls return succeeding characters from the line,
ending with the carriage return. Next call starts process over. Up to 88
characters can be typed before a carriage return. See further description
in interactive I/O section.
GETIN:
Purpose: Get a byte from input channel or keyboard queue.
JSR address: FFE4 hex.
Prerequisites: Creation of an input channel.
Data passed: None.
Data returned: Input byte or no-data value in A.
Altered registers: A, X, and Y.
Special comment: If input channel is keyboard, GETIN removes a byte from
keyboard queue (lO-byte FIFO structure at 277 hex). Keystrokes are
loaded into queue by module SCNKEY during system interrupt, until
buffer is full; then keystrokes ignored until a byte is removed. See full
description in interactive 110 section.
Using the Kernel: Channel 1/0 247
CHROUT:
Purpose: Send a byte through the output channel.
JSR address: FFD2 hex.
Prerequisites: Creation of an output channel.
Data passed: Output byte in A.
Data returned: None.
Altered register: A.
READST:
Purpose: Determine status of lIO transfer.
JSR address: FFB7 hex.
Prerequisites: None.
Data passed: None.
Data returned: Status byte in A. Each bit position from dO to d7 represents an
lIO condition; bit value 1 means condition is present, 0 means not pres-
ent. Byte equals 0 after a normal I/O transfer. Not all bits carry data
useful for kernel-level programming. The bits and conditions that are
useful to the kernel-level assembly programmer are:
Bit Condition
Altered register: A.
PLOT:
Purpose: Manage the cursor (see below under "data passed").
JSR address: FFFO hex.
Prerequisites: None.
Data passed: Carry flag set to 1 to move cursor, to 0 to read
its position. If carry set to I, X coordinate in
register Y and Y coordinate in register X.
Data returned: If called with carry = 0, X coordinate in
register Y and Y coordinate in register X.
Altered registers: A, X, and Y.
Wholefile channels. Files of all types are used to hold pure data. Only
PRG or program files are routinely used to hold programs as well. A program
almost always needs to be loaded or saved as an entire unit. This, plus the simplicity
of whole-file lIO, makes the whole-file channels a natural for program handling.
248 Connecting the Nerves: Using the Memory Map Chap. 5
Program files are also ideal for data structures that are small enough to be loaded
and saved whole in a reasonable length of time. Just what is reasonable depends on
your patience level and on the storage device. A 20K disk data file is tolerable to
most people, but 5K is closer to the limit with cassette.
The kernel creates a PRG file by transferring a continuous block of memory
from the computer onto disk or cassette. The structure of the file is therefore
the structure of the information in that memory block, although to move the file the
kernel treats it as a repetition data structure consisting of multiple bina.ry bytes. The
kernel also adds preceding and trailing information that it uses in handling the file.
To move the data or program code in a block of memory into a PRG file on
cassette or disk, a program calls the kernel module SAVE while passing it the begin-
ning and ending addresses of the memory block. SAVE then stores the file together
with the beginning address of the memory area from which it was saved.
Once a link has been defined with channel management modules SETLFS and
possibly SETNAM, the following assembly code will cause the kernel to create a
cassette or disk PRG file that mirrors the contents of a section of memory starting at
'begin_address' and ending at 'end_address'.
;'SAVE' ASSEMBLY CODE
;save from 'begin_address' through 'end_address' to PRG file
No follow-up assembly code is needed because SAVE does all necessary file
housekeeping and channel termination after it finishes writing the file.
When a PRG file is loaded it is placed into a continuous block of memory
whose starting address is specified in one of two ways. Defining the whole-file link
with a command byte of 1 passed to SETLFS tells LOAD, the file-loading module,
to use the address saved with the file. This causes the file to be loaded into the same
memory area from which it was saved. Programs in PRG files are almost always
loaded this way, since most programs can execute in only one place in the memory
map.
Alternatively, defining the whole-file link with a command byte of 0 tells
LOAD to obtain the starting address from the X and Y registers, with the low byte in
X. The program places these values in the registers before calling LOAD.
Relocatable programs or data structures that need to be loaded into different
memory areas under different circumstances can be loaded this way. The program
must also place a 0 in the accumulator before calling LOAD.
Using the Kernel: Channel 1/0 249
The code below assumes that a command byte of 0 was used to define the link.
Thus the calling program will provide the beginning address. A command byte of 1
allows the middle two instructions to be omitted, since the beginning address of the
memory area will be obtained from the file.
The address of the highest location loaded is returned in the X and Y registers,
with the lower byte of the address in X. The address should be stored for later use if
the file length is not already known.
Data Cassette. When LOAD reaches a cassette-tape EOT marker without
finding the file name it is supposed to load, the module returns with a kernel error
message number of 5 in the accumulator.
Disk. The LOAD and SAVE code above suffices for simple file handling.
Some more-advanced techniques using whole-file I/O are discussed below.
where 'file name' must be the name of a file of type PRO. If the, I option is chosen,
the PRO file will load at the address the file was originally SAVEd at. Otherwise, it
will be loaded at 0800 hex, the beginning of BASIC program memory.
Once the file is in memory, BASIC retrieves the warm start vector, an address
250 Connecting the Nerves: Using the Memory Map Chap. 5
in locations 0302 and 0303 hex, and branches to the code to which it points. The
warm start address is placed in these locations when the computer is powered up: It
points to a routine which returns control to BASIC for the next user input.
If the program you have loaded begins in memory locations below 0302 and is
continuous through 0302 and 0303 with its own beginning address in the latter two
locations, the BASIC LOAD instruction will load the program and then route execu-
tion to the program's beginning. This is where the loader will reside.
The usual area for the loader is from 02AI to 0303. Part of this area is
dedicated to RS-232 communications, which are not used during a LOAD. The rest
of it is free all the time.
The actions taken by a typical loader are:
The loader must originally have been SAVEd at the correct address (i.e., below
$0302) and must be loaded to its SAVE address. This requires a secondary address
of I in SETLFS. Assembly code for the basic auto-start loader is as follows:
* =02A1
;change the memory map to assembly language
LOA $01
AND #$FE
STA $01
;Ioad the primary program
LOA #$01 ;Ifn
LOX #$08 ;for disk drive, '#$04' for cassette
LOY #$01 ;i.e., load primary program at its SAVE address
JSR SETLFS
LOA #primaryprogramnamelenglh
LOX #name_address low byte
LOY #name_address high byte
JSR SETNAM
;LOAD the primary program with this section's LOAD codE!
;start the primary program executing
JMP PROGRAM
So you can write a loader for each program on a disk, and start any primary
program executing by typing in the following instruction from BASIC:
Using the Kernel: Channel 1/0 251
LOAD"loader_name" ,8, 1
in any order, but the bytes within a record must still be accessed sequentially. This
makes the data movement part of the assembly code for relative-file I/O similar to
that used with sequential-file I/O. Additional code is needed to sekct a record
before data are transferred. We will examine relative-file I/O in the upcoming disk
section.
Sequential files are ideal for organizing data elements that can be processed in
a sequential order and that are too numerous to fit in memory together or are
numerous enough that the time delay in reading them all before starting execution is
unacceptable. Storing such data in a sequential file allows them to b(~ transferred
only as necessary and allows the transfer time to be divided and distributed
throughout the program's execution.
Since the direction of transfer is specified when a sequential-file channel is
created, data can be transferred only in the first specified direction. Thus a program
cannot read to a certain point in a file and then start writing data from there on, or
any other such mixed read/write strategy.
Peripherals on the IEEE-488/serial and RS-232 buses send and receive data se-
quentially but generally do not store files. These devices use only sequential-file I/O,
because it alone matches their communications needs. Their sent data are sometimes
terminated with an end-of-transmission byte or bytes. A program can use this
marker to tell when to stop reading the data. If no terminating pattern is used, the
data can be thought of as forming a file of indefinite length, and a program must
have its own criteria for ending the read.
Sequential files on cassette or disk are terminated by an end-of-file marker. A
file-writing program can also divide a file into records and fields using reserved bytes
to aid the file-reading process later. As we discussed in Chapter 2, these are the stan-
dard parts of a sequential data structure. This is particularly easy to do if the data
are ASCII encoded, since so many byte values cannot occur as data (e.g., the value
00 hex). ASCII files created in BASIC use the carriage-return character (00 hex) to
separate records, which it limits to 80 bytes length each, and the comma (2C hex)
and colon (3A hex) characters to separate fields, which it also limits to 80 characters
in length. Files can use any record or field separators that cannot occur as data; the
only advantage in using BASIC's conventions is that BASIC programs will be able
to read your sequential files properly. Record and field markers allow a program to
read data record by record and field by field, although the file must still be read in
sequential order.
A sequential file can be read or written straight through from beginning to
end, but it can also be read in part and the read operation terminated. This allows
the program to go back to the beginning of the file without finishing the entire file.
For instance, suppose that you choose to keep the directory data structure of
Chapter 4's phone directory task on disk but not in memory (although this is prob-
ably not a good decision). Suppose also that you order the listings by their frequency
of usage, so that the most-accessed listing is first in the file, the second-most ac-
cessed is second, and so on. To find a given listing, the phone directory program
starts at the beginning of the directory and reads listing records until it finds the
Using the Kernel: Channel 110 253
desired listing, which is generally before it reaches the end of the file. Rather than
delay program execution by reading the rest of the file, the read operation can be ter-
minated and the file reset by closing the file using channel management code. The
file will then be ready for another search starting from its beginning byte.
Noncharacter (e.g., binary) data can also be stored in a sequential file. Such a
file can have no records or fields, since any conceivable separator could also occur as
part of the data.
A program reads partial-file data until a stopping condition is detected. That
condition may be that the end of the file, a record, or a field is reached, or that a cer-
tain number of bytes have been read. The pseudocode for a partial-file read con-
struct is as follows:
LOOP
read a byte
EXITIF stopping condition occurs
store or write byte
ENDLOOP
The 'read a byte' step is executed by calling either of two kernel modules;
CHRIN or GETIN. Both modules return a value in their accumulator. The main dif-
ference between the two modules in partial-file I/O is that CHRIN waits until a data
byte is available and then returns it, while GETIN returns with either a data byte or a
"no-data" value of zero immediately (there are other differences between the two in
interactive I/O).
With a disk or cassette file, CHRIN's waiting is no problem. Data are
available relatively quickly. With RS-232 or nondisk serial/IEEE devices, waiting
can make a tremendous difference. CHRIN will hold up program execution, forever
if necessary, until a data byte becomes available on the input channel. A good rule
of thumb is to use CHRIN with disk or cassette drives, and GETIN with all other
RS-232 and serial/IEEE devices.
The 'EXITIF stopping condition occurs' step is executed in different ways for
different types of conditions. On reaching the end of a disk or cassette file, both
CHRIN and GETIN continue to return a value as if there were still more data to
read. However, a kernel module called READST can be called to detect the EOF
condition. READST returns a 0 in the accumulator and zero flag after a normal,
midfile data read. It returns a 1 in bit d6 of the accumulator if the end of the file has
been reached with the previous data read. READST also checks for the EOT marker
with cassette I/O, and sets bit d7 if it detects the EOT. A program can perform this
EXITIF step with an EOF condition by calling READST and then executing a con-
ditional branch. It can check for EOF or EOT by ANDing the returned byte with 80
or 40 hex, and then executing a conditional branch on zero.
The 'store or write byte' step requires either storing the value from the ac-
cumulator into an array in memory or writing it out another channel to a device such
as the screen. As we will see, the latter requires executing the CHROUT module.
An end-of-record or end-of-field stopping condition can be tested for with a
254 Connecting the Nerves: Using the Memory Map Chap. 5
If memory is the data source, the 'obtain byte' step is executed with a LDA in-
struction and any supporting instructions to set up data addressing (see the 'store
byte' instructions of the read construct). If an input channel is the data source, the
'obtain byte' step is executed by calling CHRIN or GETIN. If GETIN is used, a
compound EXITIF condition such as 'EXITIF EOF OR no data available' can be
used to minimize wasted time in the write loop. Other EXITIF conditions are han-
dled as in the read loop.
The 'write byte' step is executed by calling the kernel module CHROUT with
the data byte in the accumulator. If the output channel is the screen, data may move
past too quickly to be read. Inserting a delay loop within the write construct, or
simply holding down the CTRL key on the keyboard, will slow the display of
characters to a readable rate.
The following assembly code writes sequential XASCII data obtained from
memory. This code assumes that a 0 value will follow the file data in memory. 0 is
handy because it sets the zero flag when it is loaded, saving a comparison operation
in the EXITIF test.
As before, if this construct is used to write files from different memory loca-
tions, the first four instructions must be replaced with instructions that place dif-
ferent addresses in 'datloc'. Such instructions would probably be placed in the
modules calling a sequential write module that contains the construct above.
In either the read or write templates, to transfer fewer than 256 bytes you
would probably want to change the memory addressing mode from indirect indexed
to simple indexed.
Data Cassette. Cassette tape is a sequential medium, and with current
cassette players data can only be recorded and played back sequentially. This limits
cassette I/O methods to whole-file and partial-file sequential I/O.
256 Connecting the Nerves: Using the Memory Map Chap. 5
Message Action
N < dr: > <disk name> < ,10> Erase all files, organize disk, name
(e.g., 'NO:diskl,AI') disk, and give it a two-character 10
N<dr:> <disk name> Erase all files and rename a
(e.g., 'NO:diskl') previously organized disk
C < dr: > < new filename > = Duplicate old file under newfilename
< dr: > < old filename >
(e .g., 'CO:test = O:testl')
C < dr: > < new filename > = Create new file from copies of old
< dr: > < old filename 1 >, files placed end-to-end
< dr: > < 0ldfilename2 >, ...
(e.g., 'CO:test =
O:ta,O:tb,O:tc')
R<dr:> <newfilename> = Change name of old file to
< old filename > new filename
(e.g., 'RO:test = test I')
S < dr: > < filename> Erase file named filename
(e.g., 'SO:testl')
I<dr:> Return drive(s) to normal state
(e.g., '10:') (for drive stuck in error condition)
V <dr:> Return wasted space to usable state
(e.g., 'VO:') (not to be used with random files)
P < ,lfn > < ,record # low-byte> Select the given record and byte
< ,record # high-byte> (i.e., field) position within the
< ,byte w Ii record> given file (for relative-file I/O)
ror messages, a program can detect the end of a relative file (discussed in the next
section) or know to inform the user of simple problems such as an aborted write due
to a write-protect tab on the disk. However, most of the messages indicate problems
with command messages, file accesses, or channel management situations that
usually do not occur once the program works properly.
The latter type of message is used as an aid in debugging. If a problem occurs
with a section of partial-file I/O code, instructions can be inserted after the disk ac-
cess instructions to read the error channel. Once the problem has been identified and
solved, the error-channel instructions can be removed. The more complicated errors
mentioned above can usually be eliminated in this way.
Command and error messages are moved to and fro through the command/
error channel much as normal data are moved through a sequential-file data chan-
nel, using CHRIN and CHROUT. One of the differences between 110 through the
command/error and through the data channels is the lack of a READST EOF in-
dication after reading the last byte of an error message. The carriage return
258 Connecting the Nerves: Using the Memory Map Chap. 5
o No error
1 A file or files have been erased (no error)
2-19 No error
21 No disk in drive, or disk is not formatted for use
25 Data written are different from data sent
26 Attempted write with disk write-protect tab covered
29 Disk not formatted properly
30 Illegal command received: syntax is incorrect
31 Illegal command received: not recognized
32 Illegal command received: longer than 58 characters
33 SA VE or LOAD requested with wildcards in fik name
34 Illegal command received: no file name or no ':'
39 Illegal command received: not recognized
50 Read attempted beyond end of file
51 Too many characters sent to a relative file record
52 Attempted write to relative file would overtlow disk
60 Attempted open for reading of write file
61 Attempted access of an unopened file
62 Attempted access of file not on disk
63 Attempted creation of file already on disk
64 File type does not match file type on disk
70 Requested channel or all five channels already in use
72 Disk full
73 Power-up condition (no error)
74 Used other than 0 in < dr: > portion of file name
character (OD hex) signals the end of an error message, and its detection must be
used as the stopping condition for the read loop.
The following sequential-file read construct has been adapted to read the
command/error channel. It includes the 'EXITlF carriage-return occurs' test and
uses simple indexed addressing since less than 256 bytes will be read.
This construct assumes that a command/error channel has already been created us-
ing channel management code.
Writing commands to the command/error channel is the same as writing data
to a sequential file, except that an extra step must be taken after a command has
been sent. After each command, there must be a call to CLRCHN to signal the drive
to execute the command and a call to CHKOUT to recreate the channel. The general
sequential-write construct works fine if ENDLOOP is followed by calls to these two
modules.
A shortcut method for writing just one command to the drive is to use the
command message as the file name when creating the command/error channel. Call-
ing OPEN sends the message to the disk, and calling CLRCHN causes it to be acted
on.
Relative Files. Relative files are the most efficient way to store data elements
whose position in a file can be calculated or at least estimated, and which are too
numerous to fit in memory together. For instance, a large repetition data structure
whose elements are ordered alphabetically by one of their fields should normally be
stored in a relative file. A large mailing list structure such as that mentioned in
Chapter 2 fits this description, as would a large phone directory ordered
alphabetically. However, relative files should usually be used only with data that
cannot be organized by their amount or order of usage.
As we said, the alphabetical mailing list data structure of Chapter 2 is a good
example of this type of data structure. The top-level structure, or file, holds the en-
tire list; each record holds a mailing address; each field in a record holds an indepen-
dent element of the address; and each byte in a field holds an alphanumeric
character. A mailing list is often used in two ways: to look up a single address and to
look up all addresses. The simplest and safest assumption is that its individual
records are equally likely to be accessed.
The time required to find a single record in a data structure like the mailing list
can be shortened by up to several hundred times if the structure is stored in a relative
file instead of a sequential file.
In most cases any record in a relative file of n equally accessed records can be
located with no more than a handful of record accesses. The number and time of
these accesses can be minimized in two ways. First, if the initial field of each record
holds the record identifier, only a few bytes of each record need be loaded to select
or eliminate a particular record from the search. Second, the processing overhead in
using relative files can be minimized by initializing them to their full length at the
time of their creation (initialization will be discussed shortly), using only newly for-
matted disks (see the command/error channel N command in the preceding section).
The same data structure in a sequential file requires on the average that nl2 or
half the records be examined to find a single desired record. Further, all bytes from
the beginning of the file to the desired record must be loaded during the search.
A relative file can be divided into up to 720 equal-sized records of 1 to 254
260 Connecting the Nerves: Using the Memory Map Chap. 5
bytes each. This gives a theoretical maximum relative file size of about l80K bytes.
However, the total capacity of a disk is 170K bytes, so a relative file must be limited
in either the total number or size of its records to fit on an empty disk. As long as a
program that manipulates relative files enforces these limitations, and other files are
kept off the storage disk, a relative file will never overflow available storage. If
either of these safeguards is absent, the program should read the disk error channel
to check for the disk overflow condition ("52") when expanding the file.
Two channels are used to read from or write to a relative or REL file. They are
the command/error channel, which is used to select the record and field to be ac-
cessed, and a relative-file data channel. As stated in the channel management sec-
tion, the command/error channel should be created first and terminated last.
To move data into or out of a relative file, a program must identify the record
and field it will access. It does this by activating the command/error link as both the
input and output channels, and sending the P (or Position) command to the disk.
This command is a series of bytes, one ASCII followed by four binary, in the form:
"P, 60h OR command byte, low byte of record #, high byte of record fI, byte posi-
tion of field within record," where the command byte is the command value sent
earlier to SETLFS to create the relative-file channel.
So the command to select the field beginning at byte 40 hex of record 118 hex,
in the file with the link definition "8,8,8," would consist of the binary bytes
"50 68 18 01 40" (the ASCII code for P is 50 hex). A similar command to select
the beginning of the record is "50 68 18 01 01."
The program then reads the error message on the input command/error chan-
nel to detect a disk overflow or the end of the file. If it receives the first message,
disk error number 52 for disk overflow, it means that a write to the selected record
would overflow available disk storage. The write can be canceled and an error
message sent to the user.
If the program dete.:ts the second error message, error number 50 for "record
not in file," it means that the record selected is beyond the end of the file. A
peculiarity of the kernel makes this the only way to test for the end of a relative file;
the usual end-of-file value, the READST return value of 40 hex, is returned every
time the last byte of a record is read.
Testing the error channel can be eliminated if the total length of the file is
predefined and the program limits record accesses to existing records. This is the less
flexible but faster and easier alternative.
The following code sends the P command to the disk drive, receives the first
two bytes of the error message, and tests for disk overflow and end of file. Note that
its basics are the same as sequential-file read and write code. This code assumes that
the five bytes of the P command have been placed at a location called 'cmdloc'.
The output command/error channel and the data channel, in that order, must
be created before this code can be executed. The code appears as follows:
CMDLOOP:
LDA cmdloc,Y ;'cmdloc' holds first command byte
JSR CHROUT ;send byte out command/error channel
CMDEXITIF: ;exit if all command bytes written
INY
CPY #$05 ;done if 5 bytes sent
BEQ ENDCMDLOOP
JMP CMDLOOP
ENDCMDLOOP: ;disk now has record/field position
JSR CLRCHN ;tell disk to execute the position command
LDX #$OF ;activate command/error link as
JSR CHKIN the input channel
;read the disk error message
IFDONE:
JSR CHRIN
CMP '5' ;ignore message unless it is 50 or 52
BNE DONE
IFOVERFL:
JSR CHRIN
CMP '2' ;if message= 52, handle overflow
BNE IFEOF
JSR OVERFL ;'OVERFL' is special module elsewhere in prg
JMP DONE
IFEOF:
CMP '0' ;if message = 50, handle endottile
BNE DONE
JSR EOF ;EOF is special module elsewhere in program
DONE: ;either no overflow or endotfile, or
;done with overflow or EOF handling
One compare instruction can be removed from the command-write by placing the
command bytes in reverse order in memory and letting the index count down to O.
Assuming that no error has occurred, sequential-file assembly code can now
read from or write to the currently selected record and field. Only the currently
selected record can be read from or written to. However, all data from the selected
field position to the end of the record can be accessed.
Whether reading or writing, the code should keep track of the byte position be-
ing accessed in the record if there is any possibility that access will be attempted
beyond the record's end (an error condition). This is simple, since both the length of
the record and the initial access position are known to the program. Thus the
EXITIF test in the sequential read or write code will be based on the value in a
counter, probably Y, signifying the last byte in the record. Once the last byte of a
record is reached, the code must send another position command to access any more
data.
Earlier we noted that relative file 1/0 is most efficient when the file is initial-
ized to its largest size. Initialization is accomplished by creating the file, selecting
what will be the last record in the file (this selection generates an error code 50 which
should be ignored), and writing one or more bytes of filler data into that record. All
262 Connecting the Nerves: Using the Memory Map Chap. 5
intermediate records are logged in a file data structure used by the disk drive to aC-
cess the relative file, making expansion of the file within its intermediate records
much faster later. If you read one of the intermediate records after initialization,
you will find that it contains single "data byte" FF hex, which prints as the "pi"
symbol on the screen.
The most common ways to order a relative file are by alphabetizing and by
"hashing" the contents of its key field. A simple alphabetical scheme might be to
assign one record to each two-letter combination in the alphabet, where the two-
letter combination represents the first two letters of the key field data. This is an in-
efficient use of storage space since few data items will start with AA and even fewer
start with ZZ. These records will probably be underused. Other records may be over-
filled. However, it is a simple scheme, and 26 x 26 = 676 record numbers can easily
be generated from alphabetical data. Variations on this or other alphabetical
schemes can be tailored to a task to overcome many of these shortcomings.
A different method of relating specific data to record locations is called
hashing. Although hashing algorithms are beyond the scope of this book, the con-
cept is simple. A simple formula is developed to convert the values of all the bytes in
a record's key field into a single record number within the file.
Any formula whose results are well distributed within the legal number of
records is suitable; one common formula exclusive-ORs the key field's bytes into
each other, shifting the result one bit left between each XOR. Adjustments must be
made for factors like the length of the key field to produce a legal record number.
If a new record is being added to the file and the record number generated
points to a record already in use, the hash code looks at succeeding rewrds until an
empty one is found. Of course, the code must ensure that the disk is not overfilled.
Similarly, if a particular record is being searched for and the hashed record
number points to a record containing the wrong data, the hash code examines suc-
ceeding records until the correct record is found. In this case, the code must watch
for the end of the file.
In summary, relative files are an efficient way to store large repetition data
structures whose records are equally likely to be accessed and which can be searched
for and selected by information in one of their fields. Relative files are accessed by
creating a command/error channel, creating a data channel with the appropriate file
name and format, selecting a desired record, and transferring data with that record.
The last two steps are repeated as often as necessary to complete the processing task.
Keyboard and Screen. Sequential-file 110 can be used from the keyboard by
calling GETIN instead of CHRIN, and to the screen by calling CHROUT. This is
discussed in the next section. The EXITIF condition for keyboard input will prob-
ably be the input of some character such as the carriage return. Channels to both the
keyboard and screen must be created using channel-management code.
This completes our discussion of file 110. As you shall see, interactive 110 is a
variation on sequential file 110 to the keyboard and screen, with added user conve-
niences.
Byte output to the screen uses the same code as sequential-file output. Bytes
are interpreted according to the Commodore ASCII character codes; some are
displayed as characters and others serve as commands. In the latter category are the
codes for changing between lowercase and uppercase letters, between reverse and
normal letters, for the carriage return, and so on, in the Commodore XASCII chart
in Appendix C.
The cursor is an important part of interactive 110, both in CHRIN input and
in the output display. CHROUT writes a character to the current cursor position
Using the Kernel: Channel 110 265
and then advances cursor position one place. The cursor position can be more freely
controlled using the kernel module PLOT.
PLOT controls the cursor directly, using no input or output channels. We are
discussing it here rather than under non channel I/O because it supports the interac-
tive channels.
PLOT has two functions: Depending on the state of the carry flag when it is
called, it either returns the X, Y screen position of the cursor in registers Y and X,
respectively, or it places the cursor at an X, Y position passed in registers Y and X.
One of the ways it can support interactive I/O is in keyboard input using a
menu. Suppose that there are multiple menus organized as a tree (Fig. 5.9). The in-
put options in a menu can select one of the menu's branches, which is another menu,
and so on down to the individual processing actions.
--1----
menu4 menu5
menul
----- ----
menu2 menu3
In each menu there is probably a favorite-choice option. PLOT can move the
cursor to that choice at each menu level, reducing the user's work to the bare
minimum. The program could even have a customization mode, where the user
selects the favorite choices in each menu!
To use PLOT, you must know the X-V numbering scheme used with the
screen. It is as follows. There are 40 X, or horizontal, character positions. They are
numbered from 0 for the leftmost character column to 39 for the rightmost column.
There are 25 Y, or vertical, character positions. They are numbered from 0 for the
topmost character row to 24 for the bottom character row. This organization can be
illustrated as shown in Fig. 5.10.
Y coordi nates
41
2 '---------'
o 39
X coordinates Figure S.lO Screen
This completes our discussion of channel I/O. The varieties of channel I/O are sum-
marized in the tree in Fig. 5.11. The kernel modules that perform channel I/O are il-
lustrated in tree form in Fig. 5.12, with their normal calling order proceeding from
top to bottom. Default channel selection is not shown.
266 Connecting the Nerves: Using the Memory Map Chap. 5
Channel I/O
~
Whole-file I/O Partial-file I/O
I ~
'PRG' files 'SEQ' files 'REL' files Figure 5.11
Most I/O can be performed through channels. Nevertheless, there are a few
exceptions, the most obvious being the advanced graphics and audio available
through the I/O chips, and the devices that attach to the game ports. Graphics and
audio are discussed in upcoming chapters, but the game ports and a few remaining
nonchannel I/O cases are discussed next.
Four types of I/O work independently of channels: graphics, audio, game port, and
clock I/O. The first two categories are so involved and specialized that each will be
SETMSG
~
SETLFS
Figure 5.12
Using the 1/0 Block: Nonchannel 1/0 267
given its own chapter. The latter two categories are more closely related to general
computer I/O, so they will be discussed here.
Game port devices and clocks are not supported by the kernel. Instead, I/O
chips support their functions directly. To control these devices a program must
directly manipulate the chip registers in the I/O block. Thus the detailed memory
map of the 110 block shown earlier in this chapter will be one of your most impor-
tant references for programming these devices.
There are two ports or connectors to the internal C64 buses, on the right side of the
Commodore 64. They provide the connection between the C64 and light pens, game
paddles, and joysticks. The connectors are labeled "control port 1" and "control
port 2" on the computer, but are commonly called game ports because of the roles
most of their attached devices play. Each of these devices has its own programming
requirements, so they will be discussed separately.
Light pens. The light pen provides a user with a way to input data by point-
ing to information on the TV screen. Often the input is a selection of one program
action among many, as in a menu.
Only one light pen can be used at a time. The light pen plugs into game port 1
of the C64, which is the left hand of the two game ports. Internal C64 wiring con-
nects the light pen to the dedicated video chip, called the VIC II. As mentioned
earlier, this acronym represents "the second version of the Video Interface Con-
troller" (the first version is in the VIC-20 computer). For brevity, we will call the
chip VIC.
To understand the light pen, you must understand how the TV screen "writes"
a picture and how VIC can control that writing. A TV screen displays a picture by
sweeping a raster, or electron beam, in a regular back-and-forth motion across a
coating of phosphors on the inner surface of the screen. Irradiated phosphors glow
for a short time.
Phosphors are grouped into small units called pixels. More loosely, we speak
of pixels as being the smallest picture units that the computer can generate. Com-
puter pixels are generally larger than screen pixels.
The intensity and positioning of the raster as it crosses the screen generates the
detail in the picture being written. VIC does not control the raster's movement. In-
stead, it sends signal data to the TV screen in step with the raster, controlling the pic-
ture that the raster writes. VIC's awareness of raster position allows it to detect the
position of a light pen pointed at the screen.
When the light pen is pointed at the screen and activated, VIC watches for a
signal from the pen indicating that the raster has passed the pen's tip. Since VIC
knows where the raster is at all times, it can record the pen's position when it
receives the pen's signal. The position is recorded in two bytes, one at DOl3 hex, for
268 Connecting the Nerves: Using the Memory Map Chap. 5
the horizontal position to within two pixels accuracy, and one at D014 hex, for the
vertical position to the exact pixel.
The lost accuracy on the horizontal position is caused by the organization of
the screen. Each VIC character is composed on an 8 x 8 pixel grid. With a horizon-
tal by vertical screen size of 40 x 25 characters, the screen size translates to 320 x
200 pixels. The vertical number, 200 pixels, fits in a single byte, but the horizontal
figure does not. The simple way used to make both numbers fit into single-byte VIC
registers is to halve the horizontal resolution of the light pen.
On receiving the raster signal from the light pen, VIC will both generate an
IRQ interrupt to the microprocessor and set bit 3 of location D019 if a precondition
has been met: The program must have enabled the light pen interrupt by setting bit 3
of location DOIA to a 1. VIC sets bit 3 of D019 on the raster signal because the
microprocessor sees all IRQ interrupts the same way. The program can check bit 3
of D019 and know whether the source of the interrupt was the light pen. Interrupt
routines are discussed in more detail later in the chapter.
To use a light pen from a program, include code that enables the light pen in-
terrupt as explained above, and an interrupt handler that does two things. First, it
must check D019 to verify that the light pen is the interrupt source. Second, it must
read DOl3 and D014 to obtain the horizontal and vertical position of the pen. Inter-
rupt handlers are discussed in greater detail later in this chapter under clock 110.
After obtaining the position values, the program can compare them against com-
puted or stored range values and take appropriate action.
00 FF
Figure 5.13
Unlike in light pen 110, game paddles are disconnected from most of the cir-
cuits that service them until they are intentionally connected by a program. The fir-
ing buttons have the only permanent connection, by internal C64 wiring to an 110
support chip called CIA #1, the first of two complex interface adapters. It occupies
locations DCOO through DCOF hex in the memory map. We will be using CIA #2
later.
Both paddles on one game port can be connected to the dedicated audio chip,
called SID, by CIA #1 under program control. Recall that SID is short for Sound In-
terface Device. SID occupies locations D400 through D41C hex.
To use the game paddles, a program must do three things. First, it must
prepare the CIA chip to make the connection between the paddles and SID. Second,
it must select one of the game ports and connect it to SID. Third, it must read the
data that the CIA and SID obtain from the port selected.
Preparing for Paddle liD. When properly configured, CIA # 1 can elec-
trically attach one of the game ports to SID. Once the game port is attached, SID
continuously places the input values from the port's Land R paddles into Land R
game paddle registers (these registers are analog-to-digital converters).
CIA # 1 is configured through its internal registers. There are two data ports,
which are input/output registers like the configuration register on the 6510, and two
data direction registers, which function like the register of the same name on the
6510 to control the data transfer direction of each bit in the data ports. We will call
the data ports PORT A and PORTB and the corresponding data direction registers
DORA and DDRB, respectively. A 1 in a DDR bit forces the corresponding PORT
bit to an output; a 0 forces it to an input.
PORTA, at location DCOO, is used to select one of the two game ports for pad-
dle 110. PORTA's two most significant bits are assigned to this task: bit 6 to port 1
and bit 7 to port 2.
To select a game port, first disable interrupts so that the system cycle interrupt
will not interfere with the paddle 110. Second, read the value in DDRA, location
DC02, and save it in memory or on the stack for later. Many kernel modules depend
on the setting in this register, so you will have to restore it to its previous setting after
completing paddle 110. Third, set the top two bits of DORA to I's. This makes
PORTA's d6 and d7 bits into outputs, which allows them to electrically attach either
of the game ports to SID. At the same time the lower six DORA bits should be O's so
that the lower bits of PORTA will be inputs. Among other things, these bits receive
270 Connecting the Nerves: Using the Memory Map Chap. 5
game port 2's firing button inputs. So a program fixes all these bits by writing CO
hex into DDRA. The program can now select a game port.
Selecting a Paddle Port. Game port I or game port 2 is selected by writing a
I into PORT A's bit 6 or bit 7, respectively. Thus the program writes 40 or 80 hex
into PORTA to read from game port I or 2. The program must then wait a short
while before reading data. A delay loop with a counter of 80 hex is considered safe
by Commodore literature.
Reading Paddle Data. Two things must be read from each paddle: the
numeric input value and the firing button state. The left and right paddle data can be
read from locations D419 and D41A hex, respectively, on the SID chip.
The left and right firing button states are read from one of two locations,
depending on which game port is currently selected. If it is game port 1, the location
is PORTB (i.e., DCOI hex). For game port 2, the location is PORTA, DCOO hex. In
either case, the lower seven bits of the PORT are usually all 1'5. While the left pad-
dle's firing button is pressed, bit 2 goes to O. While the right paddle's button is
pressed, bit 3 goes to O. A program can read the PORT and test for either transac-
tion.
PORTB is also used by the kernel for keyboard input. The kernel maintains
the PORTB bits as inputs, so the paddle routine need not configure them through
DDRB. However, this leads to a problem. If a program reads both the keyboard and
the game paddles, a port 1 firing button press during the keyboard read will insert
unwanted characters into the keyboard input. Your programs must avoid mixing
keyboard input with port I paddle input.
Paddle reading normally alternates between selecting a game port and reading
its paddles. When paddle reading is completed, the value of PORT A must be
restored, and interrupts reenabled.
Code for reading both paddles on port I is given on the next page as a module
that a program can call.
Reading both game ports requires either nearly doubling the code, or
separating out a part of it as another subroutine. Try rewriting this code, with the
delay through paddle reading sections as a subroutine called by the main subroutine.
Convert the LDA PORTB instruction to LDA PORTA, and use an index in register
X or Y with it and the STA LDATA, STA RDATA, and LDA PORTB instructions
so that the code can apply to both ports.
porta = $OCOO
portb = $OC01
=
ddra $OC02
;d isable interru pts
SEI
;save 'ddra' for later
LOA ddra
PHA
;prepare bit directions for paddle port selection
LOA #$CO
STA ddra ;makes top 2 bits of PORTA outputs
;select paddle port 1
LOA #$40 ;set port 1 bit
STA porta
LOY #$80
delay:
DEY
BNE delay
;read Land R paddle data
LOA $0419 ;read the L paddle
STA Idata ;somewhere in memory
LOA $D41A ;read the R paddle
STA rdata
;read paddle port 1 firing buttons
LDA portb
AND #%00001100 ;zero out all but the firing button bits
EOR #%00001100 ;invert the firing button bits
BEQ done ;if both bits were originally 1's, neither
;was pressed
AND #$00000100 ;zero out the R button bit, leaving L's
BEQ right ;if L has not been pressed, R must have been
JSR Ipress ;L has been pressed: service it
JMP done
right:
JSR rpress
;return I/O to normal state
done:
PLA ;recover the original contents of DORA
STA ddra
CLI
RTS ;program continues from here
automobile analogy for both paddles and joysticks is that it shows the different
strengths of these two types of inputs. One would not want an accelerator with only
four or five positions, nor would one want to shift across 200 gear ratios to attain
overdrive.
Two joysticks can be used at a time, one for each game port. Internal C64 wir-
ing automatically connects both to the CIA. The simplicity of this shows in the
joystick I/O assembly code.
272 Connecting the Nerves: Using the Memory Map Chap. 5
The value for each of the four major axis joystick positions is assigned to its
own bit in a 4-bit number. The basic "no-position" number consists of all 1'so Plac-
ing the joystick in one of the axis positions resets the corresponding bit to O. Corner
values are formed by resetting the bits for both of the two adjacent axis positions.
All these values are shown in Fig. 5.14 with the joystick positions that prod uce them.
The drawing's perspective is the view from above the joystick.
The first four bits occupy dO through d3 of the CIA data ports. The fifth bit,
d4, is dedicated to the firing button. It also resets to 0 when the firing button is
pressed. PORTA, at DCOO, holds the joystick data for game port 2. PORTB, at
DCOI, holds the data for game port I. As pointed out in the last section, PORTB
also retrieves keyboard data. Again, your programs must avoid mixing keyboard in-
put with port] joystick input.
Assembly code for handling both joysticks first reads the joysticks' status
from DCOO and DCOI, then uses logical operations to isolate their lower five bits,
and finally executes conditional branches and calls to select appropriate processing
actions.
The second and last category of nonchannell/O is clock I/O.
Clock 110
Three kinds of clocks are accessible to your programs. The first clock is maintained
and accessed by the kernel. The latter two are a clock and a timer on the CIA chips.
Kernel clock. To understand the kernel's clock, we must reVil!W the system
cycle of the Commodore 64. Every 1160 or 1/50 second, depending on the C64
model, an IRQ interrupt is generated. This is called the system interrupt. Its con-
stant repetition is called the system cycle.
lf interrupts have been enabled, this interrupt causes the processor to complete
executing the current instruction, to disable interrupts by setting the I flag, and to
save the contents of the program counter and the status register on the stack. The
processor then retrieves the two bytes at FFFE and FFFF hex, and uses them as the
address at which to begin executing.
One of the first instructions in the routine at this address is an indirect jump to
the address in locations 0314 and 0315 hex. Since these are RAM locations, the ad-
dress must have been placed there earlier. The kernel does this at computer power-
up, with the address of a routine in the kernel.
Several functions are performed by this routine, which we earlier called an in-
1011 + 0111
Stick
terrupt handler. The two most important to us are that it calls the kernel module
SCNKEY, which reads the keyboard and places any pressed character into the
keyboard queue for GETIN to remove later, and that it maintains the kernel clock.
Because this is an appropriate place to do so, we will take a few paragraphs to
describe the keyboard-reading function and expand on the system interrupt and
cycle. Then we will focus on the kernel clock function.
Neither the system interrupt nor the kernel interrupt handler is necessary to
running a program on the C64. For instance, the system interrupt can be turned off
by resetting bit 0 of CIA # 1's control register A, at DCOE hex, to O. One reason to do
this is that the kernel interrupt handler consumes up to about 5070 of each system
cycle. In a time-critical loop, the programmer may want to ensure that execution
completes without a pause for interrupt handling. The interrupt can be turned back
on by setting the same bit to 1. A program can do either by loading DCOE into the
accumulator, using a logical AND or OR to reset or set bit 0, and writing the ac-
cumulator back into DCOE.
With the system interrupt turned off, the executing program will continue to
execute normally, but without the usual periodic interruption. The kernel interrupt
handler will never be activated and therefore will never read the keyboard. If
keyboard input is still needed, the program must read it the same way the interrupt
handler does: by calling SCNKEY periodically. SCNKEY's summary is included
here for programming reference.
SCNKEY:
Purpose: Read the keyboard and if any key pressed, place its code
in the keyboard queue.
JSR address: FF9F hex.
Prerequisites: None.
Data passed: None.
Data returned: None_
Altered registers: A, X, and Y.
The opposite situation, with the system interrupt turned on but the kernel in-
terrupt handler being sidestepped, can also occur. This is done by placing the ad-
dress of a routine in your program into locations 0314 and 0315 hex, which normally
store the address of the kernel interrupt handler. On an interrupt, execution will pass
to your routine instead of to the kernel's. This can be done two ways: First, a
shortened interrupt handler may be useful to gain 2 or 3070 in execution time in a
critical situation which still requires frequent and regular keyboard input. Your in-
terrupt handler might contain an interrupt counter which triggers a SCNKEY call
every n interrupts, n depending on how seldomly you can tolerate a keyboard read.
However, your handler must end with a CLI operation, to reenable interrupts, and
an RTI, or ReTurn from Interrupt, instruction.
274 Connecting the Nerves: Using the Memory Map Chap. 5
Alternatively, you may want all the normal interrupt handler functions
together with added functions. The functions that belong in an interrupt handler
are, like the keyboard read, those that require periodic and frequent processing
without complicating the main program's structure or timing. Two of the most com-
mon functions in this category are graphics and audio output updating.
Adding functions to the normal interrupt handler requires inserting your own
handler between the interrupt and the kernel's handler. A program does this by
copying the address in 0314 and 0315 elsewhere before overwriting it with the ad-
dress of its own handler. The program's handler ends with an indirect jump into the'
kernel's handler, using the address originally in 0314 and 0315.
Two other types of interrupts are serviced by the C64. The first is a software
interrupt, or BRK. As we mentioned in Chapter 3, when the BRK instruction is
fetched, the kernel routes the interrupt call to the address stored in locations 0316
and 0317 hex.
The second type of interrupt is nonmaskable. It is the NMI hardware inter-
rupt, and it cannot be disabled. The programmer's main interest in this interrupt is
that it is triggered when the RESTORE key is pressed on the keyboard. The interrupt
call is routed to the address stored in locations 0318 and 0319 at that time.
Having discussed various keyboard reading and system cycle issues, we can
return to the kernel clock. The kernel clock is a series of three bytes in low memory.
The lowest byte, at 00A2 hex, is incremented by the kernel's interrupt handler on
every system interrupt. The second byte, at OOAI hex, is incremented each time the
first byte overflows (from FF to 00 hex). In the same pattern, the topmost byte, at
OOAO hex, is incremented when the second byte overflows. The limit on this process
for 1160 second interrupts is the value 4F 19 FF (OOAO is on the left), which
represents 24 hours. When this time is reached, all three bytes are reset to zeros to
restart the clock.
There are two kernel modules for reading from and writing to this clock.
However, it is so simple to read or write these bytes directly that it is not worth using
the kernel. If a program using the clock must be transported to a different com-
puter, the code can be replaced with trivial effort as long as it is well organized and
well marked.
The disadvantage to using the kernel clock is that the conversion from the
binary bytes to hours, minutes, and seconds, and vice versa, must be programmed,
requiring much effort, much code, and much execution time. A better solution is to
use the clock provided by CIA #1. It expresses time in AM/PM, hours, minutes,
seconds, and tenths of seconds, and it also provides an alarm function.
CIA clock. CIA clock time is in BCD format, with each decimal digit being
represented by its binary code. As with the kernel clock, the time appears to a pro-
gram as a sequence of bytes in the memory map. The hours byte is kept in location
DCOB hex, with the top bit indicating AM or PM time. The interpretation of that bit
is up to you, but it must be used consistently within a program. The minutes byte is
in DCOA, the seconds byte is in DC09, and the l/lO-seconds byte is in DC08.
Using the 1/0 Block: Nonchannel 1/0 275
The clock is set by placing the desired time into the four clock bytes, starting
with the hours byte and finishing with the 1/lO-seconds byte. As an example, the
hexadecimal byte values for time 11:57:30.5 PM is shown below:
91 57 30
The top bit of the hours byte is set for PM, resulting in an hours value of 91.
The time must be written from hours to 1/ 10 seconds because a write to the
hours byte stops the clock. Only a write to the 1/lO-seconds byte will restart it.
A similar situation occurs when the clock is read. When the hours byte is read,
the clock output is frozen, although in this case the clock itself keeps running. Only
a read from the 1/ lO-seconds byte can release the clock output. Thus the clock must
also be read starting with the hours byte and finishing with the 1/ lO-seconds byte.
An alarm-clock capability is included in the CIA. To set the alarm, set bit 7 of
the CIA # 1 control register B, at location DCOF hex, to 1. Then write the alarm time
into the clock registers. Reset bit 7 of the same register to return to clock operation.
When the clock time equals the alarm time, the CIA will both generate an IRQ inter-
rupt and identify it as an alarm interrupt by setting bit 2 of CIA #I's interrupt con-
trol register, at DCOD hex, to a 1.
Note that this is the same type of interrupt as produced by the system 50 or 60
times a second. The only way a program can tell the difference is by having its own
interrupt handler to determine the interrupt source. The handler must check bit 2 of
DCOD hex and route processing to an alarm-servicing routine or to the kernel inter-
rupt handler accordingly.
An alarm function is most useful with time-of-day-oriented timing. It could be
used to generate periodic interrupts with a cycle of the program's choice if the pro-
gram reacts to each alarm by reading the clock and placing a set increased time in
the alarm registers. However, this is an awkward use of the clock. Alternatively, the
clock could be used to count the time between events, by reading the clock at the
beginning and ending event. However, this also is cumbersome. Both the generation
of adjustable period interrupts and the timing of events are better and more easily
done with the CIA timers.
CIA timers. There are two timers on each CIA chip, designated timer A and
timer B. Both timers on CIA chip #1 are in use during normal kernel operations, so
they are unavailable to your programs. However, both timers on CIA chip #2 are
available as long as the kernel's RS-232 capabilities are not in use (i.e., as long as no
channel has device number 2). If RS-232 I/O is performed through a bus converter,
as we have recommended, CIA #2's timers A and B will always be available.
Each CIA timer uses two counting bytes. On CIA #2, timer A's low byte is at
DD04 and its high byte is at DD05. Timer B's bytes are at DD06 and DD07.
276 Connecting the Nerves: Using the Memory Map Chap. 5
To use a timer, a program first places a value in a latch at the same locations as
the timer's two counter bytes. It then commands the timer to load the bytes from the
latch into the counter. Next it starts the timer. Timer A decrements the 16-bit value
upon receiving each I-megahertz clock pulse (1 million times a second). Depending
on its setting, timer B can decrement its counter the same way, or upon each decre-
ment to 0 of timer A. The latter type of decrement allows using both timers together
for a longer delay between timer B decrements to O.
When a timer's count reaches 0, an IRQ interrupt is generated and a bit is set
in the interrupt control register. This is the same register, although not the same bit,
that the CIA flags on an alarm interrupt. As before, a program's interrupt handler
can check the register and determine the interrupt's source. The CIA also reloads the
starting value from the latch into the counter bytes. Depending on the timer mode
commanded by the program, the timer halts or starts over again. The former
possibility is called the one-shot mode; the latter, the continuous mode.
There are seven important timer registers on a CIA. Timer A and timer Beach
have two counter registers, at locations DD04 through 0007 on CIA #2. There is an
interrupt control register at location 0000 to flag the source of the interrupt. Bit 1
is set for a timer B interrupt and bit 0 for timer A. Last, there are two command
registers for selecting the different timer operations, at locations DDOE and DDOF.
These operations are listed in Table 5.11.
Program code using the timers must select the timer repetition and decrement
modes, place two bytes in the timer latch, command the timer to load the latch into
the counter, and start the timer. The program's interrupt handler must check the
CIA's interrupt control register for the interrupt source.
This completes our exploration of general channel and nonchannel I/O on the
Commodore 64. In the next two chapters you will learn of the special graphics and
audio effects made possible by the VIC and SID chips, and how to use them.
Exercise:
Answer the following questions to review general Commodore 64 110.
(a) What two major types of I/O does the C64 support?
(b) Name the three types of files that the operating system allows accessing, and explain
their various uses.
C64 Disk 110 VIC-1541 User's Manual Commodore Business Machines, Inc.
Nothing energizes a program like spectacular graphic effects. However, graphics
alone cannot carry a program. For instance, many game programs include in-
teresting scenes and motion but still bore us because their plots are flawed. To keep
graphics in their proper perspective as well as to provide a solid programming
background, ,-"e have delayed this chapter until late in the book. If you have en-
thusiastically practiced the principles of good programming, you now have control
over the computer's processing actions and are ready to utilize the Commodore 64's
powerful graphics to support your program tasks.
Graphics can enhance almost any type of program. The widespread belief that
striking graphics are only for game programs is unfortunate: The mind can take in
many types of information more quickly from a picture than from text. Further,
people work better when having fun, and most people find watching graphics more
enjoyable than reading text.
Nevertheless, abstract ideas are often best conveyed as text. The ideal display
for many purposes combines both graphics and text This is borne out by military
studies on aircraft displays, which prove that the most piloting information is com-
municated in the least time by mixing graphics and text. Such a mixture is natural to
the C64.
C64 graphics are generated by the VIC II chip. "VIC II" stands for Video In-
terface Controller (version) II. All graphic output, even through the screen channel
as we discussed in Chapter 5, uses VIC. By using the screen channel we avoided most
graphics housekeeping duties, but we also traded off most of VIC's graphics
capabilities.
Now we will make use of those untapped capabilities and exploit shape, color,
278
Graphics and the Movie 279
and movement with VIC and the C64. The latter three characteristics are also found
in cinema, the traditional art form most similar to computer graphics. We will use
the cinema analogy to illuminate the use of graphics on the C64.
Both the movie camera and movie film have equivalents in a C64 system. Only the
scene to be "photographed" is left for the program and programmer to supply. VIC
is the camera, obtaining the scene as data from memory and transforming it through
its "lenses" to yield the screen image. We begin with the screen image since it, like
film in movies, is the product of the medium.
The physical characteristics of the video screen and movie film are similar. Both
have a "base" and an "emulsion." For movie film, the base consists of plastic or
acetate, coated with an emulsion of light-sensitive particles. For the video screen, the
base is the inner glass surface of the picture tube, while the emulsion consists of its
regularly spaced phosphor particles. As we said in Chapter 5, the raster electron
beam sweeps across the phosphors one row at a time starting from the top of the
screen, taking the place of light in setting the color and brightness of the individual
emulsion particles. Upon reaching the bottom of the screen, the raster beam is
turned off and returned to the top row to start over. Again, in practical use the
phosphors are grouped into pixels, which are the building blocks of an image. Each
raster line is one pixel high.
The base and emulsion are the lowest-level picture elements. They are orga-
nized into individual pictures called frames. Frames are still pictures of a uniform
size, which are shown in sequence to give the illusion of movement. This concept is
so familiar. due if nothing else to the pictures of film reels preceding afternoon or
late-night movies, that the movie parallels will be left to the reader throughout much
of the following discussion.
Screen format. The first aspect of frames, that they are still pictures of a
uniform size, governs the format of every C64 screen image. A full screen cor-
responds to a single movie frame. It contains a unicolor rectangular border sur-
rounding a smaller viewing screen of two possible sizes in each dimension. The
smaller screen has a height of 200 or 192 pixels, or equivalently. raster lines, and a
width of 320 or 304 pixels. These sizes correspond to a screen 25 or 24 character rows
tall by 40 or 38 characters wide, respectively.
The full screen is defined by 262 raster lines: 200 or 192 for the viewing screen
and 62 or 70 for the border area and for the lines that could otherwise be written
during the time it takes the raster beam to return from the bottom of the video
screen to the top (the "vertical retrace" of the raster beam).
280 Awakening the Pixy: Advanced Graphics Chap. 6
} 0-49/54 (inc!.
Border
vert. retrace)
This screen format is illustrated in Fig. 6.1. Each figure following a " I" cor-
responds to the smaller viewing screen size. The viewing screen size is selected by set-
ting bits in two registers (Table 6.1).
For some purposes the viewing screen effectively extends beyond these visible
pixels to lie beneath the border. With both screen sizes, this under lap allows certain
picture objects to enter and exit the visible viewing area smoothly.
Additionally, with the smaller size viewing screen the entire image can enter
and exit the visible area smoothly. This smooth movement of the image up and
down or side to side is called "tilting" and "panning," respectively. in movie ter-
minology, and scrolling in computer terminology. We will discuss the general case of
scrolling the entire small-screen image here and leave the restricted case of moving
individual objects for later.
Scrolling. Smooth scrolling implies that an image enters and exits the screen
in small increments, much smaller, say, than the jump that occurs when the kernel
editor shoves the screen up a line to insert text at the bottom.
Given that the pixel is the smallest picture unit, one pixel is the smallest
amount the screen image can conceivably be moved. VIC supports one-pixel move-
ment vertically or horizontally with two three-bit scrolling registers. The vertical
scrolling register is in bits 0 through 2 of VIC's control register A at DOll. The hori-
zontal scrolling register is similarly in bits 0 through 2 of control register B at DOI6,
beside the horizontal size selector bit d3.
The three-bit scrolling registers allow for up to eight pixels' image movement,
which is sufficient for continuous smooth scrolling. Incrementing the horizontal
scrolling register from 0 to 7 moves the image from left to right. This uncovers the
hidden view under the left border while concealing the right edge of the image under
the right border. Upon reaching the maximum value of the scrolling register, the
data from which the image is generated must be manipulated to move the entire im-
age to the next rightward character position, new data must be written for the hid-
den left column, and only then can the scrolling register continue moving the image
incrementally by cycling over from 0 to 7. In pseudocode this is summarized as
follows:
'Left_To_Right_Scrolling_Algorithm'
Set the screen to 38 columns
LOOP
Set the horizontal scrolling register to 0
LOOP
Increment the horizontal scrolling register
EXITIF register holds 7
Delay for desired scrolling rate
ENDLOOP
Move entire image right one character position
Write the hidden column
EXITIF done scrolling
ENDLOOP
END 'Left_To_Right_Scrolling_Algorithm'
For smooth scrolling, the time consumed by the embedded delay loop must
equal the time consumed in moving the image and writing the hidden column. Of
course, both times can be extended for slower scrolling. A better way to control
scrolling is with a raster interrupt handler, a special module that executes when the
raster beam reaches a predefined screen line. This technique is discussed in the
following animation section.
To "move entire image right one character position" you must know how the
image data are organized. This is discussed in the screen memory, color memory,
and bit-map memory subsections of the upcoming "Focusing the Image" section.
You must move the data within these structures in fast LOOP constructs to be able to
update the screen quickly enough. Because of time constraints, scrolling is usually
done in character mode rather than bit-map mode, although careful programming
can make the latter possible. These two modes are also discussed in the section
"Focusing the Image."
To "write the hidden column," place the data directly into screen, color, and
bit-map memory; CHROUT and PLOT are too slow.
Horizontal movement from right to left is the mirror image of movement from
left to right. The horizontal scrolling register is cycled from 7 to 0, image data are
adjusted for leftward movement, and new image data are written for the column
under the right border.
Vertical movement is similar to horizontal movement except that the two
possible screen sizes differ by only one character in the vertical dimension. Unlike in
282 Awakening the Pixy: Advanced Graphics Chap. 6
the horizontal dimension, where hidden columns are available at both ends, just one
hidden vertical row is available. The scrolling register can be used to place the hid-
den row at either the top or the bottom of the screen. A value of 0 places the hidden
row at the top; a value of 7 places it at the bottom. Incrementing the vertical scroll-
ing register from 0 to 7 scrolls the image downward. As before, the image data must
be adjusted to move the entire image one character position up or down, and new
data must be written to generate the oncoming image edge.
Animation. The second aspect of frames, that they are shown in sequence to
give the illusion of movement, is the general case for both movies and graphics.
Where the image in movies is artificial, as in cartoons, this technique is called
"animation." Since all C64 images are artificial, the title animation also identifies
moving C64 graphics. A simple subcase of computer animation is tIle unchanging
display, which merely omits the mechanics of regular image change.
Frames in movie film are commonly shown at one of two speeds. Silent films
are shown with realistic effect at 16 frames per second (fps). Sound films are shown at
the higher speed of 24 fps to improve the sound quality as the film's sound track
moves through the projector more rapidly.
The video raster beam fills the screen 60 times each second. This means that
each raster line is present for 1/262 of 1/60 second, or 64 microseconds. Since the
6510 microprocessor operates on roughly a I-MHz clock cycle, with each assembly
instruction consuming several I-microsecond clock pulses, it is obvious that only a
handful of instructions can execute during the life of one raster line.
Changing the image on the screen once every four l/60-second screens yields an
effective change rate of 15 fps, or nearly that of silent film. This slower rate simulates
motion believably while lessening the program's screen processing load by three
fourths.
To implement change at this rale, we need a means of coordinating image and
screen changes. VIC provides a simple solution by indicating the actual raster beam
position and also by generating an interrupt when the beam reaches a preselected
raster line.
Raster beam actual and interrupt positions are both accessible through nine
register bits on VIC. Nine bits are necessary to represent all 262 possible raster posi-
tions. The lower eight bits are in the raster register at location D012 hex. The most-
significant bit is stored in bit d7 of control register A at DOll. When read, these nine
bits contain the line number of the actual raster beam position, from 0 to 262 starting
at the top of the screen.
When the nine bits are written to, VIC begins comparing th(: deposited value
against the actual raster position. When the two values become equal, VIC sets raster
IRQ bit dO in the IRQ nags register at DOI9 to 1. This action is most useful if the
raster IRQ enable bit dO of the IRQ enable register at DOIA has previously been set to
1. Then VIC will also generate an IRQ interrupt.
In this way a program can be informed when the raster beam reaches a par-
ticular raster line, which in most cases is also a pixel row. To use this information in
Graphics and the Movie 283
animation, the program must take certain preliminary actions, and several functions
must be included in the program's interrupt handler. General information on inter-
rupt handlers can be reviewed by rereading the system clock section of Chapter 5.
The program's duties for animation are to disable all nonraster interrupts, to
load a pointer in locations 0314 and 0315 hex with the beginning address of the raster
interrupt handler, to place the initial 9 bit raster-line comparison value into the raster
and control registers at locations D012 and 0011 hex, and to set up a frame counter
to allow changing the image only every fourth screen.
The program disables all nonraster interrupts by writing values into interrupt
mask registers on the I/O chips that generate interrupts; VIC and the CIAs. To
disable all VIC IRQs except for the raster's, the value 01 must be written into VIC's
IRQ enable register at location DOlA hex. To disable all CIA IRQs the value 7F hex
must be written to CIA # 1 and#210cations DCOD and 0000 hex. All nonraster inter-
rupts must remain disabled until raster interrupts are no longer in regular use. VIC's
interrupts are reenabled by writing the value FF hex into location DOIA. They can be
selectively reenabled by writing 1 values into bits dl through d3, for the sprite/
background, sprite/sprite, and light pen IRQs, respectively. CIA interrupts are
reenabled by writing the value FF hex into locations DCOD and 0000 hex. More
detailed information on CIA interrupts is beyond the needs of most C64 assembly-
language programmers, but information on the subject is contained in the CIA chip
specification in Appendix M of the Commodore 64 Programmer's Reference Guide.
The best raster-line comparison value for animation purposes is 250 decimal.
This value represents the first line after the visible viewing screen, so it allows the
longest possible time for image changes before the raster returns to the top of the
viewing screen. A comparison value within the viewable area of lines 50 through 249
usually results in a flickering on the screen, which should be avoided when possible.
Screen flickering is caused by changing a screen object just as the raster is draw-
ing it. To avoid flickering, keep two copies of the screen at different legal screen-
memory or bit-map memory areas (see the screen memory and bit-map memory
subsections of the upcoming "Focusing the Image" section) and display one while
changing the other. By flipping back and forth between the two screens, all changes
can be made off screen and this cause of flickering is eliminated.
Example:
Show an assembly language construct that could be used to prepare for using raster inter-
rupts. Use 250d as the raster-line comparison value.
Example:
Show assembly code for returning to normal system operation after a period of handling
only raster IRQs.
The raster IRQ interrupt handler's duties are more numerous than the pro-
gram's. First, the A, X, and Y registers must be saved on the stack.
Second, the frame counter must be maintained to allow changing the image
only every fourth screen (assuming a 15fps rate).
Third, on every fourth screen the source data for the image must be changed to
simulate the desired motion. There will be more on this shortly. This part of the raster
interrupt handler can also change screen colors and provide synchronization for the
screen flipping discussed earlier in this section.
Fourth, the 9-bit raster-compare value must be reloaded into the raster and con-
trol registers at locations DOl2 and DOll hex.
Fifth, VIC must be informed that the interrupt has been serviced. This is done
by writing a 1 into the raster IRQ bit ofthe IRQ flags register at DOI9. Simply reading
the IRQ flags register into a CPU register and then writing the same value back will
suffice.
Sixth, the A, X, and Y registers must be pulled from the stack.
Seventh and last, an RTI (ReTurn from Interrupt) or the normal system inter-
rupt code can be executed. The latter can be done if the system 1/60-second interrupt
is turned off, as described in Chapter 5, and the 1/60-second raster interrupt is used
for system timing instead.
Graphics and the Movie 285
Example:
Show the assembly code for a raster IRQ handler.
The raster interrupt handler is the key to advanced graphics work. It enables a
program to utilize scrolling, animation, and masking (the topic of the next section) at
the fastest possible rates, which allows more graphics activity and more background
processing by the program between interrupts_
The C64 supports a limited form of masking. The screen can be divided into
horizontal sections having different image sources. Each section will be the full width
of the screen. At assembly language processing speeds the screen can be divided into
as many sections as there are raster lines, for an absolute limit of 262 sections.
Most graphics displays will use no more than two sections; one for graphics and
one for text is typical, as we mentioned at the beginning of the chapter.
To mask a graphics frame, the interrupt handler used for animation is ex-
panded with additional tasks. Beyond its previous duties it must dett!rmine which
screen section the raster is in at the time of the interrupt, change thle scene to be
photographed for the new section, and set up the raster comparison value that will
trigger the interrupt at the beginning line of the next screen section.
Determining the screen section can be done by reading the raster position and
comparing it to the raster comparison values the program uses for the different
screen sections. If the position equals or is one larger than (allowing for the delay
before reading the value) one of the comparison values, the screen section has been
positively identified. Alternatively, a simple counter can be used to keep track of
progress through consecutive screen sections. Changing the scene requires the same
actions as initially selecting the scene, a topic that will be discussed later.
Setting up the next raster comparison value is quite simple once the program-
mer has selected a value. As in animation, the value is chosen by considering the row
numbers of the viewing screen inside the border. For instance, if the screen is to be
divided in half, there might be interrupts on row numbers 250 and 150, for the first
invisible line before the top half, and for the line halfway down the 200 visible line
numbers from 50 to 249. The interrupt handler would, on the line 250 interrupt, write
the hex equivalent of 150 into the raster register. On the line 150 interrupt it would
write 250 into the raster register.
Whether the screen is scrolled, animated, or masked, any image on the s-.:reen
will be completely composed of picture blocks of three possible sizes. These blocks
correspond to objects in the scene VIC photographs, a scene provided by the ex-
ecuting program and populated with object data structures. Our next topic is this
photographic subject for the C64's graphics camera.
Early in the chapter we mentioned the three major aspects to any visual scene: shape,
color, and movement. Movement has already been examined, as it will be again.
However, for now we will consider the scene as a frozen view. This is consistent with
our definition of a frame as a still picture.
Shape and color. Shape and color are independent qualities. This is evident
from the coexistence of black-and-white and color forms of movies,. drawings, and
other artistic media. The data making up a graphics scene are likewisle separated into
shape and color attributes. Shape data consist of object data structures. The constit-
uents of color data will be discussed shortly.
Graphics and the Movie 287
0 Black
I White
2 Red
3 Cyan
4 Purple
5 Green
6 Blue
7 Yellow
8 Orange
9 Brown
A Light red
B Dark gray
C Medium gray
D Light green
E Light blue
F Light gray
Visual objects. Objects come in two sizes with similar structures. Every ob-
ject has some multiple of eight pixels across each horizontal row, and eight or more
rows from top to bottom. This structure is illustrated in Fig. 6.2. For each pixel in the
object image or picture block there is one bit in the object data structure. The pixels
and the bits are both ordered row by row from the upper left pixel to its lower right
pixel.
So the first byte in an object data structure defines the leftmost eight pixels on
t
'8+'
pixels
I
------
r-'8N' pixels~1 Figure 6.2 Object structure
-----------------
the first row. the second byte defines the next-right eight pixels on the first row, the
byte after the last byte for the first row defines the leftmost eight pixels on the second
row, and so on until the rightmost eight pixels on the last row are defined. The most
significant bit in a data structure byte defines the leftmost pixel in the corresponding
eight-pixel object area.
VIC has three operating modes, which correspond to three different ways that
VIC looks at the scene data. They are descriptively named the character. bit-map,
and sprite modes and will be discussed in a later section. The object and correspond-
ing data structure sizes for all three modes are shown in Table 6.3.
Object structure. Within each object there are two ways of correlating data
bits to visible pixels. By our definition of codes (i.e., meanings assigned to low-level
data elements), we can call these two correlations the pixel codes. VIC has submodes
within each mode that, with just one exception, correspond to the pixel code used
within the object data structures.
The simplest pixel code assigns each data structure bit to a single pixel in each
visual object image. Following the left-to-right and top-to-bottom ordering conven-
tion we have already discussed, the most significant (d7) bit of the first data byte
defines the value of the leftmost pixel on the top row, the next (d6) bit defines the
next-right pixel on the top row, the d7 bit of the second byte defines the value of the
leftmost pixel on the second row. and so on through the least-significant bit of the last
byte which defines the bottom rightmost pixel.
This code is used in the normal submode of each of VIC's operating modes. So
in the normal submode each pixel is assigned either a 1 or a 0 value. An object that
uses this code to define the pixels in a character-mode picture block of the letter C ap-
pears as shown in Fig. 6.3. In order as hexadecimal bytes, this objel:t contains the
values" 3C 7E 66 60 60 66 7E 3C."
The second pixel code assigns two bits to two-pixel horizontal groups. Follow-
ing the left-to-right ordering convention, the most significant two bits in the first ob-
ject byte define the leftmost two pixels on the top row, and so on.
Obviously, grouping the pixels by twos cuts the picture resolution in half.
However, having two bits to describe each unit allows for assigning one of four values
instead of one of two. In other words, this code trades fineness of picture detail for
color variety.
The second code is the basis for the multicolor submodes of all three VIC
Bit-map 8h x 8v 8
Character 8h x 8v 8
Sprite 24h x 21v 63
0011 1100
0111 1110
0110 0110
0110 0000
0110 0000
0110 0110
0111 1110
0011 1100
Figure 6.3
modes. Thus in the multicolor submode each two-pixel group is assigned a value
from 00 to 11 binary. A "doctored" letter C in character mode and multicolor sub-
mode might appear as shown in Fig. 6.4 (with symbols representing colors). In order
as hexadecimal bytes, this object contains the values" 15 5A 6A 6A CO CO FO
BF."
Each mode has its own way of assigning color information to the data-structure
bit values and thus to the pixels. These assignments are discussed in the next section.
A movie camera transforms the scene before it into images on film. We now ex-
amine the graphics device that transforms the scene data before it into screen images:
the graphics camera, VIC.
VIC has four major graphics functions, corresponding roughly to the movie camera
functions of focusing, exposing film, animating, and in-camera editing. In graphics
terms these functions are, respectively, transforming scene data, writing the
transformed image on the screen, simulating motion, and providing for a change of
scene. We have already discussed the animating and exposure functions, as well as
some aspects of the focusing function. Now we will look in greater detail at the re-
maining focusing and editing functions.
Focusing the image. In the focusing function are included all the image ac-
quisition and transformation actions. A lens focuses light from the scene into an im-
age on the film. The corresponding function in VIC shares two lens attributes.
I mage Object
+ 0001 0101
- - -- 0101 1010
--- - 0110 1010
it 1=
-+ +- +- + +-- - ~
-'-
0110
1100
1010
0000
+- .1
+ + + + 1100 0000
'" ~
+ + + +
ff # '"
#
"''" # ;= #
1111
1011
0000
1111
'" Figure 6.4
290 Awakening the Pixy: Advanced Graphics Chap. 6
First, lenses have afield afview. The field of view is the area isolated from the
camera's surroundings to be visible in the image. VIC also has a field of view which is
smaller than its total surroundings. VIC's surroundings are the memory locations
that scene data can be stored in. This includes the entire 64K memory map. VIC
isolates a 16K field of view, called a bank, from its 64K surroundings to contain the
image source. There are no overlaps between banks, so under program control VIC
can access anyone of up to four different banks.
VIC's field of view is selected, not on VIC, but on the data port A register of
CIA #2, at location DDOOh. Bits dO and dl select the bank as indicated in Table 6.4.
Bank 0 is selected by the power-up reset handler and is active unless a program
changes the selection.
A second attribute commonly associated with lenses is filtering. Filters are at-
tached to lenses to transform a scene into an image in different ways. Similarly,
VIC's modes and submodes treat data in the bank differently to produce different
types of images. The analogy is imperfect, however, because even the basic organiza-
tion of the 16K bank depends on the mode. Further discussion of the graphics func-
tions must therefore be by individual mode.
Again, the three modes are named character, bit-map, and sprite. Only one of
the first two modes can be active at a time. The third mode can coexist with either of
the others, as it merely overlays movable objects on the basic image.
Character Mode. In the character mode the 16K bank contains three sections:
the character sets, screen memory, and color memory.
The Character Sets. As we have already seen, in Ilhe character mode each ob-
ject contains eight bytes, one byte per pixel row in the 8 x 8 pixel image. These ob-
jects are called characters. The character sets are two libraries of 2:56 consecutive
eight-byte character definitions each. This allows a one-byte value to act as an index
into the character set to select the eight-byte character to be displayed, in a way that
will be explained in the following section. Thus each library requires 2K of memory,
for a total character set area of 4K. The C64 provides two predefined character sets in
the character set ROM, which is located between addresses DOOO and DFFF hex in the
memory map (when it has been placed in the memory map). The lower and upper
character libraries are written in screen codes 1 and 2, respectively. These codes are
shown in Appendix B.
Whatever the source, only one 2K character set can be available to VIC at a
time. The character set in the first 2K area of character memory is selected by sending
OE hex to the screen channel with CHROUT. The second 2K library is selected by
sending 8E hex to the screen.
There are two aspects of character set addressing that you must be aware of.
The first is that a program can access the character set ROM only by placing it in the
memory map, which locates it starting at address DOOO, and reading it. The second
aspect of character set addressing is that because of circuitry manipulations in the
C64, VIC treats the character set as if it were actually located in the memory at ad-
dresses 1000 through I FFF hex, for VIC bank 0, or 9000 through 9FFF hex, for VIC
bank 2 (see Table 6.4). This makes these address ranges unavailable for any other
graphics data. Only programs and their nongraphics data can be placed in these
memory areas. The character set ROM is unavailable when VIC is set to banks lor 3.
As we have said, the character set ROM is always present starting at a 1000 hex
offset into bank 0 or bank 2 (see Table 6.4). However, the character set used by VIC
can be obtained from any non-ROM bank offset that is a multiple of 800 hex. That is,
VIC can be commanded to take its character set data from offset 0000, 0800, 1000,
1800, ... , through 3800 hex into the bank.
The offset for the character set is selected by writing a value into bits dl through
d3 of VIC's bank addressing register, at DOl8 hex. Table 6.5 lists the bit values and
resulting offsets. As usual, when altering these bits, preserve the others to avoid
disturbing the other functions of the register.
Screen Memory. The data in screen memory define the character shapes on
the screen. Recall that the larger screen format is 25 by 40 characters. The smaller
screen is 24 by 38 characters and is created by covering rows and columns of the larger
screen with a border. Screen memory is handled the same way in either case; the size
of the border is selected by setting bits in the VIC registers at locations DOll and
DOl6 hex, as was explained in the screen format section earlier in this chapter.
There is one byte in screen memory for every character position in the larger
screen, for a total of 1000 bytes. The value placed in each screen-memory location
serves as an index into the character set, causing the character image data for that
position to be obtained from an offset into the character set equaling the byte in that
0 0 0 0000 hex
0 0 0800 hex
0 0 1000 hex (ROM in banks 0 and 3)
0 1800 hex
0 0 2000 hex
0 1 2800 hex
1 I 0 3000 hex
I I I 3800 hex (2K wrap-around)
292 Awakening the Pixy: Advanced Graphics Chap. 6
screen-memory location times 8 (characters are defined by eight bytes of data each).
Thus a value of n in a screen-memory location displays the character defined as 8 x n
bytes into the character set. There are 256 characters in a library, for the 256 possible
index values.
Screen memory has a byte-oriented left-to-right, row-by-row structure similar
to the bit-oriented structure of characters. The zeroth screen-memory byte assigns a
library character to the leftmost position on the first row, the following byte defines
the next-right position on the first row, the fortieth byte defines the leftmost position
on the second row, and so on until the 999th byte defines the bottom rightmost
character position. This structure is illustrated in Fig. 6.5, with the screen-memory
bytes ordered by the screen positions they define. Again, the values placed in the
screen-memory locations are indices, or pointers, into the character set selected with
the bit patterns shown in Table 6.5.
Like the character sets, screen memory can be defined to start at a number of
different offsets into the 16K bank. As before, the starting location is chosen through
VIC's bank addressing register at 0018 hex. Table 6.6 shows the bit va.lues in bits d4
through d7 that select the different starting offsets for screen memory.
If you change the location of screen memory from the powerup value, the
kernel editor (used during keyboard input with CHRIN) will not work properly until
you write the most-significant byte of the new offset address into location 288 hex.
For example, if the bit value written into the memory register was 1111 b, selecting the
screen-memory offset of 3COOh, you would write the value 3Ch into location 288h.
Color Memory. Earlier we said that "each mode has its own way of assigning
color information to the data-structure bit values" that define the pixels in an object.
Object shape, as we have just seen, is defined with pointer values placed in screen
memory. Object color is defined with a data structure parallel to screen memory,
called color memory, and with VIC's background-color registers.
Color memory, unlike screen memory, has a permanent location in the 64K
o 0 1 2 3 .. 37 38 39
40 41 42 43 .. 77 78 79
2 80 81 82 83 .. 117 118 119
Screen
memory
locations
0 0 0 0 0000 hex
0 0 0 0400 hex (power-up value)
0 0 0 0800 hex
0 0 OCOO hex
0 0 0 1000 hex
0 0 1400 hex
0 0 1800 hex
0 1 I I I COO hex
0 0 0 2000 hex
0 0 2400 hex
0 0 2800 hex
0 I 2COO hex
0 0 3000 hex
0 I 3400 hex
0 3800 hex
3COO hex
memory map. It is always found starting at D800 hex. Like screen memory, color
memory consists of 1000 locations that are correlated to screen positions in the same
row-by-row manner. Only the lower four bits of each color memory location are
used, since there are only 16 (24) available colors to represent. The 16-value color code
used in color memory is also used in the background registers; it was shown in the
"Shape and Color" subsection.
Recall that the pixels in each object are grouped by ones or twos and are repre-
sented with one- or two-bit groups in the object data structure. The size of the bit
group depends on the submode. The value in each bit group is used to select the color
of the corresponding pixels. Depending on the submode under which VIC is running,
one of the possible bit-group values in any given screen-memory location causes the
color defined by the corresponding color-memory location to be assigned to the bit-
group's pixel or pixels. The color defined in color memory is the foreground color.
Because there are color memory locations for every screen position, foreground color
can be assigned independently to each of them. The other possible bit-group values
select a background pixel color from different sources depending on the submode
VIC is in. The background color for a given screen-memory bit-group value is com-
mon to the entire screen.
In Table 6.7 the color source for each possible bit-group value is shown by sub-
mode. A new submode is introduced in this table: extended background. This sub-
mode assigns the background colors in a different way that will be explained later.
Note that only eight different foreground colors can be represented on a
multicolor screen without masking. This is because bit d3 in each byte of color
memory determines whether the corresponding screen-memory location is multicolor
or not.
294 Awakening the Pixy: Advanced Graphics Chap. 6
Character
Submode bit pattern Color source
Bit-Map Memory. In the bit-map mode the shape of the screen image is
defined by the bit-map memory. In the character mode the screen image was defined
by screen memory, which contained a pointer into the character set for each character
position on the screen. The bit-map mode has no character sets, because it places the
eight-byte character objects directly into the screen-defining data structure in place of
the pointers. It follows that the resulting structure is eight times larger than screen
memory, or SK bytes long. All object bits and therefore image pixels are directly ac-
cessible in this data structure, which is why it is called' 'bit-map memory," or "the bit
map" for short.
Graphics and the Movie 295
Screen-memory bits
D7 D6 Color source
o 0 Background 0 (D021 h)
o 1 Background 1 (D022h)
o Background 2 (D023h)
Background 3 (D024h)
0 8 16 304 312
1 9 17 305 313
2 10 18 306 314
3 11 19 ..... . . . . 307 315
o 4 12 20
. . ..... 308 316
5 13 21 309 317
6 14 22 310 318
7 15 23 311 319
Bit-map
memory
locations
7680 7688 7696 7984 7992
7681 7689 7697 7985 7993
7682 7690 7698 7986 7994
7683 7691 7699 7987 7995
24
7684 7692
............... 7988
7700 7996
7685 7693 7701 7989 7997
7686 7694 7702 7998 7998
7687 7695 7703 7991 7999
o 2 38 39
Columns
where character row ranges from 0 to 24, character column from 0 to 39, and pixel
line from 0 to 7.
The bit location for a particular pixel is the same as the pixel position on the
pixel line; that is, the leftmost pixel on the pixel line is represented by the leftmost bit
(d7) of the byte, and so on. To ensure that you know how to use the equation, try
calculating the bit-map location for a pixel row in any character shown in Fig. 6.6.
This method of calculating data structure position is particularly useful when
the screen is masked into a combination of bit-map and character mode sections. Us-
ing the equation above, the entire screen can be designed and handled in terms of
character position.
A second method of correlating data structure and pixel position is in terms of
pixels. The screen is divided into 320 pixel rows and 200 pixel columns. The byte and
bit location for the pixel at any X, Y column and row position is calculated with an
equation derived from the previous character-based equation. This equation uses two
arithmetic operations called DIV and MOD to translate pixel row and pixel column
values into character row, character column, and pixel line values. DIV produces the
quotient of a division operation while discarding the remainder. MOD produces the
remainder of a division and discards the quotient. So DIV(13/8) produces the value
1, and MOD(13/8) produces 5. Both DIV and MOD can be programmed from
variations on the division algorithm of Chapter 3. The adaptation is simple and is left
to you as an exercise.
The second form of the bit-map equation is
So the bit position for a pixel in the tenth pixel column is 7 - MOD(10/8), or 5,
which represents bit d5 in the bit-map byte location.
The DIV and MOD operations in these equations are applied to divisions by 8.
Both operations can be performed at once by shifting the variability quantity, either
pixel row or pixel column, three bit positions right and catching the 3 bits that fall
off. The quantity remaining from the original variable is the DIV result and the three
dropped bits form the MOD result.
Example:
Illustrate DIY and MOD using bit shifting on the problem 13/8.
The initial contents of the Store and Catch bytes are
Graphics and the Movie 297
Store Catch
The contents of Store and Catch after the bit shift are
Store Catch
In the example the DIV result of 1 is in the Store, and the MOD result of 5 can
be read from the Catch. In practice, it is faster to make two copies of the variable,
shift one copy right 3 bits to obtain the DIV, and isolate the lowest three bits of the
other copy with an AND 0000 0111 operation to obtain the MOD.
The pixel-position equations treat the screen as a single large object. Individual
pixels can be manipulated according to their X, Y position on the screen, without
regard for character position. This view corresponds to the normal usage of the bit-
map mode and is convenient for that purpose. It also facilitates graphic applications
that plot pixels by their Cartesian (X, Y) or other coordinate system location.
As with the screen memory and character mode, bit-map memory can be placed
in more than one position within VIC's address bank. Bit d3 of the bank addressing
register at DOI8 hex selects this position. The two possible locations are shown in
Table 6.9.
There are two sub modes in the bit-map mode, normal and multicolor, which
correlate single or paired bits in each character object to single or paired pixels in each
character position on the screen. The method of bit/pixel correlation for both sub-
modes is described in the internal structures subsection of the section "Screen as
Film."
With image shape fully described by the bit map, color is the only remaining im-
age quality to be defined. In the bit-map mode, this definition is provided by screen
and color memory.
Screen and Color Memory. Screen and color memory are organized the same
way as in the character mode: each section's 1000 locations correspond to the 1000
character positions on the video screen in a left-to-right, then row-by-row manner. So
one screen-memory byte and one color-memory nibble are dedicated to each eight-
byte object in the bit map.
o 0000 hex
8000 hex
298 Awakening the Pixy: Advanced Graphics Chap. 6
Character
Sub mode bit pattern Color source
Like the color memory nibble, in the bit-map mode each screen memory nibble
assigns a color value to a particular bit-pattern value in the eight corre:sponding ob-
ject bytes. VIC's background-color register 0 is also used as a color source in one sub-
mode. The color source for each object bit pattern in each submode is shown in Table
6.lO.
Note the extra color flexibility of the bit map's multi color sub mode over that
of the character mode. By using screen memory instead of background registers for
two of the color values, three colors are independently selectable for each character
position instead of just one. Because there are no character pointers in the bit-map
mode, there is no extended background submode.
The bit-map mode is selected by placing a 1 in the Mode Select bilt. (d5) of VIC's
control register A, at DOll hex. As in the character mode, the normal. or multicolor
submodes are selected by placing a 0 or aI, respectively, in the Submode Select bit
(d4) of VIC's control register B, at DOI6 hex.
Sprite Mode. In the last graphics mode, small and mobile objects traverse the
background provided by either of the other modes. These objects are called sprites,
for the fairy creatures of magical powers and benevolent though impish intentions.
Each sprite is 24 pixels wide by 21 pixels tall. Thus there are three bytes per row
and 21 rows in a sprite, requiring 63 bytes in the object data structure that assigns
numerical values to pixels. Object bytes correspond to image pixels in the same way
as in the character mode. This structure is illustrated in Fig. 6.7. Data structure bytes
are numbered within the pixel borders.
Bits are assigned to pixels in the usual left-to.. right manner. Normal and
multicolor submodes exist and follow the same bit and pixel grouping rules as in the
character and bit-map modes (see the discussion in the character mode section). The
color assignments for both submodes will be shown later.
Up to eight sprites can be displayed at once on an unmasked screen. Their data
structures can be located anywhere in the 16K bank and need not even be grouped
together because the location of each structure is independently defined. VIC
assumes that eight one-byte pointers, one for each sprite, are located in the last eight
bytes of the bank's 1K screen memory section. That is, if screen memory is located at
0400 hex, screen memory fills the first 1000 bytes of the 1024 in the section. This takes
Graphics and the Movie 299
0 0 1 2
1 3 4 5
2 6 7 8
3 9 10 11
4 12 13 14
5 15 16 17
6 18 19 20
7 21 22 23
8 24 25 26
~
e 9 27 28 29
OJ 10 30 31 32
ci: 11 33 34 35
12 36 37 38
13 39 40 41
t 14 42 43 44
15 45 46 47
16 48 49 50
17 51 52 53
18 54 55 56
19 57 58 59
20 60 61 62
o 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Pixel columns ------
up bytes 0400 through 04E5. The sprite pointers fill the last eight bytes of the 1024,
from 04F8 to 04FF.
The pointers are one byte long to allow dividing the 16K bank into 256 64-byte
segments. Each segment just holds a 63-byte sprite data structure, with a leftover byte
at the end. The actual data offset is the product of the pointer value and the number
64:
sprite-data offset into bank = pointer x 64
So a pointer value of 0 tells VIC that the data structure is at the beginning of the 16K
bank, and so on.
Although only eight sprites can be displayed on an undivided screen, by using
masking additional sprites can be displayed. For instance, multiples of eight sprites
can be displayed on the same background image by changing the sprite pointer values
on the interrupts for multiple screen sections. Of course, sprite data must exist for
each different pointer value.
Fewer than eight sprites can be displayed in a screen section or on an unmasked
screen by fixing bits in VIC's sprite enable register at D015 hex. Each of the eight
sprites has its own enable bit. To turn on sprite n, set the nth bit in D015 to 1. So, to
turn on sprite 7, set bit d7 to 1. To turn off a sprite, reset its bit to O.
Each sprite image can be doubled in size vertically, horizontally, or both by
setting bits in VIC's sprite expansion registers. To expand sprite n horizontally, set
300 Awakening the Pixy: Advanced Graphics Chap. 6
the nth bit in VIC's horizontal expansion register at DOID hex to 1. That is, set bit dO
for sprite 0, bit dl for sprite I, and so on. Reset the bit to 0 to return the sprite to nor-
mal size. To expand a sprite vertically, set the corresponding bit in VIC's vertical ex-
pansion register at DO 17 hex to 1. Again, the bit positions correspond to the sprite
numbers.
A sprite's screen position is the X, Y location of its upper-left corner. Each
sprite image can be placed at any X, Y coordinate on the visible screen. Additionally,
sprite images can be hidden under any part of the screen border. To accommodate all
these positions, the available number of horizontal and vertical pixel positions has
been expanded from 320 and 200 to 512 and 256.
The new position ranges include 192 hidden positions horizontally and 56 ver-
tically. The visible area is skewed within the overall ranges, so the hidden border posi-
tions on each side of the viewing screen are of different sizes. However, the position
values wrap around from the highest to the lowest numbers, connecting the two
borders on either side of the viewing screen. By taking advantage of this, a program
can move a sprite smoothly onto or off the visible screen in any direction. For in-
stance, a sprite that is carried off the right side of the screen by incrementing the
X-position value will reenter the screen from the left as the X-position value is in-
cremented through 512 to 0 and on up again.
The coordinates for hidden, partially visible, and visible sprites of both sizes on
both size viewing screens are given in Table 6.11. Where a range specifies a larger
position value followed by a smaller (e.g., 250-28), a wrap-around from the highest
to lowest values is included.
320 x 200
Standard Hidden: 344-0 Hidden: 250-29
sprite Left partial: 1-23 Top partial: 30-49
Visible: 24-320 Visible: 50-229
Right partial: 321-343 Bottom partial: 230-249
Enlarged Hidden: 344-488 Hidden: 250-8
sprite Left partial: 489-23 Top partial: 9-49
Visible: 24-296 Visible: 50-208
Right partial: 297-343 Bottom partial: 209-249
304 x 192
Standard Hidden: 335-7 Hidden: 246-33
sprite Left partial: 8-30 Top partial: 34-53
Visible: 31-311 Visible: 54-225
Right partial: 312-334 Bottom partial: 226-245
Enlarged Hidden: 335-480 Hidden: 246-12
sprite Left partial: 481-30 Top partial: 13-53
Visible: 31-287 Visible: 54-204
Right partial: 288-334 Bottom partial: 205-245
Graphics and the Movie 301
Address Function
The limiting coordinates 512 and 256 were chosen for practical reasons. There
are 256 positions maximum that can be represented with eight bits, and 512 maximum
with nine bits. VIC supplies eight bits for specifying the vertical position and nine bits
for specifying the horizontal position of each sprite. These values are stored in the
sprite position registers, from 0000 to 0010 hex. The first 16 locations are paired into
eight combinations of horizontal and vertical coordinates for the eight sprites, start-
ing with sprite 0 at location 0000 hex. Each horizontal coordinate register omits the
ninth or top bit, which is placed in the seventeenth register at 0010 in the nth bit posi-
tion for the nth sprite. That is, bit dO of 0010 holds the most-significant bit of sprite
0' s horizontal coordinate. The sprite position registers are represented in Table 6.12.
Colors are assigned to sprite bit values according to the submode. Eight nibble-
wide sprite color registers, from D027 through D02E hex, store a dedicated color
value for each sprite.
The only unusual thing about color and sprites is that in both submodes one bit
pattern allows the underlying image to show through. In effect, the corresponding
sprite pixels become transparent. The color assignments for both submodes are given
in Table 6.13.
A final characteristic of sprites is that they interract with each other and with
the underlying image. This characteristic has two consequences. First, sprites have
priorities for line-of-sight order. This facility simulates three dimensions. A sprite of
Sprite
Submode bit pattern Color source
a lower number will always cover a sprite of a higher number (e.g., if sprite 2 and
sprite 7 cross, sprite 2 will show and only those parts of sprite 7 under sprite 2's
transparent pixels will be visible). Each sprite can also be made to pass over or behind
the underlying mode image. By resetting the nth bit of VIC's display priority register
at D01B hex to 0, the nontransparent pixels of sprite n will cover the image from the
underlying mode. By setting the same bit to 1, the sprite's nontransparent pixels will
show only above underlying image areas defined by background color register 0, or
defined by multicolor bit pairs assigned the value 01.
Second, sprites can collide with each other and with the underlying image. All
O-bits and 00 and 01 bit pairs in an object of any mode are considered to be
background areas. All other bit or bit-pair combinations are considered to be
foreground areas. A collision is registered only when a foreground bit or bit pair of a
sprite crosses over a foreground bit or bit pair of another sprite or of the underlying
image.
A collision has three effects. First, VIC indicates which sprites have collided by
setting I-bits in one of two collision registers. The latter registers correspond to the
two types of collisions; there is a sprite/sprite register at DOlE for sprite-to-sprite col-
lisions and a sprite/image register at D01F for sprite-to-image collisions. In a now-
familiar pattern, the nth bit of each register represents sprite n.
The second effect of collisions is that an IRQ interrupt is generated. The third
,effect is that VIC indicates the cause of the interrupt by setting one of two bits in the
IRQ flags register at 0019. Bit d1 indicates a sprite-to-image collision interrupt, and
bit d2 indicates a sprite-to-sprite interrupt.
As before, the IRQ flags register must be cleared after the interrupt, which we
have seen done by reading the register and writing back its contents. The set bits in the
collision registers are cleared automatically when the registers are read from.
Sprite-to-sprite collisions can always occur off-screen. Sprite .. to-image colli-
sions can occur off-screen in the horizontal direction if the smaller screen size is
selected and the image has been scrolled under the border.
The sprite mode is always selected. Whether or not sprites are displayed
depends on whether sprites have been enabled, placed in the visible section of the
screen, and given nontransparent colors.
Each sprite's submode is independent. Thus any mixture of normal and
multicolor sprites is legal. Sprite sub mode is selected by altering the appropriate bit in
VIC's sprite submode register at DOIC hex. A 1 in the nth bit of D01C places sprite n
in the multicolor submode. A 0 places the sprite in the normal subrnode.
Editing the take. VIC's last cameralike function is to handle transitions be-
tween scenes. Transitions are uncomplicated but dramatic effects whose potential is
often ignored in computer graphics. Most other camera functions are static, but tran-
sitions are dynamic and create a visual rhythm.
Two types of edits are important to this discussion: cuts and wipes. Cuts are
abrupt transitions between scenes. The same effect can be achievl~d in a graphics
mode by switching VIC between 16K banks. In character mode a single 16K bank has
For Further Study 303
room for multiple screen-memory sections, which can be exchanged for the same ef-
fect.
Wipes are transitions where a sharp border between the old and new images
covers the screen in a pattern, usually until the new image has completely replaced the
old. This is not an "in-camera" technique in movies, but it is an "in-VIC" technique
in graphics, which is why it is included here.
A simple wipe would begin with using the mask technique to divide the screen
into two sections. The first raster comparison value is at the top of the visible screen.
On each interrupt the raster comparison value is incremented by one character line,
which is equivalent to eight raster lines. At 60 frames per second, the 25 character-
position changes will produce a I/2-second wipe.
More ornate wipes are elaborations on the simple wipe. Combining the tech-
niques in this chapter can produce many other novel effects. However, all these tech-
niques can be used more effectively in the larger context of sight and sound. Chapter
7 explores the sound capabilities of the C64.
F~xercise:
Review the major points of this chapter with the following questions.
(a) Name the C64 graphics modes and describe their similarities and differences.
(b) Name the C64 submodes and describe how each encodes the pixel bits within visual
objects.
C64 graphics circuitry The Commodore 64 Howard W. Sams & Co. and
Programmer's Reference Commodore Business
Guide Machines Inc.
Mathematically generated The Fractal Geometry of W. H. Freeman and Company
natural graphics Nature
VOCAlC"HORDS
FOR A CHIMERA:
SOUND SYNTHESIS
Like the chimera, the Commodore 64 is a being of varied parts. Instead of owing its
lineage to the goat, lion, and dragon, however, the C64 counts among its relatives
the simple computer, the graphics generator, the 110 handler, and the sound syn-
thesizer. Of these, the C64's sound synthesizer element has the mOS1: distinguished
pedigree.
For chimeras or computers an expressive voice is an emblem of power. The
C64's dedicated audio chip, SID, for Sound Interface Device, provides the C64 with
three of the most expressive voices in any personal computer. To control these
voices, the programmer must first understand the nature of sound. He or she must
then abandon the unfortunate but pervasive emphasis on specialization and take on
the gestalt of the composer. Sound is a medium for communication, and the pro-
grammer can judge the sound only by its effectiveness at communicating rationally
and emotionally. Any such communication can and should be musical in the
broadest sense; it should be aesthetically controlled even if it is not recognizably
melodic, harmonic, or rhythmic.
In this chapter we begin by studying the nature of sound. This will prepare you
to use SID in a more purposeful way from your programs. We will then examine the
programming requirements for utilizing SID's sound capabilities.
WAVE ATTRIBUTES
The study of the nature of sound is called acoustics. Acoustics is a broad field with
many applications and principles beyond the scope of this discussion. We consider
304
Wave Attributes 305
the most basic concepts, and will be repayed with ample technique for creating new
sounds.
Sound occurs when an object moves in a fluid such as air (we will ignore sound
in solids). This movement presses adjoining molecules in the fluid together, raising
their pressure above their surroundings. The compressed molecules then expand into
the lower-pressure areas around them, compressing the surrounding molecules,
which then expand, and so on, in an increasing radius around the moving object.
This mechanism for transmitting energy is called a compression wave, for obvious
reasons. If there are no impediments to the wave, the farthest compressed molecules
will always form a sphere.
Waves have four major attributes: intensity, frequency, waveform, and phase.
In human terms the first two attributes are called loudness and pitch. The third at-
tribute is often called timbre or tone color, and can be derived from the first two.
The fourth attribute is recognized by the human ear only under special and usually
artificial circumstances; hence it has no common name. The modification of these
attributes is called modulation. We will discuss the four attributes individually, and
then explore the techniques provided by SID for modulating them.
Intensity
Sound waves carry energy outward from a moving object. To define how much
energy is in a sound, we must place certain limits on our measurement. If we
measure the energy produced by a regularly moving object over an indefinite time,
the resulting value will increase without limit. For a more meaningful measurement
we can imagine a sphere centered closely around a cyclically moving object, and
measure all the energy passing through the sphere in 1 second. This measurement
reveals the total energy produced per unit time by the object. Energy per unit time is
called power.
If we measure the power through a single unit-area piece of the sphere, the
resulting value is the power per unit area, or intensity, of the wave in that direction.
The intensity measurement accounts for the inevitable differences in the amount of
sound energy passed in different directions, and for the energy in the small wave
area entering the ear.
The range of audible intensities is enormous. The loudest-bearable sound car-
ries approximately 1 trillion times more energy per unit time per unit area than the
softest detectable sound. To accommodate this range, the ear collapses perceived
sound intensity into a smaller, logarithmic loudness scale. A single unit on this scale
is called the decibel, abbreviated dB.
Only 160 decibels separate the quietest audible sound from an instantly
damaging sound. The formula for calculating the decibel loudness from the actual
sound intensity I and the softest audible sound intensity lois
This scale implies, for instance, that perceived loudness doubles with a squar-
ing of the energy intensity ratio, rather than with a simple doubling in energy inten-
sity. Loudness levels for some familiar sounds are listed in Table 7.1.
The loudness span for SID's output is 48 dB, from no output to full power.
However, the TV or stereo set receiving this output amplifies (i.e., multiplies) it by
some factor before dispersing the sound. The loudness range heard from a
loudspeaker can therefore be either expanded or contracted from SID's 48 dB.
Frequency
For an object vibrating with a regular, repeated pattern, a given point in the fluid
will be cyclically compressed and expanded. The time between compressions in this
cycle is called the period. The number of periods or cycles per second is called the
frequency. Representing the period with the variable T and the frequency with the
variable 1, let us express this relationship as
f= liT
Cycles per second have units of hertz or simply Hz, after the German physicist
Heinrich Rudolf Hertz who researched waves in the late 1880s. 1000 Hz are
represented as 1 kilohertz or 1 kHz.
Assuming that the speed of sound is the same for all audible frequencies in air
(an assumption that is nearly true), the period of the wave will be proportional to the
distance between compression points on a line outward from the sound source. The
shorter this distance is at the constant speed of sound, the shorter the wave's period
and the higher its frequency will be.
Air behaves like a spring. In compression, the molecules resist with increasing
force as they are pressed closer. In expansion, the surrounding molecules press in-
ward and attempt to return the expanded region to the surrounding or ambient
pressure. Thus ambient air corresponds to a spring in its neutral, undisturbed posi-
Pin dropping 0
Leaves rustling 10
Quiet room 20
Unoccupied kitchen 40
Normal conversation 60
City street 70
Garbage disposal 90
Jet 1/3 mile away 110
Jackhammer 120
Loud rock band/pain 140
Immediate ear damage 160
Wave Attributes 307
Higher
pressure
Ambient
At-rest + pressu re
position
Lower
pressu re Hgure7.1
tion. We can show the motion of this spring as it is compressed and expanded by
drawing a compression line beside it, as shown in Fig. 7.1.
Since the compression line is vertical, we will call the position on the compres-
sion line y (as in an x,y axis). The ambient pressure point can be defined as y = 0,
with the highest pressure as y = 1 and the lowest pressure as y = - 1.
The solution for the spring position y as it oscillates in time is
y = sin(2'llj)
Since the spring represents air, this is also the solution for the pressure at a
physical point due to the compression wave from the simple cyclical motion of an
object in air. Recall that j is the frequency of the disturbance from the moving ob-
ject that causes the sound.
The sin(27rj) function is called the sine junction and is one of the most basic
relationships in nature. We can graph this variance of pressure with time at a point
in space. The axes are built by moving the compression line to the left, and by
creating a horizontal time line through the ambient-pressure point. When plotted
this function appears as shown in Fig. 7.2.
Because air behaves like a spring, it can carry sound energy only as sine waves.
Therefore, all sounds from the simplest to the most complex must consist of some
Compressed
Ambient ~-----\'----+---4r---+---4r-----r--
Time
Expanded
f'igure 7.2 Graph three complete sine cycle" starting at ambient point
308 Vocal Chords for a Chimera: Sound Synthesis Chap. 7
combination of basic sine waves of different frequencies and phases (a wave at-
tribute that will be discussed shortly).
The 5implest sound consists of a single sine-wave frequency. Assuming that the
frequency is audible to the human ear, a simple sine wave has the clearest pitch
possible. Indeed, listening to a pure sine wave quickly becomes monotonous and
even irritating. We call the pure sine wave the fundamental frequency or pitch.
The most complex sound consists of an infinite number of sine \liaVeS covering
the entire range of audible frequencies. This sound, called white noise, resembles
rushing wind.
Between the two extremes are finite combinations of audible sine waves.
Nature favors certain frequency relationships because of the way objc!cts vibrate. A
vibrating string provides a good illustration of how objects vibrate.
Consider a string that is attached at both ends to unmoving objects (Fig. 7.3).
When the string is plucked, its vibration divides it into evenly spaced alternating
points of maximum vibration and stillness, with still points at either end. There will
always be an integer number of maximum-vibration points, which are called nodes.
The vibration patterns for the first several integer number of nodes are shown in Fig.
7.4. In each case, for m nodes there are m + 1 still points (including the endpoints).
The still points bet ween nodes act like endpoints; that is, m nodes divide the string
into m substrings, each lim times the length of the full string. Each substring has a
wavelength of 11m of the string wavelength, which according to our earlier discus-
sion yields a substring frequency of m times that of the full string.
When a string is plucked, it vibrates with all numbers of nodes from I to 20 or
more, as if there were 20 or more separate strings each vibrating with just one node
pattern and associated frequency. The one-node pattern has the lowest frequency, of
course, and is called the fundamental, as in the frequency of the basic sine wave.
This is the frequency we call the pitch of the sound. The two-node frequency is twice
that of the fundamental, and so on. The frequencies for m greater than or equal to 2
are called harmonics. Harmonics are almost always present in natural sounds, and
with training can often be heard as separate pitches.
The human ear is at best sensitive to frequencies from 15 to 20,000 Hz. It
perceives pitch change at the rate of the base 2 exponent of frequency change. So a
frequency change of 21 (i.e., a doubling of frequency) results in a perceived pitch
change of 1, called an octave. The octave is the basic interval or separation between
pitches. TVtO pitches an octave apart seem to have the same pitch quality at different
heights. Since the number 15 can be doubled only about 10 times in the range of
human hearing, the ear has a range of about 10 octaves.
O __~ _rin_g
[=r-
_ _ _-I Object
l Figure 7.3
Wave Attributes 309
1 node:
Object Object
2 nodes:
Object Object
3 nodes:
Object Object
Figure 7.4
ceeding up the harmonic series from the fundamental, and making similar ratios for
other harmonics, yields the just scale.
This method divides the octave unevenly. When chords are built of the octave,
third, and fifth intervals, the most natural and consonant harmony possible results.
Because of this the fundamental has been called the key of the scale. When chords
are built of the other intervals, the impression of various degrees of mistuning
results. The just scale is appropriate only for music of simple chord structure, like
that of the classical or earlier periods of music, or of much popular music.
To enable an instrument of fixed pitch, such as the harpsichord, to play chords
from all keys equally well, the tempered scale was devised. It divides the octave into
12 evenly spaced pitches. Thus the frequency ratio of any two adjoining semitones
equals 212, or approximately 1.0594. The tempered scale allows free mixing of
chords and keys with uniform and only minor mistuning.
The tempered scale has been the dominant scale in Western music for nearly
three centuries, and is so familiar that its mistuning is usually unnoticed. Its reduc-
tion of the just scale's 20 pitches to 12 is accomplished by collapsing the eight closest
pitch pairs (usually adjoining sharps and flats) into eight single pitches.
The current frequency standard for music fixes the A pitch in the fourth piano
octave to 440 Hz. The other A pitches differ by multiples or factors of 2 from A440.
The frequencies of all A pitches in the piano keyboard range are listed in Table 7.3.
The frequency range of the human ear is much wider, from about 15 to 20,000
Hz for an unimpaired young person. This range narrows with age. SID produces all
Pitch Frequency
AO 27.5
Al 55
A2 110
A3 220
A4 440
A5 880
A6 1760
A7 3520
Wave Attributes 311
The waveform attribute can now be described in terms of sound intensity and
harmonic composition.
Waveform
For any cyclical sound the graph or function of the pressure variation with time is
called the waveform. Waveforms can take on an infinite variety of patterns other
than the sine curve.
In the preceding section we stated that every possible sound consists of some
combination of sine waves having individual intensities. It can further be shown that
every cyclical sound consists of a combination of a fundamental sine pitch and some
subset of its harmonics, with all frequencies having individual intensities.
The frequency of a waveform is the same as that of its fundamental. Combin-
ing sine waves of different frequencies means adding together their individual
pressure offsets from the ambient pressure. At times these offsets will cancel, leav-
ing ambient pressure at the measuring point. At other times they will reinforce each
other for a larger pressure offset than would result from any individual sine compo-
nent.
A few waveforms are especially prevalent. These include the square wave, the
triangle wave, and the ramp wave.
Graphing the square wave produces the result shown in Fig. 7.5. The square
wave contains the fundamental pitch and the odd harmonics. An odd harmonic is
one whose frequency is an odd multiple of the fundamental. In terms of string har-
monics, the square wave contains just those frequencies for which the number of
nodes m is odd. Further, the intensity of each harmonic equals 11 m. Thus the full
square-wave definition is
P = 111 x sin(2'71f) + 1/3 x sin(27r 3f) + 115 x sin(27l' 5f) + ...
Another waveform, the triangle wave, also contains only the fundamental
pitch and the odd harmonics. However, the relative intensity of the harmonics and
Compressed
Ambient
Time
Expanded
figure 7.5
Wave Attributes 313
Compressed
Ambient f--------'\..---------"'-------~--
Time
Expanded
Figure 7.6
even their phase relationships are different than in the square wave (phase will be
discussed in the next section). The triangle wave is shown in Fig. 7.6.
For practical purposes, the most useful thing to remember about the triangle
wave is that the intensities of its harmonics fall off much more rapidly than do those
of the square wave. Instead of a 11 m degradation, intensity decreases proportion-
ately to 11 (m2). This makes the fundamental pitch more dominant over the har-
monics in the triangle wave.
The ramp wave, alone of these three waveforms, contains the fundamental
and all harmonics (Fig. 7.7). The ramp wave contains the same harmonics in the
same intensities as the square wave, adding to them the even harmonics, also with
11 m intensities. Thus the ramp-wave definition is
P = III x sin(21l'f) + 1/2 x sin(21l' 2f) + 1/3 x sin(21l' 3f) + ...
Any of these waveforms can be represented by a spectrum plot, a graph of the
harmonics and harmonic intensities present in a sound. The spectrum plot for a
lOOO-Hz ramp wave, with harmonic intensity scaled from to 1.0, is shown in Fig.
7.8. Spectrum plots for the other two waveforms can be made from the information
on their harmonic frequencies and intensities already given in this section.
Interestingly, the ear does not hear a waveform as a single sound; instead, it
Compressed
Time
Expanded
Figure 7.7
314 Vocal Chords for a Chimera: Sound Synthesis Chap. 7
1.0 -
.f"
C 0.5
~
c
o 2 3 4 5 678
II
9 10
I
11
Frequency (kHz)
Figure 7.8
analyzes the waveform and recreates the harmonic content of the wave. With train-
ing, listeners can identify harmonics in the waveform as individual pitches!
Nevertheless, the combined effect of the harmonics in their relative intensities
is to color the fundamental. Musicians call this the tone quality, tone color, or tim-
bre of the fundamental pitch. Removing a single harmonic from a waveform usually
will not change the timbre of the sound appreciably, because timbre is a function of
the relationships between all the harmonics. The overall shape of these relationships
is hardly affected by changes to anyone harmonic.
Waveform is the attribute that distinguishes the sound of one instrument from
another when both are heard sustaining long tones. It can also distinguish the sus-
tained tone of a new and artificial instrument of your creation from that of all
natural instruments.
Phase
Phase is the measure of the time separation between the same point in the cycles of a
subject wave and a simultaneous reference wave. A wave cycle can be divided into
360 degrees or 271" radians, like a circle, and the difference between the same point in
the two waves is usually expressed in these units. For instance, if the maximum point
of the subject wave falls 1/4 cycle after the reference wave maximum, the subject
wave is said to trail the reference wave by 90 degrees. This is shown in Fig. 7.9 for
two sine waves. The subject wave is dashed and the reference wave is solid.
Figure 7.9
Wave Modulation 315
As mentioned in the preceding section, the ear breaks sounds into harmonic
frequencies with relative intensities. Its emphasis on individual frequencies and in-
tensities leaves it nearly insensitive to the relative phases of the harmonics. Thus the
phase attribute can usually be ignored in sound synthesis.
WAVE MODULATION
To use sound to communicate requires changing the values of one or more of the
four sound wave attributes with time. The changing of a sound attribute is called
modulation. SID's sound-molding abilities, and the detectable sound qualities of
natural instruments as well, can be explained as modulations of the first three wave
attributes: intensity, frequency, and waveform. In this section we discuss the
registers and capabilities available with SID for modulating sounds. The last major
section of this chapter, on sound programming, explains how these capabilities are
combined and used from programs to generate the sounds you want.
With natural instruments modulation is under the control of a player. Since
human beings are imperfect, a player's intentional modulations of intensity, fre-
quency, and to a lesser extent waveform, are departed from slightly at random inter-
vals. This effect is due to entropy, the natural law of disorder discussed in Chapter
2. The most realistic simulation of natural sounds will also subtly modify the smooth
sweep of modulation at random, relatively frequent intervals.
Sound is usually modulated to obtain a subjective effect. Therefore, this sec-
tion will separate SID's features by the subjective form of the wave attribute they
modify. In the order we discuss them, these subjective attributes are loudness, pitch,
and timbre.
Loudness
SID provides three types of loudness control: overall, voice, and harmonic. In
overall loudness control, a program writes a nibble into bits dO through d3 of SID
location D418 to change the output volume of all three voices simultaneously. These
four bits allow for 16 settings, from 0 to 15.0 selects silence, 15 selects full volume,
and the 16 values smoothly divide the 48-dB range into approximately 3-dB incre-
ments (again, the actual loudness range and division depend on the TV or stereo
amplification applied to SID's signal). From the equation relating intensity to
loudness, each 3-dB increase corresponds to a doubling in intensity.
In voice loudness control, the volume of a voice is changed over time, within
the limits set by the overall volume control, using one of three envelope generators.
Each voice has been permanently assigned its own envelope generator.
The envelope generators mold a sound in four stages, which are based on the
characteristics of natural sounds. The volume of a simplified natural sound changes
through four stages: attack, decay, sustain, release. Together, these stages are
sometimes called the ADSR cycle. In the attack stage, the application of energy to a
316 Vocal Chords for a Chimera: Sound Synthesis Chap. 7
physical object causes it to begin vibrating and emitting sound. Typically, this initial
energy pushes the sound intensity to a maximum level in a few milliseconds.
Then as the effect of the initial energy application fades and the object settles
into a vibrating pattern, the sound intensity falls to an intermediate level. This rapid
falling is called the decay stage, and it also typically lasts for a few milliseconds.
The sustain stage is the time in which the object is vibrating in its normal man-
ner, with little falling off in volume. Finally, in the release stage the object's inherent
resistance to vibration overcomes the decreasing stored energy of vibration, and the
object's movement and sound die out. The four stages are graphed as an envelope in
Fig. 7.10.
If a program were given complete control over voice volume, simulating this
envelope would require a lot of code, including timing loops for each volume level in
each stage. By sacrificing an entirely general volume capability, SID has provided
the most natural volume pattern possible with minimal program and CPU work.
SID allows much flexibility in how long each stage of the envelope is played.
Times approaching 1 second for the first two phases are possible, and although
distinctly unnatural, they provide unusual creative possibilities. For instance, a
sound that rises slowly and ends abruptly resembles the backward replay of a tape
recording of a natural instrument.
Seven SID registers have been dedicated to each voice for controlling the
voice's independent sound characteristics. Voice 1 registers extend from 0400 to
0406, with voice 2 from 0407 to 0400, and voice 3 from D40E to 0414. The
envelope generator for each voice is accessed through three of these registers.
The attack-, decay-, and release-stage lengths for each voice are selected
through three nibbles in two ADSR registers. The sustain-stage length is controlled
through a bit in another register, as we shall see shortly. The sustain volume is
selected through the remaining nibble in the ADSR registers, as a fraction of the at-
tack phase maximum (from 0 for silence to F for maximum). The maximum volume
during the attack phase is the same as the maximum overall volume.
The ADSR register locations for each voice are shown in Table 7.5. The effects
of each nibble value on the ADSR envelope are shown in Table 7.6.
Once the ADSR and other parameters for a sound have been loaded into SID,
the sound is triggered by setting a gate bit in another voice register to 1. The attack
Sustain Release
stage stage
Time
Figure 7.10
Wave Modulation 317
phase begins immediately, progressing through the delay and sustain phases.
Volume remains at the sustain level until the gate bit is reset to 0, which triggers the
release phase. However, either triggering action can be taken at any point in the
ADSR cycle, allowing the sound to be restarted or cut short before the normal time.
The gate bits for voices 1, 2, and 3 are in bit dO of locations D404, D40B, and D412,
respectively.
Voice 3 also has its own onloff bit, bit d7 of location D418 (0 = On), which
allows the voice to generate its programmed sound without sending it out for audio
amplification. This is the crudest level of voice control available. The reasons for its
existence will be discussed shortly.
0 2ms 6ms
8ms 24ms
2 16ms 48ms
3 24ms 72ms
4 38 ms 114 ms
5 56ms 168 ms
6 68ms 204ms
7 80ms 0.24 s
8 0.10 s 0.30 s
9 0.25 s 0.75 s
A 0.50 s 1.5 s
B 0.80 s 2.4 s
C 1 s 3 s
D 3 s 9 s
E 5 s 15 s
F 8 s 24 s
318 Vocal Chords for a Chimera: Sound Synthesis Chap. 7
Pitch
SID produces pitches from 0 to 3996 Hz. Another two of each voice's seven registers
are used to select the voice's pitch. The 16 pitch bits divide the 3996 Hz into 2 16 or
65,536 steps of 0.06097 Hz each. In practice one usually starts with a frequency and
calculates the appropriate register values. SID requires these values in the form of an
integer quotient and remainder:
high register = DIV[ROUND(frequency /0.06097)/256]
low register = MOD[ROUND(frequency/0.06097)/256]
Thus, the integer quotient goes into the high-register and the integer remainder
goes into the low register. The division frequency /0.06097 is rounded so that the
MOD operation will produce an integer result. To calculate the register values for
the pitch A at 440 Hz the equations are
ROUND(44010.06097) = 7217
high register = DIV[(7217) I 256] = 28
low register = MOD[(7217) I 256] = 49
The locations for the high and low registers for all three voices are listed in
Table 7.7.
The most obvious type of pitch modulation is to change between scale pitches
to form a melody. The programmer can form a table of the frequency-register values
for all the pitches in the scale, and then store the melody as a series of table-index
values.
Another type of pitch modulation varies the frequency of a single sound. The
most common musical examples of this are the vibrato and the portamento. The
vibrato is a wavering of pitch around a center frequency. Instrumentalists com-
monly use cycles lasting about 117 second each, with frequency reaching 114 scale
step above and below the center frequency. Vibrato adds an aura of warmth and
realism to a computer-generated sound. To use it from a program, the pitch must be
Wave Modulation 319
changed by small amounts within the 1 /4-step limits at regular time intervals. As we
shall see, the time interval can be generated by letting the sound-handling routine be
interrupt driven (i.e., called by the system clock or raster interrupt).
The portamento is a sliding of frequency between two terminal pitches. Trom-
bones and violins are two types of instruments that can perform true portamenti.
With its closely spaced discrete frequencies, SID can perform a convincing simula-
tion of a true portamento.
These two techniques, and any other variation on this type of pitch modula-
tion, require two actions of a program. First, the program must keep a record of
either the original or the current frequency-register values for a sound. This is
necessary because almost all SID registers are write only; voice frequency cannot be
read from SID. Second, the program must generate or obtain an offset for changing
the original or current frequency into the next frequency.
The program produces a new frequency from the above two data items accord-
ing to one of the following relationships:
new frequency = current frequency + incremental offset
new frequency = original frequency + new total offset
Either form can generate a vibrato or portamento. Depending on the source of
the offset value, one or the other form will be easiest and fastest to use. If the pro-
gram generates the offset according to a mathematical function, like simple in-
crementing and decrementing, the first form will often be best. This method is well
suited, for example, for wide portamenti. Alternatively, the program can obtain an
offset value from either of two SID registers. This type of offset fits naturally into
the second relationship. The first type of offset needs no further explanation, so we
will go on to the second.
One source of an offset value is SID's waveform information. SID produces
four waveforms or timbres: pulse, triangle, ramp, and noise. Modulation of these
waveforms is the topic of the next section, but their use in modulating frequency can
be discussed here.
SID represents each waveform with a changing eight-bit value. The pulse wave
consists of alternating 00 and FF values. The triangle wave consists of values that in-
crement to FF and decrement back to 00 repeatedly. The ramp wave consists of
values that increment to FF and wrap around to 00 to increment again (i.e., modulo
FF incrementing). Noise consists of values changing randomly. The values of all
four waveforms change at the rate of their frequency (e.g., a I-kHz waveform
changes waveform values 1000 times a second).
320 Vocal Chords for a Chimera: Sound Synthesis Chap. 7
The waveform value for voice 3 is made available through a n:gister called
oscillator 3, at location D41B.. This value can be read, scaled (by shifting right
or left), and converted to a 16-bit number (by appending a MSByte of 00) for adding
to the sound's original frequency. Note that the original frequency must be lower by
half the offset range to center the variations around a desired pitch. For a vibrato,
voice 3 should be set to a triangle waveform of approximately 7 Hz frequency, and
shifted right to scale it for a 1/4-step offset in the frequency range of the desired
pitch. Note that voice 3 can be set to the noise waveform to obtain random numbers
from oscillator 3 for nonmusical applications.
For a waveform to modulate a voice's frequency audibly, it must be of a very
low frequency. This makes the waveform source, voice 3, relatively useless as an
audible voice. Thus, in this application the voice 3 audible output is usually turned
off by setting the d7 bit of location D418 to 1, as described earlier.
A second source for a frequency offset is the voice 3 ADSR envelope. SID
makes its values-rising from 00 to FF on being gated, falling to an intermediate
value, and falling to 00 on the second gating-available through a register called
envelope 3, at location D41C. Like oscillator 3, this register must be read, scaled,
converted to 16 bits, and added to the frequency-register's value for the modulated
voice. Since the audible output of voice 3 is seldom aesthetic in this role, voice 3 out-
put is also usually turned off with bit d7 of D418.
Timbre
The four waveforms generated by SID produce four classes of timbre. One of each
seven voice registers contains timbre ,election and other control bits. These control
registers are located at D404, D40B, and D412 for voices 1, 2, and 3, respectively.
The contents of each control register are outlined in Table 7.8. The unfamiliar
bits will be explained shortly. Each bit's function is selected with a :I value.
Bits d4 through d7 select the voice waveform. If the pulse waveform is
selected, a 12-bit pulse-width value must be placed in a pulse-width register pair
belonging to the voice. A pulse-width value of 800 hex divides each wave period
Bit Purpose
1.0 r
.~
c
1"c
o 2 3 4 5 6 7
I
8 9 10 11
Frequency (kHz)
Figure 7.11
evenly between high and low wave output values (the parameter discussed under
pitch modulation). An evenly divided pulse wave is also called a square wave. With a
pulse width of 000 the wave output value remains at 00 all the time, and with a pulse
width of FFF the wave output remains at FF constantly. Other values divide the
pulse cycle unevenly, for a more complex timbre than that of the square wave. The
general expression for the percentage of cycle time that the wave output value is high
is
% of cycle that waveform output value is high = 100 x [(pulse width value) 14096]
A pulse wave with a 10070 pulse width has all harmonics strongly present. The
spectrum for a pulse of 10% width at 1 kHz is shown in Fig. 7.11. The pulse-width
register locations for the three voices are listed in Table 7.9.
If more than one waveform is selected in the control register, the different
waveform values are logically ANDed to produce a new waveform. Selecting the
pulse wave and any of the other waveforms has an interesting effect, for instance.
A 1 value in the control register voice-disable bit (d3) locks the waveform
oscillator at a 00 value until the bit is reset. Gate bit dO of the control register is
familiar from our discussion of the ADSR envelope. The remaining two control
register bits, bits d2 and d 1, select ring modulation and voice synchronization,
respectively. These functions combine the given voice with another voice for special
effects. Commodore is tight-lipped about the actual effects of these functions, so
you will have to experiment with them to become familiar with the sounds you can
produce. Selecting either function for voice 1 combines voice 1 with voice 3. Select-
ing either function for voice 2 combines voice 2 with voice 1. Finally, selecting either
function for voice 3 combines voice 3 with voice 2. Try different combinations of
frequency and waveform in each of the voices to explore the many interesting effects
that are possible.
However the waveform of a voice has been generated, it can then be further
modulated by filtering its harmonic structure. As stated in the preceding section,
filtering drops the volume of frequencies on one or both sides of a reference or
cutoff frequency.
Four types of filtering are possible with SID. Lowpass filtering suppresses all
frequencies above a cutoff frequency. Highpass filtering suppresses all frequencies
below a cutoff frequency. Bandpass filtering suppresses all frequencies on either side
of the cutoff. Notch filtering is a combination of lowpass and highpass filtering to
suppress the cutoff and immediately surrounding frequencies. The effects of these
filterings are shown in Fig. 7.12, with' 'gain" signifying the change in loudness level.
Frequencies above a lowpass cutoff or below a high pass cutoff are suppressed
by 12 dB for each octave separating them from the cutoff. Frequencies around a
bandpass cutoff are suppressed by 6 dB per octave of separation. The effect shown
for the notch filter is only approximate.
The filtering type is selected by one or more of three bits sharing the overall
output-volume register, at D418. Setting bit d4 to I selects lowpass filtering, bit d5
similarly selects bandpass, bit d6 selects highpass, and setting both bit d4 and d6
selects notch filtering.
The cutoff frequency is selected by placing an II-bit value in locations D415
and D416. Commodore makes no claims as to the preciseness with which a fre-
quency can be selected; it claims only that a range of cutoff frequencies from ap-
proximately 30 to 10,000 Hz is linearly divided 211 ways. Taken literally, this divides
the range into 4.868-Hz intervals. Assuming that this value is accurate, an approx-
imate cutoff register value can be calculated from the desired cutoff frequency as
follows:
cutoff register value = ROUND[(frequency - 30) 14.868]
The lower 3 bits of the cutoff value go into dO through d2 of location D415. The up-
per byte of the cutoff value goes into D416.
Frequencies in the immediate neighborhood of the cutoff frequency can be
boosted above their normal loudness by altering what Commodore calls the filter's
"resonance" (we point to Commodore's usage because the word "resonance" nor-
mally refers to the sharpness of volume suppression of all affected frequencies
around the cutoff frequency, a meaning that does not apply to SID). A nibble value
from 0, for no boost, to F, for maximum boost, can be placed into bits d4 through
d7 of location D417. For thf: clearest sound this value should normally be set to F.
] ust as the contents of oscillator 3 and envelope 3 were retrieved and used to
modulate a voice's frequency, they can be used to modulate the filter's cutoff fre-
quency. The resulting timbre changes are dramatic, but difficult to describe with
words. Anyone interested in unusual tone-color effects should experiment with
Wave Modulation 323
Lowpass
o
-12
~ -24
.~ -36
l:J -48
-60
Highpass
0,...-----.
-12
~ -24
c -36
'ro
l:J -48 -
-60
Bandpass
o
-6
~ -12
,S -18
'" -24
l:J
-30
Notch
~
.~
l:J
-~~
-6
-9
-12
-15
~~ _ _- L_ _ ~ __ ~~ __ ~ _ _- L_ _ ~ __
modulating the cutoff frequency with many different waveforms and ADSR
envelopes.
SID has only one filter, but each voice can be routed either through or around
it. The filter bypass bits dO through d2 of location D417 represent voices 1 through
3, respectively. A 1 bit value routes the voice around the filter, and a 0 value routes it
through to be filtered.
324 Vocal Chords for a Chimera: Sound Synthesis Chap. 7
SOUND PROGRAMMING
The key to achieving the sounds you want with SID is to experiment with modula-
tion. Try combining the different types of sound modulations in different ways and
see which combinations produce the most effective results. For instance, you could
combine a triangle waveform with a vibrato pitch change and a slow attack but fast
release.
We have discussed how natural instruments can be simulated by varying
volume, pitch, and timbre. Unnatural sounds can be produced the same way. Voices
with different wave characteristics can be played simultaneously to enrich the
resulting timbre.
By being familiar with the underlying wave nature of the sound that you are
handling, you will be able to vary different wave attributes in an intelligent way. As
you become experienced in SID's use, you will be able to start with sounds that are
closer to your intentions, to attain desired sounds more quickly than would other-
wise be possible, and perhaps most important, be able to explore sounds you cannot
mentally imagine, which may have the greatest effects of all.
The preferred method for handling sound from a program resembles the
preferred method for handling screen graphics; the program prepares for 110 opera-
tions and then utilizes an IRQ interrupt handler. To review how the IRQ handler
works, see the animation section of Chapter 6. The IRQ can be driven by the raster,
which is convenient if both sound and screen changes are being performed, or by the
system-cycle IRQ from CIA chip #1 (to review the system cycle, see th(! kernel clock
section in Chapter 5). With a raster IRQ and video processing, the video portion of
the work should be performed first because it is the most time critical. Whichever
IRQ is used to drive the audio interrupt handler, for timing preciseness it should
usually be the only IRQ enabled.
The program's duties in preparing for audio 110 are to write the value 0 to all
SID registers (locations D400 through D41 C hex), and then to turn the overall
volume (location D418, low nibble) full on (value OF hex). This places SID in a
startup condition, ready to be configured to produce sound.
The interrupt handler's duties include everything associated with producing
the sounds. This consists of setting up the filter type, cutoff, and resonance; routing
the voice or voices through the filter; setting up and altering the voice frequency,
pulse-width and ADSR values for the current state of the sound; checking for cues
to begin a sound (e.g., a keyboard input or a specific timer value); setting up the
voice control register with the selected waveform bit and with dO = 1 to start the at-
tack stage; timing the ADS stages; resetting the dO bit of the voice-control register to
start the release stage; and turning the voice off by writing a 0 into th(! voice-control
register when the sound is done. Obviously, not all these activities will be done in
anyone activation of the interrupt handler. Using a l/60-second raster or CIA IRQ
to drive the interrupt handler and a counter in the handler code will allow determin-
ing when each activity should be performed.
The pseudocode for the duties of an audio interrupt handler is given below.
Sound Programming 325
Not all pseudocode lines will be performed for every sound, but together they
describe the general case. The handler duties cover the lifetime of a sound, which is
usually far longer than 1/60 second, which is the time between two raster or system
IRQs. Hence these duties must be divided and the parts performed over many inter-
rupt handler executions. The timing counter we just mentioned can be incremented
each time the interrupt handler executes, and the various actions in the handler
pseudocode can be associated with different counter values. The assembly code for
an audio handler resembles a CASE construct, with the CASE processing options
being the actions described in the following pseudocode lines or in lines produced by
expanding the pseudocode. Over time and many execution cycles, the interrupt
handler will work its way through the actions described in the pseudocode below.
Produce a sound
Set up the filter type, cutoff, and resonance (D415-D418)
Route the voices through the filter (D417)
Set up the voice frequencies, pulse widths, and ADSR envelope
Check for cue to begin sound
I F time to begin sound
THEN Select voice waveform and start its attack phase
LOOP
Modulate the voice frequency, loudness, or timbre
Modulate overall loudness
IF Sustain-phase time has just completed
THEN Start the release phase
ENDIF
EXITIF Sound has died out
ENDLOOP
ENDIF
EN D Produce a sound
Piccolo 5 3 A 4 Triangle
Flute 5 5 5 3 Triangle Lowpass 50
Clarinet 5 4 4 2 Pls(800) Lowpass 50
Oboe 3 3 A 4 Pls(800) Lowpass 50
Bassoon 3 5 9 5 Triangle Lowpass 40
Trumpet 3 6 3 5 Sawtooth Lowpass 50
Cornet 5 3 6 4 Sawtooth Lowpass 50
French horn 5 3 6 4 Pls(500) Lowpass 40
Bass drum 0 B 0 0 Triangle
326 Vocal Chords for a Chimera: Sound Synthesis Chap. 7
are for notes between middle C, or C5, and C7. Resonance equals F hex for all
filtered cases. The pulse-width value is shown for pulse waveform cases. All values
are hexadecimal.
These values omit vibrato, timbre change, and other dynamic modulations, yet
still yield recognizable sounds. With SID's flexibility, most familiar sounds can be
recognizably simulated given the right register values and enough dynamic control
of timbre and frequency by the program.
,I Exercise:
II
; I Answer the following questions to review C64 sound synthesis.
I, (a) What are the four major attributes of sound waves?
:I
"
327
Placing values from one of the screen display codes directly into screen-memory
locations will cause the corresponding characters to be displayed as long as the SI D
chip is in character mode (see Chapter 6). Code I is selected by sending the value OE
hex out the screen channel using kernel module CHROUT. Code 2 is selected by
sending value 8E hex out the screen channel.
In all these coded values the d7 bit equals O. Setting the d7 bit reverses the two
bit colors in the character. The predefined characters and their coded values are
given in the following table. Use code 1 character where code 2 is blank.
Chara<:ter Value Character Value
328
App. B The Screen Display Codes 329
-
2D (4S) iT III 5E (94)
2E (46) ~ ~ SF (95)
/ 2F (47) 60 (96)
0 30 (48) l: 61 (97)
31 (49) 62 (98)
2 32 (50) 1__ J 63 (99)
3 33 (51 ) Ii4 ( 100)
4 34 (52) fJ 65 (101)
5
6
7
35
36
37
(53)
(54)
(55)
_I
iIIIIi
66
67
68
(102)
(103)
(104)
8 38 (56) ~ 69 (105)
9 39 (57) J 6A (106)
3A (58) 1- 68 ( 107)
<
3B
3C
(59)
(60)
t:;
6C
6D
( 108)
(109)
>
......,
3D
3E
3F
40
(61)
(62)
(63)
(64)
-
"1
J..;
n
6E
6r-
70
71
(110)
(III)
(112)
( 113)
~ A 41 (65) T, 72 (114)
IIJ B 42 (66) :-I. 73 (115)
C 43 (67) I 74 (116)
D 44 (68) 11 75 ( 117)
!=;
"'-d
E
F
45
46
(69)
(70)
...
:)
I
76
77
( 118)
(119)
ill G 47 (71) ~ 78 (120)
. ~
H 48 (72) ~ 79 (Ill)
~ 49 (73) ~ 0 7A (122)
:'i 4A (74) .:
~.
7B (123)
FT K 4B (75) 7C (124)
LJ L 4C (76) 0-: 7D (125)
N M 4D (77) ~) 7E (126)
it'! N 4E (78) ~ 7F (127)
III the following table, ASCII codes are listed under the "std." column, and special
XASCII codes are listed under the "64" column. The "standard meaning" column
applies to the ASCII code where ASCII and XASCII differ.
E:\Q
BFL
7
(5)
(7)
Enquiry
Ring hell
D(3
[)C4
13
14
(I'})
(20)
Iramicl'
\\ il h
pnntl'r"."
HS R (X) Back 'paloe l't('.
III
ESC III (27) ["cape
HI <) (9) Horil. lab IC (2k)
MID E (14)
Return
(Both code,)
SP
IF
20
(31 )
(32)
IXI II ( 17) DCI ice 21 (31)
c\lntrol,. 2~ (34)
DC2 12 (I S) for COOf- P 0'
_J (35)
dilUting $ 24 (36)
330
App. C ASCII and XASCII Codes 331
D
E 45 (69) lJ 71 ( 113)
l- 46 (70) 0 72 (114)
(j 47 (71) ['!i 73 ( 115)
II 48 (72) 0 74 (116)
49 (73) u Q 75 ( 117)
.I 4A (74) v ~ 76 ( 118)
K 4B
4('
(75) w a[J 77 ( 119)
L (76) x 78 ( 120)
\1 4D (77) Y [lJ 79 (121 )
N 4E (78) Ltl 7A ( 122)
0 4F (79) EE 713 ( 123)
(conlinued)
332 ASCII and XASCII Codes App,C
OVLN
Df-l
Ii]
fD
tiTl
~
Orange
7C
7t)
7E
71
81
( 124)
(12' )
( 126)
( 127)
(129)
-
(
l
AI
\2
A3
:\4
A5
(161)
( 11i2)
( 1(3)
(l1i4)
( 165)
;'1 85 ( I 31 ) II \Ii ( 166)
rJ ~6 (U4) I A7 ( 167)
["5 87 ( I J') IIi AS ( 168)
17 liS ( 136) r A'J ( 16,))
1'2 89 ( 137) J AA ( 170)
14 SA ( 138) f- \B (171 )
f6 8R (1'9)
L
AC ( 172)
-.
:8 8C ( 140) \D (In)
_1lIII 80 (141) I AI' ( 174)
~ Sl ( 142) Ai' (175 )
90 (144) r- EO ( 176)
U 91 ( 145) ...L HI ( 177)
92 (146) ..,- B2 (178)
'i.1 (147 ) -l H3 ( 179)
Ii 94 (148) I B4 ( 180)
13rown ( 149) I H5 (IE 1)
Lt, Red ( 150) I H6 ( 182)
(irey I (151 )
-- H7 (I S3)
.
Grey 2 ( 152) B8 ( 184)
L1. CJreen ( 153) 89 (ISS)
L1. l3lue (154) .J EA ( 186)
..-
Gre" 3 ( 155) BH ( 187)
'Ie (156) -.
11..
BC ( ISS)
II ::r.-
.
9D (157) flD ( 189)
'iE (158) BE ( 190)
9F (159) EF (191 )
AO (160)
Value., Same as:
333
334 The Instruction Set App. 0
(COIlIIIII/('(/)
336 The Instruction Set App. 0
Preface: Expose . . . Past Its Plastic Envelope: The Computer's Inner Machinery. . . Conceptual
Quicksilver: Data Structures . .. Into Its Brain: 6510 Assembly Language . .. Imposing Reason:
Program Planning . .. Connecting the Nerves: Using the Memory Map . .. Awakening the
PIXY: Advanced Graphics . . Vocal Chords for a Chimera: Sound Synthesis
In these stimulating chapters you will find a complete method for translating your most
ambitious Commodore 64 programming inspirations into reality. Sparkling style and paced
presentation work together to make it easy for you to master the most advanced assembly
language, graphics, sound , and I/O techniques. You benefit from the author's extensive
microcomputer programming experience, and the many 'trade secrets' obtained from top
professional Commodore 64 programmers and revealed here for the first time. Whether
for a tutorial or a text, reference or recreation , Power Programming is the first and last
computer book your library will need .
" I have yet to read a technical discussion that has been so thoroughly " milled " into such
easily digestible form . His descriptions ... held me absolutely spellbound . An incredible
treatise of theoretical and experiential knowledge. Quite remarkable in light of the
sophistication of the subject material. I'd recommend the book .. . even to my best
friends."
Jack Clarke, Professor, EI Camino Community College District
" I am particularly impressed with his style and his method of presentation . The author
seems to have many unique qualities in his ability to attack the subject. He's done an
excellent job of explaining those 'sometimes overlooked' details. A well organized ,
detailed account of the Commodore 64, designed for a wide range of audience. An
excellently executed document covering the Commodore 64 in more detail than any other
document I've seen .'.'
Ed Pevovar, President, Computer Communication and Engineering Consultants
o 5
PRENTICE-HALL, INC.
Englewood Cliffs, N.J. 07632
21898 68784
ISBN 0-13-687849-0