Programming The 65816
Programming The 65816
Table of Contents
1) Chapter One .......................................................................................................... 12
Basic Assembly Language Programming Concepts..................................................................................12
Binary Numbers.................................................................................................................................................... 12
Grouping Bits into Bytes....................................................................................................................................... 13
Hexadecimal Representation of Binary ................................................................................................................ 14
The ACSII Character Set ..................................................................................................................................... 15
Boolean Logic........................................................................................................................................................ 16
Logical And ........................................................................................................................................................ 16
Logical Or .......................................................................................................................................................... 17
Logical Exclusive Or........................................................................................................................................... 17
Logical Complement........................................................................................................................................... 17
Signed Numbers .................................................................................................................................................... 18
Storing Numbers in Decimal Form....................................................................................................................... 19
Computer Arithmetic............................................................................................................................................ 20
Microprocessor Programming.............................................................................................................................. 20
Machine Language.............................................................................................................................................. 20
Assembly Language ............................................................................................................................................ 22
Writing in Assembly Language ............................................................................................................................ 22
Basic Programming Concepts............................................................................................................................... 23
Selection Between Paths...................................................................................................................................... 24
Looping .............................................................................................................................................................. 24
Subroutines ......................................................................................................................................................... 24
4) Chapter Four......................................................................................................... 44
Sixteen-Bit Architecture The 65816 and the 65802...................................................................................44
2
5) Chapter Five.......................................................................................................... 64
SEP, REP, and Other Details ....................................................................................................................64
The Assembler Used in This Book ........................................................................................................................ 66
Address Notation................................................................................................................................................... 68
6) Chapter Six............................................................................................................ 69
First Examples: Moving Data....................................................................................................................69
Loading and Storing Registers ............................................................................................................................. 71
Effect of Load and Store Operations on Status Flags............................................................................................ 73
Moving Data Using the Stack .............................................................................................................................. 73
Push.................................................................................................................................................................... 74
Pushing the Basic 65x Registers .......................................................................................................................... 76
Pull..................................................................................................................................................................... 76
Pulling the Basic 65x Registers ........................................................................................................................... 76
Pushing and Pulling the 65816s Additional Registers.......................................................................................... 78
Pushing Effective Addresses................................................................................................................................ 79
Other Attributes of Push and Pull ........................................................................................................................ 79
Moving Data Between Registers ........................................................................................................................... 79
Transfers............................................................................................................................................................. 79
Exchanges........................................................................................................................................................... 86
Storing Zero to Memory ....................................................................................................................................... 86
Block Moves .......................................................................................................................................................... 87
MVP .........................................................................................................................................................368
No Operation....................................................................................................................................................... 369
OR Accumulator with Memory .......................................................................................................................... 370
Push Effective Absolute Address ........................................................................................................................ 372
Push Effective Indirect Address ......................................................................................................................... 373
Push Effective PC Relative Indirect Address ..................................................................................................... 374
PER ..................................................................................................................................................................... 374
Push Accumulator............................................................................................................................................... 375
Push Data Bank Register .................................................................................................................................... 376
Push Direct Page Register................................................................................................................................... 377
Push Program Bank Register ............................................................................................................................. 378
Push Processor Status Register........................................................................................................................... 379
Push Index Register ............................................................................................................................................ 380
Push Index Register ............................................................................................................................................ 381
Pull Accumulator ................................................................................................................................................ 382
Pull Data Bank Register...................................................................................................................................... 383
Pull Direct Page Register .................................................................................................................................... 384
Pull Status Flags.................................................................................................................................................. 385
Pull Index Register X from Stack ....................................................................................................................... 386
Pull Index Register Y from Stack ....................................................................................................................... 387
Reset Status Bits.................................................................................................................................................. 388
Rotate Memory or Accumulator Left................................................................................................................. 389
7
Table of Figures
FIGURE 1-1 BINARY REPRESENTATION............................................................................................................................. 13
FIGURE 1-2 BIT NUMBERS ............................................................................................................................................... 14
FIGURE 1-3 ANDING BITS ............................................................................................................................................... 16
FIGURE 1-4 ORING BITS .................................................................................................................................................. 17
FIGURE 1-5 EXCLUSIVE ORING BITS ............................................................................................................................ 17
FIGURE 1-6 COMPLEMENTING BITS ............................................................................................................................. 18
FIGURE 1-7 COMPLEMENTING BITS USING E XCLUSIVE OR .......................................................................................... 18
FIGURE 1-8 MULTIPLE-PRECISION ARITHMETIC ............................................................................................................... 21
FIGURE 1-9 TYPICAL ASSEMBLER SOURCE CODE ............................................................................................................. 23
FIGURE 2-1 6502 PROGRAMMING MODEL ........................................................................................................................ 28
FIGURE 2-2 INITIALIZING THE STACK POINTER TO $FF ..................................................................................................... 32
FIGURE 2-3 AFTER PUSHING THE ACCUMULATOR............................................................................................................. 33
FIGURE 2-4 INDEXING: BASE PLUS INDEX ........................................................................................................................ 35
FIGURE 2-5 INDIRECTION: OPERAND LOCATES INDIRECT ADDRESS .................................................................................. 36
FIGURE 4-1 65816 NATIVE MODE PROGRAMMING MODEL .............................................................................................. 46
FIGURE 4-2 RESULTS OF SWITCHING REGISTER SIZE ........................................................................................................ 51
8
Table of Tables
TABLE 1-1 DECIMAL AND HEX NUMBERS ........................................................................................................................ 15
TABLE 1-2 TRUTH TABLE FOR AND ............................................................................................................................... 17
TABLE 1-3 TRUTH TABLE FOR OR ................................................................................................................................... 17
TABLE 1-4 TRUTH TABLE FOR EXCLUSIVE OR ............................................................................................................. 17
TABLE 1-5 TRUTH TABLE FOR COMPLEMENT .............................................................................................................. 18
TABLE 1-6 THE EIGHT-BIT RANGE OF TWOS-COMPLEMENT NUMBERS ............................................................................ 19
TABLE 1-7 THE FIRST 16 BCD NUMBERS ........................................................................................................................ 20
TABLE 2-1 STATUS REGISTER CONDITION CODE FLAGS.................................................................................................... 30
TABLE 2-2 STATUS REGISTER MODE SELECT FLAGS......................................................................................................... 31
TABLE 2-3 6502 ADDRESSING MODES ............................................................................................................................ 34
TABLE 2-4 6502 INSTRUCTIONS ...................................................................................................................................... 37
TABLE 3-1 THE 65C02S NEW ADDRESSING MODES......................................................................................................... 41
TABLE 3-2. NEW 65C02 INSTRUCTIONS .......................................................................................................................... 42
TABLE 4-1 THE FOUR POSSIBLE NATIVE MODE REGISTER COMBINATIONS ....................................................................... 49
TABLE 4-2 ADDRESSING MODES: ZERO PAGE VS. DIRECT PAGE ..................................................................................... 52
TABLE 4-3 THE 65816/65802S NEW ADDRESSING MODES .............................................................................................. 53
TABLE 4-4 NEW 65816/65802 INSTRUCTIONS ................................................................................................................. 54
TABLE 4-5 INTERRUPT VECTOR LOCATIONS .................................................................................................................... 55
TABLE 6-1 DATA MOVEMENT INSTRUCTION .................................................................................................................... 70
TABLE 7-1 LIST OF SIMPLE ADDRESSING MODES............................................................................................................. 89
TABLE 8-1. BRANCH AND JUMP INSTRUCTIONS.............................................................................................................. 111
TABLE 9-1 ARITHMETIC INSTRUCTIONS ........................................................................................................................ 122
TABLE 9-2. EQUALITIES................................................................................................................................................ 131
TABLE 10-1 LOGIC INSTRUCTIONS ................................................................................................................................ 139
TABLE 11-1 COMPLEX ADDRESSING MODES ................................................................................................................. 154
TABLE 11-2 COMPLEX PUSH INSTRUCTIONS .................................................................................................................. 154
TABLE 11-3 ASSEMBLER SYNTAX FOR COMPLETE MEMORY ACCESS ............................................................................. 157
TABLE 12-1 SUBROUTINE INSTRUCTIONS ...................................................................................................................... 174
TABLE 13-1. INTERRUPT AND SYSTEM CONTROL INSTRUCTIONS. ................................................................................... 192
TABLE 13-2 INTERRUPT VECTORS ................................................................................................................................. 195
TABLE 13-3 RESET INITIALIZATION............................................................................................................................... 201
TABLE 17-1 OPERAND SYMBOLS .................................................................................................................................. 284
TABLE 18-1 OPERAND SYMBOLS .................................................................................................................................. 326
TABLE 18-2 65X FLAGS ................................................................................................................................................ 326
10
Part 1
Basics
11
1) Chapter One
Basic Assembly Language Programming Concepts
This chapter reviews some of the key concepts that must be mastered prior to learning to program a
computer in assembly language. These concepts include the use of the binary and hexadecimal number
systems; boolean logic; how memory is addressed as bytes of data; how characters are represented as ASCII
codes; binary-coded decimal (BCD) number systems, and more. The meaning of these terms is explained in
this chapter. Also discussed is the use of an assembler, which is a program used to write machine-language
programs, and programming techniques like selection, loops, and subroutines.
Since the primary purpose of this book is to introduce you to programming the 65816 and the other members of
the 65x family, this single chapter can only be a survey of this information rather than a complete guide.
Binary Numbers
In its normal, everyday work, most of the world uses the decimal, or base ten, number system, and
everyone takes for granted that this system is the natural (or even the only) way to express the concept of
numbers. Each place in a decimal number stands for a power of ten: ten to the 0 power is 1, ten to the 1st power
is ten, ten to the 2nd power is 100, and so on. Thus, starting from a whole numbers right-most digit and
working your way left, the first digit is multiplied by the zero power of ten, the second by the first power of ten,
and so on. The right-most digits are called the low-order or least significant digits in a positional notation
system such as this, because they contribute least to the total magnitude of the number; conversely, the leftmost
digits are called the high-order or most significant digits, because they add the most weight to the value of the
number. Such a system is called a positional notation system because the position of a digit within a string of
numbers determines its value.
Presumably, it was convenient and natural for early humans to count in multiples of ten because they
had ten fingers to count with. But it is rather inconvenient for digital computers to count in decimal; they have
the equivalent of only one finger, since the representation of numbers in a computer is simply the reflection of
electrical charges, which are either on or off in a given circuit. The all or nothing nature of digital circuitry
lends itself to the use of the binary, or base two, system of numbers, with one represented by on and zero
represented by off. A one or a zero in binary arithmetic is called a binary digit, or a bit for short.
Like base ten digits, base two digits can be strung together to represent numbers larger than a single
digit can represent, using the same technique of positional notation described for base ten numbers above. In
this case, each binary digit is such a base two number represents a power of two, with a whole numbers rightmost bit representing two to the zero power (ones), the next bit representing two to the first power (twos), the
next representing two to the second power (fours), and so on (Figure 1-1 Binary Representation)
12
128s Place
64s Place
32s Place
16s Place
8s Place
4s Place
2s Place
1s Place
As explained, if the value of a binary digit, or bit, is a one, it is stored in a computers memory by
switching to an on or charged state, in which case the bit is described as being set; if the value of a given bit is
a zero, it is marked in memory by switching to an off state, and the bit is said to be reset.
While memory may be filled with thousands or even millions of bits, a microprocessor must be able to
deal with them in a workable size.
2
4
32
64
102
Figure 1-1 Binary Representation
The smallest memory location that can be individually referenced, or addressed, is usually, and always
in the case of the 65x processors, a group of eight bits. This basic eight-bit unit of memory is known as a byte.
Different types of processors can operate on different numbers of bits at any given time, with most
microprocessors handling one, two, or four bytes of memory in a single operation. The 6502 and 65C02
processors can handle only eight bits at a time. The 65816 and 65802 can process either eight or sixteen bits at
a time.
Memory is organized as adjacent, non-overlapping bytes, each of which has its own specific address.
An address is the unique, sequential identifying number used to reference the byte at a particular location.
Addresses start at zero and continue in ascending numeric order up to the highest addressable location.
As stated, the 65802 and 65816 can optionally manipulate two adjacent bytes at the same time; a
sixteen-bit data item stored in two contiguous bytes is called a double byte in this book. A more common but
misleading usage is to describe a sixteen-bit value as a word; the term word is more properly used to describe
the number of bits a processor fetches in a single operation, which may be eight, sixteen, thirty-two, or some
other number of bits depending on the type of processor.
It turns out that bytes multiples of eight bits are conveniently sized storage units for programming
microprocessors. For example, a single byte can readily store enough information to uniquely represent all of
the characters in the normal computer character set. An eight-bit binary value can be easily converted to two
hexadecimal (base sixteen) digits; this fact provides a useful intermediate notation between the binary and
decimal number systems. A double byte can represent the entire range of memory addressable by the 6502,
65C02, and 65802, and one complete bank 64K bytes on the 65816. Once youve adjusted to it, youll find
that there is a consistent logic behind the organization of a computers memory into eight-bit bytes.
Since the byte is one of the standard units of a computer system, a good question to ask at this point
would be just how large a decimal number can you store in eight bits? The answer is 255. The largest binary
number you can store in a given number of bits is the number represented by that many one-bits. In the case of
13
Double-Byte
Byte
15
14
13
12
11
10
High-Order
Low-Order
Figure 1-2 Bit Numbers
14
Decimal
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Hexadecimal
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
Binary numbers larger than 1111 are converted to hexadecimal by first separating the bits into groups of
fur, starting from the right-most digit and moving left. Each group of four bits is converted into its
corresponding hex equivalent. It is generally easier to work with a hexadecimal number like F93B than its
binary counterpart 111100100111011.
Hexadecimal numbers are often used by machine language
programming tools such as assemblers, monitors, and debuggers to represent memory addresses and their
contents. The value of hexadecimal numbers is the ease with which they can be converted to and from their
binary equivalents once the table has been memorized.
While a hexadecimal 3 and a decimal 3 stand for the same number, a hexadecimal 23 represents two
decimal sixteens plus 3, or 35 decimal. To distinguish a multiple-digit hex number from a decimal one, either
the word hexadecimal should precede or follow it, or a $ should prefix it, as in $23 for decimal 35, or $FF to
represent 255. A number without any indication of base is presumed to be decimal. An alternative notation for
hexadecimal numbers is to use the letter H as a suffix to the number (for example, FFH); however, the dollarsign prefix is generally used by assemblers for the 65x processors.
The ASCII
Characters letters, numbers, and punctuation are stored in the computer as number values, and
translated to and from readable form on input or output by hardware such as keyboards, printers, and CRTs.
There are 26 English-language lower-case letters, another 26 upper-case ones, and a score or so of special
characters, plus the ten numeric digits, any of which might be typed from a keyboard or displayed on a screen or
printer, as well as stored or manipulated internally. Further, additional codes may be needed to tell a terminal or
printer to perform a given function, such as cursor or print head positioning. These control codes including
carriage return, which returns the cursor or print head to the beginning of a line; line feed, which moves the
cursor or print head down a line; bell, which rings a bell; and back space, which moves the cursor or print head
back one character.
The American Standard Code for Information Interchange abbreviated ASCII and pronounced AS
key, was designed to provide a common representation of characters for all computers. An ASCII code is
stored in the low-order seven bits of a byte; the most significant bit is conventionally a zero, although a system
can be designed either to expect it to be set or to ignore it. Seven bits allow the ASCII set to provide 128
different character codes, one for each English letter and number, most punctuation marks, the most commonly
use mathematical symbols, and 32 control codes.
15
Boolean Logic
Logical operations interpret the binary on/off states of a computers memory as the values true and
false rather than the numbers one and zero. Since the computer handles data one or two bytes at a time, each
logical operation actually manipulates a set of bits, each with its own position.
Logical operations manipulate binary flags. There are three logical operations that are supported by
65x microprocessor instructions, each combining two operands to yield a logical (true or false) result: and, or,
and exclusive or.
Logical And
The AND operator yields true only if both of the operands are themselves true; otherwise, it yields
false. Remember, true is equivalent to one, and false equivalent to zero. Within the 65x processors, two strings
of eight, or in the case of the 65816, eight or sixteen, individual logical values may be ANDed, generating a
third string of bits; each bit in the third set is the result of ANDing the respective bit in each of the first two
operands. As a result, the operation is called bitwise.
When considering bitwise logical operations, it is normal to use binary representation. When
considered as a numeric operation on two binary numbers, the result given in Figure 1.3 makes little sense. By
examining each bit of the result, however, you will see that each has been determined by ANDing the two
corresponding operand bits.
AND
equals
11011010
01000110
01000010
$DA
$45
$42
16
1
First Operand
0
1
0
0
0
1
Logical Or
The OR operator yields a one or true value if either (or both) of the operands is true. Taking the same
values as before, examine the result of the logical OR operation in Figure 1.4. The truth table for the OR
function is shown in Table 1.3.
OR
equals
11011010
01000110
11011110
$DA
$45
$DE
Logical Exclusive Or
The exclusive OR operator is similar to the previously-described OR operation; in this case, the result
is true only if one or the other of the operands is true, but not if both are true or (as with OR) neither is true.
That is, the result is true only if the operands are different, as Figure 1.5 illustrates using the same values as
before. The truth table for exclusive OR is shown in Table 1.4.
Second Operand
0
1
First Operand
0
1
0
1
1
1
11011010
01000110
10011100
$DA
$45
$9C
1
First Operand
0
1
0
1
1
0
Logical Complement
As Figure 1.6 shows, the logical complement of a value is its inverse: the complement of true is false,
and the complement of false is true.
17
$DA
00100101
$25
COMPLEMENTED
equals
While the 65x processors have no complement or not function built in, exclusive ORing a value with a
string of ones ($FF or $FFFF) produces the complement, as Figure 1.7 illustrates.
EOR
equals Complement
11011010
11111111
00100101
$DA
$FF
$25
Since complement has only one operand, its truth table, drawn in Table 1.5, is simpler than the other
truth tables.
operand
0
1
result
1
0
Signed Numbers
Many programs need nothing more than the whole numbers already discussed. But others need to store
and perform arithmetic on both positive and negative numbers.
Of the possible systems for representing signed numbers, most microprocessors, among them those in
the 65x family, use twos complement. Using twos-complement form, positive numbers are distinguished
from negative ones by the most significant bit of the number: a zero means the number is positive; a one means
it is negative.
To negate a number in the twos-complement system, you first complement each of its bits, then add
one. For example, to negate one (to turn plus-one into minus-one):
00000001
11111110
+1
To negate +1,
complement each bit
and add one.
11111111
The result is 1.
18
Hexadecimal
$7F
$7E
$7D
.
.
.
1
0
$FF
$FE
$FD
.
.
.
$82
$81
$80
Binary
0111 1111
0111 1110
0111 1101
.
.
.
0000 0001
0000 0000
1111 1111
1111 1110
1111 1101
.
.
.
1000 0010
1000 0001
1000 0000
Another practical way to think of negative twos-complement numbers is to think of negative numbers
as the (unsigned) value that must be added to the corresponding positive number to produce zero as the result.
For example, in an eight-bit number system, the value that must be added to one to produce zero (disregarding
the carry) is $FF; 1+$FF=$100, or 0 if only the low-order eight bits is considered. $FF must therefore be the
twos-complement value for minus one.
The introduction of twos-complement notation creates yet another possibility in interpreting the data
stored at an arbitrary memory location. Since $FF could represent either the unsigned number 255 or the
negative integer minus-one, its important to remember that it is only the way in which a program interprets the
data stored in memory that gives it its proper value signed or unsigned.
19
Hexadecimal
Decimal
BCD
0000 0000
0000 0001
0000 0010
0000 0011
0000 0100
0000 0101
0000 0110
0000 0111
0000 1000
0000 1001
0000 1010
0000 1011
0000 1100
0000 1101
0000 1110
0000 1111
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0000 0000
0000 0001
0000 0010
0000 0011
0000 0100
0000 0101
0000 0110
0000 0111
0000 1000
0000 1001
0001 0000
0001 0001
0001 0010
0001 0011
0001 0100
0001 0101
The 65x processors have a special decimal mode which can be set by the programmer. When decimal
mode is set, numbers are added and subtracted with the assumption that they are BCD numbers: in BCD mode,
for example, 1001+1 (9+1) yields the BCD results of 0001 0000 rather than the binary result of 1010 (1010 has
no meaning in the context of BCD number representation).
Obviously, in different context 0001 0000 could represent either 10 decimal or $10 hexadecimal (16
decimal); in this case, the interpretation is dependent on whether the processor is in decimal mode or not.
Computer Arithmetic
Binary arithmetic is just like decimal arithmetic, except that the highest digit isnt nine, its one. Thus
1+0=1, while 1+1=0 with a carry of 1, or binary 10. Binary of 10 is equivalent of a decimal 2. And 1-0=1,
while during the subtraction of binary 1 from binary 10, the 1 cant be subtracted from the 0, so a borrow is
done, getting the 1 from the next position (leaving it 0); thus, 10-1=1.
Addition and subtraction are generally performed in one or more main processor registers, called
accumulators. On the 65x processors, they can store either one or, optionally on the 65802 and 65816, two
bytes. When two numbers are added that cause a carry from the highest bit in the accumulator, the result is
larger than the accumulator can hold. To account for this, there is a special one-bit location, called a carry bit,
which holds the carry out of the high bit from an addition. Very large numbers can be added by adding the loworder eight or sixteen bits (whichever the accumulator holds) of the numbers, and then adding the next set of bit
plus the carry from the previous addition, and so on. Figure 1.8 illustrates this concept of multiple-precision
arithmetic.
Microprocessor Programming
You have seen how various kinds of data are represented and, in general, how this data can be
manipulated. To make those operations take place, a programmer must instruct the computer on the steps it
must take to get the data, the operations to perform on it, and finally the steps to deliver the results in the
appropriate manner. Just as a record player is useless without a record to play, so a computer is useless without
a program to execute.
Machine Language
The microprocessor itself speaks only one language, its machine language, which inevitably is just
another form of binary data. Each chip design has its own set of machine language instructions, called its
instruction set, which defines the function that it can understand and execute. Whether you program in
machine language, in its corresponding assembly language, or in a higher level language like BASIC or Pascal,
the instructions that the microprocessor ultimately executes are always machine language instructions.
Programs in assembly and higher-level languages are translated (by assemblers, compilers and interpreters) to
machine language before the processor can execute them.
20
$38
1
Plus $A5
PLUS CARRY
$83
Plus $A5
Equal $DE
Equals
$28
Carry = 1
$3883 Plus $A5A5 Equals $DE28
Figure 1-8 Multiple-Precision Arithmetic
21
Assembly Language
Writing long strings of hexadecimal or binary instructions to program a computer is obviously not
something you would want to do if you could at all avoid it. The 65816s 256 different opcodes, for example,
would be difficult to remember in hexadecimal form and even harder in binary form. Assembly language,
and programs which translate assembly language to machine code (called assemblers) were devised to simplify
the task of machine programming.
Assembly language substitutes a short word known as a mnemonic (which means memory aid) for
each binary machine code instruction. So while the machine code instruction 1010 1010, which instructs the
65x processor to transfer the contents of the A accumulator to the X index register, may be hard to remember,
its assembler mnemonic TAX (for Transfer A to X) is much easier.
The entire set of 65x opcodes are covered alphabetically by mnemonic label in Chapter Eighteen, while
Chapter Five through Thirteen discuss them in functional groups, introducing each of them, and providing
examples of their use.
To write an assembly language program, you first use a text editing program to create a file containing
the series of instruction mnemonics and operands that comprise it; this is called the source program, source
code or just source. You then use this as the input to the assembler program, which translates the assembler
statements into machine code, storing the generated code in an output file. The machine code is either in the
form of executable object code, which is ready to be executed by the computer, or (using some development
systems), a relocatable object module, which can be linked together with other assembled object modules
before execution.
If this were all that assembly language provided, it would be enough to make machine programming
practical. But just as the assembler lets you substitute instruction mnemonics for binary operation codes, it lets
you use names for the memory locations specified in operands so you dont have to remember or compute their
addresses. By naming routines, instructions which transfer control to them can be coded without having to
know their addresses. By naming constant data, the value of each constant is stated only in one place, the place
where it is named. If a program modification requires you to change the values of the constants, changing the
definition of the constant in that one place changes the value wherever the name has been used in the program.
These symbolic names given to routines and data are known as labels.
As your source program changes during development, the assembler will resolve each label reference
anew each time an assembly is performed, allowing code insertions and deletions to be made. If you hardcoded the addresses yourself, you would have to recalculate them by hand each time you inserted or deleted a
line of code.
The use of an assembler also lets you comment your program within the source file that is, to explain
in English what it is you intend the adjacent assembly statements to do and accomplish.
More sophisticated macro assemblers take symbol manipulation even further, allowing special labels,
called macro instructions (or just macros for short), to be assigned to a whole series of instructions. Macro is
a Greek word meaning long, so a macro instruction is a long instruction. Macros usually represent a series of
instructions which will appear in the code frequently with slight variations. When you need the series, you can
type in just the macro name, as though it were an instruction mnemonic; the assembler automatically expand
the macro instruction to the previously-defined string of instructions. Slight variations in the expansion are
provided for by a mechanism that allows macro instructions to have operands.
22
Label
Field
LOOP
Opcode
Field
Operand Comment
Field
Field
REP
LONGI
#$10
ON
SEP
LONGA
#$20
OFF
LDY
LDA
BEQ
CMP
BNE
INY
BRA
#0
(1,S),Y
PASS
(3,S),Y
FAIL
LOOP
PASS
PLP
CLC
BRA
FAIL
LDA
BEQ
PLP
SEC
23
Looping
Lets say you write a program to convert a Fahrenheit temperature to Celsius. If you had only one
temperature to convert, you wouldnt spend the time writing a program. What you want the program to do is
prompt for a Fahrenheit temperature, convert it to Celsius, print out the result, then loop back and prompt for
another Fahrenheit temperature, and so on until you run out of temperatures to convert. This program uses a
program concept called looping or iteration, which is simply the idea that the same code can be reexecuted
repeatedly with different values for key variables until a given exit condition. In this case the exit condition
might be the entry of a null or empty input string.
Often, its not the whole program that loops, but just a portion of it. While a poker program could deal
out 20 cards, one at a time, to four players, it would use much less program memory to deal out one card to each
of the players, then loop back to do the same thing over again four more times, before going on to take bets and
play the poker hands dealt.
Looping saves writing repetitive code over and over again, which is both tedious and uses up memory.
The 65x microprocessors execute loops by means of branch and jump instructions.
Looping almost always uses the principle of selection between paths to handle exiting the loop. In the
poker program, after each set of four cards has been dealt to the four players, the program must decide if that
was the fifth set of four cards or if there are more to deal. Four times it will select to loop back and deal another
set; the fifth time, it will select another path to break out of the loop to begin prompting for bets.
Subroutines
Even with loops, programmers could find themselves writing the same section of code over and over
when it appears in a program not in quick succession but rather recurring at irregular intervals throughout the
program. The solution is to make the section of code a subroutine, which the program can call as many times
and from as many locations as it needs to by means of a jump-to-subroutine instruction. The program, on
encountering the subroutine call, makes note of its current location for purposes of returning to it, then jumps to
the beginning of the subroutine code. At the end of the subroutine code, a return-from-subroutine instruction
tells the program to return from the subroutine to the instruction after the subroutine call. There are several
different types of calls and returns available on the different 65x processors; all of them have a basic call and
return instruction in common.
Programmers often build up large libraries of general subroutines that multiply, divide, output
messages, send bytes to and receive bytes from a communications line, output binary numbers in ASCII,
translate numbers from keyboard ASCII into binary, and so on. Then when one of these subroutines is needed,
the programmer can get a copy from the library or include the entire library as part of his program.
24
Part 2
Architecture
25
2) Chapter Two
Architecture of the 6502
This chapter, and the two which follow, provide overviews of the architecture of the four 65x family
processors: the 6502, the 65C02, and the 65802/65816. Each chapter discusses the register set and the function
of the individual registers, the memory model, the addressing modes, and the kinds of operations available for
each respective processor. Because each successive processor is a superset of the previous one, each of the next
two chapters will build on the material already covered. Much of what is discussed in this chapter will not be
repeated in the next two chapters because it is true of all 65x processors. As the original 65x machine, the 6502
architecture is particularly fundamental, since it describes a great number of common architectural features.
Microprocessor Architecture
The number, kinds, and sizes of registers, and the types of operations available using them, defines the
architecture of a processor. This architecture determines the way in which programming problems will be
solved. An approach which is simple and straightforward on one processor may become clumsy and inefficient
on another if the architectures are radically different.
A register is a special memory location within the processor itself, where intermediate results,
addresses, and other information which must be accessed quickly are stored. Since the registers are within the
processor itself, they can be accessed and manipulated much faster than external memory. Some instructions
perform operations on only a single bit within a register; others on two registers at once; and others move data
between a register within the processor and external memory. (Although the registers are indeed a special kind
of memory, the term memory will be used only to refer to the addressable memory external to the
microprocessor registers.)
The 6502 is not a register-oriented machine. As you will see, it has a comparatively small set of
registers, each dedicated to a special purpose. The 6502 instead relies on its large number of addressing modes,
particularly its direct-page indirect addressing modes, to give it power.
An addressing mode is a method, which may incorporate several intermediate calculations involving
index registers, offset, and base addresses, for generating an instructions effective address the memory
address at which data is read or written. Many 6502 instructions, such as those for addition, have many
alternate forms, each specifying a different addressing mode. The selection of the addressing mode by you, the
programmer, determines the way in which the effective address will be calculated.
There are three aspects to learning how to program the 6502 or any processor. Learning the different
addressing modes available and how to use them is a big part. Learning the available instructions and
operations, such as addition, subtraction, branching, and comparing, is another. But to make sense of either,
you must begin by understanding what each of the different registers is and does, and how the memory is
organized.
If you compare the different processors in the 65x family the eight-bit 6502 and 65C02 and the
sixteen-bit 65816 and 65802 you will find they all have a basic set of registers and a basic set of addressing
modes in common: the 6502s.
Finally, the program counter, or PC, is a pointer to the memory location of the instruction to be executed
next.
These six basic 6502 registers are depicted in the programmer model diagrammed in Figure 2.1.
Notice that, with the exception of the program counter (PC), all of them are eight-bit registers. Because they
can contain only eight bits, or one byte, of data at a time, they can only perform operations, such as addition, on
one byte at a time. Hence the 6502 is characterized as an eight-bit processor.
Although the user registers of the 6502 are only eight bits wide, all of the external addresses generated
are sixteen bits. This gives the 6502 an address space of 64K (216=65,536). In order to access data located
anywhere in that 64K space with an eight bit processor, one instruction operand in calculating effective
addresses is almost always found in memory either in the code itself following an instruction, or at a specified
memory location rather than in a register, because operands in memory have no such limits. All that is needed
to make a memory operand sixteen bits are two adjacent memory locations to put them in.
To allow programs longer than 256 bytes, the program counter, which always points to the location of
the next instruction to be executed, is necessarily sixteen bits, or two bytes, wide. You may therefore locate a
6502 program anywhere within its 64K address space.
Now each of the 6502 registers will be described in more detail.
The Accumulator
The accumulator (A) is the primary register in the 65x processor. Almost all arithmetic and most local
operations are performed on the data in the accumulator, with the result of the operation being stored in the
accumulator. For example to add two numbers which are stored in memory, you must first load one of them
into the accumulator. Then you add the other to it and the result is automatically stored in the accumulator,
replacing the value previously loaded there.
Because the accumulator is the primary user register, there are more addressing modes for accumulator operations
than for any other register.
The 6502 accumulator is an eight-bit register. Only one byte is ever fetched from memory when the accumulator
is loaded, or for operations which use two values one from memory and the other in the accumulator (as in the addition
example above).
27
Accumulator (A)
X Index Register (X)
Y Index Register (Y)
15
Program
Carry
1= Carry
Zero
1= Result Zero
IRQ Disable
1= Disable
Decimal Mode
1= Decimal Mode
Break Instruction
1= Break caused
interrupt
Overflow
1= Overflow
Negative
1= Negative
28
The use of indexing allows easy access to continuous series of memory locations, such as a multiplebyte objects. Indexing is performed by adding one of several forms of base addresses, specified in the operand
field of an instruction, to the contents of an index register. While a constant operand is fixed when a program is
created, the index registers are variable and their contents can be changed readily during the execution of a
program. As a result, indexing provides an extremely flexible mechanism for accessing data in memory.
Although the X and Y index registers are basically similar, their capabilities are not identical. Certain
instructions and addressing modes work only with one or the other of these registers. The indirect indexed
addressing modes require the Y register. And while the X is primarily used with direct page indexed and
absolute indexed addressing, it has its own unique (though infrequently used) indexed indirect addressing
mode. These differences will become clear as you learn more about the different addressing modes.
29
Name
Abbrev
Bit
Explicitly
set or clear
Set or cleared to
Reflect an operation result
Reflects most significant bit results
(the sign of a twos-complement binary number):
0= High bit clear (positive result)
1= high bit set (negative result)
negative
zero
overflow
Clear to reverse
set-overflow
hardware input
carry
Arithmetic overflow:
addition: carry out of high bit:
0= no carry
1= carry
subtraction: borrow required to subtract:
0= borrow required
1= no borrow required
Logic:
receives bit shifted or rotated out;
source of bit rotated in
break
In connection with the carry flag, it is important to know that the 6502 add operation has been designed
to always add in the carry, and the subtract operation to always use the carry as a borrow flag, making it
possible to do multiple-precision arithmetic where you add successively higher sets of bytes plus the previous
adds carry or subtract successfully higher sets of bytes taking into the operation that previous subtracts borrow.
The drawback to this scheme is that the carry must be zeroed before starting an add and set before starting a
subtraction.
In the case of subtraction, the 6502s carry flag is an inverted borrow, unlike that of most other
microprocessors. If a borrow occurred during the last operation, it is cleared; if a borrow did not result, it is set.
Finally, notice that in the status register itself, the break bit has no function. Only when an interrupt
pushes the statue register onto the stack is the break bit either cleared or set to indicate the type of interrupt
responsible.
Table 2.2 describes the other two P register flags, the mode select flags: by explicitly setting or clearing
them, you can change the operational modes of the processor.
30
Name
decimal
Abbrev
d
Bit
3
interrupt
Reason to explicitly
set or clear
Determines mode for add & subtract (not increment/decrement, through):
Set to force decimal operations (BCD)
Clear to return to binary operation
Enables or disables processors IRQ interrupt line:
Set to disable interrupts by masking the IRQ line
Clear to enable IRQ interrupts
Table 2-2 Status Register Mode Select Flags
The decimal mode flag toggles add and subtract operations (but not increment or decrement
instructions) between binary and decimal (BCD). Most processors require a separate decimal-adjust operation
after numbers represented in decimal format have been added or subtracted. The 65x processors do on-the-fly
decimal adjustment when the decimal flag is set.
The IRQ disable or interrupt disable flag, toggles between enabling and diabling interrupts.
Typically, the interrupt mask is set during time-critical loops, during certain I/O operations, and while servicing
another interrupt.
31
Stack
$01FF
(1st available)
$01FE
$01FD
$01FC
$01FB
In addition to being available as a temporary storage area, the stack is also used by the system itself in
processing interrupts, subroutine calls, and returns. When a subroutine is called the current value of the
program counter is pushed automatically onto the stack; the processor executes a return instruction by reloading
the program counter with the value on the top of the stack.
While data is pushed into subsequently lower memory locations on the 65x stacks, the location of the
last data pushed is nonetheless referred to an the top of the stack.
32
Stack
$01FF
$01FE
(next available)
$01FD
$01FC
$01FB
Accumulator
A A A A
Addressing Modes
The fourteen different addressing modes that may be used with the 6502 are shown in Table 2.3. The
availability of this many different addressing modes on the 6502 gives it much of its power: Each one allows a
given instruction to specify its effective address the source of the data it will reference in a different manner.
Not all addressing modes are available for all instructions; but each instruction provides a separate
opcode for each of the addressing modes its supports.
33
Syntax Example
Opcode
Operand
DEX
ASL
A
LDA
#55
LDA
$2000
BEQ
LABEL12
PHA
LDA
$81
LDA
$2000,X
LDA
$2000,Y
LDA
$55,X
LDX
$55,Y
JMP
($1020)
LDA
($55),Y
LDA
($55,X)
For some of the 6502 addressing modes, the entire effective address is provided in the operand field of
the instruction; for many of them, however, formation of the effective address involves an address calculation,
that is, the addition of two or more values. The addressing mode indicates where these values are to come from
and how they are to be added together to form the effective address.
Implied addressing instructions, such as DEY and INX, need no operands. The register that is the
source of the data is named in the instruction mnemonic and is specified to the processor by the opcode.
Accumulator addressing, in which data to be referenced is in the accumulator, is specified to the assembler by
the operand A. Immediate addressing, used to access data which is constant throughout the execution of a
program, causes the assembler to store the data right into the instruction stream. Relative addressing provides
the means for conditional branch instructions to require only two bytes, one byte less than jump instructions
take. The one-byte operand following the branch instruction is an offset from the current contents of the
program counter. Stack addressing encompasses all instructions, such as push or pull instructions, which use
the stack pointer register to access memory. And absolute addressing allows data in memory to be accessed by
means of its address.
Like the 6800 processor, the 6502 treats the zero page of memory specially. A page of memory is an
address range $100 bytes (256 decimal) long: the high bytes of the addresses in a given page are all the same,
while the low bytes run from $00 through $FF. The zero page is the first page of memory, from $0000 through
$00FF (the high byte of each address in the zero page is zero). Zero page addressing, a short form of absolute
addressing, allows zero page operands to be referenced by just one byte, the lower-order byte, resulting both in
fewer code bytes and in fewer clock cycles.
While most other processors provide for some form of indexing, the 6502 provides some of the
broadest indexing possibilities. Indexed effective addresses are formed from the addition of a specified base
address and index, as shown in Figure 2.4. Because the 6502s index registers (X and Y) can hold only eight
bits, they are seldom used to hold index bases; rather, they are almost always used to hold the indexes
themselves. The 6502s four simplest indexing modes add the contents of the X or Y register to an absolute or
zero page base.
Indirection (Figure 2.5) is less commonly found in microprocessor repertoires, particularly among
those microprocessors of the same design generation as the 6502. It lets the operand specify an address at
which another address, the indirect address, can be found. It is at this second address that data will be
referenced. The 6502 not only provides indirection for its jump instruction, allowing jumps to be vectored and
revectored, but it also combines indirection with indexing to give it real power in accessing data. Its as though
the storage cells for the indirect addresses are additional 6502 registers, massively extending the 6502s register
set and possibilities. In one addressing mode, indexing is performed before indirection; in another, after. The
first provides indexing into an array of indirect addresses and the second provides indexing into array which is
located by the indirect address.
The full set of 65x addressing modes are explained in detail in Chapters 7 and 11 and are reviewed in
the Reference Section
34
Base = $2000
Index Register X = $ 03
Effective Address = $2003
0
Base = $2000
0 0 0 0
$2003
$2000
$2001
$2002
$2004
X = $03
0
0
Instructions
The 6502 has 56 operation mnemonics, as listed in Table 2.4, which combine with its many addressing
modes to make 151 instructions available to 6502 programmers.
Arithmetic instructions are available, including comparisons, increment, and decrement. But missing
are addition or subtraction instructions which do not involve the carry; as a result, you must clear the carry
before beginning an add and set it before beginning subtraction.
35
0
Memory
$001F
$0020
$58
$34
$0021
$0022
$0023
$3456
$3457
$3458
$3458
$3459
$345A
36
Description
Add memory and carry to accumulator
And accumulator with memory
Shift memory or accumulator left one bit
Branch if carry clear
Branch if carry set
Branch if equal
Test memory bits against accumulator
Branch if negative
Branch if not equal
Branch if plus Software break (interrupt)
Branch if overflow clear
Branch if overflow set
Clear carry flag
Clear decimal mode flag
Clear interrupt-disable flag
Clear overflow flag
Compare accumulator with memory
Compare index register X with memory
Compare index register Y with memory
Decrement
Decrement index register X
Decrement index register Y
Exclusive-OR accumulator with memory
Increment
Increment index register X
Increment index register Y
Jump
Jump to subroutine
Load accumulator from memory
Load index register X from memory
Load index register Y from memory
Logical shift memory or accumulator right
No operation
OR accumulator with memory
Push accumulator onto stack
Push status flags onto stack
Pull accumulator from stack
Pull status flags from stack
Rotate memory or accumulator left one bit
Rotate memory or accumulator fight one bit
Return from interrupt
Return from subroutine
Subtract memory with borrow from accumulator
Set carry flag
Set decimal mode flag
Set interrupt-disable flag
Store accumulator to memory
Store index register X to memory
Store index register Y to memory
Transfer accumulator to index register X
Transfer accumulator to index register Y
Transfer stacker point to index register X
Transfer index register X to accumulator
Transfer index register X to stack pointer
Transfer index register Y to accumulator
Table 2-4 6502 Instructions
37
Pipelining
The 65x microprocessors have the capability of doing two things at once: the 6502 can be carrying on
an internal activity (like an arithmetic or logical operation) even as its getting the next instruction byte from the
instruction stream or accessing data in memory.
A processor is driven by a clock signal which synchronizes events within the processor with memory
accesses. A cycle is a basic unit of time within which a single step of an operation can be performed. The
speed with which an instruction can be executed is expressed in the number of cycles required to complete it.
The actual speed of execution is a function both of the number of cycles required for completion and the
number of timing signals provided by the clock every second. Typical clock values for 65x processors start at
one million cycles per second and go up from there.
As a result of the 6502s capability of performing two different but overlapping phases of a task within
a single cycle, which is called pipelining, the 65x processors are much faster than non- pipelined processors.
Take the addition of a constant to the 6502s eight-bit accumulator as an example.This requires five
distinct steps:
Step 1: Fetch the instruction opcode ADC.
Step 2: Interpret the opcode to be ADC of a constant.
Step 3: Fetch the operand, the constant to be added.
Step 4: Add the constant to the accumulator contents.
Step 5: Store the result back to the accumulator.
Pipelining allows the 6502 to execute steps two and three in a single cycle: after getting an opcode, it
increments the program counter, puts the new program address onto the address bus, and gets the next program
byte, while simultaneously interpreting the opcode. The completion of steps four and five overlaps the next
instructions step one, eliminating the need for two additional cycles.
So the 6502s pipelining reduces the operation of adding a constant from five cycles to two!
The clock speed of a microprocessor has often been incorrectly presumed to be the sole determinant of
its speed. What is most significant, however, is the memory cycle time. The 68000, for example, which
typically operates at 6 to 12 megahertz (MHz, or millions of cycles per second) requires four clock periods to
read or write data to and from memory. The 65x processors require only one clock period. Because the 6502
requires fewer machine cycles to perform the same functions, a one-megahertz 6502 has a throughput unmatched
by the 8080 and Z80 processors until their clock rates are up to about four MHz.
The true measure of the relative speeds of various microprocessors can only be made by comparing
how long each takes, in its own machine code, to complete the same operation.
38
carry into the high byte of the address, as is often true, then the address fetched from was correct and there is no
cycle five; the operation is a four-cycle operation in this case. Absolute indexed writes, however, always
require five cycles.)
The low-high memory order means that the first operand byte, which the 6502 fetches before it even
knows that the opcode is LDA and the addressing mode is absolute indexed, is the low byte of the address base,
the byte which must be added to the index register value first; it can do that add while getting the high-byte.
Consider how high-low memory order would weaken the benefits of pipelining and slow the process
down:
Cycle 1:Fetch the instruction opcode, LDA.
Cycle 2:Fetch an operand byte, the high byte of an array base.
Interpret the opcode to be LDA absolute indexed.
Cycle 3:Fetch the second operand byte, the low array base byte.
Store the high byte temporarily.
Cycle 4:Add the contents of the index register to the low byte.
Cycle 5:Add the carry from the low address add to the high byte.
Cycle 6:Fetch the byte at the new effective memory address.
Memory-Mapped Input/Output
The 65x family (like Motorolas but unlike Zilogs and Intels) accomplishes input and output not with
special opcodes, but by assigning each input/output device a memory location, and by reading from or writing
to that location. As a result, theres virtually no limit to the number of I/O devices which may be connected to a
65x system. The disadvantage of this method is that memory in a system is reduced by the number of locations
which are set for I/O functions.
Interrupts
Interrupts tell the processor to stop what it is doing and to take care of some more pressing matter
instead, before returning to where it left off in regular program code. An interrupt is much like a doorbell:
having one means you dont have to keep going to the door every few minutes to see if someone is there; you
can wait for it to ring instead.
An external device like a keyboard, for example, might cause an interrupt to present input. Or a clock
might generate interrupts to toggle the processor back and forth between two or more routines, letting it do
several tasks at once. A special kind of interrupt is reset (the panic button), which is generally used out of
39
NMOS Process
The 6502 is fabricated using the NMOS (pronounced EN moss) process (for N-channel Metal-Oxide
Semiconductor). Still one of the most common of the technologies used in large-scale and very-large-scale
integrated circuits, NMOS was, at the time the 6502 was designed and for many years after, the most costefficient of the MOS technologies and the easiest process for implementation of relatively high-speed parts.
This made NMOS popular among designers of microcomputers and other devices in which hardware was an
important design factor.
Most of the current generation of 8-, 16-, and 32-bit processors were originally implemented in NMOS.
Some, like the 6502, are still only available in NMOS process versions. Others, like all of the recently designed
members of the 65x family (65C02, 65802, and 65816) were produced exclusively using the CMOS process.
JMP ($20FF)
should cause the program counter to get, as its new low byte, the contents of $20FF, and as its new high byte,
the contents of $2100. However, while the 6502 increments the low byte of the indirect address from $FF to 00,
it fails to add the carry into the high byte, and as a result gets the program counters new high byte from $2000
rather than $2100.
You can also run into trouble trying to execute an unused opcode, of which the 6502 has many. The
results are unpredictable, but can include causing the processor to hang.
Finally, the decimal mode is not as easy to use as it might be. The negative, overflow, and zero flags in
the status register are not valid in decimal mode and the setting of the decimal flag, which toggles the processor
between binary and decimal math, is unknown after the processor has received a hardware reset.
40
3) Chapter Three
Architecture of the 65C02
The 65C02 microprocessor is an enhanced version of the 6502, implemented using a silicon-gate
CMOS process. The 65C02 was designed primarily as a CMOS replacement for the 6502. As a result, the
significant differences between the two products are few. While the 65C02 adds 27 new opcodes and two new
addressing modes (in addition to implementing the original 151 opcodes of the 6502), its register set, memory
model, and types of operations remain the same.
The 65C02 is used in the AppleIIc and, since early 1985, in the AppleIIe, ands it has been provided as
an enhancement kit for earlier IIes.
Remember that even as the 65C02 is a superset of the 6502, the 65802 and 65816, described in the next
chapter, are supersets of the 65C02. All of the enhancements found in the 65C02 are additionally significant in
that they are intermediate to the full 65816 architecture. The next chapter will continue to borrow from the
material covered in the previous ones, and generally what is covered in the earlier of these three architecture
chapters is not repeated in the subsequent ones, since it is true for all 65x processors.
Addressing Modes
The 65C02 introduces the two new addressing modes shown in table 3.1, as well as supporting all the
6502 addressing modes. All of them will be explained in detail in Chapters 7 and 11, and will be reviewed in
the Reference Section.
Addressing Mode
Zero Page Indirect
Absolute Indexed Indirect
Syntax Example
Opcode
Operand
LDA
($55)
JMP
($2000,X)
Zero page indirect provides an indirect addressing mode for accessing data which requires no indexing
(the 6502s absolute indirect mode is available only to the jump instructions). 6502 programmers commonly
simulate indirection by loading an index register with zero (losing its contents and taking extra steps), then
using the preindexed or postindexed addressing modes to indirectly reference the data.
On the other hand, combining indexing and indirection proved so powerful for accessing data on the
6502 that programmers wanted to see this combination made available for tables of jump vectors. Absolute
indexed indirect, available for jump instruction only, provides this multi-directional branching capability,
which can be very useful for case or switch statements common to many languages.
41
Instructions
While the 65C02 provides 27 new opcodes, there are only eight new operations. The 27 opcodes result
from providing four different addressing modes for one on the new mnemonics and two for two others, and also
from expanding the addressing modes for twelve 6502 instructions. The most significant expansion of a 6502
instruction by combining it with a 6502 addressing mode it did not previously use is probably the addition of
accumulator addressing for the increment and decrement instructions.
The new 65C02 operations, shown in Table 3.2, answer many programmers prayers: an unconditional
branch instruction, instructions to push and pull the index registers, and instructions to zero out memory cells.
These may be small enhancements, but they make programming the 65C02 easier, more straightforward, and
clearer to document. Two more operations allow the 65C02 to set or clear any or all of the bits in a memory
cell with a single instruction.
Instruction
Mnemonic
BRA
PHX
PHY
PLX
PLY
STZ
TRB
TSB
Description
Branch always (unconditional)
Push index register X onto stack
Push index register Y onto stack
Pull index register X form stack
Pull index register Y from stack
Store zero to memory
Test and reset memory bits against accumulator
Test and set memory bits against accumulator
Table 3-2. New 65C02 Instructions
CMOS Process
Unlike the 6502, which is fabricated in NMOS, the 65C02 is a CMOS (pronounced SEE
moss) part. CMOS stands for Complementary Metal-Oxide Semiconductor.
The most exciting feature of CMOS is its low power consumption, which has made portable,
battery-operated computers possible. Its low power needs also result in lower heat generation, which
means parts can be placed closer together and heat-dissipating air space minimized in CMOS-based
computer designs.
CMOS technology is not a new process. Its been around for about as long as other MOS
technologies. But higher manufacturing costs during the early days of the technology made CMOS
impractical for the highly competitive microcomputer market until the mid 1980s, so process
development efforts were concentrated on NMOS and not applied to CMOS until 1980 or 1981.
CMOS technology has reached a new threshold in that most of its negative qualities, such as the
difficulty with which smaller geometries are achieved relative to the NMOS process, have been overcome.
Price has become competitive with the more established NMOS as well.
43
4) Chapter Four
Sixteen-Bit Architecture The 65816 and the 65802
While the 65C02 was designed more as a CMOS replacement for the 6502 than an enhancement of it,
the 65802 and 65816 were created to move the earlier designs into the world of sixteen-bit processing.
Although the eight-bit 6502 had been a speed demon when first released, its competition changed over the
years as processing sixteen bits at a time became common, and as the memory new processors could address
started at a megabyte.
The 65816 and the 65802 were designed to bring the 65x family into line with the current generation of
advanced processors. First produced in prototypes in the second half of 1984, they were released
simultaneously early in 1985. The 65816 is a full-featured realization of the 65x concept as a sixteen-bit
machine. The 65802 is its little brother, with the 65816s sixteen-bit processing packaged with the 6502s
pinout for compatibility with existing hardware.
The two processors are quite similar. They are, in fact, two different versions of the same basic design.
In the early stages of the chip fabrication process they are identical and only assume their distinct
personalities during the final (metalization) phase of manufacture.
The two processors provide a wealth of enhancements: another nine addressing modes, 78 new
opcodes, a hidden second accumulator in eight-bit mode, and zero page which, renamed the direct page, can
be relocated to any contiguous set of $100 bytes anywhere within the first 64K of memory (which in the case of
the 65802 is anywhere in its address space). The most dramatic of all the enhancements common to both 65802
and 65816, though, is the expansion of the primary user registers the accumulator, index register, and stack
pointer to sixteen-bit word size. The accumulator and index registers can be toggled to sixteen bits from
eight, and back to eight when needed. The stack, pointed to by an expanded-to-sixteen-bit stack register, can be
relocated from page one to anywhere in a 64K range.
The primary distinction between the two processors is the range of addressable memory: the 65816 can
address up to sixteen megabytes; the 65802 is constrained by its 6502 pinout to 64K.
A secondary distinction between the two processors is that the 65816s new pinout also provides several
significant new signals for the hardware designer. While outside the primary scope of this book, these new
signals are mentioned in part in this chapter and described in some detail in Appendix C.
It is important to remember that the 65802 is in fact a 65816 that has been coerced to live in the
environment designed originally for the 6502 and 65C02. Outside of the memory and signal distinctions just
listed, the 65816 and the 65802 are identical. Both have a native mode, in which their registers can be used for
either eight- or sixteen-bit operations. Both have a 6502 emulation mode, in which the 6502s register set and
instruction timings emulate the eight-bit 6502 (not the 65C02) exactly (except they correct a few 6502 bugs).
All existing 6502 software can be run by the new processor as can virtually all 65C02 software even as most
of the native modes enhancements (other than sixteen-bit register) are programmable in emulation mode, too.
To access sixteen megabytes, the signals assigned to the various pins of the 65816s 40-pin package are
different from the 6502, and the 65C02 and the 65802, so it cannot be installed in existing 65x computers as a
replacement upgrade. The 65802, on the other hand, has a pinout that is identical to that of the 6502 and 65C02
and can indeed be used as a replacement upgrade.
This makes the 65802 a unique, pin-compatible, software-compatible sixteen-bit upgrade chip. You
can pull a 6502 out of its socket in any existing 6502 system, and replace it with a 65802 because it powers-on
in the 6502 emulation mode. It will run existing applications exactly the same as the 6502 did. Yet new
software can be written, and 6502 programs rewritten, to take advantage of the 65802s sixteen-bit capabilities,
resulting in programs which take up much less code space and which run faster. Unfortunately, even with a
65802 installed, an older system will remain unable to address memory beyond the original 64K limits of the
6502. This is the price of hardware compatibility.
The information presented in this chapter builds directly on the information in the previous two
chapters; it should be considered as a continuous treatment of a single theme. Even in native mode with
sixteen-bit registers, the 65802 and 65816 processors utilize many of the 6502 and 65C02 instructions, registers,
and addressing modes in a manner which differs little from their use on the earlier processors. If you are
44
45
15
Accumulator (B)
(A) or (C)
Accumulator (A)
e
n
Carry
Zero
IRQ Disable
Decimal Mode
Index Register Select
Memory/Accumulator Select
Overflow
Negative
1 = Carry
1 = Result Zero
1 = Disabled
1 = Decimal, 0 = Binary
1 = 8-bit, 0 = 16-bit
1 = 8-bit, 0 = 16-bit
1 = Overflow
1 = Negative
Figure 4.1 shows the programming model for the 65816 in native mode. While the accumulator is
shown as a sixteen-bit register, it may be set to be either a single sixteen-bit accumulator (A or C) or two eightbit accumulators, one accessible (A) and the other hidden but exchangeable (B). While the index registers are
shown as sixteen-bit registers, they may be set, as a pair, to be either sixteen-bit registers or eight-bit registers
their high bytes are zeroed when they are set to eight-bits. The obvious advantage of switching from a
processor with eight-bit registers to one with sixteen-bit registers is the ability to write programs which are from
25 to 50 percent shorter, and which run 25 to 50 percent faster due to the ease with which sixteen-bit data is
manipulated.
The feature that most clearly distinguishes the current generation of advanced microcomputer systems,
however, is the ability to address lots of memory. It is this increased memory addressability which has ushered
in the new era of microcomputer applications possibilities, such as large spreadsheets, integrated software,
multi-user systems, and more. In this regard, the 65816 stands on or above par with any of the other highperformance microprocessors, such as the 68000, the 8086, or their successors.
There are two new eight-bit registers called bank registers. One, called the data bank register, is
shown placed above the index registers and the other, called the program bank register, is appended to the
program counter. The 65816 uses the two bank registers to provide 24-bit addressing.
A bank of memory is much like a page; just as a page is a range of memory that can be defined by
eight bits (256 bytes), a bank is a range of memory that can be defined by sixteen bits (64K bytes). For
processors like the 6502, which have only sixteen-bit addressing, a 64K bank is not a relevant concept, since the
only bank is the one being currently addressed. The 65816, on the other hand, partitions its memory range into
64K banks so that sixteen-bit registers and addressing modes can be used to address the entire range of memory.
46
One effect of having a direct page register is that you can set up and alternate between multiple
direct page areas, giving each subroutine or task its own private direct page of memory, which can
prove both useful and efficient.
The Stack Pointer
The native mode stack pointer holds a sixteen-bit address value. This means it can be set to point to any
location in bank zero. It also means the stack is no longer limited in length to just $100 bytes, nor limited to
page one ($100 to $1FF). Page one therefore loses its character as a special memory area and may be treated
like any other page while running the 65802 or 65816 in the native mode.
eight-bit accumulator
eight-bit index registers
(m bit is set)
eight-bit accumulator
sixteen-bit index registers
(m bit is set)
(x bit is clear)
sixteen-bit accumulator
eight-bit index registers
(m bit is clear)
(x bit is set)
sixteen-bit accumulator
sixteen-bit index registers
(m bit is clear)
(x bit is set)
(x bit is clear)
When the opcode for a given instruction is fetched from memory during program execution, the
processor may respond differently based upon the setting of the register select flags. Their settings may be
thought of as extensions to the opcode. For example, consider the following instruction:
object
code
BD00B0
instruction
LDA $B000 X
which loads the accumulator with data from the effective address formed by the sum of $B000 and the contents
of the X register. The X register contents can be either eight bits or sixteen, depending upon the value of the
index select flag. Furthermore, the accumulator will be loaded from the effective address with either eight or
sixteen bits of data, depending upon the value of the memory/accumulator select flag.
The instruction and addressing mode used in the example are found also on the 6502 and 65C02; the opcode byte
($BD) is identical on all four processors. The 65816s new mode flags greatly expand the scope of the 6502s instructions.
For programmers already familiar with the 6502, the understanding of this basic principle how one opcode can have up to
four different effects based on the flag settings is the single most important principle to grasp in moving to a quick
mastery of the 65802 or 65816.
49
50
LLLL LLLL
0000 0000
x=0
LLLL LLLL
x=1
LLLL LLLL
0000 0000
x=1
LLLL LLLL
x=0
Accumulator: 16 Bits to 8
A
HHHH HHHH
LLLL LLLL
HHHH HHHH
LLLL LLLL
m= 0
m=1
(also C)
(also C)
Accumulator: 8 Bits to 16
B
HHHH HHHH
LLLL LLLL
A
HHHH HHHH
LLLL LLLL
m=1
m=0
(also C)
(also C)
Figure 4-2 Results of Switching Register Size
Two status register bits were required for the two-flag eight-or-sixteen-bit scheme. While the 6502s
status register has only one unused status register bit available, its break flag is used only for interrupt
processing, not during regular program execution, to flag whether an interrupt comes from a break instruction or
from a hardware interrupt. By giving the break instruction its own interrupt vector in native mode, the 65816s
designers made a second bit available for the m and x register select flags.
51
65802/65816
Operand
$55
$55, X
$55, Y
LDA
($55), Y
LDA
LDA
($55, X)
($55)
Notice in Table 4.2 that the assembler syntax for each direct page addressing mode (not to mention the
object bytes themselves) is the same as its zero page counterpart. The names and the results of the addressing
modes are what differ. Direct page addressing, like the 6502/65C02 zero page addressing, allows a memory
location to be addressed using only an eight-bit operand. In case of the 6502, a sixteen-bit zero page effect
address is formed from an eight-bit offset by concatenating a zero high byte to it. In the 65802/65816, the direct
page effective address is formed by adding the eight-bit offset to the sixteen-bit value in the direct register. This
lets you relocate the direct page anywhere in bank zero, on any byte boundary. Note, however, that it is most
efficient to start the direct page on a page boundary because this saves one cycle for every direct page
addressing operation.
When considering the use of 6502/65C02 zero page instructions as 65802/65816 direct page
instructions, remember that a direct page address of $23 is located in memory at location $0023 only if the
direct page register is set to zero; if the direct page register holds $4600, for example, the direct page address
$23 is located at $4623. The direct page is essentially an array which, when it was the zero page, began at
address zero, but which on the 65816 and 65802 can be set to begin at any location.
In the 6502/65C02, the effective address formed using zero page indexed addressing from a zero page
base address of $F0 and an index of $20 is $10; that is, zero page indexed effective addresses wrap around to
always remain in the zero page. In the emulation mode this is also true. But in native mode, there is no page
wraparound: a direct page starting at $2000 combined with a direct page base of $20 and a sixteen-bit index
holding $300 results in an effective address of $2320.
The three main registers of the 65802/65816 can, in native mode, be set to hold sixteen bits. When a
register is set to sixteen bits, then the data to be accessed by that register will also be sixteen bits.
For example, shifting the accumulator left one bit, an instruction which uses the accumulator addressing
mode, shifts sixteen bits left rather than eight if the accumulator is in sixteen-bit mode. Loading a sixteen-bit
index register with a constant using immediate addressing means that a sixteen-bit value follows the instruction
opcode. Loading a sixteen-bit accumulator by using absolute addressing means that the sixteen-bit value stored
starting at the absolute address, and continuing into location at the next address, is loaded into the accumulator.
Sixteen-bit index registers give new power to the indexed addressing modes. Sixteen-bit index registers
can hold values ranging up to 64K; no longer must the double-byte base of an array be specified as a constant
with the index register used for the index. A sixteen-bit index can hold the array base with the double-byte
constant specifying the (fixed) index.
Finally, the 65816 has expanded the scope of 6502 and 65C02 instructions by mixing and matching
many of them with more of the 6502/65C02 addressing modes. For example, the jump-to-subroutine instruction
can now perform absolute indexed indirect addressing, a mode introduced on the 65C02 solely for jump
instruction.
Syntax
Opcode
Example
Operand
BRL
JMPLABEL
Stack Relative
Stack Relative Indirect Indexed with Y
Block Move
Absolute Long
Absolute Long Indexed with X
Absolute Indirect Long
Direct Page Indirect Long
Direct Page Indirect Long Indexed with Y
LDA
LDA
MVP
LDA
LDA
JMP
LDA
LDA
3, S
(5,S), Y
0,0
$02F000
$12D080, X
[$2000]
[$55]
[$55], Y
There are six new addressing modes that use the word long, but with two very different meanings.
Five of the long modes provide 24-bit addressing for intrabank accesses. Program counter relative long
addressing, on the other hand, provides an intrabank sixteen-bit form of relative addressing for branching. Like
all the other branch instructions, its operand is an offset from the current contents of the program counter, but
branch longs operand is sixteen bits instead of eight, which expands relative branching from plus 127 or minus
128 bytes to plus 32767 or minus 32768. This and other features greatly ease the task of writing positionindependent code. The use of the word long in the description of this addressing mode means longer than an
eight bit offset, whereas the word long used with the other four addressing modes means longer than
sixteen bits.
Stack relative addressing and Stack relative indirect indexed with Y addressing treat the stack like
an array and index into it. The stack pointer register holds the base of the array, while a one-byte operand
provides the index into it. Since the stack register points to the next available location for data, a zero index is
meaningless: data and addresses which have been pushed onto the stack start at index one. For stack relative,
this locates the data; for stack relative indirect indexed, this locates an indirect address that points to the base of
an array located elsewhere. Both give you the means to pass parameters on the stack in a clean, efficient
manner. Stack relative addressing is a particularly useful capability, for example, in generating code for
recursive high-level languages such as Pascal or C, which store local variables and parameters on a stack
frame.
Block move addressing is the power behind two new instructions that move a block of bytes up to
64K of them from one memory location to another all at once. The parameters of the move are held in the
accumulator (the count), the index registers (the source and destination addresses), and a unique double operand
(source and destination addresses in the operand specify the source and destination banks for the move
operation).
The five remaining long addressing modes provide an alternative to the use of bank registers for
referencing the 65816s sixteen-megabyte address space. They let you temporarily override the data bank
register value to address memory anywhere within the sixteen-megabytes address space. Absolute long
addressing, for example, is just like absolute addressing except that, instead of providing a two-byte absolution
address to be accessed in the data bank, you provide a three-byte absolute address which overrides the databank.
Absolute long indexed with X, too, is four bytes instead of three. On the other hand, it is the memory locations
specified by absolute indirect long, direct page indirect long, and direct page indirect long indexed with Y
that hold three-byte indirect addresses instead of two-byte ones. Three-byte addresses in memory appear in
conventional 65x order; that is, the low byte is in the lower memory locations, the middle byte (still referred to
in 6502 fashion as the high byte) is in the next higher location, and the highest (bank) byte is in the highest
location.
53
Instructions
There are 78 new opcodes put into use through the 28 new operations listed in Table 4.4, as well as
through giving the previous processors operations additional addressing modes.
Instruction
Mnemonic
BRL
COP
JML
JSL
MVN
MVP
PEA
PEI
PER
PHB
PHD
PHK
PLB
PLD
REP
RTL
SEP
STP
TCD
TCS
TDC
TSC
TXY
TYX
WAI
WDM
XBA
XCE
Description
Branch always long
Co-processor empowerment
Jump long (interbank)
Jump to subroutine long(interbank)
Block move negative
Block move positive
Push effective absolute address onto stack
Push effective indirect address onto stack
Push effective program counter relative address onto stack
Push data bank register onto stack
Push direct page register onto stack
Push program bank register onto stack
Pull data bank register from stack
Pull direct page register from stack
Reset status bits
Return from subroutine long
Set status bits
Stop the processor
Transfer 16-bit accumulator to direct page register
Transfer accumulator to stack pointer
Transfer direct page register to 16-bit accumulator
Transfer stack pointer to 16-bit accumulator
Transfer index registers X to Y
Transfer index registers Y to X
Wait for interrupt
Reserved for future two-byte opcodes
Exchange the B and A accumulators
Exchange carry and emulation bits
Table 4-4 New 65816/65802 Instructions
Five of the new push and pull instructions allow the new registers to be stored on the stack; the other
three let you push constants and memory values onto the stack without having to first load them into a register.
PER is unique in that it lets data be accessed relative to the program counter, a function useful when writing
relocatable code.
There are also instructions to transfer data between new combinations of the registers; including
between the index registers a long-wished-for operation; to exchange the two bytes of the sixteen-bit
accumulator; and to exchange the carry and emulation bits, the only method for toggling the processor between
emulation and native modes.
There are new jump, branch, return, and move instructions already described in the section on
addressing modes. Theres a new software interrupt provided for sharing a system with a co-processor. There
are two instructions for putting the processor to sleep in special low-power states. And finally, theres a
reserved opcode, called WDM (the initials of the 65816s designer, William D. Mensch, Jr.), reserved for some
future compatible processor as the first byte of a possible 256 two-byte opcodes.
54
Interrupts
Native mode supplies an entire set of interrupt vectors at different locations from the emulation mode
(and earlier 6502/65C02) ones to service native mode and emulation mode interrupts differently. Shown in
Table 4.5, all are in bank zero; in addition, the sixteen-bit contents of each vector points to a handling routine
which must be located in bank zero.
Emulation Mode
IRQ
RESET
NMI
ABORT
BRK
COP
FFFE,FFFF
FFFC,FFFD
FFFA,FFFB
FFF8,FFF9
FFF4,FFF5
All locations are in bank zero.
Native Mode
FFEE,FFEF
FFEA,FFEB
FFE8,FFE9
FFE6,FFE7
FFE4,FFE5
As discussed earlier in this chapter, native mode frees up the b bit in the status register by giving the
break instruction its own vector. When the BRK is executed, the program counter and the status register are
pushed onto the stack and the program counter is loaded with the address at $FFE6, the break instruction vector
location.
The reset vector is only available in emulation mode because reset always returns the processor to that
mode.
The 65816/65802, both emulation and native modes, also provides a new co processor interrupt
instruction to support hardware co processing, such as by a floating point processor. When the COP instruction
is encountered, the 65802s interrupt processing routines transfer control to the co-processor vector location.
Finally, the pinout on the 65816 provides a new abort signal. This lets external hardware prevent the
65816 from updating memory or registers while completing the current instruction, useful in sophisticated
memory-management schemes. An interrupt-like operation then occurs, transferring control through the special
abort vector.
56
15
Accumulator (B)
7
(A or C)
0
Accumulator (A)
e
n
Emulation
0 = Native Mode
Carry
Zero
IRQ Disable
Decimal Mode
1 = Carry
1 = Result Zero
1 = Disabled
1 = Decimal, 0 = Binary
1 = 8-bit, 0 = 16-bit
Memory/Accumulator Select
1 = 8-bit, 0 = 16-bit
Overflow
1 = Overflow
Negative
1 = Negative
Finally, the bank bytes specified to the block move instructions are ignored, too. Block moves are by
necessity entirely intrabank on the 65802.
Because the abort signal was designed into the 65816 by virtue of its redesigned pinout, its vector exists
on the 65802 but has no connection to the outside world. Since there is no way to abort an instruction without
using the external pin, the abort operation can never occur on the 65802.
In all other respects, the 65802 and the 65816 are identical, so the 65802 can almost be thought of as a
65816 in a system with only 64K of physical memory installed. Table 4.6 summarizes the differences between
the 65802 and 65816 native modes and the 6502 and 65C02.
57
Emulation Mode
That the 65802 provides a pinout the same as the 6502s and the 65C02s is not enough to run all the
software written for the earlier two processors. For one thing, the eight-bit software expects interrupt handlers
to distinguish break instructions by checking the stacked break flag, and the 65802s native mode has no break
flag, having replaced both it and the 6502s unused flag with the m and x flags. For another, 6502 instructions
that use eight-bit registers to set the stack would set only half of the sixteen-bit stack. The native mode interrupt
vectors are only half of the sixteen-bit stack. The native mode interrupt vectors are different from their
6502/65C02 counterparts, as Table 4.5 showed. There are also little differences; for example, while the direct
page can be set to the zero page, direct page indexed addresses can cross pages in native mode, but wrap on the
6502 and 65C02.
Reaching beyond hardware compatibility to software compatibility was clearly so important that the
designers of the 65802 and 65816 devised the 6502 emulation mode scheme. Both processors power-on in
emulation mode, with the bank registers and the direct page register initialized to zero. As a result of both this
and having the same pinout, a 65802 can be substituted for a 6502 in any application and will execute the
existing software the same. Furthermore, it is possible to design second-generation 65816 systems compatible
with existing 6502 designs which, provided the computers designers do as good a job in providing
compatibility as the 65816s designers have, could run all existing software of the first generation system in
emulation mode, yet switch into native mode for sixteen-bit power and 24-bit addressing.
It is important to realize, however, that 6502 emulation mode goes far beyond emulating the 6502. It
embodies all the addressing mode and instruction enhancements of both the 65C02 and the 65802/65816; it has
a fully relocatable direct page register; it provides the stack relative addressing modes; and in the 65816s
emulation mode, it can switch between banks to use 24-bit addressing. The primary differences between native
and emulation modes are limitations placed on certain emulation mode registers and flags so that existing
programs are not surprised (and crashed) by non-6502-like results. These differences are summarized in Table
4.6.
58
special page
mnemonics
interrupts
instructions
index registers
break flag
block moves
bank registers
address space
addressing modes
accumulator
abort signal
6502 timing
6502 pinout
could crash
page 1
zero page
56
FFFA, FFFF
151
8 bits
D unknown
D not modified
wraps
N, V, Z invalid
yes
none
none
64K
14
8 bits
no
yes
yes
6502
NOP
page 1
zero page
64
FFFA, FFFF
178
8 bits
D=0
D=0
wraps
N, V, Z invalid
yes
none
none
64K
16
8 bits
no
no
yes
65C02
none
bank 0
direct page
92
FFE4, FFEF
256
8 or 16 bits
D=0
D=0
crosses page
N, V, Z valid
no
yes
not connected
64K
25
16 or 8/8 bits
no
no
yes
65802 Native
none
page 1
direct page
92
FFF4, FFFF
256
8 bits
D not modified
D not modified
wraps
N, V, Z valid
yes
of little use
not connected
64K
25
8/8 bits
no
yes
yes
65802 Emulation
none
bank 0
direct page
92
FFE4, FFEF
256
8 or 16 bits
D=0
D=0
crosses page
N, V, Z valid
no
yes
yes
16M
25
16 or 8/8 bits
yes
no
no
65816 Native
none
page 1
direct page
92
FFF4, FFFF
256
8 bits
D not modified
D not modified
wraps
N, V, Z valid
yes
of little use
yes
16M
25
8/8 bits
yes
yes
no
65816 Emulation
unused opcodes
59
15
Accumulator (B)
Accumulator (A)
0 0 0
0
0
0
0
0 0 0
0
0
0
0
Program Bank Register (PBR)
0
0
Direct
0 0 1
Program
e
n
Emulation
c
Carry
Zero
IRQ Disable
Decimal Mode
Break Instruction
Overflow
1=Carry
1=Result Zero
1=Disabled
1=Decimal, 0=Binary
1=Break caused
interrupt
1=Overflow
Negative
1=Negative
60
61
62
Part 3
Tutorial
63
5) Chapter Five
SEP, REP, and Other Details
Part Three is devoted to a step by step survey of all 92 different 65816 instructions and the 25 different
types of addressing modes which, together, account for the 256 operation codes of the 65802 and 65816. As a
matter of course, this survey naturally embraces the instruction sets of the 6502 and 65C02 as well.
The instructions are grouped into six categories: data movement, flow of control, arithmetic, logical and
bit manipulation, subroutine calls, and system control instructions. A separate chapter is devoted to each group,
and all of the instructions in a group are presented in their respective chapter.
The addressing modes are divided into two classes, simple and complex. The simple addressing modes
are those that form their effective address directly - that is, without requiring any, or only minimal, combination
or addition of partial addresses from several sources. The complex addressing modes are those that combine
two or more of the basic addressing concepts, such as indirection and indexing, as part of the effective address
calculation.
Almost all of the examples found in this book are intended to be executed on a system with either a
65802 or 65816 processor, and most include 65816 instructions, although there are some examples that are
intentionally restricted to either the 6502 or 65C02 instructions set for purpose of comparison.
Because of the easy availability of the pin-compatible 65802, there is a good chance that you may, in
fact, be executing your first sample programs on a system originally designed as a 6502-based system, with
system software such as machine-level monitors and operating systems that naturally support 6502 code only.
All of the software in this book was developed and tested on just such systems (AppleII computers with either
65802s replacing the 6502, or with 65816 processor cards installed).
It is assumed that you will have some kind of support environment allowing you to develop programs
and load them into memory, as well as a monitor program that lets you examine and modify memory, such as
that found in the Apple II firmware. Since such programs were originally designed to support 6502 code, the
case of calling a 65816 program from a 6502-based system program must be given special attention.
A 65802 or 65816 system is in the 6502 emulation mode when first initialized at power-up.
This is quite appropriate if the system software you are using to load and execute the sample programs
is 6502-based, as it would probably not execute correctly in the native 65816 mode.
Even though almost all of the examples are for the 65816 native mode of operation, the early examples
assume that the direct page register, program counter bank register, and data register are all in their default
condition - set to zero - in which case they provide an environment that corresponds to the 64K programming
space and zero page addressing of the 6502 and 65C02. Aside from keeping the examples simple, it permits
easy switching between the native mode and the emulation mode. If you have just powered up your 65816 or
65802 system, nothing needs be done to alter these default values.
The one initialization you must do is switch from the emulation to the native mode. To switch out of
the 6502 emulation mode, which is the default condition upon powering up a system, the code in Fragment 5.1
must be executed once.
0000
0001
18
FB
CLC
XCE
This clears the special e flag, putting the processor into the 65816 native mode.
If you are using a 65802 processor in an old 6502 system, the above code needs to be executed each
time an example is called. Further, before exiting a 65816 program to return to a 6502 calling program, the
opposite sequence in Fragment 5.2 must be executed.
64
0000
0001
38
FB
SEC
XCE
Even if you are running your test programs from a fully supported 65816 or 65802 environment, you
should include the first mode-switching fragment, since the operating mode may be undefined on entry to a
program. Execution of the second should be acceptable since the system program should reinitialize itself to the
native mode upon return from a called program.
A further requirement to successfully execute the example programs is to provide a means for returning
control to the calling monitor program. In the examples, the RTS (return from subroutine) instruction is used.
The RTS instruction is not explained in detail until Chapter 12; however, by coding it at the end of each
example, control will normally return to the system program that called the example program. So to exit a
program, you will always code the sequence in Fragment 5.3.
0000
0001
0002
38
FB
60
SEC
XCE
RTS
Fragment 5.3.
Some systems may have a mechanism other than RTS to return control to the system; consult your
system documentation.
In addition to these two details, a final pair of housekeeping instructions must be mastered early in
order to understand the examples.
These two instructions are SEP and REP (set P and reset P). Although they are not formally
introduced until Chapter 13, their use is essential to effective use of the 65802 and 65816. The SEP and REP
instructions have many uses, but their primary use is to change the value of the m and x flags in the status
register. As you recall from Chapter 4, the m and x registers determine the size of the accumulator and index
registers, respectively. When a flag is set (has a value of one), the corresponding register is eight bits; when a
flag is clear, the corresponding register is sixteen bits. SEP, which sets bits in the status register, is used to
change either accumulator, or index registers, or both, to eight bits; REP, which clears bits, is used to change
either or both to sixteen bits. Whenever a register changes size, all of the operations that move data in and out
of the register are affected as well. In this sense, the flag bits are extensions to the opcode, changing their
interpretation by the processor.
The operand following the SEP and REP instructions is a mask of the flags to be modified. Since bit
five of the status register is the m memory/accumulator select flag, an instruction of the form:
REP
#%00100000
makes the accumulator size sixteen bits; a SEP instruction with the same argument (or its hexadecimal
equivalent, $20) would make it eight bits. The binary value for modifying the x flag is %00010000, or $10; the
value for modifying both flags at once is %00110000, or $30. The sharp (#) preceding the operand signifies the
operand is immediate data, stored in the byte following the opcode in program memory; the percent (%) and
dollar ($) signs are special symbols signifying either binary or hexadecimal number representation, respectively,
as explained in Chapter 1.
Understanding the basic operation of SEP and REP is relatively simple. What takes more skill is to
develop a sense of their appropriate use, since there is always more than one way to do things. Although there
is an immediate impulse to want to use the sixteen-bit modes for everything, it should be fairly obvious that the
eight-bit accumulator mode will, for example, be more appropriate to applications such as character
manipulation. Old 6502 programmers should resist the feeling that if theyre not using the sixteen-bit modes
all the time theyre not getting full advantage from their 65802 or 65816. The eight-bit accumulator and
index register size modes, which correspond to the 6502 architecture, can be used to do some of the kinds of
65
ORG
$7000
When included in a source program, will cause the next byte of code generated to be located at memory location
$7000, with subsequently generated bytes following it.
Values can be assigned labels with the global equate directive, GEQU. For example, in a card-playing
program, spades might be represented by the value $7F; the program is much easier to code (and read) if you
can use label SPADE instead of remembering which of four values goes with which of the four suits, as seen in
Fragment 5.4.
0000
0000
0000
0000
SPADE
HEART
CLUB
DIAMOND
GEQU
GEQU
GEQU
GEQU
$7F
$FF
$3F
$1F
Fragment 5.4.
LDA
#$7F
LDA
#SPADE
Once you have defined a label using GEQU, the assembler automatically substitutes the value assigned
whenever the label is encountered.
The # sharp or pound sign is used to indicate that the accumulator is to be loaded with an immediate
constant.
66
BEGIN
LDA
#5
The label defines an entry point for a branch or jump to go to; when an instruction such as is assembled,
4C0400
JMP
BEGIN
the assembler automatically calculates the value of BEGIN and uses that value as the operand of the JMP
instruction.
Variable and array space can be set aside and optionally labelled with the define storage directive, DS
directive sets aside one byte at $1000 for the variable FLAG1; the second DS directive sets aside 20 bytes
starting at $1001 for ARRAY1.
0000
0000
0000
0001
0015
MAIN
FLAG1
ARRAY1
00
00000000
ORG
START
DS
DS
END
$1000
1
20
Fragment 5.5
The value stored at FLAG1 can be loaded into the accumulator by specifying FLAG1 as the operand of
the LDA instruction:
AD0010
LDA
FLAG1
Program constants, primarily default values for initializing variables, prompts, and messages, are located in
memory and optionally given a label by the declare constant directive, DC. The first character(s) of its operand
specifies a type (A for two-byte addresses, I1 for one-byte integers, H for hex bytes and C for character strings,
for example) followed by the value or values to be stored, which are delimited by single quotes.
Fragment 5.6 gives an example. The first constant, DFLAG1, is a default value for code in the
program to assign to the variable FLAG1. You may realize that DFLAG1 could be used as a variable; with a
label, later values of the flag could be stored here and then there would be no need for any initialization code.
But good programming practice suggests otherwise: once another value is stored into DFLAG1, its initial value
is lost, which keeps the program from being restarted from memory. On the other hand, using a GEQU to set
up DFLAG1 would prevent you from patching the location with a different value should you change your mind
about its initial value after the code has been assembled.
0000
0001
0003
001B
FE
0010
496E7365
00
DFLAG1
COUNT
PROMPT
DC
DC
DC
DC
I1 $FE
A $1000
C Insert disk into drive 1
I1 0
Fragment 5.6
Defining COUNT as a declared constant allows it, too, to patched in object as well as edited in source.
PROMPT is a message to be written to the screen when the program is running. The assembler lists
only the first four object bytes generated (496E7365) to save room, but generates them all. The zero on the
next line acts as a string terminator.
Sometimes it is useful to define a label at a given point in the code, but not associate it with a particular
source line; the ANOP (assembler no-operation) instruction does this. The value of the label will be the
67
0000
0000
0000
BLACK
WHITE
ANOP
DS
Fragment 5.7
The two bytes of variable storage reserved may now be referred to as either BLACK or WHITE; their value is
the same.
Address Notation
The 16-megabyte address space of the 65816 is divided into 256 64K banks. Although it is possible to
treat the address space in a linear fashion - the range of bytes from $000000 to $FFFFFF - it is often desirable
and almost always easier to read if you distinguish the bank component of a 24-bit address by separating it with
a colon:
$00:FFF0
$xx:1234
$01:XXXX
In these examples, the X characters indicate that that address component can be any legal value; the
thing of interest is the specified component.
Similarly, when specifying direct page addresses, remember that a direct page address is only an offset;
it must be added to the value in the direct page register:
dp:$30
$1000:30
The dp in the first example is used to simply indicate the contents of the direct page register, whatever
it may be; in the second case, the value in the direct page register is given as $1000. Note that this notation is
distinguished from the previous one by the fact that the address to the left of the colon is a sixteen-bit value, the
address on the right is eight. Twenty-four-bit addresses are the other way around.
A third notation used in this book describes ranges of address. Whenever two addresses appear together
seperated by a single dot, the entire range of memory location between and including the two addresses is being
referred to. For example, $2000.2001 refers to the double-byte starting at $2000. If high bytes of the second
address are omitted, they are assumed to have the same value as the first address. Thus, $2000.03 refers to the
addresses between $2000 and $2003 inclusive.
68
6) Chapter Six
First Examples: Moving Data
Most people associate what a computer does with arithmetic calculations and computations. That is
only part of the story. A great deal of compute time in any application is devoted to simply moving data around
the system: from here to there in memory, from memory into the processor to perform some operation, and from
the processor to memory to store a result or to temporarily save an intermediate value. Data movement is one of
the easiest computer operations to grasp and is ideal for learning the various addressing modes (there are more
addressing modes available to the data movement operations than to any other class of instructions). It,
therefore, presents a natural point of entry for learning to program the 65x instruction set.
On the 65x series of processors - the eight-bit 6502 and 65C02 and their sixteen-bit successors, the
65802 and 65816 - you move data almost entirely using the microprocessor registers.
This chapter discusses how to load the registers with data and store data from the registers to memory
(using one of the simple addressing modes as an example), how to transfer and exchange data between registers,
how to move information onto and off of the stack, and how to move blocks (or strings) of data from one
memory location to another (see Table 6-1).
69
Mnemonic
6502
Load/Store Instructions:
LDA
x
LDX
x
LDY
x
STA
x
STX
x
STY
x
Push Instructions:
PHA
PHP
PHX
PHY
PHB
PHK
PHD
x
x
Available on:
65C02
Transfer Instructions:
TAX
TAY
TSX
TXS
TXA
TYA
TCD
TDC
TCS
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
transfer A to X
transfer A to Y
transfer S to X
transfer X to S
transfer X to A
transfer Y to A
transfer C accumulator to D
transfer D to C accumulator
transfer C accumulator to S
(Continued)
transfer S to C accumulator
transfer X to Y
transfer Y to X
x
x
x
x
TSC
TXY
TYX
Exchange Instructions:
XBA
XCE
Store Zero to Memory:
STZ
Block Moves:
MVN
MVP
Description
x
x
x
x
x
x
65802/816
70
The load and store instructions in Listing 6.1 will as easily move a double byte as they did a
byte, if the register you use is in sixteen-bit mode, as in Listing 6.2.
71
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0002
0002
0002
0004
0007
000A
000A
000A
000A
000B
000C
000C
000D
000D
000E
000F
000F
KEEP KL.6.1
65816 ON
MAIN
START
18
FB
CLC
XCE
;
E220
AD0D00
8D0E00
;
38
FB
;
60
77
00
RTS
SOURCE
DEST
DC
DS
H77
1
END
Listing 6.1.
Note that the source data in the define constant statement is now two bytes long, as is storage reserved
by the define storage statement that follows. If you look at the interlisted hexadecimal code generated by the
assembler, you will see that the address of the label DEST is now $200F. The assembler has automatically
adjusted for the increase in the size of the data at SOURCE, which is the great advantage of using symbolic
labels rather than fixed addresses in writing assembler programs.
The load and store instructions are paired here to demonstrate that, when using identical addressing
modes, the load and store operations are symmetrical. In case, though, a value loaded into a register will be
stored many instructions later, or never at all, or stored using an addressing mode different from that of the load
instruction.
72
0000
0000
0000
0000
0000
0000
0000
0001
0002
0002
0004
0007
000A
000A
000A
000B
000C
000C
000D
000D
000F
0011
KEEP KL.6.2
65816 ON
MAIN
START
18
FB
;
C220
AD0D00
8D0F00
REP
LDA
STA
;
38
FB
#%00100000
SOURCE
DEST
;
60
7F7F
0000
RTS
;
SOURCE
DEST
DC
DS
END
A$7F7F
2
Listing 6.2.
73
Push
Push instructions store data, generally located in a register, onto the stack. Regardless of a registers
size, the instruction that pushes it takes only a single byte.
When a byte is pushed onto the stack, it is stored to the location pointed to by the stack pointer, after
which the stack pointer is automatically decremented to point to the next available location.
When double-byte data or a sixteen-bit address is pushed onto the stack, first its high-order byte is
stored to the location pointed to by the stack pointer, the stack pointer is decremented, the low byte is stored to
the new location pointed to by the stack pointer, and finally the stack pointer is decremented once again,
pointing past both bytes of pushed data. The sixteen-bit value ends up on the stack in the usual 65x memory
order: low byte in the lower address, high byte in the higher address.
In both cases, the stack grows downward, and the stack pointer points to the next available (unused)
location at the end of the operation.
74
$ffff
MEMORY
65816/65802
native mode stack pointer:
16-bit range
$0000-$FFFF
6502/65C02
and
65816/65802
emulation mode
stack pointer:
8-bit range
$0100-$01FF
$0200
$0100
$0000
75
Pull
Pull instructions reverse the effects of the path instructions, but there are fewer pull instructions, all of them
single-bit instructions that pull a value off the stack into a register. Unlike the Motorola and Intel processors (68xx and
808x), the 65x pull instructions set the n and z flags. So programmers used to using pull instructions between a test and a
branch on the other processors should exercise caution with the 65x pull instructions.
76
Low
8 Bit Data
Next Stack Location
Stack
Memory
16-Bit Register
High
Low
Data High
Data Low
Next Stack Location
Stack
Memory
77
OTHBNK
GEQU
$FFF3
E220
.
.
.
SEP
#%00100000
ADF3FF
LDA
OTHBNK
8B
48
PHB
PHA
AB
68
PLB
PLA
8DF3FF
STA
.
.
.
OTHBNK
Fragment 6.1.
Similar to PHB, the PHK instruction pushes the value in the eight-bit program counter bank register
onto the stack. Again, the instruction can be used to let you locate the current bank; this is useful in writing
bank-independent code, which can be executed out of any arbitrarily assigned bank.
Youre less likely to use PHK to preserve the current bank prior to changing banks (as in the case of
PHB above) because the jump to subroutine long instruction automatically pushes the program counter bank
as it changes it, and because there is no complementary pull instruction. The only way to change the value in
the program counter bank register is to execute a long jump instruction, and interrupt, or a return from
subroutine or interrupt. However, you can use PHK to synthesize more complex call and return sequences, or
to set the data bank equal to the program bank.
Finally, the PHD instruction pushes the sixteen-bit direct page register onto the stack, and PLD pulls a
sixteen-bit value from the stack into the direct page register. PHD is useful primarily for preserving the direct
page location before changing it, while PLD is an easy way to change or restore it. Note that PLB and PLD
also affect the n and z flags.
78
Like the load instructions, all of these transfer operations except TXS set both the n and z flags. (TXS
does not affect the flags because setting the stack is considered an operation in which the data transferred is
fully known and will not be further manipulated.)
The availability of these instructions on the 65802/65816, with its dual-word-size architecture, naturally
leads to some questions when you consider transfer of data between registers of different sizes. For example,
you may have set the accumulator word size to sixteen bits, and the index register size to eight. What happens
when you execute a TAY (transfer A to Y) instruction?
The first rule to remember is that the nature of the transfer is determined by the destination register. In
this case, only the low-order eight bits of the accumulator will be transferred to the eight-bit Y register. A
second rule also applies here: when the index registers are eight bits (because the index register select flag is
set), the high byte of each index register is always forced to zero upon return to sixteen-bit size, and the loworder value of each sixteen-bit index register contains its previous eight-bit value.
Listing 6.3 illustrates these rules with TAY. In this example, the value stored at the location DATA2 is
$0033; only the low order byte has been transferred from the accumulator, while the high byte has been zeroed.
The accumulator, on the other hand, operates differently. When the accumulator word size is switched
from sixteen bits to eight, the high-order byte is preserved in a hidden accumulator, B. It can even be
79
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0002
0004
0006
0009
000A
000C
000F
000F
000F
0010
0011
0011
0012
0012
0014
0016
0016
MAIN
;
KEEP
KL.6.3
65816
ON
18
FB
START
switch-to-native-mode code
CLC
clear carry flag
XCE
exchange carry with e bit (clear e bit)
C220
E210
AD1200
A8
C210
8C1400
REP
SEP
LDA
TAY
REP
STY
;
#$20
#$10
DATA
set accum to 16
set index to 8
#$10
DATA2
set index to 16
38
FB
60
RTS
33FF
0000
DATA
DATA2
DC
DS
A$FF33
2
END
Listing 6.3.
There are also rules for transfers from eight-bit to a sixteen-bit register. Transfers out of the eight-bit
accumulator into a sixteen-bit index register transfer both eight-bit accumulators.
In Listing 6.6, the value saved to RESULT is $7FFF, showing that not only is the eight-bit A
accumulator transferred to become the low byte of the sixteen-bit index register, but the hidden B accumulator
is transferred to become the high byte of the index register. This means you can form a sixteen-bit index in the
eight-bit accumulator one byte at a time, then transfer the whole thing to the index register without having to
then transfer the whole thing without having to switch the accumulator to sixteen bits first. However, take care
not to inadvertently transfer an unknown hidden value when doing transfers from the eight-bit accumulator to a
sixteen-bit index register.
80
0000
0000
0000
0000
0000
0000
0000
0001
0002
0002
0004
0007
0009
000C
000E
0011
0011
0011
0012
0013
0013
0014
0014
0016
0017
0019
0019
KEEP
65816
MAIN
START
;
18
FB
switch-to-native-mode code
CLC
clear carry flag
XCE
exchange carry with e bit (clear e bit)
C230
AD1400
E220
AD1600
C220
8D1700
REP
LDA
SEP
LDA
REP
STA
;
#$30
DATA16
#$20
DATA8
#$20
RESULT
38
FB
60
RTS
FF7F
33
0000
DATA16
DATA8
RESULT
DC
DC
DS
A$7FFF
H33
2
END
Listing 6.4
81
Transfers from eight-bit index register to the sixteen-bit accumulator result in the index register
being transferred into the accumulators low byte while the accumulators high byte is zeroed. This is
consistent with the zeroing of the high byte when eight-bit index registers are switched to sixteen bits.
In Listing 6.7, the result is $0033, demonstrating that when an eight-bit index register is transferred to
the sixteen-bit accumulator, a zero is concatenated as the high byte of the new accumulator value.
0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015
0016
0017
0018
0019
0020
0021
0022
0023
0024
0025
0026
0027
0028
0029
0030
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0002
0004
0007
000A
000C
000D
000F
0012
0012
0012
0012
0013
0014
0014
0015
0015
0017
0019
001B
001B
KEEP
65816
MAIN
START
18
FB
CLC
XCE
C230
AC1500
AD1700
E220
98
C220
8D1900
REP
LDY
LDA
SEP
TYA
REP
STA
;
SEC
XCE
60
RTS
DATA16
DATA2
RESULT
38
FB
FF7F
4433
0000
KL.6.5
ON
DC
DC
DS
A$7FFF
A$3344
2
END
Listing 6.5.
In the 65816, transfers between index registers and the stack also depend on the setting of the destination register.
For example, transferring the sixteen-bit stack to an eight-bit register, as in Fragment 6.2, results in the transfer of just the
low byte. Obviously, though, youll find few reasons to transfer only the low byte of the sixteen-bit stack pointer. As
always, you need to be watchful of the current modes in force in each of your routines.
The 65816 also adds new transfer operations to accommodate direct transfer of data to and from the new 65816
environment-setting registers (the direct page register and the sixteen-bit stack register), and also to complete the set of
possible register transfer instructions for the basic 65x user register set:
82
0000 0000
0000 0000
83
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0002
0004
0007
000A
000C
000D
0010
0010
0010
0010
0011
0012
0012
0013
0013
0013
0015
0017
0019
0019
KEEP
65816
MAIN
START
18
FB
CLC
XCE
C230
AD1300
AC1500
E220
A8
8C1700
REP
LDA
LDY
SEP
TAY
STY
;
SEC
XCE
60
RTS
DATA16
DATA2
RESULT
38
FB
FF7F
4433
0000
KL.6.6
ON
DC
DC
DS
A$7FFF
A$3344
2
END
Listing 6.6
84
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0002
0004
0006
0009
000C
000D
0010
0010
0010
0010
0011
0012
0012
0013
0013
0013
0015
0016
0018
0018
KEEP
65816
MAIN
START
switch-to-native-mode code
18
FB
CLC
XCE
E210
C220
AD1300
AC1500
98
8D1600
SEP
REP
LDA
LDY
TYA
STA
;
38
FB
SEC
XCE
60
RTS
FF7F
33
0000
KL.6.7
ON
DATA16
DATA8
RESULT
DC
DC
DS
A$7FFF
H33
2
END
Listing 6.7
0000
0002
E210
BA
SEP
TSX
#%00010000
TCD
TDC
TCS
TSC
TXY
TYX
transfers the contents of the sixteen-bit accumulator C to the D direct page register. The use of
the letter C in this instructions mnemonic to refer to the accumulator indicates that this
operation is always is a sixteen-bit transfer, regardless of the setting of the memory select flag.
For such a transfer to be meaningful, of course, the high-order byte of the accumulator must
contain a valid value.
transfer the contents of the D direct page register to the sixteen-bit accumulator. Again, the use
of the letter C in the mnemonic to name the accumulator indicates that the sixteen-bit
accumulator is always used, regardless of the setting of the memory select flag. Thus, sixteen
bits are always transferred, even if the accumulator size is eight bits, in which case the high
byte is stored to the hidden B accumulator.
transfers the contents of the sixteen-bit C accumulator to the S stacker pointer register, thereby
relocating the stack. Since sixteen bits will be transferred regardless of the accumulator word
size, the high byte of the accumulator must contain valid data.
transfer the contents of the sixteen-bit S stacker pointer register to the sixteen-bit accumulator,
C, regardless of the accumulator word size.
transfers the contents of the X index register to the Y index register. Since X and Y will always
have the same register size, there is no ambiguity.
transfers the contents of the Y index register to the X index register. Both will always be the
same size.
85
Exchanges
The 65802 and 65816 also implement two exchange instructions, neither available on the 6502 or
65C02. An exchange differs from a transfer in the two values are swapped, rather than one value being copied
to a new location.
The first of the two exchange instructions, XBA, swaps the high and low bytes of the sixteen-bit
accumulator (the C accumulator).
The terminology used to describe the various components of the eight-or-sixteen bit accumulator is: to
use A to name the accumulator as a register that may be optionally eight or sixteen bits wide (depending on the
m memory/accumulator select flag); to use C when the accumulator is considered to be sixteen bits regardless
of the setting of the m flag; and, when A is used in eight-bit mode to describe the low byte only, to use B to
describe the hidden high byte of the sixteen-bit accumulator. In the latter case, when the accumulator size is set
to eight bits, only the XBA instruction can directly access the high byte of the sixteen-bit double accumulator,
B. This replacement of A for B and B for A can be used to simulate two eight-bit accumulators, each of which,
by swapping, shares the actual A accumulator. It can also be used in the sixteen-bit mode for inverting a
double-byte value. The XBA instruction is exceptional in that the n flag is always set on the basis of bit seven
of the resulting accumulator A, even if the accumulator is sixteen bits.
The second exchange instruction, XCE, is the 65816s only, method for toggling between 6502
emulation mode and 65816 native mode. Rather than exchange register values, it exchanges two-bits - the carry
flag, which is bit zero of the status register, and the e bit, which should be considered a kind of appendage to the
status register and which determines the use of several of the other flags.
Fragment 6.3 sets the processor to 6502 emulation mode. Conversely, native mode can be set by
replacing the SEC with a CLC clear carry instruction.
0010 38
0011 FB
SEC
XCE
Fragment 6.3
Because the exchange stores the previous emulation flag setting into the carry, it can be saved and
restored later. It can also be evaluated with the branch-on-condition instructions to be discussed in Chapter 8
(Flow of Control) to determine which mode the processor was just in. A device driver routine that needs to set
the emulation bit, for example, can save its previous value for restoration before returning.
The selection of the carry flag for the e bit exchange instruction is in no way connected to the normal
use of the carry flag in arithmetic operations. It was selected because it is easy to set and reset, it is less
frequently used than the sign and zero flags, and there are branch-on-conditions instructions which test it. The
primary use of the SEC and CLC instructions for arithmetic will be covered in upcoming chapters.
Block Moves
The two block move instructions, available only on the 65802 and the 65816, let entire blocks (or
strings) of memory be moved at once.
Before using either instruction, all three user registers (C,X, and Y) must be set up with values which
serve as parameters.
The C accumulator holds the count of the number of bytes to be moved, minus one. It may take some
getting used to, but this count is numbered from zero rather than one. The C accumulator is always sixteen
bits: if the m mode flag is set to eight bits, the count is still the sixteen-bit value in C, the concatenation of B
and A.
X and Y specify either the top or the bottom addresses of the two blocks, depending on which of the
two versions of the instruction you choose. In Listing 6.8, $2000 bytes of data are moved from location $2000
to $4000.
0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015
0016
0017
0018
0019
0020
0021
0022
0023
0024
0025
0026
0027
0000
0000
0000
0000
0000
0000
0001
0002
0002
0004
0004
0004
0004
0007
000A
000D
000D
0010
0010
0011
0012
0013
0013
0015
0017
0019
0019
KEEP
65816
MAIN
KL.6.8
ON
START
18
FB
CLC
XCE
C230
REP
LONGA
LONGI
#$30
ON
ON
AD1300
AE1500
AC1700
LDA
LDX
LDY
COUNT
SOURCE
DEST
540000
MVN
0,0
38
FB
60
SEC
XCE
RTS
FF1F
0020
0040
COUNT
SOURCE
DEST
DC
DC
DC
A$1FFF
A$2000
A$4000
END
Listing 6.8.
The MVN instruction uses X and Y to specify the bottom (or beginning) addresses of the two blocks of
memory. The first byte is moved from the address in X to the address in Y; then X and Y are incremented, C is
decremented, and the next byte is moved, and so on, until the number of bytes specified by the value in C is
moved (that is, until C reaches $FFFF). If C is zero, a single first byte is moved, X and Y are each incremented
once, and C is decremented to $FFFF.
The MVP instruction assumes X and Y specify the top (or ending) addresses of the two blocks of
memory. The first byte is moved from the address in X to the address in Y; the X, Y and C are decremented,
the next byte is moved , and so on, until the number of bytes specified by the value in C is moved (until C
reaches $FFFF).
The need for two distinct block move instructions becomes apparent when the problem of memory
overlap is considered. Typically, when a block of memory starting at location X is to be moved to location Y,
the intention is to replace the memory locations from Y to Y + C with the identical contents of the range X
through X + C. However, if these two ranges overlap, it is possible that as the processor blindly transfers
memory one byte at a time, it may overwrite a value in the source range before that value has been transferred.
The rule of thumb is, when the destination range is a lower memory address than the source range, the
MVN instruction should be used (thus Move Next) to avoid overwriting source bytes before they have been
copied to the destination. When the destination range is a higher memory location than the source range, the
MVP instruction should be used (Move Previous).
87
The bank byte of the label SOURCE is 02 while the bank byte of the label DEST is 01. As always, the
assembler does the work of converting the more human-friendly assembly code to the correct object code
format for the processor.
If the source and destination banks are not specified, some assemblers will provide a user-specified
default bank value.
The assembler will translate the opcode to object code, then supply its bank value for both of the
operand bytes:
440000
MVP
If either bank is different from the default value, both must be specified.
88
7) Chapter Seven
SimpleAddressing Modes
The term addressing mode refers to the method by which the processor determines where it is to get
the data needed to perform a given operation. The data used by a 65x processor may come either from memory
or from one or another of the processors registers. Data for certain operations may optionally come from
either location, some from only one or the other. For those operations which take one of their operands from
memory, there may be several ways of specifying a given memory location. The method best suited in a
particular instance is a function of the overall implementation of a chosen problem-solving algorithm. Indeed,
there are so many addressing modes available on the 65x processors that there is not necessarily a single
correct addressing mode in each situation.
This chapter deals with those addressing modes which may be described as the simple addressing
modes. You have already seen some of these used in the examples of the previous chapter; the simple
addressing modes are listed in Table 7.1. Each of these addressing modes is straightforward. Those addressing
modes that require more than a simple combination of values from several memory locations or registers are
described as complex modes in Chapter 11.
Available on all 65x processors:
immediate
absolute
direct page (zero page)
accumulator
implied
stack
Example
LDA
LDA
LDA
ASL
TAY
PHA
Syntax
#$12
$1234
$12
A
LDA
($12)
LDA
LDA
MVN
$123456
[$12]
SOURCE, DEST
absolute long
direct page indirect long
block move
In addition to solving a given problem, the processor must spend a great deal of its time simply
calculating effective addresses. The simple addressing modes require little or no effective address computation,
and therefore tend to be the fastest executing. However, the problem-solving and memory efficiencies of the
complex addressing modes, which will be described in subsequent chapters, can make up for their effective
address calculation overhead. In each case, the nature of the problem at hand determines the best addressing
mode to use.
89
Immediate Addressing
Immediate data is data found embedded in the instruction stream of a program itself, immediately
following the opcode which uses the data. Because it is part of the program itself, it is always a constant value,
known at assembly time and specified when you create the program. Typically, small amounts of constant data
are handled most efficiently by using the immediate addressing mode to load either the accumulator or an index
register with specific value. Note that the immediate addressing mode is not available with any of the store
instructions (STA, STX, or STY), since it makes no sense to store a value to the operand location within the
code stream.
To specify the immediate addressing mode to a 65x assembler, prefix the operand with a # (pound or
sharp) sign. The constant operand may be either data or an address.
For example,
A912
LDA
#$12
The 6502 and 65C02, their registers limited to only eight bits, permit only an eight-bit operand to
follow the load register immediate opcodes. When the constant in an assembly source line is a sixteen-bit
value, greater-than and less-than signs are used to specify whether the high- or low-order byte of the doublebyte value are to be used. A less-than indicates that the low byte is to be used, and thus:
A234
LDX
#<$1234
causes the assembler to generate the LDX opcode followed by a one-byte operand, the low byte of the source
operand, which is $34. Its equivalent to:
A234
LDX
#$34
The use of a greater-than sign would cause the value $12 to be loaded. If neither the less-than nor
greater-than operator is specified, most assemblers will default to the low byte when confronted with a doublebyte value.
When assembling 65816 source code, the problem becomes trickier. The 6502 and 65C02 neither have
nor need an instruction to set up the eight-bit mode because they are always in it. But the 65816s accumulator
may be toggled to deal with eight- or sixteen-bit quantities, as can its index registers, by setting or resetting the
m (memory/accumulator select) or x (index select) flag bits of the status register. Setting the m bit puts the
accumulator in eight-bit mode; resetting it puts it in sixteen-bit mode. Setting the x bit puts the index registers
in eight-bit mode; resetting it puts them in sixteen-bit mode.
The m and x flags may be set and reset many times throughout a 65816 program. But while assembly
code is assembled from beginning to end, it rarely executes in that fashion. More commonly, it follows a
circuitous route of execution filled with branches, jumps, and subroutine calls. Except for right after the m or x
flag has been explicitly set or reset, the assembler has no way of knowing the correct value of either: your
program may branch somewhere, and re-enter with either flag having either value, quite possibly an incorrect
one.
While the programmer must always be aware of the proper values of these two flags, for most
instructions the assembler doesnt need to know their status in order to generate code. Most instructions
generated are the same in both eight- or sixteen-bit mode. Assembling a load accumulator absolute instruction,
for example, puts the same opcode value and the same absolute address into the code stream regardless of
accumulator size; it is at execution time that the m bit setting makes a difference between whether the
accumulator is loaded with one or two bytes from the absolute address.
But a load register immediate instruction is followed by the constant to be loaded. As Figure 7.1 shows,
if the register is set to eight-bit mode at the point the instruction is encountered, the 65816 expects a one-byte
constant to follow before it fetches the next opcode. On the other hand, if the register is set to sixteen-bit mode
at the point the instruction is encountered , the 65816 expects a double-byte constant to follow before it fetches
the next opcode. The assembler must put either a one-byte or two-byte constant operand into the code
following the load register immediate opcode based on the status of a flag which it doesnt know.
90
Data = Operand
Two assembler directives have been designed to tell the assembler which way to go: LONGA and
LONGI, each followed with the value ON or OFF. LONGA ON indicates the accumulator is in sixteen-bit
mode, LONGA OFF in eight-bit mode. LONGI ON tells the assembler that the index registers are in sixteenbit mode, LOGI OFF that they are in eight-bit mode. Load register immediate instructions are assembled on
the basis of the last LONGA or LONGI directive the assemble has seen - that is, the one most immediately
preceding it in the source file. For example,
LONGA ON
LONGI ON
tells the assembler that both accumulator and index registers are set to sixteen bits. Now, if it next encounters
the following two instructions
A93412
A05600
LDA
LDY
#$1234
#$56
then the first puts a LDA immediate opcode followed by the constant $1234 into the code, and the second a
LDY immediate opcode followed by the constant $0056, again two bytes of operand, the high byte padded with
zero.
On the other hand,
LONGA OFF
LONGI OFF
tells the assembler that both accumulator and index registers are set to eight bits. Now,
A934
A056
LDA
LDY
#$1234
#$56
puts LDA immediate opcode followed by the constant $34 into code, and the second a LDY immediate opcode
followed by the constant $56, each one byte of operand.
Like the flags themselves, of course, one directive may be ON and the other OFF at any time. They
also do not need to both be specified at the same time.
The setting of the LONGA and LONGI directives to either ON or OFF simply represent a promise by
you, the programmer, that the flags will, in fact, have these values at execution time. The directives do nothing
by themselves to change the settings of the actual m and x flags; this is typically done by using the SEP and
REP instructions, explained earlier. (Note, incidentally, that these two instructions use a special form of the
immediate addressing mode, where the operand is always eight bits.) Nor does setting the flags change the
91
Absolute Addressing
There are two categories of simple addressing modes available for accessing data in a known memory location:
absolute and direct page. The first of these, absolute addressing, is used to load or store a byte to or from a fixed memory
location (within the current 65K data bank on the 65816, which defaults to bank zero on power up). You specify the
sixteen-bit memory location in the operand field (following the opcode) in your assembly language source line, as Figure
7.1 loads the eight-bit constant $34 into the accumulator, then stores it to memory location $B100 in the current data bank.
0000
0002
0002
0004
E220
SEP
LONGA
LDA
STA
A934
8D00B1
#%00100000
OFF
#$34
$B100
Fragment 7.1.
The same memory move could be done with either of the index registers, as shown in Fragment 7.2
using the X register. Symbolic labels in the operand fields provide better self-documentation and easier
program modification.
0000
0000
0000
0000
0002
0002
0004
NUM1
DATA
E210
A234
8E00B1
GEQU
GEQU
$34
$B100
SEP
LONGI
LDX
STX
#%00010000
OFF
#NUM1
DATA
Fragment 7.2
As you have seen, the 65816s accumulator may be toggled to deal with either eight- or sixteen-bit
quantities, as can its index registers, by setting or resetting the m or x flag bits of the status register. Naturally,
you dont need to execute a SEP or REP instructions nor a LONGA or LONGI assembler directive before
every routine, provided you know the register you intend to use is already set correctly, and the assembler
correctly knows that the setting. But you must always exercise extreme care when developing 65816 programs
to avoid making invalid assumptions about the modes currently in force or taking unintentional branches from
code in one mode to code in another.
92
15
Bank
High
Low
Instruction:
Opcode
Operand Low
Operand High
As Fragment 7.3 shows, the load and store instructions above will as easily move sixteen bits of data as they did
eight bits; all thats needed is to be sure the register used is in sixteen-bit mode, and that the assembler has
alerted to the setting.
0000
0000
0000
0002
0002
0005
DATA
C210
A23412
8E00B1
GEQU
$B100
REP
LONGI
LDX
STX
#%00010000
ON
#1234
DATA
Fragment 7.3.
As indicated, absolute addresses are sixteen-bit addresses. On the 6502, 65C02, and 65802, with
memory space limited to 64K, sixteen bits can specify any fixed location within the entire address space of the
processor. Therefore, the term absolute addressing was appropriate.
The 65816, on the other hand, with its segmentation into 256 possible 64K banks, requires a 24-bit
address to specify any fixed location within its address space. However, the same opcodes that generate 24-bit
addresses on the 65816 by concatenating the value of the data bank register with the sixteen-bit value in the
operand field of the instruction. (Instructions that transfer control, to be discussed in Chapter 8, substitute the
program bank register value for the data bank register value.)
Absolute addressing on the 65816 is therefore actually an offset from the base of the current bank;
nevertheless, the use of the term absolute addressing has survived on the 65816 to refer to sixteen-bit fixed
addresses within the current 64K data bank.
So long as the programmer needs to access only the contents of the current data bank, (sixteen-bit)
absolute addressing is the best way to access data at any known location in that bank.
93
Low
0 0 0 0 0 0 0 0
Instruction:
Opcode
Operand
Since all of the addresses in the zero page are less than $0100 (such as $003F, for example) it follows
that, if the computer knew enough to assume two leading hexadecimal zeroes, a zero page address could be
represented in only one byte, saving both space and time. But if absolute addressing is used, the processor has
to assume that two bytes follow an instruction to represent the operand, regardless of whether the high-order
byte is zero or not.
This concept of expressing a zero page address with a single-byte operand was implemented on the
6502 and 65C02 by reserving separate opcodes for the various instructions using zero page addressing. Since
an instructions opcode for using zero page addressing is unique (as opcodes are for all of the different modes of
a given instruction), the processor will fetch only one operand byte from the code stream, using it in effect as a
displacement from a known base ($0000, in the case of the 6502 and 65C02). Since only one byte need be
fetched from the instruction stream to determine the effective address, the execution time is faster by one cycle.
The result is a form of addressing that is shorter, both in memory use and execution time, than regular sixteenbit absolute addressing.
Clearly, locating your most often accessed variables in zero page memory results in considerably
shorter code and faster execution time.
The limitation of having this special area of memory available to the zero page addressing mode
instructions is that there are only 256 bytes of memory available for use in connection with it. That is, there are
only 256 zero page addresses. Resident system programs, such as operating systems and language interprets,
typically grab large chunks of page zero for their own variable space; applications programmers must carefully
step around the operating systems variables, limiting assignment of their own programs zero page variables to
some fraction of the zero page.
This problem is overcome on the 65816 by letting its direct page be set up anywhere within the first
64K of system memory (bank zero), under program control. No longer limited to page zero, it is referred to as
direct page addressing. The result is, potentially, multiple areas of 256 ($100) bytes each, which can be
accessed one byte and one cycle cheaper than absolute memory. Setting the direct page anywhere is made
possible by the 65816s direct page register, which serves as the base pointer for the direct page area of
94
LDA
STA
#$FO
$12
This stores the one-byte value $F0 at address $0012. Note that the object code generated for the store requires
only one byte for the opcode and one for operand.
A9F0
8D0081
LDA
STA
#$FO
$B100
This stores the same one-byte value at the address $B100. In this case, the store requires one byte for the
opcode and two bytes for the operand.
Notice how the assembler automatically assumes that if the value of the operand can be expressed in
eight bits - if it is a value less than $100, whether coded as $34 or $000034 - the address is a direct page
address. It therefore generates the opcode for the direct page addressing form of the instruction, and puts only a
one-byte operand into the direct page address to store to is $12. One result of the assemblers assumption that
values less than $100 are direct page offsets is that physical addresses in the range $xx:0000 to $xxx:00FF
cannot be referenced normally when either the bank (the xx) register is other than zero or the direct page
register is set to other than $0000. For example, assembler syntax like:
A4FO
LDY
$FO
A4FO
LDY
$00FO
or
is direct page syntax. It will not access absolute address $00F0 if the direct page register holds a value other
than zero; nor will it access $00F0 in another bank, even if the data bank register is set to the other bank. Both
are evaluated to the same $F0 offset in the direct page. Instead, to access physical address $xx00F0, you must
force absolute addressing by using the vertical bar or exclamation point in your assembler source line:
ACF000
LDY
!$F0
Indexing
An array is a table or list in memory of sequentially stored data items of the same type and size.
Accessing any particular item of data in an array requires that you specify both location of the base of the array
and the item number within the array. Either your program or the processor must translate the item number into
the byte number within the array (they are the same if the items are bytes) and add it to the base location to find
the address of the item to be accessed (see Figure 7.4).
95
96
Base = $2000
Index Register X = $ 03
Effective Address = $2003
Base = $2000
0
0
$2000
$2001
$2002
$2003
$2004
X = $03
0
0
resulting in the values 0, 2, 4, . . . from the array indicates 0, 1, 2, . . . . and so on, to create an index into this
array of two-byte data items.
The 65x processors provide a wide range of indexing addressing modes that provide automatic
indexing capability. In all of them, a value in one of the two index registers specifies the unsigned (positive
integer) index into the array, while the instructions operand specifies either the base of the array or a pointer to
an indirect address at which the base may be found. Each addressing mode has a special operand field syntax
for specifying the addressing mode to the assembler. It selects the opcode that will correctly instruct the
processor where to find both the base and index.
Some early processors (the 6800, for example) had only one index register; moving data from one array
to another required saving off the first index and loading the second before accessing the second array, then
incrementing the second index and saving it before reloading the first index to again access the first array. The
65x processors were designed with two index registers so data can be quickly moved from an array indexed by
one to a second array indexed by the other.
97
BANK TWO
Correct Result
on 65816
$4000
INDEX
Index =
=
BANK ONE
$FFFF
$8000
Base =
$C000
Wrapped result
on 65802
$4000
$0000
Often, the index registers are used simultaneously as indexes and as counters within loops in which consecutive
memory locations are accessed.
The 65802 and 65816 index registers can optionally specify sixteen-bit offsets into an array, rather than
eight-bit offsets, if the x index register select flag is clear when an indexed addressing mode is encountered.
This lets simple arrays and other structured data elements be as large as 64K.
On the 6502, 65C02, and 65802, if an index plus its base would exceed $FFFF, it wraps to continue
from the beginning of the 64K bank zero; that is, when index is added to base, any carry out of the low-order
sixteen bits lost. (See Figure 7.5.)
On the 65816, the same is true of direct page indexing: because the direct page is always located in
bank zero, any time the direct page, plus an offset into the direct page, plus an index exceeds $FFFF, the
address wraps to remain in bank zero.
But as Figure 7.5 shows, whenever a 65816 base is specified by a 24-bit (long) address, or the base is
specified by sixteen bits and assumes the data bank as its bank, then, if an index plus the low-order sixteen bits
of its base exceeds $FFFF, it will temporarily (just for the current instruction) increment the bank. The 65816
assumes that the array being accessed extends into the next bank.
A20500
BD0022
LDX
LDA
#5
$2200,X
If the 65816 is in native mode and the index registers are set to sixteen-bit mode, indexes greater than
$FF can be used, as Fragment 7.5 illustrates.
0000
0003
A00501
B90022
LDY
LDA
#$105
$2200,Y
If the index register plus the constant base exceeds $FFFF, the result will continue beyond the end of
the current 64K data bank into the next bank (the bank byte of the 24-bit address is temporarily incremented by
one). So an array of any length (up to 64K bytes) can be started at any location and absolute indexed addressing
will correctly index into the array, even across a bank boundary. 65802 arrays, however, wrap at the 64K
boundary, since effectively there is only the single 64K bank zero.
Loading the index register with an immediate constant, as in the previous two examples, is of limited
use: if, when writing a program, you know that you want the accumulator from $2305, you will generate far
fewer bytes by using absolute addressing:
AD0523
LDA
$2305
The usefulness of indexed addressing becomes clear when you dont know, as you write a program, what the
index into the array will be. Perhaps the program will select among indexes, or calculate one, or retrieve it from
a variable, as in Fragment 7.6.
99
15
Bank
Instructions:
Opcode
Operand Low
High
Low
Operand High
65816 Registers:
Bank
23
High
Low
15
Index
Register
x=1
x=0
AE0600
BD0022
0000
INDEX
LDX
LDA
.
.
.
DS
INDEX
$2200,X
2
Fragment 7.6.
It can be useful to be able to put the base of an array into the index register and let it vary, while
keeping the index into the array constant. This is seldom possible with the eight bits of the 6502s and 65C02s
index registers, since they limit the base addresses they can hold to the zero page, but it is a useful capability of
the 65802 and 65816.
For example, suppose, as in Fragment 7.7, youre dealing with dozens (or hundreds) of records in
memory. You need to be able to update the fifth byte (which is a status field) of an arbitrary record. By loading
the base address of the desired record into an index register, you can use a constant to access the status field.
The index into the array, five, is fixed; the array base varies.
Because the index is less than $100, the assembler would normally generate direct page indexing. To
force the assembler to generate absolute indexing, not direct page indexing, you must use the vertical bar (or
exclamation point) in front of the five, as Fragment 7.7 shows. That way, the five is generated as the doublebyte operand $0005, an absolute address to which the address in the index register is added to form the absolute
effective address.
100
STATUS
OK
BAD
GEQU
GEQU
GEQU
5
1
0
18
FB
CLC
XCE
C210
REP
LONGI
#$10
ON
E220
SEP
LONGA
#$20
OFF
AE0E00
AE0E00
A901
9D0500
LDX
LDX
LDA
STA
REC
#OK
!STATUS,X
a$3000
0030
REC
.
.
.
DC
Fragment 7.7
Had the Y index register been used instead of the X in Fragment 7.7, the vertical bar would have been
acceptable but not necessary; direct page, Y addressing, as you will learn in the next section, can only be used
with the LDX and STX instructions, so the assembler would have been forced to use absolute,Y addressing
regardless.
Both absolute,X and absolute,Y can be used by what are called the eight Group I instructions, the
memory-to-accumulator instructions which can use more addressing modes than any others: LDA, STA, ADC,
SBC, CMP, AND, ORA, and EOR. In addition, absolute,X can be used for shifting data in memory,
incrementing and decrementing data in memory, loading the Y register, and for other instructions; but
absolute,Y has only one other use to load the X register.
Direct Page Indexed with X and Direct Page Indexed with Y Addressing
Arrays based in the direct page (the zero page on the 6502 and 65C02) can be indexed with either the X
register (called Direct Page,X addressing) or the Y register (called Direct Page,Y addressing). However, direct
page,Y addressing is available only for the purpose of loading and storing the X register, while direct page,X is
full-featured.
As is standard with indexed addressing modes, the index, which is specified by the index register, is
added to the array base specified by the operand. Unlike the absolute indexed modes, array always starts in the
direct page. So the array base, a direct page offset, can be specified with a single byte. The sum of the base and
the index, a direct page offset, must be added to the value in the direct page register to find its absolute address,
as shown in Figure 7.7.
In Fragment 7.8, the accumulator is loaded from a direct page offset base of $32 plus index of $10, or
an offset of $42 from the direct page registers setting.
0000
0003
A21000
B532
LDX
LDA
#$10
$32,X
Remember that the effective address is an offset of $42 from the direct page register and is always in
bank zero. It will correspond to an absolute address of $0042 only when the direct page register is equal to zero
101
LDX
LDA
#$F0
$32,X
In 65802 and 65816 native mode, however, indexes can be sixteen bits, so direct page indexing was
freed of the restriction that the effective address be within the direct page. Arrays always start in the direct
page, but indexing past the end of the direct page extends on through bank zero, except that it wraps when the
result is greater than $FFFF to remain in bank zero (unlike absolute indexing, which temporarily allows access
into the next higher bank).
Effective Address:
23
Bank
Instruction:
Opcode
15
7
High
0
Low
Operand
65816 Registers:
Bank
23
15
0000 0000
High
Direct
Low
7
0
Page Register (D)
+
Index
Register
x=1
x=0
In Fragment 7.10, the accumulator is loaded from the value in the direct page register plus the direct
page base of $12 plus index of $FFF0, or dp:$0002. Note this is in bank zero, not bank one.
102
C230
REP
LONGA
LONGI
#$30
ON
ON
A2F0FF
B512
LDX
LDA
#$FFF0
$12,X
If the index registers are set to sixteen bits and the array indexes you need to use are all known
constants less than $100, then you can use direct page indexing to access arrays beginning, not just in the direct
page, but anywhere in bank zero memory: load the index register with the sixteen-bit base of the array and
specify the index into the array as the operand constant. This technique would generally only be useful if the
direct page register has its default value of zero.
Accumulator Addressing
Accumulator addressing is only available for the read-modify-write instructions such as shifts and
rotates. The instructions themselves will be explained in subsequent chapters, and use of accumulator
addressing with them will be reviewed in detail.
As a simple addressing mode, accumulator addressing is included in this chapter for the sake of
completeness even though the instructions which use it have not yet been introduced.
Generally, most operations take place upon two operands, one of which is stored in the accumulator, the
other in memory, with the result being stored in the accumulator. Read-modify-write instructions, such as the
shifts and rotates, are unary operations; that is, they have only a single operand, which in the case of
accumulator addressing, is located in the accumulator. There is no reference to external memory in the
accumulator addressing modes. As usual, the result is stored in the accumulator.
The syntax for accumulator addressing, using the ASL (arithmetic shift left) instruction as an example,
is:
OA
ASL
Implied Addressing
In implied addressing, the operand of the instruction is implicit in the operation code itself; when the
operand is a register, it is specified in the opcodes mnemonic. Implied operand instructions are therefore
single-byte instructions consisting of opcode only, unlike instructions that reference external memory and as a
result must have operands in subsequent bytes of the instruction.
You have already encountered implied addressing in the previous chapter in the form of the register
transfer instructions and exchanges. Since there are a small number of registers, it is possible to dedicate an
opcode to each specific registers transfer operation. Other instructions that use implied addressing are the
register increments and decrements.
As one-byte instructions, there is no assembler operand field to be coded: You simply code the
assembler mnemonic for the given instruction, as below:
7B
AA
9B
TDC
TAX
TXY
103
Stack
Stack addressing references the memory location pointed to by the stack register. Typical use of the
stack addressing mode is via the push and pull instructions, which add or remove data to or from the stack area
of memory and which automatically decrement or increment the stack pointer. Examples of the use of push and
pull instructions were given in the previous chapter.
Additionally, the stack is used by the jump to subroutine, return from subroutine, interrupt, and return
from interrupt instructions to automatically store and retrieve addresses and in some cases also the status
register. This form of stack addressing will be covered in Chapter 12, Subroutines, and Chapter 13, System
Control.
The assembler syntax of the push and pull instructions is similar to that of implied instructions; no
operand field is coded, since the operation will always access memory at the stack pointer location.
LDA
($80)
This means, as figure 7.8 illustrates, go to the direct page address $80 and fetch the absolute (sixteen-bit)
address stored there, and then load the accumulator with the data at the address. The low-order byte of the
indirect address is stored at dp:$80, the high-order byte at dp:$81 typical 65x low/high fashion. Remember, in
the default case where DP equals $0000, the direct page address equals the zero page address, namely
$00:0080.
As explained above, the indirect address stored at the direct page location (point to by the instruction
operand) is a sixteen-bit address.
104
15
Bank
Instruction:
Opcode
Low
Operand
65816 Registers:
Bank
23
High
High
15
Low
7
0000
0000
The general rule for the 65816 is that when an addressing mode only specifies sixteen bits of the
address, then the bank byte (bits 16-23) of the address is provided by the data bank register. This rule applies
here; but you must first note that the direct page offset which points to the indirect address is itself always
located in bank zero because the direct page itself is always located in bank zero. The examples, however, were
simplified to assume both the data bank and the direct page register to by zero.
The use of indirect addressing allows an address that is referenced numerous times throughout a routine
and is subject to modification for example, a pointer to a data region to be modified in only one location and
yet alter the effective address of many instructions.
In Listing 7.1, the data $1234 is moved from location VAR1 to VAR2. Note that the load and store
instructions had the same operand: the symbol DPA, which had been given a value of $80. The indirect address
stored at that location was different in each case, however, resulting in the data being copied from one location
to another. While this example in itself is an inefficient way to move a double-byte word to another location, it
does illustrate the basic method of indirect addressing, which will become quite useful as looping and counting
instructions are added to your working set of 65x instructions.
106
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0002
0004
0004
0004
0004
0007
0009
000B
000E
0010
0012
0012
0012
0012
0013
0014
0014
0015
0015
0017
0019
0019
KEEP
KL.7.1
65816
ON
MAIN
START
DPA
EQU
$80
18
FB
CLC
XCE
C230
REP
LONGA
LONGI
#$30
ON
ON
A01500
8480
8280
A01700
8480
9280
LDY
STY
LDA
LDY
STY
STA
#VAR1
DPA
(DPA)
#VAR2
DPA
(DPA)
38
FB
SEC
XCE
60
RTS
3412
0000
VAR1
VAR2
DC
DC
A$1234
A000
END
Listing 7.1
When absolute long addressing is used, the bank address in the operand of the instruction temporarily
overrides the value in the data bank register for the duration of a single instruction. Thus, it is possible to
directly address any memory location within the entire sixteen-megabyte address space.
You will likely find, however, that this form of addressing is one of the less frequently used. There are
two reasons for this: first, it is more efficient to use the shorter sixteen-bit addressing modes, provided that the
data bank register has been appropriately set; second, it is generally undesirable to hard code fixed 24-bit
addresses into an application, as this tends to make the application dependent on being run in a fixed location
within a fixed bank. (An exception to this is the case where the address referenced is an I/O location, which is
fixed by the given system hardware configuration.)
The 65x processors, in general, do not lend themselves to writing entirely position-independent code,
although the 65816 certainly eases this task compared to the 6502 and 65C02. There is, however, no reason
why code should not be written on the 65816 and 65802 to be bank-independent that is, capable of being
executed from an arbitrary memory bank. But using absolute long addressing will tend to make this difficult if
not impossible.
If you are using a 65802 in an existing system, it is important to note that although the address space of
the 65802 is limited to 64K at the hardware level, internally the processor still works with 24-bit addresses.
One thing this means is that it is legal to use the long addressing modes such as absolute long. But using them
is futile, even wasteful: an extra address byte is required for the bank, but the bank address generated is ignored.
There are cases where use of forms of long addressing other than absolute long should be used if you are
targeting your code for both the 65802 and the 65816. But generally there is little reason to use the absolute
107
15
Bank
Instruction:
Opcode
Operand Low
Operand High
High
Low
Operand Bank
Note that the first STA instruction in Fragment 7.11 generates a four-byte instruction to store the
accumulator to a bank zero address, while the second STA instruction generates a three-byte instruction to store
the accumulator to the same sixteen-bit displacement but within bank two, the current data bank. Also note that
for both the load and the first store instructions, absolute long addressing causes the current data bank register,
which is set to two, to be overridden.
0000
0002
0002
0002
0004
0005
0006
0006
000A
000E
E220
SEP
LONGA
#$20
OFF
A902
48
AB
LDA
PHA
PLB
#$02
AF9DA303
8F7F2E00
8D7F2E
LDA
STA
STA
$03A39D
>$2E7F
$2E7F
Fragment 7.11
E220
SEP
LONGA
REP
LONGI
#$20
OFF
#$10
ON
A902
AB48
AB
LDA
PHA
PLB
#2
AE0080
BF003000
9D0010
9F00E003
LDX
LDA
STA
STA
BUFIDX
>$3000,X
$1000,X
$03E000,X
C210
Fragment 7.12
109
15
Bank
Instruction:
Opcode
High
Low
Operand = $80
Bank Indirect Address
+ 2 dp:$82
+ 1 dp:$81
0000 0000
Direct
dp:$80
Bank 0
Figure 7-10 Direct Page Indirect Long Addressing
0000
0002
0002
0002
0004
0004
0004
0006
0007
0008
0008
000A
000C
000C
000F
0011
0011
0013
C220
REP
LONGA
#$20
ON
E210
SEP
LONGI
#$10
OFF
A004
5A
AB
LDY
PHY
PLB
#$04
LDA
STA
#$02
$82
LDA
STA
#$2000
$80
LDA
STA
($80)
[$80]
;
A002
8482
;
A90020
8580
;
B280
8780
Fragment 7.13
Block Move
Block move addressing is a dedicated addressing mode, available only for two instructions, MVN and
MVP, which have no other addressing modes available to them. These operations were explained in the
previous chapter.
110
8) Chapter Eight
The Flow of Control
Flow of control refers to the way in which a processor, as it executes a program, makes its way through
the various sections of code. Chapter 1 discussed four basic types of execution: straight-line, selection between
paths, looping, and subroutines. This chapter deals with those instructions that cause the processor to jump or
branch to other areas of code, rather than continuing the default straight-line flow of execution. Such
instructions are essential to selection and looping.
The jump and branch instructions alter the default flow of control by causing the program counter to
be loaded with an entirely new value. In sequential execution, on the other hand, the program counter is
incremented as each byte from the code stream opcode or operand is fetched.
The 65x processors have a variety of branch and jump instructions, as shown in Table 8.1. Of these,
when coding in the larger-than-64K environment of the 65816, only the three jumping-long instructions (jump
indirect long, jump absolute long, and jump subroutine long) and the return from subroutine long instruction are
capable of changing the program bank register that is, of jumping to a segment of code in another bank. All
of the other branch or jump instructions simply transfer within the current bank. In fact, the interrupt
instructions (break, return from interrupt, and coprocessor instructions) are the only others which can change the
program bank; there is no direct way to modify the program counter bank without at the same time modifying
the program counter register because the program counter would still point to the next instruction in the old
bank.
Mnemonic
BEQ
6502
x
JMP
JMP
JSR
RTS
BRA
JMP
BRL
x
x
x
x
Available on:
65C02
65802/816
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
JSR
JMP
JMP
JSL
x
x
x
RTL
Description
branch on condition instruction
(eight)
jump absolute
jump indirect
jump subroutine absolute
return from subroutine
branch always (unconditional)
jump absolute indexed indirect
branch long always
(unconditional, 64K range)
jump to subroutine absolute
indexed indirect
jump indirect long (interbank)
jump absolute long (interbank)
jump subroutine long
(interbank)
return from subroutine long
(interbank)
As you many have noticed, all of the flow-of-control instructions (except the return instructions) can be
divided into two categories: jump-type instructions and branch-type instructions. This division is based on
addressing modes: branch instructions use program counter relative addressing modes; jump instructions dont.
Jump instruction can be further split into two groups: those which transfer control to another section of
code, irreversibly, and those which transfer control to a subroutine, a section of code which is meant to
eventually return control to the original (calling) section of code, at the instruction following the jump-tosubroutine instruction.
111
Jump Instructions
The jump inst
ruction (JMP) can be used with any one of five different 65816 addressing modes (only two of these are
available on the 6502, a third is available on the 65C02) to form an effective address; control then passes to that
address when the processor loads the program counter with it. For example,
4C0020
JMP
$2000
uses absolute addressing, a mode available to all 65x processors, to pass control to the code located at $2000 in
the current program bank. (Notice that using absolute addressing to access data in the last chapter used the data
bank in place of the program bank.)
In addition to absolute addressing, all of the 65x processors provide a jump instruction with absolute
indirect addressing. While this form of indirect addressing is unique to the jump instruction, it is quite similar
to the direct page indirect addressing mode described in Chapter 7. In this case, the sixteen-bit operand is the
address of a double-byte variable located in bank zero containing the effective address; the effective address is
loaded into the program counter. As with absolute addressing, the program bank remains unchanged (Figure
8.1).
For example, the jump instruction in Fragment 8.1 causes the processor to load the program counter
with the value in the double-byte variable located at $00:2000. Unlike direct page indirect addressing, the
operand is an absolute address rather than a direct page offset. Furthermore, this form of absolute addressing is
unusual in that it always references a location in bank zero, not the current data bank.
0000
0000
0002
0005
0009
C220
A93412
8F002000
6C0020
LONGA
REP
LDA
STA
JMP
ON
#$20
#$1234
>$2000
($2000)
Fragment 8.1
The 65C02 added the absolute indexed indirect addressing mode to those available to the jump
instruction. This mode is discussed further in Chapter 12, The Complex Addressing Modes. Although its
effective address calculation is not as simple as the jump absolute or jump absolute indirect, its result is the
same: a transfer of control to a new location.
The 65802 and 65816 added long (24-bit) versions of the absolute and indirect addressing modes. The
absolute long addressing mode has a three-byte operand; the first two bytes are loaded into the program counter
as before, while the third byte is loaded into the program bank register, giving the jump instruction a full 24-bit
absolute addressing mode. For example,
5C4423FF
JMP
$FF2344
causes the program counter to be loaded with $2344 and the program bank counter with $FF. Note
that on that 65802, even though the bank address is effectively ignored; the jump is to the same
location as the equivalent (sixteen-bit) absolute jump.
112
15
Bank
High
Low
Instruction:
Opcode
Operand Low
Operand High
+1
When the target of a long jump is in bank zero, say to $00A030, then the assembler has a problem. It
assumes a jump to any address between zero and $FFFF (regardless of whether its written as $A030 or
$00A030) is a jump within the current program bank, not to another bank, so it will generate an absolute jump,
not a long jump. There are two solutions. One is to use the greater-than sign (>) in front of the operand, which
forces the assembler to override its assumptions and use long addressing:
5C30A000
JMP
>$A030
The alternative is to use the JML alias, or alternate mnemonic, which also forces a jump to be long, even if the
value of the operand is less than $10000:
5C30A000
JML
$A030
The final form of the jump instruction is a 24-bit (long) jump using absolute indirect addressing. In the
instruction,
DC0020
JMP
[$2000]
the operand is the bank zero double-byte address $2000, which locates a triple-byte value; the program counter
low is loaded with the byte at $2000 and the program counter high with the byte at $2001; the program bank
register is loaded with the byte at $2002. A standard assembler will allow the JML (jump long) alias here as
well.
Notice that absolute indirect long jumps are differentiated from absolute indirect jumps within the same
bank by using parentheses for absolute indirect jumps within the same bank by using parentheses for absolute
direct and square brackets for absolute indirect long. In both cases the operand, an absolute address, points to a
location in bank zero.
The jump instructions change no flags and affect no registers other than the program counter.
113
Conditional Branching
While the jump instructions provide the tools for executing a program made up of disjoined code
segments or for looping, they provide no way to conditionally break out of a loop or to select between paths.
These are the jobs of the conditional branch instructions.
The jump instruction requires a minimum three bytes to transfer control anywhere in a 64K range. But
selection between paths is needed so frequently and for the most part for short hops that using three bytes would
tend to be unnecessarily costly in memory usage. To save memory, branches use an addressing mode called
program counter relative, which requires just two bytes; the branch opcode is followed by a one-byte operand
a signed, twos-complement offset from the current program location.
When a conditional branch instruction is encountered, the processor first tests the value of a status
register flag for the condition specified by the branch opcode. If the branch condition is false, the processor
ignores the branch instruction and goes on to fetch and execute the next instruction from the next sequential
program location. If, on the other hand, the branch condition is true, then the processor transfers control to the
effective address formed by adding the one-byte signed operand to the value currently in the program counter
(Figure 8.2).
As Chapter 1 notes, positive numbers are indicated by a zero in the high bit (bit seven), negative
numbers by a one in the high bit. Branching is limited by the signed one-byte operands to 127 bytes forward or
128 bytes backward, counting from the end of the instruction. Because a new value for the program counter
must be calculated if the branch is taken, an extra execution cycle is required. Further, the 6502 and 65C02
(and 65802 and 65816 in emulation mode) require an additional cycle if the branch crosses a page boundary.
The native mode 65802 and 65816 do not require the second additional cycle, because they use a sixteen-bit
(rather than eight-bit) adder to make the calculation.
The program counter value to which the operand is added is not the address of the branch instruction
but rather the address of the opcode following the branch instruction. Thus, measured from the branch opcode
itself, branching is limited to 129 bytes forward and 126 bytes backward. A conditional branch instruction with
an operand of zero will continue with the next instruction regardless of whether the condition tested is true or
false. A branch with an operand of zero is thus a two-byte no-operation instruction, with a variable (by one
cycle) execution time, depending on whether the branch is or isnt taken.
The 65x processors have eight instructions which let your programs branch based on the settings of four
of the condition code flag bits in the status register: the zero flag, the carry flag, the negative flag, and the
overflow flag.
None of the conditional branch instructions change any of the flags, nor do they affect any registers
other than the program counter, which they affect only if the condition being tested for is true. The most recent
flag value always remains valid until the next flag-modifying instruction is executed.
114
15
Bank
High
Low
Instruction:
Opcode
Operand
65816 Registers:
Bank
23
High
Low
15
Program
Counter (PC)
115
0000
0000
0000
0003
0003
0006
0006
0006
0007
0009
000B
000B
000C
000E
000E
000F
0011
;
AC0080
NEXTNODE
LDA
#ROOT
TAX
LDA
BNE
0,x
LOOP
TYA
STA
0,x
TAX
STZ
0,x
;
A90080
AA
B500
D0FB
;
;
LOOP
;
98
6500
;
AA
7400
;
Fragment 8.2
The routine hinges on the BNE instruction found half-way through the code; until the zero element is
reached. the processor continues looping through as many linked records as exist. Notice that the routine has
no need to know how many elements there are or to count them as it adds a new element. Figure 8.3 pictures
such a linked list.
116
Data
Link Field
$1204
$1254
$1203
$1253
$1202
$1252
$1201
$12
$1251
$00
$1200
$50
$1250
$00
$1254
$1304
$1253
$1303
$1252
$1302
$1251
$13
$1301
$00
$1250
$00
$1300
$00
Inserted Data
End
of
List
New End
of List
The two conditional branch instructions that check the zero flag are also frequently used following a
subtraction or comparison to evaluate the equality or inequality of two values. Their use in arithmetic, logical,
and relational expressions will be covered in more detail, with examples, in the next few chapters.
SEC
BCS
NEWCODE
Since the code which follows this use of the BCS instruction will never be executed due to failure of the
condition test, it should be documented as acting like a branch-always instruction.
The 6502 emulation mode of the 65802 and 65816 can be toggled on or off only by exchanging the
carry bit with the emulation bit; so the only means of testing whether the processor is in emulation mode or
native mode is to exchange the emulation flag with the carry flag and test the carry flag, as in Fragment 8.3.
Note that CLC, XCE, and BCS instructions themselves always behave the same regardless of mode.
117
.
.
CLC
XCE
BCS
.
.
.
18
FB
B0FC
EMHAND
Fragment 8.3
Arithmetic and logical uses of branching based on the carry flag will be discussed in the next two
chapters.
Branching Based on the Negative Flag
The negative flag bit in the status register indicates whether the result of arithmetic, logical, load, pull,
or transfer operation is negative or positive when considered as a twos-complement number. A negative result
causes the flag to be set; a zero or positive result causes the flag to be cleared. The processor determines the
sign of a result by checking to see if the high-order bit is set or not. A twos-complement negative number will
always have its high-order bit set, a positive number always has it clear.
The BMI (branch-minus) instruction is used to branch when a result is negative, or whenever a specific
action needs to be taken if the high-order (sign) bit of a value is set. Execution of the BPL (branch-plus)
instruction will cause a branch whenever a result is positive or zero that is, when the high-order bit is clear.
The ease with which these instructions can check the status of the high order-bit has not been lost on
hardware designers. For example, the AppleII keyboard is read by checking a specific memory location
(remember, the 65x processor use memory-mapped I/O). Like most computer I/O devices, the keyboard
generates ASCII codes in response to key presses. The code returned by the keyboard only uses the low-order
seven bits; this leaves the eight bit free to be used as a special flag to determine if a key has been pressed since
the last time a key was retrieved. To wait for a keypress, a routine (see Fragment 8.4) loops until the high-order
bit of the keyboard I/O location is set.
0000
0000
0000
0000
0000
0000
0002
0002
0005
0007
000A
000A
000A
000A
KEYBD
KSTRB
GEQU
GEQU
E230
AD00C0
10FB
8D10C0
LOOP
$C000
$C010
SEP
#$30
LDA
BPL
STA
.
.
.
KEYBD
LOOP
KSTRB
Fragment 8.4
The STA KSTRB instruction that follows a successful fetch is necessary to tell the hardware that a key
has been read; it clears the high-order bit at the KEYBD location so that the next time the routine is called, it
will again loop until the next key is pressed.
Remember that the high-order or sign bit is always bit seven on a 6502 or 65C02 or, on the 65802 and
65816, if the register loaded is set to an eight-bit mode. If a register being used an the 65802 or 65816 is set to
sixteen-bit mode, however, then the high bit the bit that affects the negative flag is bit fifteen.
118
0000
0003
0005
0008
0008
0008
AD0080
F003
4C0080
DONE
LDA
BEQ
JMP
ANOP
.
.
CONTROL
DONE
TOP
Fragment 8.5
The price of having efficient two-byte short branches is that you must use five bytes to simulate a long
conditional branch.
Many times it is possible and sensible to branch to another nearby flow of control statement and
use it to puddle-jump to your final target. Sometimes you will find the branch or jump statement you
need for puddle jumping already within your code because its not unusual for two or more segments
of code to conditionally branch to the same place. This method costs you no additional code, but you
should document the intermediate branch, nothing that its being used as a puddle-jump. Should you
change it later, you wont inadvertently alter its use by the other branch.
Each of the 65x branch instructions is based on a single status bit. Some arithmetic conditions,
however, are based on more than one flag being changed. There are no branch instructions available for the
relations of unsigned greater than and unsigned less than or equal to, these relations can only be determined by
examining more than one flag bit. There are also no branch instructions available for signed comparisons, other
than equal and not equal. How to synthesize these options is described in the following chapter.
Unconditional Branching
The 65C02 introduced the BRA branch always (or unconditional branch) instruction, to the relief of
6502 programmers; they had found that a good percentage of the jump instructions coded were for short
distance within the range of a branch instruction.
Having an unconditional branch available makes creating relocatable code easier. Every program must
have a starting address, or origin, specified, which tells the assembler where in memory the program will be
119
MAIN
4C0500
77
88
DATA1
DATA2
BEGCODE
ORG
START
JMP
DC
DC
ANOP
.
.
.
$2000
BEGCODE
H77
H88
Fragment 8.6
program is now position-independent. If executed at $2000, the branch is located at $2000; the program counter
value before the branchs operand is added is $2002; the result of the addition is $2004, the location of
BEGCODE. Load and execute the program instead at $2200, and the branch is located at $2200; the program
counter value before the branch operand is added is $2202; the result of the addition is $2204, which is the new
location of BEGCODE.
0000
0000
0000
0002
0003
0004
0007
0007
0007
0007
MAIN
8002
77
88
AD0200
DATA1
DATA2
BEGCODE
ORG
START
BRA
DC
DC
LDA
.
.
.
$2000
BEGCODE
H77
H88
DATA1
Fragment 8.7
Because the operand of a branch instruction is always relative to the program counter, its effective
address can only be formed by using the program counter. Programs that use branches rather than jump may be
located anywhere in memory.
6502 programmers in need of relocatability get around the lack of an unconditional branch instruction
by using the technique described earlier of setting a flag to a known value prior to executing a branch-on-thatcondition instruction.
Even with the unconditional branch instruction, however, relocatability can still be a problem if the
need for branching extends beyond the limits imposed by its eight-bit operand. There is some help available on
the 6502 and 65C02 in the form of the absolute indirect jump, which can be loaded with a target that is
calculated at run time.
120
121
9) Chapter Nine
Built-In Arithmetic Functions
With this chapter you make your first approach to the heart of the beast: the computer as an automated
calculator. Although their applications cover a broad range of functions, computers are generally associated
first and foremost with their prodigious calculating abilities. Not without reason, for even in chapter oriented
applications such as word processing, the computer is constantly calculating. At the level of the word processor
itself, everything from instructions decoding to effective address generation is permeated by arithmetic or
arithmetic-like operations. At the software implementation level, the program is constantly calculating
horizontal and vertical cursor location, buffer pointer locations, indents, page numbers, and more.
But unlike dedicated machines, such as desk-top or pocket calculators, which are merely calculators, a
computer is a flexible and generalized system which can be programmed and reprogrammed to perform an
unlimited variety of functions. One of the keys to this ability lies in the computers ability to implement control
structures, such as loops, and to perform comparisons and select an action based on the result. Because this
chapter introduces comparison, the elements necessary to demonstrate these features are complete. The other
key element, the ability to branch on condition, was presented in the previous chapter. This chapter therefore
contains the first examples of these control structures, as they are implemented on the 65x processor.
Armed with the material presented in Chapter 1 about positional notation as it applies to the binary and
hexadecimal number systems, as well as the facts concerning twos-complement binary numbers and binary
arithmetic, you should posses the background required to study the arithmetic instructions available on the 65x
series of processors.
Consistent with the simple design approach of the 65x family, only elementary arithmetic functions are
provided, as listed in Table 9.1, leaving the rest to be synthesized in software. There are, for example, no builtin integer multiply or divide. More advanced examples presented in later chapters will show how to synthesize
these more complex operations.
Mnemonic
6502
Increment Instructions:
DEC
x
DEX
x
DEY
x
INC
x
INX
x
INY
x
Arithmetic Instructions:
ADC
x
SBC
x
Available on:
65C02
65802/816
Description
x
x
x
x
x
x
x
x
x
x
x
x
decrement
decrement index register X
decrement index register Y
increment
increment index register X
increment index register Y
x
x
x
x
x
x
x
compare accumulator
compare index register X
compare index register Y
122
C230
A0FF7F
C8
REP
LONGA
LONGI
LDY
INY
#$30
ON
ON
#$7FFF
16-bit registers
Fragment 9.1
In a similar example, Fragment 9.2, the Y register is loaded with the highest possible value which can
be represented in sixteen bits (all bits turned on).
0000
0002
0002
0002
0005
C230
A0FFFF
C8
REP
LONGA
LONGI
LDY
INY
#$30
ON
ON
#$FFFF
z = 1 in status register
Fragment 9.2
123
+
1
1111
0000
1111
0000
1111
0000
1
1111
0000
one to be added
binary equivalent of $FFFF
result is $10000
Since there are no longer any extra bits available in the sixteen-bit register, however, the low-order
sixteen bits of the number in Y (that is, zero) does not represent the actual result. As you will see later, addition
and subtraction instructions use the carry flag to reflect a carry out of the register, indicating that a number
larger than can be represented using the current word size (sixteen bits in the above example) has been
generated. While increment and decrement instructions do not affect the carry, a zero result in the Y register
after an increment (indicated by the z status flag being set) shows that a carry has been generated, even though
the carry flag itself does not indicate this.
A classic example of this usage is found in Fragment 9.3, which shows the technique commonly used
on the eight-bit 6502 and 65C02 to increment a sixteen-bit value in memory. Note the branch-on-condition
instruction, BNE, which was introduced in the previous chapter, is being used to indicate if any overflow from
the low byte requires the high byte to be incremented, too. As long as the value stored at the direct page
location ABC is non-zero following the increment operation, processing continues at the location SKIP. If
ABC is zero as a result of the increment operation, a page boundary has been crossed, and the high order byte of
the value must be incremented, the sixteen-bit value would wrap around within the low byte.
0000
0003
0005
0008
0008
0008
0008
EE0080
D0FB
EE0180
TOP
SKIP
INC
BNE
INC
.
.
.
.
ABC
SKIP
ABC+1
Fragment 9.3
Such use of the z flag to detect carry (or borrow) is peculiar to the increment and decrement operations:
if you could increment or decrement by values other than one, this technique would not work consistently, since
it would be possible to cross the threshold (zero) without actually landing on it (you might, for example, go
from $FFFF to $0001 if the step value was 2).
A zero result following a decrement operation, on the other hand, indicates that the next decrement
operation will cause a borrow to be generated. In Fragment 9.4, the Y register is loaded with one, and then one
is subtracted from it by the DEY instruction. The result is clearly zero; however, if Y is decremented again,
$FFFF will result. If you are treating the number as a signed, twos-complement number, this is just fine, as
$FFFF is equivalent to a sixteen-bit, negative one. But if it is an unsigned number, a borrow exists.
0000
0002
0002
0002
0005
C230
A00100
88
REP
LONGA
LONGI
LDY
DEY
#$30
ON
ON
#$0001
16-bit registers
Fragment 9.4
Together with the branch-on-condition instructions introduced in the previous chapter, you can now
efficiently implement one of the most commonly used control structures in computer programming,, the
program loop.
A rudimentary loop would be a zero-fill loop; that is, a piece of code to fill a range of memory with
zeroes. Suppose, as in Listing 9.1, the memory area from $4000 to $5FFF was to be zeroed (for example, to
clear hi-res page two graphics memory in the AppleII). By loading an index register with the size of the area to
be cleared, the memory can be easily accessed by indexing from an absolute base of $4000.
124
125
0000
0000
0000
0000
0000
0000
0001
0002
0002
0002
0002
0002
0002
0004
0004
0004
0004
0004
0007
0007
0007
000A
000B
000C
000E
000E
000F
0010
0011
0011
L91
18
FB
KEEP
65816
START
KL.9.1
ON
CLC
XCE
BASE
COUNT
C230
A2FE1F
GEQU
GEQU
$4000
$2000
REP
#$30
LONGA
LONGI
ON
ON
LDX
#COUNT-2
STZ
DEX
DEX
BPL
BASE,X
LOOP
;
9E0040
CA
CA
10F9
LOOP
38
FB
60
DONE
SEC
XCE
RTS
END
Listing 9.1
The loop itself is then entered for the first time, and the STZ instruction is used to clear the memory
location formed by adding the index register to the constant BASE. Next come two decrement instructions; two
are needed because the STZ instruction stored a double-byte zero. By starting at the end of the memory range
and indexing down, it is possible to use a single register for both address generation and loop control. A simple
comparison, checking to see that the index register is still positive, is all that is needed to control the loop.
Another concrete example of a program loop is provided in Listing 9.2, which toggles the built-in
speaker in an AppleII computer with increasing frequency, resulting in a tone of increasing pitch. It features an
outer driving loop (TOP), an inner loop that produces a tone of a given pitch, and an inner-most delay loop.
The pitch of the tone can be varied by using different initial values for the loop indices.
126
0000
0000
0000
0000
0000
0001
0002
0004
0004
0004
0004
0004
0006
0007
0007
0008
0008
000B
000B
000C
000C
000D
000F
000F
000F
0010
0012
0012
0013
0015
0015
0016
0017
0018
L92
18
FB
E230
BELL
A200
8A
KEEP
65816
KL.9.2
ON
START
CLC
XCE
SEP
LONGA
LONGI
GEQU
#$30
OFF
OFF
$C030
LDX
TXA
9B
TOP
TXY
8D30C0
LOOP
STA
8A
3A
D0FD
#0
X, now in A, initializes the delay loop
initialize X & Y to 0
BELL
TXA
DELAY
DEC
BNE
A
DELAY
88
D0F6
DEY
BNE
LOOP
CA
D0F2
DEX
BNE
TOP
38
FB
60
SEC
XCE
RTS
END
Listing 9.2
56
72
8
127
1000
1010
0010
1
0010
1011
1101
If you begin by adding the binary digits from the right and marking the sum in the proper column, and
then placing any carry that results at the top of the next column to the left, you will find that a carry results
when the ones in column seven are added together. However, since the accumulator is only eight bits wide,
there is no place to store this value; the result has overflowed the space allocated to it. In this case, the final
carry is stored in the carry flag after the operation. If there had been no carry, the carry flag would be reset to
zero.
The automatic generation of a carry flag at the end of an addition is complemented by a second feature
of this instruction that is executed at the beginning of the instruction: the ADC instruction itself always adds the
previously generated one-bit carry flag value with the right-most column of binary digits. Therefore, it is
always necessary to explicitly clear the carry flag before adding two numbers together, unless the numbers
being added are succeeding words of a multi-word arithmetic operation. By adding in a previous value held in
the carry flag, and storing a resulting carry there, it is possible to chain together several limited-precision (each
only eight or sixteen bits) arithmetic operations.
First, consider how you would represent an unsigned binary number greater than $FFFF (decimal
65,536) that is, one that cannot be stored in a single double-byte cell. Suppose the number is $023A8EF1.
This would simply be stored in memory in four successive bytes, from low to high order, as follows, beginning
at $1000:
1000
1001
1002
1003
F1
8E
3A
02
Since the number is greater than the largest available word size of the processor (double byte), any arithmetic
operations performed on this number will have to be treated as multiple-precision operations, where only one
part of a number is added to the corresponding part of another number at a time. As each part is added, and so
on, until all of the parts of the number have been added.
Multiple-precision operations always proceed from low-order part to high-order part because the carry
is generated from low to high, as seen in our original addition of decimal 56 to 72.
Listing 9.3 is an assembly language example of the addition of multi-precision numbers $023A8EF1 to
$0000A2C1. This example begins by setting the accumulator word size to sixteen bits, which lets you process
half of the four-byte addition in a single operation. The carry flag is then cleared because there must be no
initial carry when an add operation begins. The two bytes stored at BIGNUM and BIGNUM+1 are loaded into
the double-byte accumulator. Note that the DC 14 assembler directive automatically stores the four-byte
integer constant value in memory in low-to-high order. The ADC instruction is then executed, adding $8EF1 to
$A2C1.
128
0000
0000
0000
0000
0000
0001
0002
0004
0004
0005
0008
000B
000E
0011
0014
0017
0018
0019
001A
001E
0022
0026
KEEP
65816
L93
18
FB
S220
18
AD1A00
6D1E00
8D2200
AD1C00
6D2000
8D2400
38
FB
60
F18E3A02
C1A20000
00000000
BIGNUM
NEXTNUM
RESULT
KL.9.3
ON
START
CLC
XCE
REP
LONGA
CLC
LDA
ADC
STA
LDA
ADC
STA
SEC
XCE
RTS
DC
DC
DS
END
#$20
ON
BIGNUM
NEXTNUM
RESULT
BIGNUM+2
NEXTNUM+2
RESULT+2
I4$023A8EF1
I4$0000A2C1
4
Listing 9.3
1
1000
1010
0011
11 1
1110
0010
0001
1
1111
1100
1011
1
0001
0001
0010
The sixteen-bit result found in the accumulator after the ADC is executed is $31B2; however, this is clearly
incorrect. The correct answer, $13B2, requires seventeen bits to represent it, so an additional result of the ADC
operation in this case is that the carry flag in the status register is set. Meanwhile, since the value in the
accumulator consists of the correct low-order sixteen bits, the accumulator is stored at RESULT and
RESULT+1.
With the partial sum of the last operation saved, the high-order sixteen bits of BIGNUM are loaded
(from BIGNUM+2) into the accumulator, followed immediately by the ADC NEXTNUM + 2 instruction,
which is not preceded by CLC this time. For all but the first addition of a multiple-precision operation, the
carry flag is not cleared; rather, the setting of the carry flag from the previous addition is allowed to be
automatically added into the next addition. You will note in the present example that the high-order sixteen bits
of NEXTNUM are zero; it almost seems unnecessary to add them. At the same time, remember that there was
a carry left over from the first addition; when the ADC NEXTNUM + 2 instruction is executed, this carry is
automatically added in; that is, the resulting value in the accumulator is equal to the carry flag (1) plus the
original value in the accumulator ($023A) plus the value at the address NEXTNUM + 2 ($0000), or $023B.
This is then stored in the high-order bytes of RESULT, which leaves the complete, correct value stored in
locations RESULT through RESULT + 3 in low-high order:
RESULT
RESULT + 1
RESULT + 2
RESULT + 3
- B2
-
31
3B
02
Comparison
The comparison operation is VALUE1 equal to VALUE2, for example is implemented on the 65x,
as on most processors, as an implied subtraction. In order to compare VALUE1 to VALUE2, one of the values
is subtracted from the other. Clearly, if the result is zero, then the numbers are equal.
This kind of comparison can be made using the instructions you already know, as Fragment 9.5
illustrates. In this fragment, you can see that the branch to TRUE will be taken, and the INC VAL instruction
never executed, because $1234 minus 1234 equals zero. Since the results of subtractions condition the z flag,
the BEQ instruction (which literally means branch if result equal to zero), in this case, means branch if the
compared values are equal.
130
C230
REP
LONGA
LONGI
#$30
ON
ON
16-bit registers
9C1200
A93412
38
E93412
F003
EE1200
60
0000
STZ
LDA
SEC
SBC
BEQ
INC
RTS
DS
VAL
#$1234
#$1234
TRUE
VAL
subtract another
if they are the same, leave VAL zero
if they are different, set VAL
TRUE
VAL
2
Fragment 9.5
There are two undesirable aspects of this technique, however, if comparison is all that is desired rather
than actual subtraction. First, because the 65x subtraction instruction expects the carry flag to be set for single
precision subtractions, the SBC instruction must be executed before each comparison using SBC. Second, it is
not always desirable to have the original value in the accumulator lost when the result of the subtraction is
stored there.
Because comparison is such a common programming operation, there is a separate compare instruction,
CMP. Compare subtracts the value specified in the operand field of the instruction from the value in the
accumulator without storing the result; the original accumulator value remains intact. Status flags normally
affected by a subtraction z, n, and c are set to reflect the result of the subtraction just performed.
Additionally, the carry flag is automatically set before the instruction is executed, as it should be for a singleprecision subtraction. (Unlike the ADC and SBC instructions, CMP does not set the overflow flag,
complicating signed comparisons somewhat, a problem which will be covered later in this chapter.)
Given the flags that are set by the CMP instruction, and the set of branch-on-condition instructions, the
relations shown in Table 9.2 can be easily tested for. A represents the value in the accumulator, DATA is the
value specified in the operand field of the instruction, and Bxx is the branch-on-condition instruction that causes
a branch to be taken (to the code labelled TRUE) if the indicated relationship is true after a comparison.
Because the action taken after a comparison by the BCC and BCS is not immediately obvious from
their mnemonic names, the recommended assembler syntax standard allows the alternate mnemonics BLT, for
branch on less than, and BGE, for
BEQ
BNE
BCC
BCS
TRUE
TRUE
TRUE
TRUE
branch if A = DATA
branch if A < > DATA
branch if A < DATA
branch if A > = DATA
branch if greater of equal, respectively, which generate the identical object code.
Other comparisons can be synthesized using combinations of branch-on-condition instructions.
Fragment 9.6 shows how the operation branch on greater than can be synthesized.
0000
0002
0004
F002
B0FC
SKIP
BEQ
BGE
ANOP
SKIP
TRUE
branch to TRUE if
A > DATA
Fragment 9.6
131
F0FE
90FC
BEQ
BCC
TRUE
TRUE
branch if
A <= DATA
Fragment 9.7
Listing 9.4 features the use of the compare instruction to count the number of elements in a list which
are less than, equal to, and greater than a given value. While of little utility by itself, this type of comparison
operation is just a few steps away from a simple sort routine. The value the list will be compared against is
assumed to be stored in memory locations $88.89, which are given the symbolic name VALUE in the example.
The list, called TABLE, uses the DC I directive, which stores each number as a sixteen-bit integer.
132
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0004
0004
0004
0004
0004
0004
0006
0008
000A
000A
000A
000C
000F
000F
000F
0012
0014
0016
0018
001A
001C
001E
0020
0020
0021
0022
0024
0024
0024
0025
0026
0027
0027
0035
0041
0043
0043
KEEP
65816
KL.9.4
ON
L94
START
LESS
SAME
MORE
GEQU
GEQU
GEQU
$82
$84
$86
counter
counter
counter
VALUE
GEQU
$88
CLC
XCE
REP
#$30
LONGA
LONGI
ON
ON
6482
6484
6486
STZ
STZ
STZ
LESS
SAME
MORE
A588
A01A00
LDA
LDY
VALUE
#LAST-TABLE
CMP
BEQ
BLT
INC
BRA
INC
BRA
INC
TABLE,Y
ISEQ
ISMORE
LESS
LOOP
SAME
LOOP
MORE
18
FB
C230
D92700
F006
9008
E682
8006
E684
8002
E686
TOP
88
88
10EB
LOOP
ISEQ
ISMORE
DEY
DEY
BPL
SEC
XCE
RTS
0C00009000
04116300
0F27
DC
DC
DC
LAST
TOP
;
38
FB
60
I12,9,302,956,123,1234,98
I4356,99,11,40000,23145,562
I9999
END
Listing 9.4.
After setting the mode to sixteen-bit word/index size, the locations that will hold the number of
occurrences of each of the three possible relationships are zeroed. The length of the list is loaded into the Y
register. The accumulator is loaded with the comparison value.
The loop itself is entered, with a comparison to the first item in the list; in this and each succeeding
case, control is transferred to counter-incrementing code depending on the relationship that exists. Note that
equality and less-than are tested first, and greater-than is assumed if control falls through. This is necessary
since there is no branch on greater-than (only branch on greater-than-or-equal). Following the incrementing of
the selected relation-counter, control passes either via an unconditional branch, or by falling through, to the
loop-control code, which decrements Y twice (since double-byte integers are being compared). Control
133
LOOP
E8
E0A000
D0FA
ANOP
.
.
.
INX
CPX
BNE
ANOP
#$A0
LOOP
Fragment 9.8
Signed Arithmetic
The examples so far have dealt with unsigned arithmetic that is, addition and subtraction of binary
numbers of the same sign. What about signed numbers?
As you saw in Chapter 1, signed numbers can be represented using twos-complement notation. The
twos complement of a number is formed by inverting it (one bits become zeroes, zeroes become ones) and then
adding one. For example, a negative one is represented by forming the twos complement of one:
0000
1111
0000
1111
0000
1111
0000
1111
0000
1111
0000
1111
0001
1110
0001
1111
Minus one is therefore equivalent to a hexadecimal $FFFF. But as far as the processor is concerned, the
unsigned value $FFFF (65,535 decimal) and the signed value minus-one are equivalent. They both amount to
the same stream of bits stored in a register. Its the interpretation of them given by the programmer which is
significant an interpretation that must be consistently applied across each of the steps that perform a multi-step
function.
Consider all of the possible signed and unsigned numbers that can be represented using a sixteen-bit
register. The twos complement of $0002 is $FFFE as the positive numbers increase, the twos-complement
(negative) numbers decrease (in the unsigned sense). Increasing the positive value to $7FFF (%0111 1111 1111
1111), the twos complement is $8001 (%1000 0000 0000 0001); except for $8000, all of the possible values
have been used to represent the respective positive and negative numbers between $0001 and $7FFF.
Since their point of intersection, $8000, determines the maximum range of a signed number, the highorder bit (bit fifteen will always be one if the number is negative, and zero if the number is positive. Thus the
range of possible binary values (%0000 0000 0000 0000 through %1111 1111 1111 1111, or $0000 . . $FFFF),
using twos-complement form, is divided evenly between representations of positive numbers, and
representations of the corresponding range of negative numbers. Since $8000 is also negative, there seems to
be one more possible negative number than positive; for the purpose here, however, zero is considered positive.
The high-order bit is therefore referred to as the sign bit. On the 6502, with its eight-bit word size (or
the 65816 in an eight-bit register mode), bit seven is the sign bit. With sixteen-bit registers, bit fifteen is the
sign bit. The n or negative flag in the status register reflects whether or not the high-order bit of a given register
is set or clear after execution of operations which affect that register, allowing easy determination of the sign of
a signed number by using either the BPL (branch on plus) or BMI (branch if minus) instructions introduced in
the last chapter.
134
C230
REP
LONGA
LONGI
#$30
ON
ON
A9F6FF
18
691400
LDA
CLC
ADC
#-10
16-bit registers
#20
Fragment 9.9
Two things should become clear: that the magnitude of the result (10 decimal) is such that it will easily
fit within the number of bits available for its representation, and that there is a carry out of bit fourteen:
1
1111
1111
0000
0000
1111
1111
0000
0000
111
1111
0001
0000
1
0110
0100
1010
In this case, the overflow flag is not set, because the carry out of the penultimate bit indicates wraparound
rather than overflow (or underflow). Whenever the two operands are different signs, carry out of the next-tohighest bit indicates wraparound; the addition of a positive and a negative number (or vice versa) can result in a
number too large (try it), but it may result in wraparound.
Conversely, overflow exists in the addition of two negative numbers if no carry results from the
addition of the next-to-highest (penultimate) bits. If two negative numbers are added without overflow, they
will always wrap around, resulting in a carry out of the next-to-highest bit. When wraparound has occurred, the
sign bit is set due to the carry out of the penultimate bit. In the case of the two negative numbers being added
(which always produces a negative result), this setting of the sign bit results in the correct sign. In the case of
the addition of two positive numbers, wraparound never occurs, so a carry out of the penultimate bit always
means that the overflow flag will be set.
These rules likewise apply for subtraction; however, you must consider that subtraction is really an
addition with the sign of the addend inverted, and apply them in this sense.
In order for the processor to determine the correct overflow flag value, it exclusive-ors the carry out of
the penultimate bit with the carry out of the high-order bit (the value that winds up in the carry flag), and sets or
135
Signed Comparisons
The principle of signed comparisons is similar to that of unsigned comparisons: the relation of one
operand to another is determined by subtracting one from the other. However, the 65x CMP instruction, unlike
SBC, does not affect the v flag, so does not reflect signed overflow/underflow. Therefore, signed comparisons
must be performed using the SBC instruction. This means that the carry flag must be set prior to the
comparison (subtraction), and that the original value in the accumulator will be replaced by the difference.
Although the value of the difference is not relevant to the comparison operation, the sign is. If the sign of the
result (now in the accumulator) is positive (as determined according to rules outlined above for proper
determination of the sign of the result of a signed operation), then the value in memory is less than the original
value in the accumulator; if the sign is negative, it is greater. If, though, the result of the subtraction is zero,
then the values were equal, so this should be checked for first.
The code for signed comparisons is similar to that for signed subtraction. Since a correct result need
not be completely formed, however, overflow can be tolerated since the goal of the subtraction is not to
generate a result that can be represented in a given precision, but only to determine the relationship of one value
to another. Overflow must still be taken into account in correctly determining the sign. The value of the sign
bit (the high-order bit) will be the correct sign of the result unless overflow has occurred. In that case, it is the
inverted sign.
Listing 9.5 does a signed comparison of the number stored in VAL1 with the number stored in VAL2,
and sets RELATION to minus one, zero, or one, depending on whether VAL1 < VAL2, VAL1 = VAL2 or
VAL1 > VAL2, respectively:
136
0000
0000
0000
0000
0000
0000
0001
0002
0004
0004
0004
0004
0004
0007
000A
000B
000E
0010
0012
0014
0017
0019
001B
001E
001F
0020
0021
0021
0023
0025
0027
0027
KEEP
65816
COMPARE
18
FB
C230
9C2500
AD2100
38
ED2300
F00E
7007
3007
EE2500
8005
30F9
CE2500
38
FB
60
0000
0000
0000
GREATER
INVERT
LESS
SAME
VAL1
VAL2
RELATION
KL.9.5
ON
START
CLC
XCE
REP
#$30
LONGA
LONGI
ON
ON
STZ
LDA
SEC
SBC
BEQ
BVS
BMI
INC
BRA
BMI
DEC
CLC
XCE
RTS
RELATION
VAL1
DS
DS
DS
2
2
2
VAL2
SAME
INVERT
LESS
RELATION
SAME
GREATER
RELATION
END
Listing 9.5
Decimal Mode
All of the examples in this chapter have dealt with binary numbers. In certain applications, however,
such as numeric I/O programming, where conversion between ASCII and binary representation of decimal
strings is inconvenient, and business applications, in which conversion of binary fractions to decimal fractions
results in approximation errors, it is convenient to represent numbers in decimal form and, if possible, perform
arithmetic operations on them directly in this form.
Like most processors, the 65x series provides a way to handle decimal representations of numbers.
Unlike most processors, it does this providing a special decimal mode that causes the processor to use decimal
arithmetic for ADC, SBC, and CMP operations, with automatic on the fly decimal adjustment. Most other
microprocessors, on the other hand, do all arithmetic the same, requiring a second decimal adjust operation to
convert back to decimal form the binary result of arithmetic performed on decimal numbers. As you remember
from Chapter 1, binary-coded-decimal (BCD) digits are represented in four bits as binary values from zero to
nine. Although values from $A to $F (ten to fifteen) may also be represented in four bits, these bit patterns are
illegal in decimal mode. So when $03 is added to $09, the result is $12, not $0C as in binary mode.
Each four-bit field in a BCD number is a binary representation of a single decimal digit, the rightmost being the
ones place, the second the tens, and so on. Thus, the eight-bit accumulator can represent numbers in the range
0 through 99 decimal, and the sixteen-bit accumulator can represent numbers in the range 0 through 9999.
Larger decimal numbers can be represented in multiple-precision, using memory variables to store the partial
results and the carry flag to link the component fields of the number together, just as multiple-precision binary
numbers are.
137
138
Youll find two types of instructions discussed in this chapter: the basic logic functions, and the
shifts and rotates. Theyre listed in Table 10.1.
Mnemonic
6502
Logic Instruction:
x
x
x
Available on:
65C02
65802/816
Description
x
x
x
x
x
x
logical and
x
x
x
test bits
test and reset bits
test and set bits
x
x
x
x
AND
EOR
ORA
logical exclusive-or
logical or (inclusive or)
Logic Functions
The fundamental logical operations implemented on the 65x processor are and, inclusive or, and
exclusive or. These are implemented as the AND, ORA, and EOR machine instructions. These three logical
operators have two operands, one in the accumulator and the second in memory. All of the addressing modes
139
Logical AND
Consider, for example, the eight-bit AND operation illustrated in Figure 10.1.
bit number
and
7
0
1
0
6
1
1
1
5
1
0
0
4
1
0
0
3
0
1
0
2
1
0
0
1
1
1
1
0
0
1
0
and
$76
$CB
$42 result
The result, $42 or %0100 0010, is formed by ANDing bit zero of the first operand with bit zero of the second to
form bit zero of the result; bit one with bit one; and so on. In each bit, a one results only if there is a one in the
corresponding bit-fields of both the first operand and the second operand; otherwise zero results.
An example of the use of the AND instruction would be to mask bits out of a double-byte word to
isolate a character (single-byte) value. A mask is a string of bits, typically a constant, used as an operand to a
logic instruction to single out of the second operand a given bit or bit-field by forcing the other bits to zeroes or
ones. Masking characters out of double bytes is common in 65802 and 65816 applications where a default
mode of sixteen-bit accumulator and sixteen-bit index registers has been selected by the programmer, but
character data needs to be accessed as well. For some types of character manipulation, it is quicker to simply
mask out the extraneous data in the high-order byte than to switch into eight-bit mode. The code in Listing 10.1
is fragmentary in the sense that it is assumed that the core routine is inserted in the middle of other code, with
the sixteen-bit accumulator size already selected.
It may seem to be splitting hairs, but this routine, which compares the value in a string of characters
pointed to by the value in the memory variable CHARDEX to the letter e is two machine cycles faster than
the alternative approach, which would be to switch the processor into the eight-bit accumulator mode, compare
the character, and then switch back into the sixteen-bit mode.
140
0000
0000
0000
0000
0000
0000
0000
0001
0002
0002
0004
0004
0004
0004
0007
0016
000A
0017
0018
0019
0020
0021
0022
0023
0024
0025
0026
0027
0028
0029
0030
0031
0032
0033
0034
0035
0036
0037
000D
0010
0012
0012
0013
0014
0014
0015
0016
0016
0017
0018
0018
0019
001A
001A
002A
0040
004A
004C
004E
MAIN
PTR
KEEP
65816
KL.10.1
ON
START
GEQU
$80
18
FB
CLC
XCE
C230
REP
LONGA
LONGI
#$30
ON
ON
LDY
LDA
CHARDEX
STRING,Y
29FF00
AND
#%000000001111
111
C96500
D004
CMP
BNE
#e
NOMATCH
38
FB
SEC
XCE
38
60
SEC
RTS
SEC
XCE
CLC
RTS
AC4C00
B91A00
38
FB
LOOP
NOMATCH
18
60
54686573
61726520
65616368
0000
0000
STRING
CHARDEX
DC
DC
DC
DC
DC
END
C These characters
C are all packed next to
C each other
H 0000
2
index to a particular char in STRING
Listing 10.1
Each time the program is executed with a different value for CHARDEX, a different adjacent character
will also be loaded into the high byte of the accumulator. Suppose the value in CHARDEX were four; when
the LDA STRING,Y instruction is executed, the value in the low byte of the accumulator is $65, the ASCII
value for a lower-case e. The value in the high byte is $20, the ASCII value for the space character (the space
between These and characters). Even though the low bytes match, a comparison to e would fail, because
the high byte of the CMP instructions immediate operand is zero, not $20 (the assembler having automatically
generated a zero as the high byte for the single-character operand e).
However, by ANDing the value in the accumulator wit %0000000011111111 ($00FF), no matter what
the original value in the accumulator, the high byte of the accumulator is zeroed (since none of the
corresponding bits in the immediate operand are set). Therefore the comparison in this case will succeed, as it
will for CHARDEX values of 2, 13, 18, 28, 32, 38, and 46, even though their adjacent characters, automatically
loaded into the high byte of the accumulator, are different.
The AND instruction is also useful in performing certain multiplication and division functions. For
example, it may be used to calculate the modulus of a power of two. (The modulus operation returns the
remainder of an integer division; for example, 13 mod 5 equals 3, which is the remainder of 13 divided by 5.)
This is done simply by ANDing with ones all of the bits to the right of the power of two you wish the modulus
141
Logical OR
The ORA instruction is used to selectively turn bits on by Oring them with ones, and to determine if
either (or both) of two logical values is true. A character-manipulation example (Listing 10.2) is used this
time writing a string of characters, the high bit of each of which must be set, to the AppleII screen memory to
demonstrate a typical use of the ORA instruction.
Since the video screen is memory-mapped, outputting a string is basically a string move. Since normal
Apple video characters must be stored in memory with their high-order bit turned on, however, the ORA
#%10000000 instruction is required to do this if the character string, as in the example, was originally stored in
normal ASCII, with the high-order bit turned off. Note that it clearly does no harm to OR a character with $80
(%10000000) even if its high bit is already set, so the output routine does not check characters to see if they
need to have set high bit, but rather routinely Ors them all with $80 before writing them to the screen. When
each character is first loaded into the eight-bit accumulator from STRING, its high bit is off (zero); the ORA
instruction converts each of the values - $48, $65, $6C, $6C, $6F into the corresponding high-bit-set ASCII
values - $C8, $E5, $EC, $EC, and $EF, before storing them to screen memory, where they will be displayed as
normal, non-inverse characters on the video screen. In this case, the same effect (the setting of the high-order
bit) could have been achieved if $80 had been added to each of the characters instead; however, the OR
operation differs from addition in that even if the high bit of the character already had a value of one, the result
would still be one, rather than zero plus a carry as would be the case if addition were used. (Further a CLC
operation would also have been required prior to the addition, making ORA a more efficient choice as well.)
142
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0002
0004
0004
0004
0006
0006
0006
0009
0009
000C
000E
0010
0013
0013
0014
0016
0016
0017
0018
0019
0019
001E
001F
001F
L102
SCREEN
KEEP
65816
KL.10.2
ON
START
MSB
GEQU
OFF
$400
18
FB
CLC
XCE
C210
REP
LONGI
#$10
ON
E220
SEP
LONGA
#$20
OFF
8-bit accum
A00000
LDY
#0
LDA
BEQ
ORA
STA
STRING,Y
DONE
#%10000000
SCREEN,Y
INY
BRA
TOP
B91900
F008
0980
990004
TOP
C8
80F3
38
FB
60
DONE
SEC
XCE
RTS
48656C6C
00
STRING
DC
DC
C Hello
H 00
END
Listing 10.2
Logical Exclusive-Or
The third logical operation, Exclusive-OR, is used to invert bits. Just as inclusive-OR (ORA) will
yield a true result if either or both of the operands are true, exclusive-or yields true only if one operand is true
and the other is false; if both are true or both are false, the result is false. This means that by setting a bit in the
memory operand of an EOR instruction, you can invert the corresponding bit of the accumulator operand
(where the result is stored). In the preceding example, where the character constants were stored with their high
bits off, an EOR #$80 instruction would have had the same effect as ORA #$80; but like addition, if some of
the characters to be converted already had their high-order bits set, the EOR operation would clear them.
Two good examples of the application of the EOR operation apply to signed arithmetic. Consider the
multiplication of two signed numbers. As you know, the sign of the product is determined by the signs of the
multiplier and multiplicand according to the following rule: if both operands have the same sign, either positive
or negative, the result is always positive; if the two operands have different signs, the result is always negative.
You perform signed multiplication by determining the sign of the result, and then multiplying the absolute
values of both operands using the same technique as for unsigned arithmetic. Finally, you consider the sign of
the result: if it is positive, your unsigned result is the final result; if it is negative, you form the final result by
taking the twos-complement of the unsigned result. Because the actual multiplication code is not included, this
example is given as two fragments, 10.1 and 10.2.
Fragment 10.1 begins by clearing the memory location SIGN, which will be used to store the sign of
the result. Then the two values to be multiplied are exclusive-ORd, and the sign of the result is tested with the
143
0000
0000
NUM1
NUM2
DS
DS
2
2
C230
REP
LONGA
LONGI
#$30
ON
ON
16-bit modes
9C0080
AD0000
4D0200
1003
CE0080
AD0200
1007
49FFFF
1A
8D0200
AD0000
1004
49FFFF
1A
STZ
LDA
EOR
BPL
DEC
LDA
BPL
EOR
INC
STA
LDA
BPL
EOR
INC
ANOP
SIGN
NUM1
NUM2
OK
SIGN
NUM2
OK1
#$FFFF
A
NUM2
NUM1
OK2
#$FFFF
A
OK
OK1
OK2
Fragment 10.1
At this point, the unsigned multiplication of the accumulator and NUM2 can be performed. The code
for the multiplication itself is omitted from these fragments; however, an example of unsigned multiplication is
found in Chapter 14. The important fact for the moment is that the multiplication code is assumed to return the
unsigned product in the accumulator.
0000
0003
0005
0008
0009
000A
AE0080
1004
49FFFF
1A
60
DONE
LDX
BPL
EOR
INC
RTS
SIGN
DONE
#$FFFF
A
if should be neg,
twos complement the result
Fragment 10.2
144
Bit Manipulation
You have now been introduced to the three principal logical operators, AND, ORA, and EOR. In
addition there are three more specialized bit-manipulating instructions that use the same logical operations.
The first of these is the BIT instruction. The BIT instruction really performs two distinct operations.
First, it directly transfers the highest and next to highest bits of the memory operand (that is, seven and six if m
= 1, or fifteen and fourteen if m = 0) to the n and v flags. It does this without modifying the value in the
accumulator, making it useful for testing the sign of a value in memory without loading it into one of the
registers. An exception to this is the case where the immediate addressing mode is used with the BIT
instruction: since it serves no purpose to test the bits of a constant value, the n and v flags are left unchanged in
this one case.
BITs second operation is to logically AND the value of the memory operand with the value in the
accumulator, conditioning the z flag in the status register to reflect whether or not the result of the ANDing was
zero or not, but without storing the result in the accumulator (as is the case with the AND instruction) or saving
the result in any other way. This provides the ability to test if a given bit (or one or more bits in a bit-field) is
set by first loading the accumulator with a mask of the desired bit patterns, and then performing the BIT
operation. The result will be non-zero only if at least one of the bits set in the accumulator is likewise set in the
memory operand. Actually, you can write your programs to use either operand as the mask to test the other,
except when immediate addressing is used, in which case the immediate operand is the mask, and the value in
the accumulator is tested.
A problem that remained from the previous chapter was sign extension, which is necessary when
mixed-precision arithmetic is performed that is, when the operands are of different sizes. It might also be
used when converting to a higher precision due to overflow. The most typical example of this is the addition (or
subtraction) of a signed eight-bit and a signed sixteen-bit value. In order for the lesser-precision number to be
converted to a signed number of the same precision as the larger number, it must be sign-extended first, by
setting or clearing all of the high-order bits of the expanded-precision number to the same value as the sign bit
of the original, lesser-precision number.
In other words, $7F would become $007F when sign-extended to sixteen bits, while $8F would become
$FF8F. A sign-extended number evaluates to the same number as its lesser precision form. For example, $FF
and $FFFF both evaluate to 1.
You can use the BIT instruction to determine if the high-order bit of the low-order byte of the
accumulator is set, even while in the sixteen-bit accumulator mode. This is used to sign extend an eight-bit
value in the accumulator to a sixteen-bit one in Listing 10.3.
145
0000
0000
0000
0000
0000
0001
0002
0002
0004
0004
0004
0004
0006
0006
0009
000C
000E
0011
0011
0013
0013
0014
0015
0016
KEEP
65816
L103
18
FB
KL.10.3
ON
START
CLC
XCE
C230
REP
LONGA
LONGI
#$30
ON
ON
A500
LDA
29FF00
898000
F003
0900FF
AND
BIT
BEQ
ORA
#$FF
#$80
OK
#$FF00
STA
8500
38
FB
60
OK
SEC
XCE
RTS
END
Listing 10.3
The pair of test-and-set instructions, TSB and TRB, are similar to the BIT instruction in that they set
the zero flag to represent the result of ANDing the two operands. They are dissimilar in that they do not affect
the n and v flags. Importantly, they also set (in the case of TSB) or reset (in the case of TRB) the bits of the
memory operand according to the bits that are set in the accumulator (the accumulator value is a mask). You
should recognize that the mechanics of this involve the logical functions described above: the TSB instruction
Ors the accumulator with the memory operand, and stores the result to memory; the TRB inverts the value in
the accumulator, and then ANDs it with the memory operand. Unlike the BIT instruction, both of the test-andset operations are read-modify-write instructions; that is, in addition to performing an operation on the memory
value specified in the operand field of the instruction, they also store a result to the same location.
The test-and-set instructions are highly specialized instructions intended primarily for control of
memory-mapped I/O devices. This is evidenced by the availability of only two addressing modes, direct and
absolute, for these instructions; this is sufficient when dealing with memory-mapped I/O, since I/O devices are
always found at fixed memory locations.
Shift and rotate instructions differ in the value chosen for the origin bit of the shift or rotate. The shift
instructions write a zero into the origin bit of the shift the low-order bit for a shift left of the high-order bit for
shift right. The rotates, on the other hand, copy the original value of the carry flag into the origin bit of the
shift. Figure 10.2. and Figure 10.3 illustrate the operation of the shift and rotate instructions.
The carry flag, as Fragment 10.3 illustrates, is used by the combination of a shift followed by one or
more rotate instructions to allow multiple-precision shifts, much as it is used by ADC and SBC instructions to
enable multiple-precision arithmetic operations.
146
147
ROL-Before
ASL-Before
CARRY FLAG
CARRY FLAG
ROL
CARRY FLAG
ROL-After
ASL-After
1
X
CARRY FLAG
ASL
CARRY FLAG
CARRY FLAG
0000
0003
0006
0009
000C
000F
A9AAAA
8D0080
A9AAAA
8D0080
0E0080
2E0080
LDA
STA
LDA
STA
ASL
ROL
#%1010101010101010
LOC1
#%1010101010101010
LOC2
LOC1
LOC2
Fragment 10.3
148
ROR-Before
LSR-Before
X
CARRY FLAG
CARRY FLAG
LSR
ROR
CARRY FLAG
CARRY FLAG
ROR-After
LSR-After
1
CARRY FLAG
1
1
CARRY FLAG
149
1010101010101010
becomes
0101010101010101
carry = 1
Left shifts multiply the original value by two. Right shifts divided the original value by two.
This principal is inherent in the concept of positional notation; when you multiply a value by ten by
adding a zero to the end of it, you are in effect shifting it left one position; likewise when you divide
by ten by taking away the right-most digit, which in this case is base two.
Shifting is also useful, for the same reason, in a generalized multiply routine, where a combination of
shift and add operations are performed iteratively to accomplish the multiplication. Sometimes, however, it is
useful to have a dedicated multiplication routine, as when a quick multiplication by a constant value is needed.
If the constant value is a power of two such as four, the constant multiplier in Fragment 10.4 the solution is
simple: shift left a number of times equal to the constants power of two (four is two to the power, so two left
shifts are equivalent to multiplying by four).
0000
0003
0004
A93423
0A
0A
LDA
ASL
ASL
#$2334
A
A
Fragment 10.4
The result in the accumulator is $2334 times four, or $8CD0. Other quickie multiply routines can be easily
devised for multiplication by constants that are not a power of two. Fragment 10.5 illustrates multiplication by
ten: the problem is reduced to a multiplication by eight plus a multiplication by two.
0000
0003
0004
0007
0008
0009
000A
A9D204
0A
8D0080
0A
0A
18
6D0080
LDA
ASL
STA
ASL
ASL
CLC
ADC
#1234
A
TEMP
A
A
multiply by 2
save intermediate result
times 2 again = times 4
times 2 again = times 8
TEMP
= times 10
Fragment 10.5
After the first shift left, which multiplies the original value by two, the intermediate result (1234 * 2 =
2468) is stored at location TEMP. Two more shifts are applied to the value in the accumulator, which equals
9872 at the end of the third shift. This is added to the intermediate result of 1234 times 2, which was earlier
stored at location TEMP, to give the result 12,340, or 1234 * 10.
Division using the shift right instructions is similar. Since bits are lost during a shift right operation,
just as there is often a remainder when an integer division is performed, it would be useful if there were an easy
way to calculate the remainder (or modulus) of a division by a power of two. This is where the use of the AND
instruction alluded to earlier comes into play.
150
0000
0003
0004
0005
0006
0009
000A
000D
A91FE2
48
4A
4A
8D0080
68
290300
8D0080
LDA
PHA
LSR
LSR
STA
PLA
AND
STA
#$E21F
A
A
QUO
#$3
MOD
save accumulator
divide by 2
divide by 2 again = divide
save quotient
recover original value
save modulus
Fragment 10.6
Consider Fragment 10.6. In this case, $E21F is to be divided by four. As with multiplication, so with
division: two shifts are applied, one for each power of two, this time to the right. By the end of the second shift,
the value in the accumulator is $3887, which is the correct answer. However, two bits have been shifted off to
the right. The original value in the accumulator is recovered from the stack and then ANDed with the divisor
minus one, or three. This masks out all but the bits that are shifted out during division by four, the bits which
correspond to the remainder or modulus the quotient times four, and then adding the remainder.
The second use for the shift instructions is general bit manipulation. Since the bit shifted out of the
word always ends up in the carry flag, this is an easy way to quickly test the value of the high- or low-order bit
of a word. Listing 10.4 gives a particularly useful example: a short routine to display the value of each of the
flags in the status register. This routine will, one by one, print the letter-name of each of the status register flags
if the flag is set (as tested by the BCS instruction), or else print a dash if it is clear.
151
0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015
0016
0017
0018
0019
0020
0021
0022
0023
0024
0025
0026
0027
0028
0029
0030
0031
0032
0033
0034
0035
0036
0037
0038
0039
0040
0041
0042
0043
0044
0045
0046
0047
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0001
0001
0002
0003
0003
0005
0007
0007
0007
0007
0008
000A
000D
000F
0012
0012
0014
0016
0018
001A
001C
001F
0021
0023
0025
0026
0028
002A
002D
002D
002E
002F
0030
0030
0038
0038
SKIP
0048
0049
0050
0051
0052
KEEP
65816
KL.10.4
ON
PRINTP
PREG
PTR
START
GEQU
GEQU
08
$80
$82
PHP
;
18
FB
CLC
XCE
C2FF
E220
REP
SEP
LONGI
LONGA
#$FF
#$20
ON
OFF
68
8580
A23000
8682
A20800
PLA
STA
LDX
STX
LDX
PREG
#FLAGS
PTR
#8
0680
B004
A92D
8002
B282
200080
E682
D002
E683
CA
D0EA
A90D
200080
LOOP
DOFLAG
SKIP
OK
38
FB
60
6E766D78
ASL
BCS
LDA
BRA
LDA
JSR
INC
BNE
INC
DEX
BNE
LDA
JSR
PREG
DOFLAG
#-
SKIP
(PTR)
COUT
PTR
OK
PTR+1
LOOP
#$0D
COUT
SEC
XCE
RTS
FLAGS
DC
cnvmxdizc
END
00001C
0000
0000
0000
0000
0000
COUT
ECOUT
48
START
GEQU
PHA
$FDED
0001
0002
0003
0004
0005
0006
0009
000A
000B
000C
000D
000E
000F
0010
DA
5A
08
38
FB
20EDFD
18
FB
28
7A
FA
68
60
PHX
PHY
PHP
SEC
XCE
JSR
CLC
XCE
PLP
PLY
PLX
PLA
RTS
END
ECOUT
Listing 10.4
153
Example
LDA
LDA
LDA
LDX
LDA
LDA
Syntax
$2234,X
$2234,Y
$17,X
$17,Y
($17),Y
($17,X)
JMP
($ 7821,X)
LDA
LDA
LDA
LDA
LDA
$17
$654321,X
[$17],Y
$29,S
($29,S),Y
Mnemonic
PEA
PEI
PER
6502
Available on:
65C02
65802/816
x
x
x
Description
push effective absolute address
push effective indirect address
push effective relative address
154
;
A20034
DA
2B
LDX
PHX
PLD
#$3400
Fragment 11.1
Fragment 11.2 illustrates the second method. The direct page register can be set to the value in the
sixteen-bit C accumulator by use of the TCD instruction, which transfers sixteen bits from accumulator to direct
page register.
0000
0000
0003
;
A900FE
5B
LDA
TCD
#$FE00
Fragment 11.2
Both methods of setting the direct page register give it a sixteen-bit value. Since sixteen bits are only
capable of specifying an address within a 64K range, its bank component must be provided in another manner;
this has been done by limiting the direct page to bank zero. The direct page can be located anywhere in 64K but
the bank address of the direct page is always bank zero.
Chapter 7, which limited the use of the direct page to page zero, used the example shown in Fragment
11.3 to store the one-byte value $F0 at address $0012, which is the direct page offset of $12 added to a direct
page register value of zero. If instead the direct page register is set to $FE00, then $F0 is stored to $FE12; the
direct page offset of $12 is added to the direct page register value of $FE00.
0000
0003
A9F0000
8512
LDA
STA
#$F0
$12
Fragment 11.3
While it is common to speak of a direct page address of $12, $12 is really an offset from the base
value in the direct page register ($FE00 in the last example). The two values are added to form the effective
direct page address of $FE12.
But while Chapter 7 defined a page of memory as $100 locations starting from a page boundary (any
multiple of $100), the direct page does not have to start on a page boundary; the direct page register can hold
any sixteen-bit value. If the code in Fragment 11.4 is executed, running the code in Fragment 11.3 stores the
one-byte value $f0 at address $1025: $1013 plus $12.
155
;
A91310
5B
LDA
TCD
#$1013
You will for the most part, however, want to set the direct page to begin on a boundary: it saves one
cycle for every direct page addressing operation. This is because the processor design includes logic that, when
the direct page registers low byte is zero, concatenates the direct page registers high byte to the direct page
offset instead of adding the offset to the entire direct page register to form the effective direct page address;
concatenation saves a cycle over addition.
One of the benefits of the direct page concept is that programs, and even parts of programs, can have
their own $100-byte direct pages of variable space separate from the operating systems direct page of variable
space. A routine might set up its own direct page with the code in Fragment 11.5.
0000
0000
0001
0004
;
0B
A90003
5B
PHD
LDA
TCD
To end the routine and restore the direct page register to its previous value, simply execute a PLD
instruction.
As discussed in Chapter 7, having a direct page makes accessing zero page addresses in any bank
require special assembler syntax. Since the zero page is no longer special, absolute addressing must be used;
but since the assembler normally selects direct page addressing for operands less than $100, the standard syntax
requires that you prefix a vertical bar or exclamation point to the operand to force the assembler to use absolute
addressing. This is just one of the potential assembler misassumptions covered in the next section.
156
Description
LDA $123456
The first is zero page memory. Page zero has no special meaning in the 65802 and 65816: its special
attributes have been usurped by the direct page, so accessing it requires use of absolute addressing just like any
other absolute location. But the assembler assumes addresses less than $100 are direct page offsets, not zero
page addresses; it will not generate code to access the zero page (unless the direct page is set to the zero page so
that the two are one and the same) without explicit direction. And even if the direct page is set to the zero page,
65816 systems have a zero page not only in bank zero but also in every other bank, and those other page zeroes
cannot ever be accessed by absolute addressing without special direction.
The syntax to force the assembler to use absolute addressing is to precede an operand with a vertical bar
or exclamation point as shown in Fragment 11.6.
157
C220
A90032
5B
E210
A202
DA
AB
A532
8D3200
8F320000
REP
LONGA
LDA
TCD
SEP
LONGI
LDX
PHX
PLB
LDA
STA
STA
#$20
ON
#$3200
#$10
OFF
#2
$32
!$32
>$32
Fragment 11.6
Notice the use of another symbol, the greater-than sign (>), to force long addressing. This solves
another problem: The assembler assumes absolute addresses are in the data bank; if the value in the data bank is
other than zero, then it similarly will not generate code to access bank zero without special direction. The
greater-than sign forces the assembler to use a long addressing mode, concatenating zero high bits onto the
operand until its 24 bits in length. This usage is shown in Fragment 11.7, where the greater-than sign forces
absolute long addressing, resulting in the assembler generating an opcode using absolute long addressing to
store the accumulator, followed by the three absolute long address bytes for $00:0127, which are, in 65x order,
$27, then $01, then $00.
The ASL instruction in Fragment 11.7 makes use of the third assembler override syntax: prefixing an
operand with the less-than sign (<) forces direct page addressing. Its not likely youll use this last syntax often,
but it may come in handy when youve assigned a label to a value that you need the assembler to truncate to its
low-order eight bits so it will be used as a direct page offset.
Note that this override syntax is the recommended standard syntax. As Chapter 1 (Basic Concepts)
points out, even mnemonics can vary from one assembler to another, so assembler syntax such as this can differ
as well.
0000
0002
0002
0002
0004
0005
0006
0009
000D
E210
SEP
LONGI
#$10
OFF
A202
DA
AB
AD2701
8F270100
0627
LDX
PHX
PLB
LDA
STA
ASL
#2
$127
>$127
<$127
Fragment 11.7
15
Bank
65816 Registers
Bank
23
High
High
Low
Low
15
Instruction:
Opcode
Operand
+1
Bank 0
Y Index
Register (Y)
x=1
x=0
Direct
This addressing mode is called postindexing because the Y index register is added after the indirect
address is retrieved from the direct page.
For example, suppose that your program needs to write a dash (hyphen) character to a location on the
AppleIIs 40-column screen that will be determined while the program is running. Further suppose your
program picks a screen location at column nine on line seven. The AppleII has a firmware routine (called
BASCALC) which, when presented with the number of a line on the screen, calculates the address of the
leftmost position in the line and returns it in zero page location BASL, located at memory locations $0028 and
$00029.
If you wanted to write your hyphen to the first position on the line, you could, after calling BASCALC
and loading the character to print into the accumulator, use the 65C02s indirect addressing mode:
9228
STA
(BASL)
The 6502 has no simple indirect addressing mode, but Fragment 11.8 illustrates what 6502
programmers long ago learned: you can use postindexing to the same effect as simple indirect by loading the Y
register with zero.
159
BASL
EQU
LDA
LDY
STA
.
.
.
A92D
A000
9128
$28
#-
#0
(BASL),Y
write a dash
to (BASL)
Fragment 11.8
But you want to write the hyphen character to column nine (the leftmost position being column zero),
not column zero. After calling BASCALC, you load the Y register with nine and write your character indirect
through BASL indexed by the nine in Y as seen in Fragment 11.9. If BASCALC calculates line seven on the
screen to start at location $780, and as a result stores that address at BASL, then the routine in Fragment 11.9
will write a dash to location $789 (column nine on line seven).
0000
0002
0004
A92D
A009
9128
LDA
LDY
STA
#-
#9
(BASL),Y
write a dash
to col 9
on the line with its base i
Fragment 11.9
You could write a line of dashes from column nine through column sixteen simply by creating the loop
coded in Listing 11.1. This kind of routine has been used for years on the 6502-based AppleII .
0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015
0016
0017
0018
0019
0020
0021
0022
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0002
0004
0006
0008
000A
000C
000E
000F
0011
0013
0014
0014
A980
8528
A907
8529
A92D
A009
9128
C8
C011
90F9
60
KEEP
65816
6502
L111
START
BASL
LINE7
GEQU
GEQU
$28
$780
LDA
STA
LDA
STA
LDA
LDY
STA
INY
CPY
BCC
RTS
#LINE7
BASL
#>LINE7
BASL+1
#-
#9
(BASL),Y
LOOP
KL.11.1
OFF
example
#17
LOOP
write a dash
to col 9
on the line with its base in BASL
incr pointer to next column position
(BLT): write another dash up to col. 17
END
Listing 11.1
Finally, note that, like absolute indexed addressing, the array of memory accessible to the indirect
indexed addressing mode can extend beyond the current 64K data bank into the next 64K bank, if the index plus
the array base exceeds $FFFF.
160
AD0080
0A
AA
A958
8150
WRITEX
LDA
ASL
TAX
LDA
STA
BOXNUMBER
A
#X
($50,X)
Fragment 11.10
161
15
Bank
0
Low
High
Instruction
Opcode
Operand
65816 Registers:
23
Bank
15
High
Low
X Index
Register (X)
x=1
x=0
0000 0000
Direct
+1
Bank 0
162
STA
(BASL),Y
In postindexed, the operand locates the direct address, so its in parentheses to indicate indirection. The, Y is
not in parentheses, since the index register is not part of finding the indirect address its added to the indirect
address once it is found.
On the other hand, with preindexing:
8150
STA
($50,X)
both the operand and the index register are involved in locating the indirect address, so both are in parentheses.
A very different application for preindexing enables the 65x to read from (or write to) several I/O
peripherals at once. Obviously, a microprocessor can only read from one device at a time, so it polls each
device: provided each device uses the same I/O controller chip (so that a single routine can check the status of
all devices and read a character from each of them identically), your program can poll the various status
locations using pre-indexing. Begin by storing an array of all the status locations in the direct page. Specify the
base of the array as the operand to preindexed instruction. Load the X index with 0 and increment it by two
until youve checked the last device. Finally, restore it to zero and cycle through again and again.
If a status check reveals a character waiting to be read, your program can branch to code that actually
reads the character from the device. This time, youll use preindexing to access a second direct page array of
the character-reading addresses for each device; the index in the X register from the status-checking routine
provides the index into the character-reading routine.
On the 6502, the 65C02, and the 6502 emulation modes, the entire array set up for preindexing must be
in the direct page. (On the 6502 and 65C02, this means the array must be entirely in the zero page which,
unfortunately, severely limits the use of preindexing due to the competition for zero page locations.) If the
specified direct page offset plus the index in the X exceeds $FF, the array wraps around within the direct page
rather than extending beyond it. That is,
A21A
LDX
#$1A
LDA
($FO,X)
followed by
A1F0
would load the accumulator from the indirect address in location $0A not $10A.
On the 65802 and 65816 (in native mode), the array must still start in the direct page but wraps,
not at the end of the direct page but at the end of bank zero, when the array base plus the D direct page
setting plus the X index exceeds $00:FFFF.
On the 65816, the data that is ultimately accessed (after the indirection) is always in the data
bank.
Absolute Indexed Indirect Addressing
The 65C02 introduced a new addressing mode, absolute indexed indirect addressing, which is quite
similar to direct page indexed indirect. (It is also preindexed using the X index register, but indexes into
absolute addressed memory rather than the direct page to find the indirect address.) This new addressing mode
is used only by the jump instruction and, on the 65802 and 65816, the jump-to-subroutine instruction.
Absolute indexed indirect provides a method for your program, not to access data in scattered locations
by putting the locations of the data into a table and indexing into it, but to jump to routines at various locations
by putting those locations into a table, indexing into it, and jumping to the location stored in the stored in the
table at the index. Figure 11.3 shows what happens.
163
;
38
E93000
0A
AA
7C0900
0080
0080
0080
0080
0080
0080
0080
0080
TABLE
DC
DC
DC
DC
DC
DC
DC
DC
(TABLE,X)
AROUTIN0
AROUTIN1
AROUTIN2
AROUTIN3
AROUTIN4
AROUTIN5
AROUTIN6
AROUTIN7
#0
A
Fragment 11.11
164
15
Bank
Instruction:
Opcode
Operand Low
65816 Registers:
Bank
23
High
Operand High
High
Low
15
X Index
Low
Register (X)
x
x=1
+1
x=0
Program Bank
Program Bank (PBR)
Because both the operand (the absolute address of the base of the table) and the index register are involved in
determining the indirect address, both are within the parentheses.
On the 65816, a jump-indirect operand is in bank zero, but a jump-indexed-indirect operand is in the
program bank. There is a different assumption for each mode. Jump indirect assumes that the indirect address
to be jumped to was stored by the program in a variable memory cell; such variables are generally in bank zero.
Jump indexed indirect, on the other hand, assumes that a table of locations of routines would be part of the
program itself and would be loaded, right along with the routines, into the bank holding the program. So,
6C3412
JMP
($1234)
JMP
(1234,X)
assumes $1234 is in the program bank, the bank in which the code currently being executed resides.
The indirect addresses stored in the table are absolute addresses also assumed to be in the current
program bank.
165
LDA
[$15],Y
The square brackets are used to indicate the indirect address is long.
So, like its sixteen-bit counterpart, indirect long indexed addressing allows you to index into an array of
which neither the base nor the index need be determined until the program is executing. Unlike its sixteen-bit
counterpart, it allows you to access an array in any bank, not just the current data bank.
LDA
3,S
LDA
3,S
or
A301
Notice that accessing the last data put on the stack requires an index of 1, not of 0. This is because the
stack pointer always points to the next available location, which is one byte below the last byte pushed onto the
stack. An index of zero would generally be meaningless, except perhaps to re-read the last byte pulled off the
stack! (The latter would also be extremely dangerous since, should an interrupt occur, the left-behind byte
would be overwritten by interrupt-stacked bytes.)
166
Effective Address:
23
15
Bank
Instruction:
Opcode
65816 Registers:
Bank
23
15
7
High
0
Low
Operand
+2
High
+1
Low
7
Y Index
Register (Y)
x=1
x=0
0000 0000
Direct
Page Register
(D)
167
Effective Address:
23
15
Bank
Instruction:
Opcode
65816 Registers:
Bank
23
15
0000 0000
7
High
0
Low
00000000
Operand
High
Low
7
Stack
Pointers (S)
A00000
B301
AA
A00200
B301
LDY
LDA
TAX
LDY
LDA
#0
(1,S),Y
#2
(1,S),Y
Fragment 11.12
The 1,S is the stack location where the indirect address was pushed. (Actually, 1,S points to the stack
location of the low byte of the indirect address; the high byte is in 2,S, the next higher stack location.) To this
indirect address, the value in the Y is added: the indirect address plus 0 locates the first value to be multiplied;
the indirect address plus 2 locates the second. Finally the accumulator is loaded from this indirect indexed
address. Figure 11.6 illustrates the sequence.
This mode, very similar to direct page indirect indexing (also called postindexing), might be called
stack postindexing. The operand which indexes into the stack is very similar to a direct page address; both are
limited to eight bits and both are added to a sixteen-bit base register (D or S). In both cases, the indirect address
points to a cell or an array in the data bank. In both cases, Y must be the index register. And in both cases in
the 65816, the postindexed indirect address about to be accessed may extend out of the data bank and into the
next bank if index plus address exceeds $FFFF; that is, if the indirect address is the base of an array, the array
can extend into the next bank.
168
Effective Address:
23
15
Bank
Instruction:
Opcode
65816 Registers:
Bank
23
15
Data Bank (DBR)
7
High
0
Low
Operand
High
Low
7
+1
Y Index
Register (Y)
Stack
High Indirect Address
Low Indirect Address
Bank 0
x=1
x=0
0000 0000
Stack
Pointer (S)
PEA
$2134
pushes what may be either sixteen-bit immediate data or a sixteen-bit address onto the stack. The operand
pushed by the PEA instruction is always 16 bits regardless of the settings of the m memory/accumulator and x
index mode select flags.
The PEI (push effective indirect address) instruction has, as an operand, a direct page location: its the
sixteen-bit value stored at the location that is pushed onto the stack. Figure 11.8 shows that this has the effect of
pushing either an indirect address or sixteen bits of direct page data onto the stack. For example, if you had
stored the value or indirect address $5678 at direct page location $21, then
D421
PEI
($21)
169
Opcode
Stack
Pointer (S)
before
after
Data High =
Operand High
Stack
Data High
Data Low
Bank 0
170
Effective Address:
23
15
Bank
7
High
0
Low
0000 0000
Instruction:
Opcode
Operand
65816 Registers:
Bank
23
15
0000 0000
High
Low
7
Direct
Source
Effective Address + 1
Source
Effective Address
before
Stack
Pointer (S)
Stack
High Indirect Address
Low Indirect Address
after
Bank 0
Figure 11-8 PEI Addressing
171
Operand Low
Operand High
+
Register:
15
7
Program
0
Counter (PC)
before
Stack
Data
Pointer (S)
after
Stack
Data High
Data Low
Data
Bank 0
Figure 11-9. PER Addressing
location in the program; the operand the assembler generates is also a sixteen-bit displacement; and
when the instruction is executed, the displacement is added to the next instructions run-time address
to form the address to which the program will branch.
To understand the use of the PER instruction, together with the relative branches, in writing a
program that will run at any address, suppose that your relocatable program is assembled starting at
location $2000. Theres a data area starting at location $2500 called DATA0. A section of program
code at $2200 needs to access a byte three bytes past, called DATA1. A simple LDA $2503 would
work, but only if the program were intended to always begin at location $2000. If its meant to be
relocatable, you might load the program at $3000, in which case the data is at $3503 and a LDA $2503
loads the accumulator with random information from what is now a non-program address. Using the
instruction
62E17F
PER
DATA3
in your source program causes the assembler to calculate the offset from $2203 (from the instruction
following the PER instruction at $2200) to DATA1 at $2503, an offset of $300. So the assembler
generates object code of a PER opcode followed by $300. Now if the code is loaded at $3000,
execution of the PER instruction causes the processor to calculate and stack the current absolute
address of DATA1 by adding the operand, $300, to the current program counter location; the result is
$3503, so its $3503 thats stacked. Once on the stack, provided the program and data banks are the
same, the data can be accessed using stack relative indirect indexed addressing. Fragment 11.13
contains the example code.
Once the address of DATA1 is on the stack, the values at DATA2 and DATA3 can be
accessed as well simply by using values of one and two, respectively, in the Y index register.
172
ACCESS
62FD7F
E220
A00000
B301
ORG
START
$2200
PER
SEP
LDY
LDA
DATA1
#$20
#0
(1,S),Y
.
.
.
END
DATA0
2A2A2A
FF
F7
E3
DATA1
DATA2
DATA3
ORG $2500
START
DC
C***
DC
HFF
DC
HF7
DC
HE3
END
Fragment 11.13
173
Available on:
6502
65C02
Mnemonic
65x Subroutine Instructions:
JSR
x
x
RTS
x
x
JSL
RTL
65802/816
x
x
x
x
Description
jump to subroutine
return from subroutine
long jump to subroutine
long return from subroutine
174
JSR
$2000
JSR
SUBR1
or
200080
In the second case, the assembler determines the address of subroutine SUBR1.
The processor, upon encountering a jump-to-subroutine instruction, first saves a return address. The
address saved is the address of the last byte of the JSR instruction (the address of the last byte of the operand),
not the address of the next instruction as is the case with some other processors. The address is pushed onto the
stack in standard 65x order the low byte in the lower address, the high byte in the higher address and done in
standard 65x fashion the first byte is stored at the location pointed to by the stack pointer, the stack pointer is
decremented, the second byte is stored, and the stack pointer is decremented again. Once the return address has
been saved onto the stack, the processor loads the program counter with the operand value, thus jumping to the
operand location, as shown in Figure 12.1. Jumping to a subroutine has no effect on the status register flags.
175
Operand Low
Operand High
65816 Registers:
Bank
23
15
Program Bank (PBR)
High
Low
Stack
instruction byte
before
Stack Pointer
after
Bank 0
pointer by one before retrieving each of the two bytes to which it points. But the return address that was stored
on the stack was the address of the third byte of the JSR instruction. When the processor pulls the return
address off the stack, it automatically increments the address by one so that it points to the instruction following
the JSR instruction which should be executed when the subroutine is done. The processor loads this
incremented return address into the program counter and continues execution from the instruction following the
original JSR instruction, as Figure 12.2 shows.
The processor assumes that the two bytes at the top of the stack are a return address stored by a JSR
instruction and that these bytes got there as the result of a previous LSR. But as a result, if the subroutine used
the stack and left it pointing to data other than the return address, the RTS instruction will pull two irrelevant
data bytes as the address to return to. Cleaning up the stack after using it within a subroutine is therefore
imperative.
The useful side of the processors inability to discern whether the address at the top of the stack was
pushed there by a JSR instruction is that you can write a reentrant indirect jump using the RTS instruction.
First formulate the address to be jumped to, then decrement it by one (or better, start with an alreadydecremented address), push it onto the stack (pushing first high byte, then low byte, so that it is in correct 65x
order on the stack) and, finally, code an RTS instruction. The return-from-subroutine pulls the address back off
the stack, increments it, and loads the result into the program counter to cause a jump to the location, as
Fragment 12.1 illustrates.
0000
0000
0001
0002
;
3A
48
60
Reentrancy is the ability of a section of code to be interrupted, then executed by the interrupting
routine, and still execute properly both for the interrupting routine and for the original routine when control is
returned to it. The interruption may be a result of a hardware interrupt (as described in the next chapter), or the
176
Stack
after
Stack Pointer (S)
PC High
PC Low
+1
Program
Counter (PC)
before
Bank 0
The indirect jump using RTS qualifies for reentrancy: While normally you would code an indirect jump
by forming the address to jump to and storing it to an absolute address, then jumping indirect through the
address, this jump by use of RTS uses only registers and stack.
A subroutine can have more than one RTS instruction. Its common for subroutine from internal loops
upon certain error conditions, in addition to returning normally from one or more locations. Some structured
programming purists would object to this practice, but the efficiency of having multiple exit points is
unquestionable.
Returning from a subroutine does not affect the status flags.
JSR
(TABLE,X)
The array Table must be located in the program bank. The addressing mode assumes that a table of locations of
routines would be part of the program itself and would be loaded, right along with the routines, into the bank
holding the program. The indirect address (the address with which the program counter will be loaded), a
sixteen-bit value, is concatenated with the program bank register, resulting in a transfer within the current
program bank. If the addition of X causes a result greater than $FFFF, the effective address will wrap,
remaining in the current program bank, unlike the indexing across banks that occurs for data accesses.
This addressing mode also lets you do an indirect jump-to-subroutine through a single double-byte cell
by first loading the X register with zero. You must remember in coding this use for the 65816, however, that
the cell holding the indirect address is in the program bank, not bank zero as with absolute indirect jumps.
The indexed indirect jump-to-subroutine is executed in virtually the same manner as the absolute jumpto-subroutine: the processor pushes the address of the final byte of the instruction onto the stack as a return
address; then the address in the double-byte cell pointed to by the sum of the operand and the X index register is
loaded into the program counter.
There is no difference between returning from a subroutine called by this instruction and returning from
a subroutine called by an absolute JSR. You code an RTS instruction which, when executed, causes the
address on the top of the stack to be pulled and incremented to point to the instruction following the JSR, then
to be loaded into the program counter to give control to that instruction.
177
JSR
$123456
This time a three-byte (long) return address is pushed onto the stack. Again it is not the address of the
next instruction but rather the address of the last byte of the JSR instruction which pushed onto the stack (the
address of the fourth byte the JSR instruction in this case). As Figure 12.3 shows, the address is pushed onto
the stack in standard 65x order: low byte in the lower address, high byte in the higher address, bank byte in the
highest address (which also means the bank byte is the first of the three pushed, the low byte last).
Jumping long to a bank zero subroutine requires the greater-than (>) sign, as explained in the last
chapter:
22563400
JSR
>$3456
The greater-than sign forces long addressing to bank zero, voiding the assemblers normal assumption to use
absolute addressing to jump to a subroutine at $3456 in the current program bank.
To avoid this confusion altogether, there is an equivalent standard mnemonic for jump-to-subroutine
long JSL:
22563400
JSL
$3456
JSL
$023456
or
22563402
Using an alternate mnemonic is particularly appropriate for jump-to-subroutine long, since this
instruction requires you to use an entirely different return-from-subroutine instruction RTL, or return-fromsubroutine long.
Return Address
(last JSR instruction byte)
Stack
Return Address Bank
Return Address High
Return Address Low
before
Stack
Pointer (S)
after
Bank 0
Figure 12-3 JSL
Branch to Subroutine
One of the glaring deficiencies of the 6502 was its lack of support for writing relocatable code;
the 65802 and 65816 address this deficiency, but still lack the branch-to-subroutine instruction some
other processors provide. There is no instruction that lets you call a subroutine with an operand that is
program counter relative, not an absolute address. Yet, to write relocatable code easily, a BSR
instruction is required: suppose a relocatable program assembled at $0 has an often-called multiply
subroutine at $07FE; if the program is later loaded at $7000, that subroutine is at $77FE; obviously, a
JSR to $07FE will fail.
Stack
after
Stack
Pointer (S)
before
Program
Counter (PC)
Bank 0
The 65802 and 65816 can synthesize the BSR function using their PER instruction. You use PER to
compute and push the current run-time return address; since its operand is the return address relative offset
(from the current address of the PER instruction), PER provides relocatability. As Fragment 12.2 shows, once
the correct return address is on the stack, a BRA or BRL completes the synthesized BSR operation.
0000
0000
0000
0000
0003
0006
0006
0006
0006
0006
0006
0006
0006
0006
0006
.
.
62FC7F
82FA7F
RETURN
SUBR1
60
PER
BRL
.
.
.
.
.
.
.
RTS
RETURN-1
SUBR1
In this case, you specify as the assembler operand the symbolic location of the routine you want to
return to minus one. Remember that the return address on the stack is pulled, then incremented, before control
is passed to it. The assembler transforms the source code operand, RETURN 1, into the instructions object
code operand, a relative displacement from the next instruction to RETURN 1. In this case, the displacement
is $0002, the difference between the first byte of the BRL instruction and its last byte. (Remember, PER works
the same as the BRL instruction; in both cases, the assembler turns the location you specify into a relative
displacement from the program counter.) When the instruction is executed, the processor adds the displacement
($0002, in this case) to the current program counter address (the address of the BRL instruction); the resulting
sum is the current absolute address of RETURN 1, which is what is pushed onto the stack.
179
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0002
0003
0005
0006
KEEP
KL.12.1
; NEGACC - ;
; Negate the 8-bit value in the accumulator
; On entry: Value to be negated is in accumulator
; On exit: Value now negated is in accumulator
NEGACC
46FF
18
6901
60
START
EOR
CLC
ADC
RTS
END
#$11111111
#1
Listing 12.1
It is extremely important to clearly document library routines. Perhaps the best approach is to begin
with a block comment at the head of the routine, describing its name, what the routine does, what it expects as
input, what direct page locations it uses during execution, if the contents of any registers or any memory special
locations are modified during execution, and how and where results are returned.
By documenting the entry and exit conditions as part of the header, as in the example, when the routine
is used from a library you wont have to read the code to get this information. Although this example is quite
simple, when applied to larger, more complex subroutines, the principle is the same: document the entry and
exit conditions, the function performed, and any side effects.
As a subroutine, this code to negate the accumulator takes six bytes. Each JSR instruction takes three.
So calling it twice from a single program requires 12 bytes of code; if called three times, 15 bytes; if four, 18
bytes.
180
49FF
1A
EOR
INC
#%11111111
A
Since the in-line code takes the same number of bytes as the JSR instruction, you would lose four bytes (the
number in the subroutine itself) by calling it as a subroutine.
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0003
0003
0004
0005
0007
0009
000A
KEEP
KL.12.2
200080
A8
8A
49FF
6900
60
NEGXA START
; first call the 8-bit negation routine defined a few pages back
JSR
NEGACC
negate the low 8 bits in the accum
; then get and negate the high 8 bits
TAY
TXA
get high 8 bits into accum
EOR
#%11111111 form ones complement
ADC
add carry from adding 1 to low byte
RTS
return
END
Listing 12.2
Here, one subroutine (NEGXA) calls another (the subroutine described previously that negates eight
bits).
49FFFF
1A
EOR
INC
#$FFFF
A
Parameter Passing
When dealing with subroutines, which by definition are generalized pieces of code used over and over
again, the question of how to give the subroutine the information needed to perform its function must be
considered. Values passed to or from subroutines are referred to as the parameters of the subroutine.
Parameters can include values to be acted upon, such as two numbers to be multiplied, or may be information
that defines the context or range of activity of the subroutine. For example, a subroutine parameter could be the
address of a region of memory to work on or in, rather than the actual data itself.
The preceding examples demonstrated one of the simplest methods of parameter-passing, by using the
registers. Since many of the operations that are coded are subroutines in assembly language are primitives that
operate on a single element, like print a character on the output device or convert this character from binary
to hexadecimal, passing parameters in registers is probably the approach most commonly found.
A natural extension of this approach, which is particularly appropriate for the 65802 and 65816, but
also possible on the 6502 and 65C02, is to pass the address of a parameter list in a register (or, on the 6502 and
65C02, in two registers). Listing 12.3 gives example.
182
0000
0000
0000
0000
0000
0001
0002
0002
0004
0004
0004
0006
0006
0006
0009
000C
000C
000F
0012
0012
0013
0014
0015
0015
0028
003B
003B
0000
0000
0000
0000
0000
0000
0003
0005
0008
0009
000B
000C
000C
0000
0000
0000
0000
0000
0000
0000
0001
0002
0003
0004
0005
0006
0009
000A
000B
000C
000D
000E
000F
0010
KEEP
65816
L123
18
FB
KL.12.3
ON
START
CLC
XCE
E220
SEP
LONGA
#$20
OFF
8-bit accumulator
C210
REP
LONGI
#$10
ON
A21500
2000080
LDX
JSR
#STRING1
PRSTRNG
A22800
200080
LDX
JSR
#STRING2
PRSTRNG
38
FB
60
SEC
XCE
RTS
54686973
54686973
STRING1
STRING2
DC
DC
END
; print a string of characters terminated by a 0 byte
; on entry: X register holds location of string
BD0000
F006
200080
E8
80F5
60
PRSTRNG
TOP
DONE
START
LDA
BEQ
JSR
INX
BRA
RTS
!0,X
DONE
COUT
TOP
END
;
;
;
COUT
ECOUT
48
DA
5A
08
38
FB
20EDFD
18
FB
28
7A
FA
68
60
COUT
machine-dependent routine to output a character
START
GEQU
PHA
PHX
PHY
PHP
SEC
XCE
JSR
CLC
XCE
PLP
PLY
PLX
PLA
RTS
END
$FDED
Apple / / COUT
Save registers
and status,
switch to emulation
ECOUT
Listing 12.3
183
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0002
0004
0006
0008
0008
000C
000C
001F
001F
0000
0000
0000
0000
0000
0000
0000
0000
0002
0004
0006
0009
000A
000C
000D
000D
000D
KEEP
6502/65C02 example
PEX
START
PARAM
GEQU
$80
LDX
STX
LDX
STX
JSR
RTS
#>STRING1
PARAM+1
#<STRING1
PARAM
PRSTRNG
DC
A200
8681
A20C
8680
2000080
60
54686973
KL.12.4
STRING1
END
; print a string of characters terminated by a 0 byte
; on entry: direct page location PARAM holds address of string
PRSTRNG
COUT
A000
B180
F006
20EDFD
C8
D0F6
60
LOOP
DONE
START
GEQU
LDY
LDA
BEQ
JSR
INY
BNE
RTS
$FDED
#0
(PARAM),Y
DONE
COUT
LOOP
END
Listing 12.4
Unfortunately, it takes eight bytes to set up PARAM each time PRSTRNG is called. As a result, a
frequently used method of passing parameters to a subroutine is to code the data in-line, immediately following
the subroutine call. This technique (see Fragment 12.5) uses no registers and no data memory, only program
memory.
184
2000080
54686520
REUTRN
.
.
JSR
DC
.
.
.
.
Fragment 12.5
This method looks, at first glance, bizarre. Normally, when a subroutine returns to the calling section of
code, the instruction immediately following the JSR is executed. Obviously, in this example, the data stored at
that location is not executable code, but string data. Execution should resume instead at the label RETURN,
which is exactly what happens using the PRSTRNG coded in Listing 12.5. The return address pushed onto the
stack by the JSR is not a return address at all; it is, rather, the parameter to PRSTRNG.
185
0000
0000
0000
0000
0000
0000
0001
0002
0002
0004
0004
0004
0006
0006
0006
0007
0008
000B
000D
0010
0011
0013
0013
0013
0013
0014
0015
0000
0000
0000
0000
0000
0000
0000
000
0001
0002
0003
0004
0005
0006
0009
000A
000B
000C
000D
000E
000F
0010
KEEP
65816
PRSTRNG
KL.12.5
ON
START
18
FB
CLC
XCE
E220
SEP
LONGA
#$20
OFF
8-bit accum
C210
REP
LONGI
#$10
ON
FA
E8
BD0000
F006
200080
E8
80F5
PLX
INX
LDA
BEQ
JSR
INX
BRA
LOOP
!0,X
DONE
COUT
LOOP
DONE
;
;
;
COUT
ECOUT
48
DA
5A
08
38
FB
20EDFD
18
FB
28
7A
FA
68
60
PHX
RTS
END
COUT
machine-dependent routine to output a character
START
GEQU
PHA
PHX
PHY
PHP
SEC
XCE
JSR
CLC
XCE
PLP
PLY
PLX
PLA
RTS
END
$FDED
Apple / / COUT
Save registers
and status,
switch to emulation
ECOUT
Listing 12.5
The parameter address on the stack need only be pulled and incremented once, and the data can then be
accessed in the same manner as in the foregoing example. Since the loop terminates when the zero end-ofstring marker is reached, pushing its address in the X register onto the stack gives RTS a correct return, value
RETURN-1 the byte before the location where execution should resume. Note that the data bank is assumed
to equal the program bank.
The advantage of this method is in bytes used: there is no need for any explicit parameter-passing by
the calling code, and the JSR mechanism makes the required information available to the subroutine
automatically. In fact, for most applications on all four 65x microprocessors, this method uses fewer bytes for
passing a single parameter than any other.
One slight disadvantage of this method is that if the string is to be output more than once, it and its
preceding JSR must be made into a subroutine that is called to output the string.
186
F40080
F40080
200080
PEA
PEA
JSR
.
.
.
STRING1
STRING2
COMPARE
Fragment 12.6
0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015
0016
0017
0018
0019
0020
0021
0022
0023
0024
0025
0026
0027
0028
0029
0030
0031
0032
0033
0034
0035
0036
0037
0038
0039
0040
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0001
0003
0003
0005
0005
0005
0008
000A
000C
000E
0010
0011
0013
0013
0013
0013
0013
0014
0015
0016
0016
0018
001A
001A
001B
001C
001D
001D
KEEP
65816
KL.12.6
ON
START
08
PHP
C210
REP
LONGI
SEP
LONGA
#$10
ON
#$20
OFF
LDY
LDA
BEQ
CMP
BNE
INY
BRA
#0
(3,S),Y
PASS
(5,S),Y
FAIL
A00000
B303
F007
D305
D006
C8
80F5
28
18
60
B305
F0F9
F0F9
28
38
60
LOOP
LOOP
PASS
PLP
CLC
RTS
FAIL
LDA
BEQ
PLP
SEC
RTS
END
Listing 12.6
This example, which compares two strings to see if they are equal up to the length of the shorter of the
two strings, uses parameters that have been explicitly passed on the stack. This approach, since it explicitly
passes the address of the strings, lets them be located anywhere and referred to any number of times. Its
187
F4FF7F
F40080
F40080
4C0080
RETURN
.
.
PEA
PEA
PEA
JMP
.
.
.
RETURN-1
STRING1
STRING2
COMPARE
Fragment 12.7
188
0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015
0016
0017
0018
0019
0020
0021
0022
0023
0024
0025
0026
0027
0028
0029
0030
0031
0032
0033
0034
0035
0036
0037
0038
0039
0040
0041
0042
0043
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0003
0003
0003
0005
0005
0005
0008
000A
000C
000E
0010
0011
0013
0013
0013
0013
0014
0015
0017
0017
0019
001B
001C
001D
001D
001E
001F
0020
0020
KEEP
65816
KL.12.7
ON
START
08
PHP
C210
REP
LONGI
#$10
ON
E220
SEP
LONGA
#$20
OFF
LDY
LDA
BEQ
CMP
BNE
INY
BRA
#0
(1,S),Y
PASS
(3,S),Y
FAIL
A00000
B301
F007
D303
D007
C8
80F5
LOOP
LOOP
28
18
8006
PASS
PLP
CLC
BRA
B303
F0F8
28
38
FAIL
LDA
BEQ
PLP
SEC
FA
FA
60
EXIT
PLX
PLX
RTS
END
Listing 12.7
Since the return address was pushed first, the parameter addresses on the stack are accessed via offsets
of one and three. Before returning, two pull instructions pop the parameters off the stack, then the RTS is
executed, which returns control to the main program with the stack in order.
Passing parameters on the stack is particularly well suited for both recursive routines (routines
that call themselves) and reentrant routines (routines that can be interrupted and used successfully both
by the interrupting code and the original call) because new memory is automatically allocated for
parameters for each invocation of the subroutine. This is the method generally used by most highlevel languages that support recursion.
Fragment 12.8 sets up multiple parameters implicitly passed on the stack by coding after the JSR, not
data, but pointers to data. The routine called is in Listing 12.8.
189
200080
0080
0080
RETURN
.
.
JSR
DC
DC
.
.
.
COMPARE
A STRING1
A STRING2
Fragment 12.8
While this subroutine, unlike the previous one, uses a dozen bytes just getting itself ready to start, each
call requires only seven bytes (three for the JSR, and two each for the parameters), while each call to the
previous routine required twelve bytes (three PERs at three bytes each plus three for the JMP).
Apple Computers ProDOS operating system takes this method a step further: all operating system
routines are called via a JSR to a single ProDOS entry point. One of the parameters that follow the JSR
specifies the routine to be called, the second parameter specifies the address of the routines parameter block.
This method allows the entry points of the internal ProDOS routines to float from one version of ProDOS to
the next; user programs dont need to know where any given routine is located.
190
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0002
0002
0002
0004
0004
0004
0005
0006
0009
000A
000B
000E
000F
0010
0011
0011
0014
0016
0019
001B
001C
001D
001F
0020
0021
0024
0026
0027
0028
0028
KEEP
65816
KL.12.8
ON
START
C210
REP
LONGI
#$10
ON
E220
SEP
LONGA
#$20
OFF
7A
C8
B90000
C8
C8
BE0000
C8
5A
A8
PLY
INY
LDA
INY
INY
LDX
INY
PHY
TAY
B90000
F009
DD0000
D006
C8
E8
80F2
18
60
BD0000
F0F9
38
60
LOOP
PASS
FAIL
EXIT
LDA
BEQ
CMP
BNE
INY
INX
BRA
CLC
RTS
LDA
BEQ
SEC
RTS
!0,Y
!0,Y
!0,Y
PASS
!0,X
FAIL
LOOP
they match up to shortest string;
!0,X
PASS
END
Listing 12.8
191
Mnemonic
BRK
RTI
NOP
SEC
CLC
SED
CLD
SEI
CLI
CLV
SEP
REP
COP
STP
WAI
WDM
6502
x
x
x
x
x
x
x
x
x
x
Available on:
65C02
65802/816
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Description
Break (software interrupt)
Return from Interrupt
No operation
Set carry flag
Clear carry flag
Set decimal mode
Clear decimal mode
Set interrupt disable flag
Clear interrupt disable flag
Clear overflow flag
Set status register bits
Clear status register bits
Co-processor or software
interrupt
Stop the clock
Wait for interrupt
Reserved for expansion
Interrupts
An interrupt, as the name implies, is a disruption of the normal sequential flow of control, as modified
by the flow-altering statements such as branches and jump instructions encountered in the stream of code.
Hardware interrupts are generated when an external device causes one of the interrupt pins, usually
the IRQ or interrupt request pin, to be electrically pulled low from its normally high signal level. The typical
application of 65x interrupts is the implementation of an interrupt-driven I/O system, where input-output
devices are allowed to operate asynchronously from the processor. This type of system is generally considered
to be superior to the alternative type of I/O management system, where devices are polled at regular intervals to
determine whether or not they are ready to send or receive data; in an interrupt-driven system, I/O service only
claims processor time when an I/O operation is ready for service. Figure 13.1 illustrates how processor time is
spent under either system.
192
POLLING
I/O
Service
Polling
Loop
User Task
A
TIME
INTERRUPT-DRIVEN
I/O
Service
User
Task
A
TIME
Software interrupts are special instructions that trigger the same type of system behavior as occurs
during a hardware interrupt.
When an interrupt signal is received, the processor loads the program counter with the address stored in
one of the sixteen-bit interrupt vectors in page $FF of bank zero memory, jumping to the (bank zero) routine
whose address is stored there. (In the case of the 6502, 65C02, and 65802, bank zero refers to the lone 64K
bank memory addressable by these processors.) The routine that it finds there must determine the nature of the
interrupt and handle it accordingly.
When an interrupt is first received, the processor finishes the currently executing instruction and pushes
the double-byte program counter (which now points to the instruction following the one being executed when
the interrupt was received) and the status flag byte onto the stack. Since the 6502 and 65C02 have only a
sixteen-bit program counter, only a sixteen-bit program counter address is pushed onto the stack; naturally, this
is the way the 65802 and 65816 behave when in emulation mode as well. The native-mode 65802 and 65816
must (and do) also push the program counter bank register, since it is changed to zero when control is
transferred through the bank zero interrupt vectors.
As Figure 13.2 shows, in native mode the program bank is pushed onto the stack first, before the
program counter and the status register: but emulation mode it is lost. This means that if a 65816 program is
running in emulation mode in a bank other than zero when an interrupt occurs, there will be no way of knowing
where to return to after the interrupt is processed because the original bank will have been lost.
193
Pointer (S)
after
Stack
PC High
PC Low
Status (P)
Program
Counter (PC)
Status (P)
Bank 0
Pointer (S)
after
Stack
Program Bank (PBR)
PC High
PC Low
Status (P)
Counter (PC)
Status (P)
Bank 0
Figure 13-2 Interrupt Processing
Just as the two types of interrupt have their own signals and pins, they also have their own vectors
locations where the address of the interrupt-handling routine is stored. As Table 13.2 shows, on the 65802 and
65816 there are two sets of interrupt vectors: one set for when the processor is in emulation mode, and one set
for when the processor is in native mode. Needless to say, the locations of the emulation mode vectors are
identical to the locations of the 6502 and 65C02 vectors.
194
Emulation mode, e = 1
Native mode, e = 0
00FFFE,FF - IRQ/BRK
00FFFC,FD - RESET
00FFFA,FB - NMI
00FFF8,F9 - ABORT
00FFEE,EF
00FFEA,EB
00FFE8,E9
00FFE6,E7
00FFE4,E5
00FFF4,F5 - COP
IRQ
NMI
ABORT
BRK
COP
As you can see in Table 13.2, there are several other vector locations named in addition to IRQ and
NMI. Note that there is no native mode RESET vector: RESET always forces the processor to emulation
mode. Also note that the IRQ vector among the 6502 vectors is listed as IRQ/BRK, while in the
65802/65816 native mode list, each has a separate vector.
The BRK and COP vectors are for handling software interrupts. A software interrupt is an
instruction that imitates the behavior of a hardware interrupt by stacking the program counter and the status
register, and then branching through a vector location. On the 6502 and 65C02, the location jumped to in
response to the execution of a BRK (a software interrupt) and the location to which control is transferred after
an IRQ (a hardware interrupt) is the same; the interrupt routine itself must determine the source of the interrupt
(that is, either software or hardware) by checking the value of bit five of the processor status register pushed
onto the stack. On the 6502 and 65C02 (and the 6502 emulation mode of the 65802 and 65816), bit five is the b
break flag. Note first that this is not true of the 65816 native mode, since bit five of its status register is the m
memory select flag. Secondly, notice that it is the stacked status byte which must be checked, not the current
status byte.
Suppose, for example, that the IRQ/BRK vector at $00:FFFE.FF contains the address $B100
(naturally, in the low-high order all 65x addresses are stored in), and the code in Fragment 13.1 is stored starting
at $B100. When a BRK instruction is executed, this routine distinguishes it from a hardware interrupt and
handles each uniquely.
0000
0000
0000
0000
0003
0004
0005
0007
0009
0009
0009
0009
0009
0009
000C
000C
000C
000C
000C
000C
000C
000F
0010
0010
0011
0011
ORG
IRQBRKIN
8D1000
68
48
2920
D0F7
START
STA
PLA
PHA
AND
BNE
$B100
SAVEA
#%00010000
ISBRK
;
.
.
.
.
JMP
4C0C00
RETURN
AD1000
40
RETURN
00
SAVEA
.
.
.
LDA
RTI
DS
SAVEA
END
Fragment 13.1
195
There is another key difference between the RTI and the RTS or RTL: RTS and RTL increment the
return address after pulling it off the stack and before loading it into the program counter; RTI on the other
hand loads the program counter with the stack return address unchanged.
RTI will probably not function correctly in the special case where an interrupt occurred while code was
executing in the emulation mode in a non-zero bank: RTI will try to return control to an address within the bank
the RTI is executed in, which will probably not be the correct bank because (as on the 6502 and 65C02) the
bank address is not stacked. As mentioned earlier, the only way to deal with this is to save the bank address
prior to entering emulation mode. When the interrupt handler returns, it should use this saved bank address to
execute a long jump to an RTI instruction stored somewhere within the return bank, the long jump will present
the program bank address to the correct value before the RTI is executed.
The interrupt handler itself should enter the native mode if interrupts are to be reenabled before exiting
in order to avoid the same problem, the return to emulation mode before exiting via the long jump to the RTI
instruction.
Concerning the BRK instruction, you should also note that although its second byte is basically a dont
care byte that is, it can have any value - the BRK (and COP instruction as well) is a two-byte instruction, the
second byte sometimes is used as a signature byte to determine the nature of the BRK being executed. When
an RTI instruction is executed, control always returns to the second byte past the BRK opcode. Figure 13.3
illustrates a stream of instructions in hexadecimal form, the BRK instruction, its signature byte, and location an
RTI returns to. The BRK instruction has been inserted in the middle; after the BRK is processed by a routine
(such as the skeleton of a routine described above), control will return to the BCC instruction, which is the
second byte past the BRK opcode.
The fact that the opcode for the BRK instruction is 00 is directly related to one of its uses: patching
existing programs. Patching is the process of inserting instruction data in the middle of an existing program in
memory to modify (usually to correct) the program without reassembling it. This is a favored method of some
programmers in debugging and testing assembly language programs, and is quite simple if you have a good
machine-level monitor program that allows easy examination and modification of memory locations. However,
if the program to be patched is stored in PROM (programmable read-only memory), the only way to modify a
program that has already been burned-in is to change any remaining one bits to zeros. Once a PROM bit has
been blown to zero, it cannot be restored to a one. The only way to modify the flow of control is to insert
BRK instructions all zeroes at the patch location and to have the BRK handling routine take control from
there.
196
LDA $44
A5
44
00
00
90
32
BRK instruction
BCC instruction
optional
signature byte
Processing Interrupts
Before an interrupt handling routine can perform a useful task, it must first know what is expected of it.
The example of distinguishing a BRK from an IRQ is just a special case of the general problem of identifying
the source of an interrupt. The fact that different vectors exist for different types of interrupts for example,
NMI would usually be reserved for some catastrophic type of interrupt, like power failure imminent, which
demanded immediate response solves the problem somewhat. Typically, however, in an interrupt-driven
system there will be multiple sources of interrupts through a single vector. The 65802 and 65816, when in
native mode, eliminate the need for a routine to distinguish between IRQ and BRK, such as the one above, by
providing a separate BRK vector, as indicated in Table 13.2. Although this does simplify interrupt processing
somewhat, it was done primarily to free up bit five in the status register to serve as the native memory select
flag, which determines the size of the accumulator.
The interrupt source is generally determined by a software technique called polling: when an interrupt
occurs, all of the devices that are known to be possible sources of interrupts are checked for an indication that
they were the source of the interrupt. (I/O devices typically have a status bit for this purpose.) A hardware
solution also exists, which is to externally modify the value that is apparently contained in the vector location
depending on he source of interrupt. The 65816 aids the implementation of such systems by providing a
VECTOR PULL signal, which is asserted whenever the interrupt vector memory locations are being accessed
in response to an interrupt.
A simple example of the polling method could be found in a system that includes the 6522 Versatile
Interface Adapter as one of its I/O controllers. The 6522 is a peripheral control IC designed for hardware
compatibility with the 65x processor family. The 6522 includes two parallel I/O ports and two timer/counters.
It can be programmed to generate interrupts in response to events such as hardware handshaking signals,
indicating that data has been read or written to its I/O ports, or to respond to one of its countdown timers
reaching zero. The 6522 contains sixteen different control and I/O registers, each of which is typically mapped
to an adjacent address in the 65x memory space. When an interrupt occurs, the processor must poll the
interrupt flag register, shown in Figure 13.4, to determine the cause of the interrupt.
197
CA2
CA1
SHIFT REG
CB2
CB1
TIMER2
TIMER1
IRQ
SET BY
CA2 active edge
CA1 active edge
Complete 8 shifts
CB2 active edge
CB1 active edge
Time-out of T2
Time-out of T-1
Any enabled
interrupt
CLEARED BY
Read or write
Reg. 1 (ORA)
Read or write
Reg. 1 (ORA)
Read or write
Shift Reg.
Read or write ORB
Read or write ORB
Read T2 low or write
T2 high
Read T1 low or write
T1 high
Clear all interrupts
If register zero of the 6522 is mapped to location $FF:B080 of a 65816 system, for example, the
interrupt flag register would normally be found at $FF:B08D. The polling routine in Fragment 13.2 would be
needed whenever an interrupt occurred. To keep the example simple, assume that only the two timer interrupts
are enabled (for example, timer 1 to indicate, in a multi-tasking system, that a given process time-slice has
expired and the next process must be activated; timer 2, on the other hand, to maintain a time-of-day clock).
198
E220
IRQIN
START
SEP
LONGA
#$20
OFF
8-bit accumulator
8D1B00
AF8DB0FF
10F5
0A
0A
30F1
STA
LDA
BPL
ASL
ASL
BMI
SAVEA
$FFB08D
NEXTDEV
A
A
TIMER2
;
; timer2 didnt cause interrupt; timer1?
90EF
BCC
ERROR
8004
.
.
.
BRA
RETURN
8002
.
.
.
BRA
RETURN
8000
.
.
.
BRA
RETURN
ERROR
.
.
.
AD1B00
40
RETURN
LDA
RTI
SAVEA
00
SAVEA
DS
END
Fragment 13.2
When the interrupt flag register is loaded into the accumulator, the first thing to check is whether or not
bit seven is set; bit seven is set if any 6522 interrupt is enabled. If it is clear, then the interrupt handler branches
to the location NEXTDEV, which polls all other connected I/O devices looking for the interrupt.
If the 6522 was the source of the interrupt, then two shifts move the flag registers bit six into the carry
and bit five into bit seven of the accumulator. Since bit five is set by the time-out of timer 2, if the high-order
bit of the accumulator is set (minus), then the source of the interrupt must be timer 2. If timer 2 did not cause
the interrupt, then the carry flag is checked; if its set, then timer 1 caused the interrupt; if its clear, then timer 1
didnt cause it either, so there has been some kind of error.
199
01
0000
00
00
00
00
m = 1, x = 1, d = 0, i = 1
1
Table 13-3 Reset Initialization
In addition to the BRK, IRQ, RESET, and NMI vectors discussed, there are two remaining
interrupt-like vectors. These are the COP (co-processor) and ABORT vectors. The COP vector is essentially
a second software interrupt, similar to BRK, with its own vector. Although it can be used in a manner similar
to BRK, it is intended particularly for use with co-processors, such as floating-point processors. Like BRK, it
is a two-byte instruction with the second available as a signature byte.
The ABORT vector contains the address of the routine which gains control when the 65816 ABORT
signal is asserted. Prior to transferring control through the ABORT vector, the current instruction is completed
but no registers are modified. The pc bank, program counter, and status register are pushed onto the stack in the
same manner as an interrupt. The ABORT signal itself is only available on the 65816; although the 65802 has
an ABORT vector, it is ineffective since no ABORT signal can be generated because of the need for the
65802 to be pin-compatible with the 6502. Typical application of the abort instruction feature is the
implementation of hardware memory-management schemes in more sophisticated 65816 systems. When a
memory-bounds violation of some kind is detected by external logic, the ABORT' signal is asserted, letting the
operating system attempt to correct the memory-management anomaly before resuming execution.
08
78
28
PHP
SEI
.
.
.
PLP
save status
disable interrupts
execute time-critical code
done restore status, enable interrupts
Fragment 13.3
Since the interrupt disable flag was clear when the PHP instruction was executed, the PLP instruction restores
the cleared flag. This same technique is also useful when mixing subroutine calls to routine with different
default modes for accumulator and index register sizes; since saving the status with PHP is a common operation
between subroutine calls anyway, the PLP instruction can be used to conveniently restore operating modes as
well as status flags.
Finally, there is a CLV (clear overflow flag). There is no corresponding set overflow instruction, and,
as you will recall from the chapter on arithmetic, the overflow flag does not need to be explicitly cleared before
a signed operation. The arithmetic operation always change the overflow status to correctly reflect the result.
The reason for including an explicit CLV instruction in the 65x repertoire is that the 6502, 65C02, and 65802
have a SET OVERFLOW input signal; external hardware logic can set the overflow flag of the status register
by pulling the SET OVERFLOW input low. Since there is no corresponding clear overflow input signal, the
overflow must be cleared in software in order to regain susceptibility to the SET OVERFLOW signal.
The practical application of the SET OVERFLOW signal is generally limited to dedicated control
applications; it is rarely connected on general-purpose, 6502-based computer systems. On the 65816, there is
no SET OVERFLOW input; it was sacrificed to make room for some of the more generally useful new signals
available on the 65816 pin configuration.
No Operation Instructions
The final two instructions to complete the 65816 instruction set are the no operation instruction. These
do exactly what they sound like: nothing. They are used as place holders, or time-wasters; often they are used
to patch out code during debugging. The NOP instruction with a hexadecimal value of $EA is the standard
no operation.
As mentioned in the earlier architecture chapters, the 6502 and 65C02 have a number of unimplemented
instructions the same opcodes which, on the 65802 and 65816, correspond to the new instructions. On the
6502, the operation of the processor when these instructions are executed is undefined; some of them cause
the processor to hang-up. On the 65C02, these are all well-behaved no-operations of either one, two, or
more cycles. On the 65802 and 65816, there is only one unimplemented instruction, defined as WDM; this is
reserved for future systems as an escape prefix to expand the instruction set with sixteen-bit opcodes. For this
reason, it should not be used in your current programs, as it will tend to make them incompatible with future
generations of the 65816.
202
Part 4
Applications
203
Multiplication
Probably the most common multiply routine written for eight-bit applications is to multiply one sixteenbit number by another, returning a sixteen-bit result. While multiplying two large sixteen-bit numbers would
yield a 32-bit result, much of systems programming is done with positive integers limited to sixteen bits, which
is why this multiply example is so common. Be aware that a result over sixteen bits cannot be generated by the
examples as coded youll have to extend them if you need to handle larger numbers.
There are several methods for the sixteen-by-sixteen multiply, but all are based on the multiplication
principles for multi-digit numbers you were taught in grade school: multiply the top number by the right-most
digit of the bottom number; move left, digit by digit, through the bottom number, multiplying it by the top
number, each time shifting the result product left one more space and adding it to the sum of the previous
products:
2344
X 12211
2344
2344
4688
4688
2344
28622584
Or to better match the description:
2344
X 12211
2344
+ 2344
25784 sum of products so far
+ 4688
494584 sum of products so far
+ 4688
5182584 sum of products so far
+ 2344
28622584 final product (sum of all single-digit multiplies)
204
101
x 1010
000
101
000
101
110010
5
x 10
0
5
50 decimal
To have the computer do it, you have it shift the bottom operand right; if it shifts out a zero, you need
do nothing, but if it shifts out a one, you add the top number to the partial product (which is initialized at zero).
Then you shift the top number left for the possible add during the next time through this loop. When there are
no more ones in the bottom number, you are done.
6502 Multiplication
With only three eight-bit registers, you cant pass two sixteen-bit operands to your multiply routine in
registers. One solution, the one used below, is to pass one operand in two direct page (zero page) bytes, while
passing the other in two more; the result is returned in two of the 6502s registers. All this is carefully
documented in the header of the routine in Listing 14.1.
This 6502 multiply routine takes 33 bytes.
65C02 Multiplication
With the same three eight-bit registers as the 6502, and an instruction set only somewhat enhanced, the
65C02 multiply routine is virtually the same as the 6502s. Only one byte can be saved by the substitution of an
unconditional branch instruction for the jump instruction, for a total byte count of 32.
205
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0002
0004
0004
0006
0008
000A
000C
000E
0010
0011
0012
0014
0015
0016
0018
0019
0019
0032
0033
0034
0035
0036
0037
001B
001D
0020
0020
0021
0021
KEEP
;
;
;
;
;
MULT
MCAND1
MCAND2
START
GEQU
GEQU
$80
$82
LDX
LDY
#0
#0
LDA
ORA
BEQ
LSR
ROR
BCC
CLC
TYA
ADC
TAY
TXA
ADC
TAX
MCAND1
MCAND1+1
DONE
MCAND1+1
MCAND1
MULT2
operand 1 (lo)
operand hi (hi); if 16-bit operand 1 is 0, done
ASL
MCAND2
ROL
JMP
MCAND2+1
MULT1
A200
A000
A580
0581
F016
4681
6680
9909
18
98
6582
A8
8A
6583
AA
MULT1
0682
MULT2
2683
4C0400
60
KL.14.1
DONE
MCAND2
MCAND2+1
RTS
END
Listing 14.1
206
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0004
0004
0004
0004
0004
0007
0007
0009
000B
000D
000F
0010
0012
0012
0014
0016
0016
0017
0018
0019
KEEP
65816
;
;
;
;
;
;
KL.14.2.
ON
16 by 16 = 16 multiply
for 65802/65816 microprocessors in native mode with
index registers and accumulator already set to 16 bits
operand 1: sixteen bits in direct page location MCAND1
operand 2: sixteen bits in direct page location MCAND2
result:
sixteen bits returned in accumulator
MULT
MCAND1
MCAND2
18
FB
C230
A90000
START
GEQU
GEQU
$80
$82
CLC
XCE
REP
#$30
LONGA
LONGI
ON
ON
LDA
#0
initialize result
LDX
BEQ
LSR
BCC
CLC
ADC
MCAND1
DONE
MCAND1
MULT2
get operand 1
if operand 1 is zero, done
get right bit, operand 1
if clear, no addition to previous products
else add oprd 2 to partial result
MCAND2
MULT1
A680
F00B
4680
9003
18
6582
MULT1
0682
80F1
MULT2
ASL
BRA
38
FB
60
DONE
SEC
XCE
RTS
END
MCAND2
now shift oprd 2 left for poss add next time
Listing 14.2
207
Division
Probably the most common division routine written for eight-bit applications is the converse of the
multiply routine just covered to divide one sixteen-bit number by another sixteen-bit number, returning both a
sixteen-bit quotient and a sixteen-bit remainder.
There are several methods for doing this, but all are based on the division principles for multi-digit
numbers that you learned in grade school. Line up the divisor under the left-most set of digits of the dividend,
appending an imaginary set of zeroes out to the right, and subtract as many times as possible. Record the
number of successful subtractions; then shift the divisor right one place and continue until the divisor is flush
right with the dividend, and no more subtractions are possible. Any non-subtractable value remaining is called
the remainder.
12211 remainder 1
2344 28622585
- 2344
5182585
-2344
2838585
-2344
494585
-2344
260185
-2344
25785
-2344
2345
-2344
1
Binary division is even easier since, with only ones and zeroes, subtraction is possible at each digit
position either only once or not at all:
1100 remainder 1
101 111101
-101
10101
-101
01
12 remainder 1
5 61
-5
11
-5
6
-5
1
208
6502 Division
The 6502, with its three eight-bit registers, handles passing parameters to and from a division routine
even less smoothly than to and from a multiplication routine: not only do you need to pass it two sixteen-bit
values, but it needs to pass back two sixteen-bit results.
The solution used in Listing 14.3 is to pass the dividend and the divisor in two direct page double bytes,
then pass back the remainder in a direct page double byte and the quotient in two registers.
209
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0002
0003
0004
0006
0008
000A
000A
000B
000D
000F
0011
0013
0015
0015
0016
0018
001A
001B
001D
001F
0021
0021
0023
0024
0026
0027
0028
0029
002A
002B
002C
002D
002E
0030
0032
0033
0035
0035
0036
0037
0037
KEEP
;
;
;
;
DIV
DIVDND
DIVSOR
A900
AA
48
A001
A582
300B
C8
0682
2683
3004
C011
D0F5
38
A580
E582
48
A581
E583
E583
8581
68
8580
48
68
68
2A
48
8A
2A
AA
4683
6682
88
D0E0
68
60
KL.14.3.
DIV1
START
GEQU
GEQU
LDA
TAX
PHA
LDY
LDA
BMI
INY
ASL
ROL
BMI
CPY
BNE
$80
$82
#0
#1
DIVSOR
DIV2
DIVSOR
DIVSOR+1
DIV2
#17
DIV1
DIV2
SEC
LDA
DIVDND
SBC
DIVSOR
PHA
LDA
DIVDND+1
SBC
DIVSOR+1
BCC
DIV3
; else carry is set to shift into quotient
STA
DIVDND+1
PLA
STA
DIVDND
PHA
DIV3
PLA
PLA
ROL
A
PHA
TXA
ROL
A
TAX
LSR
DIVSOR+1
ROR
DIVSOR
DEY
BNE
DIV2
DONE
PLA
RTS
END
Listing 14.3
210
65C02 Division
The 65C02 routine is virtually the same; only three early instructions (shown in Fragment 14.1) in the
6502 routine are changed to the code in Fragment 14.2, for a net savings of one byte, because the 65C02 has
instructions to push the index registers. This 65C02 divide routine takes 54 bytes, one byte fewer than the 6502
divide routine takes.
0000
0002
0003
A900
AA
48
LDA
TAX
PHA
#0
Fragment 14.1
0000
0002
A200
DA
LDX
PHX
#0
Fragment 14.2
65802/65816 Division
The 65802 and 65816 processors, with their registers extendable to sixteen bits, can handle sixteen-bit
division with ease. In the divide routine in Listing 14.4, the dividend and the divisor are passed in sixteen-bit
registers X and A respectively; the quotient is passed back in a sixteen-bit direct page location and the
remainder in X.
211
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0004
0004
0004
0004
0004
0006
0009
0009
000A
000C
000D
0010
0012
0012
0013
0013
0013
0014
0015
0016
0018
001A
001B
001B
001D
001E
001F
0020
0022
0022
0023
0024
0025
KEEP
65816
;
;
;
;
;
KL.14.4
ON
DIV
QUOTNT
6480
A00100
START
GEQU
$80
LONGA
LONGI
ON
ON
STZ
LDY
QUOTNT
#1
initialize quotient to 0
initialize shift count to 1
ASL
BCS
INY
CPY
BNE
A
DIV2
#17
DIV1
ROR
0A
B006
C8
C01100
D0F7
DIV1
6A
DIV2
48
8A
38
E301
9001
AA
2680
68
4A
88
D0F1
60
DIV3
ROL
PLA
LSR
DEY
BNE
QUOTNT
A
DIV4
push divisor
get dividend into accumulator
subtract divisor from dividend
bra if cant subtract; dividend still in X
store new dividend; carry=1 for quotient
shift carry quotient (1 for divide, 0 for not)
pull divisor
shift divisor right for next subtract
decrement count
branch to repeat unless count is 0
RTS
END
Listing 14.4.
212
0000
0001
0002
38
FB
200080
SEC
XCE
JSR
D06502
Fragment 14.3
Although this will work fine in some cases, it is not guaranteed to. In order to be assured of correct
functioning of an existing 6502 routine, the direct page register must be reset to zero and the stack pointer must
be relocated to page one. Although a 6502 program that uses zero page addressing will technically function
correctly if the direct page has been relocated, the possibility that the zero page may be addressed using some
form of absolute addressing, not to mention the probability that an existing 6502 monitor or operating system
routine would expect to use values previously initialized and stored in the zero page, requires that this register
be given its default 6502 value.
If the stack has been relocated from page one, it will be lost when the switch to emulation mode
substitutes the mandatory stack high byte of one. So first, the sixteen-bit stack pointer must be saved. Second,
if the 65802/65816 program was called from a 6502 environment, then there may be 6502 values on the original
6502 page-one stack; such a program must squirrel away the 6502 stack pointer on entry so it can be restored on
exit, as well as used during temporary incursions, such as this routine, into the 6502 environment.
The goal, then, is this: provide a mechanism whereby a programmer may simply pass the address of a
resident 6502 routine and any registers required for the call to a utility which will transfer control to the 6502
routine; the registers should be returned with their original (potentially sixteen-bits) values intact, except as
modified by the 6502 routine; and finally the operating mode must be restored to its state before the call.
When loading the registers with any needed parameters, keep in mind that only the low-order values
will be passed to a 6502 subroutine, even though this routine may be entered from either eight- or sixteen-bit
modes.
The call itself is simple; you push the address of the routine to be called, minus one, onto the stack,
typically using the PEA instruction. Then you call the routine, which executes the subroutine call and manages
all of the necessary housekeeping. Fragment 14.4 gives an example of calling the routine.
213
A94100
F4ECFD
200080
LDA
PEA
JSR
#A
$FDED-1
JSR6502
character to be printed
routine to be called
Fragment 14.4
$FDED is the address of an existing Apple I I routine to print charact4ers, and JSR6502 is the routine described
in Listing 14.5.
214
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0001
0003
0003
0003
0003
0004
0005
0006
0007
0007
0007
0008
0009
000C
000F
0011
0014
0015
0016
0016
0016
0016
0016
0019
001B
001D
001F
0021
0021
0021
0024
0025
0025
0025
0026
0027
0027
0027
0027
0027
0028
0028
0028
0028
0029
002A
002B
002C
002C
002C
002C
002C
002C
002C
002C
002C
KEEP
65816
JSR6502
KL.14.5
ON
START
PHP
C230
REP
LONGA
LONGI
DA
5A
0B
48
PHX
PHY
PHD
PHA
38
5B
2900FF
C90001
F004
AD4F00
1B
0B
; set up page-1 stack ptr, saving 65802 stack ptr in DP & on new stack
TSC
save old stack pointer in
TCD
direct page register
AND
#$FF00
mask stack pointer to examine high byte
CMP
#$100
BEQ
USESTK
branch if stack already in page 1
LDA
STK6502
else retrieve safe 6502 stack pointer
TCS
and load stack pointer with it
USESTK
PHD
push old stack pointer onto new stack
; set up a return-to-this-code return address on new stack
; (direct page register points to old stack with orig accum at 1)
F42700
D40C
A50A
850C
A501
F40000
2B
38
FB
PEA
PEI
LDA
STA
LDA
RETURN-1
(12)
10
12
1
60
08
EB
68
2B
;
; 6502 routine returns here
RETURN PHP
XBA
PLA
PLD
215
002C
002C
002C
002C
002E
0030
0032
0034
0036
0038
0038
0038
0038
0039
003B
003D
003F
003F
0040
0041
0041
0043
0043
0043
0043
0044
0045
0046
0046
0046
0046
0047
0049
0049
0049
0049
004A
004B
004C
004D
004E
004E
004F
004F
0051
0051
0051
0051
0051
;
29CF
850B
A509
2930
050B
850B
EB
8501
8405
8607
18
FB
CLC
XCE
C230
REP
LONGA
LONGI
0B
FA
9A
PHD
PLX
TXS
2B
7A
FA
68
28
PLA
copy accum to free stack bytes @ dp:9.10.
STA
9
; stack was moved by PLA, but DP was not
; pull registers from stack
PLD
PLY
PLX
PLA
PLP
60
8001
RTS
STK6502
;
;
;
DC
done!
A$180
END
Listing 14.5
The routine is entered with the return address on the top of the stack, and the go-to address of the 6502
routine at the next location on the stack. Since you want to be able to restore the m and x mode flags, the first
thing the routine does is push the status register onto the stack. The REP #$30 instruction, which follows, puts
the processor into a known state, since the routine can be called from any of the four possible register-size
modes. The long accumulator, long index mode is the obvious choice because it encompasses all the others.
The user registers, including the direct page register, are saved on the stack, and then the stack pointer itself is
saved to the direct page register via the accumulator. This has two benefits: it preserves the value of the old
stack pointer across a relocation of the stack, and provides a means of accessing all of the data on the old stack
after it has been relocated. This technique is of general usefulness, and should be understood clearly. Figure
14.1, which shows the state of the machine after line 0034 (the PEI instruction), helps make this clear.
The stack must be relocated to page one only if it is not already there. If it is elsewhere, then the last
6502 page-one stack pointer should be restored from where it was cubbyholed when the 65802/65816 program
216
217
FD
EC
C3
4F
P
ADDRESS
OF
6502 ROUTINE
RETURN
ADDRESS
P REGISTER
XH
X REGISTER
XL
YH
Y REGISTER
YL
DPH
DIRECT PAGE
DPL
AH
ACCUMULATOR
40
20
30
4D
FD
EC
OLD STACK
POINTER
AL
$4020
RETURN
ADDRESS
ADDRESS OF
6502 ROTUINE
STACK
POINTER
STACK
DIRECT PAGE
(OLD STACK)
The accumulator was used during these operations, and must be restored because it may contain one of
the parameters required by the 6502 routine. Like the go-to address, the accumulator is loaded from the old
stack using direct page addressing.
Having restored the accumulator, all that remains is to set the direct page register to zero; since no
registers can be modified at this point, this is accomplished by pushing a zero onto the stack, and then pulling it
into the direct page register.
When you switch the processor into emulation mode, the environment is as it should be; the new stack
is now set up to transfer control to the 6502 subroutine via the execution of an RTS instruction which, rather
than exiting the JSR6502 routine, performs a kind of jump indirect to the value on top of the stack, the go-to
address. The use of the RTS to transfer control to the 6502 routine is the reason the address minus one is put on
the stack to begin with. This requirement could be eliminated if the go-to address was decremented before
being pushed on the page one stack; but this would require the execution of two additional instructions, one to
load it into a register, and one to decrement. PEI moves the value directly onto the stack from the direct page.
218
219
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0003
0004
0006
0008
0008
0008
0008
0009
000A
000C
000D
000E
000F
0010
KEEP
65816
KL.14.6
ON
LONGA
LONGI
OFF
OFF
;
;
;
;
;
CHECK
START
SED
LDA
CLC
ADC
BMI
F8
A999
18
6901
3006
#$99
#$01
DONE
DONE
CLC
XCE
BCC
XCE
SEC
CLD
RTS
END
DONE
Listing 14.6
220
The first executable statement is the assignment of the string constant A string to invert to the
variable y. In this context, the y appears without the asterisk, because the variable is being given a value an
address rather than the string it points to. The C compiler always returns the address of a string and zeroterminates it when it encounters a string constant.
The next statement is a call to the function invert with a parameter of y (which is the variable that just
received a value in the preceding statement). Invert is the function that actually does the work of this program,
which, as you may have guessed by now, prints an inverted (backwards) string.
After the closing brace for main comes the declaration of the function invert. Invert takes a parameter
a pointer to a character. When invert is called from main with y as the parameter, yy assumes the value of y.
The code of invert tests the value pointed to by yy; the first time invert is called, this will be the letter
A, the first character in the string constant. The test is whether or not the value at yy is non-zero or not; if it
is non-zero, the statements within the braces will be executed. If (or when) the value is equal to zero, the code
within the braces is skipped.
Looking at the first of the pair of lines contained within the braces, you will find that it is a call to
invert the same function presently being defined. This calling of a routine from within itself is called
recursion, and programming languages such a C or Pascal, which allocate their local variables on the stack,
make it easy to write recursive programs such as this one. The merits of using recursion for any given problem
are the subject for another discussion; however, as seen in the example, it seems quite useful for the task at
hand. What happens when this function calls itself will be explored in a moment, as the generated code itself is
discussed.
The last executable line of the program calls the routine putchar, an I/O routine that outputs the value
passed it as a character on the standard (default) output device.
Returning to the top of the program, Listing 14.8 shows the code generated by the compiler to execute
the C program; it is inter-listed with the source code each line of compiler source appears as an assemblersource comment.
Before the first statement is compiled, the compiler has already generated some code: a jump to a
routine labeled CCMAIN. CCMAIN is a library routine that performs the housekeeping necessary to
provide the right environment for the generated code to run in. At the very least, CCMAIN must make sure the
processor is in the native mode, and switch into the default (for the compiler) sixteen-bit index and accumulator
word sizes. If the operating system supports it, it should also initialize the variable argc and argv, which allow
the programmer access to command-line parameters, although they are not used in this example. Finally,
CCMAIN will call main to begin execution of the user-writer code itself.
221
0000
0000
0000
0000
0000
0003
0007
0008
0009
0010
0011
0012
0013
0014
0015
0016
0017
0018
0019
0020
0021
0022
0023
0000
0000
0000
0000
0000
0000
0001
0004
0006
0006
0008
0009
000C
000D
000D
000E
000F
0024
0025
0026
0027
0028
0029
0030
0031
0032
0033
0034
0035
0036
0037
0038
0039
0040
0041
0042
0043
0044
0045
0046
0047
0048
0049
0000
0000
0000
0000
0000
0003
0005
0008
000A
000D
000D
000D
000F
0010
0011
0014
0015
0015
0018
001A
001B
001E
001F
001F
001F
0020
0050
0051
0052
0053
0054
0000
0000
0000
0008
0010
CC0
4C0080
DA
A90080
8301
A301
48
200080
FA
FA
60
A00000
B303
29FF00
D003
4C1F00
A303
1A
48
200080
FA
A00000
B303
48
200080
FA
60
A.OUT
ON
START
JMP
END
CCMAIN
; main ( );
main
START
;{
; char *y;
; y = A string to invert;
PHX
LDA
#CCC0+0
STA
1,S
; invert (y);
LDA
1,S
PHA
JSR
invert
PLX
;}
PLX
RTS
END
CCC0
41207374
20746F20
727400
KEEP
65816
START
DC
DC
DC
I1$41,$20,$73,$74,$72,$69,$6E,$67
I1$20,$74,$6F,$20,$69,$6E,$76,$65
I1$72,$74,$00
222
0013
0056
0057
0058
0059
0060
0061
0062
0063
0064
0065
0066
0067
0068
0000
0000
0000
0000
0000
0000
0001
0002
0004
0007
0008
0009
000A
0069
0070
0000
0000
0071
0000
0072
0073
0074
0075
0076
0077
0078
0079
0080
0081
0082
0083
0084
0085
0086
0000
0000
0000
0000
0002
0002
0003
0004
0005
0008
0009
000A
000A
000B
000C
END
;
;
CCMAIN
START
CLC
XCE
REP
JSR
SEC
XCE
RTS
END
18
FB
C230
200080
38
FB
60
PUTCHA
R
COUT
#$30
MAIN
START
GEQU
$FDED
A303
LDA
3,S
08
38
FB
20EDFD
18
FB
PHP
SEC
XCE
JSR
CLC
XCE
28
60
PLP
RTS
END
COUT
Listing 14.8
The declaration of main causes an assembler START statement to be output; this simply defines the
beginning of the subroutine or function. The declaration char *y will cause the PHX instruction to be
generated after the first line of executable code is generated; this reserves space for one variable (the pointer y)
on the stack. That first executable code line is the assignment y = A string to invert. This causes the
address of the string constant, which will be temporarily stored at the end of the generated program, to be
loaded into the accumulator. The address just loaded into the accumulator is now stored on the stack in the
memory reserved for it by the PHX instruction; the value of X that was pushed onto the stack was meaningless
in itself.
The next statement to be compiled is a call to the function invert with the variable y as the parameter.
This causes the value stored on the stack to be loaded back into the accumulator, where it is then pushed onto
the stack. All parameters to function calls are passed on the stack.
Note that the accumulator already contained the value stored on the top of the stack; the LDA 1,S
instruction was redundant. However, the hypothetical compiler in this example does not optimize across
statements, so the potential optimization elimination of the load instruction cannot be realized. Once the
parameter is on the top of the stack, the function itself is called via a JSR instruction. Since the program space
is limited to 64K, only a sixteen-bit subroutine call is used. After the call returns, the PLX instruction removes
the no-longer-needed parameter from the stack. The right bracket indicating the end of the function main
223
224
0000
0000
0000
0000
0000
0001
0002
0002
0004
0004
0006
0006
0006
0009
0009
000B
000D
000E
000F
0012
0013
0016
0016
0017
0018
0019
0019
002C
0029
0030
0031
0032
0033
0034
0035
0036
0037
0038
0039
0040
0041
0042
0043
0044
0045
0046
0047
0048
0049
0050
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0003
0004
0005
0006
0009
000A
000B
000C
000D
000E
000F
0010
KEEP
65816
MAIN
18
FB
C210
E220
A21900
KL.14.9
ON
START
CLC
XCE
REP
LONGI
SEP
LONGA
#$10
ON
#$20
OFF
LDX
#STRING
LDA
BEQ
PHA
INX
JSR
PLA
JSR
0,X
DONE
B500
F009
48
E8
200900
68
200080
INVERT
38
FB
60
DONE
SEC
XCE
RTS
41207374
STRING
DC
END
;
;
;
COUT
ECOUT
COUT
machine-department routine to output a character
48
DA
5A
08
38
FB
20EDFD
18
FB
28
7A
FA
68
60
START
GEQU
PHA
PHX
PHY
PHP
SEC
XCE
JSR
CLC
XCE
PLP
PLY
PLX
PLA
RTS
END
INVERT
COUT
$FDED
Apple / / COUT
Save registers
and status,
switch to emulation
ECOUT
Listing 14.9
225
226
0000
0000
0000
0000
0000
0000
0002
0002
0004
0004
0004
0007
0007
000A
000D
000E
0010
0011
0024
0020
0021
0022
0023
0024
0025
0026
0027
0028
0029
0030
0031
0032
0033
0034
0035
0036
0037
0038
0039
0040
0041
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0003
0004
0005
0006
0009
000A
000B
000C
000D
000E
000F
0010
KEEP
65816
MAIN
C210
E220
A21700
BD1100
200080
CA
10F7
60
41207374
INVERT
DONE
STRING
;
;
;
COUT
ECOUT
48
DA
5A
08
38
FB
20EDFD
18
FB
28
7A
FA
68
60
KL.14.10
ON
START
REP
LONGI
SEP
LONGA
#$10
ON
#$20
OFF
LDX
#L:STRING-1
LDA
JSR
DEX
BPL
RTS
DC
END
STRING,X
COUT
INVERT
8-bit accumulator
COUT
machine-dependent routine to output a character
START
GEQU
PHA
PHX
PHY
PHP
SEC
XCE
JSR
CLC
XCE
PLP
PLY
PLX
PLA
RTS
END
$FDED
Apple I I COUT
Save registers
and status,
switch to emulation
ECOUT
Listing 14.10
The Sieve program calculates the prime numbers between 3 and 16,381; it is based on an algorithm
originally attributed to the Greek mathematician Eratosthenes. The basic procedure is to eliminate every nth
number after a given number n, up to the limit of range within which primes are desired. Presumably the range
of primes is itself infinite.
As well as providing a common yardstick with which to gauge the 65816, the Sieve program in Listing
14.11 provides an opportunity to examine performance-oriented programming; since the name of the game is
performance, any and all techniques are valid in coding an assembly-language version of a benchmark.
Four variable locations are defined for the program. ITER counts down the number of times the
routine is executed; to time it accurately, the test is repeated 100 times. COUNT holds the count of primes
discovered. K is a temporary variable. And PRIME is the value of the current prime number.
The variable I has no storage reserved for it because the Y register is used; it is an index counter. Y is
used instead of X because certain indexed operations need the absolute,X addressing mode.
227
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0004
0004
0004
0004
0004
0007
0009
0009
000B
000B
000E
0011
0014
0014
0017
0018
0019
001B
001B
001E
001E
001E
001E
0021
0023
0023
0024
0025
0026
0027
0028
002A
002A
002B
002C
002E
002E
0031
0033
KEEP
65816
KL.14.11
ON
ERATOS
START
SIZE
GEQU
8192
ITER
COUNT
.K
PRIME
FLAGS
GEQU
GEQU
GEQU
GEQU
GEQU
$80
$82
$84
$86
$4000
18
FB
C230
CLC
XCE
REP
LONGI
LONGA
#$30
ON
ON
A96400
8580
LDA
STA
#100
ITER
STZ
COUNT
LDY
LDA
STA
#SIZE-1
#$FFFF
FLAGS
for I = 0 to size
STA
DEY
DEY
BPL
FLAGS,Y
LDY
#0
for i = 0 to size
(i stored in Y)
LDA
BPL
FLAGS-1,Y
SKIP
if flags[I] then
minus-one offset: to see
high bit in long a mode
98
0A
1A
1A
1A
8586
TYA
ASL
INC
INC
INC
STA
A
A
A
A
PRIME
prime = I + I + 3
98
18
6586
TYA
CLC
ADC
PRIME
k = i + prime
CMP
BGE
#SIZE+1
SKIP2
6482
AGAIN
A0FF1F
A9FFFF
8D0040
990040
88
88
10F9
LOOP
A00000
flags[I] = TRUE
LOOP
B9FF3F
101E
MAIN
;
C90120
B00C
TOP
228
0033
0034
0034
0036
0039
003B
003B
003D
003F
003F
0041
0041
0042
0045
0047
0047
0049
004B
004B
004C
004D
004E
004E
004E
AA
TAX
flags[k] = FALSE
E220
9E0040
C221
SEP
STZ
REP
#$20
FLAGS,X
#$21
clear only
one byte
clears carry as well
6586
80EF
ADC
BRA
PRIME
TOP
k = k + prime
(end while k <= size)
E682
SKIP2
INC
COUNT
C8
C00120
D0D7
SKIP
INY
CPY
BNE
#SIZE+1
MAIN
C680
D0BE
DEC
BNE
ITER
AGAIN
38
FB
60
SEC
XCE
RTS
END
Listing 14.11
The program begins by entering the native mode and extending the user registers to sixteen bits. ITER
is initialized for 100 iterations. An array (starting at FLAGS) of memory of size SIZE is initialized to $FFs,
two bytes at a time.
The routine proper now begins. Y is initialized with zero, and control falls into the main loop. The
high-order bit of each cell of the array FLAGS is tested. Initially, they are all set, but the algorithm iteratively
clears succeeding non-prime values before they are tested by this code. If the high bit is clear, this number has
already been eliminated by the algorithm; it is non-prime. Notice that the high-order bit of the FLAG[I] (or
FLAG[Y]) array is desired; however, since the processor is in sixteen-bit mode, the high bit will be loaded from
the memory location at the effective address plus one. To overcome this, the base of the array is specified as the
actual base minus one; this calculation is performed by the assembler during generation of the object code.
If the current value has not been cleared, the algorithm calls for the number which is two times the
current index value plus three (this converts the index to the array values of 3, 5, 7 . . . ) to be the next value
for PRIME. This prime number is generated quickly by transferring the Y index register into the accumulator,
shifting it left once to multiply by two, and incrementing it three times. Remember, this number is generated
from the current index only if the index value has not already been eliminated as being non-prime.
This prime number is then added to the current index, and the array elements at this offset, and at all
succeeding indices every PRIME value apart are eliminated from the array as being non-prime. They have the
current prime number as one of their factors. The most significant thing to note here in the code is that only one
byte can be cleared; the accumulator must temporarily be switched into the eight-bit mode to accomplish this.
However, since the next operation is an addition, an optimization is available: both the sixteen-bit mode can be
restored and the carry cleared in a single REP operation.
The program now loops, checking to see if the next index value has been eliminated; this process
continues until the index reaches the limit of SIZE.
You may be wondering what the result is: at 4 MHz, ten iterations are completed in 1.56 seconds, which
is twice as fast as a 4MHz 6502. The January, 1983 BYTE article cites results of 4.0 seconds for a 5MHz 8088,
1.90 seconds for an 8 MHz 8086, and .49 seconds for an 8 MHz 68000; an 8 MHz 65816 would yield .78
seconds.
229
230
4CCB22
08
18
FB
08
08
F40003
2B
C220
E210
JMP
PHP
CLC
XCE
PHP
PHD
PEA
PLD
REP
SEP
$22CB
$0300
#$20
#$10
Figure 15-1 Disassembly Output
A905
LDA
X= 00 11 Y= 00 13
AB
TAY
X= 00 11 Y= 00 05
90060
STA
X= 00 11 Y= 00 05
88
DEY
X= 00 11 Y= 00 04
D0FA
BNE
X= 00 11 Y= 00 04
990060 STA
X= 00 11 Y= 00 04
88
DEY
X= 00 11 Y= 00 03
D0FA
BNE
X= 00 11 Y= 00 03
990060 STA
X= 00 11 Y= 00 03
88
DEY
X= 00 11 Y= 00 02
#$05
S= 01 AA D= 00 00 B= 00 P= 7D E:1
S= 01 AA D= 00 00 B= 00 P= 7D E: 1
$600,Y
S= 01 AA D= 00 00 B= 00 P= 7D E: 1
S= 01 AA D= 00 00
$5003
S= 01 AA D= 00 00
$600,Y
S= 01 AA D= 00 00
$5003
S= 01 AA D= 00 00
$6000,Y
S= 01 AA D= 00 00
B= 00 P=7D E:1
B= 00 P= 7D E:1
B= 00 P= 7D E:1
B= 00 P= 7D E:1
B= 00 P= 7D E:1
S= 01 AA D= 00 00 B= 00 P= 7D E:1
S= 01 AA D= 00 00 B= 00 P= 7D E: 1
Figure 15-2 Tracer Output
This example was developed and tested using an AppleIIe with a 65816 processor card installed; the
calls to machine-dependent locations have been isolated and are clearly as such. DEBUG16 uses the native
BRK vector. On the AppleII, this location ($FFE6, FFE7) normally contains ROM data, which varies between
monitor ROM versions. Since there is no way to patch ROM, the solution opted for here is for DEBUG16 to
try to patch the location pointed to by the data that is stored there. For current ROMs, these are RAM locations
that happen to be more or less livable. Check the location pointed to by your ROMs, and make sure that neither
your own code nor the debugger are loaded into that area. DEBUG16 will automatically read whatever value is
stored there and store a vector to that address to regain control after a BRK.
Both programs are executed by putting the starting address of the routine to list or trace (which has been
loaded into memory) at DPAGE+80.82 ($380.82) in low high bank order, and then calling either the
TRACE entry point at $2000, or the LIST entry at $2003.
231
Declarations
The listing begins with the declaration of global values by way of GEQU statements. Almost all of
these are addresses of direct page memory locations that will be used; one notable exception is the label
DPAGE, a sixteen-bit value that defines the beginning of the direct page memory to be used by this program.
Because a 65816 debugger is by definition a 6502 debugger, it is wise to relocate the direct page out of the
default zero page, since it will be used by 65802 programs, and you program being debugged. In the listing, a
value of $300 is used; on an Apple I I , this relocates the direct page to page three, which is a convenient page to
use.
Many of the direct page locations are used to store the register contents of the user program when the
debugger is executing. All of the registers are represented. As you will see in the code, the adjacent positioning
of some of the registers is important and must be maintained.
In addition to the direct page location used for register storage, one general-purpose temporary variable
is used, called TEMP. Three other variables ADDRMODE, MNX, and OPLEN (for address mode,
mnemonic index, and operation length, respectively) are used primarily to access the tables used in
disassembling an instruction.
The variable CODE contains the instruction opcode currently being executed in the user program. The
variable NCODE contains the next instruction opcode to be executed, saved there before being replaced with
the BRK instruction inserted in the code. OPRNDL, OPRNDH, and OPRNDB contain the three (possible)
values of the operand of a given instruction.
0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015
0016
0017
0018
0019
0020
0021
0022
0023
0024
0025
0026
0027
0028
0029
0030
0031
0032
0033
0034
0035
0036
0037
0038
0039
0040
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
KEEP
DEBUG16
65816
MSB
LONGA
LONGI
ON
ON
OFF
OFF
***********************************************
*
*
*
DEBUG16
*
*
A 65816 DEBUGGER
*
*
*
*
*
***********************************************
ORG
MAIN
$8000
START
USING
USING
MN
ATRIBL
DPAGE
;
GEQU
$300
;
;
;
PCREG
PCREGH
PCREGB
GEQU
GEQU
GEQU
$80
PCREG+1
PCREGH+1
PROGRAM COUNTER
NCODE
GEQU
PCREGB+1
OPCREG
OPCREGH
OPCREGB
GEQU
GEQU
GEQU
NCODE+1
OPCREG+1
OPCREGH+1
INCLUDING BANK
232
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0093
0000
4C008
0
CODE
GEQU
OPCREGB+1
OPRNDL
OPRNDH
OPRNDB
GEQU
GEQU
GEQU
CODE+1
OPRNDL+1
OPRNDH+1
OPERANDS OF CURRENT
INSTRUCTION
XREG
XREGH
GEQU
GEQU
OPRNDB+1
XREG+1
X REGISTER
YREG
YREGH
GEQU
GEQU
XREGH+1
YREG+1
Y REGISTER
AREG
AREGH
GEQU
GEQU
YREGH+1
AREG+1
ACCUMULATOR
STACK
STACKH
GEQU
GEQU
AREGH+1
STACK+1
STACK POINTER
DIRREG
DIRREGH
GEQU
GEQU
STACKH+1
DIRREG+1
DBREG
GEQU
DIRREGH+1
PREG
GEQU
DBREG+1
P STATUS REGISTER
EBIT
GEQU
PREG+1
E BIT
TEMP
TEMPH
TEMPB
GEQU
GEQU
GEQU
EBIT+2
TEMP+1
TEMPH+1
TEMPORARY
ADDRMODE
GEQU
TEMPB+1
MNX
;
GEQU
ADDRMODE+1
MNEMONIC INDEX
FROM ATTRIBUTE TABLE
OPLEN
;
GEQU
MNX+2
LENGTH OF OPERATION,
INCLUDING INSTRUCTION
CR
GEQU
$8D
CARRIAGE RETURN
M
X
C
GEQU
GEQU
GEQU
$20
$10
$01
JMP
TRACE
233
LIST
The program has two entry points, defined in the first routine. One is for listing (disassembling) a
program, the other for tracing. The first entry point, at the programs origin (default $8000), is jump to the
actual entry point of the trace routine; the second, immediately past it (at $8003), is the beginning of the code
for the disassembler.
Since this is a bare-bone disassembler, intended to be expanded and perhaps integrated with a general
purpose machine language monitor, parameters such as the start address of the program to be traced are entered
by modifying the values of the register variables; for example, to begin disassembly of a program stored at
$800, the values $00 $08, and $00 are stored staring at PCREG. Since the direct page is relocated to page
three, the absolute location of this variable is $380.
Starting at the LIST entry, some basic initialization is performed: saving the status register, switching
to native mode, and then saving the previous operating mode (emulation/native) by pushing the status register a
second time (the carry flag now containing the previous contents of the e bit). Thus this program may be called
from either native or emulation mode.
The current value of the direct page is saved in program memory, and then the new value DPAGE
is stored to the direct page register. The native mode is entered.
Control now continues at TOP, the beginning of the main loop of the disassembler. The mode is set to
long accumulator, short index. This combination allows simple manipulation of both byte and double-byte
values. The value of PCREG is copied to OPCREG (old pcreg). OPCREG will contain the starting location
of the current instruction throughout the loop; PCREG will be modified to point to the next instruction.
However, it hasnt been modified yet, so it is used to load the accumulator with the opcode byte. Indirect long
addressing is used, so code anywhere within the sixteen-megabyte address space may be disassembled. Since
the accumulator is sixteen bits, a second byte is fetched as well, but ignored; the next instruction transfers the
opcode to the X register and then stores it at the location CODE.
The utility routine UPDATE is called next. This is common to both the disassembler and the tracer,
and determines the attributes of this instruction by looking the instruction up in a table; it also increments the
program counter to point to the next instruction.
The routines FLIST, FRMOPRND, and PRINTLN form the disassembled line and display it. After
each line is printed, the routine PAUSE is called to check the keyboard to see if a key has been pressed,
signalling a pause. If PAUSE returns with the carry clear, it means the user has signalled to quit, and control
falls through to QUIT; otherwise, the program loops to TOP again, where it repeats the process for the next
instruction.
234
0003
0003
0003
0003
0003
0003
0003
0003
0003
0003
0003
0003
0004
0005
0006
0007
0007
0008
000B
000C
000C
000C
000C
000E
0010
0010
0010
0010
0012
0014
0016
0018
001A
001C
001D
001F
001F
0022
0022
0025
0028
002B
002D
0030
0030
0032
0032
0033
0034
0035
0036
0037
0037
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
;
LIST
;
MAIN LOOP OF DISASSEMBLER FUNCTION
;
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
08
18
FB
08
LIST
ENTRY
PHP
CLC
XCE
PHP
0B
F40003
2B
PHD
PEA
PLD
TOP
ANOP
C220
E210
REP
SEP
LONGA
LONGI
#M
#X
ON
OFF
649D
A580
8584
A682
8686
A780
AA
8687
STZ
LDA
STA
LDX
STX
LDA
TAX
STX
MNX
PCREG
OPCREG
PCREGB
OPCREGB
[PCREG]
CODE
SAVE AS CODE
200080
JSR
UPDATE
200080
200080
200080
9005
200080
JSR
JSR
JSR
BCC
JSR
FLIST
FRMOPRNND
PAUSE
QUIT
PRINTLN
80DA
BRA
TOP
2B
28
FB
28
60
QUIT
PLD
PLP
XCE
PLP
RTS
END
PRINT IT
RESTORE ENVIRONMET,
RETURN TO CALLER
Local Symbols
LIST
00003
QUIT
000032
TOP
0000C
235
FLIST
FLIST is called by both the disassembler and the tracer. This routine displays the current program
counter value, the object code of the instruction being disassembled in hexadecimal, and the mnemonic for the
opcode. The code required to do this is basically the same for any instruction, the only difference being the
length of the instruction, which has already been determined by UPDATE.
The first thing the code does is to blank the output buffer by calling CLRLN. Particularly since 6502
emulation-mode I/O routines are used, it is more efficient to build an output line first, then display it all at once,
rather than output the line on the fly. Characters are stored in the output buffer LINE via indexed absolute
addressing; the Y register contains a pointer to the current character position within the line, and is incremented
every time a character is stored. Since character manipulation is the primary activity in this routine, the
accumulator is set to eight bits for most of the routine.
The flow of the program proceeds to generate the line from left to right, as it is printed; the first
characters stored are therefore the current program counter values. Since UPDATE has already modified the
program counter variable to load the operands of the instruction, the value in the variable OPCREG is used.
The hex conversion routine, PUTHEX, converts the data in the accumulator into the ASCII characters that
represents the numbers two hexadecimal digits, storing each character at the location pointed to by LINE,Y,
and then incrementing Y to point to the next character. A colon is printed between the bank byte and the
sixteen-bit program counter display to aid readability.
Next, some spaces are skipped by loading the Y register with a higher value, and the object code bytes
are displayed in hexadecimal. These values have already been stored in direct page memory locations CODE
and OPRNDL, OPRNDH, and OPNDB by the UPDATE routine, which also determined the length of the
instruction and stored it at OPLEN, The length of the operand controls a loop that outputs the bytes; note that a
negative displacement of one is calculated by the assembler so that the loop is not executed when OPLEN is
equal to one.
All that remains is to print the instruction mnemonic. The characters for all of the mnemonics are
stored in a table called MN; at three characters per mnemonic (which as you may have noticed is the standard
length for all 65x mnemonics), the mnemonic index (MNX) determined by UPDATE from the instruction
attribute table must be multiplied by three. This is done by shifting left once (to multiply by two), and adding
the result to the original value of MNX. Note that this type of custom multiplication routine is much more
efficient than the generalized multiplication routines described in the previous chapter. The characters in the
mnemonic table are copied into the output line using the MVN instruction; the result just calculated is
transferred into the X register as the source of the move. It is the line-buffered output that allows use of the
block-move instruction; on-the-fly output would have required each character to be copied out of the mnemonic
table in a loop.
236
;
;
FLIST FORM IMAGE OF PROGRAM COUNTER,
;
OBJECT CODE, AND MNEMONIC IN LINE
;
;
REQUIRES ATTRIBUTE VARIABLES TO BE PREVIOUSLY INITIALIZED
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
FLIST
0003
0003
0005
0005
0005
0005
0007
0009
000C
000E
0011
0012
0014
0017
0019
001C
001C
001E
0020
0023
0025
0025
0027
0029
002B
002E
002F
0031
0031
0033
0033
0033
0033
0035
0036
0037
0039
003A
003D
003E
0041
0044
0044
0047
0047
0048
START
USING
MN
200080
JSR
CLRLN
E230
SEP
LONGA
LONGI
#M+X
OFF
OFF
SHORT REGISTERS
A000
A586
200080
A9BA
990080
C8
A585
200080
A584
200080
LDY
LDA
JSR
LDA
STA
INY
LDA
JSR
LDA
JSR
#0
OPCREGB
PUTHEX
#:
LINE,Y
A00A
A587
20080
A201
LDY
LDA
JSR
LDX
#10
CODE
PUTHEX
#1
CPX
BEQ
LDA
JSR
INX
BRA
OPLEN
DONE
OPRNDL-1,X
PUTHEX
REP
LONGA
LONGI
#M+X
ON
ON
LDA
ASL
CLC
ADC
CLC
ADC
TAX
LDY
LDA
ENTRY
MVN
MNX
A
E49F
F008
B587
200080
E8
80F4
MORE
C23
DONE
A59D
0A
18
659D
18
690080
AA
A01480
A90200
MOVE
540000
60
OPCREGH
PUTHEX
OPCREG
PUTHEX
MORE
MNX
#MN
#LINE+20
#2
0,0
RTS
END
Local Symbols
DONE
000031
MORE
000025
MOVE
000044
237
FRMOPRND
This is the second part of the line-disassembly pair. It performs the address-mode specific generation of
the disassembled operand field; the result is similar to the address mode specification syntax of a line of 65x
source code.
The Y register is loaded with the starting destination in LINE, and the attribute stored at ADDRMODE
is multiplied by two to form an index into a jump table. There is a separate routine for each addressing mode;
the address of that routine is stored in a table called MODES in the order that corresponds to the attributes
given them from the attribute table.
The JMP indirect indexed instruction is used to transfer control through the jump table MODES to the
appropriate routine, whose index, times two, has been loaded into the X register.
Each of the routines is basically similar; they output any special characters and print the address of the
operand found in the instruction stream. There are three relative routines, POB, PODB, and POTB (for put
operand byte, put operand double byte, and put operand triple byte) which output direct page, absolute, and
absolute long addresses.
The two routines FPCR and FPCRL, which handle the program counter relative instructions, however,
must first calculate the destination address (which is how an assembler would specify the operand, so this is
how they are disassembled) by adding the actual operand, a displacement, to the current program counter. The
operand of a short program counter relative instruction is sign-extended before adding, resulting in a sixteen-bit
signed displacement which is added to the program counter to find the destination address.
238
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0002
0002
0002
0002
0004
0224
0004
A59C
LDA
0225
0226
0227
0228
0229
0230
0231
0232
0233
0234
0235
0236
0237
0238
0239
0240
0241
0242
0243
0244
0245
0246
0247
0248
0249
0250
0251
0252
0253
0254
0255
0256
0257
0258
0259
0260
0261
0262
0263
0264
0265
0266
0006
0007
0008
000B
000B
000B
000B
000D
0010
0011
0013
0015
0017
001A
001D
001D
001D
0020
0020
0020
0023
0023
0023
0026
0026
0026
0028
002B
002C
002C
002C
002D
002D
002D
0030
0030
0030
0030
0032
0035
0038
003B
0A
AA
7C0080
ASL
TAX
JMP
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
;
FRMOPRND
;
FORMS OPERAND FIELD OF DISASSEMBLED INSTRUCTION
;
;
OPLEN, ADDRMODE, AND OPRND MUST HAVE BEEN
;
INITIALIZED BY UPDATE
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
FRMOPRND
E230
A01C
START
USING
SEP
LONGA
LONGI
MODES
#M+X
OFF
OFF
LDY
#28
FIMM
A9A3
990080
C8
A59F
C902
F003
4C0080
4C0080
GOSHORT
FABS
4C0080
FABSL
4C0080
FDIR
4C0080
FACC
A9C1
990080
60
ENTRY
LDA
STA
INY
LDA
CMP
BEQ
JMP
JMP
##
LINE,Y
OPLEN
#2
GOSHORT
PODB
POB
ENTRY
JMP
POTB
ABSOLUTEW LONG
OUTPUT A TRIPLE BYTE
ENTRY
JMP
POB
DIRECT MODE
OUTPUT A SINGLE BYTE
ENTRY
LDA
STA
RTS
FINDINX
ENTRY
JSR
FINY
IMMEDIATE MODE
OUTPUT POUND SIGN,
ONE OR TWO
OPERAND BYTES, DEPENDING
ON OPLEN
ABSOLUTE MODE
JUST OUTPUT A DOUBLE BYTE
#A
LINE,Y
ENRTY
LDA
STA
INY
LDA
STA
ACCUMULATOR
JUST AN A
IMPLIED
NO OPERAND
FIND
A9AC
990080
C8
A9D9
990080
PODB
ENTRY
RTS
20B600
(MODES,X)
ENTRY
JMP
FIMP
60
ADDRMOD
E
A
#,
LINE,Y
INDIRECT INDEXED
CALL INDIRECT, THEN FALL
THROUGH TO INDEXED BY Y
INDEXED BY Y MODES
TACK ON A COMMA,Y
#Y
LINE,Y
239
003B
003C
003C
003C
003F
0042
0042
0042
0044
0047
0048
004B
004E
0050
0053
0054
0054
0054
0057
005A
005A
005A
005D
0060
0060
0060
0062
0065
0066
0068
006B
006C
006D
006D
006D
0070
0073
0073
0073
0076
0079
0079
0079
007C
007F
007F
007F
0081
0082
0084
0086
0086
0088
008B
008D
008E
008F
0091
0091
0093
0093
0093
0096
0096
0096
60
RTS
FINDINXL
20C600
4C3000
FINXIND
A9A8
990080
C8
200080
206000
A9A9
990080
60
FDIRINXX
200080
4C6000
FDIRINXY
200080
4C3000
FINX
A9AC
990080
C8
A9D8
990080
C8
60
FABSX
200080
4C6000
FABSLX
200080
4C6000
FABSY
200080
4C3000
FPCR
A9FF
EB
A588
C221
3003
297F00
6584
1A
1A
8588
OK
ENTRY
JSR
JMP
ENTRY
LDA
STA
INY
JSR
JSR
LDA
STA
RTS
FINDL
FINY
#(
LINE,Y
POB
FINX
#)
LINE,Y
INDEX INDIRECT
PARENTHESIS
A SINGLE BYTE
COMMA, X
CLOSE.
ENTRY
JSR
JMP
POB
FINX
DIRECT INDEXED BY X
OUTPUT A BYTE,
TACK ON COMMA, X
ENTRY
JSR
JMP
POB
FINY
DIRECT INDEXED BY Y
OUTPUT A BYTE,
TACK ON COMMA, Y
#,
LINE,Y
INDEXED BY X
TACK ON A
COMMA, X
ENTRY
LDA
STA
INY
LDA
STA
INY
RTS
#X
LINE,Y
(USED BY SEVERAL
MODES)
ENTRY
JSR
JMP
PODB
FINX
ABSOLUTE INDEXED BY X
OUTPUT A DOUBLE BYTE,
TACK ON A COMMA, X
ENTRY
JSR
JMP
POTB
FINX
ABSOLUTE LONG BY X
OUTPUT A TRIPLE BYTE,
TACK ON COMMA, X
ENTRY
JSR
JMP
PODB
FINY
ABSOLUTE Y
OUTPUT A DOUBLE BYTE,
TACK ON COMMA,Y
#$FF
ENTRY
LDA
XBA
LDA
REP
LONGA
BMI
AND
ADC
INC
INC
STA
OPRNDL
#M+C
ON
OK
#$7F
OPCREG
A
A
OPRNDL
E220
SEP
LONGA
#M
OFF
4C0080
JMP
PODB
FCPRL
ENTRY
240
0096
0098
0098
0098
009A
009C
009D
00A0
00A2
00A2
00A4
00A4
00A4
00A7
00A7
00A7
00A9
00AC
00AD
00B0
00B2
00B5
00B6
00B6
00B6
00B8
00BB
00BC
00BF
00C1
00C4
00C5
00C6
00C6
00C6
00C8
00CB
00CC
00CF
00D1
00D4
00D5
00D6
0375
00D6
0376
0377
0378
0379
0380
0381
0382
0383
0384
0385
0386
0387
0388
0389
0390
0391
0392
0393
0394
0395
00D6
00D8
00DB
00DC
00DF
00E1
00E4
00E5
00E5
00E5
00E6
00E6
00E6
00E9
00EB
00EE
00EF
00F1
00F4
00F5
C221
REP
LONGA
#M+C
ON
A588
6584
18
690300
8588
LDA
ADC
CLC
ADC
STA
OPRNDL
OPCREG
E220
SEP
LONGA
#M
OFF
4C0080
JMP
PODB
#(
LINE,Y
ABSOLUTE INDIRECT
SURROUND A DOUBLE BYTE
WITH PARENTHESES
FABSIND
A9A8
990080
C8
200080
A9A9
990080
60
FIND
A9A8
990080
C8
200080
A9A9
990080
C8
60
FINDL
A9DB
990080
C8
200080
A9DD
990080
C8
60
FABSINXIN
D
A9A8
990080
C8
206D00
A9A9
990080
60
ENTRY
LDA
STA
INY
JSR
LDA
STA
INY
RTS
ENTRY
LDA
STA
INY
JSR
LDA
STA
INY
RTS
PODB
#)
LINE,Y
#(
LINE,Y
INDIRECT
SURROUND A SINGLE BYTE
WITH PARENTESES
POB
#)
LINE,Y
#[
LINE,Y
INDIRECT LONG
SURROUND A SINGLE BYTE
WITH SQUARE BRACKTS
POB
# ]
LINE,Y
ENTRY
LDA
ST5A
INY
JSR
LDA
STA
RTS
FABSX
#)
LINE,Y
FSTACK
ENTRY
RTS
STACK IMPLIED
FSTACKREL
ENTRY
JSR
LDA
STA
INY
LDA
STA
INY
RTS
STACK RELATIVE
JUST LIKE
DIRECT INDEXED, BUT WITH
AN S
60
202300
A9AC
990080
C8
A9D3
990080
C8
60
ENTRY
LDA
STA
INY
JSR
LDA
STA
RTS
#S
LINE,Y
241
00F6
00F6
00F6
00F6
00F8
00FB
00FC
00FF
0101
0104
0105
0108
0108
0108
0108
0108
010A
010C
010D
010F
0111
0111
0114
0116
0119
011A
011B
011D
0120
0120
0120
FSRINDINX
A9A8
990080
C8
20E600
A9A9
990080
C8
4C3000
FBLOCK
ENTRY
LDA
STA
INY
JSR
LDA
STA
INY
JMP
FSTACKREL
#)
LINE,Y
FINY
TACK ON A COMMA,Y
ENTRY
BLOCK MOVE
C220
A588
EB
8588
E220
REP
LDA
XBA
STA
SEP
#M
OPRNDL
200080
A9AC
990080
C8
EB
8588
4C0080
JSR
LDA
STA
INY
XBA
STA
JMP
POB
#,
LINE,Y
MAKE HUMAN-READABLE:
SWAP SOURCE, DEST
OPRNDL
#M
OUTPUT THE SOURCE
THEN COMMA
OPRNDL
POB
END
Local Symbols
FABS
FABSLX
FBLOCK
FIMM
FINDINXL
FINY
FSTACK
00001D
000073
000108
00000B
00003C
000030
0000E5
FABSIND
FABSX
FDIR
FIMP
FINDL
FPCR
FSTACKREL
0000A7
00006D
000023
00002C
0000C6
00007F
0000E6
FABSINXIND
FABSY
FDIRINXX
FIND
FINX
FPCRL
GOSHORT
0000D6
000079
000054
000086
000060
000096
00001A
FABBSL
FACC
FDIRINXY
FINDINX
FINXIND
FSRINDINX
OK
000020
000026
00005A
00002D
000042
0000F6
00008B
242
POB
This routine (put operand byte), with three entry points, outputs a dollar sign, followed by either one,
two, or three operand bytes in hexadecimal form; it calls the routine PUTHEX to output the operand bytes. It is
called by FRMOPRND.
Depending on the entry point, the X register is loaded with 0, 1, or 2, controlling the number of times
the loop at MORE is executed; on each iteration of the loop, an operand byte is loaded by indexing into
OPRNDL and then printed by PUTHEX.
0427
0428
0429
0430
0431
0432
0433
0434
0435
0436
0437
0438
0439
0440
0441
0442
0443
0444
0445
0446
0447
0448
0449
0450
0451
0452
0453
0454
0455
0456
0457
0458
0459
0460
0461
0462
0463
0464
0465
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0002
0004
0004
0006
0008
0008
000A
000A
000C
000F
0010
0010
0012
0015
0016
0018
0019
;LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
; POB, PODB, POTB
; PUT OPERAND (DOUBLE, TRIPLE) BYTE
;
; PUTS OPRNDL (OPRNDH, OPRNDB) IN LINE AS HEX VALUE
; WITH $ PREFIX
;
; ASSUMES SHORT ACCUMULATOR AND INSEX REGISTERS
; (CALLED BY FOPRND)
;LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
POB
START
LONGA
LONGI
OFF
OFF
;
A200
8006
PODB
A201
8002
POTB
A202
A9A4
990080
C8
B588
200080
CA
10F8
60
;
IN
MORE
LDX
BRA
ENTRY
LDX
BRA
ENTRY
LDX
#0
IN
PRINT:
ONE OPERAND BYTE
SKIP
#1
IN
#2
LDA
STA
INY
#$
LINE,Y
LDA
JSR
DEX
BPL
RTS
END
OPRNDL,X
PUTHEX
MORE
Local Symbols
IN
00000A
MORE
000010
PODB
000004
POTB
000008
243
STEP
This routine also contains the PAUSE entry point called by LIST; STEP waits until a keypress,
PAUSE simply checks to see if a key has been pressed, and waits only if there has been an initial keypress. In
both cases, the wait loop continues until the next keypress. If the keypress that exits the wait loop was the
ESCAPE key, the carry is cleared, signalling the calling program that the user wants to quit rather than
continue. If it was RETURN, the overflow flag is cleared; the tracer uses this toggle between tracing and single
stepping. Any other keypress causes the routine to return with both flags set.
The code in this listing is machine-dependent; it checks the keyboard locations of the AppleII. Since
this is a relatively trivial task, in-line code is used rather than a call to one of the existing 6502 monitor routines;
therefore, the processor remains in the native mode while it performs this I/O operation.
Like all utility routines, STEP saves and restores the status on entry and exit.
0466
0467
0468
0464
0465
0466
0467
0468
0469
0470
0471
0472
0473
0474
0475
0476
0477
0478
0479
0480
0481
0482
0483
0484
0485
0486
0487
0488
0489
0490
0491
0492
0493
0494
0495
0496
0497
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0003
0005
0005
0005
0006
0008
0498
0499
0500
0501
0502
0503
0504
0505
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
; STEP - - CHECKS FOR USER PAUSE SIGNAL
; (KEYSTROKE)
;
; CONTAINS MACHINE-DEPENDENT CODE
; FOR APPLE I I
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
STEP
KEYBD
KEYYSTB
ESC
V
08
E230
800B
PAUSE
START
EQU
EQU
EQU
EQU
LONGA
LONGI
$C000
$C010
$9B
$40
OFF
OFF
PHP
SEP
BRA
#M+X
WAIT
SAVE MODES
08
E230
AD00C0
ENTRY
PHP
SEP
LDA
#M+X
KEYBD
000B
000D
0010
0010
0013
0015
0018
101B
8D10C0
BPL
STA
RETNCR
KEYSTB
LDA
BPL
STA
CMP
KEYBD
WAIT
KEYSTB
#ESC
001A
D004
BNE
RETNESC
AD00C0
10FB
8D10C0
C998
;
WAIT
244
001C
001C
0508
0509
0510
0511
0512
0513
0514
0515
0516
0517
0518
0519
0520
0521
0522
0523
001D
001E
001F
0020
0020
0022
0024
0025
0027
0028
0028
002B
002C
002D
002E
002F
28
RETEQ
EA
18
60
PLP
NOP
CLC
RTS
C98D
D004
28
E241
60
RETNESC
8D10C0
28
38
B8
60
RETNCR
CMP
BNE
PLP
SEP
RTS
#CR
RETNCR
STA
PLP
SEC
CLV
RTS
END
KEYSTB
#C+V
ELSE SET
(CONTINUE)
Local Symbols
ESC
RETEQ
WAIT
00009B
00001C
000010
KEYBD
RETNCR
00C000
000028
KEYSTB
RETNESC
00C010
000020
PAUSE
V
000005
000040
245
PUTHEX
This utility routine, already referred to in several descriptions, is called whenever a hexadecimal value
needs to be output. It converts the character in the low byte of the accumulator into two hexadecimal
characters, and stores them in the buffer LINE at the position pointed to by the Y register.
PUTHEX calls and internal subroutine, MAKEHEX, which does the actual conversion. This call
(rather than in-line code) allows MAKEHEX to first call, then fall through into, an internal routine,
FORMNIB.
When MAKEHEX returns, it contains the two characters to be printed in the high and low bytes of the
accumulator; MAKEHEX was processed with the accumulator eight bits wide, so the sixteen-bit mode is
switched to, letting both bytes be stored in one instruction. The Y register is incremented twice, pointing it to
the space immediately past the second character printed.
FORMNB is both called (for processing the first nibble) and fallen into (for processing the second).
Thus the RTS that exist exits FORMNIB returns variously to either MAKEHEX or PUTHEX. This technique
results in more compact code than if FORMNIB were called twice.
The conversion itself is done by isolating the respective bits, and then adding the appropriate offset to
form either the correct decimal or alphabetic (A-F) hexadecimal character.
Like all utility routines, the status is saved and restored on entry and exit.
0524
0525
0526
0527
0528
0529
0530
0531
0532
0533
0534
0535
0536
0537
0538
0539
0540
0541
0542
0543
0544
0545
0546
0547
0548
0549
0550
0551
0552
0553
0554
0555
0556
0557
0558
0559
0560
0561
0562
0563
0564
0565
0566
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0004
0006
0006
0009
000A
000B
000C
000D
000D
000F
000F
000F
000F
0010
0012
0015
0016
0017
0018
0019
001A
001B
001B
001B
001D
;
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
; PUTHEX
;
; CONVERTS NUMBER IN ACCUMULATOR TO HEX STRING
; STORED AT LINE,Y
;
; SAVE AND RESTORED MODE FLAGS
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
PUTHEX
08
200D00
C220
990080
C8
C8
28
60
E230
MAKEHEX
48
290F
201B00
EB
68
4A
4A
4A
4A
START
PHP
JSR
REP
LONGA
STA
INY
INY
PLP
RTS
SEP
LONGA
LONGI
PHA
AND
JSR
XBA
PLA
LSR
LSR
LSR
LSR
MAKEHEX
#M
ON
LINE,Y
FORMNIB
$M+X
OFF
OFF
#$OF
FORMNIB
A
A
A
A
;
C90A
B004
CMP
BGE
#$A
HEXDIG
246
001F
0020
0022
0023
0023
0025
0026
0026
18
69B0
60
69B6
60
CLC
ADC
RTS
HEXDIG
ADC
RTS
#0
$A-11
END
Local Symbols
FORMNIB
00001B
HEXDIG
000023
MAKEHEX
00000D
247
CLRLN
CLRLN performs the very straightforward task of clearing the output buffer, LINE, to blanks. It also
contains the global storage reserved for LINE.
Like the other utility routines, CLRLN saves and restores the status.
0575
0576
0577
0578
0579
0580
0581
0582
0583
0584
0585
0586
0587
0588
0589
0590
0591
0592
0593
0594
0595
0596
0597
0598
0599
0600
0601
0602
0603
0604
0605
0606
0607
0608
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0003
0003
0003
0003
0006
0009
0009
000C
000D
000E
0010
0011
0012
0012
0012
0012
0058
005A
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
; CLRLN
;
; CLEARS LINE WITH BLANKS
;
; SAVES AND RESTORES MODE FLAGS
;
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
CLRLN
08
C230
A9A0A0
A24400
9D1200
CA
CA
10F9
28
60
LOOP
LINE
A0A0A0A0
8D00
START
PHP
REP
LONGA
LONGI
#M+X
ON
ON
LDA
LDX
#
#68
STA
DEX
DEX
BPL
PLP
RTS
LINE,X
ENTRY
DC
DC
END
LOOP
70C
H8D00
Local Symbols
LINE
000012
LOOP
000009
248
UPDATE
This routine, common to both the disassembler and the tracer, updates the program counter and other
direct page variables the address mode attribute (ADDRMODE) and the length (OPLEN) and, using the
length, reads the instruction operands into direct page memory.
The address mode and length attributes are stored in a table called ATRIBL, two bytes per instruction.
Since there are 256 different codes, the table size is 512 bytes. The current opcode itself, fetched previously, is
used as the index into the table. Since the table entries are two bytes each, the index is first multiplied by two
by shifting left. Since the sixteen-bit accumulator was used to calculate the index, both attribute bytes can be
loaded in a single operation; since their location in direct page memory is adjacent, they can be stored in a
single operation as well.
Normally, the value of OPLEN loaded from the attribute table is the correct one; in the case of the
immediate addressing mode, however, the length varies with the setting of the m and x flags. The opcode for
the immediate instructions are trapped using just three comparisons, an AND, and four branches to test the
opcode bits. Note that the immediate operands are multiplied times two because the opcode already happens to
be shifted left once. If the current instruction uses immediate addressing, the stored value of the status register
is checked for the relevant flag setting; if m or x, as appropriate, is clear, then OPLEN is incremented. The
routines that output the immediate operand now know the correct number of operand bytes to print, and the
tracer knows where the next instruction begins.
The status is saved on entry and restored on exit.
0609
0610
0611
0612
0613
0614
0615
0616
0617
0618
0619
0620
0621
0622
0623
0624
0625
0626
0627
0628
0629
0630
0631
0632
0633
0634
0635
0636
0637
0638
0639
0640
0641
0642
0643
0644
0645
0646
0647
0648
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0003
0003
0003
0003
0006
0007
0007
0008
000B
000C
000E
000E
000F
0010
0012
0012
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
; UPDATE
;
; UPDATES ATTRIBUTE VARIABLES BASED ON OPCODE
; PASSED IN ACCUMULATOR BY LOOKING IN ATTRIBUTE
; TABLES
;
; SAVES AND RESTORES MODE FLAGS
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
UPDATE
START
USING ATRIBL
LDYI
LDXI
EQU
EQU
$A0+2
$A2+2
08
C230
PHP
REP
LONGA
LONGI
#M+X
ON
ON
29FF00
0A
AND
ASL
#$FF
A
AS
B90080
EB
859C
TAY
LDA
XBA
STA
AA
98
E210
TAX
TYA
SEP
LONGI
SAVE STATE
ATRIBL,Y
ADDRMODE
#X
OFF
249
0012
0015
0017
0017
0017
0019
001B
001D
001D
0020
0022
0025
0027
002A
002C
002F
0031
0033
0036
0038
003A
003A
003D
0040
0042
0044
0047
0049
0049
004B
004B
004B
004D
004F
004F
0051
0051
0052
0054
0054
BCFF7F
849F
LDY
STY
LENS-1,X
OPLEN
A697
E001
F02E
LDX
CPX
BEQ
EBIT
#1
SHORT
BIT
BNE
CMP
BEQ
BIT
BNE
CMP
BLT
LDA
AND
BEQ
BNE
#$20
SHORT
#LDXI
CHKX
#$F+2
CHKA
PREG
CHKA
PREG
#X
LONG
SHORT
EMULATION MODE?
TEST BIT ZERO
YES - - ALL IMMEDIATE ARE
SHORT
IS MSD+2 EVEN?
NO, CANT BE IMMEDIATE
IS IT LDX #?
AND
CMP
BNE
LDA
AND
BNE
#$0F+2
#$9+2
SHORT
PREG
#M
SHORT
0689
0690
0691
0692
0693
0694
0695
0056
0057
0059
005B
005B
005C
005D
C8
C49F
D0F4
;
892000
D029
C94401
F00A
891E00
D00E
C94001
9009
A596
291000
F011
D011
291E00
C91200
D009
A596
292000
D002
CHKA
E69F
LONG
;
INC
OPLEN
A000
8005
SHORT
LDY
BRA
#0
LOOPIN
A780
LOOP
;
[PCREG]
AA
9687
E680
28
60
LOOPIN
DONE
IS LSD+2 ZERO?
CHECK ACCUMULATOR OPCODES
MUST = LDY# OR GREATER
NO, MAYBE ACCUMULATOR
IF IT IS, WHAT IS FLAG SETTING?
CLEAR 16 BIT MODE
SET 8 BIT MODE
TAX
STX
ORPNDL-1,Y
INC
PCREG
INY
CPY
BNE
OPLEN
LOOP
BYTE
MOVED ALL OPERAND BYTES?
NO, CONTINUE
PLP
RTS
END
Local Symbols
CHKA
LDYI
SHORT
00003A
000140
000048
CHKX
LONG
000031
000049
DONE
LOOP
00005B
00004F
LDXI
LOOPIN
000144
000054
250
PRINTLN
This is the output routine. In this version, an existing 6502 output routine is called, necessitating a
reversion to the emulation mode. Since this is the only place a 6502 routine is called, a simpler mode-switching
routine than the generalized one of the previous chapter is used. The user registers do not need to be preserved,
but zero needs to be swapped into the direct page to make it address page zero.
The main loop is in the emulation mode until the null terminal byte of LINE is encountered; on exit, the
native mode, direct page, and status are restored.
0696
0697
0698
0699
0700
0701
0702
0703
0704
0705
0706
0707
0708
0709
0710
0711
0712
0713
0714
0715
0716
0717
0718
0719
0720
0721
0722
0723
0724
0725
0726
0727
0728
0729
0730
0731
0732
0733
0734
0735
0736
0737
0738
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0005
0006
0006
0006
0006
0007
0008
0008
000A
000A
000D
000F
0012
0013
0015
0015
0016
0017
0018
0019
001A
001A
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
; PRINTLN
;
; MACHINE-DEPENDENT CODE TO OUTPUT
; THE STRING STORED AT LINE
;
; SAVES AND RESTORED MODE FLAGS
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
PRINTLN
COUT
START
EQU
$FDED
08
0B
F40000
2B
PHP
PHD
PEA
PLD
OFF
OFF
38
FB
LONGA
LONGI
SEC
XCE
A000
LDY
#0
LDA
BEQ
JSR
INY
BRA
LINE,Y
DONE
COUT
B90080
F006
20EDFD
C8
80F5
LOOP
18
FB
2B
28
60
DONE
CLC
XCE
PLD
PLP
RTS
SWITCH TO EMULATION
LOOP
RESTORE NATIVE MODE
RESTORE DIRECT PAGE
RESTORE MODE FLAGS
END
Local Symbols
COUT
00FDED
DONE
000015
LOOP
0000DA
251
TRACE
This is the actual entry to the trace routine. It performs initialization similar to LIST, and additionally
sets up the BRK vectors, so they can point to locations within the tracer.
The e flag, direct page register and data bank register are all given initial values of zero. The program
counter and program counter bank are presumed to have been initialized by the user. The first byte of the
program to be traced is loaded; since indirect long addressing is used, this program can be used with the 65816
to debug programs located in any bank. It can, of course, also be used with the 65802.
The jump to TBEGIN enters the main loop of the trace routine in the middle in other words,
between instructions.
0739
0740
0741
0742
0743
0744
0745
0746
0747
0748
0749
0750
0751
0752
0753
0754
0755
0756
0757
0758
0759
0760
0761
0762
0763
0764
0765
0766
0767
0768
0769
0770
0771
0772
0773
0774
0775
0776
0777
0778
0779
0780
0781
0782
0783
0784
0785
0786
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0003
0004
0004
0006
0006
0009
0009
000A
000D
000D
0010
0011
0011
0013
0013
0015
0015
0014
0017
0019
001B
001D
001F
0021
0021
0024
0024
0027
002A
002A
0787
0788
002D
0030
START
GEQU
GEQU
$3F0
$FFE6
08
18
FB
08
PHP
CLC
XCE
PHP
C210
F40000
REP
LONGI
PEA
#$10
ON
0
BA
8E3D00
TSX
STX
SAVSTACK
F40003
2B
PEA
PLD
DPAGE
8691
STX
STACK
E220
SEP
LONGA
#$20
OFF
A901
8597
6493
6494
6495
649E
LDA
STA
STZ
STZ
STZ
STZ
#1
EBIT
DIRREG
DIRREGH
DBREG
MNX+1
9C0080
STZ
STEPCNTRL
A20080
8EF003
LDX
STX
#EBRKN
USRBRKV
AEE6FF
LDX
BRKN
E000C0
9003
CPX
BLT
#$C000
OK
252
0032
4C0080
0790
0791
0792
0793
0794
0795
0796
0797
0798
0799
0035
0038
0038
003A
003D
003D
003D
003F
003F
0041
8E3F00
0800
0801
0041
0043
0000
OK
A780
4C0080
SAVSTACK
0000
USRBRKN
0000
SAVRAM
JMP
QUIT
STX
USRBRKN
LDA
JMP
[PCREG]
TBEGIN
ENTRY
DS
ENTRY
DS
ENTRY
DS
END
2
2
Local Symbols
OK
000035
SAVRAM
000041
SAVSTACK
00003D
USRBRKN
00003F
253
EBRKIN
This is the main loop of the tracer. It has three entry points: one each for the emulation and native
mode BRK vectors to point top, and a third (TBEGIN) which is entered when then program starts tracing and
there is no last instruction. This entry provides the logical point to begin examining the tracing process.
TRACE has performed some initialization, having loaded the opcode of the first instruction to be
traced into the accumulator. As with FLIST, UPDATE is called to update the program counter and copy the
instruction attributes and operand into direct page memory. The routine CHKSPLC is then called to handle the
flow-altering instructions' in these cases, it will modify PCREG to reflect the target address. In either case, the
opcode of the next instruction is loaded, and a BRK instruction (a zero) is stored in its place, providing a means
to regain control immediately after the execution of the current instruction.
The contents of the RAM pointed to by the (arbitrary) ROM values in the native mode BRK vector are
temporarily saved, and the location is patched with a jump to the NBRKIN entry point.
The registers are then loaded with their user program values: these will have been preinitialized by
TRACE, or will contain the values saved at the end of the execution of the previous instruction. Note the order
in which the registers are loaded; some with direct page locations, others pushed onto the stack directly from
direct page locations; then pull into the various registers. Once the user registers have been loaded with their
values, they cannot be used for data movement. The P status register must be pulled last, to prevent any other
instructions from modifying the flags.
The e bit is restored by loading the P register with a mask reflecting the value it should have; e is
exchanged with the carry, and a second PLP instruction restores the correct status register values.
The routine exists via a jump indirect long through the old pcreg variable, which points to the current
instruction. It will be reentered (at either EBRKIN or NBRKIN) when the BRK instruction that immediately
follows the current instruction is executed.
Before this, however, the single instruction will be executed by the processor; any memory to be loaded
or stored, or any registers to be changed by the instruction, will be modified.
After the BRK is executed, control returns to the tracer either at EBRKIN, if the user program was in
emulation mode, or at NBRKIN if the user program was in native mode. The first thing that must be done is
preserve the state of the machine as it was at the end of the instruction.
The BRK instruction has put the program counter bank (only in native mode), the program counter, and
the status register on the stack. The program already knows the address of the next instruction, so the value on
the stack can be disregarded. The status register is needed, however.
Entry to EBRKIN is from the Apple I I monitor user vector at $3F0 and $3F1. The Apple II monitor
handles emulation mode BRK instructions by storing the register values to its own zero page locations; it pulls
the program counter and status register from the stack and stores them, too. The code at EBRKIN dummies up
a native mode post-BRK stack by first pushing three place-holder bytes, then loading the status register the
form where the Apple Monitor stored it, and pushing it. The accumulator and X registers are re-loaded from
monitor locations; Y has been left intact. A one is stored to variable EBIT, which will be used to restore the
emulation mode when EBRKIN exists. The processor switches to native mode, and control falls through into
NBRKIN, the native mode break handler.
With the stack in the correct state for both emulation mode and native mode entries, the routine
proceeds to save the entire machine context. The register sizes are extended to sixteen bits to provide a standard
size which encompasses the maximum size possible. The data bank and direct page registers are pushed onto
the stack; the DPAGE value is pushed on immediately after, and pulled into the direct page, establishing the
local direct page. With this in place, the A, X, and Y registers can be stored at their direct page locations. The
register values pushed on the stack are picked off using stack-relative addressing. Since control is not returned
by execution of an RTI (as is usual for interrupt processing), but instead is returned by means of a JMP, the
stack must be cleaned up. Since seven bytes have been pushed, seven is added to the current stack pointer, and
then saved at the direct page variable STACK. This being done, a small local stack region $140 can be
allocated.
The memory borrowed as a RAM native-mode BRK vector is restored.
The current line is then disassembled in the same manner as LIST. The register values just stored into
memory are also displayed via the routine DUMPREGS.
254
255
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0819
0000
0820
0821
0822
0823
0824
0825
0826
0827
0828
0829
0830
0831
0832
0833
0834
0835
0836
0837
0838
0839
0840
0841
0842
0843
0844
0845
0846
0847
0848
0849
0850
0851
0852
0853
0854
0855
0856
0857
0858
0859
0860
0861
0862
0863
0864
0865
0866
0000
0000
0000
0003
0004
0006
0007
0009
000B
000B
000B
000B
000B
000B
000E
000E
000F
0010
0010
0010
0010
0010
0012
0012
0012
0012
0013
0014
0017
0018
0018
001A
001C
001E
001E
0020
0022
0022
0023
0024
0027
0029
0029
002B
002D
002D
0030
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
; EBRKIN, NBRKIN, TBGIN
;
; ENTRY POINTS FOR TRACER MAIN LOOP
; EBKIN AND NBKIN RECOVER CONTROL AFTER
; BRK INSTRUCTION EXECUTED
; TBEGIN IS INITIAL ENTRY FROM TRACE
;
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
EBRKIN
;
START
LONG
A
LONGI
F40000
4848
A548
48
A545
A646
PEA
PHA
LDA
PHA
LDA
LDX
;
;
;
APPLE I I MONITOR
LOCATIONS
FOR P, AA
AND X
EE9703
INC
18
FB
CLC
XCE
GO NATIVE
ENTRY
NBRKIN
;
EBIT+DPAGE
C230
REP
LONGA
LONGI
#M+X
ON
ON
8B
0B
F40003
2B
PHB
PHD
PEA
PLD
858F
868B
848D
STA
STX
STY
AREG
XREG
YREG
A301
8593
LDA
STA
1,S
DIRREG
3B
18
690700
8595
TSC
CLC
ADC
STA
#7
STACK
SAVE AS STACK
A303
8595
LDA
STA
3,S
DBREG
A94001
1B
LDA
TCS
#$140
DPAGE
256
0031
0031
0032
0033
0036
0039
003C
003F
0042
0045
0048
0048
004B
004B
004E
0051
0054
0054
0056
0056
0056
0058
0058
0058
005B
005D
005D
0060
0062
0064
0066
0069
006B
006B
006E
0070
0072
0075
0075
0075
0077
0079
0079
0079
007A
007A
007A
007C
007E
0080
0082
0082
0083
0083
0085
0088
0088
0088
0088
008B
008B
008D
008D
008F
0091
0091
0093
0093
4B
AB
AE0080
AD0180
9D0100
AD0080
9D0100
200080
200080
PHK
PLB
LDX
LDA
STA
LDA
STA
JSR
JSR
USRBRKN
SAVRAM+1
!1,X
SAVRAM
!0,X
FLIST
FRMOPRND
200080
JSR
PRINTLN
PRINT IT
200080
200080
200080
JSR
JSR
JSR
CLRLN
DUMPREGS
PRINTLN
E220
SEP
LONGA
#M
ON
C210
REP
LONGI
STEPCNTRL
DOPAUSE
2CE000
300E
BIT
BMI
200080
9068
5011
A980
8DE000
800A
JSR
BCC
BVC
LDA
STA
BRA
STEP
QUIT
RESUME
#$80
STEPCNTRL
RESUME
JSR
BCC
BVC
STZ
PAUSE
QUIT
RESUME
STEPCNTRL
RESUME
LDA
STA
NCODE
[PCREG]
TBEGIN
ENTRY
TAY
200080
905A
5003
9CE000
DOPAUSE
;
A583
8780
AB
;
A680
8684
A582
8586
LDX
STX
LDA
STA
98
TYA
8587
200080
PCREG
OPCREG
PCREGB
OPCREGB
STA
JSR
CODE
UPDATE
JSR
CHKSPCL
LDA
[PCREG]
STA
LDA
NCODE
#0
STA
[PCREG]
;
;
200080
;
A780
;
8583
A900
;
8780
GO
ENTRY
257
0093
0095
0095
0095
0098
009B
009E
00A1
00A4
00A7
00AA
00AD
00B0
00B2
00B3
00B5
00B7
00B9
00B9
00BB
00BB
00BD
00BF
00C1
00C1
00C2
00C2
00C3
00C4
00C5
00C5
00C6
00C7
00C7
00CA
00CA
00CA
00CC
00CC
00CC
00CE
00D0
00D0
00D2
00D2
00D2
00D5
00D6
00D7
00D8
00D8
00DB
00DC
00DC
00DD
00DE
00DF
00E0
00E0
00E0
00E1
C230
AE0080
BD0000
8D0080
BD0100
8D0180
A94C00
9D0000
A91000
9D0100
A561
1B
D495
D496
D493
REP
LONGA
LONGI
LDX
LDA
STA
LDA
STA
LDA
STA
LDA
STA
LDA
TCS
PEI
PEI
PEI
#M+X
ON
ON
USRBRKIN
!0,X
SAVRAM
!1,X
SAVRAM+1
#$4C
!0,X
#NBRKIN
!1,X
STACK
(DBREG)
(EBIT-1)
(DIRREG)
6497
STZ
EBIT
A58F
A48D
A68B
LDA
LDY
LDX
AREG
YREG
XREG
2B
PLD
28
28
FB
PLP
PLP
XCE
AB
28
PLB
PLP
DC8403
JMP
[DPAGE+OPCREG]
E220
ENTRY
SEP
LONGA
#$20
OFF
A583
8780
LDA
STA
NCODE
[PCREG]
C210
REP
LONGI
#$10
ON
AE0080
E8
E8
9A
LDX
INX
INX
TXS
SAVSTACK
F40000
2B
PEA
PLD
28
FB
28
60
PLP
XCE
PLP
RTS
QUIT
STEPCNTRL
00
ENTRY
DS
END
RESTORE STACK
POP IT AWAY!
ON TO THE NEXT!
Local Symbols
DOPAUSE
RESUME
00006B
000075
GO
STEPCNTRL
000093
0000E0
NBRKIN
TBEGIN
000010
000079
QUIT
0000CA
258
CHKSPCL
This routine checks the opcode about to be executed to see if it will cause a transfer of control. Is it a
branch, a jump, or a call? If it is any of the three, the destination of the transfer must be calculated and stored at
PCREG so that a BRK instruction can be stored there to maintain control after the current instruction is
executed.
A table that contains all of the opcodes which can cause a branch or jump (SCODES) is scanned. If a
match with the current instruction is not found, the routine exists and tracing resumes.
If a match is found, the value of the index into the table is checked. The opcodes for all the branches
are stored at the beginning of SCODES, so if the value of the index is less than 9, the opcode was a branch and
can be handled by the same general routine.
The first thing that must be determined if the opcode is a branch is whether or not the branch will be
taken. By shifting the index right (dividing by two) an index for each pair of different types of branches is
obtained. This index is used to get a mask for the bit in the status register to be checked. The value shifted into
the carry determines whether the branch is taken if the status bit is set or clear.
If a branch is not taken, the routine exits. If, however, a branch is taken, the new program counter value
must be calculated by sign extending the operand and adding it to the current program counter.
Each of the other opcodes (jumps and calls) are dispatched to handler routines through a jump table.
Since only the new program counter values must be calculated, jumps and calls with the same addressing mode
can be handled by the same routine.
Breaks, co-processor calls, and RTIs are not handled at all; a more robust tracer would handle BRKs by
letting breakpoints be set and cleared. Since the software interrupts are not implemented, and software tracing
of hardware interrupts is impractical, RTI is left unimplemented. The program counter is incremented by one,
causing these instructions to be bypassed completely.
All of the jumps and calls are straightforward. Long addressing is used to force the stack and indirect
addressing modes to access bank zero. Also notice the way the data bank register is copied to the program
counter bank for indirect indexed addressing. Finally, note how the long addressing modes call their absolute
analogs as subroutines, then handle the bank byte.
259
0000
0000
0000
0000
0000
0000
0000
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0003
0005
0005
0008
000A
000B
000D
000E
0003
0003
0010
0010
0010
0011
0013
0015
0015
0016
0017
001A
001A
001C
001C
001E
001E
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
; CHKSPCL
;
; CHECK CURRENT OPCODE (IN CODE) FOR SPECIAL CASES
; - - INSTRUCTIONS WHICH TRANSFER CONTROL (JMP, BRA, ETC.)
;
;
; ASSUMES SHORTA, LONGI - -CALLED BY EBRKIN
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
CHKSPCL
START
LONGA
LONGI
OFF
ON
LDX
LDA
#SCX-SCODES
CODE
CMP
BEQ
DEX
BPL
RTS
SCODES,X
HIT
SEP
LONGI
#X
OFF
8A
C909
B00F
TXA
CMP
BGE
#9
NOTBR
4A
AA
BD0080
LSR
TAX
LDA
PHASK,X
AND
PREG
BCS
BBS
F00B
BEQ
DOBRANCH
0020
0021
0021
0023
0024
0024
0025
0025
0026
0026
0028
0028
0028
002B
002D
002D
002E
60
RTS
EB
XBA
002E
0030
A588
LDA
A20000
A587
DD0080
F004
CA
10F8
60
LOOP
E210
HIT
LOOP
EXIT IF NOT
;
2596
B003
D008
60
BBS
BNE
RTS
DOBRANCH
0A
NOTBR
;
ASL
AA
TAX
C210
7CEE7F
REP
JMP
#X
(SPJMP-18,X)
ENTRY
LDA
#$FF
DOBRANCH
A9FF
;
OPRNDL
260
0030
0032
0032
0032
0032
0034
0034
0037
0037
0039
003B
003D
003E
C231
REP
LONGA
LONGI
#M+X+C
ON
ON
3003
BMI
OK
AND
#$7F
ADC
STA
SEP
RTS
END
PCREG
PCREG
#M
297F00
;
OK
6580
8580
E220
60
Local Symbols
BBS
NOTBR
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
000021
000024
0000
0000
0000
0000
0000
0001
0001
0001
0001
0003
0005
0006
0006
0006
0008
0008
000A
000C
000E
0010
0010
0011
0011
0011
0011
0013
0015
0017
0019
001A
001A
001A
001C
001F
0021
0024
0025
0025
0027
002B
002C
002E
0030
0030
0031
0031
0031
DOBRANCH
OK
00002B
000037
HIT
LOOP
000005
SBRK
SRTI
SCOP
START
ENTRY
ENTRY
RTS
SJSRABS
SJMPABS
ENTRY
ENTRY
LDX
STX
RTS
ABSOLUTES - -
60
A688
8680
60
SBRL
C221
A588
6580
8580
E220
60
SJSRABSL
SJMPABSL
A688
8680
A58A
8582
60
SRTS
ENTRY
REP
LONGA
LDA
ADC
STA
SEP
LONGA
RTS
ENTRY
ENTRY
LDX
STX
LDA
STA
RTS
OPRNDL
PCREG
#M+C
ON
OPRNDL
PCREG
PCREG
#M
OFF
OPRNDL
PCREG
OPRNDB
PCREGB
STACK
SAVSTACK
CONT
QUIT
C220
BF000000
1A
8580
E220
REP
LDA
INC
STA
SEP
#M
>0,X
A
PCREG
#M
60
RTS
CONT
SRTL
ENTRY
MOVE OPERAND TO PC
LONG BRANCH
LONG ACCUM AND CLEAR CARRY
ADD DISPLACMENT TO
PROGRAM COUNTER
ABSOLUTE LONGS
ENTRY
LDX
CPX
BNE
JMP
INX
A691
EC0080
D003
4C0080
E8
00000E
RETURN
PEEK ON STACK
IF ORIGINAL STACK . . .
RETURN TO MONITOR
RETURN LONG
261
0031
0034
0034
0035
0036
003A
003C
003D
003D
003D
003D
003F
003F
201A00
JSR
E8
E8
BF000000
8582
60
INX
INX
LDA
STA
RTS
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
0041
0045
0047
0049
004A
004A
004A
004A
004D
004E
004F
0053
0055
0056
0056
0056
0056
0056
0058
005A
005C
005E
0060
0060
0062
0064
0066
0068
0068
0069
0069
0069
Local Symbols
CONT
SJMPABSL
SJJSRABS
SRTL
SJMPIND
SRTS
DO NORMAL RETURN,
THEN GET BANK BYTE
>0,X
PCREGB
A688
ENTRY
LDX
OPRNDL
C220
REP
#M
BF000000
8580
E220
60
LDA
STA
SEP
RTS
>0,X
PCREG
#M
SJMPIND
>0,X
PCREGB
SJMPINDL
203D00
E8
E8
BF000000
8582
60
SJMPINDX
SJSRINDX
ENTRY
JSR
INX
IJNX
LDA
STA
RTS
A48B
A688
8699
A582
859B
ENTRY
ENTRY
LDY
LDX
STX
LDA
STA
XREG
OPRNDL
TEMP
RCREGB
TEMP+2
C220
B799
8680
E220
REP
LDA
STA
SEP
#M
[TEMP],Y
PCREG
#M
60
RTS
INDIRECT
GET OPERAND
INDEX JUMPS
LET CPU DO ADDITION
GET INSIRECT POINTER
INDEXED JUMPS ARE IN PROGRAM
BANK
Y IS X
END
000024
000011
000001
000031
SBRL
SJMPIND
SJSRABSL
SRTS
000006
00003D
000011
00001A
SCOP
SJMPINDL
SJSRINDX
000000
00004A
000056
SJMPABS
SJMPINDX
SRTI
000001
000056
000000
262
DUMPREGS
This routine forms an output line that will display the contents of the various registers. The routine is
driven in a loop by a table containing single-character register names (A, X, and so on) and the address of
the direct page variable that contains the corresponding register value. It is interesting in that a direct page
pointer to a direct page address is used, since the two index registers are occupied with accessing the table
entries and pointing to the next available location in the output buffer.
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0003
0003
0003
0003
0005
0005
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
0007
0009
0009
000B
000B
000E
0010
0011
0014
0017
0018
001A
001A
001C
001E
0020
0023
0025
0027
0029
002C
002E
0031
0032
0034
0037
0038
0038
003A
003C
003E
003F
0042
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
; DUMPREGS
;
; DISPLAYS CONTENTS OF REGISTER VARIABLES IN LINE
;
; SAVES AND RESTORES MODE
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
DUMPREGS
START
PHP
SEP
LONGA
LONGI
#M+X
OFF
OFF
A000
LDY
#0
A903
LDA
#>DPAGE
859A
STA
TEMPH
A209
LDX
#ENDTABLE-TABLE
LDA
STA
DEX
LDA
JSR
DEX
BPL
TABLE,X
TEMP
TABLE,X
PUTREG16
1995
8599
A9C2
200080
A996
8599
A9D0
200080
A9C5
990080
C8
A9BA
990080
C8
LDA
STA
LDA
JSR
LDA
STA
LDA
JSR
LDA
STA
INY
LDA
STA
INY
#DBREG
TEMP
#B
PUTREG8
#PREG
TEMP
#P
PUTREG8
#E
LINE,Y
A9B0
A697
F001
1A
990080
LDA
LDX
BEQ
INC
STA
#0
EBIT
OK
A
LINE,Y
08
E230
BD4400
8599
CA
BD4400
200080
CA
10F1
LOOP
OK
LOOP
NOW ALL THE 8-BIT REGISTERS
#:
LINE,Y
0 BECOMES 1
263
0042
0042
0043
0044
0044
0046
0048
004A
004C
004D
004E
28
60
C494
D392
D98E
D88C
C1
90
PLP
RTS
TABLE
ENDTABLE
DC
DC
DC
DC
DC
DC
END
CD,I1DIRREGH
CS,I1STACKH
CY,I1YREGH
CX,I1XREGH
CA
I1AREGH
DIRECT PAGE
ADDRESS OF
REGISTER
VARIABLES
Local Symbols
ENDTABLE
00004D
LOOP
00000B
OK
00003F
TABLE
000044
264
PUTRTEG8
This routine, along with PUTREG16, is called by DUMPREGS to actually output a register value
once its label and storage location have been loaded from the table. Naturally, it calls PUTREX to convert the
register values to hexadecimal.
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0003
0004
0006
0009
000A
000C
000C
000C
000F
0010
0012
0015
0016
0017
0019
001B
001E
001E
001F
0021
0024
0025
0026
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
; PUTREGS
;
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
PUTREG8
990080
C8
A9BC
990080
C8
8012
PUTREG16
990080
C8
A9BD
990080
C8
C8
B299
C699
200080
C8
B299
20080
C8
60
PRIN
START
STA
INY
LDA
STA
INY
BRA
ENTRY
STA
INY
LDA
STA
INY
INY
LDA
DEC
JSR
INY
LDA
JSR
INY
RTS
END
LINE,Y
#=
LINE,Y
EQUALS . .
PRIN
LINE,Y
#=
LINE,Y
EQUALS . .
(TEMP)
TEMP
PUTHEX
(TEMP)
PUTHEX
Local Symbols
PRIN
0000IE
PUTREG16
00000C
265
TABLES
The next several pages list the tables used by the program SPJMP, PMASK, SCODES, MN,
MODES, LENS, and ATRIBL.
SPJMP is a jump table of entry points to the trace handlers for those instructions which modify the
flow of control.
PMASK contains the masks used to check the status of individual flag bits to determine if a branch will
be taken.
SCODS is a table containing the opcodes of the special (flow-altering) instructions.
ATRBL is the attribute table for all 256 opcodes. Each table entry is two bytes, one is an index into the
mnemonic table, the other the address mode. This information is the key to the other tables, all used by the
UPDATE routine, which puts a description of the current instructions attributes into the respective direct page
variables. MN is the table of instruction mnemonics that the mnemonic index attribute points into. MODES
is a jump table with addresses of the disassembly routine for each addressing mode, and LENS contains the
length of instructions for each addressing mode. Both of these tables are indexed into directly with the address
mode attribute.
266
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0002
0004
0006
0008
000A
000C
000E
0010
0012
0014
0016
0018
001A
001C
001C
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
; SP JMP
; JUMP TABLE FOR SPECIAL OPCODE HANDLERS
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
SPJMP
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
SCT
START
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
ASBRK
ASJSRABS
ASRTI
ASRTS
ASCOP
ASJSRABSL
ASBRL
ASRTL
ASJMPABS
ASJMPABSL
ASJMPIND
ASJMPINDX
ASJMPINDL
ASJSRINDX
END
Local Symbols
SCT
00001A
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0003
0004
0005
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
0000
0000
0000
0000
0000
0000
0000
0000
0001
0002
0003
0004
0005
0006
0007
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
; PMASK
; STATUS REGISTER MASKS FOR BRANCH HANDLING CODE
; LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
;
PMASK
80
40
01
02
00
SCODES
10
30
50
70
90
B0
D0
F0
START
DC
DC
DC
DC
DC
END
H80
H40
H01
H02
H00
START
DC
DC
DC
DC
DC
DC
DC
DC
SPECIAL OPCODES
H10
H30
H50
H70
H90
HB0
HD0
HF0
BPL
BMI
BVC
BVS
BCC
BCS
BNE
BEQ
267
0008
0009
000A
000B
000C
000D
000E
000F
0010
0011
0012
0013
0014
0015
0016
0016
0017
80
00
20
40
60
02
22
82
6B
4C
5C
6C
7C
DC
SCX
FC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
ENTRY
DC
END
H80
H00
H20
H40
H60
H02
H22
H82
H6B
H4C
H5C
H6C
H7C
HDC
BRA
BRK
JSR
RTI
RTS
COP
JSR
BRL
RTL
JMP
JMP
JMP
JMP
JMP
HFC
JSR
ABSL
ABS
ABSL
()
(,X)
[]
(,X)
Local Symbols
SCX
1138
1139
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
000016
0000
0000
0000
0000
0000
0003
0006
0009
000C
000F
0012
0015
0018
001B
001E
0021
0024
0027
002A
002D
0030
0033
0036
0039
003C
003F
0042
0045
0048
004B
004E
0051
0054
0057
005A
005D
0060
0063
0066
0069
006C
006F
MN
000000
C1C4C3
C1C3C4
C1D3CC
C2C3C3
C2C3D3
C2C5D1
C2C9D4
C2CDC9
C2C3C5
C2D0CC
C2D2CB
C2D6C3
C2D6D3
C3CCC3
C3CCC4
C3CCC9
C3CCD6
C3CDD0
C3D0D8
C3D0D9
C4C5C3
C4C5D8
C4C5D9
C5CFD2
C9CEC3
C9C3D8
C9C3D9
CACDD0
CAD3D2
CCC4C1
CCC4D8
CCC9D9
CDC3D2
CECFD0
CFD2C1
D0C8C1
D0C8D0
DATA
DX
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
3
CADC
CAND
CASL
CBCC
CBCS
CBEQ
CBIT
CBMI
CBNE
CBPL
CBRK
CBVC
CBVS
CCLC
CCLD
CCLI
CCLV
CCMP
CCPX
CCPY
CDEC
CDEX
CDEY
CEOR
CINC
CINX
CINY
CJMP
CJSR
CLDA
CLDX
CLDY
CLSR
CNOP
CORA
CPHA
CPHP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
268
0072
0075
0078
007B
007E
0081
0084
0087
008A
008D
0090
0093
0096
0099
009C
009F
00A2
00A5
00A8
00AB
00AE
00B1
00B4
00B7
00BA
00BD
00C0
00C3
00C3
00C6
00C9
00CC
00CF
00D2
00D5
00D8
00DB
00DB
00DE
00E1
00E1
00E4
00E7
00EA
00ED
00F0
00F3
00F6
00F9
00F9
00FC
00FF
0102
0105
0108
010B
010E
0111
0114
1439
1440
1441
1442
1443
1444
0000
0000
0000
0002
0004
0006
D0CCC1
D0CCD0
D2CFCC
D2CFD2
D2D4C9
D2D4D3
D3C2C3
D3C5C3
D3C5C4
D3C5C9
D3D4C1
D3D4D8
D3D4D9
D4C1D8
D4C1D9
D4D3D8
D4D8C1
D4D8D3
D4D9C1
C2D2C1
D0CCD8
D0CCD9
D0C8D8
D0D8D0
D3D4DA
D4D3C2
D4D3C2
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
CPLA
CPLP
CROL
CROR
CRIT
CRTS
CSBC
CSEC
CSED
CSEI
CSTA
CSTX
CSTY
CTAX
CTAY
CTSX
CTXA
CTXS
CTYA
CBRA
CPLX
CPLY
CPHX
CPHY
CSTZ
CTRB
CTSB
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
D0C5C1
D0C5C9
D0C5D2
D0CCC2
D0CCC4
D0C8C2
D0C8C4
D0C8CB
DC
DC
DC
DC
DC
DC
DC
DC
CPEA
CPEI
CPER
CPLB
CPLD
CPHB
CPHD
CPHK
65
66
67
68
69
70
71
72
D2C5D0
D3C5D0
DC
DC
CREP
CSEP
73
74
D4C3C4
D4C4C3
D4C3D3
D4D3C3
D4D8D9
D4D9D8
D8C2C1
D8C3C5
DC
DC
DC
DC
DC
DC
DC
DC
CTCD
CTDC
CTCS
CTSC
CTXY
CTYX
CXBA
CXCE
75
76
77
78
79
80
81
82
C2D2CC
CAD3CC
D2D4CC
CDD6CE
CDD6D0
C3CFD0
D7C1C9
D3D4D0
D7C4CD
DC
DC
DC
DC
DC
DC
DC
DC
DC
END
CBRL
CJSL
CRTL
CMVN
CMVP
CCOP
CWAI
CSTP
CWDM
83
84
85
86
87
88
89
100
101
2
AFIMM
AFABS
AFABSL
1
2
3
MODES
0000
0080
0080
0080
DATA
DS
DC
DC
DC
269
0008
000A
000C
000E
0010
0012
0014
0016
0018
001A
001C
001E
0020
0022
0024
0026
0028
002A
002C
002E
0030
0032
0032
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
0000
0000
0000
0001
0002
0003
0004
0005
0006
0007
0008
0009
000A
000B
000C
000D
000E
000F
0010
0011
0012
0013
0014
0015
0016
0017
0018
0000
0000
0000
0000
0000
0000
0002
0004
0006
0008
000A
000C
000E
0010
0012
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
0080
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
AFDIR
AFACC
AFIMP
AFINDINX
AFINDINXL
AFINXIND
AFDIRINXX
AFDIRINXY
AFABSX
AFABSLX
AFABSY
AFPCR
AFPCRL
AFABSIND
AFIND
AFINDL
AFABSINXIND
AFSTACK
AFSTACKREL
AFSRINDINX
AFBLOCK
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
H02
H03
H04
H02
H01
H01
H02
H02
H02
H02
H02
H03
H04
H03
H02
H03
H03
H02
H02
H03
H01
H02
H02
H03
IMM
ABS
ABS LONG
DIRECT
ACC
IMPLIED
DIR IND INX
DIR IND INX L
DIR INX IND
DIR INX X
DIR INX Y
ABS X
ABS L X
ABS Y
PCR
PCR L
ABS IND
DIR IND
DIR IND L
ABS INX IND
STACK
SR
SR INX
MOV
END
LENS
02
03
04
02
01
01
02
02
02
02
02
03
04
03
02
03
03
02
02
03
01
02
02
03
START
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
END
DATA
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
I111,6
I135,9
I188,4
I135,22
I164,4
I134,4
I13,4
I135,19
I137,21
I135,1
BRK
ORA D,X
COP (REALLY 2)
ORA-,X
TSB D
ORA D
ASL D
ORA [D]
PHP
ORA IMM
00
01
02
03
04
05
06
07
08
09
270
0014
0016
0018
001A
001C
001E
0020
0022
0024
0026
0028
002A
002C
002E
0030
0032
0034
0036
0038
003A
003C
003E
0040
0042
0044
0046
0048
004A
004C
004E
0050
0052
0054
0056
0058
005A
005C
005E
0060
0062
0064
0066
0068
006A
006C
006E
0070
0072
0074
0076
0078
007A
007C
007E
0080
0082
0084
0086
0088
008A
008C
008E
0090
0092
0094
0096
0098
009A
0305
4715
4002
2302
0302
2303
0A0F
2307
2312
2317
3FO4
230A
030A
2308
0E06
230E
1905
4D06
3F02
230C
030C
230D
1D02
0207
1D03
0216
0704
0204
2804
0213
2706
0201
2805
4515
0705
0202
28005
0203
080F
020B
0212
0217
070A
020A
280A
0208
2D06
020E
1505
4E06
070C
020C
280C
020D
2A06
1809
6506
1816
5718
1804
2104
1813
2406
1801
2105
4806
1C02
1802
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
I13,5
I171,21
I164,2
I135,2
I13,2
I135,3
I110,15
I135,7
I135,18
I135,23
I163,4
I135,10
I13,10
I135,8
I114,6
I135,14
I125,5
I177,6
I163,2
I135,12
I13,12
I135,13
I129,2
I12,7
I129,3
I12,22
I17,4
I12,4
I140,4
I12,19
I139,6
I12,1
I140,5
I169,21
I17,2
I12,2
I140,5
I12,3
I18,15
I12,11
I12,18
I12,23
I17,10
I12,10
I140,10
I12,8
I145,6
I125,14
I121,5
I178,6
I17,12
I12,12
I140,12
I12,13
I142,6
I124,9
I1101,6
I124,22
I187,24
I124,4
I133,4
I124,19
I136,6
I124,1
I133,5
I172,6
I128,2
I124,2
ASL ACC
PHD
TSB ABS
ORA ABS
ASL ABS
ORA ABS L
BPL
ORA (D),Y
ORA (D)
ORA S,Y
TRB D
ORA D,X
ASL D,X
ORA (DL),Y
CLC
ORA ABS,Y
NC ACC
TCS
TRB ABS,X
ORA ABS,X
ASL ABS,X
ORA ABSL,X
JSR ABS
AND (D, X)
JSL ABS L
AND SR
BIT D
AND D
ROL D
AND (DL)
PLP
AND IMM
ROL ACC
PLD
BIT ABS
AND ABS
ROL A
AND ABS L
BMI
AND D,Y
AND (D)
AND (SR),Y
BIT D,X
AND D,X
ROL D,X
AND (DL),Y
SEC
AND ABS,Y
DEC
TSC
BIT A,X
AND ABS,X
ROL A,X
AND AL,X
RTI
EOR (D,X)
WDM
EOR (D,X)
MVP
EOR D
LSR D
EOR (DL)
PHA
EOR IMM
LSR ABS L
PHK
JMP ABS
EOR ABS
0A
0B
0C
0D
0E
0F
10
11
12
13
14
15
16
17
18
19
1A
1B
1C
1D
1E
1F
20
21
22
23
24
25
26
27
28
29
2A
2B
2C
2D
2E
2F
30
31
32
33
34
35
36
37
38
39
3A
3B
3C
3D
3E
3F
40
41
42
43
44
45
46
47
48
49
4A
4B
4C
4D
271
009C
009E
00A0
00A2
00A4
00A6
00A8
00AA
00AC
00AE
00B0
00B2
00B4
00B6
00B8
00BA
00BC
00BE
00C0
00C2
00C4
00C6
00C8
00CA
00CC
00CE
00D0
00D2
00D4
00D6
00D8
00DA
00DC
00DE
00E0
00E2
00E4
00E6
00E8
00EA
00EC
00EE
00F0
00F2
00F4
00F6
00F8
00FA
00FC
00FE
0100
0100
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
0000
0000
0000
0002
0004
0006
0008
000A
000C
000E
0010
0012
0014
0016
2102
1805
0C0F
1807
1812
1817
56148
180A
210A
1808
1006
180E
3D15
4B06
1C03
180C
210C
180D
2B06
0109
4340
0116
3E04
0104
2904
0113
2615
0101
2905
5506
1C11
0102
2902
0103
0D0F
0108
0112
0117
3E0A
010A
290A
0108
2F06
010E
3B15
4C06
1C14
010C
290C
010D
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
I133,2
I124,5
I112,15
I124,7
I124,18
I124,23
I186,24
I124,10
I133,10
I124,8
I116,6
I124,14
I161,21
I175,6
I128,3
I124,12
I133,12
I124,13
I143,6
I11,9
I167,16
I11,22
I162,4
I11,4
I141,4
I11,19
I138,21
I11,1
I141,5
I185,6
I128,17
I11,2
I141,2
I11,3
I113,15
I11,8
I11,18
I11,23
I162,10
I11,10
I141,10
I11,8
I147,6
I11,14
I159,21
I176,6
I128,20
I11,12
I141,12
I11,13
LSR ABS
EOR ABS L
BVC
EOR (D),Y
EOR (D)
EOR (SR),Y
MVN
EOR D,X
LSR D,X
EOR (DL),Y
CLI
EOR
PHY
TCD
JMP ABSL
EOR ABS,X
LSR ABS,X
EOR ABSL,X
RTS
ADC (D, X)
PER
ADC SR
STZ D
ADC D
ROR D
ADC (DL)
PLA
ADC
ROR ABSL
RTL
JMP (A)
ADC ABS
ROR ABS
ADC ABSL
BVS
ADC (D),Y
ADC (D)
ADC (SR),Y
STZ D,X
ADC D,X
ROR D,X
ADC (DL),Y
SEI
ADC ABS,Y
PLY
TDC
JMP (A, X)
ADC ABS,X
ROR ABS,X
ADC ABSL,X
4E
4F
50
51
52
53
54
55
56
57
58
59
5A
5B
5C
5D
5E
5F
60
61
62
63
64
65
66
67
68
69
6A
6B
6C
6D
6E
6F
70
71
72
73
74
75
76
77
78
79
7A
7B
7C
7D
7E
7F
I157,15
I148,9
I183,16
I148,22
I150,4
I148,4
I149,4
I148,19
I123,6
I17,1
I154,6
I170,21
BRA
STA (D, X)
BRL
STA-,S
STY D
STA D
STX D
STA [ D ]
DEY
BIT IMM
TXA
PHB
80
81
82
83
84
85
86
87
88
89
8A
8B
END
ATRIBH
390F
3009
5310
3016
3204
3004
3104
3013
1706
0701
3606
4615
START
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
272
0018
001A
001C
001E
0020
0022
0024
0026
0028
002A
002C
002E
0030
0032
0034
0036
0038
003A
003C
003E
0040
0042
0044
0046
0048
004A
004C
004E
0050
0052
0054
0056
0058
005A
005C
005E
0060
0062
0064
0066
0068
006A
006C
006E
0070
0072
0074
0076
0078
007A
007C
007E
0080
0082
0084
0086
0088
008A
008C
008E
0090
0092
0094
0096
0098
009A
009C
009E
3203
3002
3102
3003
040F
3007
3012
3017
320A
300A
310B
3008
3806
300E
3706
4F06
3E02
300C
3E0C
300D
2001
1E09
1F01
1E16
2004
1E04
1F04
1E13
3406
1E01
3306
4415
2002
1E02
1F02
1E03
050F
1E07
1E12
1E17
200A
1E0A
1E0B
1E08
1106
1E0E
3506
5006
200C
1E0C
1F0E
1E0D
1401
1209
4901
1216
1404
1204
1504
1213
1B06
1201
1606
5906
1402
1202
1502
1203
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
I150,2
I148,2
I149,2
I148,3
I14,15
I148,7
I148,18
I148,23
I150,10
I148,10
I149,11
I148,8
I156,6
I148,14
I155,6
I179,6
I162,2
I148,12
I162,12
I148,13
I132,1
I130,9
I131,1
I130,22
I132,4
I130,4
I131,4
I130,19
I152,6
I131,1
I151,6
I168,21
I132,2
I130,2
I131,2
I130,3
I15,15
I130,7
I130,18
I130,23
I132,10
I130,10
I130,11
I130,8
I117,6
I130,14
I153,6
I180,6
I132,12
I130,12
I131,14
I130,13
I130,13
I118,9
I173,1
I118,22
I120,4
I118,4
I121,4
I118,19
I127,6
I118,1
I122,6
I189,6
I120,2
I118,2
I121,2
I118,3
STY ABS
STA ABS
STX ABS
STA ABS L
BC
STA (D),Y
STA (D)
STA (SR),Y
STY D,X
STA D,X
STX D,Y
STA (DL),Y
TYA
STA ABS,Y
TXS D
TXY
STZ ABS
STA ABS,X
STZ ABS,X
STA ABSL,X
LDY IMM
LDA (D,X)
LDX IMM
LDA SR
LDY D
LDA D
LDX D
LDA (DL)
TAY
LDA IMM
TAX
PLB
LDY ABS
LDA ABS
LDX ABS
LDA ABS L
BCS
LDA (D),Y
LDA (D)
LDA (SR),Y
LDY D,X
LDA D,X
LDX D,Y
LDA (DL),Y
CLV
LDA ABS,Y
TSX
TYX
LDY ABS,X
LDA ABS,X
LDX ABS,Y
LDA ABSL,X
CPY
CMP (D,X)
REP
CMP
CPY D
CMP D
DEC D
CMP (DL)
INY
CMP IMM
DEX
WAI
CPY ABS
CMP ABS
DEC ABS
CMP ABSL
8C
8D
8E
8F
90
91
92
93
94
95
96
97
98
99
9A
9B
9C
9D
9E
9F
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
AA
AB
AC
AD
AE
AF
B0
B1
B2
B3
B4
B5
B6
B7
B8
B9
BA
BB
BC
BD
BE
BF
C0
C1
C2
C3
C4
C5
C6
C7
C8
C9
CA
CB
CC
CD
CE
CF
273
00A0
00A2
00A4
00A6
00A8
00AA
00AC
00AE
00B0
00B2
00B4
00B6
00B8
00BA
00BC
00BE
00C0
00C2
00C4
00C6
00C8
00CA
00CC
00CE
00D0
00D2
00D4
00D6
00D8
00DA
00DC
00DE
00E0
00E2
00E4
00E6
00E8
00EA
00EC
00EE
00F0
00F2
00F4
00F6
00F8
00FA
00FC
00FE
0100
090F
1207
1212
1217
4204
120A
150A
1208
0F06
120E
3C15
6406
1C11
120C
150C
120D
1301
2C09
4A01
2C16
1F04
2C04
1904
2C13
1A06
2C01
2206
5106
1302
2C02
1902
2C03
060F
2C07
2C12
2C17
4102
2C0A
190A
2C08
2E06
2C0E
3A15
5206
1D14
2C0C
190C
2C0D
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
END
I19,15
I118,7
I118,18
I118,23
I166,4
I118,10
I121,10
I118,8
I115,6
I118,14
I160,21
I1100,6
I128,17
I118,12
I121,12
I118,13
I119,1
I144,9
I174,1
I144,22
I131,4
I144,4
I125,4
I144,19
I126,6
I144,1
I134,6
I181,6
I119,2
I144,2
I125,2
I144,3
I16,15
I144,7
I144,18
I144,23
I165,2
I144,10
I125,10
I144,8
I146,6
I144,14
I158,21
I182,6
I129,20
I144,12
I125,12
I144,13
BNE
CMP (D0,Y
CMP (D)
CMP
PEI D
CMP D,X
DEC D,X
CMP (DL),Y
CLD
CMP ABS,Y
PHX
STP
JMP (A)
CMP ABS,X
DEC ABS,X
CMP ABSL,X
CPX IMM
SBC (D,X)
SEP IMM
SBC SR
LDX D
SBC D
INC D
SBD (DL)
INX D
SBC IMM
NOP
XBA
CPX ABS
SBC ABS
INC ABS
SBC ABSL
BEQ
SBC (D),Y
SBC (D)
SBC (SR),Y
PEA
SBC D,X
INC D,X
SBC (DL),Y
SED
SBC ABS,Y
PLX
XCE
JSR (A,X)
SBC ABS,X
INC ABS,X
SBC ABSL,X
D0
D1
D2
D3
D4
D5
D6
D7
D8
D9
DA
DB
DC
DD
DE
DF
E0
E1
E2
E3
E4
E5
E6
E7
E8
E9
EA
EB
EC
ED
EE
EF
F0
F1
F2
F3
F4
F5
F6
F7
F8
F9
FA
FB
FC
FD
FE
FF
Global Symbols
ADDRMODE
C
DIRREG
M
OPCREGB
OPRNDH
PCREGH
TEMP
X
YREGH
00009C
000001
000093
000020
000086
000089
000081
000099
000010
00008E
AREG
CODE
DIRREGH
MNX
OPCREGH
OPRNDL
PREG
TEMPB
XREG
00008F
000087
000094
00009D
000085
000088
000096
00009B
00008B
AREGH
CR
DPAGE
NCODE
OPLEN
PCREG
STACK
TEMPH
XREGH
000090
00008D
000300
000083
00009F
000080
000091
00009A
00008C
BRKN
DBREG
EBIT
OPCREG
OPRNDB
PCREGB
STACKH
USRBRKV
YREG
00FFE6
000095
000097
000084
00008A
000082
000092
003F0
00008D
274
00000037
000000CF
00000048
00000120
00000019
0000002F
00000026
0000005A
0000005D
0000001A
00000043
0000003E
00000069
0000004E
00000026
0000001C
00000005
00000017
00000114
00000032
00000018
00000100
00000100
Code:
Code:
Code:
Code:
Code:
Code:
Code:
Code:
Code:
Code:
Code:
Code:
Code:
Code:
Code:
Code:
Code:
Code:
Data:
Data:
Code:
Data:
Code:
MAIN
EBRKIN
FLIST
FRMOPRND
POB
STEP
PUTHEX
CLRLN
UPDATE
PRINTLN
TRACE
CHKSPCL
SBRK
DUMPREGS
PUTREG8
SPJMP
PMASK
SCODES
MN
MODES
LENS
ATRIBL
ATRIBH
0000009C
000087A1
00000001
00000087
00000093
00000300
00008037
00008224
000081BB
00008526
000081A8
00008204
00008214
0000817E
000081E4
00008233
00008689
00000020
0000009D
00008047
00000086
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
AREG
ATRIBL
CHKSPCL
CR
DIRREGH
DUMPREGS
FABS
FABSL
FABSY
FDIR
FIMM
FINDINX
FINX
FLIST
FRMOPRND
FSTACKREL
LINE
MAIN
MODES
NCODE
OPCREGH
0000008F
000086A1
00008DF0
0000008D
00000094
00008497
0000816B
0000816E
000081C7
00008171
00008159
0000817B
000081AE
00008106
0000814E
00008234
000082EE
00008000
00008657
00000083
00000085
00
03
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
02
00
00
AREGH
BRKN
CLRLN
DBREG
DOBRANCH
EBIT
FABSIND
FABSLX
FACC
FDIRINXX
FIMP
FINDINXL
FINXIND
FPCR
FSRINDINX
GO
LIST
MN
MOVE
OPCREG
OPLEN
00000090
0000FFE6
000082DC
00000095
0000841B
00000097
000081F5
000081C1
00008174
000081A2
0000817A
0000818A
00008190
000081CD
00008244
000080C4
00008003
00008543
0000814A
00000084
0000009F
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
01
00
00
00
275
Debugging Checklist
Program bugs fall into two categories: those specific to the particular processor youre writing assembly
code for, and those that are generic problems which can crop up in any assembly program for almost any
processor. This chapter will primarily consider bugs specific to the 65x processors, but will also discuss some
generic bugs as they specifically apply in 65x assembly programs.
You may want to put a checkmark beside the bugs listed here each time you find them in your
programs, giving you a personalized checklist of problems to look for. You may also want to add to the list
other bugs that you write frequently.
Decimal Flag
Seldom does the d decimal flag get miss set, but when it does, arithmetic results may seem to
inexplicably go south. This can be the result of a typo, attempting to execute data, or some other execution
error. Or it can result from coding errors in which the decimal flag is set to enable decimal arithmetic, then
never reset. If branching occurs before the decimal flag is reset, be sure all paths ultimately result in the flag
being cleared. Branching while in decimal mode is almost as dangerous as branching after temporarily pushing
a value onto the stack; equal care must be taken to clear d and clean the stack.
This bug may be doubly hard to find on the 6502, which does not clear d on interrupt or, worse, on
reset. An instruction inadvertently or mistakenly executed which sets d (only SED, RTI, or PLP have the
capability on the 6502) would require you to specifically reclear the decimal flag or to power off and power
back on again. As a result, it is always a good idea to clear the decimal flag at the beginning of every 6502
program.
276
65x Branches
There are eight 65x conditional branches, each based on one of the two states of four condition code
flags. Remembering how to use them for arithmetic is necessary to code branches that work.
Keep in mind that compare instructions cannot be used for signed comparisons: they dont affect the
overflow flag. Only the subtract instruction can be used to compare two signed numbers directly (except for the
relationships equal and not equal).
Remember that if the z flag is set (one), then the result was zero; and if the zero flag is clear (zero), then
the result was other than zero the opposite of most first guesses about it.
A common code sequence is to test a value, then branch on the basis of the result of the test. A
common mistake is to code an instruction between the test and the branch that also affects the very flag your
branch is based on (often because an instruction you dont expect to affect the flags does indeed do so).
Note that 65x pull instructions set the negative and zero flags, unlike 68xx and 8088/8086 processors;
that store instructions do not set any flags, unlike 68xx processors; that transfer and exchange instructions do set
flags, unlike Motorola and Intel processors; that load instructions do set flags, unlike the 8088; and increment
and decrement instructions do not affect the carry flag.
Also, in decimal mode on the 6502, the negative, overflow and zero flags are not valid.
Interrupt-Handling Code
To correctly handle 65x interrupts, you should generally, at the outset, save all registers and, on the
6502 and in emulation mode, clear the decimal flag (to provide a consistent binary approach to arithmetic in the
interrupt handler). Returning from the interrupt restores the status register, including the previous state of the
decimal flag.
During interrupt handling, once the previous environment has been saved and the new one is solid,
interrupts may be enabled.
At the end of handling interrupts, restore the registers in the correct order. RTI will pull the program
counter and status register from the stack, finishing the return to the previous environment, except that in
65802/65816 native mode it also pulls the program bank register from the stack. This means you must restore
the mode in which the interrupt occurred (native or emulation) before executing an RTI.
65802/65816: MVN/MVP
MVN and MVP require two operands, usually code or data labels from which the assembler strips the
bank bytes, in sourcebank,destbank order (opposite of object code order). Eight-bit index registers will cause
these two instructions to move only zero page memory. But eight-bit accumulator mode is irrelevant to the
count value; the accumulator is expanded to sixteen bits using hidden B accumulator as the high byte of the
count. Finally, the count in the accumulator is one less than the count of bytes to be moved: five in the
accumulator means six bytes will be moved.
278
Return Address
If your program removes the return address from the stack in order to use it in some fashion
other than using an RTS or RTL instruction to return, remember that you must add one to the stacked
value to form the true return address (an operation the return-from-subroutine instructions execute
automatically).
Inconsistent Assembler Syntax
6502 assemblers have been wildly inconsistent in their syntax, and early 65802 assemblers
have not set standard either. This book describes syntax recommended by the designers of the 65816,
the Western Design Center, as implemented in the ORCA/M assembler. Others, however, do and will
differ. For example, while many assemblers use the syntax of a pound sign (#) in front of a sixteen-bit
immediate value to specify that the low byte be accessed, with the greater-then sign (>) being used to
represent the high byte, at least one 6502 assembler uses the same two signs to mean just the opposite.
Syntax for the new block move instructions will undoubtedly vary from the recommended standard in
many assemblers. Beware and keep you assemblers manual handy.
Generic Bugs: They Can Happen Anywhere
Uninitialized Variables
The code you wrote on paper is perfect. The problem is one or more lines that never
got typed in, or were typed in wrong. The solution is to compare your original handwritten
code with the typed-in version, or compare a disassembly with your original code.
More enigmatically, a line may be accidentally deleted or an opcode or operand inadvertently
changed by a keypress during a subsequent edit (usually in a section of code which has just been
proven to work flawlessly). Regular source backups and a program that can compare text to spot
changes will often solve the problem. Or you can compare a disassembly with the previous source
listing.
Failure to Increment the Index in a Loop
The symptoms are: everything stops, and typing at the keyboard has no effect. The
problem is an endless loop your branch out of the loop is waiting for an index to reach to
some specified value, but the index is never decremented or incremented and thus never
reaches the target value.
279
This problem is typically found in code in which first a value is pushed, then there is a
conditional branch, but all paths do not pull the value still on the stack. It may result in a
return address being pulled off the stack which is not really a return address (one or more bytes
of it are really previously pushed data bytes).
Immediate Data Versus Memory Location
Failure to use the # sign to signify a constant (or whatever other syntax a particular assembler
requires) will instruct the assembler to load, not the constant, but data from a memory location that it
assumes the constant specifies. That is, #VAR means access a constant (or the address of a variable);
VAR, on the other hand, means access its contents.
Initializing the Stack Pointer from a Subroutine
It wont take much thought to realize that you cant just reset the stack pointer from
within a subroutine and expect the return-from-subroutine instruction to work. The return
address was pointed to by the previous stack pointer. Who knows where it is in relation to the
newly set one?
Top-Down Design and Structured Programming
Its wise to carefully consider the design of a program before beginning to write any of it. The
goals of design are to minimize program errors, or bugs; to reduce complexity; to maximize
readability; and to increase the speed and ease of coding and testing and thus the productivity of
programmers.
The top-down approach to structured programming combines two major design concepts. This
approach is generally recognized as the method of design which best achieves these goals, particularly
when coding large programs. Top-down design suggests that programs should be broken into levels:
at the top level is a statement of the goal of the program; beneath it are second-level modules, which
are the main control sections of the program; the sections can be broken into their parts; and so on.
A blackjack game (twenty-one), for example, might be broken down into four second-level
modules, the goals of which are to deal the cards, take and place bets on the hands dealt, respond to
requests for more cards, and finally compare each players hand with the dealers to determine
winnings. The dealing module might be broken down into two third-level modules, the goals of which
are to shuffle the cards, and to deliver a card to each player (executed twice so that each player gets
two cards). The shuffling module might be broken into two fourth-level modules which assign a
number to each card and then create a random order to the numbers.
The makeup of each level is clear. At the top level, the makeup describes the program itself.
At lower levels, the makeup describes the subprocess. At the lowest levels, the work is actually done.
A top-down design is then implemented using subroutines. The top level of the program is a
very short straight-line execution routine (or loop in the case of programs that start over when they
reach the end), that does nothing more than call a set of subroutines, one for each second-level module
of the program. The second-level subroutines may call third-level subroutines which may call fourthlevel subroutines, and so on.
Structure programming is a design concept which calls for modules to have only one entry
point; jumping into the middle of a module is not permitted. (A structured approach to the problem of
needing an entry point to the middle of a module is to make that portion of the module a sub-module
280
with its own single entry and exit points.) A second rule is that all exits return control to the calling
module; all branches (selections) are internal; no branches are permitted to code outside the module.
One of the side benefits of modular programming is the ability to reuse previously coded
modules in other programs: the dealing module could be dropped into any card game program that
calls for shuffling followed by the dealing of one card at a time to each player. And its shuffling submodule could be borrowed for other card game programs which only need shuffling. This use of the
modularity principle should not be confused with the top-down structured design; they are distinct but
related concepts. Modular programming in itself is not the same as top-down design.
A software development team could, using top-down design, readily assign one programmer
the task of coding the deck-shuffling routine, another programmer the betting module, another
responsibility for the dealing routines, and a fourth with writing the code for the end-of-game
comparison of hands and determination of the winner.
A new programmer trying to understand a top-down program avoids becoming mired in detail
while trying to get an understanding of the structure, yet can very easily figure how to get to the degree
of detail which interests him.
Finally, debugging, the process of finding and removing programming mistakes, is
exceptionally straightforward with top-down design: on seeing that, after shuffling, one of the 52 cards
seems to be missing, the programmer can go directly to the shuffling subroutines to fix the problem.
Top-down design sometimes seems like a waste of time to programmers anxious to get the
bytes flying; complex programs can take days or weeks of concerted thinking to break down into the
subparts which fit together most logically and efficiently. But the savings in time spent coding and
recoding and in being able to understand, debug, and modify the program later well justify the time
spent on design.
Documentation
One of the most important elements of good programming practice is documentation. It is
remarkable how little one can recall about the nitty-gritty details of a program written just last month
(or sometimes even yesterday) the names of the key variables, their various settings and what each
means and how each interacts with other variables in various routines, and so on. Clever
programmers, those who bend programming principles to ends never anticipated, too often find they
(not to mention their co-workers) can no longer discover the meaning behind their cleverness when it
comes time to debug or modify that code.
The first principle of documentation is to make the program document itself. Choose labels
which are meaningful: DELLOOP is a much better label for the beginning of a loop which deals cards
in a card game than is LAB137. Substitute a label for all constants: branching if theres a 1 in some
register after writing a byte to disk is, by itself, meaningless; branching because theres a constant
named DISKFULL in the register provides clear documentation. When your program needs to
determine if an ASCII value is an upper-case letter, its much clearer to compare with greater than or
equal to A than with greater than @. Who remembers that @ precedes A in the ASCII
chart?
Variables should be commented when theyre declared with a description of their purpose,
their potential settings, and any default states. And if any of that information changes during the
development of the program, the comment should be changed to match.
Routines should be commented when theyre written: Note the purpose of the routine, the
variables or parameters which need to be set before entry into the routine, and the variables or
parameters which will be passed back. If other data structures will be affected by the routine, this, too,
should be commented.
Nothing is as important both to debugging of code and to continuing development of programs
as documentation: self-documentation; a comment on every important line of code that explains and
281
expands it; a comment header on every routine; and a comment on every variable. While some
languages are said to be automatically self-documenting, no language can create documentation
which is half adequate compared to what the original programmer can provide while the program is
being written.
282
addr
two-byte address
addr/const
const
destbk
dp
label
long
nearlabel
sr
srcebk
Program
Counter (PC)
1= Carry
1= Result Zero
1= Disabled
1= Decimal Mode
1= Break caused
interrupt
1= Overflow
1= Negative
Register (X)
Y Index
Register (Y)
Direct
Stack
Pointer (S)
Program
Counter (PC)
Data Bank Register (DBR)
Program Bank Register (PBR)
0
e
n
Emulation
0= Native
Mode
Carry
Zero
IRQ Disable
Decimal Mode
Index Register Select
Memory/Accumulator Select
Overflow
Negative
1= Carry
1= Result Zero
1= Disabled
1= Decimal, 0= Binary
1= 8-bit, 0= 16-bit
1= 8-bit, 0= 16-bit
1= Overflow
1= Negative
285
23
0
Accumulator (A)
Register (X)
Y Index
Register (Y)
Direct
Stack
Pointer (S)
Program
Counter (PC)
0
n
e
c
Emulation
Carry
Zero
IRQ Disable
Decimal Mode
Index Register Select
Memory/Accumulator Select
Overflow
Negative
0= Native Mode
1= Carry
1= Result Zero
1= Disabled
1= Decimal, 0= Binary
1= 8-bit, 0= 16-bit
1= 8-bit, 0= 16-bit
1= Overflow
1= Negative
286
15
7
Accumulator (B)
(C)
0
Accumulator (A)
Direct
0
Program
0
e
c
Emulation
Carry
Zero
IRQ Disable
Decimal Mode
Break Instruction
Overflow
Negative
1= Carry
1= Result Zero
1= Disabled
1= Decimal, 0= Binary
1= Break caused
interrupt
1= Overflow
1= Negative
287
Absolute Addressing
Effective Address:
Bank: Data Bank Register (DBR) if locating data: Program Bank Register (PBR) if transferring
control.
High: Second operand byte.
Low: First operand byte.
Sample Syntax:
LDA addr
Effective Address:
23
15
Bank
Instruction:
Opcode
65816 Register:
Bank
23
Data Bank (DBR)
Operand Low
7
High
0
Low
Operand High
High
Low
15
if locating data
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
of transferring control
Status (P)
LDY
LSR
ORA
ROL
ROR
SBC
STA
STX
STY
STZ
TRB
TSB
288
Sample Syntax:
LDA addr, X
Effective Address:
23
15
Bank
Instruction:
Opcode
Operand Low
65816 Registers:
Bank
7
High
0
Low
Operand High
High
Low
15
Register (X)
x-1
x 0
Y Index
Accumulator
Register (Y)
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
LSR
ORA
ROL
ROR
SBC
STA
STZ
289
Effective Address:
23
15
Bank
Instruction:
Opcode
65816 Registers:
Bank
23
Data Bank (DBR)
Operand Low
7
High
0
Low
Operand High
High
Low
15
X Index
Register (X)
Y Index
Register (Y)
+
x1
x0
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
ORA
SBC
STA
290
Instruction:
Opcode
65816 Registers:
Bank
23
Data Bank (DBR)
Operand Low
7
High
0
Low
Operand High
High
Low
15
0
+1
+
X Index
Register (X)
Program Bank
Memory
x=1
x=0
Y Index
Accumulator
Register (Y)
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Stack (P)
291
7
High
0
Low
Instruction:
Opcode
Operand Low
Operand High
65816 Registers:
Bank
23
15
Data Bank (DBR)
High
Low
7
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Bank 0 Memory
Counter (PC)
Status (P)
292
Sample Syntax:
JMP [addr]
Effective Address:
23
15
Bank
Instruction:
Opcode
Operand Low
Operand High
+2
7
High
0
Low
65816 Registers:
Bank
23
Data Bank (DBR)
+1
High
15
X Index
Register (X)
Y Index
Register (Y)
Accumulator
Direct
0000 0000
Stack
Pointer (S)
Program
Bank 0 Memory
(A or C)
0000 0000
Low
Counter (PC)
Status (P)
293
Sample Syntax:
LDA long
Effective Address:
23
Bank
Instruction:
Opcode
65816 Register:
Bank
23
15
Data Bank (DBR)
Operand Low
Operand High
High
7
High
0
Low
Operand Bank
Low
7
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
15
Program
Counter (PC)
Status (P)
LDA
ORA
SBC
STA
JSR(JSL)
294
The 24-bit Operand is added to (16 bits if 65802/65816 native mode, x = 0; else
8 bits)
Sample Syntax:
LDA long, X
Effective Address:
23
Bank
Instruction:
Opcode
65816 Registers:
Bank
23
15
Data Bank (DBR)
Operand Low
Operand High
High
15
7
High
0
Low
Operand Bank
Low
7
0
+
X Index
Register (X)
x 1
x 0
Y Index
Accumulator
Register (Y)
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Stack
LDA
ORA
SBC
STA
295
Accumulator Addressing
8-Bit Data (all processors): Data: Byte in accumulator A.
Data
7
Accumulator B
0
Accumulator A
Data
(A or C)
Sample Syntax:
ASLA
Instruction:
Opcode
65816 Registers:
Bank
23
Data Bank (DBR)
High
Low
15
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
m 1
m 0
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
INC1
LSR
ROL
ROR
296
Bank:
The 16-bit value in X; if X is only 8 bits (mode flag x = 1), the high byte is 0.
Instruction:
Opcode
65816 Registers:
Bank
23
15
Data Bank (DBR)
Destination Bank
Source Bank
High
Low
7
X Index
00000000
Y Index
00000000
Accumulator
23
0000 0000
Program Bank (PBR
Direct
Stack
Program
0
Low
0
Low
Register (X)
x 0
x 1
Register (Y)
x 0
x 1
(A or C)
1
0000 0000
16 bit count
Status (P)
297
Bank:
High/Low
Sample Syntax:
LDA dp
Effective Address:
23
Bank
15
7
High
0
Low
00000000
Instruction:
Opcode
65816 Registers:
Bank
23
15
Data Bank (DBR)
Operand
High
Low
7
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
Page (D)
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
LDY
LSR
ORA
ROL
ROR
SBC
STA
STX
STY
STZ1
TRB1
TSB1
298
Zero
Direct Page Register plus Operand byte plus X (16 bits if 65802/65816 native mode, x =
0; else 8 bits).
Sample Syntax:
LDA dp, X
Effective Address:
23
15
Bank
7
High
0
Low
00000000
Instruction:
Opcode
65816 Registers:
Bank
23
Operand
High
15
Low
7
+
X Index
Register (X)
x 1
x 0
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
Page (D)
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
LSR
ORA
ROL
ROR
SBC
STA
STY
STZ1
299
7
High
0
Low
00000000
Instruction:
Opcode
65816 Registers:
Bank
23
15
Data Bank (DBR)
Operand
High
Low
7
+
X Index
Register (X)
+
Y Index
Register (Y)
x 1
x 0
0000 0000
Direct
Page (D)
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
300
Sample Syntax:
LDA (dp, X)
Effective Address:
23
15
Bank
Instruction:
Opcode
65816 Registers:
Bank
23
Data Bank (DBR)
7
High
0
Low
Operand
High
15
Low
7
X Index
Register (X)
x 1
x 0
Y Index
Accumulator
+1
Register (Y)
(A or C)
High Indirect
Address
Low Indirect
Address
Bank 0 Memory
+
0000 0000
Direct
0000 0000
Stack
Pointer (PC)
Program
Counter (PC)
Page (D)
Status (P)
LDA
ORA
SBC
STA
301
Sample Syntax:
LDA (dp)
Effective Address:
23
Bank
Instruction:
Opcode
65816 Registers:
Bank
23
15
7
High
0
Low
Operand
High
Low
15
X Index
Register (X)
Y Index
Register (Y)
+1
+
High Indirect
Address
Low Indirect
Address
Bank 0 Memory
Accumulator
(A or C)
0000 0000
Direct
Page (D)
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
LDA
ORA
SBC
STA
302
Sample Syntax:
LDA [dp]
Effective Address:
23
15
Bank
Instruction:
Opcode
65816 Registers:
Bank
23
7
High
0
Low
Operand
High
Low
15
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
Page (D)
0000 0000
Stack
Pointer (S)
Program
+2
+1
+
Bank Indirect
Address
High Indirect
Address
Low Indirect
Address
Bank 0 Memory
Counter (PC)
Status (P)
LDA
ORA
SBC
STA
303
Found by concatenating the data bank to the double-byte indirect address, then
adding Y (16 bits if 65802/65816 native mode, x = 0; else 8).
Located in the Direct Page at the sum of the direct page register and the operand
byte, in bank zero.
Sample Syntax:
LDA (dp), Y
Effective Address:
23
15
Bank
Instruction:
Opcode
65816 Registers:
Bank
23
7
High
0
Low
Operand
High
Low
15
X Index
Register (X)
+1
Y Index
Register (Y)
High Indirect
Address
Low Indirect
Address
Bank 0 Memory
x=1
x=0
Accumulator
(A or C)
0000 0000
Direct
Page (D)
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
LDA
ORA
SBC
STA
304
Sample Syntax:
LDA (dp), Y
Effective Address:
23
Bank
Instruction:
Opcode
65816 Registers:
Bank
23
15
Data Bank (DBR)
15
7
High
0
Low
Operand
High
Low
7
0
+2
X Index
Register (X)
+1
+
Y Index
Bank Indirect
Address
High Indirect
Address
Low Indirect
Address
Register (Y)
Bank 0 Memory
x 1
x 0
Accumulator
(A or C)
0000 0000
Direct
Page (D)
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
LDA
ORA
SBC
STA
305
Immediate Addressing
8-Bit Data (all processors): Data Operand byte.
16-Bit Data (65802/65816, native mode, applicable mode flag m or x = 0):
Data High:
Second Operand byte.
Data Low:
First Operand byte.
Sample Syntax:
LDA const.
Instruction:
Opcode
Opcode
Data = Operand
Instruction:
65816 Registers:
Bank
23
Data Bank (DBR)
High
Low
15
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
LDX
LDY
ORA
REP1
SBC
SEP1
306
Implied Addressing
Type 1: Mnemonic specifies register(s) to be operated on
Type 2: Mnemonic specifies flag bit(s) to be operated on
Type 3: Mnemonic specifies operation; no data involved
Sample Syntax:
NOP
Instruction:
Opcode
65816 Registers:
Bank
23
15
Data Bank (DBR)
High
Low
7
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
TSX
TXA
TXS
TXY
TYA
TYX
XBA
SEC
SED
SEI
XCE
WAP
65802/65816 only.
307
Instruction:
Opcode
15
7
High
0
Low
Operand
sign extended to 16 bits
65816 Registers:
Bank
23
Data Bank (DBR)
High
Low
15
+
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (PC)
308
Instruction:
Opcode
65816 Registers:
Bank
23
Data Bank (DBR)
Operand Low
7
High
0
Low
Operand High
High
Low
15
+
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
309
65816 Registers:
Bank
23
Data Bank (DBR)
High
Low
15
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
310
Stack
before
Data High
Data
Data Low
Bank 0
311
Sample Syntax:
PEI dp
Effective Address:
23
15
Bank
Instruction:
Opcode
65816 Registers:
Bank
23
15
Data Bank (DBR)
0
Low
Operand
High
Low
7
X Index
Register (X)
Y Index
Register (Y)
Accumulator
Direct
0000 0000
Stack
Pointer (S)
Program
(A or C)
0000 0000
7
High
Counter (PC)
Status (P)
312
Source
Effective Address + 1
Source
Effective Address:
Stack
before
Stack
Pointer (S)
after
Bank 0
313
Zero
High/Low:
Data: Source:
Program Bank, Program Counter, and Status Register.
Destination Effective Address: Provided by Stack Pointer.
Sample Syntax:
BRK
Effective Address:
23
15
Bank
7
High
0
Low
00000000
Instruction:
Optimal Signature
Byte
Note: Hardware interrupt addressing differs only
in that there is no instruction involved
Opcode
Interrupt Vector
Address +1
65816 Registers:
Bank
23
High
Interrupt Vector
Address
Low
15
Contents of
Vector High
Contents of
Vector Low
X Index
Register (X)
Y Index
Register (Y)
Accumulator
0000 0000
0000 0000
Program Bank (PBR)
Direct
Stack
Program
(A or C)
Page Register (D)
Pointer (S)
Vectors
6502/C02/emulation
Native
IRQ
00FFEE.F
RESET
00FFFC.D
NMI
00FFFA.B
00FFEA.B
ABORT
00FFF8.9
00FFE8.9
BRK
00FFFE.F
00FFE6.7
COP
00FFF4.5
00FFE4.5
Counter (PC)
Instructions Using It:
Status (P)
314
before
Stack
Pointer (S)
after
Stack
PC High
PC Low
Status (P)
Program
Counter (PC)
Status (P)
Bank 0
Stack
Pointer (S)
after
Stack
Program Bank (PBR)
PC High
PC Low
Status (P)
Counter (PC)
Status (P)
Bank 0
315
Operand Low
65816 Registers:
Bank
23
15
Data Bank (DBR)
Operand High
High
Low
7
X Index
Register (X)
Y Index
Register (Y)
Accumulator
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
before
Stack
Pointer (S)
after
Data
(A or C)
0000 0000
Stack
Data High
Data Low
Data
Bank 0
316
High
Low
15
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000
Stack
Pointer (S)
0000
Program
Counter (PC)
Status (P)
65802/65816 only.
8 bit register, except on 65802/816 may either 8 or 16 bits, dependent on flag m.
8 bit register, except on 65802/816 may be either 8 or 16 bits, dependent on flag x.
16 bits always.
8 bits always.
317
Stack
after
Register
before
Bank 0
Stack
after
Register High
Register Low
before
Bank 0
318
High
23
Low
15
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
(also called K)
Status (P)
PHP
PHX
65802/65816 only.
8 bit register, except on 65802/816, may be either 8 or 16 bits, dependent on flag m.
8 bit register, except on 65802/816, may be either 8 or 16 bits, dependent on flag x.
16 bit always.
8 bit always.
319
Stack
before
Stack Pointer (S)
Register
after
Bank 0
Stack
before
Register High
Register Low
after
Bank 0
320
High
23
Low
15
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
Pointer (S)
before
Stack
PC High
PC Low
Status (P)
Program
Counter (PC)
Status (P)
Bank 0
after
Stack
Pointer (S)
before
Stack
Program Bank (PBR)
PC High
PC Low
Status (P)
Counter (PC)
Status (P)
Bank 0
321
High
Low
15
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
Pointer (S)
before
Stack
Program Bank (PBR)
PC High
PC Low
Program
Counter (PC)
Bank 0
322
High
Low
15
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
after
Stack Pointer (S)
befor
e
Stack
PC High
PC Low
+1
Program
Counter
(PC)
Bank 0
323
15
7
High
0
Low
00000000
Instruction:
Opcode
65816 Registers:
Bank
23
15
Data Bank (DBR)
Operand
High
Low
7
X Index
Register (X)
Y Index
Register (Y)
Accumulator
(A or C)
0000 0000
Direct
0000 0000
Stack
Pointer (S)
Program
Counter (PC)
Status (P)
324
65816 Registers:
Bank
23
Data Bank (DBR)
15
High
Low
Operand
High
Low
15
Stack
X Index
Register (X)
Y Index
Register (Y)
x-1
x-0
Accumulator
Direct
0000 0000
Stack
Pointer (S)
Program
Bank 0
Memory
(A or C)
0000 0000
Counter (PC)
Status (P)
SBC
STA
325
addr
addr / const
const
destbk
dp
label
long
nearlabel
sr
srcebk
two-byte address
two-byte value: either an address or a constant
one- or two-byte constant
64K bank to which string will be moved
one-byte direct page offset (6502/65C02: zero page)
label of code in same 64 bank as instruction
three-byte address (includes bank byte)
label of code close enough to instruction to be reachable
by a one-byte signed offset
one-byte stack relative offset
64K bank from which string will be moved
Table 18-1 Operand Symbols
Flags:
bits:
7 6 5 4 3 2 1 0
6502/65C02/6502 emulation:
n v - b d i z c
65802/65816 native:
n v m x d i z c
n negative result
v overflow
m 8-bit memory/accumulator
x 8-bit index registers
b BRK caused interrupt
d decimal mode
i IRQ interrupt disable
z zero result
c carry
Table 18-2 65x Flags
326
ADC
Add the data located at the effective address specified by the operand to the contents of the
accumulator; add one to the result if the carry flag is set, and store the final result in the accumulator.
The 65x processors have no add instruction that does not involve the carry. To avoid adding the carry
flag to the result, you must either be sure that it is already clear, or you must explicitly clear it (using CLC)
prior to executing the ADC instruction.
In a multi-precision (multi-word) addition, the carry should be cleared before the low-order words are
added; the addition of the low word will generate a new carry flag value based on the addition. This new value
in the carry flag is added into the next (middle-order or high-order) addition; each intermediate result will
correctly reflect the carry from the previous addition.
d flag clear: Binary addition is performed.
d flag set: Binary coded decimal (BCD) addition is performed.
8-bit accumulator (all processors): Data added from memory is eight-bit.
16-bit accumulator (65802/65816 only, m = 0): Data added from memory is sixteen-bit: the low-order
eight bits are located at the effective address; the high-order eight bits are located at the effective address plus
one.
Flags Affected:
n v z c
n
v
z
c
Codes:
Opcode
Addressing Mode + +
Immediate
Absolute
Absolute Long
Direct Page (DP)
DP Indirect
DP Indirect Long
Absolute Indexed, X
Absolute Long Indexed, X
Absolute Indexed Y
DP Indexed, X
DP Indexed Indirect, X
DP Indirect Indexed, Y
DP Indirect Long Indexed, Y
Stack Relative (SR)
SR Indirect Indexed, Y
Syntax
ADC #const
ADC addr
ADC long
ADC dp
ADC (dp)
ADC [dp]
ADC addr, X
ADC long, X
ADC addr, Y
ADC dp, X
ADC (dp, X)
ADC (dp), Y
ADC [dp], Y
ADC sr, S
ADC (sr, S), Y
(hex)
69
6D
6F
65
72
67
7D
7F
79
75
61
71
77
63
73
6502
x
x
x
x
x
x
x
x
Available on:
65C02 65802/816
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
# of
# of
Bytes
2*
Cycles
3
4
2
2
2
3
4
3
2
2
2
2
2
2
21,4
41,4
51,4
31,2,4
51,2,4
61,2,4
41,3,4
51,4
41,3,4
41,2,4
61,2,4
51,2,3,4
61,2,4
41,4
71,4
+ + ADC, a Primary Group Instruction, has available all of the Primary Group addressing modes and bit patterns
* Add 1 byte if m = 0 (16-bit memory/accumulator)
1 Add 1 cycle if m = 0 (16-bit memory/accumulator)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
3 Add 1 cycle if adding index crosses a page boundary
4 Add 1 cycle if 65C02 and d = 1 (decimal mode, 65C02)
327
AND
Bitwise logical AND the data located at the effective address specified by the operand with the contents
of the accumulator. Each bit in the accumulator is ANDed with the corresponding bit in memory, with the
result being stored in the respective accumulator bit.
The truth table for the logical AND operation is:
Second
Operand
0
0
0
1
First Operand
0
1
That is, a 1 or logical true results in given bit being true only if both elements of the respective bits
being ANDed are 1s, or logically true.
8-bit accumulator (all processors): Data ANDed from memory is eight-bit.
16-bit accumulator (65802/65816 only, m = 0): Data ANDed from memory is sixteen-bit: the loworder byte is located at the effective address; the high-order byte is located at the effective address plus one.
Flags Affected: n - - - - - z n
Set if most significant bit of result is set; else cleared.
z
Set if result is zero; else cleared.
Codes:
Addressing Mode + +
Immediate
Absolute
Absolute Long
Direct Page (DP)
DP Indirect
DP Indirect Long
Absolute Indexed, X
Absolute Long Indexed, X
Absolute Indexed,Y
DP Indexed, X
DP Indexed Indirect, X
DP Indirect Indexed, Y
DP Indirect Long Indexed, Y
Stack Relative (SR)
SR Indirect Indexed, Y
Syntax
AND # const
AND addr
AND long
AND dp
AND(dp)
AND [dp]
AND addr, X
AND long, X
AND addr, Y
AND dp, X
AND (dp, X)
AND (dp), Y
AND [dp], Y
AND sr, S
AND (sr, S), Y
Opcode
Available on:
(hex)
6502 65C02 65802/816
29
x
x
x
2D
x
x
x
2F
x
25
x
x
x
32
x
x
27
x
3D
x
x
x
3F
x
39
x
x
x
35
x
x
x
21
x
x
x
31
x
x
x
37
x
23
x
33
x
# of
Bytes
2*
3
4
2
2
2
3
4
3
2
2
2
2
2
2
# of
Cycles
21
41
51
31,2
51,2
61,2
41,3
51
41,3
41,2
61,2
51,2,3
61,2,0
41
71
+ + AND, a Primary Group Instruction, has available all of the Primary Group addressing modes and bit patterns
* Add 1 byte if m = 0 (16-bit memory/accumulator)
1 Add 1 cycle if m = 0 (16-bit memory/accumulator)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
3 Add 1 cycle if adding index crosses a page boundary
328
ASL
Shift the contents of the location specified by the operand left one bit. That is, bit one takes on the
value originally found in bit zero, bit two takes the value originally in bit one, and so on; the leftmost bit (bit 7
on the 6502 and 65C02 or if m = 1 on the 65802/65816, or bit 15 if m = 0) is transferred into the carry flag; the
rightmost bit, bit zero, is cleared. The arithmetic result of the operation is an unsigned multiplication by two.
ASL
X
Carry Flag
Figure 18-2 ASL
Codes:
Addressing Mode
Accumulator
Absolute
Direct Page (DP)
Absolute Indexed, X
DP Indexed, X
1
2
3
Syntax
ASL A
ASL addr
ASL dp
ASL addr, X
ASL dp, X
Opcod
e
(hex)
0A
0E
06
1E
16
Available on:
6502
x
x
x
x
x
65C02
x
x
x
x
x
65802/816
x
x
x
x
x
# of
# of
Bytes
1
3
2
3
2
Cycles
2
61
51,2
71,3
61,2
329
BCC
The carry flag in the P status register is tested. If it is clear, a branch is taken; if it is set, the instruction
immediately following the two-byte BCC instruction is executed.
If the branch is taken, a one-byte signed displacement, fetched from the second byte of the instruction,
is sign-extended to sixteen bits and added to the program counter. Once the branch address has been calculated,
the result is loaded into the program counter, transferring control to that location.
The allowable range of the displacement is 128 to + 127 (from the instruction immediately following
the branch).
BCC may be used in several ways: to test the result of a shift into the carry; to determine if the result of
a comparison is either less than (in which case a branch will be taken), or greater than or equal (which causes
control to fall through the branch instruction); or to determine if further operations are needed in multi-precision
arithmetic.
Because the BCC instruction causes a branch to be taken after a comparison or subtraction if the
accumulator is less than the memory operand (since the carry flag will always be cleared as a result), many
assemblers allow an alternate mnemonic for the BCC instruction: BLT, or Branch if Less Than.
Flags Affected :
- - - - - - - -
Codes:
Opcode
Addressing Mode +
+
Program Counter
Relative
Available on:
# of
# of
Cycles
Syntax
(hex)
6502
65C02
65802/816
Bytes
BCC nearlabel
90
21,2
(or BLT
nearlabel)
1
2
330
BCS
The carry flag in the P status register is tested. If it is set, a branch is taken; if it is clear, the instruction
immediately following the two-byte BCS instruction is executed.
If the branch is taken, a one-byte signed displacement, fetched from the second byte of the instruction,
is sign-extended to sixteen bits and added to the program counter. Once the branch address has been calculated,
the result is loaded into the program counter, transferring control to that location.
The allowable range of the displacement is 128 to + 127 (from the instruction immediately following
the branch).
BCS is used in several ways: to test the result of a shift into the carry; to determine if the result of a
comparison is either greater than or equal (which causes the branch to be taken) or less than; or to determine if
further operations are needed in multi-precision arithmetic operations.
Because the BCS instruction causes a branch to be taken after a comparison or subtraction if the
accumulator is greater than or equal to the memory operand (since the carry flag will always be set as a result),
many assemblers allow an alternate mnemonic for the BCS instruction: BGE or Branch if Greater or Equal.
Flags Affected:
- - - - - - - -
Codes:
Addressing Mode
Program Counter
Relative
1
2
Syntax
BCS
nearlabel
(or BGE
nearlabel)
Opcode
Available on:
(hex) 6502 65C02 65802/816
B0
# of
Bytes
# of
Cycles
21,2
331
Branch if Equal
BEQ
The zero flag in the P status register is tested. If it is set, meaning that the last value tested (which
affected the zero flag) was zero, a branch is taken; if it is clear, meaning the value tested was non-zero, the
instruction immediately following the two-byte BEQ instruction is executed.
If the branch is taken, a one-byte signed displacement, fetched from the second byte of the instruction,
is sign-extended to sixteen bits and added to the program counter. Once the branch address has been calculated,
the result is loaded into the program counter, transferring control to that location.
The allowable range of the displacement is 128 to + 127 (from the instruction immediately following
the branch).
BEQ may be used in several ways: to determine if the result of a comparison is zero (the two values
compared are equal), for example, or if a value just loaded, pulled, shifted, incremented or decremented is zero;
or to determine if further operations are needed in multi-precision arithmetic operations. Because testing for
equality to zero does not require a previous comparison with zero, it is generally most efficient for loop counters
to count downwards, existing when zero is reached.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing Mode
Program
Relative
1
2
Syntax
Counter BEQ
nearlabel
Available on:
(hex)
6502
65C02
F0
# of
Byte
65802/816
s
x
# of
Cycle
s
21,2
332
BIT
BIT sets the P status register flags based on the result of two different operations, making it a dualpurpose instruction:
First, it sets or clears the n flag to reflect the value of the high bit of the data located at the effective
address specified by the operand, and sets or clears the v flag to reflect the contents of the next-to-highest bit of
the data addressed.
Second, it logically ANDs the data located at the effective address with the contents of the accumulator;
it changes neither value, but sets the z flag if the result is zero, or clears it if the result is non-zero.
BIT is usually used immediately preceding a conditional branch instruction: to test a memory values
highest or next-to-highest bits; with a mask in the accumulator, to test any bits of the memory operand; or with a
constant as the mask (using immediate addressing) or a mask in memory, to test any bits in the accumulator.
All of these tests are non-destructive of the data in the accumulator or in memory. When the BIT instruction is
used with the immediate addressing mode, the n and v flags are unaffected.
8-bit accumulator/memory (all processors): Data in memory is eight-bit; bit 7 is moved into the n
flag; bit 6 is moved into the v flag.
16-bit accumulator/memory (65802/65816 only, m = 0): Data in memory is sixteen-bit: the low-order
eight bits are located at the effective address; the high-order eight bits are located at the effective address plus
one. Bit 15 is moved into the n flag; bit 14 is moved into the v flag.
Flags Affected:
Codes:
Addressing Mode
Immediate
Absolute
Direct Page (DP)
Absolute Indexed,
X
DP Indexed, X
*
1
2
Syntax
BIT # const
BIT addr
BIT dp
Opcode
(hex)
6502
89
2C
x
24
x
Available on:
65C02 65802/816
x
x
x
x
x
x
# of
Bytes
2*
3
2
# of
Cycles
21
41
31,2
BIT addr, X
3C
41,3
BIT dp, X
34
41,2
333
Branch if Minus
BMI
The negative flag in the P status register is tested. If it is set, the high bit of the value which most
recently affected the n flag was set, and a branch is taken. A number with its high bit set may be interpreted as
a negative twos-complement numbers, so this instruction tests, among other things, for the sign of twoscomplement numbers. If the negative flag is clear, the high bit of the value which most recently affected the
flag was clear, or, in the twos-complement system, was a positive number, and the instruction immediately
following the two-byte BMI instruction is executed.
If the branch is taken, a one-byte signed displacement, fetched from the second byte of the instruction,
is sign-extended to sixteen bits and added to the program counter. Once the branch address has been calculated,
the result is loaded into the program counter, transferring control to that location.
The allowable range of the displacement is 128 to + 127 (from the instruction immediately following
the branch).
BMI is primarily used to either determine, in twos-complement arithmetic, if a value is negative or, in
logic situations, if the high bit of the value is set. It can also be used when looping down through zero (the loop
counter must have a positive initial value) to determine if zero has been passed and to effect an exit from the
loop.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing Mode
Program
Relative
1
2
Syntax
Counter BMI
nearlabel
Available on:
# of
(hex)
6502
65C02
65802/816
30
Bytes
2
# of
Cycle
s
21,2
334
BNE
The zero flag in the P status register is tested. If it is clear (meaning the value just tested is non-zero), a
branch is taken; if it is set (meaning the value tested is zero), the instruction immediately following the two-byte
BNE instruction is executed.
If the branch is taken, a one-byte signed displacement, fetched from the second byte of the instruction,
is sign-extended to sixteen bits and added to the program counter. Once the branch address has been calculated,
the result is loaded into the program counter, transferring control to that location.
The allowable range of the displacement is 128 to + 127 (from the instruction immediately following
the branch).
BNE may be used in several ways: to determine if the result of a comparison is non-zero (the two
values compared are not equal), for example, or if the value just loaded or pulled from the stack is non-zero, or
to determine if further operations are needed in multi-precision arithmetic operations.
Flags Affected:
- - - - - - - -
Codes:
Addressing Mode
Program
Relative
1
2
Syntax
BNE
Counter
nearlabe
l
Opcode
(hex)
D0
Available on:
6502 65C02 65802/816
x
# of
Bytes
2
# of
Cycles
21,2
335
Branch if Plus
BPL
The negative flag in the P status register is tested. If it is clear meaning that the last value which
affected the zero flag had its high bit clear a branch is taken. In the twos complement system, values with
their high bit clear are interpreted as positive numbers. If the flag is set, meaning the high bit of the last value
was set, the branch is not taken; it is a twos-complement negative number, and the instruction immediately
following the two-byte BPL instruction is executed.
If the branch is taken, a one-byte signed displacement, fetched from the second byte of the instruction,
is sign-extended to sixteen bits and added to the program counter. Once the branch address has been calculated,
the result is loaded into the program counter, transferring control to that location.
The allowable range of the displacement is 128 to + 127 (from the instruction immediately following
the branch).
BPL is used primarily to determine, in twos-complement arithmetic, if a value is positive or not or, in
logic situations, if the high bit of the value is clear.
Flags Affected:
- - - - - - - -
Codes:
Addressing Mode
Syntax
Program
Counter BPL
Relative
nearlabel
1
2
Opcode
(hex)
10
Available on:
6502 65C02 65802/816
x
# of
Bytes
2
# of
Cycles
21,2
336
Branch Always
BRA
A branch is always taken, and no testing is done: in effect, an unconditional JMP is executed, but since
signed displacements are used, the instruction is only two bytes, rather than the three bytes of a JMP.
Additionally, using displacements from the program counter makes the BRA instruction relocatable. Unlike a
JMP instruction, the BRA is limited to targets that lie within the range of the one-byte signed displacement of
the conditional branches: - 128 to + 127 bytes from the first byte following the BRA instruction.
To branch, a one-byte signed displacement, fetched from the second byte of the instruction, is signextended to sixteen bits and added to the program counter. Once the branch address has been calculated, the
result is loaded into the program counter, transferring control to that location.
Flags Affected:
- - - - - - - -
Codes:
Addressing Mode
Program Counter
Relative
1
Syntax
BRA
nearlabel
Opcode
(hex)
80
Available on:
6502 65C02 65802/816
x
# of
Bytes
# of
Cycles
31
Add 1 cycle if branch crosses page boundary on 65C02 or in 65816/65802s 6502 emulation mode (e = 1)
337
Software Break
BRK
0000
0001
0002
0004
68
48
2910
D007
PLA
PHA
AND
BNE
#$10
ISBRK
Fragment 18.1
65802/65816 Native Mode (e = 0): The program counter bank register is pushed onto the stack; the
program counter is incremented by two and pushed onto the stack; the status register is pushed onto the stack;
the interrupt disable flag is set; the program bank register is cleared to zero; and the program counter is loaded
from the break vector at $00FFE6-00FFE7.
6502: The d decimal flag is not modified after a break is executed.
65C02 and 65816/65802: The d decimal flag is reset to 0 after a break is executed.
338
Stack
Bank Address
Address High
Address Low
Contents of
Status Register
Stack Pointer
Bank 0
Flags Affected:
b
d
i
- - b - i - (6502)
- - b d i - (65C02, 65802/65816 emulation mode e = 1)
- - - d i - (65802/65816 native mode e = 0)
b in the P register value pushed onto the stack is set.
d is reset to 0, for binary arithmetic.
The interrupt disable flag is set, disabling hardware IRQ interrupts.
Codes:
Opcode
Addressing
Mode
Stack/Interrupt
Available on:
Syntax
(hex)
6502
65C02
65802/816
BRK
00
# of
# of
Bytes
Cycles
2*
71
*
BRK is 1 byte, but program counter value pushed onto stack is incremented by 2 allowing for optional
signature
byte.
1 Add 1 cycle for 65802/65816 native mode (e = 0)
339
BRL
A branch is always taken, similar to the BRA instruction. However, BRL is a three-byte instruction;
the two bytes immediately following the opcode form a sixteen-bit signed displacement from the program
counter. Once the branch address has been calculated, the result is loaded into the program counter, transferring
control to that location.
The allowable range of the displacement is anywhere within the current 64K program bank.
The long branch provides an unconditional transfer of control similar to the JMP instruction, with one
major advantage: the branch instruction is relocatable while jump instructions are not. However, the (nonrelocatable) jump absolute instruction executes one cycle faster.
Flags Affected:
- - - - - - - -
Codes:
Addressing Mode
Program Counter
Relative Long
Syntax
BRL label
Opcode
(hex)
82
6502
Available on:
65C02 65802/816
x
# of
Bytes
3
# of
Cycles
4
340
BVC
The overflow flag in the P status register is tested. If it is clear, a branch is taken; if it is set, the
instruction immediately following the two-byte BVC instruction is executed.
If the branch is taken, a one-byte signed displacement, fetched from the second byte of the instruction,
is sign-extended to sixteen bits and added to the program counter. Once the branch address has been calculated,
the result is loaded into the program counter, transferring control to that location.
The allowable range of the displacement is 128 to + 127 (from the instruction immediately following
the branch).
The overflow flag is altered by only four instructions on the 6502 and 65C02 addition, subtraction, the
CLV clear-the-flag instruction, and the BIT bit-testing instruction. In addition, all the flags are restored from
the stack by the PLP and RTI instructions. On the 65802/65816, however, the SEP and REP instructions can
also modify the v flag.
BVC is used almost exclusively to check that a twos-complement arithmetic calculation has not
overflowed, much as the carry is used to determine if an unsigned arithmetic calculation has overflowed. (Note,
however, that the compare instructions do not affect the overflow flag.) You can also use BVC to test the
second highest bit in a value by using it after the BIT instruction, which moves the second highest bit of the
tested value into the v flag.
The overflow flag can also be set by the Set Overflow hardware signal on the 6502, 65C02, and 65802;
on many systems, however, there is no connection to this pin.
Flags Affected:
- - - - - - - -
Codes:
Addressing Mode
Program Counter
Relative
1
2
Syntax
BVC
nearlabel
Opcode
(hex)
6502
50
Available on:
65C02 65802/816
x
# of
Bytes
2
# of
Cycles
21,2
341
BVS
The overflow flag in the P status register is tested. If it is set, a branch is taken; if it is clear, the
instruction immediately following the two-byte BVS instruction is executed.
If the branch is taken, a one-byte signed displacement, fetched from the second byte of the instruction,
is sign-extended to sixteen bits and added to the program counter. Once the branch address has been calculated,
the result is loaded into the program counter, transferring control to that location.
The allowable range of the displacement is 128 to + 127 (from the instruction immediately following
the branch).
The overflow flag is altered by only four instructions on the 6502 and 65C02 addition, subtraction, the
CLV clear-the-flag instruction and the BIT bit-testing instructions. In addition, all the flags are restored from
the stack by the PLP and RTI instruction. On the 65802/65816, the SEP and REP instructions can also modify
the v flag.
BVS is used almost exclusively to determine if a twos-complement arithmetic calculation has
overflowed, much as the carry is used to determine if an unsigned arithmetic calculation has overflowed. (Note,
however, that the compare instructions do not affect the overflow flag.) You can also use BVS to test the
second-highest bit in a value by using it after the BLT instruction, which moves the second-highest bit of the
tested value into the v flag.
The overflow flag can also be set by the Set Overflow hardware signal on the 6502, 65C02, and 65802;
on many systems, however, there is no hardware connection to this signal.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing Mode
Syntax
(hex)
6502
Available on:
# of
65C02
Bytes
65802/816
# of
Cycle
s
Program Counter
BVS
70
x
x
x
2
21,2
Relative
nearlabel
1 1 Add 1 cycle if branch is taken
2 Add 1 more cycle if branch taken crosses page boundary on 6502, 65C02, or 65816/65802s 6502
emulation mode (e = 1)
342
CLC
Flags Affected:
- - - - - - - c
c carry flag cleared always.
Codes:
Opcode
Addressing
Mode
Implied
Available on:
Syntax
(hex)
6502
65C02
65802/816
CLC
18
# of
# of
Bytes
Cycles
2
343
CLD
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Available on:
# of
# of
Syntax
(hex)
6502
65C02
65802/816
Bytes
Cycles
CLD
D8
344
CLI
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
CLI
Available on:
# of
# of
(hex)
6502
65C02
65802/816
Bytes
Cycles
58
345
CLV
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Available on:
# of
# of
Bytes
Cycles
Syntax
(hex)
6502
65C02
CLV
B8
65802/816
x
346
CMP
Subtract the data located at the effective address specified by the operand from the contents of the
accumulator, setting the carry, zero, and negative flags based on the result, but without altering the contents of
either the memory location or the accumulator. That is, the result is not saved. The comparison is of unsigned
binary values only.
The CMP instruction differs from the SBC instruction in several ways. First, the result is not saved.
Second, the value in the carry prior to the operation is irrelevant to the operation; that is, the carry does not have
to be set prior to a compare as it is with 65x subtractions. Third, the compare instruction does not set the
overflow flag, so it cannot be used for signed comparisons. Although decimal mode does not affect the CMP
instruction, decimal comparisons are effective, since the equivalent binary values maintain the same magnitude
relationships as the decimal values have, for example, $99 > $04 just as 99 > 4.
The primary use for the compare instruction is to set the flags so that a conditional branch can then be
executed.
8-bit accumulator (all processors): Data compared is eight-bit.
16-bit accumulator (65802/65816 only, m = 0): Data compared is sixteen-bit: the low-order eight bits
of the data in memory are located at the effective address; the high-order eight bits are located at the effective
address plus one.
Flags Affective:
n z c
n Set if most significant bit of result is set; else cleared.
z Set if result is zero; else cleared.
c Set if no borrow required (accumulator value higher or same);
cleared if borrow required (accumulator value lower).
347
Codes:
Opcode
Available on:
# of
# of
Syntax
(hex)
6502 65C02 65802/816 Bytes Cycles
CMP
Immediate
C9
x
x
x
2*
21
#const
Absolute
CMP addr
CD
x
x
x
3
41
Absolute Long
CMP long
CF
x
4
51
Direct Page (also DP)
CMP dp
C5
x
x
x
2
31, 2
DP Indirect
CMP (dp)
D2
x
x
2
51, 2
DP Indirect Long
CMP [dp]
C7
x
2
61, 2
CMP addr,
DD
x
x
x
3
41, 3
Absolute Indexed, X
X
Absolute Long Indexed, CMP long,
DF
x
4
51
X
X
CMP addr,
Absolute Indexed, Y
D9
x
x
x
3
41, 3
Y
DP Indexed, X
CMP dp, X
D5
x
x
x
2
41, 2
CMP (dp,
DP Indexed Indirect, X
C1
x
x
x
2
61, 2
X)
CMP (dp),
DP Indirect Indexed, Y
D1
x
x
x
2
51, 2, 3
Y
DP
Indirect
Long CMP [dp],
D7
x
2
61, 2
Indexed, Y
Y
Stack Relative (also SR) CMP sr, S
C3
x
2
41
CMP (sr,
D3
x
2
71
SR Indirect Indexed, Y
S), Y
+ + CMP, a Primary Group Instruction, has available all of the Primary Group addressing modes and
bit patterns
* Add 1 byte if m = 0 (16-bit memory/accumulator)
1 Add 1 cycle if m = 0 (16-bit memory/accumulator)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
3 Add 1 cycle if adding index crosses a page boundary
Addressing Mode + +
348
Co-Processor Enable
COP
Execution of COP causes a software interrupt, similarly to BRK, but through the separate COP vector.
Alternatively, COP may be trapped by a co-processor, such as a floating point or graphics processor, to call a
co-processor function. COP is unaffected by the i interrupt disable flag.
COP is much like BRK, with the program counter value pushed on the stack being incremented by
two; this lets you follow the co-processor instruction with a signature byte to indicate to the co-processor or coprocessor handling routine which operation to execute. Unlike the BRK instruction, 65816 assemblers require
you to follow the COP instruction with such a signature byte. Signature bytes in the range $80-$FF are
reserved by the Western Design Center for implementation of co-processor control; signatures in the range $00$7F are available for use with software-implemented COP handlers.
6502 Emulation Mode (65802/65816, e=1): The program counter is incremented by two and pushed
onto the stack; the status register is pushed onto the stack; the interrupt disable flag is set; and the program
counter is loaded from the emulation mode co-processor vector at $FFF4-FFF5. The d decimal flag is cleared
after a COP is executed.
65802/65816 Native Mode (e = 0): The program counter bank register is pushed onto the stack; the
program counter is incremented by two and pushed onto the stack; the status register is pushed onto the stack;
the interrupt disable flag is set; the program bank register is cleared to zero; and the program counter is loaded
from the native mode co-processor vector at $00FFE4-00FFE5. The d decimal flag is reset to 0 after a COP is
executed.
Stack
Bank Address
Address High
Address Low
Contents of Status
Register
Stack Pointer
Bank 0
Figure 18-4 Stack after COP
Flag Affected:
- - - - d i - d d is rest to 0.
i The interrupt disable flag is set, disabling hardware interrupts.
Codes:
Opcode
Addressing
mode
Stack/Interrupt
*
1
Syntax
COP const
(hex)
02
6502
Available on:
# of
# of
65C02
Bytes
Cycles
65802/816
x
2*
71
COP is 1 byte, but program counter value pushed onto stack is incremented by 2 allowing for optional code
byte
Add 1 cycle for 65816/65802 native mode (e = 0)
349
CPX
Subtract the data located at the effective address specified by the operand from the contents of the X
register, setting the carry, zero, and negative flags based on the result, but without altering the contents of either
the memory location or the register. The result is not saved. The comparison is of unsigned values only (except
for signed comparison for equality).
The primary use for the CPX instruction is to test the value of the X index register against loop
boundaries, setting the flags so that a conditional branch can be executed.
8-bit index registers (all processors): Data compared is eight-bit.
16-bit index registers (65802/65816 only, x = 0): Data compared is sixteen-bit: the low-order eight
bits of the data in memory are located at the effective address; the high-order eight bits are located at the
effective address plus one.
Flags Affected:
n - - - - - z c
n
Set if most significant bit of result is set; else cleared.
z
Set if result is zero; else cleared.
c
Set if no borrow required (X register value higher or same);
cleared if borrow required (X register value lower).
Codes:
Addressing Mode
Immediate
Syntax
CPX
#const
CPX addr
Opcode
Available on:
(hex)
6502 65C02 65802/816
E0
# of
Bytes
2*
Absolute
EC
x
x
x
3
Direct Page (also
CPX dp
E4
x
x
x
2
DP)
* Add 1 byte if x = 0 (16-bit index registers)
1 Add 1 cycle if x = 0 (16-bit index registers)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
# of
Cycles
21
41
31, 2
350
CPY
Subtract the data located at the effective address specified by the operand from the contents of the Y
register, setting the carry, zero, and negative flags based on the result, but without altering the contents of either
the memory location or the register. The comparison is of unsigned values only (expect for signed comparison
for equality).
The primary use for the CPY instruction is to test the value of the Y index register against loop
boundaries, setting the flags so that a conditional branch can be executed.
8-bit index registers (all processors): Data compared is eight-bit.
16-bit index registers (65802/65816 only, x = 0): Data compared is sixteen-bit: the low-order eight
bits of the data in memory is located at the effective address; the high-order eight bits are located at the effective
address plus one.
Flags Affected:
n
n
z
c
- - - - - z c
Set if most significant bit of result is set; else cleared.
Set if result is zero; else cleared.
Set if no borrow required (Y register value higher or same);
cleared if borrow required (Y register value lower).
Codes:
Opcode
Available on:
Addressing Mode +
Syntax
(hex)
6502 65C02 65802/816
+
Immediate
CPY # const
C0
x
x
x
Absolute
CPY addr
CC
x
x
x
Direct Page (also
CPY dp
C4
x
x
x
DP)
* Add 1 byte if x = 0 (16-bit index registers)
1 Add 1 cycle if x = 0 (16-bit index registers)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
# of
# of
Bytes
Cycles
2*
3
21
41
31, 2
351
Decrement
DEC
Decrement by one the contents of the location specified by the operand (subtract one from the value).
Unlike subtracting a one using the SBC instruction, the decrement instruction is neither affected by nor
affected the carry flag. You can test for wraparound only by testing after every decrement to see if the value is
zero or negative. On the other hand, you dont need to set the carry before decrementing.
DEC is unaffected by the setting of the d (decimal) flag.
8-bit accumulator/memory (all processors): Data decremented is eight-bit.
16-bit accumulator/memory (65802/65816 only, m = 0): Data Decremented is sixteen-bit: if in
memory, the low-order eight bits are located at the effective address; the high-order eight bits are located at the
effective address plus one.
Flags Affected:
Codes:
Opcode
Available on:
(hex)
6502 65C02 65802/816
3A
x
x
CE
x
x
x
# of
Bytes
1
3
# of
Cycles
2
61
51, 2
71, 3
61,2
352
DEX
Decrement by one the contents of index register X (subtract one from the value). This is a special
purpose, implied addressing form of the DEC instruction.
Unlike using SBC to subtract a one from the value, the DEX instruction does not affect the carry flag;
you can test for wraparound only by testing after every decrement to see if the value is zero or negative. On the
other hand, you dont need to set carry before decrementing.
DEX is unaffected by the setting of the d (decimal) flag.
8-bit index registers (all processors): Data decremented is eight-bit.
16-bit index registers (65802/65816 only, x = 0): Data decremented is sixteen-bit.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
DEX
Available on:
(hex)
6502
65C02
CA
65802/816
x
# of
# of
Bytes
Cycles
353
DEY
Decrement by one the contents of index register Y (subtract one from the value). This is a special
purpose, implied addressing form of the DEC instruction.
Unlike using SBC to subtract a one from the value, the DEY instruction does not affect the carry flag;
you can test for wraparound only by testing after every decrement to see if the value is zero or negative. On the
other hand, you dont need to set the carry before decrementing.
DEY is unaffected by the setting of the d (decimal) flag.
8-bit index registers (all processors): Data decremented is eight-bit.
16-bit index registers (65802/65816 only, x = 0): Data decremented is sixteen-bit.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
DEY
Available on:
# of
# of
(hex)
6502
65C02
65802/816
Bytes
Cycles
88
354
EOR
Bitwise logical Exclusive-OR the data located at the effective address specified by the operand with the
contents of the accumulator. Each bit in the accumulator is exclusive-ORed with the corresponding bit in
memory, and the result is stored into the same accumulator bit.
The truth table for the logical exclusive-OR operation is:
First Operand
0
1
Second
0
Operand
1
0
1
1
0
A 1 or logical true results only if the two elements of the Exclusive-OR operation are different.
8-bit accumulator (all processors): Data exclusive-ORed from memory is eight-bit.
16-bit accumulator (65802/65816 only, m = 0): Data exclusive-ORed from memory is sixteen-bit: the
low-order eight bits are located at the effective address; the high-order eight bits are located at the effective
address plus one.
Flags Affected:
355
Codes:
Opcode
Addressing Mode
Syntax
Immediate
Absolute
Absolute Long
Direct Page (also DP)
DP Indirect
DP Indirect Long
Absolute Indexed, X
Absolute Long Indexed,
X
Absolute Indexed, Y
DP Indexed, X
DP Indexed Indirect, X
DP Indirect Indexed, Y
DP
Indirect
Long
Indexed, Y
Stack Relative (also SR)
SR Indirect Indexed, Y
Available on:
(hex)
6502
65C02
EOR # const
EOR addr
EOR long
EOR dp
EOR (dp)
EOR [dp]
EOR addr, X
49
4D
4F
45
52
47
5D
x
x
x
x
x
x
EOR long, X
5F
EOR addr, Y
EOR dp, X
EOR (dp, X)
EOR (dp), Y
59
55
41
51
EOR [dp], Y
EOR sr, S
EOR (sr, S),
Y
# of
Byte
65802/816
s
x
2*
x
3
x
4
x
2
x
2
x
2
x
3
# of
Cycle
s
21
41
51
31, 2
51, 2
61, 2
41, 3
51
x
x
x
x
3
2
2
2
41, 3
41, 2
61, 2
51, 2, 3
57
61, 2
43
41
53
71
x
x
x
x
x
x
x
x
+ + EOR, a Primary Group Instruction, has available all of the Primary Group addressing modes and bit patterns
* Add 1 byte if m = 0 (16-bit memory/accumulator)
1 Add 1 cycle if m = 0 (16-bit memory/accumulator)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL<>0)
3 Add 1 cycle if adding index crosses a page boundary
356
Increment
INC
Increment by one the contents of the location specified by the operand (add one to the value).
Unlike adding a one with the ADC instruction, however, the increment instruction is neither affected by
nor affects the carry flag. You can test for wraparound only by testing after every increment to see if the result
is zero or positive. On the other hand, you dont have to clear the carry before incrementing.
The INC instruction is unaffected by the d (decimal) flag.
8-bit accumulator/memory (all processors): Data incremented is eight-bit.
16-bit accumulator/memory (65802/65816 only, m=0): Data incremented is sixteen-bit: if in
memory, the low-order eight bits are located at the effective address; the high-order eight-bits are located at the
effective address plus one.
Flags Affected:
Codes:
Opcode
Available on:
# of
Addressing Mode
Syntax
(hex)
6502 65C02 65802/816 Bytes
Accumulator
INC A
1A
x
x
1
Absolute
INC addr
EE
x
x
x
3
Direct Page (also
INC dp
E6
x
x
x
2
DP)
Absolute Indexed, INC addr,
FE
x
x
x
3
X
X
DP Indexed, X
INC dp, X
F6
x
x
x
2
1 Add 2 cycles if m = 0 (16-bit memory/accumulator)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL<>0)
3 Subtract 1 cycle if 65C02 and no page boundary crossed
# of
Cycles
2
61
51, 2
71, 3
61, 2
357
INX
Increment by one the contents of index register X (add one to the value). This is a special purpose,
implied addressing form of the INC instruction.
Unlike using ADC to add a one to the value, the INX instruction does not affect the carry flag. You can
execute it without first clearing the carry. But you can test for wraparound only by testing after every increment
to see if the result is zero or positive. The INX instruction is unaffected by the d (decimal) flag.
8-bit index registers (all processors): Data incremented is eight-bit.
16-bit index registers (65802/65816 only, x = 0): Data incremented is sixteen-bit.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
INX
Available on:
# of
# of
(hex)
6502
65C02
65802/816
Bytes
Cycles
E8
358
INY
Increment by one the contents of index register Y (add one to the value). This is a special purpose,
implied addressing form of the INC instruction.
Unlike using ADC to add one to the value, the INY instruction does not affect the carry flag. You can
execute it without first clearing the carry. But you can test for wraparound only by testing after every increment
to see if the value is zero or positive. The INY instruction is unaffected by the d (decimal) flag.
8-bit index registers (all processors): Data incremented is eight-bit.
16-bit index registers (65802/65816 only, x = 0): Data incremented is sixteen-bit.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
INY
Available on:
# of
# of
(hex)
6502
65C02
65802/816
Bytes
Cycles
C8
359
Jump
JMP
Flags Affected:
- - - - - - - -
Codes:
Opcode
Available on:
Addressing Mode +
Syntax
(hex)
6502 65C02 65802/816
+
Absolute
JMP addr
4C
x
x
x
Absolute Indirect
JMP (addr)
6C
x
x
x
Absolute
Indexed JMP (addr,
7C
x
x
Indirect
X)
Absolute Long
JMP long
5C
x
(or JML long)
Absolute
Indirect
JMP [addr]
DC
x
Long
(or
JML
[addr])
1 1 Add 1 cycle if 65C02
2 6502: If low byte of addr is $FF (i.e., addr is $xxFF): yields incorrect result
# of
# of
Bytes
Cycles
3
3
3
51, 2
360
JSL
Jump-to-subroutine with long (24-bit) addressing: transfer control to the subroutine at the 24-bit address
which is the operand, after first pushing a 24-bit (long) return address onto the stack. This return address is the
address of the last instruction byte (the fourth instruction byte, or the third operand byte), not the address of the
next instruction; it is the return address minus one.
The current program counter bank is pushed onto the stack first, then the high-order byte of the return
address and then the low-order byte of the address are pushed on the stack in standard 65x order (low byte in the
lowest address, bank byte in the highest address). The stack pointer is adjusted after each byte is pushed to
point to the next lower byte (the next available stack location). The program counter bank register and program
counter are then loaded with the operand values, and control is transferred to the specified location.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing
Mode
Absolute Long
Syntax
JSL long
(or JSR
long)
(hex)
22
Available on:
6502
65C02
# of
# of
65802/816
Bytes
Cycles
361
Jump to Subroutine
JSR
Transfer control to the subroutine at the location specified by the operand, after first pushing onto the
stack, as a return address, the current program counter value, that is, the address of the last instruction byte (the
third byte of a three-byte instruction, the fourth byte of a four-byte instruction), not the address of the next
instruction.
If an absolute operand is coded and is less than or equal to $FFFF, absolute addressing is assumed by
the assembler; if the value is greater than $FFFF, absolute long addressing is used.
If long addressing is used, the current program counter bank is pushed onto the stack first. Next or
first in the more normal case of intra-bank addressing the high order byte of the return address is pushed,
followed by the low order byte. This leaves it on the stack in standard 65x order (lowest byte at the lowest
address, highest byte at the highest address). After the return address is pushed, the stack pointer points to the
next available location (next lower byte) on the stack. Finally, the program counter (and, in the case of long
addressing, the program counter bank register) is loaded with the values specified by the operand, and control is
transferred to the target location.
Flags Affected:
- - - - - - - -
Codes:
Addressing Mode
Syntax
Absolute
JSR addr
Absolute
Indexed JSR (addr,
Indirect
X)
Absolute Long
JSR long
(or JSL long)
Opcode
Available on:
(hex)
6502 65C02 65802/816
20
x
x
x
# of
Bytes
3
# of
Cycles
6
FC
22
362
LDA
Load the accumulator with the data located at the effective address specified by the operand.
8-bit accumulator (all processors): Data is eight-bit
16-bit accumulator (65802/65816 only, m = 0): Data is sixteen-bit; the low-order eight bits are
located at the effective address; the high-order eight bits are located at the effective address plus one.
Flags Affected:
Codes:
Addressing Mode + +
Syntax
LDA
#
Immediate
const
Absolute
LDA addr
Absolute Long
LDA long
Direct Page (DP)
LDA dp
DP Indirect
LDA (dp)
DP Indirect Long
LDA [dp]
LDA addr,
Absolute Indexed, X
X
Absolute Long Indexed, LDA long,
X
X
LDA addr,
Absolute Indexed, Y
Y
DP Indexed, X
LDA dp, X
LDA (dp,
DP Indexed Indirect, X
X)
LDA (dp),
DP Indirect Indexed, Y
Y
DP
Indirect
Long LDA [dp],
Y
Indexed, Y
Stack Relative (also SR) LDA sr, S
LDA (sr,
SR Indirect Indexed, Y
S), Y
Opcode
(hex)
6502
Available on:
65C02 65802/816
A9
AD
AF
A5
B2
A7
x
x
BD
BF
# of
Bytes
# of
Cycles
2*
21
x
x
x
x
x
3
4
2
2
2
41
51
31, 2
51, 2
61, 2
41, 3
51
B9
41, 3
B5
41, 2
A1
61, 2
B1
51, 2, 3
B7
61, 2
A3
41
B3
71
+ + LDA, a Primary Group Instruction, has available all of the Primary Group addressing modes and bit patterns
*
Add 1 byte if m = 0 (16-bit memory/accumulator)
1
2
3
363
LDX
Load index register X with the data located at the effective address specific by the operand.
8-bit index registers (all processors): Data is eight-bit.
16-bit index registers (65802/65816 only, x = 0): Data is sixteen-bit: the low-order eight bits are
located at the effective address; the high-order eight bits are located at the effective address plus one.
Flags Affected:
n - - - - - z n
Codes:
Addressing Mode
Immediate
*
1
2
3
Syntax
LDX #
const
LDX addr
Opcode
Available on:
(hex)
6502 65C02 65802/816
A2
Absolute
AE
x
x
x
Direct Page (also
LDX dp
A6
x
x
x
DP)
Absolute Indexed, LDX addr,
BE
x
x
x
Y
Y
DP Indexed, Y
LDX dp, Y
B6
x
x
x
Add 1 byte if x = 0 (16-bit index registers)
Add 1 cycle if x = 0 (16-bit index registers)
Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
Add 1 cycle if adding index crosses a page boundary
# of
Bytes
# of
Cycles
2*
21
41
31, 2
41, 3
41, 2
364
LDY
Load index register Y with the data located at the effective address specified by the operand.
8-bit index registers (all processors): Data is eight-bit.
16-bit index registers (65802/65816 only, x = 0): Data is sixteen-bit: the low-order eight bits are
located at the effective address; the high-order eight bits are located at the effective address plus one.
Flags Affected:
Codes:
Addressing Mode
Immediate
*
1
2
3
Syntax
LDY
#
const
LDY addr
Opcode
(hex)
6502
A0
Available on:
65C02 65802/816
x
Absolute
AC
x
x
x
Direct Page (also
LDY dp
A4
x
x
x
DP)
Absolute Indexed, LDY addr,
BC
x
x
x
X
X
DP Indexed, X
LDY dp, X
B4
x
x
x
Add 1 byte if x = 0 (16-bit index registers)
Add 1 cycle if x = 0 (16-bit index registers)
Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
Add 1 cycle if adding index crosses a page boundary
# of
Bytes
# of
Cycles
2*
21
41
31, 2
41, 3
41,2
365
LSR
Logical shift the contents of the location specified by the operand right one bit. That is, bit zero takes
on the value originally found in bit one, bit one takes the value originally found in bit two, and so on; the
leftmost bit (bit 7 if the m memory select flag is one when the instruction is executed or bit 15 if it is zero) is
cleared; the rightmost bit, bit zero, is transferred to the carry flag. This is the arithmetic equivalent of unsigned
division by two.
X
Carry Flag
Figure 18-6 LSR
Flags Affected:
n - - - - - z c
n Cleared.
z Set if result is zero; else cleared.
c Low bit becomes carry: set if low bit was set; cleared if low bit was zero.
Codes:
Opcode
Addressing Mode
Syntax
(hex)
650
2
x
x
Available on:
65C0 65802/8
2
16
x
x
x
x
Accumulator
LSR A
4A
1
Absolute
LSR addr
4E
3
Direct Page (also
LSR dp
46
x
x
x
2
DP)
Absolute Indexed, LSR addr,
5E
x
x
x
3
X
X
DP Indexed, X
LSR dp, X
56
x
x
x
2
1 Add 2 cycles if m = 0 (16-bit memory/accumulator)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
3 Subtract 1 cycle if 65C02 and no page boundary crossed
# of
# of
Bytes
Cycles
2
61
51, 2
71, 3
61, 2
366
MVN
Moves (copies) a block of memory to a new location. The source, destination and length operands of
this instruction are taken from the X, Y, and C (double accumulator) registers; these should be loaded with the
correct values before executing the MVN instruction.
The source address for MVN, taken from the X register, should be the starting address (lowest in
memory) of the block to be moved. The destination address, in the Y register, should be the new starting
address for the moved block. The length, loaded into the double accumulator (the value in C is always used,
regardless of the setting of the m flag) should be the length of the block to be moved minus one; if C contains
$0005, six bytes will be moved. The two operand bytes of the MVN instruction specify the banks holding the
two blocks of memory: the first operand byte (of object code) specifies the destination bank; the second operand
byte specifies the source bank.
The execution sequence is: the first byte is moved from the address in X to the address in Y; then X and
Y are incremented, C is decremented, and the next byte is moved; this process continues until the number of
bytes specified by the value in C plus one is moved. In other words, until the value in C is $FFFF.
If the source and destination blocks do not overlap, then the source block remains intact after it has been
copied to the destination.
If the source and destination blocks do overlap, then MVN should be used only if the destination is
lower than the source to avoid overwriting source bytes before theyve been copied to the destination. If the
destination is higher, then the MVP instruction should be used instead.
When execution is complete, the value in C is $FFFF, registers X and Y each point one byte past the
end of the blocks to which they were pointing, and the data bank register holds the destination bank value (the
first operand byte).
Assembler syntax for the block move instruction calls for the operand field to be coded as two
addresses, source first, then destination the move intuitive ordering, but the opposite of the actual operand
order in the object code. The assembler strips the bank bytes from the addresses (ignoring the rest) and reverses
them to object code order. If a block move instruction is interrupted, it may be resumed automatically via
execution of an RTI if all of the registers are restored or intact. The value pushed onto the stack when a block
move is interrupted is the address of the block move instruction. The current byte-move is completed before the
interrupt is serviced.
If the index registers are in eight-bit mode (x = 1), or the processor is in 6502 emulation mode (e = 1),
then the blocks being specified must necessarily be in page zero since the high bytes of the index registers will
contain zeroes.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing
Mode
Syntax
MVN
srcbk,destbk
* 7 cycles per bye moved
Block Move
(hex)
54
Available on:
6502
65C02
# of
# of
65802/816
Bytes
Cycles
367
MVP
Moves (copies) a block of memory to a new location. The source, destination and length operands of
this instruction are taken from the X, Y, and C (double accumulator) registers; these should be loaded with the
correct values before executing the MVP instruction.
The source address for MVP, taken from the X register, should be the ending address (highest in
memory) of the block to be moved. The destination address, in the Y register, should be the new ending address
for the moved block. The length, loaded into the double accumulator (the value in C is always used, regardless
of the setting of the m flag) should be the length of the block to be moved minus one; if C contains $0005, six
bytes will be moved. The two operand bytes of the MVP instruction specify the banks holding the two blocks
of memory: the first operand byte (of object code) specifies the destination bank; the second operand byte
specifies the source bank.
The execution sequence is: the first byte is moved from the address in X to the address in Y; then X and
Y are decremented, C is decremented, and the previous byte is moved; this process continues until the number
of bytes specified by the value in C plus one is moved. In other words, until the value in C is $FFFF.
If the source and destination blocks do not overlap, then the source block remains intact after it has been
copied to the destination.
If the index registers are in eight-bit mode (x = 1), or the processor is in 6502 emulation mode
(e = 1), then the blocks If the source and destination blocks do overlap, then MVP should be used only if the
destination is higher than the source to avoid overwriting source bytes before theyve been copied to the
destination. If the destination is lower, then the MVN instruction should be used instead.
When execution is complete, the value in C is $FFFF, registers X and Y each point one byte past the
beginning of the blocks to which they were pointing, and the data bank register holds the destination bank
value (the first operand byte).
Assembler syntax for the block move instruction calls for the operand field to be coded as two
addresses, source first, then destination the more intuitive ordering, but the opposite of the actual operand
order in the object code. The assembler strips the bank bytes from the addresses (ignoring the rest) and reverses
them to object code order. If a block move instruction is interrupted, it may be resumed automatically via
execution of an RTI if all of the registers are restored or intact. The value pushed onto the stack when a block
move is interrupted is the address of the block move instruction. The current byte-move is completed before the
interrupt is serviced.being specified must necessarily be in page zero since the high bytes of the index registers
will contain zeroes.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing
Mode
Syntax
MVP
srcbk,destbk
* 7 cycles per byte moved
Block Move
(hex)
44
Available on:
6502
65C02
# of
# of
65802/816
Bytes
Cycles
368
No Operation
NOP
Executing a NOP takes no action; it has no effect on any 65x registers or memory, except the program
counter, which is incremented once to point to the next instruction.
Its primary uses are during debugging, where it is used to patch out unwanted code, or as a placeholder, included in the assembler source, where you anticipate you may have to patch in instructions, and
want to leave a hole for the patch.
NOP may also be used to expand timing loops each NOP instruction takes two cycles to execute, so
adding one or more may help fine tune a timing loop.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing
Mode
Implied
Available on:
# of
# of
Syntax
(hex)
6502
65C02
65802/816
Bytes
Cycles
NOP
EA
369
ORA
Bitwise logical OR the data located at the effective address specified by the operand with the contents
of the accumulator. Each bit in the accumulator is ORed with the corresponding bit in memory. The result is
stored into the same accumulator bit.
The truth table for the logical OR operation is:
Second Operand
0
1
First Operand
0
1
0
1
1
1
A 1 or logical true results if either of the two operands of the OR operation is true.
8-bit accumulator (all processors): Data ORed from memory is eight-bit.
16-bit accumulator (65802/65816 only, m=0): Data ORed from memory is sixteen-bit: the low-order
eight bits are located at the effective address; the high-order eight bits are located at the effective address plus
one.
Flags Affected:
370
Codes:
Addressing Mode + +
Syntax
ORA
#
Immediate
const
Absolute
ORA addr
Absolute Long
ORA long
Direct Page (also DP)
ORA dp
DP Indirect
ORA (dp)
DP Indirect Long
ORA [dp]
ORA addr,
Absolute Indexed, X
X
Absolute Long Indexed, ORA long,
X
X
ORA,
Absolute Indexed, Y
addr, Y
ORA, dp,
DP Indexed, X
X
ORA (dp,
DP Indexed Indirect, X
X)
ORA, (dp),
DP Indirect Indexed, Y
Y
DP
Indirect
Long ORA [dp],
Y
Indexed, Y
Stack Relative (also SR) ORA sr, S
ORA (sr,
SR Indirect Indexed, Y
S), Y
++
*
1
2
3
Opcode
(hex)
6502
Available on:
65C02 65802/816
# of
Bytes
# of
Cycles
09
2*
21
0D
0F
05
12
07
x
x
x
x
x
x
x
3
4
2
2
2
41
51
31, 2
51, 2
61, 2
1D
41, 3
51
1F
19
41, 3
15
41, 2
01
61, 2
11
51, 2, 3
17
61, 2
03
41
13
71
ORA, a Primary Group Instruction, has available all of the Primary Group addressing modes and bit patterns
Add 1 byte if m = 0 (16-bit memory/accumulator)
Add 1 cycle if m = 0 (16-bit memory/accumulator)
Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
Add 1 cycle if adding index crosses a page boundary
371
PEA
Push the sixteen-bit operand (typically an absolute address) onto the stack. The stack pointer is
decremented twice. This operation always pushes sixteen bits of data, irrespective of the settings of the m and x
mode select flags.
Although the mnemonic suggests that the sixteen-bit value pushed on the stack be considered an
address, the instruction may also be considered a push sixteen-bit immediate data instruction, although the
syntax of immediate addressing is not used. The assembler syntax is that of the absolute addressing mode, that
is, a label or sixteen-bit value in the operand field. Unlike all other instructions that use this assembler syntax,
the effective address itself, rather than the data stored at the effective address, is what is accessed (and in this
case, pushed onto the stack).
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing
Mode
Stack
(Absolute)
Syntax
PEA addr
(hex)
F4
Available on:
6502
65C02
# of
# of
65802/816
Bytes
Cycles
372
PEI
Push the sixteen-bit value located at the address formed by adding the direct page offset
specified by the operand to the data page register. The mnemonic implies that the sixteen-bit data
pushed is considered an address, although it can be any sixteen-bit data. This operation always pushes
sixteen bits of data, irrespective of the settings of the m and x mode select flags.
The first byte pushed is the byte at the direct page offset plus one (the high byte of the double
byte stored at the direct page offset). The byte at the direct page offset itself (the low byte) is pushed
next. The stack pointer now points to the next available stack location, directly below the last byte
pushed.
The assembler syntax is that of direct page indirect; however, unlike other instructions which
use this assembler syntax, the effective indirect address, rather than the data stored at that address, is
what is accessed and pushed onto the stack.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing Mode
Syntax
(hex)
Available on:
65802/81
6502
65C02
6
# of
# of
Bytes
Cycles
61
373
PER
Add the current value of the program counter to the sixteen-bit signed displacement in the operand, and
push the result on the stack. This operation always pushes sixteen bits of data, irrespective of the settings of the
m and x mode select flags.
The high byte of the sum is pushed first, then the low byte is pushed. After the instruction is completed,
the stack pointer points to the next available stack location, immediately below the last by pushed.
Because PERs operand is a displacement relative to the current value of the program counter (as with
the branch instructions), this instruction is helpful in writing self-relocatable code in which an address within
the program (typically of a data area) must be accessed. The address pushed onto the stack will be the run-time
address of the data area, regardless of where the program was loaded in memory; it may be pulled into a
register, stored in an indirect pointer, or used on the stack with the stack relative indirect indexed addressing
mode to access the data at that location.
As is the case with the branch instructions, the syntax used is to specify as the operand the label of the
data area you want to reference. This location must be in the program bank, since the displacement is relative
to the program counter. The assembler converts the assembly-time label into a displacement from the
assembly-time address of the next instruction.
The value of the program counter used in the addition is the address of the next instruction, that is, the
instruction following the PER instruction.
PER may also be used to push return addresses on the stack, either as part of a simulated branch-tosubroutine or to place the return address beneath the stacked parameters to a subroutine call; always remember
that a pushed return address should be the desired return address minus one.
Flags Affected:
- - - - - - - -
Codes:
Addressing Mode
Syntax
Stack (Program Counter Relative PER
Long)
label
Opcode
Available on:
(hex)
6502 65C02 65802/816
62
# of
Bytes
3
# of
Cycle
6
374
Push Accumulator
PHA
Push the accumulator onto the stack. The accumulator itself is uncharged.
8-bit accumulator (all processors): The single byte contents of the accumulator are pushed
they are stored to the location pointed to by the stack pointer and the stack pointer is decremented.
16-bit accumulator (65802/65816 only, m = 0): Both accumulator bytes are pushed. The
high byte is pushed first, then the low byte. The stack point now points to the next available stack
location, directly below the last byte pushed.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Available on:
Addressing
Syntax
(hex)
6502
65C02
65802/816
Mode
Stack (Push)
48
x
x
x
PHA
1 Add 1 cycle if m=0 (16-bit memory/accumulator)
# of
# of
Bytes
Cycles
31
375
PHB
Push the contents of the data bank register onto the stack.
The single-byte contents of the data bank registers are pushed onto the stack; the stack pointer now
points to the next available stack location, directly below the byte pushed. The data bank register itself is
unchanged. Since the data bank register is an eight-bit register, only one byte is pushed onto the stack,
regardless of the settings of the m and x mode select flags.
While the 65816 always generates 24-bit addresses, most memory references are specified by a sixteenbit address. These addresses are concatenated with the contents of the data bank register to form a full 24-bit
address. This instruction lets the current value of the data bank register be saved prior to loading a new value.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Address
Mode
Stack (Push)
Syntax
PHB
(hex)
8B
6502
Available on:
# of
# of
65C02
65802/816
Bytes
Cycles
376
PHD
Push the contents of the direct page register D onto the stack.
Since the direct page register is always a sixteen-bit register, this is always a sixteen-bit operation,
regardless of the settings of the m and x mode select flags. The high byte of the direct page register is pushed
first, then the low byte. The direct page register itself is unchanged. The stack pointer now points to the next
available stack location, directly below the last byte pushed.
By pushing the D register onto the stack, the local environment of a calling subroutine may easily be
saved a called subroutine before modifying the D register to provide itself with its own direct page memory.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Address
Mode
Stack (Push)
Syntax
(hex)
PHD
0B
6502
Available on:
# of
# of
65C02
65802/816
Bytes
Cycles
377
PHK
Flags Affected:
- - - - - - - -
Codes:
Address Mode
Stack (Push)
Syntax
PHK
Opcode
(hex)
4B
6502
Available on:
65C02 65802/816
x
# of
Bytes
1
# of
Cycles
3
378
PHP
Push the contents of the processor status register P onto the stack.
Since the status register is always an eight-bit register, this is always an eight-bit operation, regardless
of the settings of the m and x mode select flags on the 65802/65816. The status register contents are not
changed by the operation. The stack pointer now points to the next available stack location, directly below the
byte pushed.
This provides the means for saving either the current mode settings or a particular set of status flags so
they may be restored or in some other way used later.
Note, however, that the e bit (the 6502 emulation mode flag on the 65802/65816) is not pushed onto the
stack or otherwise accessed or saved. The only access to the e flag is via the XCE instruction.
Flags Affected:
- - - - - - - -
Codes:
Address Mode
Stack (Push)
Syntax
PHP
Opcode
(hex)
08
6502
x
Available on:
65C02
65802/816
x
x
# of
Bytes
1
# of
Cycles
3
379
PHX
Push the contents of the X index register onto the stack. The register itself is unchanged.
8-bit index registers (all processors): The eight-bit contents of the index register are pushed onto the
stack. The stack pointer now points to the next available stack location, directly below the byte pushed.
16-bit index registers (65802/65816 only, x=0): The sixteen-bit contents of the index register are
pushed. The high byte is pushed first, then the low byte. The stack pointer now points to the next available
stack location, directly below the last byte pushed.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Available on:
Address Mode Syntax
(hex)
6502 65C02 65802/816
Stack (Push)
PHX
DA
x
x
1 Add 1 cycle if x=0 (16-bit index registers)
# of
Bytes
1
# of
Cycles
31
380
PHY
Push the contents of the Y index register onto the stack. The register itself is unchanged.
8-bit index registers (all processors): The eight-bit contents of the index register are pushed onto the
stack. The stack pointer now points to the next available stack location, directly below the byte pushed.
16-bit index registers (65802/65816 only, x = 0): The sixteen-bit contents of the index register are
pushed. The high byte is pushed first, then the low byte. The stack pointer now points to the next available
stack location, directly below the last byte pushed.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Address Mode Syntax
(hex)
6502
Stack (Push)
PHY
5A
1 Add 1 cycle if x=0 (16-bit index registers)
Available on:
65C02
65802/816
x
x
# of
Bytes
1
# of
Cycles
31
381
Pull Accumulator
PLA
Pull the value on the top of the stack into the accumulator. The previous contents of the accumulator
are destroyed.
8-bit accumulator (all processors): The stack pointer is first incremented. Then the byte pointed to
by the stack pointer is loaded into the accumulator.
16-bit accumulator (65802/65816 only, m = 0): Both accumulator bytes are pulled. The
accumulators low byte is pulled first, then the high byte is pulled.
Note that unlike some other microprocessors, the 65x pull instructions set the negative and zero flags.
Flags Affected:
Codes:
Opcode
Available on:
Address Mode
Syntax
(hex)
6502 65C02 65802/816
Stack (Pull)
PLA
68
x
x
x
a) Add 1 cycle if m=0 (16-bit memory/accumulator)
# of
Bytes
1
# of
Cycles
41
382
PLB
Pull the eight-bit value on top of the stack into the data bank register B, switching the data bank to that
value. All instructions which reference data that specify only sixteen-bit addresses will get their bank address
from the value pulled into the data bank register. This is the only instruction that can modify the data bank
register.
Since the bank register is an eight-bit register, only one byte is pulled from the stack, regardless of the
settings of the m and x mode select flags. The stack pointer is first incremented. Then the byte pointed to by
the stack pointer is loaded into the register.
Flags Affected:
Codes:
Address Mode
Stack (Pull)
Syntax
PLB
Opcode
(hex)
AB
6502
Available on:
65C02 65802/816
x
# of
Bytes
1
# of
Cycles
4
383
PLD
Pull the sixteen-bit value on top of the stack into the direct page register D, switching the direct page to
that value.
PLD is typically used to restore the direct page register to a previous value.
Since the direct page register is a sixteen-bit register, two byte are pulled from the stack, regardless of
the settings of the m and x mode select flags. The low byte of the direct page register is pulled first, then the
high byte. The stack pointer now points to where the high byte just pulled was stored; this is now the next
available stack location.
Flags Affected:
Codes:
Opcode
Addressing Mode
Syntax
Stack (Pull)
PLD
(hex)
2B
6502
Available on:
# of
65C02
65802/816
Byte
# of
Cycle
s
5
384
PLP
Pull the eight-bit value on the top of the stack into the processor status register P, switching the status
byte to that value.
Since the status register is an eight-bit register, only one byte is pulled from the stack, regardless of the
settings of the m and x mode select flags on the 65802/65816. The stack pointer is first incremented. Then the
byte pointed to by the stack pointer is loaded into the status register.
This provides the means for restoring either previous mode settings or a particular set of status flags that
reflect the result of a previous operation.
Note, however, that the e flagthe 6502 emulation mode flag on the 65802/65816is not on the stack so
cannot be pulled from it. The only means of setting the e flag is the XCE instruction.
Flags Affected:
n v - b d i z c
(6502, 65C02,
65802/65816 emulation mode e=1)
n v m x d i z c (65802/65816 native mode e=0)
All flags are replaced by the values in the byte pulled from the stack.
Codes:
Address Mode
Stack (Pull)
Syntax
PLP
Opcode
(hex)
28
Available on:
6502 65C02 65802/816
x
x
x
# of
Bytes
1
# of
Cycles
4
385
PLX
Pull the value on the top of the stack into the X index register. The previous contents of the register are
destroyed.
8-bit index registers (all processors): The stack pointer is first incremented. Then the byte pointed to
by the stack pointer is loaded into the register.
16-bit index registers (65802/65816 only, x = 0): Both bytes of the index register are pulled. First the
low-order byte of the index register is pulled, then the high-order byte of the index register is pulled.
Unlike some other microprocessors, the 65x instructions to pull an index register affect the negative and
zero flags.
Flags Affected:
Codes:
Opcode
Address Mode
Syntax
(hex)
6502
Stack (Pull)
PLX
FA
1. Add 1 cycle if x = 0 (16-bit index registers)
Available on:
65C02
65802/816
x
x
# of
Bytes
1
# of
Cycles
41
386
PLY
Pull the value on the top of the stack into the Y index register. The previous contents of the register are
destroyed.
8-bit index registers (all processors): The stack pointer is first incremented. Then the byte pointed to
by the stack pointer is loaded into the register.
16-bit index registers (65802/65816 only, x = 0): Both bytes of the index register are pulled. First the
low-order byte of the index register is pulled, then the high-order byte of the index register is pulled.
Unlike some other microprocessors, the 65x instructions to pull an index register affect the negative and
zero flags.
Flags Affected:
Codes:
Opcode
Addressing Mode Syntax
(hex)
6502
Stack (Pull)
PLY
7A
1. 1 Add 1 cycle if x = 0 (16-bit index registers)
Available to:
65C02 65802/816
x
x
# of
Bytes
1
# of
Cycles
41
387
REP
For each bit set to one in the operand byte, reset the corresponding bit in the status register to zero. For
example, if bit three is set in the operand byte, bit three in the status register (the decimal flag) is reset to zero by
this instruction. Zeroes in the operand byte cause no change to their corresponding status register bits.
This instruction lets you reset any flag or flags in the status register with a single two-byte instruction.
Further, it is the only direct means of resetting several of the flags, including the m and x mode select flags
(although instructions that pull the P status register affect the m and x mode select flags).
6502 emulation mode (65802/65816, e=1): Neither the break flag nor bit five (the 6502s undefined
flag bit) are affected by REP.
Flags Affected:
n v - - d i z c
(65802/65816 emulation mode e=1)
n v m x d i z c (65802/65816 native mode e=0)
All flags for which an operand bit is set are reset to zero.
All other flags are unaffected by the instruction.
Codes:
Addressing Mode
Immediate
Syntax
REP # const
Opcode
Available to:
(hex)
6502 65C02 65802/816
C2
x
# of
Bytes
2
# of
Cycles
3
388
ROL
Rotate the contents of the location specified by the operand left one bit. Bit one takes on the
value originally found in bit zero, bit two takes the value originally in bit one, and so on; the rightmost
bit, bit zero, takes the value in the carry flag; the leftmost bit (bit 7 on the 6502 and 65C02 or if m = 1
on the 65802/65816, or bit 15 if m = 0) is transferred into the carry flag.
X
Carry Flag
Figure 18-8 ROL
8-bit accumulator/memory (all processors): Data rotated is eight bits, plus carry.
16-bit accumulator/memory (65802/65816 only, m=0): Data rotated is sixteen bits, plus carry: if in
memory, the low-order eight bits are located at the effective address; the high eight bits are located at the
effective address plus one.
Flags Affected:
n
n
z
c
- - - - - z c
Set if most significant bit of result is set; else cleared.
Set if result is zero; else cleared.
High bit becomes carry: set if high bit was set; cleared if high bit
was clear.
Codes:
Opcode
Available to:
# of
Address Mode
Syntax
(hex)
6502 65C02 65802/816 Bytes
Accumulator
ROL A
2A
x
x
x
1
Absolute
ROL addr
2E
x
x
x
3
Direct Page (also
ROL dp
26
x
x
x
2
DP)
Absolute Indexed, ROL addr,
3E
x
x
x
3
X
X
DP Indexed, X
ROL dp, X
36
x
x
x
2
1 Add 2 cycles if m=0 (16-bit memory/accumulator)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
3 Subtract 1 cycle if 65C02 and no page boundary crossed
# of
Cycles
2
61
51, 2
71, 3
61, 2
389
ROR
Rotate the contents of the location specified by the operand right one bit. Bit zero takes on the value
originally found in bit one, bit one takes the value originally in bit two, and so on; the leftmost bit (bit 7 on the
6502 and 65C02 or if m = 1 on the 65802/65816, or bit 15 if m = 0) takes the value in the carry flag; the
rightmost bit, bit zero, is transferred into the carry flag.
1
X
8-bit accumulator/memory (all processors): Data rotated is eight bits, plus carry.
16-bit accumulator/memory (65802/65816 only, m=0): Data rotated is sixteen bits, plus carry: if in
memory, the low-order eight bits are located at the effective address; the high-order eight bits are located at the
effective address plus one.
Flags Affected:
n
n
z
c
- - - - - z c
Set if most significant bit of result is set; else cleared.
Set if result is zero; else cleared.
Low bit becomes carry: set if low bit was set; cleared if low
bit was clear.
Codes:
Opcode
Available to:
(hex)
6502 65C02 65802/816
6A
x
x
x
6E
x
x
x
Address Mode
Syntax
Accumulator
ROR A
Absolute
ROR addr
Direct Page (also
ROR dp
66
x
x
x
DP)
Absolute Indexed, ROR addr,
7E
x
x
x
X
X
DP Indexed, X
ROR dp, X
76
x
x
x
1 Add 2 cycles if m = 0 (16-bit memory/accumulator)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
3 Subtract 1 cycle if 65C02 and no page boundary crossed
# of
Bytes
1
3
# of
Cycles
2
61
51, 2
71, 3
61, 2
390
RTI
Pull the status register and the program counter from the stack. If the 65802/65816 is set to
native mode (e = 0), also pull the program bank register from the stack.
RTI pulls values off the stack in the reverse order they were pushed onto it by hardware or
software interrupts. The RTI instruction, however, has no way of knowing whether the values pulled
off the stack into the status register and the program counter are valid or even, for that matter, that an
interrupt has ever occurred. It blindly pulls the first three (or four) bytes off the top of the stack and
stores them into the various registers.
Unlike the RTS instruction, the program counter address pulled off the stack is the exact
address to return to; the value on the stack is the value loaded into the program counter. It does not
need to be incremented as a subroutines return address does.
Pulling the status register gives the status flags the values they had immediately prior to the
start of interrupt-processing.
One extra byte is pulled in the 65802/65816 native mode than in emulation mode, the same
extra byte that is pushed by interrupts in native mode, the program bank register. It is therefore
essential that the return from interrupt be executed in the same mode (emulation or native) as the
original interrupt.
6502, 65C02, and Emulation Mode (e = 1): The status register is pulled from the stack, then
the program counter is pulled from the stack (three bytes are pulled).
65802/65816 Native Mode (e = 0): The status register is pulled from the stack, then the
program counter is pulled from the stack, then the program bank register is pulled from the stack (four
bytes are pulled).
Stack
Bank 0
Figure 18-10Native Mode Stack before RTI.
Flags Affected:
n v - - d i z c
(6502, 65C02,
65802/65816 emulation mode e = 1)
n v m x d i z c (65802/65816 native mode e = 0)
All flags are restored to their values prior to interrupt (each flag takes
the value of its corresponding bit in the stacked status byte, except that
the Break flag is ignored).
391
Codes:
Opcode
Available to:
Addressing Mode Syntax
(hex)
6502 65C02
65802/816
Stack (RTI)
RTI
40
x
x
x
1 Add 1 cycle for 65802/65816 native mode (e=0)
# of
Bytes
1
# of
Cycles
61
392
RTL
Pull the program counter (incrementing the stacked, sixteen-bit value by one before loading the
program counter with it), then the program bank register from the stack.
When a subroutine in another bank is called (via a jump to subroutine long instruction), the
current bank address is pushed onto the stack along with the return address. To return to the calling
bank, a long return instruction must be executed, which first pulls the return address from the stack,
increments it, and loads the program counter with it, then pulls the calling bank from the stack and
loads the program bank register. This transfers control to the instruction immediately following the
original jump to subroutine long.
Stack
Bank 0
Figure 18-11 Stack before RTL
Flags Affected:
- - - - - - - -
Codes:
Opcode
Address Mode
Stack (RTL)
Syntax
RTL
(hex)
6B
Available to:
65802/8
6502 65C02
16
x
# of
Bytes
1
# of
Cycle
s
6
393
RTS
Pull the program counter, incrementing the stacked, sixteen-bit value by one before loading the
program counter with it.
When a subroutine is called (via a jump to subroutine instruction), the current return address is
pushed onto the stack. To return to the code following the subroutine call, a return instruction must be
executed, which pulls the return address from the stack, increments it, and loads the program counter
with it, transferring control to the instruction immediately following the jump to subroutine.
Stack
Bank 0
Figure 18-12 Stack before RTS
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing
Mode
Stack (RTS)
Syntax
RTS
Available to:
# of
# of
(hex)
6502
65C02
65802/816
Bytes
Cycles
60
394
SBC
Subtract the data located at the effective address specified by the operand from the contents of the
accumulator; subtract one more if the carry flag is clear, and store the result in the accumulator.
The 65x processors have no subtract instruction that does not involve the carry. To avoid subtracting
the carry flag from the result, either you must be sure it is set or you must explicitly set it (using SEC) prior to
executing the SBC instruction.
In a multi-precision (multi-word) subtract, you set the carry before the low words are subtracted. The
low word subtraction generates a new carry flag value based on the subtraction. The carry is set if no borrow
was required and cleared if borrow was required. The complement of the new carry flag (one if the carry is
clear) is subtracted during the next subtraction, and so on. Each result thus correctly reflects the borrow from
the previous subtraction.
Note that this use of the carry flag is the opposite of the way the borrow flag is used by some other
processors, which clear (not set) the carry if no borrow was required.
d flag clear: Binary subtraction is performed.
d flag set: Binary coded decimal (BCD) subtraction is performed.
8-bit accumulator (all processors): Data subtracted from memory is eight-bit.
16-bit accumulator (65802/65816 only, m=0): Data subtracted from memory is sixteen-bit: the low
eight bits is located at the effective address; the high eight bits is located at the effective address plus one.
Flags Affected:
n
n
v
z
c
v - - - - z c
Set if most significant bit of result is set; else cleared.
Set if signed overflow; cleared if valid sign result.
Set if result is zero; else cleared.
Set if unsigned borrow not required; cleared if unsigned borrow.
395
Codes:
Addressing Mode + +
Immediate
Absolute
Absolute Long
Direct Page (also DP)
DP Indirect
DP Indirect Long
Absolute Indexed, X
Syntax
SBC # const
SBC addr
SBC long
SBC dp
SBC (dp)
SBC [dp]
SBC addr,
X
Opcode
Available to:
(hex)
6502 65C02 65802/816
E9
x
x
x
ED
x
x
x
EF
x
E5
x
x
x
F2
x
x
E7
x
FD
FF
# of
Bytes
2*
3
4
2
2
2
# of
Cycles
21, 4
41, 4
51, 4
31, 2, 4
51, 2, 4
61, 2, 4
41, 3, 4
51, 4
F9
41, 3, 4
F5
E1
F1
x
x
x
x
x
x
x
x
x
2
2
2
41, 2, 3, 4
61, 2, 4
51, 2, 3, 4
F7
61, 2, 4
E3
41, 4
F3
71, 4
SBC, a Primary Group Instruction, has available all of the Primary Group addressing modes and bit patterns
Add 1 byte if m=0 (16-bit memory/accumulator)
Add 1 cycle if m=0 (16-bit memory/accumulator)
Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
Add 1 cycle if adding index crosses a page boundary
Add 1 cycle if 65C02 and d=1 (decimal mode, 65C02)
396
SEC
Flags Affected:
- - - - - - - c
c Carry flag set always.
Codes:
Opcode
Addressing
Mode
Implied
Syntax
SEC
Available to:
# of
# of
(hex)
6502
65C02
65802/816
Bytes
Cycles
38
397
SED
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
SED
Available to:
# of
# of
(hex)
6502
65C02
65802/816
Bytes
Cycles
F8
398
SEI
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
SEI
Available to:
# of
# of
(hex)
6502
65C02
65802/816
Bytes
Cycles
78
399
SEP
For each one-bit in the operand byte, set the corresponding bit in the status register to one. For
example, if bit three is set in the operand byte, bit three in the status register (the decimal flag) is set to one by
this instruction. Zeroes in the operand byte cause no change to their corresponding status register bits.
This instruction lets you set any flag or flags in the status register with a single two-byte instruction.
Furthermore, it is the only direct means of setting the m and x mode select flags. (Instructions that pull the P
status register indirectly affect the m and x mode select flags).
6502 emulation mode (65802/65816, e=1): Neither the break flag nor bit five (the 6502s non-flag bit)
is affected by SEP.
Flags Affected:
n v - - d i z c
(65802/65816 emulation e=1)
n v m x d i z c (65802/65816 native mode e=0)
All flags for which an operand bit is set are set to one.
All other flags are unaffected by the instruction.
Codes:
Opcode
Addressing
Mode
Syntax
(hex)
Immediate
SEP #
const
E2
6502
Available to:
# of
# of
65C02
65802/816
Bytes
Cycles
400
STA
Store the value in the accumulator to the effective address specified by the operand.
8-bit accumulator (all processors): Value is eight-bit.
16-bit accumulator (65802/65816 only, m=0): Value is sixteen-bit: the low-order eight bits are stored
to the effective address; the high-order eight bits are stored to the effective address plus one.
The 65x flags are unaffected by store instructions.
Flags Affected:
- - - - - - - -
Codes:
Addressing Mode + +
Absolute
Absolute Long
Direct Page (also DP)
DP Indirect
DP Indirect Long
Syntax
STA addr
STA long
STA dp
STA (dp)
STA [dp]
STA addr,
Absolute Indexed, X
X
Absolute Long Indexed, STA long,
X
X
STA addr,
Absolute Indexed, Y
Y
DP Indexed, X
STA dp, X
STA (dp,
DP Indexed Indirect, X
X)
STA (dp),
DP Indirect Indexed, Y
Y
DP
Indirect
Long STA [dp],
Indexed, Y
Y
Stack Relative (also SR) STA sr, S
STA (sr,
SR Indirect Indexed, Y
S), Y
+ +
1
2
Opcode
Available on::
(hex)
6502 65C02 65802/816
8D
x
x
x
8F
x
85
x
x
x
92
x
x
87
x
9D
9F
# of
Bytes
3
4
2
2
2
# of
Cycles
41
51
31,2
51,2
61,2
51
51
99
51
95
41,2
81
61,2
91
61,2
97
61,2
83
41
93
71
STA, a Primary Group Instruction, has available all of the Primary Group addressing modes and bit patterns
401
STP
During the processors next phase 2 clock cycle, stop the processors oscillator input; the processor is
effectively shut down until a reset occurs (until the RES pin is pulled low).
STP is designed to put the processor to sleep while its not (actively) in use in order to reduce power
consumption. Since power consumption is a function of frequency with CMOS circuits, stopping the clock cuts
power to almost nil.
Your reset handling routine (pointed to by the reset vector, $00:FFFC-FD) should be designed to either
reinitialize the system or resume control through a previously-installed reset handler.
Remember that reset is an interrupt-like signal that causes the emulation bit to be set to one. It also
causes the direct page register to be reset to zero; stack high to be set to one (forcing the stack pointer to page
one); and the mode select flags to be set to one (eight-bit registers; a side effect is that the high bytes of the
index registers are zeroed). STP is useful only in hardware systems (such as battery-powered systems)
specifically designed to support a low-power mode.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Available on:
# of
# of
Addressing
Syntax
(hex)
6502 65C02 65802/816 Bytes Cycles
Mode
Implied
STP
DB
x
1
31
1 Uses 3 cycles to shut the processor down; additional cycles are required by reset to restart it
402
STX
Store the value in index register X to the effective address specified by the operand.
8-bit index registers (all processors): Value is eight-bit.
16-bit index registers (65802/65816 only, x = 0): Value is sixteen-bit: the low-order eight bits are
stored to the effective address; the high-order eight bits are stored to the effective address plus one.
The 65x flags are unaffected by store instructions.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing Mode
Syntax
(hex)
Available on:
65802/81
6502 65C02
6
# of
# of
Bytes
Cycles
STX
8E
x
x
x
3
addr
Direct page
STX dp
86
x
x
x
2
Direct
Page STX dp,
96
x
x
x
2
Indexed, Y
Y
1 Add 1 cycle if x=0 (16-bit index registers)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
Absolute
41
31,2
41,2
403
STY
Store the value in index register Y to the effective address specified by the operand.
8-bit index registers (all processors): Value is eight-bit.
16-bit index registers (65802/65816 only, x = 0): Value is sixteen-bit: the low-order eight bits are
stored to the effective address; the high-order eight bits are stored to the effective address plus one.
The 65x flags are unaffected by store instructions.
Flags Affected:
- - - - - - - -
Codes:
Addressing Mode
Opcode
Available on:
(hex)
6502 65C02 65802/816
# of
Bytes
Syntax
STX
Absolute
8C
x
x
x
3
addr
Direct page
STX dp
84
x
x
x
2
Direct Page Indexed, STX dp,
94
x
x
x
2
X
X
1 Add 1 cycle if x=0 (16-bit index registers)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
# of
Cycles
41
31,2
41,2
404
STZ
Flags Affected:
- - - - - - - -
Codes:
Opcode
Available on:
Addressing Mode
Syntax
(hex)
6502 65C02 65802/816
Absolute
STZ addr
9C
x
x
Direct Page
STZ dp
64
x
x
Absolute Indexed, STZ addr,
9E
x
x
X
X
Direct
Page
STZ dp, X
74
x
x
Indexed, X
1 Add 1 cycle if m=0 (16-bit memory/accumulator)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
# of
Bytes
3
2
# of
Cycles
41
31,2
51
41,2
405
TAX
Transfer the value in the accumulator to index register X. If the registers are different sizes, the nature
of the transfer is determined by the destination register. The value in the accumulator is not changed by the
operation.
8-bit accumulator, 8-bit index registers (all processors): Value transferred is eight-bit.
8-bit accumulator, 16-bit index registers (65802/65816 only, m = 1, x = 0): Value transferred is
sixteen-bit; the eight-bit A accumulator becomes the low byte of the index register; the hidden eight-bit B
accumulator becomes the high byte of the index register.
16-bit accumulator, 8-bit index registers (65802/65816 only, m=0, x=1): Value transferred to the
eight-bit index register is eight-bit, the low byte of the accumulator.
16-bit accumulator, 16-bit index registers (65802/65816 only, m=0, x=0): Value transferred to the
sixteen-bit index register is sixteen-bit, the full sixteen-bit accumulator.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Available to:
# of
# of
Syntax
(hex)
6502
65C02
65802/816
Bytes
Cycles
TAX
AA
406
TAY
Transfer the value in the accumulator to index register Y. If the registers are different sizes, the nature
of the transfer is determined by the destination register. The value in the accumulator is not changed by the
operation.
8-bit accumulator, 8-bit index registers (all processors): Value transferred is eight-bit.
8-bit accumulator, 16-bit index registers (65802/65816 only, m = 1, x = 0): Value transferred is
sixteen-bit; the eight-bit A accumulator becomes the low byte of the index register; the hidden eight-bit B
accumulator becomes the high byte of the index register.
16-bit accumulator, 8-bit index registers (65802/65816 only, m=0, x=1): Value transferred to the
eight-bit index register is eight-bit, the low byte of the accumulator.
16-bit accumulator, 16-bit index registers (65802/65816 only, m=0, x=0): Value transferred to the
sixteen-bit index register is sixteen-bit, the full sixteen-bit accumulator.
Flags Affected:
Codes:
Addressing
Mode
Implied
Syntax
TAX
Opcode
(hex)
AA
Available to:
# of
# of
6502
65C02
65802/816
Bytes
Cycles
407
TCD
Transfer the value in the sixteen-bit accumulator C to the direct page register D, regardless of the
setting of the accumulator/memory mode flag.
An alternate mnemonic is TAD, (transfer the value in the A accumulator to the direct page register).
In TCD, the C is used to indicate that sixteen bits are transferred regardless of the m flag. If the A
accumulator is set to just eight bits (whether because the m flag is set, or because the processor is in 6502
emulation mode), then its value becomes the low byte of the direct page register and the value in the hidden B
accumulator becomes the high byte of the direct page register.
The accumulators sixteen-bit value is unchanged by the operation.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
(hex)
TCD
(or
TAD)
5B
Available to:
6502
65C02
# of
# of
65802/816
Bytes
Cycles
408
TCS
Transfer the value in the accumulator to the stack pointer S. The accumulators value is unchanged by
the operation.
An alternate mnemonic is TAS (transfer the value in the A accumulator to the stack pointer).
In TCS, the C is used to indicate that, in native mode, sixteen bits are transferred regardless of the m
flag. If the A accumulator is set to just eight bits (because the m flag is set), then its value is transferred to the
low byte of the stack pointer and the value in the hidden B accumulator is transferred to the high byte of the
stack pointer. In emulation mode, only the eight-bit A accumulator is transferred, since the high stack pointer
byte is forced to one (the stack is confined to page one).
TCS, along with TXS, are the only two instructions for changing the value in the stack pointer. The
two are also the only two transfer instructions not to alter the flags.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing
Mode
Implied
Syntax
(hex)
TCS
(or
TAS)
1B
6502
Available to:
# of
# of
65C02
65802/816
Bytes
Cycles
409
TDC
Transfer the value in the sixteen-bit direct page register D to the sixteen-bit accumulator C, regardless
of the setting of the accumulator/memory mode flag.
An alternate mnemonic is TDA (transfer the value in the direct page register to the A accumulator).
In TDC, the C is used to indicate that sixteen bits are transferred regardless of the m flag. If the A
accumulator is set to just eight bits (whether because the m flag is set, or because the processor is in 6502
emulation mode), then it takes the value of the low byte of the direct page register and the hidden B accumulator
takes the value of the high byte of the direct page register.
The direct page registers sixteen-bit value is unchanged by the operation.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
(hex)
TDC
(or
TDA)
7B
6502
Available to:
# of
# of
65C02
65802/816
Bytes
Cycles
410
TRB
Logically AND together the complement of the value in the accumulator with the data at the effective
address specified by the operand. Store the result at the memory location.
This has the effect of clearing each memory bit for which the corresponding accumulator bit is set,
while leaving unchanged all memory bits in which the corresponding accumulator bits are zeroes.
Unlike the BIT instruction, TRB is a read-modify-write instruction, not only calculating a result and
modifying a flag, but also storing the result to memory as well.
The z zero flag is set based on a second and different operation the ANDing of the accumulator value
(not its complement) with the memory value (the same way the BIT instruction affects the zero flag). The
result of this second operation is not saved; only the zero flag is affected by it.
8-bit accumulator/memory (65C02;65802/65816, m=1): Values in accumulator and memory are
eight-bit.
16-bit accumulator/memory(65C02;65802/65816, m=1): Values in accumulator and memory are
sixteen-bit: the low-order eight bits are located at the effective address; the high-order eight bits are at the
effective address plus one.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Syntax
(hex)
Available on:
6502
65C02
65802/816
# of
Bytes
TRB
1C
x
x
3
addr
Direct Page
TRB dp
14
x
x
2
1 Add 2 cycles if m = 0 (16-bit memory/accumulator)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
Absolute
# of
Cycle
s
61
51,2
411
TSB
Logically OR together the value in the accumulator with the data at the effective address specified by
the operand. Store the result at the memory location.
This has the effect of setting each memory bit for which the corresponding accumulator bit is set, while
leaving unchanged all memory bits in which the corresponding accumulator bits are zeroes.
Unlike the BIT instruction, TSB is a read-modify-write instruction, not only calculating a result and
modifying a flag, but storing the result to memory as well.
The z zero flag is set based on a second different operation, the ANDing of the accumulator value with
the memory value (the same way the BIT instruction affects the zero flag). The result of this second operation
is not saved; only the zero flag is affected by it.
8-bit accumulator/memory(65C02;65802/65816, m = 1): Values in accumulator and memory are
eight-bit.
16-bit accumulator/memory (65802/65816 only, m = 0): Values in accumulator and memory are
sixteen-bit: the low-order eight bits are located at the effective address; the high-order eight bits are at the
effective address plus one.
Flags Affected:
Codes:
Opcode
Available on:
# of
Addressing
Syntax
(hex)
6502 65C02 65802/816 Bytes
Mode
Absolute
TSB addr
0C
x
x
3
Direct Page
TSB dp
04
x
x
2
1 Add 2 cycles if m = 0 (16-bit memory/accumulator)
2 Add 1 cycle if low byte of Direct Page register is other than zero (DL< >0)
# of
Cycles
61
51,2
412
TSC
Transfer the value in the sixteen-bit stack pointer S to the sixteen-bit accumulator C, regardless of the
setting of the accumulator/memory mode flag.
An alternate mnemonic is TSA (transfer the value in the stack pointer to the A accumulator).
In TSC, the C is used to indicate that sixteen bits are transferred regardless of the m flag. If the A
accumulator is set to just eight bits (whether because the m flag is set, or because the processor is in 6502
emulation mode), then it takes the value of the low byte of the stack pointer and the hidden B accumulator takes
the value of the high byte of the stack pointer. (In emulation mode, B will always take a value of one, since the
stack is confined to page one.)
The stack pointers value is unchanged by the operation.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
TSC
(or
TSA)
(hex)
3B
Available to:
6502
65C02
# of
# of
65802/816
Bytes
Cycles
413
TSX
Transfer the value in the stack pointer S to index register X. The stack pointers value is not changed
by the operation.
8-bit index registers (all processors): Only the low byte of the value in the stack pointer is transferred
to the X register. In the 6502, the 65C02, and the 6502 emulation mode, the stack pointer and the index
registers are only a single byte each, so the byte in the stack pointer is transferred to the eight-bit X register. In
65802/65816 native mode, the stack pointer is sixteen bits, so its most significant byte is not transferred if the
index registers are in eight-bit mode.
16-bit index registers (65802/65816 only, x=0): The full sixteen-bit value in the stack pointer is
transferred to the X register.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
TSX
Available to:
# of
# of
(hex)
6502
65C02
65802/816
Bytes
Cycles
BA
414
TXA
Transfer the value in index register X to the accumulator. If the registers are different sizes, the nature
of the transfer is determined by the destination (the accumulator). The value in the index register is not changed
by the operation.
8-bit index registers, 8-bit accumulator (all processors): Value transferred is eight-bit.
16-bit index registers, 8-bit accumulator (65802/65816 only, x=0, m=1): Value transferred to the
eight-bit accumulator is eight-bit, the low byte of the index register; the hidden eight-bit accumulator B is not
affected by the transfer.
8-bit index registers, 16-bit accumulator (65802/65816 only, x=1, m=0): The eight-bit index register
becomes of the low byte of the accumulator; the high accumulator byte is zeroed.
16-bit index registers, 16-bit accumulator (65802/65816 only, x=0, m=0): Value transferred to the
sixteen-bit accumulator is sixteen-bit, the full sixteen-bit index register.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
TXA
Available to:
# of
# of
(hex)
6502
65C02
65802/816
Bytes
Cycles
8A
415
TXS
Transfer the value in index register X to the stack pointer, S. The index registers value is not changed
by the operation.
TXS, along with TCS, are the only two instructions for changing the value in the stack pointer. The
two are also the only two transfer instructions that do not alter the flags.
6502, 65C02, and 6502 emulation mode (65802/65816, e=1): The stack pointer is only eight bits (it is
concatenated to a high byte of one, confining the stack to page one), and the index registers are only eight bits.
The byte in X is transferred to the eight-bit stack pointer.
8-bit index registers (65802/65816 native mode, x=1): The stack pointer is sixteen bits but the index
registers are only eight bits. A copy of the byte in X is transferred to the low stack pointer byte and the high
stack pointer byte is zeroed.
16-bit index registers (65802/65816 native mode, x=0): The full sixteen-bit value in X is transferred
to the sixteen-bit stack pointer.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Addressing
Mode
Implied
Syntax
TXS
Available to:
# of
# of
(hex)
6502
65C02
65802/816
Bytes
Cycles
9A
416
TXY
Transfer the value in index register X to index register Y. The value in index register X is not changed
by the operation. Note that the two registers are never different sizes.
8-bit index registers (x=1): Value transferred is eight-bit.
16-bit index registers (x=0): Value transferred is sixteen-bit.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
TXY
(hex)
9B
Available to:
6502
65C02
# of
# of
65802/816
Bytes
Cycles
417
TYA
Transfer the value in index register Y to the accumulator. If the registers are different sizes, the nature
of the transfer is determined by the destination (the accumulator). The value in the index register is not changed
by the operation.
8-bit index registers, 8-bit accumulator (all processors): Value transferred is eight-bit.
16-bit index registers, 8-bit accumulator (65802/65816 only, x=0, m=1): Value transferred to the
eight-bit accumulator is eight-bit, the low byte of the index register; the hidden eight-bit accumulator B is not
affected by the transfer.
8-bit index registers, 16-bit accumulator (65802/65816 only, x=1, m=0): The eight-bit index register
becomes of the low byte of the accumulator; the high accumulator byte is zeroed.
16-bit index registers, 16-bit accumulator (65802/65816 only, x=0, m=0): Value transferred to the
sixteen-bit accumulator is sixteen-bit, the full sixteen-bit index register.
Flags Affected:
Codes:
Opcod
e
Addressing
Mode
Implied
Syntax
TYA
Available to:
# of
# of
(hex)
6502
65C02
65802/816
Bytes
Cycles
98
418
TYX
Transfer the value in index register Y to index register X. The value in index register Y is not changed
by the operation. Note that the two registers are never different sizes.
8-bit index registers (x=1): Value transferred is eight-bit.
16-bit index registers (x=0): Value transferred is sixteen-bit.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
TYX
(hex)
BB
Available to:
6502
65C02
# of
# of
65802/816
Bytes
Cycles
419
WAI
Pull the RDY pin low. Power consumption is reduced and RDY remains low until an external
hardware interrupt (NMI, IRQ, ABORT, or RESET) is received.
WAI is designed to put the processor to sleep during an external event to reduce its power
consumption, to allow it to be synchronized with an external event, and/or to reduce interrupt latency (an
interrupt occurring during execution of an instruction is not acted upon until execution of the instruction is
complete, perhaps many cycles later; WAI ensures that an interrupt is recognized immediately).
Once an interrupt is received, control is vectored through one of the hardware interrupt vectors; an RTI
from the interrupt handling routine will return control to the instruction following the original WAI. However,
if by setting the i flag, interrupt have been disabled prior to the execution of the WAI instruction, and IRQ is
asserted, the wait condition is terminated and control resumes with the next instruction, rather than through
the interrupt vectors. This provides the quickest response to an interrupt, allowing synchronization with
external events. WAI also frees up the bus; since RDY is pulled low in the third instruction cycle, the
processor may be disconnected from the bus if BE is also pulled low.
Flags Affected:
- - - - - - - -
Codes:
Opcode
Available to:
# of
# of
Addressing
Syntax
(hex)
6502 65C02 65802/816
Bytes
Cycles
Mode
Implied
CB
x
1
31
WAI
1 Uses 3 cycles to shut the processor down; additional cycles are required by interrupt to restart it
420
WDM
The 65802 and 65816 use 255 of the 256 possible eight-bit opcodes. One was reserved; it provides an
escape hatch for future 65x processors to expand their instruction set to sixteen bit opcodes; this opcode
would signal that the next byte is an opcode in the expanded instruction set. This reserved byte for future twobyte opcodes was given a temporary mnemonic, WDM, which happen to be the initials of the processors
designer William D. Mensch, Jr.
WDM should never be used in a program, since it would render the object program incompatible with
any future 65x processors.
If the 65802/65816 WDM instruction is accidentally executed, it will act like a two-byte NOP
instruction.
Flags Affected*:
Codes:
Opcode
Addressing
Mode
Syntax
(hex)
Available to:
6502
65C02
65802/816
# of
# of
Bytes
Cycles
42
x
2*
*
WDM
* Byte and cycle counts subject to change in future processors which expand WDM into 2-byte
opcode portions of instructions of varying lengths
421
XBA
B represents the high-order byte of the sixteen-bit C accumulator, and A in this case represents the loworder byte. XBA swaps the contents of the low-order and high-order bytes of C.
An alternate mnemonic is SWA (swap the high and low bytes of the sixteen-bit A accumulator).
XBA can be used to invert the low-order, high-order arrangement of a sixteen-bit value, or to
temporarily store an eight-bit value from the A accumulator into B. Since it is an exchange, the previous
contents of both accumulators are changed, replaced by the previous contents of the other.
Neither the mode select flags nor the emulation mode flag affects this operation.
The flags are changed based on the new value of the low byte, the A accumulator (that is, on the former
value of the high byte, the B accumulator), even in sixteen-bit accumulator mode.
Flags Affected:
Codes:
Opcode
Addressing
Mode
Implied
Syntax
XBA
(or SWA)
(hex)
EB
Available to:
6502
65C02
# of
# of
65802/816
Bytes
Cycles
422
XCE
This instruction is the only means provided by the 65802 and 65816 to shift between 6502 emulation
mode and the full, sixteen-bit native mode.
The emulation mode is used to provide hardware and software compatibility between the 6502 and
65802/65816.
If the processor is in emulation mode, then to switch to native mode, first clear the carry bit, then
execute an XCE. Since it is an exchange operation, the carry flag will reflect the previous state of the
emulation bit. Switching to native mode causes bit five to stop functioning as the break flag, and function
instead as the x mode select flag. A second mode select flag, m, uses bit six, which was unused in emulation
mode. Both mode select flags are initially set to one (eight-bit modes). There are also other differences
described in the text.
If the processor is in native mode, then to switch to emulation mode, you first set the carry bit, then
execute an XCE. Switching to emulation mode causes the mode select flags (m and x) to be lost from the status
register, with x replaced by the b break flag. This forces the accumulator to eight bits, but the high accumulator
byte is preserved in the hidden B accumulator. It also forces the index registers to eight bits, causing the loss of
values in their high bytes, and the stack to page one, causing the loss of the high byte of the previous stack
address. There are also other differences described in the text.
Flags Affected:
e
- - m b/x - - - c
e Takes carrys previous value: set if carry was set; else cleared.
c Takes emulations pervious value: set if previous mode was emulation; else
cleared.
m m is a native mode flag only; switching to native mode sets it to 1.
x x is a native mode flag only; it becomes the b flag in emulation.
b b is an emulation mode flag only; it is set to 1 to become the x flag in native.
Codes:
Opcode
Addressing
Mode
Implied
Syntax
XCE
(hex)
FB
Available to:
6502
65C02
# of
# of
65802/816
Bytes
Cycles
423
Opcode
Mnemonic
BRK
ORA
COP
ORA
TSB
ORA
ASL
ORA
PHP
ORA
ASL
PHD
TSB
ORA
ASL
ORA
BLP
ORA
ORA
ORA
TRB
ORA
ASL
ORA
CLC
ORA
INC
TCS
TRB
ORA
ASL
ORA
JSR
AND
JSR
AND
BIT
AND
ROL
AND
PLP
AND
ROL
PLD
BIT
AND
Addressing Mode
Stack/Interrupt
DP Indexed Indirect, X
Stack/Interrupt
Stack Relative
Direct Page
Direct Page
Direct Page
DP Indirect Long
Stack (Push)
Immediate
Accumulator
Stack (Push)
Absolute
Absolute
Absolute
Absolute Long
Program Counter Relative
DP Indirect Indexed, Y
DP Indirect
SR Indirect Indexed, Y
Direct Page
DP Indexed, X
DP Indexed, X
DP Indirect Long Indexed, Y
Implied
Absolute Indexed, Y
Accumulator
Implied
Absolute
Absolute Indexed, X
Absolute Indexed, X
Absolute Long Indexed, X
Absolute
DP Indexed Indirect, X
Absolute Long
Stack Relative
Direct Page
Direct Page
Direct Page
DP Indirect Long
Stack (Pull)
Immediate
Accumulator
Stack (Pull)
Absolute
Absolute
6502
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Available on:
65C02 65802/816
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
# of
Bytes
2**
2
2**
2
2
2
2
2
1
2*
1
1
3
3
3
4
2
2
2
2
2
2
2
2
1
3
1
1
3
3
3
4
3
2
4
2
2
2
2
2
1
2*
1
1
3
3
# of
Cycles
79
61,2
79
41
52,5
31,2
52,5
61,2
3
21
2
4
65
41
65
51
27,8
51,2,3
51,2
71
52,5
41,2
62,5
61,2
2
41,3
2
2
65
41,3
75,6
51
6
61,2
8
41
31,2
31,2
52,5
61,2
4
21
2
5
41
41
Continued.
424
Opcode
Mnemonic
ROL
AND
BMI
AND
AND
AND
BIT
AND
ROL
AND
SEC
AND
DEC
TSC
BIT
AND
ROL
AND
RTI
EOR
WDM
EOR
MVP
EOR
LSR
EOR
PHA
EOR
LSR
PHK
JMP
EOR
LSR
EOR
BVC
EOR
EOR
EOR
MVN
EOR
LSR
EOR
CLI
EOR
PHY
TCD
JMP
EOR
LSR
EOR
Addressing Mode
Absolute
Absolute Long
Program Counter Relative
DP Indirect Indexed, Y
DP Indirect
SR Indirect Indexed, Y
DP Indexed, X
DP Indexed, X
DP Indexed, X
DP Indirect Long Indexed, Y
Implied
Absolute Indexed, Y
Accumulator
Implied
Absolute Indexed, X
Absolute Indexed, X
Absolute Indexed, x
Absolute Long Indexed, X
Stack/RTI
DP Indexed Indirect, X
Stack Relative
Block Move
Direct Page
Direct Page
DP Indirect Long
Stack (Push)
Immediate
Accumulator
Stack (Push)
Absolute
Absolute
Absolute
Absolute Long
Program Counter Relative
DP Indirect Indexed, Y
DP Indirect
SR Indirect Indexed, Y
Block Move
DP Indexed, X
DP Indexed, X
DP Indirect Long Indexed, Y
Implied
Absolute Indexed, Y
Stack (Push)
Implied
Absolute Long
Absolute Indexed, X
Absolute Indexed, X
Absolute Long Indexed, X
6502
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Available on:
65C02 65802/816
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
# of
Bytes
3
4
2
2
2
2
2
2
2
2
1
3
1
1
3
3
3
4
1
2
216
2
3
2
2
2
1
2*
1
1
3
3
3
4
2
2
2
2
3
2
2
2
1
3
1
1
4
3
3
4
# of
Cycles
65
51
27,8
51,2,3
51,2
71
41,2
41,2
62,5
61,2
2
41,3
2
2
41,3
41,3
75,6
51
69
61,2
16
41
13
31,2
52,5
61,2
31
21
2
3
3
41
65
51
27,8
51,2,3,
51,2
71
13
41,2
62,5
61,2
2
41,3
310
2
4
41,3
75,6
51
Continued.
425
Opcode
Mnemonic
RTS
ADC
PER
ADC
STZ
ADC
ROR
ADC
PLA
ADC
ROR
RTL
JMP
ADC
ROR
ADC
BVS
ADC
ADC
ADC
STZ
ADC
ROR
ADC
SEI
ADC
PLY
TDC
JMP
ADC
ROR
ADC
BRA
STA
BRL
STA
STY
STA
STX
STA
DEY
BIT
TXA
PHB
STY
STA
STX
STA
BCC
STA
Addressing Mode
Stack (RTS)
DP Indexed Indirect, X
Stack (PC Relative Long)
Stack Relative
Direct Page
Direct Page
Direct Page
DP Indirect Long
Stack (Pull)
Immediate
Accumulator
Stack (RTL)
Absolute Indirect
Absolute
Absolute
Absolute Long
Program Counter Relative
DP Indirect Indexed, Y
DP Indirect
SR Indirect Indexed, Y
Direct Page Indexed, X
DP Indexed, X
DP Indexed, X
DP Indirect Long Indexed, Y
Implied
Absolute Indexed, Y
Stack/Pull
Implied
Absolute Indexed Indirect
Absolute Indexed, X
Absolute Indexed, X
Absolute Long Indexed, X
Program Counter Relative
DP Indexed Indirect, X
Program Counter Relative Long
Stack Relative
Direct Page
Direct Page
Direct Page
DP Indirect Long
Implied
Immediate
Implied
Stack (Push)
Absolute
Absolute
Absolute
Absolute Long
Program Counter Relative
DP Indirect Indexed, Y
6502
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Available on:
65C02 65802/816
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
# of
Bytes
1
2
3
2
2
2
2
2
1
2*
1
1
3
3
3
4
2
2
2
2
2
2
2
2
1
3
1
1
3
3
3
4
2
2
3
2
2
2
2
2
1
2*
1
1
3
3
3
4
2
2
# of
Cycles
6
61,2,4
6
41,4
31,2
31,2,4
52,5
61,2,4
41
21,4
2
6
511,12
41,4
65
51,4
27,8
51,2,3,4
51,2,4
71,4
41,2
41,2,4
62,5
61,2,4
2
41,3,4
410
2
6
41,3,4
75,6
51,4
38
61,2
4
41
32,10
31,2
32,10
61,2
2
21
2
3
410
41
410
51
27,8
61,2
Continued.
426
Opcode
Hex
92
93
94
95
96
97
98
99
9A
9B
9C
9D
9E
9F
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
AA
AB
AC
AD
AE
AF
B0
B1
B2
B3
B4
B5
B6
B7
B8
B9
BA
BB
BC
BD
BE
BF
C0
C1
C2
C3
Mnemonic
STA
STA
STY
STA
STX
STA
TYA
STA
TXS
TXY
STZ
STA
STZ
STA
LDY
LDA
LDX
LDA
LDY
LDA
LDX
LDA
TAY
LDA
TAX
PLB
LDY
LDA
LDX
LDA
BCS
LDA
LDA
LDA
LDY
LDA
LDX
LDA
CLV
LDA
TSX
TYX
LDY
LDA
LDX
LDA
CPY
CMP
REP
CMP
Addressing Mode
DP Indirect
SR Indirect Indexed, Y
Direct Page Indexed, X
DP Indexed, X
Direct Page Indexed, Y
DP Indirect Long Indexed, Y
Implied
Absolute Indexed, Y
Implied
Implied
Absolute
Absolute Indexed, X
Absolute Indexed, X
Absolute Long Indexed, X
Immediate
DP Indexed Indirect, X
Immediate
Stack Relative
Direct Page
Direct Page
Direct Page
DP Indirect Long
Implied
Immediate
Implied
Stack (Pull)
Absolute
Absolute
Absolute
Absolute Long
Program Counter Relative
DP Indirect Indexed, Y
DP Indirect
SR Indirect Indexed, Y
DP Indexed, X
DP Indexed, X
DP Indexed, Y
DP Indirect Long Indexed, Y
Implied
Absolute Indexed, Y
Implied
Implied
Absolute Indexed, X
Absolute Indexed, X
Absolute Indexed, Y
Absolute Long Indexed, X
Immediate
DP Indexed Indirect, X
Immediate
Stack Relative
6502
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Available on:
65C02 65802/816
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
# of
Bytes
2
2
2
2
2
2
1
3
1
1
3
3
3
4
2+
2
2+
2
2
2
2
2
1
2*
1
1
3
3
3
4
2
2
2
2
2
2
2
2
1
3
1
1
3
3
3
4
2+
2
2
2
# of
Cycles
51,2
71
42,10
41,2
42,10
61,2
2
51
2
2
41
51
51
51
210
61,2
210
41
32,10
31,2
32,10
61,2
2
21
2
4
410
41
410
51
27,8
51,2,3
51,2
71
42,10
41,2
42,10
61,2
2
41,3
2
2
43,10
41,3
43,10
51
210
61,2
3
41
Continued.
427
Opcode
Mnemonic
CPY
CMP
DEC
CMP
INY
CMP
DEX
WAI
CPY
CMP
DEC
CMP
BNE
CMP
CMP
CMP
PEI
CMP
DEC
CMP
CLD
CMP
PHX
STP
JMP
CMP
DEC
CMP
CPX
SBC
CPX
SBC
INX
SBC
INC
SBC
INX
SBC
NOP
XBA
CPX
SBC
INC
SBC
BEQ
SBC
SBC
SBC
PEA
SBC
Addressing Mode
Direct Page
Direct Page
Direct Page
DP Indirect Long
Implied
Immediate
Implied
Implied
Absolute
Absolute
Absolute
Absolute Long
Program Counter Relative
DP Indirect Indexed, Y
DP Indirect
SR Indirect Indexed, Y
Stack (Direct Page Indirect)
DP Indexed, X
DP Indexed, X
DP Indirect Long Indexed, Y
Implied
Absolute Indexed, Y
Stack (Push)
Implied
Absolute Indirect Long
Absolute Indexed, X
Absolute Indexed, X
Absolute Long Indexed, X
Immediate
DP Indexed Indirect, X
Immediate
Stack Relative
Direct Page
Direct Page
Direct Page
DP Indirect Long
Implied
Immediate
Implied
Implied
Absolute
Absolute
Absolute
Absolute Long
Program Counter Relative
DP Indirect Indexed, Y
DP Indirect
SR Indirect Indexed, Y
Stack (absolute)
DP Indexed, X
6502
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Available on:
65C02 65802/816
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
# of
Bytes
2
2
2
2
1
2*
1
1
3
3
3
4
2
2
2
2
2
2
2
2
1
3
1
1
3
3
3
4
2+
2
2
2
2
2
2
2
1
2*
1
1
3
3
3
4
2
2
2
2
3
2
# of
Cycles
32,10
31,2
52,5
61,2
2
21
2
315
410
41
65
51
27,8
51,2,3
51,2
71
62
41,2
62,5
61,2
2
41,3
310
314
6
41,3
75,6
51
210
61,2,4
3,
41,4
32,10
31,2,4
52,5
61,2,4
2
21,4
2
3
410
41,4
65
51.4
27.8
51.2.3.4
51.2.4
71,4
5
41,2,4
Continued.
428
429
430
431
432
433
Processor
Opcode or instruction first introduced on the 65C02
j Opcode or instruction first introduced on the 65816/65802
(not marked: first introduced on the NMOS 6502)
Addressing mode box:
Immediate
# const
1
Addressing Mode
Assembler operand syntax
#
Number of bytes
Number of cycles
Key to detailed instruction operation chart (see Appendix E: 65816 Data Sheet)
Operation column:
A
X
Y
M
M(d)
M(s)
M(pc)
PC
rl
+
o2
RDY
Accumulator
Index register X
Index register Y
Contents of memory location specified by effective address
Contents of direct page memory location pointed to by operand
Contents of memory location pointed to by stack pointer
Current opcode pointed to by the program counter
Memory location of current opcode pointed to by the program counter
Two-byte operand of relative long addressing mode instruction
Add
Subtract
And
Or
Exclusive Or
Logical complement of a value or status bit (A indicates the complement of the value in the
accumulator)
Phase 2 clock (hardware signal)
Ready (hardware signal)
434
+
n
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
435
436
symbol
#
A
r
rl
i
s
d
d,x
d,y
(d)
(d,x)
(d),y
addressing mode
immediate
accumulator
program counter relative
program counter relative long
implied
stack
direct
direct indexed (with x)
direct indexed (with y)
direct indirect
direct indexed indirect
direct indirect indexed
symbol
[d]
[d],y
a
a,x
a,y
al
al,x
d,s
(d,s),y
(a)
(a,x)
xyc
ADDRESSING
MODE
BASE
NO CYCLES
addressing mode
direct indirect long
direct indirect long indexed
absolute
absolute indexed (with x)
absolute indexed (with y)
absolute long
absolute long indexed
stack relative
stack relative indirect indexed
absolute indirect
absolute indexed indirect
block move
437
BASIC, 19
Binary arithmetic, 19
binary digit, 11
binary-coded decimal, 11, 18
bit, 11
bitwise, 15
branch, 23
conditional, 23
macro assemblers, 21
number systems, 11
438
Table of Contents
Appendices
A. 65x Signal Description ........................................................................4
6502 Signals .............................................................................................................................................................................. 6
Address Bus ........................................................................................................................................................................... 6
Clock Signals ......................................................................................................................................................................... 6
Data Bus................................................................................................................................................................................. 6
Data Bus Enable ..................................................................................................................................................................... 6
Read/Write ............................................................................................................................................................................. 6
Ready ..................................................................................................................................................................................... 6
Interrupt Request .................................................................................................................................................................... 6
Sync ....................................................................................................................................................................................... 7
Reset ...................................................................................................................................................................................... 7
65C02 Signals ........................................................................................................................................................................... 7
Memory Lock......................................................................................................................................................................... 7
Notes...................................................................................................................................................................................... 7
65802 Signals ............................................................................................................................................................................ 7
65816 Signals ............................................................................................................................................................................ 7
Bank Address ......................................................................................................................................................................... 8
Vector Pull ............................................................................................................................................................................. 8
Abort...................................................................................................................................................................................... 8
Valid Program Address and Valid Data Address...................................................................................................................... 8
Memory and Index ................................................................................................................................................................. 8
Emulation............................................................................................................................................................................... 9
Bus Enable ............................................................................................................................................................................. 9
Table of Tables
Table D-1
Table D-2
Table D-3
Table D-4
Table D-5
Table D-6
Table D-7
Appendices
VSS
RDY
PHI1O
IRQB
NC
NMIB
SYNC
VDD
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
VPB
RDY
ABORTB
IRQB
MLB
NMIB
VPA
VDD
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
6502
W65C816
40
39
38
37
36
35
34
33
32
31
30
29
28
27
26
25
24
23
22
21
RESB
PHI2O
SOB
PHI2I
NC
NC
RWB
D0
D1
D2
D3
D4
D5
D6
D7
A15
A14
A13
A12
VSS
40
39
38
37
36
35
34
33
32
31
30
29
28
27
26
25
24
23
22
21
RESB
VDA
M/X
PHI2I
BE
E
RWB
D0/BA0
D1/BA1
D2/BA2
D3/BA3
D4/BA4
D5/BA5
D6/BA6
D7/BA7
A15
A14
A13
A12
VSS
VSS
RDY
PHI1O
IRQB
MLB
NMIB
SYNC
VDD
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
VSS
RDY
PHI1O
IRQB
NC
NMIB
SYNC
VDD
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
65C02
W65C802
RESB
PHI2O
SOB
PHI2I
NC
NC
RWB
D0
D1
D2
D3
D4
D5
D6
D7
A15
A14
A13
A12
VSS
40
39
38
37
36
35
34
33
32
31
30
29
28
27
26
25
24
23
22
21
40
39
38
37
36
35
34
33
32
31
30
29
28
27
26
25
24
23
22
21
RESB
PHI2O
SOB
PHI2I
NC
NC
RWB
D0
D1
D2
D3
D4
D5
D6
D7
A15
A14
A13
A12
VSS
6502 Signals
The 6502 defines the basic set of signals.
Address Bus
Pins A0 A15 are the address lines. Every time an address is generated opcodes fetch, operand read, intermediate
address, or effective address of a read or write operation the binary value of the address appears on these pins, A0 representing the
low-order bit of the address, and A15 representing the high-order bit. These outputs are TTL compatible.
Clock Signals
All of the 65x series processors operate on a two-phase external cycle; a 65, processors frequency, expressed in
Megahertz, or millions of cycles per second, is also its memory-access cycle time. The 6502 has an internal clock generator based
on the phase zero input signal, a time base typically provided by a crystal oscillator. The two output signals, phase one and phase
two, are derived from this signal. Phase one goes high when phase zero is low; phase two goes low on the rising edge of phase one.
Data Bus
Pins D0-D7 are the data lines; these eight pins form a bi-directional data bus to read and write data between the processor
and memory and the peripheral devices. Like the address lines, the outputs can drive one standard TTL load.
Read/Write
R/W is high when data is being read from memory or peripherals into the processor, low when the processor is writing
data. When in the low state, data and address lines have valid data and addresses.
Ready
The RDY signal enables the processor to be single-stepped on all cycles except write cycles. When enabled during phase
one, the processor is halted and the address lines maintain the current address; this lets the processor interface with lower-speed
read-only memory devices, and can also be used in direct memory access implementations.
Interrupt Request
The IRQ signal requests that an interrupt-service cycle be initiated. This signal is connected to peripheral devices that are
designed to be interrupt-driven. This is the maskable interrupt signal, so the interrupt disable flag in the status register must be zero
for the interrupt to be effective. The RDY signal must be high for an interrupt to be recognized. IRQ is sampled during phase 2.
Non-maskable Interrupt
NMI is basically identical to IRQ, except that it causes an unconditional interrupt when it is asserted, and
control vectors through the NMI vector rather than IRQ.
Sync
This line goes high during phase one of those cycles that are opcode fetches. When used with the RDY signal, this allows
hardware implementation of a single-step debugging capability.
Reset
RESET reinitializes the processor, either at power-up or to restart the system from a known state. RESET must be held
low for at least two cycles after a power down. When it is asserted, an interrupt-like service routine begins (although the status and
program counter are not stacked), with the result that control is transferred through the RESET vector.
65C02 Signals
The 65C02 pinout is identical to the 6502, with the exception of memory lock and notes described below.
Memory Lock
The ML output signal assures the integrity of read-modify-write instructions by signaling other devices, for example,
other processors in a multiprocessor environment, that the bus may not be claimed until completion of the read-modify-write
operation. This signal goes low during the execution of the memory-referencing (non-register operand) ASL, DEC, INC, LSR,
ROL, ROR, TRB, and TSB instructions.
Notes
The 65C02, unlike the 6502, responds to RDY during a write cycle as well as a read, halting the processor.
Response of the 65C02 to a reset is different from the 6502 in that the 65C02s program counter and status register are
written to the stack. Additionally, the 65C02 decimal flag is cleared after reset or interrupt; its value is indeterminate after reset and
not modified after interrupt on the 6502.
When an interrupt occurs immediately after the fetch of a BRK instruction on the 6502, the BRK is ignored; on the 65C02,
the BRK is executed, then the interrupt is executed.
Finally, the 65C02 R/W line is high during the modify (internal operation) cycle of the read-modify-write operations; on
the 6502, it is low.
65802 Signals
The 65802 signals are by definition 6502 pin-compatible. The 65C02 ML (memory lock) signal is not on the standard pinout, although it is available as a special-order mask option. Like the 6502, and unlike the 65C02, the 65802 does not write to the
stack during a reset.
Some of the enhancement of the 65C02 are available on the 65802 in the native mode, while in emulation mode the system
behaves as a 6502. R/W is low during the modify cycle of read-modify-write cycles in the emulation mode; high in the native
mode.
65816 Signals
Most of the signals behave as on the 65802, with the following additions and changes:
Vector Pull
The VP signal is asserted whenever any of the vector addresses ($00:FFE4-FFEF, $00:FFF4-FFFF) are being accessed as
part of an interrupt-type service cycle. This lets external hardware modify the interrupt vector, eliminating the need for software
polling for interrupt sources.
Abort
The ABORT input pin, when it is asserted, causes the current instruction to be aborted. Unlike an interrupt, none of the
registers are updated and the instruction quits execution from the cycle where the ABORT signal was received. No registers are
modified. In other words, the processor is left in the state it was in before the instruction that was aborted. Control is shifted to the
ABORT vector after an interrupt-like context-saving cycle.
The ABORT signal lets external hardware abort instructions on the basis of undesirable address bus conditions; memory
protection and page virtual memory systems can be fully implemented using this signal.
ABORT should be held low for only one cycle; if held low during the ABORT interrupt sequence, the ABORT interrupt
will be aborted.
VPA
0
0
1
1
VDA
0
1
0
1
-Internal operation
-Valid program address
-Valid data address
-Opcode fetch
During internal operations, the output buffers may be disabled by external logic, making address bus available for
transparent direct memory access. Also, since the 65816 sometimes generates a false read during instructions that cross page
boundaries, these may be trapped via these two signals if this is desirable. Note, however, that addresses should not be qualified in
emulation mode if hardware such as the Apple II disk controller is used, which requires false read to operate.
The other states may be used for virtual memory implementation and high-speed data or instruction cache control. VPA
and VDA high together are equivalent to the 6502 SYNC output.
Bus Enable
This signal replaces the data bus enable signal of the 6502; when asserted, it disables the address buffers and R/W as well
as the data buffers
COMPORT
GEQU
$C0A8
E220
SEP
LONGA
#$20
OFF
A900
8DA9C0
A91E
8DABC0
A90B
8DAAC0
60
LDA
STA
LDA
STA
LDA
STA
RTS
#0
COMPORT+1
#$1E
COMPORT+3
#$0B
COMPORT+2
Fragment B.1
Actually, any value can be written to the status register to cause a programmed reset; this operation is done to reinitialize
the I/O registers the three figures each show the effects on the non-data registers on each of their status bits.
10
0
STATUS
SET BY
CLEARED BY
Parity Error *
0 = No Error
1 = Error
Self Clearing * *
0 = No Error
1 = Error
Self Clearing * *
Framing Error *
Overrun *
0 = No Error
1 = Error
Self Clearing * *
Receive Data
Register Full
0 = Not Full
1 = Full
Read Receive
Data Register
Transmit Data
Register Empty
0 = Not Full
1 = Full
DCD
0 = DCD Low
1 = DCD High
DSR
0 = DSR Low
1 = DSR High
IRQ
0 = No Interrupt
1 = Interrupt
Write Transmit
Data Register
Not Resettable
Reflects DCD
State
Not Resettable
Reflects
DSR
State
Read
Status Register
HARDWARE
RESET
PROGRAM
RESET
11
0 0 1
0 1 1
1 0 1
1 1 1
OPERATION
Parity Disabled-No Parity Bit GeneratedNo Parity Bit Received
Odd Parity Receiver and Transmitter
Even Parity Receiver and Transmitter
Mark Parity Bit Transmitted, Parity Check
Disabled
Space Parity Bit Transmitted, Parity Check
Disabled
TRANSMITTER CONTROLS
BIT
3
0
0
1
1
NORMAL/ECHO MODE
FOR RECEIVER
0 = Normal
1 = Echo (Bit 2 and 3
must be 0)
TRANSMIT
INTERRUPT
Disabled
Enabled
Disabled
Disabled
2
0
1
0
1
HARDWARE RESET
PROGRAM RESET
RTS
LEVEL
High
Low
Low
Low
TRANSMITTER
Off
On
On
Transmit BRK
12
STOP BITS
BAUD RATE
0 = 1 Stop Bit
1 = 2 Stop Bits
1 Stop Bit if Word Length
= 8 Bits and Parity
11/2 Stop Bits if Word Length
= 5 Bits and No Parity
WORD LENGTH
6
0
0
1
1
BIT
5
0
1
0
1
DATA WORD
LENGTH
8
7
6
5
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
3
0
-
2
0
-
1
0
-
0
0
-
GENERATOR
16x EXTERNAL CLOCK
50 BAUD
75
109.92
134.58
150
300
600
1200
1800
2400
3600
4800
7200
9600
19,200
HARDWARE RESET
PROGRAM RESET
7
0
-
6
0
-
5
0
-
4
0
-
Figure B-3Control
Register
13
ADA9C0
2908
F0F9
AWAITCH
ADA8C0
60
GEQU
$C0A8
SEP
LONGA
#$20
OFF
LDA
AND
BEQ
COMPORT+1
#8
AWAITCH
LDA
RTS
COMPORT
Fragment B.2
Similarly, as Fragment B.3 shows, writing a byte out to the communications line is a matter of (once the 6551 has been
initialized) waiting until the status register bit four (transmitter data register empty) is set, the writing the byte to the data register.
Neither routine does any error checking using the other status register bits.
0000
0000
0000
0000
0000
0000
0001
0001
0004
0007
0009
0009
000A
000D
GEQU
$C0A8
PHA
WAITRDY
LDA
AND
BEQ
COMPORT+1
#$10
WAITRDY
PLA
STA
RTS
COMPORT
Fragment B.3
The data direction register is generally initialized for an application just once; then the data register is selected. Each data
direction register bit controls the same-numbered bit in the data register: if a data direction register bit is set, the corresponding data
register bit becomes an output line; if a data direction register bit is clear, the corresponding data register bit becomes an input line.
15
E220
SEP
LONGA
#$20
OFF
LDA
AND
STA
PORTACTRL
#%11111011
PORTACTRL
A9FF
8D0080
LDA
STA
#$FF
PORTA
AD0080
0904
8D0080
LDA
ORA
STA
PORTACTRL
#%00000100
PORTACTRL
LDA
AND
STA
PORTBCTRL
#%11111011
PORTBCTRL
A901
8D0080
LDA
STA
#1
PORTB
AD0080
0904
8D0080
LDA
ORA
STA
PORTBCTRL
#%00000100
PORTBCTRL
A901
8D0080
LDA
STA
#1
PORTB
60
RTS
Fragment B.4
PORTACTRL, PORTA, PORTBCTRL, and PORTB must be elsewhere equated to the addresses at which each is located. The
value in the control register is loaded and bit two is ANDed out with the mask, then stored back to choose the data direction register
as the chosen register in each port. All ones are stored to Port As data direction register, selecting all eight lines as outputs. One is
stored to Port Bs data direction register, selecting bit zero as an output and the rest of the port as inputs. Then the control registers
are loaded again, this time ORing bit two back on before re-storing them, to choose the data register as the chosen register in each
port. Finally, one is written out Port B to the printers Data Strobe to initialize the line.
Now bytes can be written to the printer by waiting for a zero on the Printer Busy Line (bit seven of Port B was chosen so
that a positive/negative test could be made to test the nit), then storing the byte to be written to Port A, and finally toggling the Data
Strobe to zero and then back to one to inform the printer that a new character is ready to be printed.
16
POUT
BIT
BMI
PORTB
POUT
8D0080
STA
PORTA
A90000
8D0080
EA
A90100
8D0080
LDA
STA
NOP
LDA
STA
#0
PORTB
60
RTS
#1
PORTB
Fragment B.5
You must be sure, in toggling the Strobe by writing to it, that the zero written to bit seven (zeroes are written to bits one
through seven during both writes to Port B) not be read back as though it is a value being sent by the printers Busy Line indicating
the printer is not busy.
Remember that it is always important to have a data sheet for each peripheral support chip you attempt to write code for.
17
18
The specified bit in the zero page location specified in the operand is tested. If it is clear (reset), a branch is
taken; if it is set, the instruction immediately following the two-byte BBRx instruction is executed. The bit is specified
by a number (0 through 7) concatenated to the end of the mnemonic.
If the branch is performed, the third byte of the instruction is used as a signed displacement from the program
counter; that is, it is added to the program counter: a positive value (numbers less than or equal to $80; that is, numbers
with the high-order bit clear) results in a branch to a higher location; a negative value (greater than $80, with the highorder bit set) results in a branch to a lower location. Once the branch address is calculated, the result is loaded into the
program counter, transferring control to that location.
Most assemblers calculate the displacement for you: you must specify as the operand, not the displacement but
rather the label to which you wish to branch. The assembler then calculates the correct offset.
Flags Affected:
Codes:
Addressing Modes:
Direct Page / Program Counter Relative
Direct Page / Program Counter Relative
Direct Page / Program Counter Relative
Direct Page / Program Counter Relative
Direct Page / Program Counter Relative
Direct Page / Program Counter Relative
Direct Page / Program Counter Relative
Direct Page / Program Counter Relative
Syntax
BBR0 dp, nearlabel
BBR1 dp, nearlabel
BBR2 dp, nearlabel
BBR3 dp, nearlabel
BBR4 dp, nearlabel
BBR5 dp, nearlabel
BBR6 dp, nearlabel
BBR7 dp, nearlabel
Opcode
(hex)
0F
1F
2F
3F
4F
5F
6F
7F
Available to:
6502
65C02
R65C02
x
x
x
x
x
x
x
x
65802
# of
Bytes
3
3
3
3
3
3
3
3
# of
Cycles
5
5
5
5
5
5
5
5
19
The specified bit in the zero page location specified in the operand is tested. If it is set, a branch is taken; if it is
clear (reset), the instructions immediately following the two-byte BBSx instruction is executed. The bit is specified by a
number (0 through 7) concatenated to the end of the mnemonic.
If the branch is performed, the third byte of the instruction is used as a signed displacement from the program
counter; that is, it is added to the program counter: a positive value (numbers less than or equal to $80; that is, numbers
with the high order bit clear) results in a branch to a higher location; a negative value (greater than $80, with the highorder bit set) results in a branch to a lower location. Once the branch address is calculated, the result is loaded into the
program counter, transferring control to that location.
Most assemblers calculate the displacement for you: you must specify as the operand, not the displacement but
rather the label to which you wish to branch. The assembler then calculates the correct offset.
Flags Affected:
Codes:
Opcode
Addressing Mode
Syntax
(hex)
8F
9F
AF
BF
CF
DF
EF
FF
Available to :
6502
65C02
# of
R65C02
x
x
x
x
x
x
x
x
65802
Bytes
3
3
3
3
3
3
3
3
# of
ycles
5
5
5
5
5
5
5
5
20
Clear the specified bit in the zero page memory location specified in the operand. The bit to clear is specified by
a number (0 through 7) concatenated to the end of the mnemonic.
Flags Affected:
Codes:
Addressing Mode
Direct Page
Direct Page
Direct Page
Direct Page
Direct Page
Direct Page
Direct Page
Direct Page
Syntax
RMB0 dp
RMB1 dp
RMB2 dp
RMB3 dp
RMB4 dp
RMB5 dp
RMB6 dp
RMB7 dp
Opcode
(hex)
07
17
27
37
47
57
67
77
Available to:
6502
65C02
R65C02
x
x
x
x
x
x
x
x
65802
# of
Bytes
2
2
2
2
2
2
2
2
# of
Cycles
5
5
5
5
5
5
5
5
21
Set the specified bit in the zero page memory location specified in the operand. The bit to set is
specified by a number (0 through 7) concatenated to the end of the mnemonic.
Flags Affected:
Codes:
Addressing Mode
Direct Page
Direct Page
Direct Page
Direct Page
Direct Page
Direct Page
Direct Page
Direct Page
Syntax
SMB0 dp
SMB1 dp
SMB2 dp
SMB3 dp
SMB4 dp
SMB5 dp
SMB6 dp
SMB7 dp
Opcode
(hex)
87
97
A7
B7
C7
D7
E7
F7
Available to:
6502
65C02
R65C02
x
x
x
x
x
x
x
x
65802
# of
Bytes
2
2
2
2
2
2
2
2
# of
Cycles
5
5
5
5
5
5
5
5
22
D. Instruction Groups
The 65x instructions can be divided into three groups, on the basis of both the types of actions of each
instruction and the addressing modes each can use. The opcodes in the first group and some in the second have similar
bit patterns, the same addressing modes available, and regularity which can make remembering the capabilities of a
particular instruction or creating a compiler generator much easier.
Group I instructions are the most commonly used load, store, logic, and arithmetic instructions, and have by far
the most addressing modes available to them. Group II instructions are mostly read-modify-write instructions, such as
increment, decrement, shift, and rotate, which both access and change one and only one register or memory location.
Group III is a catch-all for the remaining instructions, such as index register comparisons and stack operations.
Group I Instructions
The 65x Group I instructions, with their opcodes bit patterns, are shown in Table D.1. The aaaaas are filled
with addressing mode bit patterns there is one pattern for each addressing mode available to Group I instruction.
Add with Carry to the Accumulator (ADC)
And the Accumulator (AND)
Compare the Accumulator (CMP)
Exclusive Or the Accumulator (EOR)
Load the Accumulator (LDA)
Or the Accumulator (ORA)
Subtract with Borrow from the Accumulator (SBC)
Store the Accumulator (STA)
011a
001a
110a
010a
101a
000a
111a
100a
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
aaaa
The 6502 addressing modes available to the Group I instructions have bit patterns that all end in 01. These bit
patterns are found in Table D.2. The exception to this scheme is STA immediate; since it is not possible to use
immediate addressing with a store instruction, its logical opcode 1000 1001 is used by a non-Group-I instruction.
Immediate
Direct (Zero) Page
Absolute
Direct (Zero) Page Indexed by X
Absolute Indexed by X
Absolute Indexed by Y
Direct (Zero) Page Indexed Indirect with X (pre-indexed)
Direct (Zero) Page Indirect Indexed with Y (post-indexed)
0
0
0
1
1
1
0
1
1001
0101
1101
0101
1101
1001
0001
0001
The 65C02 adds one more addressing mode for Group I instructions; it has the only Group I
addressing mode bit pattern to end in a zero:
Direct (Zero) Page Indirect
10010
The 65802 and 65816 add the six addressing modes for Group I instructions found in Table D.3.
Direct Page Indirect Long Indexed with Y (post-indexed long)
Direct Page Indirect Long
Absolute Long
Absolute Long Indexed with X
Stack Relative
Stack Relative Indirect Indexed with Y
1
0
0
1
0
1
0111
0111
1111
1111
0011
0011
23
000b
110b
111b
010b
001b
011b
100b
100b
bc10
b110
b110
bc10
bc10
bc10
b110
b100
0
0
0
1
1
10
01
11
01
11
Notice how the four bb1 addressing modes have the same bit patterns as the first three bits of their
corresponding bit patterns for the Group I instruction addressing mode.
There are a few exceptions.
Absolute indexing is not available for storing either index register. Furthermore, since the register cannot use
itself, the STX instruction cant use direct page, X; instead, direct page, Y substitutes for this instructions direct page,
indexed store.
The two 65C02 instructions to increment and decrement the accumulator do not follow this scheme at all;
giving these instructions that addressing mode clearly was not planned when the 6502 was designed, since their opcodes
were assigned to other instructions. Nor does the 65C02s STZ (store zero memory) instruction, which uses the main
four addressing modes, follow the scheme, even though it seems clearly to be a Group II instruction of this type. But
four of the five addressing modes of the BIT instruction on the 65C02, 65802, and 65816 (the 6502 has only two
addressing modes for this instruction)-the four bb1 addressing modes above-follow this scheme (its bit pattern is 001b
b100). It also has an immediate addressing mode, however, which is in no way regular.
24
101d
101d
dd10
dd00
0
0
0
1
1
00
01
11
01
11
Table D-6 Address Mode Patterns for Load Index Register Instruction
The two indexed modes use the Y index register for indexing when loading the X register and vice versa.
1110
1100
ee00
ee00
00
01
11
Table D-7 Address Mode Patterns for Compare Index Register Instructions
Test-and-Change-Bits Instructions
The two test-and-change-bits instructions each have two addressing modes that they use in a regular manner.
The two instructions are:
Test and Reset Memory Bits (TRB)
Test and Set Memory Bits (TSB)
0001
0000
x100
x100
x=0
x=1
25
Hex
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
10
11
12
13
14
15
16
17
18
19
1A
1B
1C
1D
1E
1F
20
21
22
23
24
25
26
27
28
29
2A
2B
2C
2D
2E
2F
30
31
32
33
34
35
36
Hex
80
81
82
83
84
85
86
87
88
89
8A
8B
8C
8D
8E
8F
90
91
92
93
94
95
96
97
98
99
9A
9B
9C
9D
9E
9F
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
AA
AB
AC
AD
AE
AF
B0
B1
B2
B3
B4
B5
B6
Character
Control-@
Control-A
Control-B
Control-C
Control-D
Control-E
Control-F
Control-G
Control-H
Control-I
Control-J
Control-K
Control-L
Control-M
Control-N
Control-O
Control-P
Control-Q
Control-R
Control-S
Control-T
Control-U
Control-V
Control-W
Control-X
Control-Y
Control-Z
Control-[
Control-\
Control-]
Control-^
Control-_
!
"
#
$
%
&
'
(
)
*
+
,
.
\
0
1
2
3
4
5
6
Names
NUL, null
Break
BEL, bell
BS, backspace
HT, horizontal tab
LF, line feed
VT, vertical tab
FF, form feed, Page
CR, carriage return
XON, resume
XOFF, screen pause
Space
Exclamation point
Quote
Pound sign
Dollar sign
Percent sign
Ampersand
Apostrophe
Left parenthesis
Right parenthesis
Asterisk
Plus sign
Comma
Minus sign, dash
Period
Backlash
26
Hex
37
38
39
3A
3B
3C
3D
3E
3F
40
41
42
43
44
45
46
47
48
49
4A
4B
4C
4D
4E
4F
50
51
52
53
54
55
56
57
58
59
5A
5B
5C
5D
5E
5F
60
61
62
63
64
65
66
67
68
69
6A
6B
6C
6D
Hex
B7
B8
B9
BA
BB
BC
BD
BE
BF
C0
C1
C2
C3
C4
C5
C6
C7
C8
C9
CA
CB
CC
CD
CE
CF
D0
D1
D2
D3
D4
D5
D6
D7
D8
D9
DA
DB
DC
DD
DE
DF
E0
E1
E2
E3
E4
E5
E6
E7
E8
E9
EA
EB
EC
ED
Character
7
8
9
:
;
<
=
>
?
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_
`
a
b
c
d
e
f
g
h
i
j
k
l
m
Names
Colon
Semicolon
Less than
Equal
Greater than
Question mark
At sign
Left bracket
Backlash
Right bracket
Caret
Underscore
Accent grave
27
Hex
6E
6F
70
71
72
73
74
75
76
77
78
79
7A
7B
7C
7D
7E
7F
Hex
EE
EF
F0
F1
F2
F3
F4
F5
F6
F7
F8
F9
FA
FB
FC
FD
FE
FF
Character
n
o
p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
DEL
Names
Left brace
Vertical line
Right brace
Tilde
delete, rubout
28
Hex
00
01
02
03
04
05
06
07
08
09
0A
0B
0C
0D
0E
0F
10
11
12
13
14
15
16
17
18
19
1A
1B
1C
1D
1E
1F
20
21
22
23
24
25
26
27
28
29
2A
2B
2C
2D
2E
2F
30
31
32
33
34
35
36
Hex
80
81
82
83
84
85
86
87
88
89
8A
8B
8C
8D
8E
8F
90
91
92
93
94
95
96
97
98
99
9A
9B
9C
9D
9E
9F
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
AA
AB
AC
AD
AE
AF
B0
B1
B2
B3
B4
B5
B6
Character
Control-@
Control-A
Control-B
Control-C
Control-D
Control-E
Control-F
Control-G
Control-H
Control-I
Control-J
Control-K
Control-L
Control-M
Control-N
Control-O
Control-P
Control-Q
Control-R
Control-S
Control-T
Control-U
Control-V
Control-W
Control-X
Control-Y
Control-Z
Control-[
Control-\
Control-]
Control-^
Control-_
!
"
#
$
%
&
'
(
)
*
+
,
.
\
0
1
2
3
4
5
6
Names
NUL, null
Break
BEL, bell
BS, backspace
HT, horizontal tab
LF, line feed
VT, vertical tab
FF, form feed, Page
CR, carriage return
XON, resume
XOFF, screen pause
Space
Exclamation point
Quote
Pound sign
Dollar sign
Percent sign
Ampersand
Apostrophe
Left parenthesis
Right parenthesis
Asterisk
Plus sign
Comma
Minus sign, dash
Period
Backlash
29
Hex
37
38
39
3A
3B
3C
3D
3E
3F
40
41
42
43
44
45
46
47
48
49
4A
4B
4C
4D
4E
4F
50
51
52
53
54
55
56
57
58
59
5A
5B
5C
5D
5E
5F
60
61
62
63
64
65
66
67
68
69
6A
6B
6C
6D
Hex
B7
B8
B9
BA
BB
BC
BD
BE
BF
C0
C1
C2
C3
C4
C5
C6
C7
C8
C9
CA
CB
CC
CD
CE
CF
D0
D1
D2
D3
D4
D5
D6
D7
D8
D9
DA
DB
DC
DD
DE
DF
E0
E1
E2
E3
E4
E5
E6
E7
E8
E9
EA
EB
EC
ED
Character
7
8
9
:
;
<
=
>
?
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_
`
a
b
c
d
e
f
g
h
i
j
k
l
m
Names
Colon
Semicolon
Less than
Equal
Greater than
Question mark
At sign
Left bracket
Backlash
Right bracket
Caret
Underscore
Accent grave
30
Hex
6E
6F
70
71
72
73
74
75
76
77
78
79
7A
7B
7C
7D
7E
7F
Hex
EE
EF
F0
F1
F2
F3
F4
F5
F6
F7
F8
F9
FA
FB
FC
FD
FE
FF
Character
n
o
p
q
r
s
t
u
v
w
x
y
z
{
|
}
~
DEL
Names
Left brace
Vertical line
Right brace
Tilde
delete, rubout
31