6502 Assembly Language Programming For Apple Commodore and Atari
6502 Assembly Language Programming For Apple Commodore and Atari
-
- 6502
-,...--
ASSEMBLY·
LANGUAGE
PROGRAMMING
--
-
-f---- I§.~
ASSEMBLY·
LANGUAGE
PROGRAMMING
far Apple,
Commodore,
and Atari
Computers
by Christopher
Lampton
CHAPTER ONE
The One True Language 1
CHAPTER TWO
The Microprocessor Zone 7
CHAPTER THREE
Down Memory Lane 17
CHAPTER FOUR
A 6502 Vocabulary 25
CHAPTER FIVE
Addressing Modes 45
CHAPTER SIX
Using the Assembler 57
CHAPTER SEVEN
Waving the Flags 69
CHAPTER EIGHT
Decisions, Decisions 79
CHAPTER NINE
Input and Output 87
CHAPTER TEN
Doing It All 97
APPENDIX
The 6502 Instruction Set 109
For Further Reading 113
Index 115
-
-
- 6502
-
- ASSEMBLY·
LANGUAGE
PROGRAMMING
THEONE
TRUE
LANGUAGE
It may shock you to hear this, but there is only one pro-
gramming language that is understood by your computer.
Oh, yes, we know that you've been told about BASIC
and FORTRAN and Pascal and Logo and any ofa number
of so-called high-level computer languages, with which one
can construct the complex lists of instructions known as
computer programs. You probably even know how to write
programs in one or more of these languages. But, truth to
tell, none ofthese languages is actually "understood" by the
computer. They are designed to be understood by human
beings. To the computer, programs written in these lan-
guages are pure gibberish.
The language that is understood by a computer is called
machine language, and it is made up of electronic signals,
as would befit the language of an electronic device. All
computer programs must be written in, or translated into,
machine language before they are acceptable to a computer.
********
The key to understanding machine language is the micro-
processor. This tiny chip of silicon is the core of the central
processing unit (or CPU) of a microcomputer; within it
reside thousands of interconnected transistors, too small to
be seen with a conventional microscope but capable of pro-
cessing millions of electronic messages in a second. It is the
microprocessor that actually "obeys" the instructions in
our machine-language programs, turning those instructions
into actions. The microprocessor plus the internal memo-
ry-more about that in a moment-and a few other elec-
tronic chips and parts make up the CPU board of a micro-
computer.
This book is concerned primarily with computers based
on a microprocessor called the 6502, and other micropro-
cessors (such as the 6510, 6509, and 7501) in its immediate
family. These computers include the Apple II series, the
Commodore Pet, VIC, 64, and Plus 4, and the Atari 400,
800, and XL series.
A microprocessor alone does not a computer make,
01011110
Byte-Eight bits.
Instruction set-The set of all machine-language In-
structions that can be executed by a given CPU.
[10]
10101000
168
TAY
01101001 00001111
$5A80
[ 15]
TAX-Transfer A to X.
TXA-Transfer X to A.
TAY-Transfer A to Y.
TYA-Transfer Y to A.
[26]
LOA $5608
This tells the microprocessor to copy the number currently
stored at memory address $5608 into the A register.
LOY $8A38
LOX #$A4
This loads the hexadecimal number $A4 into register X. The
number sign (#) tells us that this is the actual number to be
loaded rather than the address of the number to be loaded.
(The dollar sign [$], of course, tells us that it is in hexade-
cimal.)
Instructions that copy a number from a microprocessor
register to a memory address are called store instructions,
because they "store" the number in memory for safekeep-
ing. These instructions all begin with the letters ST (for
"store"), followed by a letter identifying the register that
contains the value to be stored. Here are the mnemonics for
6502 store instructions:
STA $0801
LOY $9COE
STY $3344
AOC $45FF
AOC #$FS
SBC $OCOO
SBC #$OB
[29]
INC $5C93
DEC $ABC9
AND $4C50
AND #$77
LOA #$OF
AND $9999
STA $9999
ORA $4000
ORA #$34
[32]
LOA #$20
ORA $0405
STA $0405
EOR $4110
EOR #$45
LOA #$FF
ORA $C91C
STA $C91C
ASLA
[34]
will shift every digit in the A register one position to the left.
If the A register contains the number $D7 (or 11010111)
before this instruction is executed, then it will contain the
number $AE (or 10101110) after it is executed. Note that the
digit in the leftmost position disappears completely after
the shift takes place (though it is not completely lost, as we
shall see in the chapter on arithmetic). Similarly, the
instruction:
ASL $898A
LSR A
LSR $OAAA
[35]
JMP $3990
JSR $C09F
more than one return address in the stack. How does the
6502 know which one to retrieve?
The secret is in the SP register. SP is short for stack
pointer; as its name implies, the SP register "points" at the
stack. That is, it always contains the address within the
stack at which the next return address will be stored,
assuming an RTS instruction is not encountered in the
meantime. For instance, if the SP register currently con-
tains the number $E4 and we execute a JSR instruction, the
two bytes of the return address for that JSR will be placed
in the stack at address $01 E4 (the address "pointed" to by
the stack pointer) and address $01 E3 (the address immedi-
ately before it). The SP register will then automatically be
decremented twice so that it will point at address $01 E2, the
address at which the next return address will be stored. If
another JSR is encountered before an RTS is encountered,
the return address for that JSR is placed at $01 E2, and the
SP register will be decremented to point at $01 EO.
Suppose, now, that an RTS is encountered. The 6502 will
increment the value of the SP register twice, to point at the
most recent return address. It will fetch the address from
the stack and return program control to that address. When
the second RTS is encountered, the 6502 will increment the
SP register yet again, to point to the earlier return address.
It will then return to that address.
The point of using a stack is that the most recent value
placed on the stack must always be the first value removed
from it-what programmers call a LIFO ("last-in first-
out") structure. This ensures that RTS return addresses are
always executed in the correct order. Note, however, that
the stack is only capable of holding 128 return addresses at
one time. Thus, it is impossible to nest subroutines more
than 128 levels deep. (It is unlikely that you will ever need
to nest subroutines this deeply, however.)
As we shall see in a moment, there are instructions that
allow the programmer to access the 6502 stack directly.
These instructions should be used with great care, however,
BEQ-branch on equal
BNE-branch on not equal
BCC-branch on carry clear
BCS-branch on carry set
BMI-branch on minus
BPL-branch on plus
BVS-branch on overflow set
BVC-branch on overflow clear
JMP $8392
4C 92 83
FO 3E
********
These are the major categories of 6502 instructions. Here
are a few miscellaneous instructions that may prove useful
from time to time:
Or, in hexadecimal:
80 OB EC
or, alternatively:
STA $45
10000101 01000101
8545
LDX $0980,X
ter as index: ADC, AND, CMP, EOR, LOA, LOX, ORA, SSC,
STA.
Indexed addressing may be used with zero-page
addressing (that is, a zero-page address plus an index val-
ue), but in this case only the X register may be used as an
index. The following instructions can be used with indexed
zero-page addressing: ADC, AND, ASL, CMP, DEC, EOR, INC,
LDA,LDY, LSR,ORA,ROL,ROR,SSC,STA,STY.
JMP ($A890)
Note that the index notation (,X) is now inside the paren-
theses rather than outside them. This indicates that the
index value is to be added to the zero-page address itself,
rather than to the address contained at that address.
Suppose that zero-page address $00A4 contains the
number $30. We add this to address $00A4, which produces
an address of $0001. This is the address that actually con-
tains the indirect address. If address $0001 contains the
number $67, and address $0002 contains the address $B4,
then the effective address ofthe instruction is $B467, and it
is at this address that we will find (or place) our data.
Indexed indirect addressing allows us to store entire
tables of addresses on the zero page. However, because
zero-page space is at a premium on most 6502-based com-
puters, it is unlikely that you will often have a chance to do
so. This is probably the least used of all 6502 addressing
modes.
Like indirect indexed addressing, indexed indirect
addressing has its limitations. The indirect address must be
stored on the zero page and only the X register can be used
for indexing. The instructions that can use indexed indirect
addressing are: AOC, AND, CMP, EOR, LOA, ORA, SBC,
STA.
LOY #$45
CMP #$BB
********
Those are the 6502 addressing modes. We will be using
most of them in upcoming chapters. If you don't under-
stand how they work now, be prepared to refer back to this
chapter at appropriate moments. The 6502 addressing
modes are every bit as important to the programmer as the
actual instructions in the instruction set, and a good under-
standing of them will make you a good assembly-language
programmer.
USING THE
ASSEMBLER
; DEMONSTRATION PROGRAM
; Illustrates assembler program file format
*=$4000
LOY #0 ; Initialize index
LOOP LOA MESSG1,Y; Get next character
CMP #0 ; Are we done yet?
BEQ REBOOT; If so, return to system
JRS OUTCHR ; Else print it
JMP LOOP; And go for more
REBOOT JMP WMSTRT ; Reinitialize system
MESSG1 .BYTE 'HELLO, THERE!',O
********
Perhaps the most important feature of any assembler is the
ability it gives us to define labels. A label is a word that
represents a number. We may assign a number to a label
either through an equate statement or by referring to the
label in the label field.
Equate statements resemble assignment statements in
BASIC. They usually consist of the name of the label, an
equals sign, and the numeric value we wish to assign to the
label. There are two equate statements in the program sam-
ple on page 58. The first looks like this: WMSTRT=O. This
assigns a value of 0 to the label WMSTRT. Similarly, the
equate statement OUTCHR=$FF56 assigns the value $FF56
to the label OUTCHR.
Once this assignment has been made, the assembler will
treat the labels as though they were these numbers. We may
.WORD $4005,7
LOA STORE-9
THERE=HERE-$25
This sets the label THERE equal to the value of label HERE
minus $25.
Most assemblers offer the standard operators for addi-
tion (+), subtraction (-), multiplication (*), and division
(f) , though the division is usually integer-that is, any frac-
tional portion of the result will be dropped. In addition,
some assemblers will offer additional operators such as
AND and OR. Check your manual for details.
Two more directives that you will find useful are < and
>. These directives, when used at the beginning of a label
representing a sixteen-bit number, tell the assembler that
you only wish to use the low byte (that is, the rightmost
eight bits) or the high byte (the leftmost eight bits) of the
number, respectively. For instance, suppose that label
FIRST is equal to the 16-bit number $5690. If we write the
label as <FIRST, the assembler will treat it as though it
were the number $90-the low byte of $5690. Similarly, if
we write the label as >FIRST, the assembler will treat it as
though it were the number $56-the high byte of $5690.
Why would we want to do this? Well, suppose that we
wanted to store the value of the label FIRST at a zero-page
memory location, where it would become a pointer to a
value elsewhere in memory. We cannot place it in memory
with a single sequence of load and store instructions
because it is a sixteen-bit number and load and store
instructions can only manipulate eight bit numbers. Thus,
[66]
we must break it into two halves, each eight bits long, and
store them separately, like this:
LOA #<FIRST
STA $05
LOA #>FIRST
STA $06
LDA#$45
STA $5600
LDA#$3F
CLC
ADC $5600
56
23
You would then add the two numbers a column at a time,
adding the numbers in the rightmost column (6 and 3) and
writing the result (9) in the rightmost column of the answer,
then adding the numbers in the second column (5 and 2)
and placing the result (7) in the leftmost column of the
answer. The answer, of course, is 79.
Suppose, however, that the result of one of these two
additions had been a two-digit number-that is, a number
that would not fit into the single digit position that we allot-
ted for it in the answer? What we would do, as you probably
know, is take the extra digit and use it as a carry-that is,
add it into the total in the next column. For instance, if we
wished to add the numbers 45 and 98, we would place them,
one above the other, like this:
45
98
4A91
334A
7DDB
[73]
AOOEM CLC
LOA NUMBR1
AOC NUMBR2
STA NUMBR1
LOA NUMBR1 + 1
AOC NUMBR2+1
STA NUMBR1 +1
RTS
SUBEM SEC
LDA NUMBR1
SBC NUMBR2
STA NUMBR1
LDA NUMBR1 + 1
SBC NUMBR2+1
STA NUMBR1 +1
RTS
ASL A
ASLA
ASLA
ASLA
ASLA
ASLA
LSR A
LSR A
JMP NEXT
ZERO
NEXT
is one of the instructions that affects the zero flag. The BEQ
instruction tests the value of the zero flag and branches to
routine ZERO if it is equal to O. Otherwise, the flow of the
program automatically proceeds to routine NOZERO. At the
end of NOZERO, the instruction JMP NEXT guides the flow
of the program around routine ZERO so that it will not be
executed as well.
This is roughly equivalent to this statement in
BASIC:
IF N1 =0 THEN ... . ELSE .. ..
LOA NUMBR1
BNE NOZERO
ZERO
JMP NEXT
NOZERO
NEXT
where the statements following the THEN and the ELSE are
equivalent to NOZERO and ZERO, respectively.
We can also simulate more complex IF-THEN logic.
Consider this BASIC statement
IF N1 <N2 THEN .... ELSE ....
LOA NUMBR1
CMP NUMBR2
BCC LESTHN
NOLESS
JMP NEXT
LESTHN
NEXT
LOA NUMBR1
CMP NUMBR2
BCS NOLESS
LESTHN
JMP NEXT
NOLESS
NEXT
CMP NUMBR1
BCS NEXT
JMP HELLO
NEXT
101 = 0
20 .. (body of the loop) ..
301 = 1+ 1
40 IF 1<10 THEN GOTO 20
50 .. (rest of the program) ..
[84]
LOA #$00
STAINDEX
LOA #$20
STA INDEX+1
LOOP .. (body of loop) ..
DEC INDEX
BNE LOOP
DEC INDEX+1
BNE LOOP
NEXT
; SUBROUTINE OUTPUT-LINE
; Outputs a string of characters
; to the video display. The address
; of the string must be stored at
; location PNTER. The string must
; be terminated with a 0 byte.
LOA #<MSG1 and LOA #>MSG1 load the high and low
byte, respectively, of the address of MSG1 into PNTER.
Just as important as output to the video display is input
from the keyboard. Without it, we could have only limited
real-time interaction with the computer (though some
recent computer models make extensive and creative uses
of input devices such as the mouse, which point to options
presented on the screen).
The Commodore Kernal contains a routine called
GETIN, at memory address $FFE4, which handles keyboard
input. Call this routine with a JSR instruction and the
ASCII code of the character most recently typed at the key-
board will be placed in the A register. If no character has
been typed, a 0 will be placed in the A register.
We can combine this routine with the CHROUT routine
to create a short, simple program that will allow us to type
on the keyboard and see the characters echoed to the dis-
play. Here's the program:
*=$4000
[94]
,
SCECHO JSR GETIN
CMP #0
BEQ SCECHO
JSR CHROUT
JMP SCECHO
; SUBROUTINE INPUT-LINE
; Accepts a string of characters from
; the keyboard terminating with a
; carriage return. Stores string
; at location STRING and auto-
; matically terminates string
; with 0 byte.
,
GETIN=$FFE4
,
"
*=$4500
INLINE LOA #<STRING ; GET LOW BYTE OF STORAGE
ADDRESS
STA PNTER; STORE IN POINTER
LOA N>STRING; GET HIGH BYTE OF STORAGE
ADDRESS
STA PNTER+1 ; STORE IN POINTER
LOY NO; INITIALIZE INDEX
INLOOP JSR GETIN ; GET CHARACTER
CMP NO ; NOTHING TYPED?
BEQ INLOOP ; IF SO, TRY AGAIN
JSR CHROUT; ELSE SHOW US WHAT WE TYPED
STA (PNTER),Y ; AND SAVE IT
INY ; POINT TO NEXT STORAGE ADDRESS
BNE tNN£XT ; AND CONTINUE
[95]
REPEAT
Prompt user for input
REPEAT
Get number from keyboard.
Store in array.
UNTIL 10 NUMBERS ARE RECEIVED
Initialize total to O.
Initialize pointer to first array element.
REPEAT
Add current array element to total.
Advance pointer to next array element.
UNTIL COMPLETE ARRAY TOTALED
Display total. .
Ask if user wants to repeat program.
UNTIL USER DOES NOT WANT TO REPEAT
This, of course, is only one possible outline for the pro-
gram. We could, for instance, add each number to the pre-
vious number as they are input at the keyboard, thus
removing the need to store the numbers in an array. How-
ever, retaining the array allows us to illustrate certain prin-
ciples of data storage.
To contrast our machine-language solution to this prob-
lem with a typical high-level language solution, let's trans-
late this program first into BASIC. A BASIC version of this
outline might look like this:
10 DIM A(10)
20 PRINT "Type ten numbers, pressing RETURN after each."
30 FOR 1=0 TO 9
40 INPUT A(I)
50 NEXT I
60T=0
70 FOR 1=0 TO 9
80 T=T+A(I)
90 NEXT I
100 PRINT "The total of the ten numbers is ";T
[99]
STA TOTAL+1
INX
CPX #10
BNE LOOP2
LOA #<MSG2 ; POINT TO 2ND PROMPT
STA PNTER
LOA #>MSG2
STA PNTER+1
JSR OUTUN ; PRINT IT
LOA #<BUFFER ; POINT TO BUFFER
STA PNTER ; (WHERE TOTAL WILL BE)
LOA #>BUFFER
STA PNTER+1
LOA TOTAL+1 ; GET HI-BYTE OF TOTAL
JSR BINHEX ; CONVERT TO HEX
JSR OUTUN ; AND PRINT IT
LOA TOTAL ; GET LO-BYTE OF TOTAL
JSR BINHEX ; CONVERT TO HEX
JSR OUTUN ; AND PRINT IT
LOA #CR ; ADD A CARRIAGE RETURN
JSR CHROUT
LOA #<MSG3 ; POINT TO 'AGAIN? ' PROMPT
STA PNTER
LOA #>MSG3
STA PNTER+1
JSR OUTUN ; PRINT IT
LOOP3 JSR GETTIN ; LOOK FOR RESPONSE
CMP #0 ; NOTHING TYPED?
BEQ LOOP3 ; LOOK AGAIN
CMP #'Y ; WAS IT 'Y'.?
BNE NEXT ; IF NOT, SKIP
JMP ADDTEN ; ELSE REPEAT PROGRAM
NEXT CMP #'N ; WAS IT 'N'?
BNE LOOP3 ; IF NOT, LOOK AGAIN
JMP WMSTRT ; ELSE END PROGRAM
,
; ··UNE OUTPUT ROUTINE··
,
; OUTPUTS A STRING OF TEXT TO THE
; VIDEO DISPLAY, ADDRESS OF
; STRING MUST BE AT LOCATION
; 'PNTER', STRING MUST TERMINATE
; WITH A 0 BYTE,
,
OUTUN LOY #0 ; POINT TO FIRST CHARACTER
OLLOOP LOA (PNTER),Y ; GET CHARACTER
CMP #0 ; ARE WE FINISHED
[104]
,
BINHEX TAY ; SAVE BINARY NUMBER
AND #$FO ; REMOVE LEAST SIGNIFICANT DIGll
LSR A ; SHIFT HIGH HALF TO LOW HALF
LSR A
LSR A
LSR A
JSR CONVT1 ; ADD ASCII VALUE
STA BUFFER ; SAVE IN BUFFER
TYA ; GET BINARY NUMBER
AND #$OF ; REMOVE MOST SIGNIFICANT DIGIT
JSR CONVTI ; ADD ASCII VALUE
STA BUFFER+1 ; SAVE IN BUFFER
LOA #0
STA BUFFER+2
RTS
CONVTI CLC ; ADD ASCII VALUE
ADC #48
CMP #58 ; WAS IT 'A' OR GREATER?
BCC CVT12 ; IF NOT, RETURN
CLC
ADC #7 ; ELSE ADD ANOTHER 7
CVT12 RTS
,
; **HEX-BINARY CONVERSION ROUTINE**
,
; CONVERTS A 2-DIGIT HEXADECIMAL
; STRING TO AN 8-BIT BINARY
; NUMBER. EXPECTS HEX STRING AT
; LOCATION 'BUFFER'. LEAVES
; BINARY NUMBER IN ACCUMULATOR.
,
HEXBIN LOA BUFFER ; GET 1ST CHARACTER
JSR CONVT2 ; SUBTRACT ASCII VALUE
ASL A ; SHIFT TO HIGH HALF
ASL A
ASLA
ASL A
STA TEMP ; SAVE HIGH HALF
LOA BUFFER + 1 ; GET NEXT CHARACTER
JSR CONVT2 ; SUBTRACT ASCII VALUE
ORA TEMP ; COMBINE WITH HIGH HALF
RTS
CONVT2 SEC ; SUBTRACT ASCII VALUE
SBC #48
CMP#10 ; WAS IT 'A' TO 'F?
BCCCVT22 ; IF NOT, RETURN
SEC
[106]
immediate-I
zero page-Zp
zero page, X-Zpx
zero page, Y -Zpy
absolute-A
absolute, X-Ax
absolute, Y -Ay
implied-1m
relative-R
indirect, X-Ix
indirect, Y -Iy
indirect-In
negative-N
zero-Z
carry-C
interrupt-I
decimal mode-D
overflow-V
[110]