Assembly Language Programming
Assembly Language Programming
html
programs, even when using HLLs. Of course, another reason for learning assembly language is just for the knowledge.
Now some of you may be thinking, "Gee, that would be wonderful, but I've got lots
to do. My time would be better spent writing code than learning assembly language."
Assembly language is the uncontested speed champion among programming
There are some practical reasons for learning assembly, even if you never intend to
languages. An expert assembly language programmer will almost always produce a
write a single line of assembly code. If you know assembly language well, you'll
faster program than an expert C programmer. While certain programs may not
have an appreciation for the compiler, and you'll know exactly
benefit much from implementation in assembly, you can speed up many programs by
what the compiler is doing with all those HLL statements. Once you see how
a factor of five or ten over their HLL counterparts by careful coding in assembly
compilers translate seemingly innocuous statements into a ton of machine code,
language; even greater improvement is possible if you're not using an
you'll want to search for better ways to accomplish the same thing. Good assembly
optimizing compiler. Alas, speedups on the order of five to ten times are generally
language programmers make better HLL programmers because they understand the
not achieved by beginning assembly language programmers. However, if you spend
limitations of the compiler and they know what it's doing with their code. Those who
the time to learn assembly language really well, you too can achieve these
don't know assembly language will accept the poor performance their
impressive performance gains.
compiler produces and simply shrug it off.
Despite some people's claims that programmers no longer have to worry about
memory constraints, there are many programmers who need to write smaller
programs. Assembly language programs are often less than one-half the size of Representation of numbers in binary
comparable HLL programs. This is especially impressive when you consider the fact
that data items generally consume the same amount of space in both types of ---------------------------------------------------------------------
programs, and that data is responsible for a good amount of the space used by a
typical application. Saving space saves money. Pure and simple. If a program
requires 1.5 megabytes, it will not fit on a 1.44 Mbyte floppy. Likewise, if an Before we begin to understand how to program in assembly it is best to try to
application requires 2 megabytes RAM, the user will have to install an extra understand how numbers are represented in computers. Numbers are stored in
megabyte if there is only one available in the machine. Even on big machines with binary, base two. There are several terms which are used to describe different size
32 or more megabytes, writing gigantic applications isn't excusable. Most users put numbers and I will describe what these mean.
more than eight megabytes in their machines so they can run multiple programs
1 BIT: 0
from memory at one time. The bigger a program is, the fewer applications will be
able to coexist in memory with it. Virtual memory isn't a particularly attractive One bit is the simplest piece of data that exists. Its either a one or a zero.
solution either. With virtual memory, the bigger an application is, the slower the
system will run as a result of that program's size. 1 NIBBLE: 0000
4 BITS
Capability is another reason people resort to assembly language. HLLs are an
abstraction of a typical machine architecture. They are designed to be independent of The nibble is four bits or half a byte. Note that it has a maximum value of 15 (1111
the particular machine architecture. As a result, they rarely take into account any = 15). This is the basis for the hexadecimal (base 16) number system which is used
special features of the machine, features which are available to assembly language as it is far easier to understand.
programmers. If you want to use such features, you will need to use assembly Hexadecimal numbers go from 1 to F and are followed by a h to state that the are in
language. A really good example is the input/output instructions available hex. i.e. Fh = 15 decimal. Hexadecimal numbers that begin with a letter are prefixed
on the 80x86 microprocessors. These instructions let you directly access certain I/O with a 0 (zero).
devices on the computer. In general, such access is not part of any high level
1 BYTE 00000000
language. Indeed, some languages like C pride themselves on not supporting any
2 NIBBLES
specific I/O operations. In assembly language you have no such restrictions.
8 BITS
Anything you can do on the machine you can do in assembly language. This is
definitely not the case with most HLLs. A byte is 8 bits or 2 nibbles. A byte has a maximum value of FFh (255 decimal).
Because a byte is 2 nibbles the hexadecimal representation is two hex digits in a row Because the x86 processor has so few registers, we'll give each register its own name
i.e. 3Dh. The byte is also that size of the 8-bit registers which we will be covering and refer to it by that name rather than its address. The names for the x86
later. registers are
1 WORD 0000000000000000
2 BYTES AX -The accumulator register
4 NIBBLES BX -The base address register
16 BITS CX -The count register
DX -The data register
A word is two bytes that are stuck together. A word has a maximum value of FFFFh
(65,536). Since a word is four nibbles, it is represented by four hex digits. This is Besides the above registers, which are visible to the programmer, the x86 processors
the size of the 16-bit registers. also have an instruction pointer register which contains the address of the next
instruction to execute. There is also a flags register that holds the result of a
comparison. The flags register remembers if one value was less than, equal to, or
Registers greater than another value.
Because registers are on-chip and handled specially by the CPU, they are much
--------------------------------------------------------------------- faster than memory. Accessing a memory location requires one or more clock cycles.
Accessing data in a register usually takes zero clock cycles. Therefore, you should
try to keep variables in the registers. Register sets are very small and most registers
Registers are a place in the CPU where a number can be stored and manipulated.
have special purposes which limit their use as variables, but they are still an
There are three sizes of registers: 8-bit, 16-bit and on 386 and above 32-bit. There
excellent place to store temporary data.
are four different types of registers; general purpose registers, segment egisters,
index registers and stack registers. Firstly here are descriptions of the main registers. If AX contained 24689 decimal:
Stack registers and segment registers will be covered later.
AH AL
01100000 01110001
General Purpose Registers
AH would be 96 and AL would be 113. If you added one to AL it would be 114 and
--------------------------------------------------------------------- AH would be unchanged.
SI, DI, SP and BP can also be used as general purpose registers but have more
These are 16-bit registers. There are four general purpose registers; specific uses. They are not split into two
AX, BX, CX and DX.
They are split up into 8-bit registers. AX is split up into AH which contains the high halves.
byte and AL which contains the low
byte. On 386's and above there are also 32-bit registers, these have the same names
as the 16-bit registers but with an 'E' in front i.e. EAX. You can use AL, AH, AX and
EAX separatly and treat them as separate registers for some tasks.
CPU registers are very special memory locations constructed from flip-flops. They
are not part of main memory; the CPU implements them on-chip. Various members
of the 80x86 family have different register sizes. The 886, 8286, 8486, and 8686
(x86 from now on) CPUs have exactly four registers, all 16 bits wide.
All arithmetic and location operations occur in the CPU registers.
Index Registers share data. All in all, segmentation is a really neat feature. On the other hand, if you
ask ten programmers what they think of segmentation, at least nine of the ten will
--------------------------------------------------------------------- claim it's terrible. Why such a response?
Well, it turns out that segmentation provides one other nifty feature: it allows you to
These are sometimes called pointer registers and they are 16-bit registers. They are extend the addressability of a processor. In the case of the 8086, segmentation let
mainly used for string instructions. There are three index registers SI (source index), Intel's designers extend the maximum addressable memory from 64K to one
DI (destination index) and IP (instruction pointer). On 386's and above there are also megabyte. Gee, that sounds good. Why is everyone complaining? Well, a little
32-bit index registers: EDI and ESI. You can also use BX to index strings. IP is a history lesson is in order to understand what went wrong.
index register but it can't be manipulated directly as it stores the address of the next
instruction. In 1976, when Intel began designing the 8086 processor, memory was very
expensive. Personal computers, such that they were at the time, typically had four
thousand bytes of memory. Even when IBM introduced the PC five years later, 64K
was still quite a bit of memory, one megabyte was a tremendous amount. Intel's
designers felt that 64K memory would remain a large amount throughout the
lifetime of the 8086. The only mistake they made was completely underestimating
the lifetime of the 8086. They figured it would last about five years, like their earlier
8080 processor. They had plans for lots of other processors at the time, and "86" was
not a suffix on the names of any of those. Intel figured they were set. Surely one
megabyte would be more than enough to last until they came out with something
better.
Unfortunately, Intel didn't count on the IBM PC and the massive amount of software
to appear for it. By 1983, it was very clear that Intel could not abandon the 80x86
architecture. They were stuck with it, but by then people were running up against the
Stack registers one megabyte limit of 8086. So Intel gave us the 80286. This processor could
address up to 16 megabytes of memory. Surely more than enough. The only problem
---------------------------------------------------------------------
was that all that wonderful software written for the IBM PC
was written in such a way that it couldn't take advantage of any memory beyond one
BP and SP are stack registers and are used when dealing with the stack. They will be megabyte.
covered when we talk about the stack later on.
It turns out that the maximum amount of addressable memory is not everyone's main
complaint. The real problem is that the 8086 was a 16 bit processor, with 16 bit
registers and 16 bit addresses. This limited the processor to addressing 64K chunks
Segments and offsets of memory. Intel's clever use of segmentation extended this to one megabyte, but
addressing more than 64K at one time takes some effort. Addressing more than
--------------------------------------------------------------------- 256K at one time takes a lot of effort.
Despite what you might have heard, segmentation is not bad. In fact, it is a really
You cannot discuss memory addressing on the 80x86 processor family without first great memory management scheme. What is bad is Intel's 1976 implementation of
discussing segmentation. Among other things, segmentation provides a powerful segmentation still in use today. You can't blame Intel for this - they fixed the
memory management mechanism. It allows programmers to partition their programs problem in the 80's with the release of the 80386. The real culprit is MS-DOS that
into modules that operate independently of one another. Segments provide a way to forces programmers to continue to use 1976 style segmentation. Fortunately, newer
easily implement object-oriented programs. Segments allow two processes to easily operating systems such as Linux, UNIX, Windows 9x, Windows NT, and
OS/2 don't suffer from the same problems as MS-DOS. Furthermore, users finally The size of the offset limits the maximum size of a segment. On the 8086 with 16 bit
seem to be more willing to switch to these newer operating systems so programmers offsets, a segment may be no longer than 64K; it could be smaller (and most
can take advantage of the new features of the 80x86 family. segments are), but never larger. The 80386 and later processors allow 32 bit offsets
with segments as large as four gigabytes.
With the history lesson aside, it's probably a good idea to figure out what
segmentation is all about. Consider the current view of memory: it looks like a linear The segment portion is 16 bits on all 80x86 processors. This lets a single program
array of bytes. A single index (address) selects some particular byte from that array. have up to 65,536 different segments in the program. Most programs have less than
Let's call this type of addressing linear or flat addressing. Segmented addressing uses 16 segments (or thereabouts) so this isn't a practical limitation.
two components to specify a memory location: a segment value and an offset within
that segment. Ideally, the segment and offset values are independent of one another. Of course, despite the fact that the 80x86 family uses segmented addressing, the
The best way to describe segmented addressing is with a two-dimensional array. The actual (physical) memory connected to the CPU is still a linear array of bytes. There
segment provides one of the indices into the array, the offset provides the other: is a function that converts the segment value to a physical memory address. The
processor then adds the offset to this physical address to obtain the actual address of
the data in memory. This text will refer to addresses in your programs as segmented
addresses or logical addresses. The actual linear address that
appears on the address bus is the physical address :
Now you may be wondering, "Why make this process more complex?" Linear
addresses seem to work fine, why bother with this two dimensional addressing
scheme? Well, let's consider the way you typically write a program. If you were to
write, say, a SIN(X) routine and you needed some temporary variables, you probably On the 8086, 8088, 80186, and 80188 (and other processors operating in real mode),
would not use global variables. Instead, you would use local variables inside the the function that maps a segment to a physical address is very simple. The CPU
SIN(X) function. In a broad sense, this is one of the features that segmentation offers multiplies the segment value by sixteen (10h) and adds the offset portion. For
- the ability to attach blocks of variables (a segment) to a particular piece of code. example, consider the segmented address: 1000:1F00. To convert this to a physical
You could, for xample, have a segment containing local variables for SIN, a address you multiply the segment value (1000h) by sixteen. Multiplying by the radix
segment for SQRT, a segment for DRAWWindow, etc. Since the variables for SIN is very easy. Just append a zero to the end of the number. Appending a zero to 1000h
appear in the segment for SIN, it's less likely your SIN routine will affect the produces 10000h. Add 1F00h to this to obtain 11F00h. So 11F00h is the physical
variables belonging to the SQRT routine. Indeed, on the 80286 and later operating in address that corresponds to the segmented address 1000:1F00.
protected mode, the CPU can prevent one routine from accidentally modifying the
variables in a different segment.
number of registers is very limited, often times less than ten. In most high-level
languages, register access
Segment registers are: CS, DS, ES, SS. On the 386+ there are also FS and GS. isn't permitted, but don't worry; the code the compiler generates uses registers
heavily (and for yet more
Offset registers are: BX, DI, SI, BP, SP, IP. In 386+ protected mode1, ANY general reasons not discussed here, it kind of has to). But back to the topic at hand.
register (not a segment register) can be used as an Offset register. (Except IP, which The register used for the stack is called the Stack Pointer, or SP for short. The
you can't manipulate directly). region of memory is the same very Stack
segment we've been talking about. Since a stack only has two operations, pushing
(adding an item to the stack), and
popping (removing an item from the stack), the easiest way to describe a stack is to
THE STACK
just describe both of these operations.
--------------------------------------------------------------------- When the program is loaded, the stack pointer is set to the highest address of the
stack segment. When an item is pushed
onto the stack, two things occur. First of all, the size of the item, in bytes, is
As there are only six registers that are used for most operations, you're probably subtracted from the stack pointer. Then, all the
wondering how do you get around that. It's easy. There is something called a stack bytes the item consists of are copied into the region of the stack segment that the
which is an area of memory which you can save and restore values to. stack pointer now points to. When an item
This is an area of memory that is like a stack of plates. The last one you put on is the is popped from the stack, the size of the item is added to the stack pointer. The copy
of the item still resides on the stack,
first one that you take off. This is sometimes refered to as Last In First Off (LIFO) or
but will be overwritten when the next push occurs.
First In Last Out (FILO).
If another piece of data is put on the stack it grows downwards.
As you can see the stack starts at a high address and grows downwards. You have to
make sure that you don't put too much data in the stack or it will overflow. Why We Have A Stack
The stack is useful for storing context. If a procedure simply pushes all its local
The STACK Segment variables onto the stack when it enters,
and pops them off when its done, its complete context is nicely cleaned up
In general, a stack is a single ended data structure with a first in, last out data afterwards. What's nice is that when a
ordering. That means that if item A is placed procedure calls another procedure, the called procedure can do the same with its
into the stack ("Pushed"), and then item B is placed on the stack, then item B must context; leaving the calling procedure's
be removed ("Popped") from the stack data completely alone.
before item A may be retrieved. This may seem kind of pointless, but the Before we go any further into the stack, we have to define the Instruction
usefullness will be demonstrated soon. Pointer, IP for short. This is
The system implements a stack for all running programs. The implementation of a another register, like the Stack Pointer, which holds the address of the currently
stack is very simple: one variable kept executing instruction in
inside the processor itself and a region of memory memory. This register is maintained by the processor. In fact, the processor runs
by basically doing the
following:
Yes, the processor has a small, limited set of variables inside it. These variables
are called registers. The 1.Read instruction that IP points to
Notice that the values of CX and AX will be exchanged. There is an For example:
instruction to exchange two registers: XCHG, which would reduce the int 21h ;Calls DOS service
previous fragment to "xchg ax,cx". int 10h ;Calls the Video BIOS interrupt
Most interrupts have more than one function, this means that you
TYPES OF OPERAND have to pass a number to the function you want. This is usually put
in AH. To print a message on the screen all you need to do is this:
--------------------------------------------------------------------- mov ah,9 ;subroutine number 9
int 21h ;call the interrupt
There are three types of operands in assembler: immediate, register But first you have to specify what to print. This function needs
and memory. Immediate is a number which will be known at compilation DS:DX to be a far pointer to where the string is. The string has to
and will always be the same for example '20' or 'A'. A register be terminated with a dollar sign ($). This would be easy if DS could
operand is any general purpose or index register for example AX or be manipulated directly, to get round this we have to use AX.
SI. A memory operand is a variable which is stored in memory which
will be covered later. This example shows how it works:
easily be written as: multiplied if a byte sized operand is given and the result is stored
in AX. If the operand is word sized AX is multiplied and the result
Number1 db 0 is placed in DX:AX.
Number2 dw 1
On a 386, 486 or Pentium the EAX register can be used and the answer
This time Number1 is equal to 0 and Number2 is equal to 1 when you is stored in EDX:EAX.
program loads. Your program will also be three bytes longer. If you
declare a variable as a word you cannot move the value of this DIV Divides two unsigned integers (always positive)
variable into a 8-bit register and you can't declare a variable IDIV Divides two signed integers (either positive or negitive)
as a byte and move the value into a 16-bit register. For examples:
Syntax:
mov al,Number1 ;ok DIV register or variable
mov ax,Number1 ;error IDIV register or variable
mov bx,Number2 ;ok This works in the same way as MUL and IMUL by dividing the number in
mov bl,Number2 ;error AX by the register or variable given. The answer is stored in two
places. AL stores the answer and the remainder is in AH. If the
All you have to remember is that you can only put bytes into 8-bit operand is a 16 bit register than the number in DX:AX is
registers and words into 16-bit registers. divided by the operand and the answer is stored in AX and remainder
in DX.
ADD Add the contents of one number to another
YOUR FIRST ASSEMBLY PROGRAM
Syntax:
This subtracts operand2 from operand1. Immediate data cannot be used ;This is a simple program which displays "Hello World!" on the
as operand1 but can be used as operand2. ;screen.
---------------------------------------------------------------------
To make this work at the beginning of your code add these lines:
These are some instructions to compile and link programs. If you mov ax,@data
have a compiler other than TASM or A86 then see your instruction mov ds,ax
manual.
Turbo Assembler: Note: for A86 you need to change the first line to:
This is because all the data in the segment has the same SEG value.
The /t switch makes a .COM file. This will only work if the memory Putting this in DS saves us reloading this every time we want to
model is declared as tiny in the source file. use another thing in the same segment.
A86:
KEYBOARD INPUT
a86 file.asm
---------------------------------------------------------------------
This will compile your program to a .COM file. It doesn't matter
what the memory model is. We are going to use interrupt 16h, function 00h to read the keyboard. This gets a key
from the keyboard buffer. If there isn't
one, it waits until there is. It returns the SCAN code in AH and the ASCII translation
MAKING THINGS EASIER in AL.
---------------------------------------------------------------------
xor ah,ah ;function 00h - get character
int 16h ;interrupt 16h
The way we entered the address of the message we wanted to print was
a bit cumbersome. It took three lines and it isn't the easiest thing All we need to worry about for now is the ascii value which is in AL.
to remember
Note: XOR performs a Boolean Exclusive OR. It is commonly used to
erase a register or variable.
Syntax:
FLOW CONTROL CMP register or variable, value
jxx destination
---------------------------------------------------------------------
An example of this is:
In assembly there is a set of commands for control flow like in any cmp al,'Y' ;compare the value in al with Y
other language. Firstly the most basic command: je ItsYES ;if it is equal then jump to ItsYES
jmp label Every instruction takes up a certain amount of code space. You will
get a warning if you try and jump over 127 bytes in either direction
All this does it to move to the label specified and start executing from the compiler. You can solve this by changing a sequence like this:
the code there. For example:
cmp ax,10 ;is AX 10?
je done ;yes, lets finish
jmp ALabel
.
. to something like this:
.
ALabel:
What do we do if we want to compare something? We have just got a cmp ax,10 ;is AX 10?
key from the user but we want to do something with it. Lets print jne notdone ;no it is not
something out if it is equal to something else. How do we do that? jmp done ;we are now done
In registers
;This is a simple program to demonstrate procedures. It should The advantages of this is that it is easy to do and is fast. All you
;print Hello World! on the screen when ran. have to do is to is move the parameters into registers before
calling the procedure.
.model tiny
.data
HI DB "Hello World!$" ;define a message
Listing 4: PROC1.ASM
.code
org 100h
;this a procedure to print a block on the screen using
Start:
;registers to pass parameters (cursor position of where to
mov ax ,@data ;Initialise Data Seg
mov ds,ax ;print it and colour).
call Display_Hi ;Call the procedure
mov ax,4C00h ;return to DOS .model tiny
int 21h ;interrupt 21h function 4Ch .code
org 100h
Display_Hi PROC
Start:
mov dx,OFFSET HI ;put offset of message into DX
mov dh,4 ;row to print character on
mov ah,9 ;function 9 - display string
mov dl,5 ;column to print character on
int 21h ;call DOS service
mov al,254 ;ascii value of block to display
ret mov bl,4 ;colour to display character
Display_Hi ENDP
call PrintChar ;print our character
end Start
mov ax,4C00h ;terminate program
int 21h
pop bx ;restore bx
xor bh,bh ;display page - 0
Procedures wouldn't be so useful unless you could pass parameters to
mov ah,9 ;function 09h write char & attrib
modify or use inside the procedure. There are three ways of doing mov cx,1 ;display it once
this and I will cover all three methods: in registers, in memory and int 10h ;call bios service
in the stack.
pop cx ;restore registers
ret ;return to where it was called
There are three example programs which all accomplish the same task. PrintChar ENDP
They print a square block (ASCII value 254) in a specified place.
end Start
The sizes of the files when compiled are: 38 for register, 69 for
memory and 52 for stack.
PASSING THROUGH MEMORY
To pass parameters through memory all you need to do is copy them to pop bx cx ax ;restore registers
a variable which is stored in memory. You can use a variable in the ret ;return to where it was called
PrintChar ENDP
same way that you can use a register but commands with registers are
a lot faster.
end Start
Listing 5: PROC2.ASM
Passing through Stack
WHAT ARE MEMORY MODELS? This is the opposite to compact. Data elements are NEAR and
procedures are FAR.
---------------------------------------------------------------------
Large
We have been using the .MODEL directive to specify what type of
memory model we use, but what does this mean?
This means that both procedures and variables are FAR. You have to
Syntax: point at both the segment and offset addresses.
.MODEL MemoryModel
Flat
Where MemoryModel can be SMALL, COMPACT, MEDIUM, LARGE, HUGE,
TINY This isn't used much as it is for 32 bit unsegmented memory space.
OR FLAT. For this you need a DOS extender. This is what you would have to use
if you were writing a program to interface with a C/C++ program that
used a DOS extender such as DOS4GW or PharLap.
Tiny
This means that there is only one segment for both code and data. MACROS (in Turbo Assembler)
This type of program can be a .COM file.
---------------------------------------------------------------------
jmp SkipData
(All code examples given are for macros in Turbo Assembler.) PrintMe db SomeText,'$'
Macros are very useful for doing something that is done often but SkipData:
for which a procedure can't be use. Macros are substituted when the push ax dx ds cs
pop ds
program is compiled to the code which they contain.
mov dx,OFFSET cs:PrintMe
mov ah,9
This is the syntax for defining a macro:
int 21h
pop ds dx ax
endm
Name_of_macro macro
; endm
;a sequence of instructions
;
endm The only problems with macros is that if you overuse them it leads
to you program getting bigger and bigger and that you have problems
These two examples are for macros that take away the boring job of with multiple definition of labels and variables. The correct way to
pushing and popping certain registers: solve this problem is to use the LOCAL directive for declaring names
inside macros.
The disadvantages with the first method is that you will have to
Function 3Dh: open file remember not to use the register you saved it in and it wastes a
register that can be used for something more useful. We are going to
use the second. This is how it is done:
Opens an existing file for reading, writing or appending on the
specified drive and filename.
FileHandle DW 0 ;use this for saving the file handle
INPUT: .
AH = 3Dh .
.
AL = bits 0-2 Access mode mov FileHandle,ax ;save the file handle
000 = read only
001 = write only Function 3Eh: close file
010 = read/write
bits 4-6 Sharing mode (DOS 3+)
000 = compatibility mode Closes a file that has been opened.
001 = deny all
INPUT:
010 = deny write
AX = 3Eh
011 = deny read
BX = file handle
100 = deny none
DS:DX = segment:offset of ASCIIZ pathname OUTPUT:
CF = 0 function is successful
AX = destroyed .model small
CF = 1 function not successful .stack
AX = error code - 06h file not opened or unauthorised handle. .data
start:
mov dx,offset OpenError ;display an error Function 4Eh: find first matching file
mov ah,09h ;using function 09h
int 21h ;call DOS service
mov ax,4C01h ;end program with an errorlevel =1
int 21h
Searches for the first file that matches the filename given.
ErrorReading: INPUT:
mov dx,offset ReadError ;display an error AH = 4Eh
mov ah,09h ;using function 09h CX = file attribute mask (bits can be combined)
int 21h ;call DOS service
bit 0 = 1 read only
mov ax,4C02h ;end program with an errorlevel =2 bit 1 = 1 hidden
int 21h bit 2 = 1 system
bit 3 = 1 volume label
bit 4 = 1 directory
END start bit 5 = 1 archive
bit 6-15 reserved
DS:DX = segment:offset of ASCIIZ pathname
Function 3Ch: Create file
OUTPUT:
CF = 0 function is successful
Creates a new empty file on a specified drive with a specified pathname. [DTA] Disk Transfer Area = FindFirst data block
INPUT: The DTA block
AH = 3Ch
CX = file attribute Offset Size in bytes Meaning
bit 0 = 1 read-only file
bit 1 = 1 hidden file 0 21 Reserved
bit 2 = 1 system file 21 1 File attributes
bit 3 = 1 volume (ignored) 22 2 Time last modified
bit 4 = 1 reserved (0) - directory 24 2 Date last modified
bit 5 = 1 archive bit 26 4 Size of file (in bytes)
bits 6-15 reserved (0) 30 13 File name (ASCIIZ)
DS:DX = segment:offset of ASCIIZ pathname
An example of checking if file exists:
OUTPUT:
CF = 0 function is successful File DB "C:\file.txt",0 ;name of file that we want
AX = handle
mov dx,OFFSET File ;address of filename
CF = 1 an error has occurred mov cx,3Fh ;file mask 3Fh - any file
03h path not found mov ah,4Eh ;function 4Eh - find first file
04h no available handle int 21h ;call DOS service
jc NoFile
05h access denied
;print message saying file exists
Important: If a file of the same name exists then it will be lost. NoFile:
Make sure that there is no file of the same name. This can be done ;continue with creating file
with the function below.
This is an example of creating a file and then writing to it. int 21h ;call dos service
jc WriteError ;jump if there is an error
cmp ax,cx ;was all the data written?
jne WriteError ;no it wasn't - error!
Listing 8: CREATE.ASM
mov bx,Handle ;put file handle in bx
mov ah,3Eh ;function 3Eh - close a file
;This example program creates a file and then writes to it. int 21h ;call dos service
WriteError:
StartMessage DB "This program creates a file called NEW.TXT"
mov dx,offset WriteMessage ;display an error message
DB ,"on the C drive.$"
jmp EndError
EndMessage DB CR,LF,"File create OK, look at file to"
DB ,"be sure.$"
OpenError:
mov dx,offset OpenMessage ;display an error message
WriteMessage DB "An error has occurred (WRITING)$"
jmp EndError
OpenMessage DB "An error has occurred (OPENING)$"
CreateMessage DB "An error has occurred (CREATING)$"
CreateError:
mov dx,offset CreateMessage ;display an error message
WriteMe DB "HELLO, THIS IS A TEST, HAS IT WORKED?",0
EndError:
FileName DB "C:\new.txt",0 ;name of file to open
mov ah,09h ;using function 09h
Handle DW ? ;to store file handle
.code int 21h ;call dos service
mov ax,4C01h ;terminate program
mov ax,@data ;base address of data segment int 21h
mov ds,ax ;put it in ds
EndOk:
LoopCycle:
mov dx,OFFSET Deleted ;display message
mov dx,OFFSET FileName ;DS:DX points to file name
jmp Endit
mov ah,4Fh ;function 4fh - find next
ErrorDeleting: int 21h ;call DOS service
jc exit ;exit if carry flag is set
mov dx,OFFSET ErrDel ;display message
jmp Endit
mov cx,13 ;length of filename
FileDontExist: mov si,OFFSET DTA+30 ;DS:SI points to filename in DTA
mov dx,OFFSET NoFile ;display message xor bh,bh ;video page - 0
mov ah,0Eh ;function 0Eh - write character
EndIt:
mov ah,9 NextChar:
int 21h lodsb ;AL = next character in string
int 10h ;call BIOS service
mov ax,4C00h ;terminate program and exit to DOS loop NextChar
int 21h ;call DOS service
mov di,OFFSET DTA+30 ;ES:DI points to DTA
end
mov cx,13 ;length of filename
xor al,al ;fill with zeros
USING THE FINDFIRST AND FINDNEXT FUNCTIONS rep stosb ;erase DTA
---------------------------------------------------------------------
STOS* Move byte, word or double word from AL, AX or EAX to ES:DI
Syntax:
In assembly there are some very useful instructions for dealing with
strings. Here is a list of the instructions and the syntax for using
them: stosb ;move AL into ES:DI
stosw ;move AX into ES:DI
stosd ;move EAX into ES:DI
MOV* Move String: moves byte, word or double word at DS:SI
to ES:DI LODS* Move byte, word or double word from DS:SI to AL, AX or EAX
Syntax: Syntax:
Note: This instruction is normally used with the REP prefix. String1 db "This is a string!$"
String2 db 18 dup(0)
SCAS* Search string: search for AL, AX, or EAX in string at ES:DI
Diff1 db "This string is nearly the same as Diff2$"
Diff2 db "This string is nearly the same as Diff1$"
Syntax:
Equal1 db "The strings are equal$"
Equal2 db "The strings are not equal$"
scasb ;search for AL
scasw ;search for AX SearchString db "1293ijdkfjiu938uHello983fjkfjsi98934$"
scasd ;search for EAX
Message db "This is a message"
jmp Next_Operation In many programs it is necessary to find out what the DOS version
Not_Equal: is. This could be because you are using a DOS function that needs
the revision to be over a certain level.
mov ah,9 ;function 9 - display string
mov dx,OFFSET Message5 ;ds:dx points to message Firstly this method simply finds out what the version is.
int 21h ;call dos function
SETVER can change the version that is returned. The way to get round push dx
this is to use this method. push bx
push temp
mov ah,33h ;function 33h - actual DOS version
push bp
mov al,06h ;subfunction 06h push si
int 21h ;call interrupt 21h push di
The main advantage is that it is less typing, a smaller instruction
This will only work on DOS version 5 and above so you need to check and it is a lot faster. POPA does the reverse and pops these
using the former method. This will return the actual version of DOS registers off the stack. PUSHAD and POPAD do the same but with the
even if SETVER has changed the version. This returns the major 32-bit registers ESP, EAX, ECX, EDX, EBX, EBP, ESI and EDI.
version in BL and the minor version in BH.
Using MUL's and DIV's is very slow and should be only used when
You can push and pop more than one register on a line in TASM and speed is not needed. For faster multiplication and division you can
A86. This makes your code easier to understand. shift numbers left or right one or more binary positions. Each shift
is to a power of 2. This is the same as the << and >> operators in
push ax bx cx dx ;save registers C. There are four different ways of shifting numbers either left or
pop dx cx bx ax ;restore registers right one binary position.
temp = SP ---------------------------------------------------------------------
push ax
push cx Using Loop is a better way of making a loop then using JMP's. You
place the amount of times you want it to loop in the CX register and
every time it reaches the loop statement it decrements CX (CX-1) and .model tiny
then does a short jump to the label indicated. A short jump means .code
that it can only 128 bytes before or 127 bytes after the LOOP org 100h
instruction. start:
pop cx ;restore cx
This is exactly the same as the following piece of code without pop bx ;restore bx
pop ax ;restore dx
using loop:
mov ax,4C00h ;exit to dos
int 21h
mov cx,100 ;100 times to loop
Label: ChangeNumbers PROC
dec cx ;CX = CX-1 add ax,bx ;adds number in bx to ax
jnz Label ;continue until done mul cx ;multiply ax by cx
mov dx,ax ;return answer in dx
ret
Which do you think is easier to understand? Using DEC/JNZ is faster ChangeNumbers ENDP
on 486's and above and it is useful as you don't have to use CX. end start
This works because JNZ will jump if the zero flag has not been set.
Setting CX to 0 will set this flag. Now compile it to a .COM file and then type:
--------------------------------------------------------------------- Turbo Debugger then loads. You can see the instructions that make up
your programs, for example the first few lines of this program is
This is a good time to use a debugger to find out what your program shown as:
is actually doing. I am going to demonstrate how to use Turbo
Debugger to check what the program is actually doing. First we need cs:0000 50 push ax
a program which we can look at. cs:0001 53 push bx
cs:0002 51 push cx
F5 Size Window
;example program to demonstrate how to use a debugger F7 Next Instruction
At the left of this display there is a box showing the contents of This next example demonstrates how to write to the screen using the
the registers. At this time all the main registers are empty. Now file function 40h of interrupt 21h.
press F7 this means that the first line of the program is run. As
the first line pushed the AX register into the stack, you can see Listing 13: TEXT2.ASM
that the stack pointer (SP) has changed. Press F7 until the line
which contains mov ax,000A is highlighted. Now press it again. Now
if you look at the box which contains the contents of the registers
you can see that AX contains A. Press it again and BX now contains .model small
.stack
14, press it again and CX contains 3. Now if you press F7 again you .code
can see that AX now contains 1E which is A+14. Press it again and mov ax,@data ;set up ds as the segment for data
now AX contains 5A 1E multiplied by 3. Press F7 again and you will mov ds,ax ;put this in ds
see that DX now also contains 5A. Press it three more times and you mov ah,40h ;function 40h - write file
can see that CX, BX and AX are all set back to their original values mov bx,1 ;handle = 1 (screen)
of zero. mov cx,17 ;length of string
mov dx,OFFSET Text ;DS:DX points to string
int 21h ;call DOS service
I am going to cover some more ways of outputting text in text modes. end
This first program is an example of how to move the cursor to
display a string of text where you want it to go.
The next program shows how to set up and call function 13h of
interrupt 10h - write string. This has the advantages of being able
Listing 12: TEXT1.ASM to write a string anywhere on the screen in a specified colour but
it is hard to set up.
.model tiny
.code Listing 14: TEXT3.ASM
org 100h
start:
mov dh,12 ;cursor col
.model small
mov dl,32 ;cursor row
.stack
mov ah,02h ;move cursor to the right place .code
xor bh,bh ;video page 0 mov ax,@data ;set up ds as the segment for data
.data Wr_Char:
lodsb ;put next character into al
Text DB "This is some text" mov es:[di],al ;output character to video memory
end inc di ;move along to next column
mov es:[di],ah ;output attribute to video memory
inc di
The next program demonstrates how to write to the screen using rep loop Wr_Char ;loop until done
stosw to put the writing in video memory.
mov ax,4C00h ;return to DOS
int 21h
It would be polite to tell the user if his/her computer cannot Just use this to check if mode 13h is supported at the beginning of
support mode 13h instead of just crashing his computer without your program to make sure that you can go into that mode.
warning. This is how it is done.
.code Once we are in mode 13h and have finished what we are doing we need
mov ax,@data ;set up DS to point to data segment
to we need to set it to the video mode that it was in previously.
mov ds,ax ;use ax This is done in two stages. Firstly we need to save the video mode
and then reset it to that mode.
call Check_Mode_13h ;check if mode 13h is possible
jc Error ;if cf=1 there is an error
VideoMode db ?
mov ah,9 ;function 9 - display string
....
mov dx,OFFSET Supported ;DS:DX points to message
int 21h ;call DOS service mov ah,0Fh ;function 0Fh - get current mode
int 10h ;Bios video service call
jmp To_DOS ;exit to DOS mov VideoMode,al ;save current mode
Mode_13h_OK:
ret This makes a colour dot on the screen at the specified graphics
Check_Mode_13h ENDP
coordinates.
end INPUT:
AH = 0Ch
AL = Color of the dot SOME OPTIMIZATIONS
CX = Screen column (x coordinate) ---------------------------------------------------------------------
DX = Screen row (y coordinate)
This method isn't too fast and we could make it a lot faster. How?
OUTPUT: By writing direct to video memory. This is done quite easily.
Nothing except pixel on screen.
The VGA segment is 0A000h. To work out where each pixel goes you use
Note: This function performs exclusive OR (XOR) with the new colour this simple formula to get the offset.
value and the current context of the pixel of bit 7 of AL is set.
Offset = X + ( Y x 320 )
This program demonstrates how to plot pixels. It should plot four
All we do is to put a number at this location and there is now a
red pixels into the middle of the screen.
pixel on the screen. The number is what colour it is.
There are two instructions that we can use to put a pixel on the
Listing 17: PIXINT.ASM
screen, firstly we could use stosb to put the value in AL to ES:DI
or we can use a new form of the MOV instruction like this:
;example of plotting pixels in mode 13 using bios services -
;INT 10h mov es:[di], color
.model tiny
.code
Which should we use? When we are going to write pixels to the screen
org 100h we need to do so as fast as it is possible.
start: Instruction Pentium 486 386 286 86
mov ax,13 ;mode = 13h
int 10h ;call bios service STOSB 3 5 4 3 11
MOV AL to SEG:OFF 1 1 4 3 10
mov ah,0Ch ;function 0Ch
mov al,4 ;color 4 - red If you use the MOV method you may need to increment DI (which STOSB
mov cx,160 ;x position = 160 does).
mov dx,100 ;y position = 100
int 10h ;call BIOS service [ put pixel instruction]
inc dx ;plot pixel downwards
int 10h ;call BIOS service
If we had a program which used sprites which need to be continuously
inc cx ;plot pixel to right draw, erased and then redraw you will have problems with flicker. To
int 10h ;call BIOS service avoid this you can use a 'double buffer'. This is another part of
dec dx ;plot pixel up memory which you write to and then copy all the information onto the
int 10h ;call BIOS service
screen.
xor ax,ax ;function 00h - get a key
int 16h ;call BIOS service
mov ax,3 ;mode = 3
int 10h ;call BIOS service
end start