Coa Unit 3
Coa Unit 3
Introduction
A ssembly languages, like any other computer language, represents a compromise between
people and machines. People regularly process information in what are frequently called
natural languages such as English, French, or German. Computers process information in
what are commonly known as machine languages. The route from human to thought to
machine language is traveled half way by people, who translate their thoughts in to a computer
language, and half way by computers, which use programs called translators to convert
computer language in to machine language.
There are different kinds of translators. The translators that take an entire program and
translate it as a body in to machine language are called compilers. Translators that process
programs one line at a time are called interpreters. Special purpose translators that are
specifically designed to translate assembly language programs in to machine language are
called assemblers.
Figure 5.1 translators process program instructions in to machine language at different levels of
organization
Aside from the practical advantages of programming in assembly language, there are several
emotionally satisfying reasons for learning this language. Assembly language gives you
complete control over the machine.
Perhaps the greatest value in learning to program in assembly language lies in the satisfaction
that comes from understanding computers and how they work. An understanding of assembly
language is in many respects an understanding of the computer itself.
Objectives……
Starting with any new language can be little intimidating at first. It is often reassuring to begin
by composing and executing a simple program just for the sake of demonstrating to yourself
that you can get the thing to work. Program 5.1 shows a short program that does nothing more
than displays the phrase “Have a nice day! ” on screen.
Each step in the process creates a separate image of the program. The first step, entering the
program, creates a source file that contains the source code of the program. The source code is
the assembly language image of the program. Program 5.1 displays the source code that
displays the text “Have a nice day! “.
Assembling a program converts its source code in to an OBJect file. An OBJect file contains the
machine language image of the source code of a program in skeletal form. It also contains
enough information about the program to enable the linker and the loader to fill in the rest of
the program.
Program 5.1
Hex $
Code segment
mov ax , data
mov ds, ax
int $21
int $21
Data segment
end
The loader copies the EXEcutable image of the program in to memory and fills in the few
remaining gaps in the program with the very last details necessary to accommodate the
program in the exact place in memory to which it has been copied. After the loader has done
its work, the final image of the program exists in memory and is ready to be run. When the
loader is invoked directly from the DOS, it automatically begins a running a program
immediately after installing it. Figure 5.2 shows the progress of an assembly language program
from source code to machine languages.
Hardware Requirements
The program 5.1 and all the programs discussed in this text are designed to run on an
8086/8088‐based computer running under the MS‐DOS or PC‐DOS version 2.10 or later. Intel
corporation, which manufactures the 8086/8088 chip, has also produced a series of increasingly
powerful chips in the same family as the 8086/8088, designated 80186,the 80188,the
80286,the 80386,and the 80486. The assembly language for the 8086/8088 is the subset of the
assembly language for each of these other chips. Programs written for the 8086/8088 should
run without modifications on the machines based on any chip in this family.
Software requirements
Getting your program in the computer, assembling it , linking it, and if necessary, debugging it
are tasks that each require a separate software package. The first thing you will need is a word
or text processor for writing and editing your programs. Almost any word processor will suffice,
so long as it produces what are called standard ASCII text files.
Once a program is written, you will have to assemble it. To assemble the program we need to
have a full featured 8086/8088 macro assembler ELASS.EXE in the current directory of the DOS
where the program is saved. All of the examples under this chapter are designed to be run
through ELASS.
After the program is assembled, it will have to be linked. To link the assembled program again
we need executable file ELINK.EXE in the current directory of the DOS.
Entering a program
If you want to enter a program, first open the DOS command and then change the directory in
to the directory that contains the assembler (ELASS.EXE) and the linker (ELINKER.EXE) and then
write the “Edit” command with the required file name
Start ÆAll programs Æ Accessories Æ Command prompt and you will get the command
prompt window.
To change the directory in to C:\asm writes the following command in the above current
directory
Here our current directory is asm that contains the assembler (ELASS.EXE) and linker
(ELINK.EXE).
(You can also configure the environmental variable not to depend on the directory that
contains the assembler and the linker)
Finally we can open the editor to write the source code of the assembly language using the
command “edit” as shown below.
Notice:
• Our assumption here is the user name that we have used is “user” and also the directory
that contains the assembler and the linker is “asm” folder but you can put the assembler
and the linker in any folder (directory) as you want as you like. But you shouldn’t for get
before assembling the code that the current directory contains the assembler and also
the linker.
To assemble an assembly language program, enter the following command at the DOS prompt:
Where progname is the name of the source file in which the program to be assembled is stored.
There is no need to include the .asm file name extension following progname. The assembler
will infer if you omit it.
Once a program has been assembled and its object file has been recorded on the default
device, it is ready to be linked. To link a program using LINK.EXE, enter the following command
at the DOS prompt:
C:\asm>link progname;
The semicolon following the progname parameter directs LINK to apply its standard defaults. If
you inadvertently omit the semicolon, LINK will prompt for a series of inputs. You can ignore
those prompts and invoke the defaults by repeatedly pressing the enter key. Eventually, LINK
will report that it has finished linking the program. When it has finished linking, it will produce a
file named PROGNAME.EXE on the default device and return to DOS.
But if the linker that we have is ELINK.EXE, enter the following command at the DOS prompt to
link the program:
C:\asm>elink progname;
Under ELINK you can omit the trailing semicolon with out effect. When ELINK begins linking
your program, it will report that it is WRITING PROGNAME.EXE. When ELINK finishes, it will display a
brief index of the program it has just linked.
Once an assembly language program has been linked in to an executable file, it can be loaded
and executed just like any other executable file. To load and execute an executable file, enter
the following command at the DOS prompt:
C:\asm>progname
This command will instruct DOS to invoke the loader. The loader will load PROGNAME.EXE from
the default device in to memory, insert a few finishing touches to it, and automatically begin
There are two possible kinds of errors you can run in to the course of executing an assembly
language program: processing errors and system errors. Processing errors produce invalid
inputs. They are the familiar errors you can encounter in the course of executing a program
written in any language. System errors, on the other hand, are unique to languages that permit
two‐level operations. A system error is one that corrupts or displays the system under which
the program is running.
This topic provides a statement‐by‐statement analysis of the program. There are three kinds of
statements in the source code of an 8086/8088 assembly language program: instruction
statements, data allocation statements, and directives. The assembler translates each
instruction in to machine language. It reserves and initializes data space in memory for each
data allocation statements. Directives serve to define the context in which instructions and data
allocation statements are processed. This topic begins by introducing the list directive.
There are several ways to generate a listing of an assembly language program. You could either
take advantage of the listing and printing capabilities of what ever word processor you might
have used to generate that program, or you could use the TYPE command or you could insert a
LIST directive in to the program to force the assembler to generate a listing.
The DOS command for displaying a program source file on the video screen is
C:\asm>type progname.asm
Alternatively, you might choose to generate a program listing by using LIST directive. A LIST
directive causes the assembler to produce an annotated listing on the printer, the video screen,
a disk drive, or some combination of the three. An annotated listing shows the text of the
assembly language program numbers each statement in the program, and subject to certain
limitations, shows the offset associated with each instruction and each datum. It also displays
the machine language associated with each instruction. The advantage of using a LIST directive
instead of your word processor or DOS is that the LIST directive produces much more
informative output.
The SCR parameter directs on the video screen. The LPN parameter generates a similar listing
on the printer (PRN or LPT1). TXT parameter creates an ASCII format image of the annotated
listing on the default device under the name progname.TXT, where progname.ASM contains a
device name and/or subdirectory path specification, then progname.TXT will be written to the
specified device and subdirectory rather than to the default device. The TOF( Top Of Form)
parameter directs the assembler to begin each page of the listing with a header that gives the
name of the program, the current page number , the date, and the time of the day.
When a LIST directive is used, it normally appears as the first in a program. When the assembler
encounters a LIST directive it begins listing and continues until it encounters the end of the file
or a NOLIST directive. The general format for a NOLIST directive is:
NOLIST
An annotated program listing consisting of two sections: the annotation of the program, which
appears to the left, and the source code of the program which appears to the right.
An annotated program listing consisting of three columns of data: line numbers, offsets, and
machine codes.
The left most column in the program annotation contains line numbers. The assembler assigns
line numbers to the statements in the source file sequentially. If the assembler should have
occasion to issue an error message, the message will contain a reference to one of these line
numbers.
The second column from the left contains offsets. Each offset indicates the address of an
instruction or a datum as an offset from the base of logical segment. For example, the
statement at line number 0004 produces machine language at offset $0000 of the code
segment, and the instruction at line 0005 produces machine language at offset $0003.
The third column in the annotation displays the machine language produced by each instruction
in the program. In program5.2, line 0005 contains the instruction
Mov DS, AX
The machine language image for this instruction is shown to be 8ED8. Each machine language
image of an instruction statement consists of an opcode byte, which is usually but not always
followed by one or more additional bytes. The opcode is always the first byte of an instruction.
It is also the first byte that the 8086/8088 “reads” as it prepares to execute an instruction. The
opcode tells the 8086/8088 what to do and whether the instruction contains additional bytes.
The opcode in the machine code 8ED8 is 8E. That opcode says to the 8086/8088, “this is a MOV
instruction; read another byte for more details.” The byte D8 (in the context of being prepared
by the opcode 8E) says: “The full text of this instruction reads MOV DS, AX.”
As you can see, the column of machine language is not altogether complete. The listings for
lines 0004 and 0006 indicate that the assembler did not know what the machine language
images for those instructions were going to be. The xxxx at line 0004 occupies a place in the
Missing offsets: The xxxx in the machine language for the instruction at the line 0006 is there
because the assembler does not know the offset of the text of the message “have a nice a
day!$”. The linker must supply that value. It may occur to you to ask why the assembler does
not know the offset of that datum when it is so clearly identified as $0000 in the annotation of
line 0013.
The answer is that the assembler did not know that program5.2 was going to be the only
module in your program. The source code of a program can be divided in to several modules,
and each module can be assembled separately. Then, after each has been assembled they can
all be linked together in to one single program.
When the assembler reports that the offset of the datum defined at line 0013 is $0000, it is just
fudging. The assembler reports offsets on the assumption that it is processing the one and only
module in a program. This is actually just as well for beginning programmers. Most of the
programs beginners write consist of only a single module, so the annotation is quite suitable for
their purposes.
LSB/MSB order: The 8086/8088 stores all word sized values in memory in LSB(Less Significant
Byte)/MSB (More Significant Byte) order. The 8086/8088 will store a number such as $1234 in
to a word of memory with the value $34 in the first byte of the word and the value $12 in the
second byte of that word. For example, the machine language image of the instruction at line
0009 consists of an opcode, $B8, followed by a word of memory containing the value $4C00. A
full text of the machine language for that instruction reads B8004C because the value is stored
in LSB/MSB order.
The right half of an annotated program listing shows the source code of the program itself. Each
assembly language statement appears as some variation on the same basic format:
The elements of a statement must appear in their appropriate order, but no significance is
attached to the column in which an element begins. Each statement must end with a carriage
return , a line feed, or a combination of the two, but the task of managing that is really the
province of the word processor and should be transparent to a programmer. The assembler is
entirely indifferent to case. Any given token could be entered in one part of a program in lower
Keywords: A keyword is at the heart of every assembly language statement. The keyword in a
statement defines the nature of that statement. If the statement is an assembly language
instruction, the keyword will be an instruction mnemonic; if the statement is a directive, the
keyword will be the title of the directive; if the statement is a data allocation statement, the
keyword will be a data‐definition‐type.
For example, the keyword in line 0001 of program5.2 is LIST, the key word in line 0002 is HEX,
the key word in line 0003 is SEGMENT , and the keyword in line 0004 is MOV.
Identifiers are composed of the letters of the alphabet, the digits 0 through 9, and the special
characters @, _,? , ! , and $. The first character in an identifier, however, may not be one of the
digits 0 through 9. An identifier may not be one of the assembler’s reserved words.
Comment: A comment is a string of text that clarifies about the program but not part of the
program. A semicolon identifies all subsequent text in a statement as a comment. The
assembler ignores comments and does not process at all.
Assembly language directives are statements that describe the context of which the instruction
in a program are to be assembled into machine language and in which the data allocation
statements are to be processed into data space. The ELASS assembler supports 28 different
directives of which appear in PROGRAM5.2: LIST, HEX, SEGMENT, and END.
The HEX directive at line 0002 of program5.2 is there to facilitate the coding of hexadecimal
values in the body of the program. That statement directs the assembler to treat tokens in the
source file that begin with a dollar sign as numeric constants in hexadecimal notation. A HEX
directive contains only the source code that follows it, so it is customary to place the HEX
directive at the beginning of the program. If the HEX directive had not been included in
program5.2, the assembler would have processed the tokens beginning with the dollar signs as
identifiers instead of as numeric values in hexadecimal notation. As a result the HEX directive at
line 0002, the assembler recognizes the token $21 in lines 0008 and 0010 and the tokens $4C00
in line 0009 and $0400 in line 0011 as hexadecimal numbers.
A hex directive is only one of several techniques that you can employ to force the assembler to
recognize a numeric value represented in hexadecimal notation.
A segment directive defines the logical segment to which subsequent instructions and data
allocation statements belong. It also gives a segment name to the base of that segment. This is
critically important. The address of every element in a program must be represented to the
8086/8088 in segment‐relative format. That means every address must be expressed in terms
of a segment register and an offset from the base of the segment addressed by that register. By
defining the base of a logical segment, a SEGMENT directive makes it possible to set a segment
register to address that base and also makes it possible to calculate the offset of each element
in that segment from a common base.
Typically, an 8086/8088 assembly language program will consists of three logical segments: a
code segment, a stack segment, and data segment. Also typically, though not at all necessarily,
the three segments will be named CODE, STACK, and DATA respectively. Don’t confuse the name of
the segment with its role. The code segment is the code segment because it contains program
code , not because it is named CODE . you could edit program5.2 and replace the identifier CODE
in line 0003 with FRED or GEORGE OR APPLE_PIE and the program would work just as well.
A segment directive indicates that all statements following it in the source file through and until
an ENDS (EndSegment) directive or until another segment directive are a part of that logical
segment. In program5.2 the code segment extends from line 0003 through line 0010, the stack
segment consists of line 0011, and the data segment consists of lines 0012 and 0013.
In program5.2 the end of each segment is marked implicitly by the presence of another SEGMENT
directive. Alternatively, the end of the segment can be marked explicitly with an ENDS directive:
The first segment directive in program5.2 introduces a logical segment named CODE. By default
the linker assumes that the first segment in a program is its code segment. When the linker
links a program, it makes a note in the header section of the program’s executable file
described the location of the code segment. When DOS invokes the loader to load an
executable file in to memory, the loader reads that note. As it loads the program in to memory,
the loader also makes notes to itself of exactly where in memory it actually places each of the
program’s other logical segments. As the loader turns execution over to the program it has just
loaded, it sets the CS (code segment) register to address the base of the segment identified by
the linker as the code segment. This renders every instruction in the code segment addressable
in segment‐relative terms in the form of CS: XXXX.
The linker also assumes by default that the first instruction in the code segment is intended to
be the first instruction to be executed. That instruction will appear in memory at an offset of
$0000 from the base of the code segment, so the linker passes that value on to the loader by
leaving another note in the header of the program’s executable file. The loader sets the IP
(Instruction Pointer) register to that value. This sets CS:IP to the segment relative address of the
first instruction in the program.
The architecture of the 8086/8088 is such that it automatically executes the instruction at the
address defined by CS: IP. The CPU steps through a program by executing an instruction,
updating the contents of the IP register so that CS: IP points to the next instruction, executing
that instruction, and so forth. Normally the CPU adjusts only the contents of the IP register as it
steps through a program, but a few instructions can cause the CPU to adjust both the CS
register and IP register at once. The loaders act of setting the CS and IP register in accordance
with the direction of the linker turns control of the CPU over the program it has just loaded.
The second segment directive in program5.2 appears at line 0011. That statement defines the
program’s stack segment. The name of the stack segment, like the name of the code segment,
is altogether arbitrary. The identifier STACK to the left of the keyword SEGMENT is the
segment’s name. Line 0011 would have served just as well in this program if it had been
written:
QWERTY SEGMENT STACK $0400
One of the many reasons a program must have a stack are is that the computer is continuously
carrying on several background operations that are completely transparent, even to an
assembly language programmer. Every 55 milliseconds the CPU has to drop what it is doing,
make a note of the address of the instruction it was about to execute , make a note of the state
of all its registers , and then go about updating the system clock. When it finishes servicing the
system clock , it has to read all those notes, restore all its registers, and go back to doing what
ever it was doing when the interruption occurred. All those notes recorded in the stack.
The size of the stack segment is defined to be $0400 bytes by the set‐aside parameter in line
0011. The linker notes the location and size of a program’s stack segment in that program’s
executable file. The loader uses that information to initialize the SS (Stack Segment) register
and the SP (Stack Pointer) register just before it sets the CS and IP registers to address the first
instruction in the program. The loader sets the SS register to address the base of the stack
segment. That makes every byte in the stack segment is addressable in segment‐relative format
as SS: XXXX. The loader then sets the contents of the SP register equal to the size of the stack
segment in bytes. That initializes SS: SP to address the byte of memory just beyond the last
byte in the program stack.
The third and the last segment in the program5.2 begins at line 0012. This is a data segment. It
contains a single data allocation statement at line 0013. Once again the name of this segment is
entirely arbitrary. It might just as well have been named RALPH, as in
RALPH SEGMENT
But if you should wish to change the name of this segment, remember that its name is
referenced in line 0004, so you should have to change it there too:
The end directive at line 0014 is simply an advisory to the assembler alerting it that it has
reached the end of the program.
Data allocation statements set aside and initialize one or more bytes of memory for use as data
space. The general format for a data allocation statement is :
allocates a block memory 17 bytes long , initializes it to the text string “Have a nice a day!$”,
and assigns the identifier Message to it.
Zero, one , or two operands, and if there are two operands, they must be separated by a
comma. When a mnemonic takes two operands, the first operand is called the destination
operand, and the second operand is called the source operand. Following the mnemonic and
the operands, if any, there may be a comment, which, if present, must be preceded by a
semicolon.
The MOV(MOVe) instruction at line 0004 is the first instruction in the code segment. The
general format for a MOV instruction is:
The MOV instruction copies the contents of the source operand into the destination operand. It
is more or less like the assignment operator in higher‐level language. The MOV instruction in
line 0004,
DATA is defined as a segment name on line 0002. The value of a segment name is the
paragraph number at which that segment is loaded by the loader.
Line 0005 reads MOV DS, AX. Here the Ax register is the source operand and the Ds register is the
destination operand. This instruction directs the CPU to copy the contents of the Ax register in
to Ds register. The combined effect of lines 0004 and 0005 is to move the segment number of
the data segment in to the Ds register. You may well wonder why this was not done in a single
statement:
The reason is that in assembly language immediate (constant) values con not be transferred
directly to the segment registers like Ds, to transfer any immediate (constant) values first we
should transfer in to general‐purpose register and then through a general purpose register we
can transfer to segment‐registers. But we can transfer immediate values directly to general‐
purpose register.
This is another “Move immediate to register” instruction. In this case, the immediate value of
the offset message is moved to the Dx register. Offset Message refers to the offset of Message.
The offset of Message is the distance in bytes from the beginning of the data segment to the
first byte in Message.
The mnemonic INT stands for INTerrupt. This mnemonic appears at lines 0008 and 0010 of
program5.2. The INT instruction is a kind of subroutine call. The 8086/8088 sets up 256 special
subroutine calls called software Interrupts, which are generally used by the operating system
and low‐level applications. Put some what loosely, INT $21 means “call special subroutine
number $21.”
INT $21 has more than 100 functions supplied by DOS for most input, output, and other
essential machine functions. Taken collectively, those functions are called DOS function calls.
The DOS function calls are numbered sequentially. The contents of the AH register are used to
specify the function to be invoked. When the INT $21 is invoked, it directs program flow to the
function whose number is then present in the AH register.
Lines 0009 and 0010 of program5.2 invoke DOS function $4C of INT $21. That function returns
system control to DOS. The protocol of DOS function $4C specify that the contents of the Ax
register must contain a return code. A return code of $00 indicates an error‐free execution.
Since the AL and AH registers are the lower and upper halves of the AX register, they can both
be set in one single instruction.
This topic presents two programs designed to introduce the four basic DOS function calls for
console I/O:
Analysis of program5.3
The code in program5.3 extends the programming task introduced in program5.2. Program5.3
does two things. It reads a character from the keyboard, and it displays that character
embedded in message that reads, “the letter you typed was x”. Program5.3 is designed to
introduce DOS function $08 and DOS function $02.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>program5.3<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Stylistically, the most striking difference between program5.2 and program5.3 is the presence
of comments in program5.3. There are two ways to document a statement of source code. A
comment can either precede the statement on a separate line, or it can be appended to the line
on which the statement appears.
The first line of program5.2 could have been documented in either of the following manners:
LIST SCR
Or
The codes at lines 0010 and 0011 and at line 0036 and 0037 are called the boilerplate code.
The boilerplate code is code that is present in more or less the same form in every assembly
language program. Lines 0010 and 0011 set the ds register so that the program can access the
data segment, and lines 0036 and 0037 get processing control back to DOS when the program
concludes.
DOS function $08 is invoked at lines 0014 and 0015. This function waits for an input from the
keyboard and returns the ASCII value of that input in the Al register. DOS function $08 is
invoked when an int $21 instruction is executed with the value $08 stored in the Ah register.
The instruction at line 0018 copies the contents of the Al register in to Bl register. This
operation is necessary because the program is going to use those contents in the call to
function $02 at lines 0026 through 0028. Before it can make that call, however, the contents of
the Al register will be contaminated ( i.e. changed ) by the call to function $09 at lines 0021
through 0023.
As a rule, the DOS function calls preserve the contents of all registers except for the Ax register
and any other register or registers in which they explicitly return data. Consequently, the
contents of Al register, which is part of the Ax register, will be undefined after the execution of
the INT $21 instruction at line 0023, but the contents of the Bl register will be unaffected.
Composing output
Program5.3 produces its output in three separate steps. First, it outputs a message, “the letter
you typed was “, and second it outputs the character that the user typed. Third and finally, it
outputs two spaces and a period. The first and third parts of the output are constant string
images. They are generated with the same DOS function $09 that was used in program5.3. The
second part of the output uses DOS function $02 to output a single character. To invoke DOS
function $02, a program must execute an INT $21 instruction with the character to be displayed
contained in the Dl register and the value of $02 contained in the Ah register. The sequence of
instructions at lines 0026 through 0028 does just that.
Or
Analysis of program5.4
The code at lines 0016 through 0018 of program5.4 invokes DOS function $09 to display the
user prompt “type a letter, please. ”. This line appears on screen immediately below the
command line that invokes the program:
C:\asm>prohram5.4
The combined action of the carriage return/ line feed (CR/LF) sequence positions the cursor at
the start of the following line. The text of the user prompt, “type a letter, please. ”, appears at
the far left of the line immediately below the command line, because that is where the
command line left the DOS cursor pointer just before the program5.4 took over.
Lines 0021 and 0022 of program5.4 invoke DOS function $01: Keyboard Input with Echo. This
function is invoked when the Int $21 instruction is executed with a value of $01 in the Ah
register. DOS function $01 waits for a keystroke at the keyboard. When a key is pressed, it
returns with the ASCII code for that key stored in the Al register, and echoes that keystroke to
the video screen.
DOS function $01 and $08 are identical except that function $01 displays the image of the key
that was pressed while function $08 does not. When the user presses a key, function $01
echoes its image to the video screen at the current location of the cursor and advances the
cursor one position to its right:
Before program5.4 displays the next line of its output, it must first generate a CR/LF sequence.
Otherwise, the next line would begin where the call to function $01 left the cursor:
There are several ways to generate a CR/LF. The most straightforward one involves using a DOS
function $02 to output a carriage return character, and then using it again to output a line feed
character. The SCII code for the carriage return is $0D. The ASCII code for a line feed is $0A. the
following code generates a CR/LF sequence:
Int $21
Int $21
The INC(Increment) instruction , which appears at line 0077 , has a general form
INC operand
This instruction increases the contents of the operand by a value of 1. In this case inc dl adds 1
to the contents of the dl register. The combined effect of the instructions at lines 0025 and
0076 is to set the contents of the dl register to the ASCII code for the key read in by the call to
DOS function $01 at lines 0021 and 0022
The collective effect of lines 0075 through 0078 is to display the image of the letter that is
alphabetically one position after the letter whose ASCII code is in the BL register.
When you execute program5.4 from the DOS prompt, the screen appears some thing like this:
C:\asm>program5.4
The DEC (Decrement) instruction is the negative counter part of the INC instruction. The
general format for a DEC instruction is:
DEC operand
The control transfer instructions consist of calls, returns, jumps, loops, and interrupts. These
instructions intercept the flow of program control and redirect it elsewhere in a program. They
make it possible for a program to branch, to loop, and to execute subroutines.
The control transfer instructions are a special class of instructions that, when executed, adjust
the contents of the IP register and in some cases the contents of the CS register as well. By the
time the CPU has finished executing one of these instructions, CS: IP is no longer necessarily
pointing to the next instruction in sequence, but may point to an instruction some where else in
the program.
4.8.1 Subroutines
Program5.4 is an example of program that could be improved with the use of a subroutine. As it
now stands, program5.4 contains the same squib of code twice. The code that generates the
CR/LF sequence at lines 0028 through 0033 appears again at lines 0051 through 0056. The same
program could have been coded more succinctly as it appears in program5.5. In program5.5 the
CR/LF generating code appears as a subroutine at lines 0080 through 0089. This subroutine
appears only once, but it is called from two separate points in the program, once at line 0032
and again at line 0048.
Procedures
The program code in program5.5 is divided in to two separate procedures. A procedure named
MAIN, which runs from line 0014 through line 0078 , and a procedure named CRLF, which runs
from line 0080 through line 0089. Each procedure begins with a PROC directive of the form
Procname ENDP
The CALL instruction is used to invoke the code in a subroutine. The general format for an
instruction that calls a subroutine in a procedure is
CALL procname
program5.5
0001 list scr
0002 ;**********program5.5***********************************************
0003 ;*this program is the same as program5.4 except that *
0004 ;* it employs Procedures and Calls. *
0005 ;* *
0006 ;*Asks the user to input a letter from *
0007 ;*the keyboard and responds: *
0008 ;* "The letter you typed was x ." (CR/LF) *
0009 ;* "The letter after x is y ." *
0010 ;*******************************************************************
0011 hex $
0012 code segment
0013 ;*******************************************************************
0014 main proc
0015 ;set the DS register.
0016 0000 B8XXXX mov ax,data
0017 0003 8ED8 mov ds,ax
0018
0019 ;display user promt
0020 0005 B409 mov ah, $09
0021 0007 BAXXXX mov dx, offset user_promt
0022 000A CD21 int $21
0023
0024 ; read keyboard with echo
0025 000C B401 mov ah, $01
The CPU does several things in the course of executing a CALL instruction. First, it adjusts the
contents of IP register as if it were about to execute the next instruction, but then, instead of
going on to do so, it records the contents of the IP register in the program stack. Then it adjusts
the contents of the IP register again, this time to point to the first instruction in the named
procedure.
A RET(RETurn) instruction transfers program flow back from a subroutine to its parent. When
the CPU encounters a RET instruction, it recovers the note that the CALL instruction directed it
to leave for itself in the program stack. Then it puts the address it reads there into the IP
register and continues execution. This gets the CPU back to where it was when the CALL
instruction sidetracked it.
Placements of procedures
4.8.2 JUMPS
A program jump transfers program flow to the instruction at some specified location in
memory. An assembly language jump is analogous to a GOTO command in a higher‐level
language. The format for an unconditional jump to an address specified by an assembly
language label is:
JMP label
Where label is a program address identifier. A label consists of any valid assembler identifier
followed by a colon. Any instruction statement can be prefixed by a label. The assembler treats
a reference to a label as a reference to the address of the instruction to which that label is
affixed. The JMP instruction in this sequence.
JMP AX_ZERO
.
.
.
AX_ZERO: MOV AX, $0000
4.8.3 Branches
A program branch is a point in a program at which program flow can continue in either of two
paths. The path actually taken at a branch is selected under program control based on the state
of some condition. In higher level languages program branches are usually represented as
IF/THEN constructs.
If A==B is a test clause and THEN GOTO 100 is the operational clause. In assembly language
analogue of an IF/THEN statement, the role of the test clause is performed by one instruction
and the role of the operational clause is performed by another. The 8086/8088 mediates the
transfer of information concerning the result of the test from the first instruction to the second
via the flag register.
The flag register is a 16‐bit register, six of whose bits are devoted to status flags and three of its
bits are devoted to control flags. The remaining seven bits in the flag register are undefined.
Generally speaking, the CPU adjusts the status flags in the course of executing arithmetic
operations.
They reflect the outcome of an operation and record information about that outcome in a
manner that renders the information accessible for use in the execution of subsequent
instructions. The control flags control the operation of the CPU in certain circumstances.
In general, most of the instructions that perform arithmetic calculations such as addition or
subtraction will adjust some subset of the status flags to reflect the outcome of their operation.
Two flags of particular interest to programmers are the Zero flag and a carry flag. In general the
zero flag will be set to when an arithmetic operation produces a result of zero and cleared
Comparison
The CMP (CoMPare) instruction is frequently used to compare two values and to adjust the
status flags accordingly. The general format for the CMP instruction is:
The CMP instruction accomplishes its task by subtracting the value represented by the contents
of the source operand from the value represented by the contents of the destination operand,
but it does not store the result of that subtraction or affect the contents of either operand. It
merely reflects the status of the result it obtains in the status flags. If the contents of the
destination operand are equal to the contents of source operand, the zero flag will be set;
otherwise the zero flag will be cleared. If the contents of the destination operand are below the
contents of the source operand, the carry flag will be set; if the contents of the destination
operand are above or equal to the contents of the source operand, carry flag will be cleared.
Conditional jumps
A conditional jump instruction will test the state of some specified status flag or flags and direct
program flow accordingly. In assembly language a conditional jump resembles the second half
of an
IF……THEN GOTO……….
statement in higher level language. In 8086/8088 assembly language , this pair of instructions:
CMP AX, BX
JZ Label
Jxxx label
Where xxx describes the condition for which you are testing. The syntax of 8086/8088 assembly
language supports 31 conditional jump instructions. Fifteen of those instructions test a status
Six of the conditional jumps are specifically designed for branching based on the outcome of
arithmetic comparisons of unsigned numbers.
JB Jump if Below
JE Jump if Equal
JA Jump if Above
These six conditional jumps test either the Zero flag or the carry flag or both the Zero and Carry
flags. For example, the test performed by the JB instruction will prove true if the carry flag is
set, and the test performed by the JBE instruction will prove true if either the carry flag or the
zero flag is set.
Four of the other conditional jumps are variations on those six instructions:
But each of these variations is really just a synonym for one of the original six. JNB, for example,
is equivalent to JAE.
The 31st conditional jump is a bit anomalous. Instead of testing the status flags, this instruction
tests the contents of the CX register for Zero. The mnemonic for this instruction is JCXZ. It
forces a jump to the address of an indicated label if the contents of the CX register are zero.
CMP AX, BX
JZ .
. ; Statements
JMP END_IF
EQUAL: . ; If AX is equal to BX
. ; Statement
END_IF:
A program loop can be implemented with either an explicit counter or an implicit counter.
Explicitly counted loops usually rely upon the fact that the DEC (Decrement) instruction not only
subtracts 1 from the contents of its operand but also sets the Zero flag if that subtraction
results in a zero and clears the zero flag if doesn’t . The skeleton of an explicitly counted loop
might look like this:
Loop_Label:
DEC CX
JNZ Loop_Label
The DEC CX instruction will subtract 1 from the contents of the CX register and adjust the Zero
flag accordingly. program flow will continue to cycle through the loop until the contents of the
CX register are reduced to zero.
Example4.1
Write a program that displays the following output using a loop without using data allocation
statement:
ABCD
ABCD
ABCD
ABCD
List scr
Hex $
Code segment
Mov ax,data
Mov ds,ax
Loop_outer:
Mov bl,$04
Loop_inner:
Mov dl,bh
Mov ah,$02
Int $21
Inc bh
Dec bl
Jnz loop_inner
Call CRLF
Dec CX
jnz loop_outer
;exit to DOS
Mov ax,$4c00
Int $21
;generate CR/LF
mov ah,$02
mov dl,$0d
int $21
mov ah,$02
mov dl,$0a
int $21
ret
CRLF endp
Code ends
Data segment
• The translators that take an entire program and translate it as a body in to machine
language are called compilers.
• Translators that process programs one line at a time are called interpreters
• Special purpose translators that are specifically designed to translate assembly language
programs in to machine language are called assemblers.
• Assembling a program converts its source code in to an OBJect file. An OBJect file
contains the machine language image of the source code of a program in skeletal form
• There are three kinds of statements in the source code of an 8086/8088 assembly
language program: instruction statements, data allocation statements, and directives.
• A comment is a string of text that clarifies about the program but not part of the
program. A semicolon identifies all subsequent text in a statement as a comment.
• The Hex directive directs the assembler to treat tokens in the source file that begin with
a dollar sign as numeric constants in hexadecimal notation. A HEX directive contains
only the source code that follows it, so it is customary to place the HEX directive at the
beginning of the program. If the HEX directive had not been included in program5.2,
the assembler would have processed the tokens beginning with the dollar signs as
identifiers instead of as numeric values in hexadecimal notation
• A segment directive defines the logical segment to which subsequent instructions and
data allocation statements belong. It also gives a segment name to the base of that
segment.
• The first segment directive in program introduces a logical segment named CODE. By
default the linker assumes that the first segment in a program is its code segment.
• The second segment directive in program defines the program’s stack segment.
• The third and the last segment in the program is a data segment. It contains a single
data allocation statement.
• The general format for a data allocation statement is :
{ Varname } data‐definition‐type { init {{,init}} }
• The MOV instruction copies the contents of the source operand into the destination
operand
MOV destination, source
• The offset of Message is the distance in bytes from the beginning of the data segment to
the first byte in Message.
Mov dx, Offset message
• DOS function $01: keyboard input with Echo.
• DOS function $02: character output.
• DOS function $08: keyboard input with out echo.
Or
as appropriate.
The point of this exercise is to give you practice with program loops, so try to write this
program with out using any data allocation statements.
14. Write a program that displays the following ten lines of output:
The point of this exercise is to give you practice with program loops, so try to write this
program with out using any data allocation statements.