CH-06 8086 Assembly
CH-06 8086 Assembly
(ECEg-4161)
Lecture 06
Assembler
An assembler is a program that converts source-code programs written in
assembly language into object files in machine language.
Popular assemblers include MASM, NASM, GNU EMU8086 etc.
Introduction…
Linker
A linker program combines your program's object file created by the assembler
with other object files and link libraries, producing a single executable
program.
You need a linker utility to produce executable files.
8086 Assembly language
Each personal computer has a microprocessor that manages the
computer's arithmetical, logical, and control activities.
Each family of processors has its own set of instructions for handling various. These
set of instructions are called 'machine language instructions'.
The low-level assembly language is designed for a specific family of processors that
represents various instructions in symbolic code and a more understandable form.
Advantages of Assembly Language
• It requires less memory and execution time;
• It allows hardware-specific complex jobs in an easier way;
• It is suitable for time-critical jobs;
• It is most suitable for writing interrupt service routines and other memory
resident programs.
8086 Assembly language…
Assembly language instructions equate one-to-one to machine-level instructions, except they
use a mnemonic to assist the memory.
Each instruction is represented by one assembly language statement.
There may be only one statement/line.
A statement may start in any column.
A statement is either
• An 8086 instruction: Executable statements that actually do something or
• An Assembler directive: provides directions to the assembler program.
Example:
Start: MOV AX, BX ; copy BX into AX
INC CX ; increment CX
RET ; return from procedure
The entities enclosed with in square brackets are optional.
Start is a user defined name and you only put in a label in your statement when
necessary!.
The symbol : is used to indicate that it is a label.
The symbol ; is used to indicate that it is a comment.
Format of 8086 Instruction...
Names/labels Field
Used for instruction labels, procedure names, and variable names.
Names are 1-31 characters including letters, numbers and special characters ? .
@_$%
Case insensitive
Can be used as labels for statements to act as place markers to indicate where to jump to.
Can identify a blank line.
• For example:
label2:
Comments field
Any text can be written after the statement as long as it is preceded by the “;”.
A semicolon marks the beginning of a comment
A semicolon in beginning of a line makes it all a comment line
Good programming practice dictates comment on every line
Assembler directives (Preprocessors)
Assembler directives (pseudo-instructions) give directions to the assembler about how it
should translate the assembly language instructions into machine code.
It provides information to assist the assembler in producing executable code.
It creates storage for a variable and initialize it.
Assembler directives (pseudo-instructions) give directions to the assembler about how it should
translate the assembly language instructions into machine code.
Assembler directives are specific for a particular assembler.
Used to :
Specify the start and end of a program
Attach value to variables
Allocate storage locations to input/ output data
Define start and end of segments, procedures, macros etc..
Assembler directives …
Example
ASSUME CS: ACODE, DS:ADATA ;Tells the compiler that the instructions of the program are stored in
the segment ACODE and data are stored in the segment ADATA
Assembler directives …
EXTRN
Used to tell the assembler that the names or labels following the directive are in
some other assembly module.
GLOBAL
• Be used in place of PUBLIC or EXTRN directives. It is used to make the symbol
available to other modules. INCLUDE Include Source Code from File
• Example: INCLUDE “program_2.asm”
LENGTH
• Is an operator, which tells the assembler to determine the number of elements
in some named data item, such as a string or an array.
• Example: MOV CX, LENGTH string1 CX = no of elements in string1
Assembler directives …
NAME
Used to give a specific name to each assembly module when programs consisting of
several modules are written.
Example: NAME “Main”
OFFSET
Is an operator which tells the assembler to determine the offset or the displacement of a
named data item (variable) or procedure from start of the segment which contains it.
Example: MOV BX, OFFSET TABLE
Assembler directives …
PROC
Indicates the beginning of a procedure
ENDP
End of procedure
FAR
Intersegment call
NEAR
Intra-segment call
Assembler directives …
MACRO
Indicate the beginning of a macro
ENDM
End of a macro
General form:
Assembler directives …
Defining Data in a program
Data is usually stored in the data segment.
You can define constants, work areas (a chunk of memory).
Data can be defined in different lengths (8-bit, 16-bit)
Each byte of character is stored as its ASCII value in hexadecimal
The definition for data:
<Name> DX <Expression>
This directive does not define a data item; instead, it defines a value that the assembler
can use to substitute in other instructions (similar to defining a constant in C
programming or using the #define )
FACT EQU 12 ;defines FACT as a named constant
MOV CX, FACT ;without dereferencing
No memory is allocated.
Strings are also possible.
Assembler directives …
User defined Data definition directives
Structure:
Struct myStruct ;declares myStruct as a structure
var1 DB 0 ; Var1 data byte initialized with 0
var2 DB 1 ; Var2 data byte initialized with 1
Ends myStruct
Structure variable:
structVar myStruct ? ;creates structure variable
Acceccing structure:
MOV [structVar.var1], 20 ;move 20 in var1 in mystruct
Program Examples
Example 1
Determine the contents of the GP registers and flags after these programs.
ORG 40H
MOV AX,1234H JMP START JMP START
MOV BX,5678H V DB 5, 8, 12, 1 NAME DB “Abebe”
PUSH AX START: START:
PUSH BX MOV AL, 0 MOV SI, OFFSET NAME
POP AX MOV BX, 0 INC SI
POP BX MOV DL, 24 LOADS [SI]
DO: ADD AL, V[BX] MOV CX, 05H
INC BX REPEAT: INC SI
AX ? CMP AL, DL CMP AL, [SI]
BX ? JLE DO LOOPE REPEAT
• CPU pushes current program address and status in to stack before executing the ISR.
• In DOS, INT 21 is used for basic I/O functions, like display, print…
DOS function calls are used for Standard input/output in Assembly language(8086).
To use a DOS function call in a DOS program,
1. Place the function number in AH (8 bit register) and other data that might be necessary
in other registers.
2. Once everything is loaded, execute the INT 21H instruction to perform the task.
After execution of a DOS function, it may return results in some specific registers.
Standard I/O (CONT..)
INT 21h /01H: Read from the Keyboard
Read character from standard input, result is stored in AL(in ASCII ).
If there is no character in the keyboard buffer, the function waits until any key is
pressed.
Example:
MOV DL, ‘A’ ;load ASCII key code of Character ‘A’ in DL
MOV AH,02H ;load DOS function number in AH
INT 21H ;access DOS
Standard I/O …
INT 21h/09H: Display a character String
This function displays a character string on the standard output.
The character string must end with an ASCII of symbol ‘$’ (24H).
The character string can be of any length and may contain control characters such as
carriage return (0DH) and line feed (0AH).
DX must contain address of the character string.
Example:
Buf DB “Hello World$” ;define character string
MOV DX, offset Buf ;load address of the string in DX, offset gives address of the Buf.
MOV AH,09H ;load DOS function number in AH
INT 21H ;access DOS
Standard I/O …
INT 21h/0AH: Buffered keyboard input
This function continues to read the keyboard (displaying data as typed) until either the
specified number of characters are typed or until the enter key is typed.
The first byte of the buffer contains the size of the buffer (up to 255).
The second byte is filled with the number of characters typed upon return.
The third byte through the end of the buffer contains the character string typed, followed
by a carriage return (0DH).
Buffer e.g: Buf DB 13, 10, “Welcome$”
Example:
Buf DB 10, ?, 10 dup(0) ;declare a buffer.
MOV DX, offset Buf ;load address of the buffer in DX, offset gives address of the Buf.
MOV AH,0AH ;load DOS function number in AH
INT 21H ;access DOS
Standard I/O …
mov al, 5 ; 05h, or 00000101b
mov bl, 10 ; 0Ah, or 00001010b
add bl, al ; 5 + 10 = 15 (decimal), 0Fh,
or 00001111b
; print result in binary ;Print the string “hello world”
mov cx, 8 ;for the 8 bits msg db "hello world “, 0DH, 0A, “$“
mov dx, offset msg
print: mov ah, 2 ; print function mov ah, 9
mov dl, '0‘ int 21h
test bl, 10000000b ; test first bit. ;wait for any key press
jz zero mov ah, 1
mov dl, '1‘ int 21h
zero: int 21h ;print to console ret ;return to the operating system
shl bl, 1
loop print
; print binary suffix:
mov dl, ‘b’
int 21h
N.B
Carriage Return, (ASCII 0DH) is the control character to bring the cursor to the start of a line.
Line-feed (ASCII 0AH) is the control character that brings the cursor down to the next line on the screen
Procedure and MACRO
While writing programs, there might be a case where a particular sequence of instructions is
used several times through out a program .
To avoid writing the sequence of instructions again and again in the program, the same
sequence can be written as a separate subprogram:
Use separate procedure
Use Macro
Procedure/Subroutines
Write it in a separate subprogram and call that subprogram whenever necessary.
Avoid writing the same sequence of instruction again and again.
CALL instruction transfers the execution control to procedure.
In the procedure, a RET instruction is used at the end.
This will cause the execution to be transferred to caller program.
Procedure/Subroutines
At the time of invoking a procedure the address of the next instruction of the program is kept on
the stack so that, once the flow of the program has been transferred and the procedure is done,
one can return to the next line of the original program, the one which called the procedure.
Intra-segment procedure(IP in stack)
Intersegment procedure(CS:IP in stack
The PROC and ENDP directives indicate the start and end of a procedure (subroutine).
Both the PROC and ENDP directives require a label to indicate the name of the procedure.
Syntax:
<Procedure-name> PROC [NEAR|FAR] ;starts procedure
.. ;body of the procedure
..
RET
<Procedure-name>ENDP ;close of the procedure
NEAR | FAR is optional and gives the types of procedure. If not used, assembler assumes the
procedure as near procedure. All procedures are defined in code segment. The directive ENDP
indicates end of a procedure.
Procedure/Subroutines
Procedure/Subroutines
Advantages:
• Programming becomes simple.
• Reduced development time as each module can be implemented by different
persons.
• Debugging of smaller programs and procedures is easy.
• Reuse of procedures is possible.
• A library of procedures can be created to use and distribute.
Disadvantages:
• Extra code may be required to integrate procedures.
• Liking of procedures may be required.
• Processor needs to do extra work to save status of current procedure and load
status of called procedure. The queue must be emptied so that instructions of the
procedure can be filled in the queue.
Macros
A macro is a group of instructions that perform one task, just as a procedure
performs one task.
The difference is that a procedure is accessed via a CALL instruction, whereas a
macro, and all the instructions defined in the macro, is inserted in the program at
the point of usage.
Creating a macro is very similar to creating a new opcode, which is actually a
sequence of instructions, in this case, that can be used in the program.
You type the name of the macro and any parameters associated with it, and the
assembler then inserts them into the program.
Macro sequences execute faster than procedures because there is no CALL or RET
instruction to execute.
The instructions of the macro are placed in your program by the assembler at the
point where they are invoked.
Using macro avoids the overhead time involved in calling and returning
from a procedure.
Macros ...
One drawback of macro is each time generating in-line code causes more memory
consumption
The MACRO and ENDM directives delineate a macro sequence.
The first statement of a macro is the MACRO instruction, which contains the name of the
macro and any parameters associated with it.
An example is MOVE MACRO A,B
which defines the macro name as MOVE. This new pseudo opcode uses two parameters: A
and B.
The last statement of a macro is the ENDM instruction, which is placed on a line by itself.
Never place a label in front of the ENDM statement. If a label appears before ENDM, the
macro will not assemble.
Syntax:
<Macro-name> MACRO [<Arg 1> <,Arg 2>…<,Arg n>]
..
ENDM
Macros ...
Main parts Macro :
MACRO header (MACRO)
Text or Body
Pseudo instructions marking the end of the instruction (e.g.:- ENDM)