Comparch
Comparch
2
Learning assembly language programming will help
understanding the operations of the microprocessor
Faster and shorter programs.
Compilers do not always generate optimum code.
Small controllers embedded in many products
Have specialized functions,
Rely so heavily on input/output functionality,
3
MASM
Microsoft : Macro Assembler
TASM
Borland : Turbo Assembler
NASM
Library General Public License (LGPL) [Free] : Netwide
Assembler
4
.model small
.stack 100h
.data
message db 'Hello World', 13, 10, '$'
.code
start:
mov ax, @data
mov ds, ax
mov dx, offset message ; copy address of message to dx
mov ah, 9h ; string output
int 21h ; display string
mov ax, 4c00h
int 21h
end start
5
TITLE PRGM1 MOV AX, A
.MODEL SMALL ADD AX, B
.STACK 100H
.DATA MOV SUM, AX
A DW 2 ; exit to DOS
B DW 5 MOV AX, 4C00H
SUM DW ? INT 21H
.CODE MAIN ENDP
MAIN PROC
; initialize DS END MAIN
MOV AX, @DATA
MOV DS, AX
; add the numbers
6
An instruction is a statement that becomes
executable when a program is assembled.
Assembled into machine code by assembler
An instruction contains:
Label (optional)
Mnemonic (required)
Operand (depends on the instruction)
Comment (optional)
Basic syntax
[label:] mnemonic [operands] [ ; comment]
Eg
Start: mov ax,bx ; this instru…
7
Act as place markers
marks the address (offset) of code and data
Follow identifier rules
Data label
must be unique
example: myArray (not followed by colon)
count DWORD 100
Code label
target of jump and loop instructions
example: L1: (followed by colon)
target:
mov ax, bx
…
jmp target
8
Instruction Mnemonics
memory aid
examples: MOV, ADD, SUB, MUL, INC, DEC
Operands
constant 96
constant expression 2+4
register ax
memory (data label) count
9
•One operand mnemonics
•E.g.
•stc ; set Carry flag
•One operand mnemonics
•inc ax ; add 1 to AX
•Two operand mnemonics
•Mov count, bx ; move BX to count
10
Comments are good!
explain the program's purpose
when it was written, and by whom
revision information
tricky coding techniques
application-specific explanations
Single-line comments
begin with semicolon (;)
11
.model small
.stack 100h
.data
message db 'Hello World', 13, 10, '$'
.code
main proc
mov ax, @data
mov ds, ax
mov dx, offset message ; copy address of message to dx
mov ah, 9h ; string output
int 21h ; display string
mov ax, 4c00h
int 21h
main endp
end main
12
Commands that are recognized and acted upon by
the assembler
Not part of the instruction set
Used to declare code, data areas, select memory model,
declare procedures, etc.
not case sensitive
Different assemblers have different directives
NASM not the same as MASM, for example
13
EQU pseudo-op used to assign a name to constant.
Makes assembly language easier to understand.
No memory allocated for EQU names.
LF EQU 0AH
MOV DL, 0AH
MOV DL, LF
PROMPT EQU “Type your name”
MSG DB “Type your name”
MDG DB PROMPT
14
Used to define arrays whose elements share common
initial value.
It has the form: repeat_count DUP (value)
Numbers DB 100 DUP(0)
Allocates an array of 100 bytes, each initialized to 0.
Names DW 200 DUP(?)
Allocates an array of 200 uninitialized words.
Two equivalent definitions
Line DB 5, 4, 3 DUP(2, 3 DUP(0), 1)
Line DB 5, 4, 2, 0, 0, 0, 1, 2, 0, 0, 0, 1, 2, 0, 0, 0, 1
15
Use DUP to allocate (create space for) an array
or string. Syntax: counter DUP ( argument )
Counter and argument must be constants or
constant expressions
var1 BYTE 20 DUP(0) ; 20 bytes, all equal to zero
var2 BYTE 20 DUP(?) ; 20 bytes, uninitialized
var3 BYTE 4 DUP("STACK") ; 20 bytes: "STACKSTACKSTACKSTACK"
var4 BYTE 10,3 DUP(0),20 ; 5 bytes
var4 10
0
0
0
20
16
Used to override declared type of an address
expression.
Examples:
MOV [BX], 1 illegal, there is
ambiguity
MOV Bye PTR [BX], 1 legal
MOV WORD PTR [BX], 1 legal
Let j be defined as follows
j DW 10
MOV AL, j illegal
MOV AL, Byte PTR J legal
17
18
19
Identifiers
Programmer-chosen name to identify a variable, constant,
procedure, or code label
1-247 characters, including digits
not case sensitive
first character must be a letter, _, @, ?, or $
Subsequent characters may also be digits
Cannot be the same as a reserved word
@ is used by assembler as a prefix for predefined symbols,
so avoid it identifiers
Examples
Var1, Count, $first, _main, MAX, open_file, myFile, xVal,
_12345
20
Reserved words cannot be used as identifiers
Instruction mnemonics
MOV, ADD, MUL,, …
Register names
Directives – tells MASM how to assemble programs
type attributes – provides size and usage information
BYTE, WORD
Operators – used in constant expressions
predefined symbols – @data
21
A data definition statement sets aside storage in memory for a
variable.
May optionally assign a name (label) to the data
Syntax:
[name] directive initializer [,initializer] . . .
value1 BYTE 10
22
Each variable has a type and assigned a memory
address.
Data-defining pseudo-ops
DB define byte
DW define word
DD define double word (two consecutive words)
DQ define quad word (four consecutive words)
DT define ten bytes (five consecutive words)
Each pseudo-op can be used to define one or more
data items of given type.
23
Assembler directive format defining a byte variable
name DB initial value
a question mark (“?”) place in initial value leaves
variable uninitialized
I DB 4 define variable I with initial value 4
J DB ? Define variable J with uninitialized value
Name DB “Course” allocate 6 bytes for Name
K DB 5, 3, -1 allocates 3 bytes
K 05
03
FF
24
Offset Value
list1 0000 10
0001 20
I DW 4 J FE
FF
J DW -2
K BC
1A
K DW 1ABCH
L 31
30
L DW “01”
26
Enclose character in single or double quotes
'A', "x"
ASCII character = 1 byte
Enclose strings in single or double quotes
"ABC"
'xyz'
Each character occupies a single byte
Embedded quotes:
'Say "Goodnight," Gracie'
27
A string is implemented as an array of characters
For convenience, it is usually enclosed in quotation marks
It often will be null-terminated (ending with ,0 or $)
Examples:
28
End-of-line character sequence:
0Dh = carriage return
0Ah = line feed
str1 BYTE "Enter your name: ",0Dh,0Ah
BYTE "Enter your address: ",0
Idea: Define all strings used by your program in the same area of
the data segment.
29
CPU communicates with peripherals through I/O registers
called I/O ports.
Two instructions access I/O ports directly: IN and OUT.
Used when fast I/O is essential, e.g. games.
Most programs do not use IN/OUT instructions
port addresses vary among computer models
much easier to program I/O with service routines provided by
manufacturer
Two categories of I/O service routines
Basic input/output system (BIOS) routines
Disk operating system (DOS) routines
31
System Hardware
Non-standard interface
BIOS
Standard interface
Operating System
Standard interface
Application Program
32
INT 21H used to invoke a large number of DOS
function.
Type of called function specified by putting a number
in AH register.
AH=1 single-key input with echo
AH=2 single-character output
AH=9 character string output
AH=8 single-key input without echo
AH=0Ah character string input
33
34
Input: AH=2, DL= ASCII code of character to be output
Output: AL=ASCII code of character
To display a character
MOV AH, 2
MOV DL, „?‟ ; displaying character „?‟
INT 21H
To read a character and display it
MOV AH, 1
INT 21H
MOV AH, 2
MOV DL, AL
INT 21H
35
36
Input:AH=1
Output: AL= ASCII code if character key is pressed,
otherwise 0.
To input character with echo:
MOV AH, 1
INT 21H ; read character will be in AL register
To input a character without echo:
MOV AH, 8
INT 21H ; read character will be in AL register
37
.model small
.stack 100h
.data
message db 'Hello World', 13, 10, '$'
.code
start:
mov ax, @data
mov ds, ax
mov dx, offset message ; copy address of message to dx
mov ah, 9h ; string output
int 21h ; display string
mov ax, 4c00h
int 21h
end start
38
Input: AH=9, DX= offset address of a string.
String must end with a „$‟ character.
To display the message Hello!
MSG DB “Hello!$”
MOV AH, 9
MOV DX, offset MSG
INT 21H
OFFSET operator returns the address of a variable
The instruction LEA (load effective address) loads
destination with address of source
LEA DX, MSG
39
Input:AH=10, DX= offset address of a buffer to store
read string.
First byte of buffer should contain maximum string size+1
Second byte of buffer reserved for storing size of read
string.
To read a Name of maximum size of 20B display it
Name DB 21,0,20 dup(“$”)
MOV AH, 10
LEA DX, Name
Mov ah ,
INT 21H
MOV AH, 9
LEA DX, Name+2
INT 21H
40
.model small mov dx,offset n1
.stack 100 mov ah,9h
int 21h
.data
mov dx,offset array+2
array db 21,?,20 dup('$')
mov ah,9h
n1 db 10,13,'$'
int 21h
.code
end start
start:
mov ax,@data
mov ds,ax
mov ah,10
lea dx,array
int 21h
41
Prompt the user to enter a lowercase letter, and on
next line displays another message with letter in
uppercase.
Enter a lowercase letter: a
In upper case it is: A
TITLE : CASE CONVERSTION PROGRAM
.MODEL SMALL
.STACK 100H
.DATA
MSG1 DB 'ENTER A LOWER CASE LETTER $'
MSG2 DB 0DH,0AH, 'IN UPPER CASE ITS IS: '
CHAR DB ?,'$'
42
.CODE
;INITALIZE DS
MOV AX, @DATA ;get data segment
MOV DS,AX ;initailize DS
;print user prompt
LEA DX,MSG1 ;get first message
MOV AH,9 ;display sting function
INT 21H ;display first message
;input a char and cover to upper case
MOV AH,1 ;read character function
INT 21H ;read a small letter into AL
SUB AL, 20H ;convert it to upper case
MOV CHAR, AL ;and store it
;display on the next line
LEA DX,MSG2 ;get second message
MOV AH,9 ;display message and uppercase
INT 21H ;letter in front
43
.model small
.DATA
String1 DB "Hello"
String2 DB 5 dup(?)
.CODE
MOV AX, @DATA
MOV DS, AX
MOV ES, AX
CLD
MOV CX, 5
LEA SI, String1
LEA DI, String2
REP MOVSB
44
45
46
47
48
49
Procedure
Used to define subroutines, offers modular programming.
Call to procedure will be a transfer of control to called
procedure during run time.
PROC: indicates beginning of procedure.
Procedure type helps assembler to decide weather to code return
as near/far.
Near/Far term follows PROC indicates type of procedure.[Near by
default]
ENDP: indicates assembler the end of procedure
50
Procedure Declaration
Name PROC type
;body of the procedure
RET
Name ENDP
Procedure type
NEAR (statement that calls procedure in same segment
with procedure)
FAR (statement that calls procedure in different
segment)
Default type is near
Procedure Invocation
CALL Name
51
Executing a CALL instruction causes
Save return address on the stack
Near procedure: PUSH IP
Far procedure: PUSH CS; PUSH IP
IP gets the offset address of the first instruction of the
procedure
CS gets new segment number if procedure is far
• Executing a RET instruction causes
Transfer control back to calling procedure
Near procedure: POP IP
Far procedure: POP IP; POP CS
RET n
IP [SP+1:SP]
SP SP + 2 + n
52
.model small
.stack 100h
cr equ 13
lf equ 10
.data
msg1 db 'enter an upper case letter: $'
result db cr,lf,'The lower case equivalent is:', '$'
.code
start:
mov ax , @data
mov ds,ax
mov dx, offset msg1
53
call outputs
call getc
mov bl,al
add bl,32
mov dx,offset result
call outputs
mov dl,bl
call putc
mov ax,4c00h
int 21h
54
putc proc
mov ah,2h
int 21h
ret
getc proc
mov ah,01h
int 21h
ret
outputs proc
mov ah,9h
int 21h
ret
end start
55
MACRO definition directive
Used to define macro constants.
Call to macro will be replaced by its body during
assembly time.
EQU: macro symbol
MACRO: informs assembler the beginning of macro. It is a
open subroutines. It gets expanded when call is made to
it.
MacroName MACRO [arg1,arg2…argn]
Advantage: save great amount of effort and time by
avoiding overhead of writing repeated pattern of code.
ENDM: informs assembler the end of macro.
56
57
58
59