Chapter2 - Assembly Language Programming

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Assembly Language

Programming
n Dr. Ali Mohamed
n Faculty of Engineering
n Aswan University
The 80x86 Microprocessors 1.1 Assembly Language

Objectives

The 80x86 Microprocessors 1.2 Assembly Language

1
Program template

The 80x86 Microprocessors 1.3 Assembly Language

Program Statements

 A given Assembly language program is a series of statements


 Assembly language instructions
 MOV AX, 0F8E1H
 ADD AX, BX
 Directives (pseudo-instructions) give directions to the assembler
about how it should translate the Assembly language instructions
into machine code.
 DATA1 DB 52H
 An Assembly language instruction consists of four fields:
[label:] mnemonic [operands] [;comment]

The 80x86 Microprocessors 1.4 Assembly Language

2
Fields of Assembly language instructions
[label:] mnemonic [operands] [;comment]
 Brackets indicate that the field is optional and should not be typed
 The label field allows the program to refer to a line of code by name.
 The label field cannot exceed 31 characters.
 Labels for directives do not need to end with a colon.
 A label must end with a colon when it refers to an opcode generating
instruction

The colon indicates to the assembler that this refers to code within this
code segment.
 The Assembly language mnemonic and operand (s) fields together perform
the real work of the program and accomplish the tasks for which the program
was written.
 ADD AL,BL ; ADD is the mnemonic and "AL,BL" are operands
 MOV AX,6764 ; MOV is the mnemonic and "AX,6764" are operands
 The comment field begins with a “;” they be used to describe the program, to
make it easier for someone to read and understand.

The 80x86 Microprocessors 1.5 Assembly Language

Memory model

 SMALL MODEL: This is one of the most widely used memory models for
Assembly language programs and is sufficient for the small programs.
 The SMALL model uses a maximum of 64K bytes of memory for code and
another 64K bytes for data.
 MEDIUM MODEL: In this model, the data must also fit into 64K bytes, but
the code can exceed 64K bytes of memory.
 COMPACT MODEL: This is the opposite of the MEDIUM model. While
the data can exceed 64K bytes of memory, the code cannot.
 LARGE MODEL: Combining the two preceding models gives the LARGE
model. Although this model allows both data and code to exceed 64K
bytes of memory, no single set of data (such as an array) should exceed
64K bytes.
 HUGE MODEL: Both code and data can exceed 64K bytes of memory,
and a single set of data, such as an array, can exceed 64K bytes as well.
 There is another memory model called TINY. This model is used with
COM files in which the total memory for both code and data must fit into
64K bytes. This model cannot be used with the simplified segment
definition.

The 80x86 Microprocessors 1.6 Assembly Language

3
Simple SEGMENT Definition

 An assembly program usually consists of three


segments (code, data and stack).
 .CODE
 .DATA
 .STACK
 The stack segment defines the storage area of stack
 .STACK 64
 It reserves 64 bytes as stack storage.
 The stack segment defines data that the program uses
 The stack segment defines where the program
instructions stored.
The 80x86 Microprocessors 1.7 Assembly Language

DATA Segment

.DATA
DATA1 DB 52H
DATA2 DB 29H
SUM DB ?
 When the program starts only the CS and SS registers have the
proper values (by the OS). The DS register must be initialized by
the program.

MOV AX,@DATA
MOV DS,AX

The 80x86 Microprocessors 1.8 Assembly Language

4
CODE Segment

 A procedure is a set of instructions to accomplish a specific task.


 A code segment could consist of one procedure or more.
 Each procedure must have a name defined by the PROC directive.
 It also should ended with the ENDP directive.
 PROC directive has two options FAR and NEAR, we use FAR for
now.
 The instructions should be between the two directives.
 The last two instructions returns control to OS.
MOV AH,4CH
INT 21H
 The last two lines ends the procedure and the program.

The 80x86 Microprocessors 1.9 Assembly Language

Program Structure

The 80x86 Microprocessors 1.10 Assembly Language

5
Program Creation Cycle

The 80x86 Microprocessors 1.11 Assembly Language

Program Creation Cycle

The 80x86 Microprocessors 1.12 Assembly Language

6
Program Files

 .asm file is the source code file.


 The MASM program accepts the .asm file and generates the .obj
file.
 .obj file is the machine code file.
 .lst file contains all opcodes and offset addresses as well as
program errors generated by MASM.
 .crf file includes alphabetical list of all symbols and labels
 The linker produces the ready to run version of the program (.exe
file)
 .map file lists all the segment names, where it starts and stops in
memory as well as the size of each segment in bytes.

The 80x86 Microprocessors 1.13 Assembly Language

Sample Program

The 80x86 Microprocessors 1.14 Assembly Language

7
Control Transfer Instructions

The 80x86 Microprocessors 1.15 Assembly Language

Conditional Jump

 It transfers the control flow of the program.


 If the control transferred in the same segment, it is called NEAR
jump (intrasegment)
 If the control transferred outside the current code segment, it is
called FAR jump (intersegment)
 Since the CS:IP always refers to the next instruction to be
executed, they must be updated when we execute jump
instructions.
 In NEAR jump, only IP register has to be updated while in FAR
jump both CS and IP registers should be updated.
 The flag register indicates the current conditions
 All conditional jumps are short jumps within -128 to 127 of the
value in the IP register (two bytes instruction)
 Positive for forward jump and negative for backward jump

The 80x86 Microprocessors 1.16 Assembly Language

8
The 80x86 Microprocessors 1.17 Assembly Language

Backward Jump

The 80x86 Microprocessors 1.18 Assembly Language

9
Forward Jump

The 80x86 Microprocessors 1.19 Assembly Language

Unconditional Jump

 Program control transfers unconditionally to the target location. It has the


following forms,
 Short Jump. The jump within 128 : -127 bytes of the current location (IP).
In this case the opcode is (EB) and the operand ranges between 00 to FF.
the operand is added to (IP) to calculate the jump address.
 JMP SHORT Label
 Near Jump. The jump within the code segment the opcode is (E9) the
target address can be
 Direct. Same as short except that the target address within the 32767: -
32768 of the current location
 JMP Label
 Register Indirect. The target address is in a register
 JMP BX
 Memory Indirect. The target address is the content of memory location.
 JMP [DI]

The 80x86 Microprocessors 1.20 Assembly Language

10
Unconditional Jump

 FAR Jump. The jump is out of the current code segment. So, the value of
CS and IP should be changed.
 JMP FAR PTR Label

The 80x86 Microprocessors 1.21 Assembly Language

CALL statements

 CALLs to procedures are used to perform tasks


that need to be performed frequently.
 The address could be in the current segment, in
which case it will be a NEAR call or outside the
current CS segment, which is a FAR call.
 To make sure that after execution of the called
subroutine the microprocessor knows where to
come back, it automatically saves the address of
the instruction following the call on the stack. It
must be noted that in the NEAR call only the IP
is saved on the stack, and in a FAR call both CS
and IP are saved.
The 80x86 Microprocessors 1.22 Assembly Language

11
CALL statements

 When a subroutine is called, control is transferred


to that subroutine and begins to fetch instructions
from the new location.
 After finishing execution of the subroutine, the last
instruction in the subroutine must be RET (return).
 The assembler generates different opcode for
FAR and NEAR calls. For NEAR calls, the IP is
restored; for FAR calls, both CS and IP are
restored.
 This will ensure that control is given back to the
caller.
The 80x86 Microprocessors 1.23 Assembly Language

The 80x86 Microprocessors 1.24 Assembly Language

12
Rules for names in Assembly language

 First, each label name must be unique.


 The names used for labels in Assembly language
programming consist of:
 alphabetic letters in both upper and lower case,
 the digits 0 through 9, and
 the special characters question mark (?), period (.), at
(@), underline U, and dollar sign ($).
 The first character of the name must be an
alphabetic character or special character. It
cannot be a digit.

The 80x86 Microprocessors 1.25 Assembly Language

DATA TYPES AND DATA DEFINITION

 The assembler supports all the various data types of the 80x86
microprocessor by providing data directives that define the data types
and set aside memory for them.
 The 8088/86 microprocessor supports many data types, but none are
longer than 16 bits wide since the size of the registers is 16 bits.
 It is the job of the programmer to break down data larger than 16 bits to be
processed by the CPU.
 The data types used by the 8088/86 can be 8-bit or 16-bit, positive or
negative.
 If a number is less than 8 bits wide, it still must be coded as an 8-bit
register with the higher digits as zero.
 Similarly, if the number is less than 16 bits wide it must use all 16 bits,
with the rest being 0s.

The 80x86 Microprocessors 1.26 Assembly Language

13
ORG (origin) Directive

 ORG is used to indicate the beginning of the offset address.

 The number that comes after ORG can be either in hex or in decimal.

 The ORG directive is used extensively in the data segment to separate fields of
data to make it more readable for the student, it can also be used for the offset
of the code segment (IP).

The 80x86 Microprocessors 1.27 Assembly Language

DB (define byte) Directive

 The DB directive allows allocation of memory in byte-sized chunks.


 DB can be used to define numbers in decimal, binary, hex, and ASCII.
 DATA1 DB 25D ; DECIMAL
 DATA2 DB 10001001B ; BINARY
 DATA3 DB 12H ; HEX
 DATA4 DB „M‟ ; ASCII
 Regardless of which one is used, the assembler will convert them into
hex.
 ORG 0010H
 DATA DB „2591‟ „2‟
„5‟
 DATA5 DB ? „9‟
 DATA6 DB „My name is Joe‟ ;ASCII CHARACTERS „1‟
00
„M‟
„y‟
„‟
The 80x86 Microprocessors 1.28 Assembly Language

14
The 80x86 Microprocessors 1.29 Assembly Language

DUP (duplicate) Directive

 DUP is used to duplicate a given number of characters to avoid a lot of


typing.
 For example, contrast the following two methods of filling six memory
locations with FFH:
 ORG 0030H
 DATA7 DB 0FFH, 0FFH, 0FFH, 0FFH, 0FFH, 0FFH ;fill 6 bytes with FF
 ORG 38H
 DATA8 DB 6 DUP(0FFH) ;fill 6 bytes with FF
 Reserve 32 bytes of memory with no initial value given
 ORG 40H
 DATA9 DB 32DUP(?) ;set aside 32 bytes
 DUP can be used inside another DUP
 DATA10 DB 5 DUP (2 DUP (99)) ; fill 10 bytes with 99

The 80x86 Microprocessors 1.30 Assembly Language

15
The 80x86 Microprocessors 1.31 Assembly Language

DW (define word) Directive


BA
03
 DW is used to allocate memory 2 bytes (one word) at a time.54
 DW is used widely in the 8088/8086 and 09
80286 microprocessors since the registers are 16 bits wide. 3F
25
 ORG 70H 09
 DATA11 DW 954 00
02
 DATA12 DW 100101010100B 00
 DATA13 DW 253FH 07
00
 DATA14 DW 9,2,7,0CH,00100000B,5,‟HI‟ 0C
 DATA15 DW 8 DUP(?) 00
20
00
20
00
05
„I‟
„H‟
00
00
The 80x86 Microprocessors 1.32 00 Language
Assembly

16
The 80x86 Microprocessors 1.33 Assembly Language

EQU (equate) Directive


 This is used to define a constant without occupying a memory location.
 EQU does not set aside storage for a data item but associates a constant value
with a data label so that when the label appears in the program, its constant value
will be substituted for the label.
 EQU can also be used outside the data segment, even in the middle of a code
segment.
 Using EQU for the counter constant in the immediate addressing mode:
 COUNT EQU 25
 MOV CX,COUNT  MOV CX, 25 ; the register CX will be loaded with the value 25.
 This is in contrast to using DB:
 When executing the same instruction "MOV CX,COUNT" it will be in the direct
addressing mode.
 Now what is the real advantage of EQU?
 First, note that EQU can also be used in the data segment:
 COUNT EQU 25
 COUNTER1 DB COUNT
 For a constant (a fixed value) used in many different places in the data and code
segments.

The 80x86 Microprocessors 1.34 Assembly Language

17
DD (define double word) Directive

 The DD directive is used to allocate memory locations that are 4 bytes


(two words) in size.
 Again, the data can be in decimal, binary, or hex.
 In any case the data is converted to hex and placed in memory locations
according to the rule of low byte to low address and high byte to high
address.

 ORG 00A0H
 DATA16 DD 1023 ;DECIMAL
 DATA17 DD 10001001011001011100B ;BINARY
DATA18 DD 5C2A57F2H ;HEX
 DATA19 DD 23H,34789H ,65533

The 80x86 Microprocessors 1.35 Assembly Language

DQ (define quadword) Directive

 DQ is used to allocate memory 8 bytes (four words) in size.


 This can be used to represent any variable up to 64 bits wide:

 ORG 00C0H
 DATA20 DQ 4523C2H ;HEX
 DATA21 DQ „HI‟ ;ASCI I CHARACTERS
 DATA22 DQ ? ;NOTHING

The 80x86 Microprocessors 1.36 Assembly Language

18
DT (define ten bytes) Directive

 DT is used for memory allocation of packed BCD numbers.


 The application of DT will be seen in the multibyte addition of BCD
numbers in Chapter 3.
 For now, observe how they are located in memory.
 Notice that the "H" after the data is not needed. This allocates 10 bytes,
but a maximum of l8 digits can be entered.

 ORG OOEOH
 DATA23 DT 867943569829 ;BCD
 DATA24 DT ? ;NOTHING
 DT can also be used to allocate 10-byte integers by using the "D" option:
 DEC DT 65535d ;the assembler will convert the decimal number to
hex and store it

The 80x86 Microprocessors 1.37 Assembly Language

Full SEGMENT Definition

Label SEGMENT [options]


…… …… …… …… ; a statement belonging to this segment here label
…… …… …… …… ; a statement belonging to this segment here label
…… …… …… …… ; a statement belonging to this segment here label
ENDS Label
 The "SEGMENT" and "ENDS" directives indicate to the assembler the
beginning and ending of a segment.
 Assembly language statements are grouped into segments in order to be
recognized by the assembler and consequently by the CPU.
 The stack segment defines storage for the stack
 The data segment defines the data that the program will use
 The code segment contains the Assembly language instructions.

The 80x86 Microprocessors 1.38 Assembly Language

19
Stack Segment definition

STSEG SEGMENT
DB 64 DUP (?)
STSEG ENDS

 DB directive reserves 64 bytes of memory for the stack.


 SEGMENT directive begins the segment
 ENDS directive ends the segment

The 80x86 Microprocessors 1.39 Assembly Language

Data Segment Definition

DTSEG SEGMENT
DATA1 DB 52
DATA1 DB 29
SUM DB ?
DTSEG ENDS
 The data segment defines three data items
 DATA1, DATA2, and SUM
 The DB directive is used by the assembler to allocate memory in byte-
sized chunks.
 The data items defined in the data segment will be accessed in the code
segment by their labels.
 DATA1 and DATA2 are given initial values in the data section.
 SUM is not given an initial value, but storage is set aside for later use by
the program.

The 80x86 Microprocessors 1.40 Assembly Language

20
Code Segment Definition
CDSEG SEGMENT
MAIN PROC FAR
ASSUME CS:CDSEG, DS:DTSEG, SS:STSEG
MOV AX,DTSEG
MOV DS,AX
MOV AL,DATA1
………
 The first line of the segment after the SEGMENT directive is the PROC
directive.
 A procedure is a group of instructions designed to accomplish a specific
function.
 Every procedure must have a name defined by the PROC directive,
followed by the assembly language instructions and closed by the ENDP
directive.
 The PROC and ENDP statements must have the same label.
 The PROC directive may have the option FAR or NEAR.
 The operating system that controls the computer must be directed to the
beginning of the program in order to execute it.

The 80x86 Microprocessors 1.41 Assembly Language

The Form of an Assembly Language Program


STSEG SEGMENT
DB 64DUP(?)
STSEG ENDS
DTSEG SEGMENT
DATA1 DB 52H
DATA2 DB 29H
SUM DB ?
DTSEG ENDS
CDSEG SEGMENT
MAIN PROC FAR
ASSUME CS:CDSEG, DS:DTSEG, SS:STSEG
MOV AX,DTSEG
MOV DS,AX
MOV AL,DATA1
MOV BL,DATA2
ADD AL,BL
MOV SUM,AL
MOV AH,4CH
INT 21H
MAIN ENDP
CDSEG ENDS
END MAIN ;this is the program exit point

The 80x86 Microprocessors 1.42 Assembly Language

21
ASSUME Directive
ASSUME CS:CDSEG, DS:DTSEG, SS:STSEG
 The ASSUME directive associates segment registers with specific
segments by assuming that the segment register is equal to the segment
labels used in the program.
 If an extra segment had been used, ES would also be included in the
ASSUME statement.
 The ASSUME statement is needed because a given Assembly language
program can have several code segments, one or two or three or more
data segments and more than one stack segment, but only one of each
can be addressed by the CPU at a given time since there is only one of
each of the segment registers available inside the CPU.
 ASSUME directive tells the assembler which of the segments defined by
the SEGMENT directives should be used.
 ASSUME directive also helps the assembler to calculate the offset
addresses from the beginning of that segment.
 For example, in “MOV AL,[BX]” the BX register is the offset of the data
segment.

The 80x86 Microprocessors 1.43 Assembly Language

ASSUME

 What value is actually assigned to the CS, DS, and SS registers for
execution of the program?
 The operating system must pass control to the program so that it may
execute, but before it does that it assigns values for the segment
registers.
 The operating system must do this because it knows how much memory
is installed in the computer, how much of it is used by the system, and
how much is available.
 One cannot tell DOS to give the program a specific area of memory, say
from 25FFF to 289E2. Therefore, it is the job of DOS to assign exact
values for the segment registers.
 Upon taking control from DOS, of the three segment registers, only CS
and SS have the proper values.
 The DS value (and ES, if used) must be initialized by the program.
 This is done as follows:
 MOVAX,DTSEG
 MOV DS,AX

The 80x86 Microprocessors 1.44 Assembly Language

22
The Form of an Assembly Language Program
STSEG SEGMENT
DB 64DUP(?)
STSEG ENDS

DTSEG SEGMENT
;place data here
DTSEG ENDS

CDSEG SEGMENT
MAIN PROC FAR ;this is the program entry point
ASSUME CS:CDSEG, DS:DTSEG, SS:STSEG
MOV AX,DTSEG ;bring in the segment for data
MOV DS,AX ;assign the DS value

;place code here

MOV AH,4CH
INT 21H
MAIN ENDP
CDSEG ENDS
END MAIN

The 80x86 Microprocessors 1.45 Assembly Language

Return Control to the Operating System

MOV AH,4CH
INT 21H

Their purpose is to return control to the operating system.

The 80x86 Microprocessors 1.46 Assembly Language

23
INT 21H Option 01H
Inputting a single character with echo

MOV AH, 01
INT 21H
 This function waits until a character is input for the keyboard, then echoes
it to the monitor.
 The input character (ASCII) will be in AL

 Without echo
MOV AH, 07
INT 21H

The 80x86 Microprocessors 1.47 Assembly Language

INT 21H Option 02H


Outputting a single character to monitor

MOV AH, 02
MOV DL, ‘M’
INT 21H
 DL is loaded wit the character to be displayed

The 80x86 Microprocessors 1.48 Assembly Language

24
INT 21H Option 0AH
Inputting a string of data from keyboard, with echo

DATA DB 6,?, 6 DUO(FF)


MOV AH, 0AH
MOV DX, Offset DATA
INT 21H
 Get data form keyboard and store it in a predefined area of memory in the data
segment.
 DX stores the offset address of the buffer
 OS will put the number of character that came in through the keyboard in the
second byte.
 The keyed-in data is paced in the buffer starting from the third byte
 The last character in the data is the carriage return
06 00 FF FF FF FF FF FF
06 4 „S‟ „I‟ „N‟ „E‟ 0D FF SINE
06 5 „M‟ „a‟ „h‟ „m‟ „o‟ 0D Mahmoud
06 00 0D FF FF FF FF FF
The 80x86 Microprocessors 1.49
 Assembly Language

INT 21H Option 09H


outputting a string of data to the monitor

DATA DB ‘My name is Ahmed’,’$’


MOV AH, 09H
MOV DX, Offset DATA
INT 21H
 Display the ASCII data string pointed by DX until it encounters „$‟.
 DX stores the offset address of the data to be displayed

The 80x86 Microprocessors 1.50 Assembly Language

25
HW

 1, 8, 9, 10, 13, 15, 16

The 80x86 Microprocessors 1.51 Assembly Language

26

You might also like