Coen3114 Intro To Assembly Language Programming PDF
Coen3114 Intro To Assembly Language Programming PDF
5.1 5.2 5.3 5.4 5.5 Introduction The Computer Organization - Intel PC Instruction Format Addressing Mode DEBUG program
Introduction
Levels of Programming Languages 1) Machine Language
Consists of individual instructions that will be executed by the CPU one at a time
(ii) 8086 Is similar to 8088 but has a 16-bit data bus and runs faster.
(iii) 80286
Runs faster than 8086 and 8088 Can address up to 16 MB of internal memory multitasking => more than 1 task can be ran simultaneously
(iv) 80386
has 32-bit registers and 32-bit data bus can address up to 4 billion bytes. of memory support virtual mode, whereby it can swap portions of memory onto disk: in this way, programs running concurrently have space to operate.
(v) 80486
has 32-bit registers and 32-bit data bus the presence of CACHE
(vi) Pentium
has 32-bit registers, 64-bit data bus has separate caches for data and instruction the processor can decode and execute more than one instruction in one clock cycle (pipeline)
In performing its task, the processor (CPU) is partitioned into two logical units: 1) An Execution Unit (EU) 2) A Bus Interface Unit (BIU) EU
EU is responsible for program execution Contains of an Arithmetic Logic Unit (ALU), a Control Unit (CU) and a number of registers
BIU
Delivers data and instructions to the EU. manage the bus control unit, segment registers and instruction queue. The BIU controls the buses that transfer the data to the EU, to memory and to external input/output devices, whereas the segment registers control memory addressing.
EU and BIU work in parallel, with the BIU keeping one step ahead. The EU will notify the BIU when it needs to data in memory or an I/O device or obtain instruction from the BIU instruction queue. When EU executes an instruction, BIU will fetch the next instruction from the memory and insert it into to instruction queue.
EU : Execution Unit AX BX CX DX AH BH CH DH AL BL CL DL SP BP SI DI
1 2 3 4
Instruction Queue
Example :
register memory
05
29
29
05
Address 04A2616
(low-order/least significant byte)
Address 04A2716
(high-order/most significant byte)
When the processor takes data (a word or 2 bytes), it will re-reverse the byte to its actual order 052916
(ii) Data Segment (DS) Contains programs defined data, constants and works areas. DS register is used to store the starting address of the DS (iii) Stack Segmen (SS) Contains any data or address that the program needs to save temporarily or for used by your own calledsubroutines. SS register is used to hold the starting address of this segment
SS Register
DS Register
CS Register
memory (MM)
Segment Offsets
Within a program, all memory locations within a segment are relative to the segments starting address. The distance in bytes from the segment address to another location within the segment is expressed as an offset (or displacement). Thus the first byte of the code segment is at offset 00, the second byte is at offset 01 and so forth. To reference any memory location in a segment (the actual address), the processor combines the segment address in a segment register with the offset value of that location. actual address = segment address + offset
Eg: A starting address of data segment is 038E0H, so the value in DS register is 038E0H. An instruction references a location with an offset of 0032H bytes from the start of the data segment. the actual address = DS segment address + offset = 038E0H + 0032H = 03912H
Registers
Registers are used to control instructions being executed, to handle addressing of memory, and to provide arithmetic capability Registers of Intel Processors can be categorized into:
1. 2. 3. 4. 5. Segment register Pointer register General purpose register Index register Flag register
i) Segment register
There are 6 segment registers : (a) CS register Contains the starting address of programs code segment. The content of the CS register is added with the content in the Instruction Pointer (IP) register to obtain the address of the instruction that is to be fetched for execution. (Note: common name for IP is PC (Program Counter)) (b) DS register Contains the starting address of a programs data segment. The address in DS register will be added with the value in the address field (in instruction format) to obtain the real address of the data in data segment.
(c) SS Register
Contains the starting address of the stack segment. The content in this register will be added with the content in the Stack Pointer (SP) register to obtain the required word. Used by some string (character data) operations to handle memory addressing ES register is associated with the Data Index (DI) register.
(a) Instruction Pointer register The 16-bit IP register contains the offset address or displacement for the next instruction that will be executed by the CPU The value in the IP register will be added into the value in the CS register to obtain the real address of an instruction
Example : The content in CS register = The content in IP register = next instruction address:
39B40H 514H
39B40H + 514H . 3A054H Intel 80386 introduced 32-bit IP, known as EIP (Extended IP)
(b) Stack Pointer Register (Stack Pointer (SP)) The 16-bit SP register stores the displacement value that will be combined with the value in the SS register to obtain the required word in the stack Intel 80386 introduced 32-bit SP, known as ESP (Extended SP) Example: Value in register SS = 4BB30H Value in register SP = + 412H 4BF42H (c) Base Pointer Register The 16-bit BP register facilitates referencing parameters, which are data and addresses that a program passes via a stack The processor combines the address in SS with the offset in BP
(iii) General Purpose Registers There are 4 general-purpose registers, AX, BX, CX, DX: (a) AX register Acts as the accumulator and is used in operations that involve input/output and arithmetic The diagram below shows the AX register with the number of bits.
32 bits 8 bit AH AX EAX 8 bit AL
EAX AX AH AL
: 32 bit : 16 bit (rightmost 16-bit portion of EAX) : 8 bit => leftmost 8 bits of AX (high portion) : 8 bit => rightmost 8 bit of AX (low portion)
(b) BX Register o Known as the base register since it is the only this general purpose register that can be used as an index to extend addressing. o This register also can be used for computations o BX can also be combined with DI and SI register as a base registers for special addressing like AX, BX is also consists of EBX, BH and BL
32 bits
8 bit 8 bit
BH BX EBX
BL
(c) CX Register known as count register may contain a value to control the number of times a loops is repeated or a value to shift bits left or right CX can also be used for many computations Number of bits and fractions of the register is like below :
32 bits
8 bit 8 bit
CH CX
CL
ECX
(d) DX Register Known as data register Some I/O operations require its use Multiply and divide operations that involve large values assume the use of DX and AX together as a pair to hold the data or result of operation. Number of bits and the fractions of the register is as below :
32 bits
8 bit 8 bit
DH
DL
DX
EDX
(iv) Index Register There are 2 index registers, SI and DI (a) SI Register o Needed in operations that involve string (character) and is always usually associated with the DS register o SI : 16 bit o ESI : 32 bit (80286 and above)
(b) DI Register o Also used in operations that involve string (character) and it is associated with the ES register o DI : 16 bit o EDI : 32 bit (80386 and above)
(v) FLAG Register o Flags register contains bits that show the status of some activities o Instructions that involve comparison and arithmetic will change the flag status where some instruction will refer to the value of a specific bit in the flag for next subsequent action
O D I T S Z A P C 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
- 9 of its 16 bits indicate the current status of the computer and the results of processing - the above diagram shows the stated 9 bits
15
14
13
12
11
10
OF (overflow): indicate overflow of a high-order (leftmost) bit following arithmetic DF (direction): Determines left or right direction for moving or comparing string (character) data IF (interrupt): indicates that all external interrupts such as keyboard entry are to be processed or ignored TF (trap): permits operation of the processor in single-step mode. Usually used in debugging process SF (sign): contains the resulting sign of an arithmetic operation (0 = +ve, 1 = -ve) ZF (zero): indicates the result of an arithmetic or comparison operation (0 = non zero; 1 = zero result) AF (auxillary carry): contains a carry out of bit 3 into bit 4 in an arithmetic operation, for specialized arithmetic PF (parity): indicates the number of 1-bits that result from an operation. An even number of bits causes so-called even parity and an odd number causes odd parity CF (parity): contains carries from a high-order (leftmost) bit following an arithmetic operation; also, contains the content of the last bit of a shift or rotate operation.
Instruction Format
The operation of CPU is determined by the instructions it executes (machine or computer instructions) CPUs instruction set the collection of different instructions that CPU can execute Each instruction must contain the information required by CPU for execution :1. 2. 3. 4. Operation code (opcode) -- specifies the operation to be performed (eg: ADD, I/O) do this Source operand reference -- the operation may involve one or more source operands (input for the operation) to this Result operand reference -- the operation may produce a result put the answer here Next instruction reference -- to tell the CPU where to fetch the next instruction after the execution of this instruction is complete do this when you have done that
opcode
Operands (source & result) can be in one of the 3 areas: Main or Virtual Memory CPU register I/O device It is not efficient to put all the information required by CPU in a machine instruction Each instruction is represented by sequence of bits & is divided into 2 fields; opcode & address
opcode
address
Processing become faster if all information required by CPU in one instruction or one instruction format
opcode address for Operand 1 address for Operand 2 address for Result address for Next instruction
Problems instruction become long (takes a few words in main memory to store 1 instruction) Solution provide a few instruction formats (format instruction); 1, 2, 3 and addressing mode. Instruction format with 2 address is always used; INTEL processors
Opcodes are represented by abbreviations, called mnemonics, that indicate the operation. Common examples:
ADD SUB DIV LOAD STOR Add Subtract Divide Load data from memory Store data to memory
Instruction-3-address
opcode address for Result address for Operand 1 address for Operand 2
Example : SUB Y A B Y=A-B Result Y Operand 1 A Operand 2 B Operation = subtracts Address for Next instruction? program counter (PC)
Instruction-2-address
opcode address for Operand 1 & Result address for Operand 2
Example : SUB Y B Y=Y-B Operand 1 Y Operand 2 B Result replace to operand 1 Address for Next instruction? program counter (PC)
Instruction-1-address
opcode address for Operand 2
Example :
LOAD A ADD B or SUB B
AC = AC + B AC = AC - B
Operand 1 & Result in AC (accumulator), a register SUB B B subtracts from AC, the result stored in AC Address for Next instruction? program counter (PC)
Short instruction (requires less bit) but need more instructions for arithmetic problem
Y = (A-B) / (C+D x E)
Instruction SUB Y, A, B MPY T, D, E ADD T, T, C DIV Y, Y, T
Three-address instructions
Comment Y T T Y
Y = (A-B) / (C+D x E)
MOVE Y, A SUB Y, B MOVE T, D MPY T, E ADD T, C DIV Y, T
Two-address instructions
Y = (A-B) / (C+D x E)
INSTRUCTIONS LOAD D MPY E ADD C STOR Y LOAD A SUB B DIV Y STOR Y
One-address Instructions
Comment AC AC AC Y AC AC AC Y
D AC x E AC + C AC A AC B AC / Y AC
Addressing Mode
opcode
address
Address field address for operand & result Number of bit required to store data (operand & result)
Eg: field size for an operand = 4 bit, 24=16 space for address can be used to store an operand
How is the address of an operand specified? Addressing mode - A technique to identify address to access operands
address
Eg: ADD 20
Operand = 20 add 20 to contents of accumulator (AC) If value in AC = 10, the result 30
One memory reference to access data No additional calculations to work out A (effective address) Limited address space
Instruction Opcode Address A Memory
Operand
Operand
Register address
Small address field is needed No memory references are required Fast execution But, address space is limited
Opcode Instruction Register Address R Registers
Operand
Registers
Pointer to Operand
Operand
Pointer to Operand
Operand
Base-Register Addressing
A holds displacement R holds pointer to base address R may be explicit or implicit e.g. segment registers in 80x86 A = base R = displacement EA = A + R Good for accessing arrays EA = A + R R++
Indexed Addressing
Displacement Addressing
Instruction
Opcode Address A Memory
Program Counter PC
Operand
Relative Addressing
Instruction Opcode Address A Memory Opcode Instruction Address A Memory
Registers
Base Register
Operand
Index Register
Operand
Better distinction of the base and indexing might be who/what does the reference. Examples:
Indexing is used within programs for accessing data structures Base addressing is used as a control measure by the OS to implement segmentation
and
P : Proceed or execute a set of related instructions Q : Quit the DEBUG session R : Display the contents of one or more registers in hex format T : Trace the execution of one instruction U : Unassemble (or disassemble) machine code into symbolic code Note : refer appendix C (from main reference) pg 513-519 for complete DEBUG commands
Example : To display the content in segment FE0016 beginning from the first byte of the segment, in the DEBUG mode, type: D FE00:0
or
d fe00:0 first byte of the segment
c:\>DEBUG -d fe00:0 FE00:0000 FE00:0010 FE00:0020 FE00:0030 FE00:0040 FE00:0050 FE00:0060 FE00:0070
41 4D 20 41 6E 41 18 49
77 20 42 77 63 77 41 4F
61 43 49 61 2E 03 77 53
72 4F 4F 72 6F 0C 61 20
64 4D 53 64 66 04 72 76
20 50 20 20 74 01 64 36
53 41 43 53 77 01 20 2E
74 42 59 74 65 74 64 A6
77 4C 52 77 20 77 75 32
61 45 49 61 49 E9 6C EC
72 20 47 72 6E 11 61 33
65 34 48 65 63 14 72 EC
49 38 54 20 2E 20 20 35
42 36 20 49 20 43 42 EC
Award SoftwareIB M COMPATIBLE 486 BIOS COPYRIGHT Award Software I nc.oftware Inc. Aw.....oftw... C .Award Modular B IOS v6.0..2.3.5.
Address
Hexadecimal Representation
ASCII Code
- the DEBUG program can also be used to enter the program code into the memory and trace its execution - Below is an example of a program in machine language (written in hexadecimal) and assembly language (symbolic code) together with description about the instructions
Machine Instruction B82301 052500 8BD8 03D8 8BCB 2BC8 2BC0 EBEE
Symbolic Code (Assembly Language) MOV ADD MOV ADD MOV SUB SUB JMP AX,0123 AX,0025 BX,AX BX,AX CX,BX CX,AX AX,AX 100
Explanation
Move value 0123H to AX Add value 0025H to AX Move contents of AX to BX Add contents of AX to BX Move contents of BX to CX Subtract contents of AX from CX Subtract AX from AX Go back to the start
- - The first and second instructions in the program above use immediate addressing mode where the real data value is in the address field MOV AX, 0123 ADD AX, 002 Other instructions use register addressing mode (general purpose registers)
To enter instructions in machine language into the memory (code segment), the e or E command is used followed by the address of the segment code at 100 (beginning addess of instructions in a code segment (100H = 256B)) - refer to Table 1 below. (E Enter) To trace the execution of the program, use r or R first to view the content in the CPU registers followed by the command t or T. - refer to Table 2 below. (R Register; T Trace)
-e CS:100 B8 23 01 05 25 00 -e CS:106 8B D8 03 D8 8B CB -e CS:10C 2B C8 2B C0 EB EE
Table 1
-r AX=0000 BX=0000 DS=2090 ES=2090 2090:0100 B82301 -t AX=0123 BX=0000 DS=2090 ES=2090 2090:0103 052500 -t AX=0148 BX=0000 DS=2090 ES=2090 2090:0106 8BD8 -t AX=0148 BX=0148 DS=2090 ES=2090 2090:0108 03D8 -t AX=0148 BX=0290 DS=2090 ES=2090 2090:010A 8BCB -t AX=0148 BX=0290 DS=2090 ES=2090 2090:010A 8BCB -t AX=0148 BX=0290 DS=2090 ES=2090 2090:010C 2BC8 -t AX=0148 BX=0290 DS=2090 ES=2090 2090:010E 2BC0 -t AX=0000 BX=0290 DS=2090 ES=2090 2090:0110 EBEE
- to view the instruction that is entered in the code segment, use the d command followed by cs:100 (100 is the word byte starting address that is allowed in the code segment) - refer Table 3 below
-d cs:100 2090:0100 2090:0110
2090:0120
8B 23 01 05 25 00 8B D8-03 D8 8B CB 2B C8 2B C0 EB EE E8 59 00 5F 5E 59-5B 58 5A 1F 34 00 7F 20
D5 E2 00 74 F7 1E 0E 1F-BE D5 E2 E8 98 02 2E A1
...t............
2F D8 41 06 E7
E7 B0 00 67 04
BB FF 74 E1 00
40 86 07 01 75
00 47 0B BA 03
BA 18 FF 69 E9
01 A2 74 E1 9A
FF C3 39 69 E7
CD 0E 82 E1 04
21 1F E9 E9 E8
1F E8 CD BD 48
72 D2 FC FC 02
0B 00 2E 80 80
8B 3D C6 3E 3E
Table 3
5.5.4) Machine Language Example : Using Defined Data (Direct Addressing Mode)
- Example above uses immediate addressing mode for the MOV and ADD instructions (first 2 instructions). The following is an example of entering program in machine language using direct addressing mode (data are in main memory (Data Segment)). In this case, data needs to be entered into the Data Segment first. Assume the data position in the Data segment is as below:
Table 4
EBF4
Table 5
- Enter instructions (Table 5) and data (Table 4) above using the E or e command. refer to Table 6 below.
- The first 2 rows are the instructions to enter the program which start at byte 100H in a Code Segment (CS) whereas the last 2 rows are instructions to enter data which start at byte 200 in a Data Segment (DS).
- To view the instructions and data that are already entered, use the d or D command. refer to Table 7 below.
-D CS : 100, 10B 2090 : 0100 A1 00 02 03 06 02 02 A3 04 02 EB F4 -D DS : 200, 208 2090 : 0200 23 01 25 00 00 00 2A 2A 2A
Table 7
Note: Row 1 and 3 above is typed by the programmer whereas row 2 and 4 is what is displayed by the computer
To view machine code for the assembly language entered, use the u or U command. (U Un-assemble)
-U 100, 107 2090 : 0100 2090 : 0102 2090 : 0104 2090 : 0106
the machine code for the instruction entered - To execute the above program, as usual, use the r or R command followed by the T or t command.
5.5.5)
- DEBUG program also can be used to request information about system by using INT (interrupt) instruction. - INT instruction will exit from a program, enter a DOS or BIOS routine, performs the requested function, and return to a program.
Example 1: Getting the Current Date and Time - the instruction to access the current date is INT 21H function code 2AH. The function code 2AH must be moved to AH register. The instructions are as the following:
MOV AH, 2A INT 21 JMP 100 - use command A to enter the above instructions into the code segment.
- type R to display the registers and T to execute the MOV. - type P to proceed directly through the interrupt routine; the operation stops at the JMP. - - the registers contain the following information in hex format: AL: Day of the week, where 0 = Sunday CX: Year (for example, 07D4H = 2004) DH: Month (01H through 0CH) DL: Day of the month (01H through 1FH)
Example 2: Displaying to display data on screen. enter the following instructions using A 100 command. 100 102 105 107 109 MOV AH, 09 MOV DX, 109 Starting address of the INT 21 data to display JMP 100 DB MY NAME IS YANI, $
- key in R to display the registers and first instruction, and key in T commands for the two MOVs. Key in P to execute INT 21 and MY NAME IS YANI will display on the screen.
<enter> twice
- the first instruction, MOV, provides function code 10H that tells INT 16H to accept data from the keyboard.
- The operation delivers the character from the keyboard to the AL register. - Key in R to display the registers and first instruction and key in a T command to execute the MOV. - Type P for INT 16H, the system waits for you to press a key. - If you press the number 1, the operation delivers 31H (hex for ASCII 1) to AL.