0% found this document useful (0 votes)
29 views68 pages

2 CPE 413 Intro To Assembly Lang Programming

This document provides an introduction to assembly language. It discusses assembly language statements like instructions, directives, and macros. It describes different types of assembly language instructions, constants, addressing modes, and data transfer instructions. It also explains concepts like data allocation, defining constants, and the structure of a basic assembly language program using directives like TITLE, MODEL, STACK, DATA, CODE, INCLUDE, PROC, ENDP, and END.

Uploaded by

Samuel jidayi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views68 pages

2 CPE 413 Intro To Assembly Lang Programming

This document provides an introduction to assembly language. It discusses assembly language statements like instructions, directives, and macros. It describes different types of assembly language instructions, constants, addressing modes, and data transfer instructions. It also explains concepts like data allocation, defining constants, and the structure of a basic assembly language program using directives like TITLE, MODEL, STACK, DATA, CODE, INCLUDE, PROC, ENDP, and END.

Uploaded by

Samuel jidayi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 68

Introduction to Assembly Language

Prof. Christopher U. Ngene email: [email protected] 1


Outline

• Assembly language • Overview of assembly


statements language instructions
• Arithmetic
• Data allocation
• Conditional
• Where are the operands? • Logical
• Addressing modes • Shift
• Register • Rotate
• Immediate
• Direct
• Defining constants
• Indirect • EQU and = directives

• Data transfer instructions • Illustrative examples


• mov, xchg, and xlat • Performance: When to use the
• PTR directive xlat instruction
Prof. Christopher U. Ngene email: [email protected] 2
Assembly Language Statements
• Three different classes
• Instructions
• Tell CPU what to do
• Executable instructions with an op-code
• Directives (or pseudo-ops)
• Provide information to assembler on various aspects of the
assembly process
• Non-executable
• Do not generate machine language instructions
• Macros
• A shorthand notation for a group of statements
• A sophisticated text substitution mechanism with
parameters
Prof. Christopher U. Ngene email: [email protected] 3
Constants
• Integer Constants
• Examples: –10, 42d, 10001101b, 0FF3Ah, 777o
• Radix: b = binary, d = decimal, h = hexadecimal, and o = octal
• If no radix is given, the integer constant is decimal
• A hexadecimal beginning with a letter must have a leading 0
• Character and String Constants
• Enclose character or string in single or double quotes
• Examples: 'A', "d", 'ABC', "ABC", '4096'
• Embedded quotes: "single quote ' inside", 'double quote "
inside'
• Each ASCII character occupies a single byte
Prof. Christopher U. Ngene email: [email protected] 4
Instructions
• Assembly language instructions have the format:
[label:] mnemonic [operands] [;comment]
• Instruction Label (optional)
• Marks the address of an instruction, must have a colon :
• Used to transfer program execution to a labeled instruction
• Mnemonic
• Identifies the operation (e.g. MOV, ADD, SUB, JMP, CALL)
• Operands
• Specify the data required by the operation
• Executable instructions can have zero to three operands
• Operands can be registers, memory variables, or constants

Prof. Christopher U. Ngene email: [email protected] 5


Instruction Examples
• No operands
stc ; set carry flag
• One operand
inc eax ; increment register eax
call Clrscr ; call procedure Clrscr
jmp L1 ; jump to instruction
with ; label L1
• Two operands
add ebx, ecx ; register ebx = ebx + ecx
sub var1, 25 ; memory variable var1 =
; var1 - 25
• Three operands
imul eax,ebx,5 ; register eax = ebx * 5
Prof. Christopher U. Ngene email: [email protected] 6
Comments
• Comments are very important!
• Explain the program's purpose
• When it was written, revised, and by whom
• Explain data used in the program
• Explain instruction sequences and algorithms used
• Application-specific explanations
• Single-line comments
• Begin with a semicolon ; and terminate at end of line
• Multi-line comments
• Begin with COMMENT directive and a chosen character
• End with the same chosen character
Prof. Christopher U. Ngene email: [email protected] 7
Flat Memory Program Template
TITLE Flat Memory Program Template (Template.asm)

; Program Description:
; Author: Creation Date:
; Modified by: Modification Date:

.386
.MODEL FLAT, STDCALL
.STACK

INCLUDE\masm32\INCLUDE\windows.inc
.DATA
; (Your initialized variables here)
.DATA?
; (Your uninitialized variables here)
.CONST
;(Your constants here)
.CODE
main PROC
; (insert executable instructions here)
exit
main ENDP
; (insert additional procedures here)
END main

Prof. Christopher U. Ngene email: [email protected] 8


TITLE and .MODEL Directives
• TITLE line (optional)
• Contains a brief heading of the program and the disk file name
• MODEL directive
• Specifies the memory configuration
• For our purposes, the FLAT memory model will be used
• Linear 32-bit address space (no segmentation)
• STDCALL directive tells the assembler to use …
• Standard conventions for names and procedure calls
• .386 processor directive
• Used before the .MODEL directive
• Program can use instructions of Pentium P6 architecture (686)
• At least the .386 directive should be used with the FLAT model
Prof. Christopher U. Ngene email: [email protected] 9
.STACK, .DATA, & .CODE Directives
• STACK directive
• Tells the assembler to define a runtime stack for the program
• The size of the stack can be optionally specified by this directive
• The runtime stack is required for procedure calls

• DATA, .DATA?, .CONST, .CODE directives


• All 4 directives are called section. Remember no segments in Win32,
• But you can divide your entire address space into logical sections.
• The start of one section denotes the end of the previous section.
• There are two groups of section: data and code. Data sections are divided into 3
categories:

• .DATA
• This section contains initialized data of your program.
• Assembler will allocate and initialize the storage of variables

Prof. Christopher U. Ngene email: [email protected] 10


.STACK, .DATA, & .CODE Directives…
• .DATA?
• This section contains uninitialized data of your program.
• Sometimes you just want to pre-allocate some memory but don't want to initialize it.
• The advantage of uninitialized data is: it doesn't take space in the executable file. For
example, if you allocate 10,000 bytes in your .DATA? section, your executable is not
bloated up 10,000 bytes. Its size stays much the same. You only tell the assembler how
much space you need when the program is loaded into memory, that's all.

• .CONST
• This section contains declaration of constants used by your program. Constants in this
section can never be modified in your program. They are just *constant*.
• CODE directive
• Defines the code section of a program containing instructions
• Assembler will place the instructions in the code area in memory

You don't have to use all .DATA, .DATA? And .CONST sections in your program.
Declare only the section(s) you want to use.

Prof. Christopher U. Ngene email: [email protected] 11


INCLUDE, PROC, ENDP, and END
• INCLUDE directive
• Causes the assembler to include code from another file
• We will include windows.inc
• Declares procedures implemented in the windows.lib library
• To use this library, you should link windows.lib to your programs
• PROC and ENDP directives
• Used to define procedures
• As a convention, we will define main as the first procedure
• Additional procedures can be defined after main
• END directive
• Marks the end of a program
• Identifies the name (main) of the program’s startup procedure
Prof. Christopher U. Ngene email: [email protected] 12
Data Allocation
• Variable declaration in a high-level language such
as C
• char response
• int value
• Float total
• double average_value
• specifies
• Amount storage required (1 byte, 2 bytes, …)
• Label to identify the storage allocated (response, value, …)
• Interpretation of the bits stored (signed, floating point, …)
• Bit pattern 1000 1101 1011 1001 is interpreted as
• -29,255 as a signed number
• 36,281 as an unsigned number
Prof. Christopher U. Ngene email: [email protected] 13
Data Allocation…
• In assembly language, we use the define directive
• Define directive can be used
• To reserve storage space
• To label the storage space
• To initialize
• But no interpretation is attached to the bits stored
• Interpretation is up to the program code
• Define directive goes into the .DATA part of the assembly
language program

• Define directive format


[var-name] D? init-value [,init-value],...

Prof. Christopher U. Ngene email: [email protected] 14


Data Allocation….
• Five define directives
DB Define Byte ;allocates 1 byte
DW Define Word ;allocates 2 bytes
DD Define Doubleword ;allocates 4 bytes
DQ Define Quadword ;allocates 8 bytes
DT Define Ten bytes ;allocates 10 bytes
• Examples
sorted DB ’y’
response DB ? ;no initialization
value DW 25159
float1 DQ 1.234

Prof. Christopher U. Ngene email: [email protected] 15


Data Allocation….
• Multiple definitions can be abbreviated
• Example
message DB ’B’
DB ’y’
DB ’e’
DB 0DH
DB 0AH
can be written as
message DB ’B’, ’y’, ’e’, 0DH, 0AH
• More compactly as
message DB ’Bye’, 0DH, 0AH

Prof. Christopher U. Ngene email: [email protected] 16


Data Allocation…
• Multiple definitions can be cumbersome to initialize data
structures such as arrays
• Example
• To declare and initialize an integer array of 8 elements
marks DW 0, 0, 0, 0, 0, 0, 0, 0
• What if we want to declare and initialize to zero an array
of 200 elements?
• There is a better way of doing this than repeating zero 200
times in the above statement
• Assembler provides a directive to do this (DUP directive)

Prof. Christopher U. Ngene email: [email protected] 17


Data Allocation…
• Multiple initializations
• The DUP assembler directive allows multiple initializations to
the same value
• Previous marks array can be compactly declared as marks DW
8 DUP (0)
• Examples
table1 DW 10 DUP (?) ;10 words, uninitialized
message DB 3 DUP (’Bye!’) ;12 bytes, initialized
; as Bye!Bye!Bye!
Name1 DB 30 DUP (’?’) ;30 bytes, each
; initialized to ?

Prof. Christopher U. Ngene email: [email protected] 18


Data Allocation….
• The DUP directive may also be nested
• Example
stars DB 4 DUP (3 DUP (’*’), 2 DUP (’?’), 5 DUP (’!’))
Reserves 40-bytes space and initializes it as
***??!!!!!***??!!!!!***??!!!!!***??!!!!!
• Example
matrix DW 10 DUP (5 DUP (0))
defines a 10 x 5 matrix and initializes its elements to
zero. This declaration can also be done by

matrix DW 50 DUP (0)

Prof. Christopher U. Ngene email: [email protected] 19


Data Allocation…
Symbol Table
• Assembler builds a symbol table
• so we can refer to the allocated storage space by the associated
label
• Assembler keeps track of each name and its offset
• Offset of a variable is relative to the address of the first variable.

Prof. Christopher U. Ngene email: [email protected] 20


Data Allocation…

Correspondence to C Data Types

Prof. Christopher U. Ngene email: [email protected] 21


Data Allocation….
• LABEL directive provides LABEL Directive
another way to name a
memory location • Example
• Format:
name LABEL type
Type can be:

BYTE 1 byte
WORD 2
bytes
• count refers to the 16-bit value
DWORD 4 • Lo_count refers to the low byte
bytes • Hi_count refers to the high byte
QWORD 8
bytes
Prof. Christopher U. Ngene email: [email protected] 22
Where Are the Operands?
Register addressing mode
• Operands required by an • Most efficient way of
operation can be specified in a specifying an operand
variety of ways • operand is in an internal
• A few basic ways are: register
• operand in a register • Examples
• register addressing mode
• operand in the instruction itself mov EAX, EBX
• immediate addressing mode mov BX, CX
• operand in memory * The mov instruction
• variety of addressing modes
• direct and indirect addressing mov destination, source
modes
• operand
• copies data from source to
Addressing mode at an I/O
refers port
to the specification of the location of data required by an
destination
operation
Prof. Christopher U. Ngene email: [email protected] 23
Where Are the Operands?...
Immediate addressing mode Direct addressing mode
• Data is part of the instruction • Data is in the data segment
• operand is located in the code • Need a logical address to access data
• Two components: segment:offset
segment along with the
• Various addressing modes to specify
instruction the offset component
• Efficient as no separate operand • offset part is referred to as the effective
fetch is needed address
• Typically used to specify a • The offset is specified directly as part
of the instruction
constant
• We write assembly language programs
• Example using memory
Mov AL, 75 • labels (e.g., declared using DB, DW,
LABEL,...)
• This instruction uses register
• Assembler computes the offset value for
addressing mode for specifying the the label
destination and immediate • Uses symbol table to compute the offset
addressing mode to specify the of a label
source
Prof. Christopher U. Ngene email: [email protected] 24
Addressing Modes: Directing Addressing
Direct addressing mode
Examples
mov AL, response
» Assembler replaces response by its effective address (i.e., its offset value
from the symbol table)

mov table1, 56
» table1 is declared as

table1 DW 20 DUP (0)

» Since the assembler replaces table1 by its effective address, this instruction
refers to the first element of table1
– In C, it is equivalent to

table1[0] = 56
Prof. Christopher U. Ngene email: [email protected] 25
Addressing Modes: Directing Addressing…

• Problem with direct addressing


• Useful only to specify simple variables
• Causes serious problems in addressing data types
such as arrays
• As an example, consider adding elements of an array
• Direct addressing does not facilitate using a loop structure to
iterate through the array
• We have to write an instruction to add each element of the array

• Indirect addressing mode remedies this problem

Prof. Christopher U. Ngene email: [email protected] 26


Addressing Modes: Indirect Addressing

• The offset is specified indirectly via a register


• Sometimes called register indirect addressing mode
• For 16-bit addressing, the offset value can be in one of the three
registers: BX, SI, or DI
• For 32-bit addressing, all 32-bit registers can be used
• Example
mov EBX, OFFSET array ; EBX contains the address of the operand
mov AX, [BX] ;EBX used to access memory
• Square brackets [ ] are used to indicate that BX is holding an offset
value
• BX contains a pointer to the operand, not the operand itself

Prof. Christopher U. Ngene email: [email protected] 27


Addressing Modes: Indirect Addressing…
• Using indirect addressing mode, we can process
arrays using loops

• Example: Summing array elements

• Load the starting address (i.e., offset) of the array into BX


• Loop for each element in the array
• Get the value using the offset in BX
• Use indirect addressing
• Add the value to the running total
• Update the offset in BX to point to the next element of the
array
Prof. Christopher U. Ngene email: [email protected] 28
Indirect Addressing: Array Sum Example
• Indirect addressing is ideal for traversing an array
.data
array DWORD 10000h,20000h,30000h
.code
mov esi, OFFSET array ; esi = array address
mov eax,[esi] ; eax = [array] = 10000h
add esi,4 ; why 4?
add eax,[esi] ; eax = eax + [array+4]
add esi,4 ; why 4?
add eax,[esi] ; eax = eax + [array+8]

Note that ESI register is used as a pointer to array


ESI must be incremented by 4 to access the next array element
Because each array element is 4 bytes (DWORD) in memory

Prof. Christopher U. Ngene email: [email protected] 29


Loading offset value into a register

• Suppose we want to load BX with the offset value of


table1
• We cannot write
mov BX, table1
• Two ways of loading offset value
• Using OFFSET assembler directive
• Executed only at the assembly time
• Using lea instruction
• This is a processor instruction
• Executed at run time

Prof. Christopher U. Ngene email: [email protected] 30


OFFSET and LEA
Loading offset value into a register

• Using OFFSET assembler directive


• The previous example can be written as
mov BX, OFFSET table1

• Using lea (Load Effective Address) instruction


• The format of lea instruction is
lea register, source
• The previous example can be written as
lea BX, table1
Prof. Christopher U. Ngene email: [email protected] 31
OFFSET and LEA…
Loading offset value into a register
• Which one to use -- OFFSET or lea?
• Use OFFSET if possible
• OFFSET incurs only one-time overhead (at assembly time)
• lea incurs run time overhead (every time you run the program)
• May have to use lea in some instances
• When the needed data is available at run time only
• An index passed as a parameter to a procedure
• We can write
lea BX, table1[SI]
• to load BX with the address of an element of table1 whose
index is in SI register
• We cannot use the OFFSET directive in this case

Prof. Christopher U. Ngene email: [email protected] 32


LEA Example
.data
array WORD 1000 DUP(?)

.code ; Equivalent to . . .
lea eax, array ; mov eax, OFFSET array

lea eax, array[esi] ; mov eax, esi


; add eax, OFFSET array

lea eax, array[esi*2] ; mov eax, esi


; add eax, eax
; add eax, OFFSET array

lea eax, [ebx+esi*2] ; mov eax, esi


; add eax, eax
; add eax, ebx

Prof. Christopher U. Ngene email: [email protected] 33


DATA TRANSFER INSTRUCTIONS

Prof. Christopher U. Ngene email: [email protected] 34


Data Transfer Instructions

The mov instruction


• We will look at three • The format is
instructions
• mov (move) mov destination, source
• Actually copy • Copies the value from source to
• xchg (exchange) destination
• Exchanges two operands
• source is not altered as a result of
copying
• xlat (translate)
• Both operands should be of same
• Translates byte values using a
size
translation table
• source and destination cannot
• Other data transfer are both be in memory
instructions such as • Most Pentium instructions do not
allow both operands to be located in
• movsx (move sign extended) memory
• movzx (move zero extended) • Pentium provides special instructions
to facilitate memory-to-memory block
copying of data
Prof. Christopher U. Ngene email: [email protected] 35
Data Transfer Instructions…

Mov Instruction Rules


• Both operands must be of same size
• No memory to memory moves
• Destination cannot be CS, EIP, or IP
• No immediate to segment moves

Programs running in protected mode should not modify


the segment registers
Prof. Christopher U. Ngene email: [email protected] 36
Data Transfer Instructions…
The mov instruction

• Five types of operand combinations are allowed:

• The operand combinations are valid for all instructions


that require two operands
Prof. Christopher U. Ngene email: [email protected] 37
Data Transfer Instructions…
Zero Extension
• MOVZX Instruction
• Fills (extends) the upper part of the destination with zeros
• Used to copy a small source into a larger destination
• Destination must be a register

Example:
mov bl, 8Fh
movzx ax, bl

Prof. Christopher U. Ngene email: [email protected] 38


Data Transfer Instructions…
Sign Extension
• MOVSX Instruction
• Fills (extends) the upper part of the destination register with a
copy of the source operand's sign bit
• Used to copy a small source into a larger destination

Example:
mov bl, 8Fh
movsx ax, bl

Prof. Christopher U. Ngene email: [email protected] 39


Data Transfer Instructions…
Ambiguous moves: PTR directive
• For the following data definitions • PTR assembler directive can be
.DATA used to clarify the operand size.
table1 DW 20 DUP (0)
• The last two mov instructions can
status DB 7 DUP (1)
• the last two mov instructions are
be written as
ambiguous mov WORD PTR [BX], 100
Mov BYTE PTR [SI], 100
mov BX, OFFSET table1 • WORD and BYTE are called type
mov SI, OFFSET status specifiers
mov [BX], 100 • We can also use the following type
mov [SI], 100 specifiers:
• DWORD for doubleword values
• Not clear whether the assembler
should use byte or word equivalent • QWORD for quadword values
of 100 • TWORD for ten byte values

Prof. Christopher U. Ngene email: [email protected] 40


Data Transfer Instructions…
The xchg instruction
The syntax is • The xchg instruction is useful for
xchg operand1, operand2 conversion of 16-bit data between
little endian and big endian forms
Exchanges the values of operand1
• Example:
and operand2
mov AL, AH
Examples
• converts the data in AX into the
xchg EAX, EDX other endian form
xchg response, CL • Pentium provides bswap
xchg total, DX instruction to do similar
• Without the xchg instruction, we conversion on 32-bit data
need a temporary register to bswap 32-bit register
exchange values using only the • bswap works only on data located
in a 32-bit register
mov instruction
Prof. Christopher U. Ngene email: [email protected] 41
Data Transfer Instructions…
The xlat instruction
• The xlat instruction translates bytes • Example: Encrypting digits
• The format is Input digits: 0 1 2 3 4 5 6 7 8 9
xlatb Encrypted digits: 4 6 9 5 0 3 1 8 7 2
.DATA
• To use xlat instruction
xlat_table DB ’4695031872’
• BX should be loaded with the starting
address of the translation table ...
• AL must contain an index in to the .CODE
table
mov BX,OFFSET xlat_table
• Index value starts at zero
• The instruction reads the byte at this GetCh AL
index in the translation table and Sub AL, ’0’ ; converts input character to index
stores this value in AL
xlatb ; AL = encrypted digit character
• The index value in AL is lost
• Translation table can have at most 256 PutCh AL
entries (due to AL) …...
Prof. Christopher U. Ngene email: [email protected] 42
The xchg instruction…

XCHG Rules
• Operands must be of the same size
• At least one operand must be a register
• No immediate operands are permitted
• No exchange of two memory operands

Prof. Christopher U. Ngene email: [email protected] 43


MORE ADDRESSING MODES

Prof. Christopher U. Ngene email: [email protected] 44


Addressing Modes: Index Addressing
• Combines a variable's name with an index register
• Assembler converts variable's name into a constant offset
• Constant offset is added to register to form an effective address
• Syntax: [name + index] or name [index]
.data
array DWORD 10000h,20000h,30000h
.code
mov esi, 0 ; esi = array index
mov eax,array[esi] ; eax = array[0] =
10000h
add esi,4
add eax,array[esi] ; eax = eax + array[4]
add esi,4
add
Prof. eax,[array+esi]
Christopher U. Ngene email: [email protected]; eax = eax + array[8] 45
Addressing Modes: Index Addressing...
Index Scaling
• Useful to index array elements of size 2, 4, and 8 bytes
• Syntax: [name + index * scale] or name [index * scale]
• Effective address is computed as follows:
• Name's offset + Index register * Scale factor

.DATA
arrayB BYTE 10h,20h,30h,40h
arrayW WORD 100h,200h,300h,400h
arrayD DWORD 10000h,20000h,30000h,40000h
.CODE
mov esi, 2
mov al, arrayB[esi] ; AL = 30h
mov ax, arrayW[esi*2] ; AX = 300h
mov eax, arrayD[esi*4] ; EAX = 30000h

Prof. Christopher U. Ngene email: [email protected] 46


Addressing Modes: Based Addressing
• Syntax: [Base + Offset]
• Effective Address = Base register + Constant Offset
• Useful to access fields of a structure or an object
• Base Register  points to the base address of the structure
• Constant Offset  relative offset within the structure

.DATA mystruct is a structure


mystruct WORD 12 consisting of 3 fields:
DWORD 1985 a word, a double word,
BYTE 'M' and a byte
.CODE
mov ebx, OFFSET mystruct
mov eax, [ebx+2] ; EAX = 1985
mov al, [ebx+6] ; AL = 'M'

Prof. Christopher U. Ngene email: [email protected] 47


Addressing Modes: Based-Indexed Addressing
• Syntax: [Base + (Index * Scale) + Offset]
• Scale factor is optional and can be 1, 2, 4, or 8
• Useful in accessing two-dimensional arrays
• Offset: array address => we can refer to the array by name
• Base register: holds row address => relative to start of array
• Index register: selects an element of the row => column index
• Scaling factor: when array element size is 2, 4, or 8 bytes
• Useful in accessing arrays of structures (or objects)
• Base register: holds the address of the array
• Index register: holds the element address relative to the base
• Offset: represents the offset of a field within a structure
Prof. Christopher U. Ngene email: [email protected] 48
Addressing Modes: Based-Indexed Addressing…
• Example
.data
matrix DWORD 0, 1, 2, 3, 4 ; 4 rows, 5 cols
DWORD 10,11,12,13,14
DWORD 20,21,22,23,24
DWORD 30,31,32,33,34

ROWSIZE EQU SIZEOF matrix ; 20 bytes per row

.code
mov ebx, 2*ROWSIZE ; row index = 2
mov esi, 3 ; col index = 3
mov eax, matrix[ebx+esi*4] ; EAX = matrix[2][3]

mov ebx, 3*ROWSIZE ; row index = 3


mov esi, 1 ; col index = 1
mov eax, matrix[ebx+esi*4] ; EAX = matrix[3][1]

Prof. Christopher U. Ngene email: [email protected] 49


Summary of Addressing Modes
Assembler converts a variable name into a
constant offset (called also a displacement)

For indirect addressing, a base/index


register contains an address/index

CPU computes the effective


address of a memory operand

Prof. Christopher U. Ngene email: [email protected] 50


Registers Used in 32-Bit Addressing
• 32-bit addressing modes use the following 32-bit
registers
Base + ( Index * Scale ) + displacement
EAX EAX 1 no displacement
EBX EBX 2 8-bit displacement
ECX ECX 4 32-bit displacement
EDX EDX 8 Only the index register can
ESI ESI have a scale factor

EDI EDI ESP can be used as a base


EBP EBP register, but not as an index

ESP
Prof. Christopher U. Ngene email: [email protected] 51
Differences between 16- and 32-bit Modes

Prof. Christopher U. Ngene email: [email protected] 52


One-Dimensional Arrays

• Array declaration in HLL (such as • In assembly language, declaration


C) such as
int test_marks [10]; test_marks DW 10 DUP (?)
• specifies a lot of information only assigns name and allocates storage
space.
about the array:
• Name of the array (test_marks) • You, as the assembly language
• Number of elements (10) programmer, have to “properly”
• Element size (2 bytes) access the array elements by taking
• Interpretation of each element (int element size and the range of
i.e., signed integer) subscripts.
• Index range (0 to 9 in C) • Accessing an array element requires
• You get very little help in assembly its displacement or offset relative to
language! the start of the array in bytes
Prof. Christopher U. Ngene email: [email protected] 53
One-Dimensional Arrays…
• To compute displacement, we
need to know how the array is
laid out
• Simple for 1-D arrays
• Assuming C style subscripts
(i.e., subscript starts at zero)
displacement = subscript *
element size in bytes
• If the element size is 2, 4, or 8
bytes, a scale factor can be
used to avoid counting
displacement in bytes
Prof. Christopher U. Ngene email: [email protected] 54
Multidimensional Arrays
• We focus on two-dimensional arrays
• Our discussion can be generalized to higher dimensions
• A 5x3 array can be declared in C as
int class_marks [5] [3]; /*5 rows and 3 columns*/
• Two dimensional arrays can be stored in one of two ways:
• Row-major order
• Array is stored row by row starting with the first row
• Most HLL including C and Pascal use this method
• Column-major order
• Array is stored column by column starting with the first column.
• FORTRAN uses this method

Prof. Christopher U. Ngene email: [email protected] 55


Multidimensional Arrays…

(a) Row-major order (b) Column-major order

Prof. Christopher U. Ngene email: [email protected] 56


Multidimensional Arrays…
• Why do we need to know the underlying storage
representation?
• In a HLL, we really don’t need to know
• In assembly language, we need this information as we have to
calculate displacement of element to be accessed

• In assembly language,
class_marks DW 5*3 DUP (?)

• allocates 30 bytes of storage

• There is no support for using row and column subscripts


• Need to translate these subscripts into a displacement value
Prof. Christopher U. Ngene email: [email protected] 57
Multidimensional Arrays…
• Assuming C language subscript convention, we can
express displacement of an element in a 2-D array at row
i and column j as
displacement = (i * COLUMNS + j) * ELEMENT_SIZE
• where
• COLUMNS = number of columns in the array
• ELEMENT_SIZE = element size in bytes

• Example: Displacement of

class_marks [3, 1] ; for 5 x 3 array

• element is (3*3 + 1) * 2 = 20
Prof. Christopher U. Ngene email: [email protected] 58
Arrays: Example
• Example 1
• One-dimensional array
• Computes array sum (each element is 4 bytes long e.g., long
integers)
• Uses scale factor 4 to access elements of the array by using a
32-bit addressing mode (uses ESI rather than SI)
• Also illustrates the use of predefined location counter $
• Example 2
• Two-dimensional array
• Finds sum of a column
• Uses “based-indexed addressing with scale factor” to access
elements of a column

Prof. Christopher U. Ngene email: [email protected] 59


Example 1
1: TITLE Sum of a long integer array ARRAY_SUM.ASM
2: COMMENT |
3: Objective: To find sum of all elements of an array.
4: Input: None
5: | Output: Displays the sum.
6: .386
7: .MODEL FLAT, STDCALL
8: .STACK 100H
9: .DATA
10: test_marks DD
90,50,70,94,81,40,67,55,60,73
11: NO_STUDENTS EQU ($ - test_marks)/4 ;
number of students
12: sum_msg DB 'The sum of test marks is: ‘,
0
13:
14: .CODE
Prof. Christopher U. Ngene email: [email protected] 60
15:
Example 1…
18: mov CX, NO_STUDENTS ; loop
iteration count
19: sub EAX, EAX
; sum := 0
20: sub ESI, ESI
; array index := 0
21: add_loop:
22: mov EBX, test_marks [ESI*4]
23: PutLInt EBX
24: nwln
25: add EAX, test_marks [ESI*4]
26: inc ESI
27: loop add_loop
28:
29: PutStr sum_msg
30: PutLInt EAX
31: nwln
Prof. Christopher U. Ngene email: [email protected] 61
32: .EXIT
Example 1…
• Each element of the test_marks array, declared on line 10,
requires 4 bytes.
• The array size NO_STUDENTS is computed on line 11 using
the predefined location counter symbol $.
• The predefined symbol $ is always set to the current offset in
the segment.
• Thus, on line 11, $ points to the byte after the array storage
space. Therefore, ($-test_marks) gives the storage space in bytes
and dividing this by four gives the number of elements in the
array.
• We are using the indexed addressing mode on lines 22 and 25
where a scale factor of 4 is used.
• Remember that scale factor is only allowed in the 32-bit mode. As a
result, we have to use ESI rather than the SI register.10
Prof. Christopher U. Ngene email: [email protected] 62
Example 2
1: TITLE Sum of a column in a 2-dimensional array TEST_SUM.ASM
2: COMMENT |
3: Objective: To demonstrate array index manipulation
4: in a two-dimensional array of
integers.
5: Input: None
6: | Output: Displays the sum.
7: .MODEL FLAT, STDCALL
8: .STACK 100H
9: .DATA
10: NO_ROWS EQU 5
11: NO_COLUMNS EQU 3
12: NO_ROW_BYTES EQU NO_COLUMNS * 2 ; number of
bytes per row
13: class_marks DW 90, 89, 99
14: DW 79, 66, 70
15: DW 70, 60, 77
Prof.16:
Christopher U. Ngene email: [email protected] DW 60, 55, 68 63
Example 2…
19: sum_msg DB 'The sum of the last test marks
is: ',0
20:
21: .CODE
22: .386
23: INCLUDE io.mac
24: main PROC
25: .STARTUP
26: mov CX, NO_ROWS
; loop iteration count
27: sub AX, AX
; sum := 0
28: ; ESI := index of class_marks[0,2]
29: sub EBX, EBX
30:Prof. Christopher U. Ngene mov
email: [email protected] ESI, NO_COLUMNS-1 64
Example 2…

32: add AX,


class_marks[EBX+ESI*2]
33: add EBX, NO_ROW_BYTES
34: loop sum_loop
35:
36: PutStr sum_msg
37: PutInt AX
38: nwln
39: done:
40: .EXIT
41: main ENDP
42: END
Prof. Christopher U. Ngene email: [email protected] main 65
Example 2…
• To access individual test marks, we use based-indexed addressing with a
displacement on line 32.

• Note that even though we have used


class_marks [EBX+ESI*2]
• it is translated by the assembler as
[EBX+(ESI*2)+constant]
• where the constant is the offset of class_marks. For this to work, EBX should
store the offset of the row in which we are interested.

• For this reason, after initializing EBX to zero to point to the first row (line 29),
NO_ROW_BYTES is added in the loop body (line 33).

• The ESI register is used as a column index. This works for row-major ordering.
Prof. Christopher U. Ngene email: [email protected] 66
Calculating the Sizes of Arrays and Strings
• When using an array, we usually like to know its size. The following example
uses a constant named ListSize to declare the size of list:
list BYTE 10,20,30,40
ListSize = 4
• Explicitly stating an array’s size can lead to programming error, particularly if
you should later insert or remove array elements.

• A better way to declare an array size is to let the assembler calculate its value for
you

• The $ operator (current location counter) returns the offset associated with the
current program statement.

• In the following example, ListSize is calculated by subtracting the offset of list


from the current location counter ($):
list BYTE 10,20,30,40 ListSize must follow immediately after list
ListSize = ($ - list)
Prof. Christopher U. Ngene email: [email protected] 67
Calculating the Sizes of Arrays and Strings…
• Rather than calculating the length of a string manually, let the assembler do it:
myString BYTE "This is a long string, containing"
BYTE "any number of characters"
myString_len = ($ − myString)

• Arrays of Words and DoubleWords


• When calculating the number of elements in an array containing words and double words
• always divide the total array size (in bytes) by the size of the individual array elements.
• The following code, for example, divides the address range by 2 because each word in the array
occupies 2 bytes (16 bits):

list WORD 1000h, 2000h, 3000h, 4000h


ListSize = ($ − list) / 2

the array occupies 2 bytes (16 bit) divide addr range by 2 b/cos each word in

list DWORD 10000000h, 20000000h, 30000000h, 40000000h


ListSize = ($ − list) / 4

each element of an array of doublewords is 4 bytes long, so its overall length must be divided by four to
produce the number of array elements:
Prof. Christopher U. Ngene email: [email protected] 68

You might also like