0% found this document useful (0 votes)
25 views68 pages

2 CPE 413 Intro To Assembly Lang Programming

Assembly language
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views68 pages

2 CPE 413 Intro To Assembly Lang Programming

Assembly language
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Introduction to Assembly

Language

Prof. Christopher U. Ngene email: [email protected] 1


Outline

• Assembly language • Overview of assembly


statements language instructions
• Arithmetic
• Data allocation
• Conditional
• Where are the operands? • Logical
• Addressing modes • Shift
• Register • Rotate
• Immediate
• Direct • Defining constants
• Indirect • EQU and = directives
• Data transfer instructions • Illustrative examples
• mov, xchg, and xlat • Performance: When to use
• PTR directive the xlat instruction
Prof. Christopher U. Ngene email: [email protected] 2
Assembly Language Statements
• Three different classes
• Instructions
• Tell CPU what to do
• Executable instructions with an op-code
• Directives (or pseudo-ops)
• Provide information to assembler on various aspects
of the assembly process
• Non-executable
• Do not generate machine language instructions
• Macros
• A shorthand notation for a group of statements
• A sophisticated text substitution mechanism with
parameters
Prof. Christopher U. Ngene email: [email protected] 3
Constants
• Integer Constants
• Examples: –10, 42d, 10001101b, 0FF3Ah, 777o
• Radix: b = binary, d = decimal, h = hexadecimal, and o = octal
• If no radix is given, the integer constant is decimal
• A hexadecimal beginning with a letter must have a leading 0
• Character and String Constants
• Enclose character or string in single or double quotes
• Examples: 'A', "d", 'ABC', "ABC", '4096'
• Embedded quotes: "single quote ' inside", 'double quote " inside'
• Each ASCII character occupies a single byte

Prof. Christopher U. Ngene email: [email protected] 4


Instructions
• Assembly language instructions have the format:
[label:] mnemonic [operands] [;comment]
• Instruction Label (optional)
• Marks the address of an instruction, must have a colon :
• Used to transfer program execution to a labeled instruction
• Mnemonic
• Identifies the operation (e.g. MOV, ADD, SUB, JMP, CALL)
• Operands
• Specify the data required by the operation
• Executable instructions can have zero to three operands
• Operands can be registers, memory variables, or constants

Prof. Christopher U. Ngene email: [email protected] 5


Instruction Examples
• No operands
stc ; set carry flag
• One operand
inc eax ; increment register eax
call Clrscr ; call procedure Clrscr
jmp L1 ; jump to instruction with ; label L1
• Two operands
add ebx, ecx ; register ebx = ebx + ecx
sub var1, 25 ; memory variable var1 = ; var1 - 25
• Three operands
imul eax,ebx,5 ; register eax = ebx * 5

Prof. Christopher U. Ngene email: [email protected] 6


Comments
• Comments are very important!
• Explain the program's purpose
• When it was written, revised, and by whom
• Explain data used in the program
• Explain instruction sequences and algorithms used
• Application-specific explanations
• Single-line comments
• Begin with a semicolon ; and terminate at end of line
• Multi-line comments
• Begin with COMMENT directive and a chosen character
• End with the same chosen character
Prof. Christopher U. Ngene email: [email protected] 7
Flat Memory Program Template
TITLE Flat Memory Program Template (Template.asm)

; Program Description:
; Author: Creation Date:
; Modified by: Modification Date:

.386
.MODEL FLAT, STDCALL
.STACK

INCLUDE\masm32\INCLUDE\windows.inc
.DATA
; (Your initialized variables here)
.DATA?
; (Your uninitialized variables here)
.CONST
;(Your constants here)
.CODE
main PROC
; (insert executable instructions here)
exit
main ENDP
; (insert additional procedures here)
END main

Prof. Christopher U. Ngene email: [email protected] 8


TITLE and .MODEL Directives
• TITLE line (optional)
• Contains a brief heading of the program and the disk file name
• MODEL directive
• Specifies the memory configuration
• For our purposes, the FLAT memory model will be used
• Linear 32-bit address space (no segmentation)
• STDCALL directive tells the assembler to use …
• Standard conventions for names and procedure calls
• .386 processor directive
• Used before the .MODEL directive
• Program can use instructions of Pentium P6 architecture (686)
• At least the .386 directive should be used with the FLAT model

Prof. Christopher U. Ngene email: [email protected] 9


.STACK, .DATA, & .CODE Directives
• STACK directive
• Tells the assembler to define a runtime stack for the program
• The size of the stack can be optionally specified by this directive
• The runtime stack is required for procedure calls

• DATA, .DATA?, .CONST, .CODE directives


• All 4 directives are called section. Remember no segments in Win32,
• But you can divide your entire address space into logical sections.
• The start of one section denotes the end of the previous section.
• There are two groups of section: data and code. Data sections are divided into 3
categories:

• .DATA
• This section contains initialized data of your program.
• Assembler will allocate and initialize the storage of variables

Prof. Christopher U. Ngene email: [email protected] 10


.STACK, .DATA, & .CODE Directives…
• .DATA?
• This section contains uninitialized data of your program.
• Sometimes you just want to pre-allocate some memory but don't want to initialize it.
• The advantage of uninitialized data is: it doesn't take space in the executable file.
For example, if you allocate 10,000 bytes in your .DATA? section, your executable is
not bloated up 10,000 bytes. Its size stays much the same. You only tell the
assembler how much space you need when the program is loaded into memory,
that's all.

• .CONST
• This section contains declaration of constants used by your program. Constants in
this section can never be modified in your program. They are just *constant*.
• CODE directive
• Defines the code section of a program containing instructions
• Assembler will place the instructions in the code area in memory
You don't have to use all .DATA, .DATA? And .CONST sections in your
program.
Declare only the section(s) you want to use.
Prof. Christopher U. Ngene email: [email protected] 11
INCLUDE, PROC, ENDP, and END
• INCLUDE directive
• Causes the assembler to include code from another file
• We will include windows.inc
• Declares procedures implemented in the windows.lib library
• To use this library, you should link windows.lib to your programs
• PROC and ENDP directives
• Used to define procedures
• As a convention, we will define main as the first procedure
• Additional procedures can be defined after main
• END directive
• Marks the end of a program
• Identifies the name (main) of the program’s startup procedure

Prof. Christopher U. Ngene email: [email protected] 12


Data Allocation
• Variable declaration in a high-level language such
as C
• char response
• int value
• Float total
• double average_value
• specifies
• Amount storage required (1 byte, 2 bytes, …)
• Label to identify the storage allocated (response, value, …)
• Interpretation of the bits stored (signed, floating point, …)
• Bit pattern 1000 1101 1011 1001 is interpreted as
• -29,255 as a signed number
• 36,281 as an unsigned number

Prof. Christopher U. Ngene email: [email protected] 13


Data Allocation…
• In assembly language, we use the define directive
• Define directive can be used
• To reserve storage space
• To label the storage space
• To initialize
• But no interpretation is attached to the bits stored
• Interpretation is up to the program code
• Define directive goes into the .DATA part of the assembly
language program

• Define directive format


[var-name] D? init-value [,init-value],...

Prof. Christopher U. Ngene email: [email protected] 14


Data Allocation….
• Five define directives
DB Define Byte ;allocates 1 byte
DW Define Word ;allocates 2 bytes
DD Define Doubleword ;allocates 4 bytes
DQ Define Quadword ;allocates 8 bytes
DT Define Ten bytes ;allocates 10 bytes
• Examples
sorted DB ’y’
response DB ? ;no initialization
value DW 25159
float1 DQ 1.234

Prof. Christopher U. Ngene email: [email protected] 15


Data Allocation….
• Multiple definitions can be abbreviated
• Example
message DB ’B’
DB ’y’
DB ’e’
DB 0DH
DB 0AH
can be written as
message DB ’B’, ’y’, ’e’, 0DH, 0AH
• More compactly as
message DB ’Bye’, 0DH, 0AH

Prof. Christopher U. Ngene email: [email protected] 16


Data Allocation…
• Multiple definitions can be cumbersome to initialize
data structures such as arrays
• Example
• To declare and initialize an integer array of 8 elements
marks DW 0, 0, 0, 0, 0, 0, 0, 0
• What if we want to declare and initialize to zero an
array of 200 elements?
• There is a better way of doing this than repeating zero 200
times in the above statement
• Assembler provides a directive to do this (DUP directive)

Prof. Christopher U. Ngene email: [email protected] 17


Data Allocation…
• Multiple initializations
• The DUP assembler directive allows multiple initializations
to the same value
• Previous marks array can be compactly declared as marks
DW 8 DUP (0)
• Examples
table1 DW 10 DUP (?) ;10 words, uninitialized
message DB 3 DUP (’Bye!’) ;12 bytes, initialized
; as Bye!Bye!Bye!
Name1 DB 30 DUP (’?’) ;30 bytes, each
; initialized to ?

Prof. Christopher U. Ngene email: [email protected] 18


Data Allocation….
• The DUP directive may also be nested
• Example
stars DB 4 DUP (3 DUP (’*’), 2 DUP (’?’), 5 DUP (’!’))
Reserves 40-bytes space and initializes it as
***??!!!!!***??!!!!!***??!!!!!***??!!!!!
• Example
matrix DW 10 DUP (5 DUP (0))
defines a 10 x 5 matrix and initializes its elements to
zero. This declaration can also be done by

matrix DW 50 DUP (0)

Prof. Christopher U. Ngene email: [email protected] 19


Data Allocation…
Symbol Table
• Assembler builds a symbol table
• so we can refer to the allocated storage space by the
associated label
• Assembler keeps track of each name and its offset
• Offset of a variable is relative to the address of the first variable.

Prof. Christopher U. Ngene email: [email protected] 20


Data Allocation…

Correspondence to C Data
Types

Prof. Christopher U. Ngene email: [email protected] 21


Data Allocation….
• LABEL directive provides LABEL Directive
another way to name a
memory location • Example
• Format:
name LABEL type
Type can be:

BYTE 1 byte
WORD 2 bytes
DWORD 4 bytes
• count refers to the 16-bit
QWORD 8 bytes
value
TWORD 10 bytes • Lo_count refers to the low
byte
• Hi_count refers to the high
Prof. Christopher U. Ngene email: [email protected] 22
byte
Where Are the Operands?
Register addressing
• Operands required by an mode
• Most efficient way of
operation can be specified in specifying an operand
a variety of ways • operand is in an internal
• A few basic ways are: register
• operand in a register
• register addressing mode
• Examples
• operand in the instruction itself mov EAX, EBX
• immediate addressing mode mov BX, CX
• operand in memory
• variety of addressing modes
* The mov instruction
• direct and indirect addressing mov destination, source
modes
• operand at an I/O port • copies data from source
Addressing mode refers to the specificationto destination
of the location of data required by
an operation
Prof. Christopher U. Ngene email: [email protected] 23
Where Are the Operands?...
Immediate addressing mode Direct addressing mode

• Data is part of the • Data is in the data segment


instruction • Need a logical address to access
data
• operand is located in the code • Two components: segment:offset
segment along with the • Various addressing modes to
instruction specify the offset component
• offset part is referred to as the
• Efficient as no separate effective address
operand fetch is needed • The offset is specified directly as
• Typically used to specify a part of the instruction
constant • We write assembly language
programs using memory
• Example • labels (e.g., declared using DB, DW,
Mov AL, 75 LABEL,...)
• Assembler computes the offset value
• This instruction uses register for the label
addressing mode for specifying • Uses symbol table to compute the
the destination and immediate offset of a label
addressing mode to specify the
source U. Ngene email: [email protected]
Prof. Christopher 24
Addressing Modes: Directing Addressing
Direct addressing mode
Examples
mov AL, response
» Assembler replaces response by its effective address (i.e., its offset
value from the symbol table)

mov table1, 56
» table1 is declared as

table1 DW 20 DUP (0)

» Since the assembler replaces table1 by its effective address, this


instruction refers to the first element of table1
– In C, it is equivalent to

table1[0] = 56
Prof. Christopher U. Ngene email: [email protected] 25
Addressing Modes: Directing Addressing…

• Problem with direct addressing


• Useful only to specify simple variables
• Causes serious problems in addressing data types
such as arrays
• As an example, consider adding elements of an array
• Direct addressing does not facilitate using a loop structure to
iterate through the array
• We have to write an instruction to add each element of the array

• Indirect addressing mode remedies this problem

Prof. Christopher U. Ngene email: [email protected] 26


Addressing Modes: Indirect Addressing

• The offset is specified indirectly via a register


• Sometimes called register indirect addressing mode
• For 16-bit addressing, the offset value can be in one of the three
registers: BX, SI, or DI
• For 32-bit addressing, all 32-bit registers can be used
• Example
mov EBX, OFFSET array ; EBX contains the address of the
operand
mov AX, [BX] ;EBX used to access memory
• Square brackets [ ] are used to indicate that BX is holding an
offset value
• BX contains a pointer to the operand, not the operand itself

Prof. Christopher U. Ngene email: [email protected] 27


Addressing Modes: Indirect Addressing…
• Using indirect addressing mode, we can process
arrays using loops

• Example: Summing array elements

• Load the starting address (i.e., offset) of the array into


BX
• Loop for each element in the array
• Get the value using the offset in BX
• Use indirect addressing
• Add the value to the running total
• Update the offset in BX to point to the next element of the
array
Prof. Christopher U. Ngene email: [email protected] 28
Indirect Addressing: Array Sum Example
• Indirect addressing is ideal for traversing an array
.data
array DWORD 10000h,20000h,30000h
.code
mov esi, OFFSET array ; esi = array address
mov eax,[esi] ; eax = [array] = 10000h
add esi,4; why 4?
add eax,[esi] ; eax = eax + [array+4]
add esi,4; why 4?
add eax,[esi] ; eax = eax + [array+8]

 Note that ESI register is used as a pointer to array


 ESI must be incremented by 4 to access the next array element
 Because each array element is 4 bytes (DWORD) in memory

Prof. Christopher U. Ngene email: [email protected] 29


Loading offset value into a register

• Suppose we want to load BX with the offset value of


table1
• We cannot write
mov BX, table1
• Two ways of loading offset value
• Using OFFSET assembler directive
• Executed only at the assembly time
• Using lea instruction
• This is a processor instruction
• Executed at run time

Prof. Christopher U. Ngene email: [email protected] 30


OFFSET and LEA
Loading offset value into a register

• Using OFFSET assembler directive


• The previous example can be written as
mov BX, OFFSET table1

• Using lea (Load Effective Address) instruction


• The format of lea instruction is
lea register, source
• The previous example can be written as
lea BX, table1
Prof. Christopher U. Ngene email: [email protected] 31
OFFSET and LEA…
Loading offset value into a register
• Which one to use -- OFFSET or lea?
• Use OFFSET if possible
• OFFSET incurs only one-time overhead (at assembly time)
• lea incurs run time overhead (every time you run the program)
• May have to use lea in some instances
• When the needed data is available at run time only
• An index passed as a parameter to a procedure
• We can write
lea BX, table1[SI]
• to load BX with the address of an element of table1 whose
index is in SI register
• We cannot use the OFFSET directive in this case

Prof. Christopher U. Ngene email: [email protected] 32


LEA Example
.data
array WORD 1000 DUP(?)

.code ; Equivalent to . . .
lea eax, array; mov eax, OFFSET array

lea eax, array[esi] ; mov eax, esi


; add eax, OFFSET array

lea eax, array[esi*2] ; mov eax, esi


; add eax, eax
; add eax, OFFSET array

lea eax, [ebx+esi*2] ; mov eax, esi


; add eax, eax
; add eax, ebx

Prof. Christopher U. Ngene email: [email protected] 33


DATA TRANSFER
INSTRUCTIONS

Prof. Christopher U. Ngene email: [email protected] 34


Data Transfer Instructions

The mov instruction


• We will look at three • The format is
instructions mov destination, source
• mov (move) • Copies the value from source to
• Actually copy destination
• xchg (exchange) • source is not altered as a result of
• Exchanges two operands copying
• xlat (translate) • Both operands should be of same
size
• Translates byte values using
a translation table • source and destination cannot
both be in memory
• Other data transfer are • Most Pentium instructions do not
allow both operands to be located in
instructions such as memory
• Pentium provides special instructions
• movsx (move sign extended) to facilitate memory-to-memory block
• movzx (move zero extended) copying of data

Prof. Christopher U. Ngene email: [email protected] 35


Data Transfer Instructions…

Mov Instruction Rules


• Both operands must be of same size
• No memory to memory moves
• Destination cannot be CS, EIP, or IP
• No immediate to segment moves

Programs running in protected mode should not


modify the segment registers
Prof. Christopher U. Ngene email: [email protected] 36
Data Transfer Instructions…
The mov instruction
• Five types of operand combinations are allowed:

• The operand combinations are valid for all


instructions that require two operands
Prof. Christopher U. Ngene email: [email protected] 37
Data Transfer Instructions…
Zero Extension
• MOVZX Instruction
• Fills (extends) the upper part of the destination with zeros
• Used to copy a small source into a larger destination
• Destination must be a register

Example:
mov bl, 8Fh
movzx ax, bl

Prof. Christopher U. Ngene email: [email protected] 38


Data Transfer Instructions…
Sign Extension
• MOVSX Instruction
• Fills (extends) the upper part of the destination register
with a copy of the source operand's sign bit
• Used to copy a small source into a larger destination

Example:
mov bl, 8Fh
movsx ax, bl

Prof. Christopher U. Ngene email: [email protected] 39


Data Transfer Instructions…
Ambiguous moves: PTR
directive
• For the following data definitions • PTR assembler directive can be
.DATA used to clarify the operand size.
table1 DW 20 DUP (0) • The last two mov instructions can
status DB 7 DUP (1) be written as
• the last two mov instructions are mov WORD PTR [BX], 100
ambiguous
Mov BYTE PTR [SI], 100
mov BX, OFFSET table1 • WORD and BYTE are called type
specifiers
mov SI, OFFSET status
mov [BX], 100 • We can also use the following type
mov [SI], 100 specifiers:
• DWORD for doubleword values
• Not clear whether the assembler • QWORD for quadword values
should use byte or word • TWORD for ten byte values
equivalent of 100

Prof. Christopher U. Ngene email: [email protected] 40


Data Transfer Instructions…
The xchg instruction
The syntax is • The xchg instruction is useful for
xchg operand1, operand2 conversion of 16-bit data
between little endian and big
Exchanges the values of endian forms
operand1 and operand2
• Example:
Examples mov AL, AH
xchg EAX, EDX • converts the data in AX into the
xchg response, CL other endian form
xchg total, DX • Pentium provides bswap
• Without the xchg instruction, we instruction to do similar
need a temporary register to conversion on 32-bit data
exchange values using only the bswap 32-bit register
mov instruction • bswap works only on data located
in a 32-bit register

Prof. Christopher U. Ngene email: [email protected] 41


Data Transfer Instructions…
The xlat instruction
• The xlat instruction translates bytes • Example: Encrypting digits
• The format is Input digits: 0 1 2 3 4 5 6 7 8 9
xlatb Encrypted digits: 4 6 9 5 0 3 1 8 7 2
.DATA
• To use xlat instruction xlat_table DB ’4695031872’
• BX should be loaded with the starting
address of the translation table ...
• AL must contain an index in to the .CODE
table
mov BX,OFFSET xlat_table
• Index value starts at zero
• The instruction reads the byte at this GetCh AL
index in the translation table and Sub AL, ’0’ ; converts input character to index
stores this value in AL
• The index value in AL is lost xlatb ; AL = encrypted digit character
• Translation table can have at most PutCh AL
256 entries (due to AL) …...

Prof. Christopher U. Ngene email: [email protected] 42


The xchg instruction…
XCHG Rules
• Operands must be of the same size
• At least one operand must be a register
• No immediate operands are permitted
• No exchange of two memory operands

Prof. Christopher U. Ngene email: [email protected] 43


MORE ADDRESSING MODES

Prof. Christopher U. Ngene email: [email protected] 44


Addressing Modes: Index Addressing
• Combines a variable's name with an index register
• Assembler converts variable's name into a constant offset
• Constant offset is added to register to form an effective address
• Syntax: [name + index] or name [index]
.data
array DWORD 10000h,20000h,30000h
.code
mov esi, 0 ; esi = array index
mov eax,array[esi] ; eax = array[0] = 10000h
add esi,4
add eax,array[esi] ; eax = eax + array[4]
add esi,4
add eax,[array+esi] ; eax = eax + array[8]
Prof. Christopher U. Ngene email: [email protected] 45
Addressing Modes: Index Addressing...
Index Scaling
• Useful to index array elements of size 2, 4, and 8 bytes
• Syntax: [name + index * scale] or name [index * scale]
• Effective address is computed as follows:
• Name's offset + Index register * Scale factor

.DATA
arrayB BYTE 10h,20h,30h,40h
arrayW WORD 100h,200h,300h,400h
arrayD DWORD 10000h,20000h,30000h,40000h
.CODE
mov esi, 2
mov al, arrayB[esi] ; AL = 30h
mov ax, arrayW[esi*2] ; AX = 300h
mov eax, arrayD[esi*4] ; EAX = 30000h

Prof. Christopher U. Ngene email: [email protected] 46


Addressing Modes: Based Addressing
• Syntax: [Base + Offset]
• Effective Address = Base register + Constant Offset
• Useful to access fields of a structure or an object
• Base Register  points to the base address of the
structure
• Constant Offset  relative offset within the structure
.DATA mystruct is a structure
mystruct WORD 12 consisting of 3 fields: a
DWORD 1985 word, a double word, and
BYTE 'M' a byte
.CODE
mov ebx, OFFSET mystruct
mov eax, [ebx+2] ; EAX = 1985
mov al, [ebx+6] ; AL = 'M'

Prof. Christopher U. Ngene email: [email protected] 47


Addressing Modes: Based-Indexed
Addressing
• Syntax: [Base + (Index * Scale) + Offset]
• Scale factor is optional and can be 1, 2, 4, or 8
• Useful in accessing two-dimensional arrays
• Offset: array address => we can refer to the array by name
• Base register: holds row address => relative to start of array
• Index register: selects an element of the row => column index
• Scaling factor: when array element size is 2, 4, or 8 bytes
• Useful in accessing arrays of structures (or objects)
• Base register: holds the address of the array
• Index register: holds the element address relative to the base
• Offset: represents the offset of a field within a structure

Prof. Christopher U. Ngene email: [email protected] 48


Addressing Modes: Based-Indexed
Addressing…
• Example
.data
matrix DWORD 0, 1, 2, 3, 4 ; 4 rows, 5 cols
DWORD 10,11,12,13,14
DWORD 20,21,22,23,24
DWORD 30,31,32,33,34

ROWSIZE EQU SIZEOF matrix ; 20 bytes per row

.code
mov ebx, 2*ROWSIZE ; row index = 2
mov esi, 3 ; col index = 3
mov eax, matrix[ebx+esi*4] ; EAX = matrix[2][3]

mov ebx, 3*ROWSIZE ; row index = 3


mov esi, 1 ; col index = 1
mov eax, matrix[ebx+esi*4] ; EAX = matrix[3][1]

Prof. Christopher U. Ngene email: [email protected] 49


Summary of Addressing Modes
Assembler converts a variable name into a
constant offset (called also a displacement)

For indirect addressing, a base/index


register contains an address/index

CPU computes the effective


address of a memory operand

Prof. Christopher U. Ngene email: [email protected] 50


Registers Used in 32-Bit Addressing
• 32-bit addressing modes use the following 32-bit
registers
Base + ( Index * Scale ) + displacement
EAX EAX 1 no displacement
EBX EBX 2 8-bit displacement
ECX ECX 4 32-bit displacement
EDX EDX 8
Only the index register can
ESIESI have a scale factor
EDI EDI
ESP can be used as a base
EBP EBP register, but not as an index
ESP
Prof. Christopher U. Ngene email: [email protected] 51
Differences between 16- and 32-bit Modes

Prof. Christopher U. Ngene email: [email protected] 52


One-Dimensional Arrays

• Array declaration in HLL (such as • In assembly language, declaration


C) such as
int test_marks [10]; test_marks DW 10 DUP (?)
• specifies a lot of information only assigns name and allocates
about the array: storage space.
• Name of the array (test_marks) • You, as the assembly language
• Number of elements (10) programmer, have to “properly”
• Element size (2 bytes) access the array elements by taking
• Interpretation of each element element size and the range of
(int i.e., signed integer) subscripts.
• Index range (0 to 9 in C) • Accessing an array element
• You get very little help in requires its displacement or offset
relative to the start of the array in
assembly language! bytes
Prof. Christopher U. Ngene email: [email protected] 53
One-Dimensional Arrays…
• To compute displacement, we
need to know how the array is
laid out
• Simple for 1-D arrays
• Assuming C style subscripts (i.
e., subscript starts at zero)
displacement = subscript *
element size in
bytes
• If the element size is 2, 4, or 8
bytes, a scale factor can be
used to avoid counting
displacement in bytes
Prof. Christopher U. Ngene email: [email protected] 54
Multidimensional Arrays
• We focus on two-dimensional arrays
• Our discussion can be generalized to higher dimensions
• A 5x3 array can be declared in C as
int class_marks [5] [3]; /*5 rows and 3 columns*/
• Two dimensional arrays can be stored in one of two
ways:
• Row-major order
• Array is stored row by row starting with the first row
• Most HLL including C and Pascal use this method
• Column-major order
• Array is stored column by column starting with the first column.
• FORTRAN uses this method

Prof. Christopher U. Ngene email: [email protected] 55


Multidimensional Arrays…

(a) Row-major order (b) Column-major order

Prof. Christopher U. Ngene email: [email protected] 56


Multidimensional Arrays…
• Why do we need to know the underlying storage
representation?
• In a HLL, we really don’t need to know
• In assembly language, we need this information as we have to
calculate displacement of element to be accessed

• In assembly language,
class_marks DW 5*3 DUP (?)

• allocates 30 bytes of storage

• There is no support for using row and column subscripts


• Need to translate these subscripts into a displacement value

Prof. Christopher U. Ngene email: [email protected] 57


Multidimensional Arrays…
• Assuming C language subscript convention, we can
express displacement of an element in a 2-D array at
row i and column j as
displacement = (i * COLUMNS + j) * ELEMENT_SIZE
• where
• COLUMNS = number of columns in the array
• ELEMENT_SIZE = element size in bytes

• Example: Displacement of

class_marks [3, 1] ; for 5 x 3 array

• element is (3*3 + 1) * 2 = 20
Prof. Christopher U. Ngene email: [email protected] 58
Arrays: Example
• Example 1
• One-dimensional array
• Computes array sum (each element is 4 bytes long e.g.,
long integers)
• Uses scale factor 4 to access elements of the array by
using a 32-bit addressing mode (uses ESI rather than SI)
• Also illustrates the use of predefined location counter $
• Example 2
• Two-dimensional array
• Finds sum of a column
• Uses “based-indexed addressing with scale factor” to
access elements of a column

Prof. Christopher U. Ngene email: [email protected] 59


Example 1
1: TITLE Sum of a long integer array ARRAY_SUM.ASM
2: COMMENT |
3: Objective: To find sum of all elements of an array.
4: Input: None
5: | Output: Displays the sum.
6: .386
7: .MODEL FLAT, STDCALL
8: .STACK 100H
9: .DATA
10: test_marks DD 90,50,70,94,81,40,67,55,60,73
11: NO_STUDENTS EQU ($ - test_marks)/4 ; number of students
12: sum_msg DB 'The sum of test marks is: ‘, 0
13:
14: .CODE
15:
16: main PROC
17: .STARTUP
Prof. Christopher U. Ngene email: [email protected] 60
Example 1…
18: mov CX, NO_STUDENTS ; loop iteration count
19: sub EAX, EAX ; sum := 0
20: sub ESI, ESI ; array index := 0
21: add_loop:
22: mov EBX, test_marks [ESI*4]
23: PutLInt EBX
24: nwln
25: add EAX, test_marks [ESI*4]
26: inc ESI
27: loop add_loop
28:
29: PutStr sum_msg
30: PutLInt EAX
31: nwln
32: .EXIT
33: main ENDP
34: END main
Prof. Christopher U. Ngene email: [email protected] 61
Example 1…
• Each element of the test_marks array, declared on line
10, requires 4 bytes.
• The array size NO_STUDENTS is computed on line 11
using the predefined location counter symbol $.
• The predefined symbol $ is always set to the current offset
in the segment.
• Thus, on line 11, $ points to the byte after the array storage
space. Therefore, ($-test_marks) gives the storage space in
bytes and dividing this by four gives the number of elements
in the array.
• We are using the indexed addressing mode on lines 22 and
25 where a scale factor of 4 is used.
• Remember that scale factor is only allowed in the 32-bit mode. As a
result, we have to use ESI rather than the SI register.10

Prof. Christopher U. Ngene email: [email protected] 62


Example 2
1: TITLE Sum of a column in a 2-dimensional array TEST_SUM.ASM
2: COMMENT |
3: Objective: To demonstrate array index manipulation
4: in a two-dimensional array of integers.
5: Input: None
6: | Output: Displays the sum.
7: .MODEL FLAT, STDCALL
8: .STACK 100H
9: .DATA
10: NO_ROWS EQU 5
11: NO_COLUMNS EQU 3
12: NO_ROW_BYTES EQU NO_COLUMNS * 2 ; number of bytes per row
13: class_marks DW 90, 89, 99
14: DW 79, 66, 70
15: DW 70, 60, 77
16: DW 60, 55, 68
17: DW 51, 59, 57
18:
Prof. Christopher U. Ngene email: [email protected] 63
Example 2…
19: sum_msg DB 'The sum of the last test marks is: ',0
20:
21: .CODE
22: .386
23: INCLUDE io.mac
24: main PROC
25: .STARTUP
26: mov CX, NO_ROWS ; loop iteration count
27: sub AX, AX ; sum := 0
28: ; ESI := index of class_marks[0,2]
29: sub EBX, EBX
30: mov ESI, NO_COLUMNS-1
31: sum_loop:

Prof. Christopher U. Ngene email: [email protected] 64


Example 2…

32: add AX, class_marks[EBX+ESI*2]


33: add EBX, NO_ROW_BYTES
34: loop sum_loop
35:
36: PutStr sum_msg
37: PutInt AX
38: nwln
39: done:
40: .EXIT
41: main ENDP
42: END main
Prof. Christopher U. Ngene email: [email protected] 65
Example 2…
• To access individual test marks, we use based-indexed addressing
with a displacement on line 32.

• Note that even though we have used


class_marks [EBX+ESI*2]
• it is translated by the assembler as
[EBX+(ESI*2)+constant]
• where the constant is the offset of class_marks. For this to work,
EBX should store the offset of the row in which we are interested.

• For this reason, after initializing EBX to zero to point to the first row
(line 29), NO_ROW_BYTES is added in the loop body (line 33).

• The ESI register is used as a column index. This works for row-major
ordering.

Prof. Christopher U. Ngene email: [email protected] 66


Calculating the Sizes of Arrays and Strings
• When using an array, we usually like to know its size. The following example uses a
constant named ListSize to declare the size of list:
list BYTE 10,20,30,40
ListSize = 4
• Explicitly stating an array’s size can lead to programming error, particularly if you
should later insert or remove array elements.

• A better way to declare an array size is to let the assembler calculate its value for
you

• The $ operator (current location counter) returns the offset associated with the
current program statement.

• In the following example, ListSize is calculated by subtracting the offset of list from
the current location counter ($):
list BYTE 10,20,30,40
ListSize = ($ - list)
ListSize must follow immediately after lis
Prof. Christopher U. Ngene email: [email protected] 67
Calculating the Sizes of Arrays and Strings…
• Rather than calculating the length of a string manually, let the assembler do it:
myString BYTE "This is a long string, containing"
BYTE "any number of characters"
myString_len = ($ − myString)

• Arrays of Words and DoubleWords


• When calculating the number of elements in an array containing words and double words
• always divide the total array size (in bytes) by the size of the individual array elements.
• The following code, for example, divides the address range by 2 because each word in the array
occupies 2 bytes (16 bits):

list WORD 1000h, 2000h, 3000h, 4000h


ListSize = ($ − list) / 2

the array occupies 2 bytes (16 bit) divide addr range by 2 b/cos each word in

list DWORD 10000000h, 20000000h, 30000000h, 40000000h


ListSize = ($ − list) / 4

each element of an array of doublewords is 4 bytes long, so its overall length must be divided by four to
produce the number of array elements:

Prof. Christopher U. Ngene email: [email protected] 68

You might also like