Assembly Strings and Arrays
Assembly Strings and Arrays
String Instructions
String is a collection of bytes, words, or long-words that can be up to 64KB in length String instructions can have at most two operands. One is referred to as source string and the other one is called destination string
Source string must locate in Data Segment and SI register points to the current element of the source string Destination string must locate in Extra Segment and DI register points to the current element of the destination string DS : SI 0510:0000 0510:0001 0510:0002 0510:0003 0510:0004 0510:0005 0510:0006 53 48 4F 50 50 45 52 S H O P P E R ES : DI 02A8:2000 02A8:2001 02A8:2002 02A8:2003 02A8:2004 02A8:2005 02A8:2006 53 48 4F 50 50 49 4E S H O P P I N
Source String
Destination String
(1 of 2)
The MOVSB, MOVSW, and MOVSD instructions copy data from the memory location pointed to by DS:SI to the memory location pointed to by ES:DI.
.data source DWORD 0FFFFFFFFh target DWORD ? .code mov si,OFFSET source mov di,OFFSET target movsd
(2 of 2)
Direction Flag
Direction Flag (DF) is used to control the way SI and DI are adjusted during the execution of a string instruction
DF=0, SI and DI will auto-increment during the execution; otherwise, SI and DI auto-decrement Instruction to set DF: STD; Instruction to clear DF: CLD Example:
DS : SI 0510:0000 0510:0001 0510:0002 0510:0003 0510:0004 0510:0005 0510:0006 53 48 4F 50 50 45 52 S H O P P E R SI CX=5 SI CX=4 SI CX=3 SI CX=2 SI CX=1 SI CX=0
Source String
OR
.data source DWORD 20 DUP(?) target DWORD 20 DUP(?) .code cld mov cx,LENGTHOF source mov si,OFFSET source mov di,OFFSET target rep movsd
String Instructions
MOVSB (MOVSW) Example
DS : SI 0510:0000 MOV AX, 0510H MOV DS, AX MOV SI, 0 MOV AX, 0300H MOV ES, AX MOV DI, 100H CLD MOV CX, 5 REP MOVSB 0510:0001 0510:0002 0510:0003 0510:0004 0510:0005 0510:0006 53 48 4F 50 50 45 52 S H O P P E R Destination String ES : DI 0300:0100
Source String
Your turn . . .
Use MOVSD to delete the first element of the following double-word array. All subsequent array values must be moved one position forward toward the beginning of the array:
array DWORD 1,1,2,3,4,5,6,7,8,9,10 .data array DWORD 1,1,2,3,4,5,6,7,8,9,10 .code cld mov cx,(LENGTHOF array) - 1 mov si,OFFSET array+4 mov di,OFFSET array rep movsd
String Instructions
CMPSB (CMPSW) Example
Assume:
53 48 4F 50 50 49 4E
S H O P P I N
Source String
Destination String
Comparing Arrays
Use a REPE (repeat while equal) prefix to compare corresponding elements of two arrays.
.data source DWORD COUNT DUP(?) target DWORD COUNT DUP(?) .code mov ecx,COUNT mov esi,OFFSET source mov edi,OFFSET target cld repe cmpsd
; repetition count
(1 of 3) This program compares two strings (source and destination). It displays a message indicating whether the lexical value of the source string is less than the destination string.
.data source BYTE "MARTIN " dest BYTE "MARTINEZ" str1 BYTE "Source is smaller",0dh,0ah,0 str2 BYTE "Source is not smaller",0dh,0ah,0
Source is smaller
Screen output:
(2 of 3)
.code main PROC cld ; direction = forward mov si,OFFSET source mov di,OFFSET dest mov cx,LENGTHOF source repe cmpsb jb source_smaller mov dx,OFFSET str2 ; "source is not smaller" jmp done source_smaller: mov dx,OFFSET str1 ; "source is smaller" done: call WriteString exit main ENDP END main
(3 of 3)
The following diagram shows the final values of SI and DI after comparing the strings:
Before Source: M ESI Before Dest: M EDI A R T I N E Z M A R After T I N E Z EDI A R T I N M A R After T I N ESI
Your turn . . .
Modify the String Comparison program from the previous two slides. Prompt the user for both the source and destination strings. Sample output:
Input first string: ABCDEFG Input second string: ABCDDG The first string is not smaller.
SCASB Example
Search for the letter 'F' in a string named alpha:
.data alpha BYTE "ABCDEFGH",0 .code mov di,OFFSET alpha mov al,'F' mov cx,LENGTHOF alpha cld repne scasb jnz quit dec di
; ; ; ; ;
value to be stored ES:DI points to target character count direction = forward fill with contents of AL
Arrays
Arrays
One-Dimensional Arrays Array declaration in HLL (such as C)
int test_marks[10];
Arrays (contd)
In assembly language, declaration such as
test_marks DW 10 DUP (?)
Accessing an array element requires its displacement or offset relative to the start of the array in bytes
Arrays (contd)
To compute displacement, we need to know how the array is laid out
Simple for 1-D arrays
Multidimensional Arrays
We focus on two-dimensional arrays
Our discussion can be generalized to higher dimensions
Column-major order
Array is stored column by column FORTRAN uses this method
In assembly language,
class_marks DW 5*3 DUP (?)
allocates 30 bytes of storage There is no support for using row and column subscripts
Need to translate these subscripts into a displacement value
where
COLUMNS = number of columns in the array ELEMENT_SIZE = element size in bytes Example: Displacement of class_marks[3,1]
element is (3*3 + 1) * 2 = 20
Examples
Reverse an array
Reverse and array of N elements, BX has the number of elements N, SI points to the array
MOV DI, SI DEC BX ADD DI, BX INC BX SHR BX, 1 LOOP:
MOV AX, [SI] XCHG AX, [DI] MOV [SI], AX INC SI DEC DI DEC BX JNZ LOOP
Alternative format:
table BYTE 10h,20h,30h,40h,50h,60h,70h, 80h,90h,0A0h, 0B0h,0C0h,0D0h, 0E0h,0F0h NumCols = 5
10
20
30
40
50
60
70
80
90
A0
B0
C0
D0
E0
F0
table
table[ebx]
table[ebx + esi]
Binary Search
A simple searching algorithm that works well for large arrays of values that have been placed in either ascending or descending order
Each pair of adjacent values is compared, and exchanged if the values are not ordered correctly:
One Pass (Bubble Sort) 3 1 7 5 2 9 4 3 1 3 7 5 2 9 4 3 1 3 7 5 2 9 4 3 1 3 5 7 2 9 4 3 1 3 5 2 7 9 4 3 1 3 5 2 7 9 4 3 1 3 5 2 7 4 9 3 1 3 5 7 2 4 3 9
Bubble Sort
Binary Search
Searching algorithm, well-suited to large ordered data sets Divide and conquer strategy Each "guess" divides the list in half Classified as an O(log n) algorithm:
As the number of array elements increases by a factor of n, the average search time increases by a factor of log n.
; exit search
base-index addressing
; if ( EDX < searchval(EDI) ) ; first = mid + 1; cmp edx,edi jge L2 mov eax,mid ; first = mid + 1 inc eax mov first,eax jmp L4 ; continue the loop
Summary
String primitives are optimized for efficiency Strings and arrays are essentially the same Keep code inside loops simple Use base-index operands with two-dimensional arrays Avoid the bubble sort for large arrays Use binary search for large sequentially ordered arrays