Requirements For Coding in Assembly Language
Requirements For Coding in Assembly Language
Chapter 4
Requirements for Coding in Assembly Language
Reserved Words
Certain words in assembly language are reserved for its own purposes to be used only under special conditions.
4 categories of reserved words:
- instructions such as MOV and ADD
- directives such as END and SEGMENT
- operators such as FAR and SIZE which are used for expressions.
- predefined symbols such as @Data and @Model which return information during assembly.
Using a reserved word for a wrong purpose causes the assembler to generate an error message. Appendix D list the
reserved words.
Identifiers
An identifier is a name that you apply to items in your program. The two types of identifier are:
- name which refer to the address of a data item
- label, which refers to the address of an instruction.
An identifier can use the following characters:
Alphabetic letters: A-Z and a-z
Digits: 0-9 (may not be the first character)
Special characters: question mark (?)
underline (_)
dollar ($)
at (@)
period(.) (May not be the first character)
The first character of an identifier must be an alphabetic letter or a special character, except for the period. Since the
assembler uses some special words that being with the @ symbol, you should avoid using it for your own definitions.
The assembler is not case sensitive. The maximum length of an identifier is 31 characters (247 since MASM 6.0).
2
Statement
An assembly language program consists of a set of statements. The two types of statements are:
- instructions such as MOV and ADD, which the assembler translates to object code (machine code) which is
executed at run time.
- directives, which tell the assembler to perform a specific action, such as define a data item – most of the time
the object code is not generated. For example .586 that means use of 32-bit operands.
- macros, that are shorthands for the sequence of other statements. The assembler expands a macro to the
statements it represents, and then assembler works on the resulting statements.
General format for a statement:
[identifier] operation [operand(s)] [;comment]
An identifier (if any), operation, and operand (if any) are separated by at least one blank or tab character. There is a
maximum of 132 characters on a line (512 since MASM 6.0).
Example:
Directive: COUNT DB 1 ;name, operation, operand
Instruction: MOV AX,0 ;operation, two operands
Operation
The operation, which must be coded, is most commonly used for defining data areas and coding instruction. For a data
item, an operation such as DB or DW defines a field, work area, or constant. For an instruction, an operation such as
MOV or ADD indicates an action to perform.
Operand
The operand (if any) provides information for the operation to act on.
For a data item, the operand defines its initial value.
Example:
COUNTER DB 0
For an instruction, an operand indicates where to perform the action. An instruction may have no, one, two or three
operands.
Examples:
No operands: RET
Directives
Assembly language supports a number of statements that enable you to control the way in which a program assembles
and lists. These statements act during the assembling of the program and generate no executable code.
The TITLE directive causes a title for a program to print on line 2 of each page of the program listing. You may code it
once at the start of the program. Syntax:
TITLE text
Segment Directive
An assembly program in .EXE format consists of one or more segments. A stack segment defines stack storage, a data
segment defines data items, and a code segment provides for executable code. SEGMENT directive and ENDS
directive have the following format:
The SEGMENT statement defines the start of a segment. The segment name must be present, must be unique, and must
follow the naming conventions of the language. The ENDS statement indicates the end of the segment and contains the
same name as the SEGMENT statement. The maximum size of a segment is 64K.
segment-name ENDS
The align entry (alignment) indicates the boundary on which the segment is to begin. PARA is typically used and is the
default.
Others are BYTE, WORD, DWORD, PAGE (divisible bye 256).
Combine type. The combine entry indicates whether to combine the segment with other segments when they are linked
4
after assembly. Combine types are STACK, COMMON, PUBLIC, and AT expression.
Example
name SEGMENT PARA STACK
You may use PUBLIC and COMMON where you intend to combine separately assembled programs when linking.
When a program is not to be linked with others this option may be omitted or code NONE.
Class type. The class entry is enclosed in apostrophes, is used to group related segments when linking. This book uses
‘code’ for the code segment, ‘data’ for the data segment, and ‘stack’ for the stack segment.
Example
name SEGMENT PARA STACK ‘Stack’
PROC Directive
The code segment contains the executable code for a program. It also contains one or more procedures, defined with
the PROC directive.
Example
segname SEGMENT PARA
procname PROC FAR
procname ENDP
segname ENDS
The procedure name must be present, must be unique, and must following naming conventions for the language. The
FAR operand indicates the entry point for program execution.
The ENDP indicates the end of a procedure and contains the same name as the PROC which enables the assembler to
relate the two.
The code segment may contain any number of procedures uses as subroutines, each with its own set of PROC and
ENDP statements. Each of these is usually coded with NEAR operand (this is the default).
ASSUME Directive
This statement associates the name of a segment with the segment register.
Syntax:
ASSUME SS:stackname,DS:datasegname,CS:codesegname...
Note that it may contain reference for the ES. You can code ES:NOTHING or simply leave it out.
Like other directives, ASSUME is just a message to help the assembler convert symbolic code to machine code.
Note you still may have to code instructions to load addresses into the segment register during execution.
END Directive
5
ENDS ends a segment, ENDP ends a procedure. An END directive ends an entire program.
Syntax:
END [procname]
Coding the processor directive before the .MODEL statement causes the assembler to assume 32-bit addressing. The
entery STDCALL tells the assembler to use standard conventions for names and procedure calls. The processor
operates more efficiently because it does not have to convert segment:offsets to actual address.
The use of DWORD to align segments on a doubleword address speeds up accessing memory for 32-bit data buses.
The .386 directive tells the assembler to accept instructions that are unique to these processors; the USE32 use type
tells the assembler to generate code appropriate to 32-bit protected mode;
.386
segname SEGMENT DWORD USE32
Note: On these processors the DS register is still 16 bits in size.
MOV EAX,DATASEG ;get address of data segment
MOV DX,AX ;load 16-bit portion
TINY * *
SMALL 1 1
You may use any of these models for a stand-alone program (one not linked to another program). The TINY is intended
for the exclusive use of .COM programs. The SMALL model requires that code fits within a 64K segment and data fit
within another 64K segment.
The general formats for the directives that define the stack, data, and code segments are:
.STACK [size]
.DATA
.CODE [name]
Each of these directives causes the assembler to generate the required SEGMENT statement and its matching ENDS.
The default segment names are STACK, _DATA, and _TEXT. The default code segment size is 1024 bytes which you
may override.
Data Definition
A data item may contain an undefined value, or a constant, or a character string, or a numeric value.
Syntax:
[name] Dn expression
Name. a program that reference a data item does so by means of a name.
Directive. The directive that define data items are DB, DW, DD, DF, DQ, AND DT.
Expression. The expression in an operand my contain a question mark to indicate an uninitialized item. It may contain a
constant. It may be a multiple constant values separated by commas. It also allows for repeated duplications of the
same value:
7
EQUATE DIRECTIVES
Equal-Sign Directive
The Equal-Sign Directive directive enables you to assign the value of an expression to a name, and may do so any
number of times in a program. Example:
VALUE_OF_PI = 3.1416
RIGHT_COL = 79
SCREEN_POSITION = 80*25
MOV CX,CONTR