Module 3
Module 3
opcode n i x b p e disp
0000 00 0 1 0 0 0 0 0000 0000 0003
opcode n i x b p e disp
Object Code = 3E2003
0011 11 1 0 0 0 1 0 0000 0000 0011
Format 4 Instruction
• Line Loc. Label Instruction Operand Object Code
• 133 103C +LDT #4096 75101000
• Opcode=74
• n=0, i=1: Immediate addressing
• e=1: Format 4 instruction
opcode n i x b p e address
• nixbpe=010001 0111 01 0 1 0 0 0 1 0000 0001 0000 0000 0000
• Starting address 0
15 0006 CLOOP +JSUB RDREC 4B101036
• Relocate the program to 1000
15 0006 CLOOP +JSUB RDREC 4B102036
• Relocate the program to 3000
15 0006 CLOOP +JSUB RDREC 4B104036
• ORG statement will affect the values of all symbols defined until the
next ORG
• Since the values of symbols used as labels are taken from LOCCTR
Expressions
• Assemblers allow to use expressions where single operand is permitted
• Each expression will be evaluated by the assembler to produce a single
operand or address value
• Operators: +, -, *, /
• Division produces integer result
• Individual terms in the expression may be constants, user defined terms or
special terms
• Most common special terms is the current value of location counter (often
designated by *). This term represents the value of the next unassigned
memory location
106 BUFEND EQU *
• Gives BUFEND a value that is the address of the next byte after the buffer
area
Expressions
• Expressions can be classified as absolute expressions or relative
expressions
• Absolute term: constant
• Relative term: Labels on instructions and data areas, references to location
counter
• MAXLEN EQU BUFEND-BUFFER
• BUFEND and BUFFER both are relative terms, representing addresses within
the program
• However the expression BUFEND-BUFFER represents an absolute value
• When relative terms are paired with opposite signs, the dependency
on the program starting address is canceled out; the result is an
absolute value
Expressions
• None of the relative terms may enter into a multiplication or division
operation
• Errors: Symbol Type Value
RETADR R 30
• BUFEND+BUFFER
BUFFER R 36
• 100-BUFFER
BUFEND R 1036
• 3*BUFFER MAXLEN A 1000
• To determine the type of an expression
• keep track of the types of all symbols defined in the program
• Flag in the symbol table indicates the type of value (absolute or
relative) in addition to the value itself
Program Blocks
• The programs logically contained subroutines, data areas etc.
• The programs were treated as a single unit by the assembler resulting in a
single block of object code.
• Within the object program, the generated machine instructions and data
appeared in the same order that were written in the source program
• Many assemblers provide features that allow more flexible handling of the
source and object programs.
• Some features allow the generated machine instructions and data to appear in a
different order from the source program-Program Blocks
• Other features allow creation of different independent program units-Control
sections
Program Blocks
• Program blocks refer to segments of code that are rearranged within
a single object program unit
• Assembler directive USE indicates which portions of the program
belong to which blocks
USE [blockname]
• M^000004^05^+^RDREC
Control Sections and Program Linking
• There is a separate set of object program records (from Header
through End) for each control section
• The object program of each control section will be the same as if the
sections were assembled separately
Control Section Program Block
CSECT directive USE directive
Refers to segments of code that are Refer to segments of code that are
translated into independent object program rearranged within a single object program
units unit
Handled separately by assembler Not handled separately
All control sections need not be assembled All program blocks should be assembled at
at the same time the same time
Assembler Design Options
• One pass Assemblers
• Main problem is with forward references
• Instruction operands are symbols that have not been defined in the source program
• Solution
• Define data before they are referenced
• Place all storage reservation statements at the start of the program rather
than at the end
• But forward reference to labels cannot be eliminated easily
• Assembler must make special provision for handling forward references
• Many one pass assemblers prohibit forward references
• Two main types of one pass assembler
• Load and Go Assembler: produces object program in memory for immediate
execution
• Object Program Output: produces the usual kind of object program for later
execution
Load and Go Assembler
• Useful for program development and testing
• Avoids the overhead of writing the object program out and reading it
back
• No loader is required
• For a load-and-go assembler, the actual address must be known at
assembly time, we can use an absolute program
• The assembler simply generates the object code instructions as it
scans the source program directly in memory for immediate
execution
Load and Go Assembler
• If the instruction operand is an undefined symbol
• omit the operand address during instruction translation
• insert the symbol into SYMTAB, and mark this symbol as undefined
• the address that refers to the undefined symbol is added to a list of forward
references associated with the symbol table entry
• when the definition for a symbol is encountered, the forward reference list for
that symbol is scanned and the proper address for the symbol is then inserted
into any instructions previously generated
• At the end of the program
• any SYMTAB entries that are still marked with * indicate undefined symbols
• search SYMTAB for the symbol named in the END statement and jump to this
location to begin execution
Object Program Output
• Used on systems when external working-storage devices (for the intermediate file
between the two passes) are not available or too slow
• Solution:
• Forward references are entered into lists as before
• When definition of a symbol is encountered, instructions that made forward
references to that symbol will no longer be available in memory for
modification
• They will already have been written out as part of the Text record in the
object program
• The assembler must generate another Text record with the correct operand
address
• When the program is loaded, the address will be inserted into the instruction
by the action of the loader
• The object program records must be kept in their original order when they
are presented to the loader
Multi Pass Assembler
• In our definition of EQU directive, any symbol defined on the RHS
should be defined previously in the source program
• Consider the sequence
ALPHA EQU BETA
BETA EQU DELTA
DELTA RESW 1
List of dependent
symbols
Undefined Symbol
MASM Assembler
Microsoft MACRO Assembler (MASM)
• The Microsoft Macro Assembler (MASM)
• is an x86 assembler that uses the Intel syntax for MS-DOS and Microsoft
Windows
• SEGMENT
• MASM assembler language program is written as a collection segments,
• Each segment is defined as belonging to a particular class, CODE, DATA, CONST,
STACK
• Segments are addressed via registers: CS (code), SS (stack), DS (data), ES, FS, GS
• similar to program blocks in SIC/XE
• ASSUME directive tells the assembler to associate segment name with a
register
• ASSUME ES:DATASEG2
• associates ES register with the segment DATASEG2
• Similar to BASE in SIC
MASM Assembler
• JUMP instructions are assembled in two different ways:
• Near jump: jump to a target in the same code segment
• Assembled using same code segment register
• Assembled instruction: 2 or 3 bytes
• Far jump: jump to a target in a different code segment
• Assembled using different segment register
• Assembled instruction: 5 bytes
• e.g. JMP TARGET
• By default assembler assumes near jump
• If the target is in another code segment, the programmer must warn the assembler by writing
• Warning: JMP FAR PTR TARGET
• If the jump address is within 128 bytes of the current instruction, the programmer
can specify shorter (2 byte) Jump by specifying
• Warning: JMP SHORT TARGET
• Pass 1: reserves 3 bytes for jump instruction
• Phase error: if the target address requires far jump and the programmer does not
specify FAR PTR
• Similarity between far jump and extended format instruction in SIC/XE
Microsoft MASM Assembler
• Length of the assembled instruction depends on the types of
operands used
• Registers, immediate operands (1 to 4 bytes) or memory locations (space
required depends on the location of the operand)
• Segments can be written in more than one part.
• If the SEGMENT directive specifies the same name as a previously
defined segment, it is considered as a continuation of the previous
segment
• All parts of the segments are gathered together during the assembly
process
• Similar to Program blocks in SIC/XE
• References between segments are handled by the assembler
MASM Assembler
• External references between separately assembled modules are
handled by Linker
• MASM directive PUBLIC similar to EXTDEF in SIC
• MASM directive EXTRN similar to EXTREF in SIC
• Produces object code in different formats to allow execution of the
program in different operating systems
• Also produces an instruction timing listing that shows the number of
clock cycles required to execute each machine instruction