SP Unit1 Assembler MRD
SP Unit1 Assembler MRD
RESHMA PISE 21
Assembly Language Programming (ALP)
Assembly language is an example of middle-level language.
We use predefined words called mnemonics.
Binary code instructions in machine language are replaced with
mnemonics and operands.
As the computer cannot understand mnemonics, we use a translator
called Assembler to translate mnemonics into machine language.
Three features of Assembly Language:
➢ Mnemonic Operation Codes
➢ Symbolic operands
➢ Data declarations
Assembly Language Advantages
Mnemonic operation codes:
• Writing instructions in a middle-level language is easier than writing instructions in a
machine language.
• Assembly language is more readable compared to low-level language.
• Easy to find errors in program and modify.
• Not necessary to memorize numeric operation codes.
Symbolic operands:
Symbolic names can be associated with data or instructions.
Symbolic names can be used as operands in assembly instructions as data operands or
address operands (need not know memory address).
Data declarations: Data can be declared in a variety of notations, including the
decimal notation.
Avoids conversion of constants into their machine representation.
ALP Statement Format
<Label> : Mnemomic <operand spec>[,<operand spec>…]
•AREA(4)
AREA: This directive is used to declare a code or data section. It typically includes
attributes that specify the characteristics of the area, such as its name, type (code
or data), and whether it is readable, writable, or executable.
(4): This is an offset from the base address of the defined area. In this case, it
indicates an address that is 4 units past the start of the area.
Comments: You need the comments for the readability of the program and debugging.
ALP:Mnemonic Operation Codes
Each statement has two operands, the first operand is a register and the
second operand refers to a memory word using a symbolic name and
optional displacement.
• The values are not protected by the assembler and can be changed by moving a new value into the
memory word.
• In the above example, the value of NUM can be changed by executing an instruction
MOVEM BREG, NUM
The DS (declare storage) statement reserves memory word and
associates names with them. Ex:
This is translated into an instruction with two operands – BREG and the value 6 as
an immediate operand.
2) A Literal is an operand with the syntax = '<value>’.
➢It differs from a constant because its location cannot be defined in the assembly
program.
➢ It’s value does not change during the execution of the program.
➢ Literals are allocated memory addresses at end of program or it can be specified
by assembler directive LTORG assembler directive.
Prof. M a d h a v i D a c h a w a r
Comp Engg. Dept
Vishwakarma University
Language Processing
There are two phases in Designing of an assembler:
Language processing = Analysis of source program + Synthesis of target program
Analysis of source program is specification of the source program
Basically it involves :
▪ Checking if program is syntactically and semantically correct.
N DC 5 101
509
4. Reserve storage for data.
M DS 1 510 ---
SUM DS 1 511 ---
END
Analysis Phase : Tasks Performed
The Main function of the Analysis phase is to build the Symbol table and Literal table.
➢ Determine the addresses with which the symbolic names used in a program are associated.
To perform the this, we need to fix the addresses of all program instructions. This function is
called Memory Allocation.
➢To implement memory allocation a data structure called Location counter (LC) is maintained.
it is initialized to the constant specified in the START statement or if nothing is mentioned in
START it is initialized to 0 by default.
To update the contents of LC, analysis phase needs to know lengths of different instructions.
Mnemonic Table
Symbol Table
Symbol Address
→ Data Access
N 210
-- > Control Access
Literal Table AGAIN 230
Literal Address
‘5’ 230
‘8’ 231
Design of Assembler : Pass structure
N DC 10
Two Pass Assembler
Assembly
program
Intermediate Object
Pass 1 Pass 2
file codes
Literal Table
Advanced Assembler Directives
➢ ORG/ ORIGIN
Syntax : ORG <address spec>
This directive indicates that LC should be set to the address given by <address spec>.
Ex : ORG 400
➢ EQU
Syntax : <symbol> EQU <address spec>
Where <address spec> is an < constant > or <operand spec>
Ex : MAX EQU 100 or A EQU B
➢ LTORG
A programmer can specify where literals should be placed.
By default, assembler places all the literals after the END statement.
Literals
The assembler allocates memory to the literals of literal pool at every LTORG statement and
at the END statement, .
The pool contains all literals used in the program since the start of the program or since the
last LTORG statement
LC START 200
200 ADD AREG, = ‘5’
202 SUB BREG , = ‘8’
……
230 LTORG The LTORG statement allocates the addresses 230 and 231 to the values
'5' and ‘8’.
232 MOVER AREG, BREG First literal pool
234 MOVER CREG, = ‘3’
…..
The END statement allocates the addresses 250 to the literal ‘3’.
250 END Second literal pool.
Data Structures in Pass I
1. Location Counter : LC
2. OPTAB – a table of mnemonic op codes, static table
Implemented as array or hash table, easy for search
◦ Contains mnemonic op code, class and mnemonic info
◦ Class field indicates whether the op code corresponds to
◦ an imperative statement (IS) / a declaration statement (DL) or an assembler Directive (AD)
◦ For IS, mnemonic info field contains the pair ( machine opcode, instruction length)
◦ Else, it contains the ID of the routine to handle the AD or DL statement.
Ex : MOVER IS (04, 1)
BC IS (07, 1)
LTORG AD R#5
DS DL R#1
DC DL R#2
Data Structures in Pass I
3. SYMTAB - Symbol Table
◦ A SYMTAB entry contains primarily symbol name and address . Implemented as hash table.
Index no. Symbol Address
1 LOOP 502
2 AGAIN 514
3 NUM 550
5. POOLTAB – POOL Table POOLTAB contains the LITTAB index of the first literal of each
Literal no. literal pool. When LTORG / END statement is processed, literals
in the current pool are allocated addresses starting with current LC value.
#1
#3
Algorithm - First Pass of 2- Pass Assembler
For imperative statement, code is the instruction opcode in the machine language from
MOT.
e.g. MOV = > (IS , 04 ) // 4 is opcode of MOV
For declarations and assembler directives, code is the index number within the class.
Thus, (AD, 05) stands for assembler directive number 5 which LTORG.
Intermediate Code: Variant I
Representation of First operand :
A single digit number which is a code for a register or the
condition code.
(Ex: 1 – AREG or 1 – LT etc.)
Assembly
program
Intermediate Object
Pass 1 Pass 2
file codes
Literal Table
Pass 2 of 2–Pass Assembler
Functions :
➢ Process IC to synthesize the machine code.
➢ LC Processing
➢ Error Reporting
f) If size≠ 0 then
i) Move contents of machine_code_buffer to the
address code_area_address + Loc_cntr;
ii) Loc_cntr = Loc_cntr + size;
N DC 10
Single Pass Assembler:Backpatching
The problem of forward references is handled using a process called Backpatching
The operand field of an instruction containing a forward reference is left blank
Ex: MOVER AREG, N
Is partially synthesized since N is a forward reference. Second operand field in
the machine code is left blank. It will be inserted later.
◦ Each entry in the TII is a pair of the form (<instruction address>, <symbol referenced>)
◦ When the END statement is processed, the symbol table would contain the addresses of all
symbols defined in the source program.
Single Pass Assembler:Backpatching
◦ When the END statement is processed , TII would contain information of all forward
references.
◦ At the end of the program, report the error if symbol is not found in symbol table.
( It means the symbol is undefined )
Mind Map of Assembler