0% found this document useful (0 votes)
2 views

Module 2

The document provides an overview of assemblers, detailing their role in translating assembly language into machine code and the structure of assembly language itself. It discusses the advantages and disadvantages of using assembly language, types of assembly statements, and assembler directives, along with the design procedures for single and multi-pass assemblers. Additionally, it outlines the algorithms for both passes of an assembler, addressing issues such as forward references and modularity in design.

Uploaded by

mralpha263
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Module 2

The document provides an overview of assemblers, detailing their role in translating assembly language into machine code and the structure of assembly language itself. It discusses the advantages and disadvantages of using assembly language, types of assembly statements, and assembler directives, along with the design procedures for single and multi-pass assemblers. Additionally, it outlines the algorithms for both passes of an assembler, addressing issues such as forward references and modularity in design.

Uploaded by

mralpha263
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

ASSEMBLERS

Objectives and Outcome


Objectives-
The main objective of this unit is to introduce
Assemblers.
It highlights on assembly language syntax,
semantics, and the assembler design
Explore data structures, databases, and algorithms
for pass1 and pass 2 of assembler
Introduction
• An Assembler is a program that accepts input
as assembly language program and produces its
machine language equivalent along with the
information for the loader.
Role of Assembler

l Source Object
l

lCode
lProgram Assembler Linker

Executable
l

lCode

Loader
What is assembly language?

Assembly language is a family of low-level language for


programming computers, microprocessors, microcontrollers etc.
They implement a symbolic representation which uses symbolic
codes or mnemonics as instruction.
This representation is usually defined by the hardware manufacturer,
and is based on abbreviations (called mnemonic) that help the
programmer remember individual instruction, register etc.
For processing of an assembly language program we need a language
translator called Assembler.
Assembler- Assembler is a Translator which translates assembly
language code into machine code
Advantages of Assembly Language

Writing a program in assembly language is more convenient than


machine language as it makes the programmer free from the
burden of remembering the operation codes and addresses of
memory location.
Assembly program is written using symbols(Mnemonics).
Assembly program is more readable.
Assembly language is machine dependent.
Disadvantages of Assembly Language

It is a machine oriented language, it requires familiarity with


machine architecture and understanding of available
instruction set.
Execution in an assembly language program is comparatively
time consuming compared to machine language. The reason is
that a separate language translator program is needed to translate
assembly program into binary machine code
Assembly Language Programming
Types of Assembly Language statements:
•Imperative statements
–An imperative statement in assembly language indicates the
action to be performed during execution of assembly statement

Ex:- ADD 1,FOUR


Declarative Statement:-
l

These statements declares the storage area or declares the


constant in program.
Syntax-
Label: DS <Constant>
Label: DC 'value'

EX A DS 1F
ONE DC F'1'
Assembler Directives
-Not translated into machine instructions
-Providing information to the assembler

l Ex START 100
USING *, 15
Advance Assembler Directives

ORIGIN-
This directive instructs the assembler to put the address given by <address
specification> in the location counter

EQU-
The statement simply associates the name <symbol> with the address
specified by <address specification>. However, the address in the location
counter is not affected.

LTORG-
The LT0RG directive, which stands for 'origin for literals', allows a
programmer to specify where literals should be placed.

– If a program does not use an LTORG statement, the assembler would


enter all literals used in the program into a single pool and allocate
memory to them when it encounters the END statement.
Simple set of instructions & their opcodes

Mnemonic Instruction opcode Comment


00 STOP Stops Execution
01 ADD Performs addition
02 SUB Performs subtraction
03 MULT Performs multiplication
04 MOVER Move contents to register
05 MOVEM Move contents to mem
reference
06 COMP Compares
07 BC Branch on condition
08 DIV division
09 READ Reads input
10 PRINT print
11 JMP Jump to specified location
12 XCHG Xchange contents
13 STORE Stores data
14 INT Calls interrupts
Sample Assembly prog. And its code
St Label Mnemonic Operands LC Machine Code
<opcode><reg-
operand><mem>
1 START 501
2 READ NUM 501 09 0 513

3 MOVER REG1,ONE 502 04 1 515


4 MOVEM REG1,TEMP 503 05 1 516

5 REPEAT MULT REG1,TEMP 504 03 1 516

6 MOVER REG2,TEMP 505 04 2 516


7 ADD REG2,ONE 506 01 2 515

8 MOVEM REG2,TEMP 507 05 2 516


9 COMP REG2,NUM 508 06 2 513

10 BC LE,REPEAT 509 07 2 504

11 MOVEM REG1,ANSWER 510 05 1 514

12 PRINT 511 10 0 514

13 STOP 512 00 0 000

14 NUM DS 1 513

15 ANSWER DS 1 514

16 ONE DC 1 515 00 0 001

17 TEMP DS 1 516

18 END
Assembler Directives
START <constant>
END <operand spec>
ORIGIN <address
specification> EQU <symbol>
EQU <addr> LTORG
USING <symbol>,<base
Register> PUBLIC & EXTRN
SEGMENT,ENDS,ASSUME
PURGE
Pre-defined Tables
Machine Instruction Format
Intermediate Representation

Intermediate code can be in variant I or variant II form.


Variant l
The mnemonics field contains a pair of the form –
(statement class, code)
– Where statement class can be one of IS, DL, and AD
• For imperative statement, code is the instruction opcode
in the machine language.
• For declarations and assembler directives, code is
an ordinal number within the class.
• (AD, 01) stands for assembler directive number 1 which
is the directive START
Intermediate Representation

Variant lI
This variant differs from variant I of the intermediate
code because in variant II symbols, condition codes and
CPU register are not processed.
General Design Procedure of
Assembler
1. Specify the problem
2. Specify data structures
3. Define format of data structures
4. Specify algorithm
5. Look for modularity [capability of one
program to be subdivided into independent
programming
units.]
6. Repeat 1 through 5 on modules.
Statement of Problem
•The assembler must do following.

1) Generate Instructions
a) Evaluate the mnemonic in the operation field.
b) Evaluate Sub fields.
2) Process Pseudo ops.
Types of Assembler

•Single pass Assembler


•Multi-pass Assembler
Problem of Forward
Reference
• When the variables are used before their
definition at that time problem of forward
reference occurs.
Problem of Forward
Reference
JOHN START 0
USING *, 15
L 1, FIVE
A 1, FOUR
ST 1, TEMP
FOUR DC F’4’
FIVE DC F’5’
TEMP DS 1F
END
Steps for design procedure

1) Specify the problem

2) Specify Data Structures

3) Define format of Data Structures

4) Specify Algorithms

5) Look for modularity

6) Repeat 1 through 5 on modules


Step 1- Specify the problem
Pass1: Define symbols & literals.
1) Determine length of m/c instruction [MOT]

2) Keep track of Location Counter [LC]

3) Remember values of symbols [ST]

4) Process some pseudo ops[EQU,DS etc]


[POT]

5) Remember Literals [LT]


Step 2- Specify Data structure

Pass1: Databases
•Input source program
• “LC” location counter used to keep track of each
instructions address.
•M/c operation table (MOT) [Symbolic mnemonic & length]
•Pseudo operation table [POT], [Symbolic mnemonic &
action]
•Symbol Table (ST) to store each lablel & it’s value.
•Literal Table (LT), to store each literal (variable) & it’s
location.
•Copy of input to used later by PASS-2.
Pass2: Generate object program
1) Look up value of symbols [ST]
2) Generate instruction [MOT]
3) Generate data (for DS, DC & literals)
4) Process pseudo ops[POT]
Step 2- Specify Data structure

•Pass2: Databases
•Copy of source program input to Pass1.
•Location Counter (LC)
•MOT [Mnemonic, length, binary m/c op code,Format]
•POT [Mnemonic & action to be taken in Pass2
•ST [prepared by Pass1, label & value]
• Base Table [or register table] indicates which registers
are currently specified using ‘USING’ pseudo op & what
are
contents.
•Literal table prepared by Pass1. [Lit name & value].
Step 3 -Format of Data Structures
•Machine Operation Table
–The op-code is the key and it’s value is the binary
op code equivalent, which is used for use in
generating machine code.
–The instruction length is stored for updating the
location counter.
–Instruction format is use in forming the m/c
language equivalent
Pass-II
Assembler
Flow-Chart
Assembler Pass-I (Algorithm)
1.loc_cntr = 0 ; (default value)
pooltab_ptr =
1;
poolTAB [1] =
littab_ptr = 1;
1;
1. While next statement is not an END statement
(a) if label is present then
this_label = symbol in label field;
Enter (this_label , loc_cntr) in SYMTAB
(b) If an LTORG statement then
Process literals LITTAB [POOLTAB] to allocate memory
and put the address in address field. Update loc_cntr
accordingly.
pooltab_ptr = pooltab_ptr + 1;
POOLTAB [pooltab_ptr] = littab_ptr ;

(c) If an ORIGIN statement or START statement then


loc_cntr = value specified in operand field ;
(d) If an EQU statement then
this_addr = value of <address spec>;
correct the symtab entry for this label to (this _label
, this_address);

(e) If an Declaration statement then


code = code of the declaration statement;
size = size of memory area required by DC/DS ;
loc_cntr = loc_cntr + size;
Generate IC
(f) If an Imperative statement then

code = machine opcode from OPTAB;


loc_cntr = loc_cntr + instruction length from
OPTAB;

if operand is a literal then


this_literal = literal in operand
field; LITTAB [Littab_ptr] =
this_literal; littab_ptr = littab_ptr +
1;
else
if operand is a symbol then
this_entry = SYMTAB entry ;
Generate IC.

3. Processing the END


statement perform step 2(b)
generate IC
go to PASS II
Assembler PASS –II (Algorithm)
1. code_area_address = address of code_area ;
pooltab_ptr =1;
loc_cntr = 0;

2. While next statement is not an END statement


(a) clear machine_code_buffer;

(b) if an LTORG statement


Process literals in LITTAB[POOLTAB] similar to the processing of
constants in a DC
i.e assemble the literals in machine_code_buffer.
size = size of memory area required for literals
;
(c ) pool_tab_ptr = pool_tab_ptr + 1;
If START or ORIGIN statement
loc_cntr = value specified in operand field;
(d) size = 0 ;
If Declaration statement
DC statement : assemble the constant in
machine_code_buffer
size = 0;
(e) If an Imperative statement
get operand address from SYMTAB,
LITTAB; assemble instruction in
machine_code_buffer ;
size = size of instruction ;
(f) If size != 0 then
move contents of machine_code_buffer to the address of
code_area_address + loc_cntr ;
loc_cntr = loc_cntr + size ;
3. Processing the END statement

(a) Perform step 2(b) and 2(f)


(b) write code_area into an output file
V. Look for Modularity
This implies checking for the functions that can be taken as
independently used functions that can go through entire design
process. These functions can be implemented as separate
external subroutines as internal subroutine or section of
PASS I and PASS II
Multiuse
Unique

You might also like