0% found this document useful (0 votes)
14K views

Module 3 (Part2)

This document discusses machine-independent features in programming, including literals, symbol defining statements, expressions, program blocks, and control sections. It provides details on literals, including how they are implemented using literal pools and handled by the assembler. Symbol defining statements like EQU and ORG are also covered, with EQU used to define symbols and assign them values in the symbol table. The document explains how these machine-independent features do not depend on the architecture and are more related to software.

Uploaded by

vidhya_bineesh
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14K views

Module 3 (Part2)

This document discusses machine-independent features in programming, including literals, symbol defining statements, expressions, program blocks, and control sections. It provides details on literals, including how they are implemented using literal pools and handled by the assembler. Symbol defining statements like EQU and ORG are also covered, with EQU used to define symbols and assign them values in the symbol table. The document explains how these machine-independent features do not depend on the architecture and are more related to software.

Uploaded by

vidhya_bineesh
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 68

MODULE 3

part2
Machine-Independent features:
 These are the features which do not depend on the architecture
of the machine. Such features are more related to software
than to machine architecture. These are:
 Literals
 Symbol defining statements
 Expressions
 Program blocks
 Control sections

2
ng the constant elsewhere in the program and making a label for it. Such an operand is called a literal because the value is stated literally
with a prefix = followed by a specification of the literal value.

Literals
 It is easy for a programmer to write the value of a constant operand as part of
the instruction that uses it.
 This avoids defining the constant elsewhere in the program and making a label
for it. Such an operand is called a literal because the value is stated literally in
the instruction.
 A literal is defined with a prefix = followed by a specification of the literal
value.

Example:
001A ENDFIL LDA =C’EOF’ 032010
001D
- LTORG
– 002D * =C’EOF’ 454F46
• -
3


 The example above shows a 3-byte operand whose value is a character
string EOF. The object code for the instruction is also mentioned.
 It shows the relative displacement value of the location where this value is
stored.
 In the example the value is at location (002D) and hence the displacement
value is (010).
 As another example the given statement below shows a 1-byte literal with
the hexadecimal value ‘05’.
 215 1062 WLOOPTD =X’05’ E32011

4
Literals vs. Immediate Operands
Immediate Operands
 The operand value is assembled as part of the machine
instruction
 e.g. 55 0020 LDA #3 010003
Literals
 The assembler generates the specified value as a constant at
some other memory location
 e.g. ENDFIL LDA =C’EOF’
 Same as,
 e.g. ENDFIL LDA EOF
 EOF BYTE C’EOF’

5
Literal - Implementation (1/3)
literal pools
All the literal operands used in a program are gathered together into one or
more literal pools. This is usually placed at the end of the program. The
assembly listing of a program containing literals usually includes a listing of
this literal pool, which shows the assigned addresses and the generated data
values.
Eg: 1076 * =X’05’ 05

•In some cases it is placed at some other location in the object program. An
assembler directive LTORG is used.
6
 Whenever the LTORG is encountered, it creates a literal pool that contains
all the literal operands used since the beginning of the program.
 The literal pool definition is done after LTORG is encountered. It is better
to place the literals close to the instructions.

LTORG
 
002D * =C’EOF’ 454F46

7
Recognizing Duplicate literals –
That is the same literal used in more than one place in a program and store
only one copy of the data value.
For example, the literal =X’05’ is used in different instructions in a program,
but only one data area with this value is created.
215 1062 WLOOP LDA =C’EOF’ -
-
230 106B LDS =C’EOF’
LTORG
002D * =C’EOF’ 454F46

 Duplicate literals can be identified by comparing character strings.


Eg: =C’EOF’
 Otherwise, generated value can be compared. For eg: the literals =C’EOF’
and =X’454F46’ are identical operand values.
8
 The value of some literals depends on their location in the program.
Literals referring to the current value of the location counter (denoted by
the symbol *) . Such literals are useful for loading base registers.
Eg: BASE *
LDB *
 Such literal operands will have different values in different places of the
program since they hold the current value of the location counter.

9
Structure of LITTAB
●LITTAB
» literal name, the operand value, length and the address assigned to the
operand
» The literal table is usually created as a hash table using the literal name
or value as the key.

10
Handling of literals by the assembler -
 During Pass-1:
 The literal encountered is searched in the literal table.
 If the literal already exists, no action is taken; if it is not present, the
literal is added to the LITTAB (leaving the address unassigned.
 When Pass 1 encounters a LTORG statement or the end of the program,
the assembler makes a scan of the literal table. At this time each literal
currently in the table is assigned an address.
 As addresses are assigned, the location counter is updated to reflect the
number of bytes occupied by each literal.
 During Pass-2:
 The assembler searches the LITTAB for each literal encountered in the
11
instruction and replaces it with its equivalent value.
Symbol-Defining Statements
1. EQU Statement
2. ORG Statement

Define symbol table using EQU and ORG

12
EQU Statement
 Most assemblers provide an assembler directive that allows the programmer to define
symbols and specify their values. The directive used for this EQU (Equate).
 The general form of the statement is
Symbol EQU value
 This statement defines the given symbol (i.e., entering in the SYMTAB) and assigning to
it the value specified. The value can be a constant or an expression involving constants.
One common usage is to define symbolic names that can be used to improve readability in
place of numeric values.
 For example +LDT we #4096
can write it as,

MAXLEN EQU 4096


 
+LDT #MAXLEN 13
 When the assembler encounters EQU statement, it enters the symbol
MAXLEN along with its value in the symbol table.
 During the assembly of LDT instruction the assembler searches the SYMTAB
for its entry and its equivalent value as the operand in the instruction.
 If the maximum length is changed from 4096 to 1024, it is difficult to change
if it is mentioned as an immediate value wherever required in the instructions.
We have to scan the whole program and make changes wherever 4096 is used.
 If we mention this value in the instruction through the symbol defined by
EQU, we may not have to search the whole program but change only the value
of MAXLENGTH in the EQU statement (only once).

14
 Another common usage of EQU statement is for defining values for the
general- purpose registers.
 The assembler can use the mnemonics for register usage like a-register A ,
X – index register and so on. But there are some instructions which
requires numbers in place of names in the instructions. For example in the
instruction RMO 0,1 instead of RMO A,X. The programmer can assign the
numerical values to these registers using EQU directive.
A EQU 0
 
X EQU 1
 These statements will cause the symbols A, X, to be entered into the symbol table
with their respective values. An instruction RMO A, X would then be allowed.

15
 As another usage if in a machine that has many general purpose registers
named as R1, R2,…, some may be used as base register, some may be used
as accumulator. Their usage may change from one program to another. In
this case we can define these requirement using EQU statements.

BASE EQU R1

INDEX EQU R2

COUNT EQU R3

16
 One restriction with the usage of EQU is whatever symbol occurs in the
right hand side of the EQU should be predefined. For example, the
following statement is not valid:
BETA EQU ALPHA

ALPHA RESW 1
 
 As the symbol ALPHA is assigned to BETA before it is defined. The value
of ALPHA is not known.

17
ORG Statement
 This directive can be used to indirectly assign values to the symbols. This
assembler directive changes the value in the location counter. The directive is
usually called ORG (for origin). Its general format is:
ORG value
 Where value is a constant or an expression involving constants and
previously defined symbols. When this statement is encountered during
assembly of a program, the assembler resets its location counter (LOCCTR)
to the specified value.
 Since the values of symbols used as labels are taken from LOCCTR, the
ORG statement will affect the values of all labels defined until the next ORG
is encountered. ORG is used to control assignment storage in the object
program. 18
 While using ORG, the symbol occurring in the statement should be
predefined as is required in EQU statement. For example for the sequence
of statements below:

ORG ALPHA

ALPHA RESB 1
 The sequence could not be processed as the symbol used to assign the new
location counter value is not defined. In first pass, as the assembler would
not know what value to assign to ALPHA This is a kind of problem of the
forward reference.
Define symbol table using EQU and ORG

SYMBOL 6 Bytes

VALUE 3 Bytes

FLAG 2 Bytes

20
 The symbol field contains a 6-byte user-defined symbol; VALUE is a one-
word representation of the value assigned to the symbol; FLAG is a 2-byte
field specifies symbol type and other information. The space for the table
can be reserved by the statement:
STAB RESB 1100

Using EQU Using ORG


STAB RESB 1100 STAB RESB 1100
SYMBOL EQU STAB ORG STAB
 
VALUE EQU STAB+6 SYMBOL RESB 6
 
FLAGS EQU STAB+9
VALUE RESW 1
  FLAG RESB 2
ORG STAB+1100
21
Expressions:
 Assemblers also allow use of expressions in place of operands in the instruction.

 Each such expression must be evaluated to generate a single operand value or


address.
 Assemblers generally arithmetic expressions formed according to the normal rules
using arithmetic operators +, - *, /. Division is usually defined to produce an
integer result.

22
 Individual terms may be constants, user-defined symbols, or special terms.

 The only special term used is * ( the current value of location counter) which
indicates the value of the next unassigned memory location. Thus the statement
BUFFEND EQU *
Assigns a value to BUFFEND, which is the address of the next byte following the buffer
area

 Some values in the object program are relative to the beginning of the program
and some are absolute (independent of the program location, like constants).
 Expressions are classified as either absolute expression or relative expressions ,
neither absolute nor relative depending on the type of value they produce.

23
Absolute Expressions:
 The expression that uses only absolute terms is absolute expression.

 Absolute expression may contain relative term provided the relative terms occur in
pairs with opposite signs for each pair.

 None of the relative terms enter into multiplication or division.

Example:
MAXLEN EQU BUFEND-BUFFER
  In the above instruction the difference in the expression gives a value that does not
depend on the location of the program and hence gives an absolute value irrespective
of the relocation of the program. The expression can have only absolute terms.
Example:

MAXLEN EQU 1000 24


Relative Expressions:
 All the relative terms except one can be paired . The remaining unpaired
relative term must have a positive sign.

 None of the relative terms must enter into multiplication or division.

 A relative term represents some location within the program.

 Example:
STAB EQU OPTAB + (BUFEND – BUFFER)

25
Neither absolute nor relative
 Expressions that are legal are those expressions whose value remains meaningful
when the program is relocated.
 Expressions that do not meet the conditions for either absolute or relative are
neither absolute nor relative.
 They are considered as errors.

Eg: BUFEND + BUFFER, 100-BUFFER, 3*BUFFER

26
Handling the type of expressions:

 To find the type of expression, we must keep track the type of symbols
used.
 This can be achieved by defining the type in the symbol table against each
of the symbol as shown in the table below:

27
Program Blocks:
 Program blocks allow the generated machine instructions and data to appear in the
object program in a different order by Separating blocks for storing code, data,
stack, and larger data block.
 Program blocks refer to segments of code that are rearranged within a single object
program unit.

 Assembler Directive USE indicates which portion of the program belong to the
various blocks.
 USE [blockname] 
 At the beginning, statements are assumed to be part of the unnamed (default) block.

 If no USE statements are included, the entire program belongs to this single block.

28
 Program readability is better if data areas are placed in the source program
close to the statements that reference them. In the example below three
blocks are used :
Default: executable instructions
CDATA: all data areas that are less in length
  CBLKS: all data areas that consists of larger blocks of memory

29
30
31
32
Fig:Program blocks traced through the assembly and loading processes

33
How the assembler handles program
blocks ?
Pass 1
Pass 2

34
Pass 1

 A separate location counter for each block is maintained.


 The location counter for a block is initialized to zero when the block is first
started.
 The current value of the location counter is saved when switching to
another block.
 The saved value is continued when resuming previous block.
 After pass 1 the symbol table will be having labels with block no along
with address.(For absolute symbol there is no block number.)

35
 At the end of pass 1 latest value of location counter or each block gives the
length of that block.
 Assembler constructs a block table that contains starting addresses and
lengths of all blocks

36
Pass 2

 Code generation during pass2 the assembler needs the address relative to
the start of the program. (not the start of the individual program block).
 Assembler adds the label address with its block starting address

Advantage-
 Separation of programs into blocks has reduced the addressing problem.

Since the larger buffer are is moved to the end of the object program extended
format instructions need not be used. The use of program blocks has achieved
the effect of rearranging the source statements without actually rearranging
them. The loader will load the object program at the indicated address.
37
38
39
40
Control Sections:
 A control section is a part of the program that maintains its identity after assembly;
each control section can be loaded and relocated independently of the others.
 Different control sections are most often used for subroutines or other logical
subdivisions. The programmer can assemble, load, and manipulate each of these
control sections separately.
 Because of this, there should be some means for linking control sections together.
For example, instructions in one control section may refer to the data or
instructions of other control sections.
 Since control sections are independently loaded and relocated, the assembler is
unable to process these references in the usual way. Such references between
different control sections are called external references.

41
 The assembler generates the information about each of the external references that
will allow the loader to perform the required linking.
 When a program is written using multiple control sections, the beginning of each of
the control section is indicated by an assembler directive
 assembler directive: CSECT
The syntax
 
Controlsectionname CSECT
 
separate location counter for each control section

42
 Control sections differ from program blocks in that they are handled separately by the
assembler. Symbols that are defined in one control section may not be used directly
another control section; they must be identified as external reference for the loader to
handle. The external references are indicated by two assembler directives:

– EXTDEF (external Definition): It is the statement in a control section, names symbols


that are defined in this section but may be used by other control sections. Control
section names do not need to be named in the EXTREF as they are automatically
considered as external symbols.

– EXTREF (external Reference): It names symbols that are used in this section but are
defined in some other control section. The order in which these symbols are listed is
not significant. The assembler must include proper information about the external
references in the object program that will cause the loader to insert the proper value
where they are required.

43
44
45
Handling External Reference

Case 1
15 0003 CLOOP +JSUB RDREC 4B100000
 
The operand RDREC is an external reference.

 The assembler has no idea where RDREC is


 inserts an address of zero
 can only use extended format to provide enough room (that is, relative
addressing for external reference is invalid)
The assembler generates information for each external reference that will
allow the loader to perform the required linking

46
Case 2
 190 0028 MAXLEN WORD BUFEND-BUFFER 000000
 
There are two external references in the expression, BUFEND and BUFFER.

The assembler inserts a value of zero


passesinformation to the loader
Add to this data area the address of BUFEND

Subtract from this data area the address of BUFFER

 
Case 3
 
On line 107, BUFEND and BUFFER are defined in the same control section and the
expression can be calculated immediately.
107 1000 MAXLEN EQU BUFEND-BUFFER
 

47
Define record (EXTDEF)
Col. 1 D
Col. 2-7 Name of external symbol defined in this control section
Col. 8-13 Relative address within this control section (hexadecimal)
Col.14-73 Repeat information in Col. 2-13 for other external symbols
 
 
Refer record (EXTREF)
 
Col. 1 R
Col. 2-7 Name of external symbol referred to in this control section
Col. 8-73 Name of other external reference symbols
 
 
Modification record
 
Col. 1 M
Col. 2-7 Starting address of the field to be modified (hexadecimal)
Col. 8-9 Length of the field to be modified, in half-bytes (hexadecimal)

Col.11-16 External symbol whose value is to be added to or subtracted from


the indicated field
Assembler Design Options

There are two design options or the assembler.

 One pass assembler: is used when it is necessary to avoid a second pass over
the source program.

 Multipass Assembler: allows an assembler to handle forward references.


 
One-Pass Assembler

 The main problem in designing the assembler using single pass was to resolve
forward references. We can avoid to some extent the forward references by:
 Eliminating forward reference to data items, by defining all the storage
reservation statements at the beginning of the program rather at the end.
 Unfortunately, forward reference to labels on the instructions cannot be
avoided. (forward jumping)
 To provide some provision for handling forward references by prohibiting
forward references to data items
 There are two types of one-pass assemblers:
 One that produces object code directly in memory for immediate
execution (Load- and-go assemblers)
 The other type produces the usual kind of object code for later
execution.
•  
Load-and-Go Assembler

 Load-and-go assembler generates their object code in memory for immediate executio
 No object program is written out, no loader is needed.
 It is useful in a system with frequent program development and testing
 The efficiency of the assembly process is an important consideration.
 Programs are re-assembled nearly every time they are run; efficiency of the assembly
process is an important consideration.
• Forward Reference in One-Pass Assemblers: In load-and-Go assemblers when
a forward reference is encountered :
 Omits the operand address if the symbol has not yet been defined
 Enters this undefined symbol into SYMTAB and indicates that it is
undefined
 Adds the address of this operand address to a list of forward
references associated with the SYMTAB entry
 When the definition for the symbol is encountered, scans the
reference list and inserts the address.
 At the end of the program, reports the error if there are still
SYMTAB entries indicated undefined symbols.
 For Load-and-Go assembler
o Search SYMTAB for the symbol named in the END statement
and jumps to this location to begin execution if there is no error
•  
One-Pass Assembler that generates object code:
 
 If the operand contains an undefined symbol, use 0 as the address and
write the Text record to the object program.
 Forward references are entered into lists as in the load-and-go
assembler.
 When the definition of a symbol is encountered, the assembler
generates another Text record with the correct operand address of
each entry in the reference list.
 When loaded, the incorrect address 0 will be updated by the latter
Text record containing the symbol definition.
Algorithm for one pass assembler
MultiPass Assembler:

 For a two pass assembler, in EQU assembler directive we required that any symbol
on the right hand side be defined previously in the program. This is because of the
two pass.If multipass is possible this restriction can be avoided. Eg:

ALPHA EQU BETA


BETA EQU DELTA
 
DELTA RESW 1
 
• Three passes are required
Working of Multipass Assembler:
A multipass assembler can make as many passes as needed to process the

definition of symbols.
For a forward reference in symbol definition, we store in the SYMTAB:

o The symbol name


o The defining expression
o The number of undefined symbols in the defining expression
o The undefined symbol (marked with a flag *) associated with a list of
symbols depend on this undefined symbol.
When a symbol is defined, we can recursively evaluate the symbol

expressions depending on the newly defined symbol.


Location counter value
for the line 4 is 1034

Multi-Pass Assembler Example Program


Multi-Pass Assembler : Example for forward reference in Symbol Defining Statements:
Implementation Example: MASM
ASSEMBLER

You might also like