System Programming Concept
System Programming Concept
System Programming Concept
Concept
An understanding towards a compiler
Index 155
Published by Notion Press Xpress Publishing
www.notionpress.com
This book has been published with all reasonable efforts taken to make the
material error-free after the consent of the author. The author of this book is
solely responsible and liable for its content including but not limited to the
views, representations, descriptions, statements, information, opinions and
references [“Content”]. The publisher does not endorse or approve the
Content of this book or guarantee the reliability, accuracy or completeness
of the Content published herein. The publisher and the author make no
representations or warranties of any kind with respect to this book or its
contents. The author and the publisher disclaim all such representations
and warranties, including for example warranties of merchantability and
educational or medical advice for a particular purpose. In addition, the
author and the publisher do not represent or warrant that the information
accessible via this book is accurate, complete or current.
ISBN: 9781637817858
Price ₹ 295.00
First Published – 2021
viii
PREFACE
This book is for system programmer, computer engineers, and others who
want to be able to understand the program codes by learning what is going
on “under the hood” of a computer system. We are aiming to explain the
enduring concepts underlying all computer systems, and to show you the
concrete ways that these ideas affect the correctness, performance, and
utility of your application programs. This book is focused to Computer
Science an interesting topic for Undergraduate and Postgraduate students.
The proposed book is not only student-friendly but also contains all
advanced domains of Computer Science in day to day practice. This book is
written from a programmer’s perspective, describing how application
programmers can use their knowledge of a system to write better
programs. Of course, learning what a system is supposed to do provides a
good first step in learning how to build one, and so this book also serves as
a valuable introduction to those who go on to implement systems hardware
and software. As there is a common syllabus for all Graduate students, the
course has been accordingly designed. The syllabus designed by
Universities for ‘System Programming’ is not only an introductory
computing course but also it focuses on new industrial skills in Computer
Science. This book is not considered as the ultimate textbook ever written,
but sincere efforts have been made to bring it at par with the much sought-
after texts in this field. If you study and learn the concepts in this book, you
will be on your way to becoming the rare “power programmer” who
knows how things work and how to fix them when they break. Our aim is
to present the fundamental concepts in ways that you will find useful right
away. You will also be prepared to delve deeper, studying such topics as
compilers, computer architecture, operating systems, embedded systems,
and networking. Finally, we dedicate this book to all the students who
would be using this text and wish them all our best.
ix
Acknowledgments
From Biswaranjan Mishra
The writing of this book was a massive task for me. I have
given my sincere efforts to put it from my brain to paper.
A lot of help was offered by many people, whom I need to
mention. First of all, I want to show my gratitude to my
Family for their constant support and inspiration. My
special thanks to mother Sakuntala Paikaray, father
Shyama Sundar Paikaray and my sister Susama Manjari Parida who have
always motivated me and encouraged me to explore my interests.
Lastly but not the least ,we are thankful to sri saswat chatterjee who helped
us in preparing the manuscript of this book.
x
Organization of the Book
• The first chapter introduces the readers to the different software and
components in detail.
• Chapter three describes the computer language with its phases and
passes of the compiler.
• Chapter five deals with linker and loader about its usages in the
programming loading schemes.
• The last chapter nine describes the system development and its
applications.
xi
About the Authors
Biswaranjan Mishra is working as Faculty of Computer Science in All
India Institute of Medical Science, Bhubaneswar, Odisha. In addition to
taking the classes over there served as guest faculty in various
conglomerate like Centurion University of Technology and Management
Bhubaneswar and NIPS school of Management Bhubaneswar. He has
completed B.Tech and M.Tech in Computer Science and Engineering from
BPUT, and pursuing his Ph.D. in Computer Science and Engineering at
GIET University, Odisha. He has more than a decade of teaching, research
experience and industrial expertise in many premier institutes. His area of
research interest includes Blockchain and cryptocurrency, Machine
Learning and IoT.
xii
Chapter 1
Introduction to Software
1
1.1 Introduction to Software
The instruction is the internal command or an external input received from devices such
as a mouse or keyboard. A program is a set of instructions to perform specific tasks and
software is the collection of one or many programs for a specific purpose. In the
computer language software is programming code executed on a computer processor.
For an operating system, the code written is called machine code. Examples of software
are Photoshop, MySQL, Google Chrome, Microsoft Word, Excel, PowerPoint, etc.
Application Software
Utility Software
System Software
(Including OS)
Hardware
3
1.3.4 Computer Hardware
Hardware refers back to the physical components of a system. Computer Hardware is any
part of the system that we will touch these parts. These are the primary electronic gadgets
used to accumulate the computer. Examples of hardware in a computer are the Processor,
Memory Devices, Monitor, Printer, Keyboard, Mouse, and the Central Processing Unit.
Application Software
Internet Browsers
System Software
Games
Operating system
Hardware
Multimedia
Utilities CPU, Mouse, Printer
Spreadsheet
4
collaboration of software
developers on the internet.
2. Later specified by the Open
Source Initiative (OSI).
3. It does not explicitly state ethical
values, besides those directly
associated with software
development.
2. Packaged Software Custom Software
6
Exercise
Short Questions
1. What is a kernel?
2. Can application software without an operating system be installed and why?
3. What are the phases of Traditional Software Development?
4. What are the phases of Agile Software Development?
5. What are the tasks of an operating system?
Long Questions
7
Chapter 2
System Programming
8
2.1 Introduction to System Programming
System programming is characterized by the fact that it is aimed at producing system
software that provides services to the computer hardware or specialized system
services. Many a time, system programming directly deals with the peripheral devices
with a focus on input, process (storage), and output.
• Programmers are expected to know the hardware and internal behavior of the
computer system on which the program will run. System programmers
explore these known hardware properties and write software for specific
hardware using efficient algorithms.
• Uses a low level programming language or some programming dialect.
• Requires little runtime overheads and can execute in a resource-constrained
environment.
• These are very efficient programs with a small or no runtime library
requirements.
• Has access to systems resources, including memory
• Can be written in assembly language
9
2.2 Machine Structure of System Program
A generic computer system comprises hardware components, collection of system
programs, and a set of application programs.
Hospital Application
Banking Web Browser
Management
System
Programs
Command
Compilers Editors
Interpreters
Operating System
System
Machine Language Programs
Microarchitecture Hardware
Physical devices
10
Memory
ALU
Output Input
11
ALU
Memory
I/O
1) Software Interface
• Software interface comprises a set of statements, predefined functions, user
12
options, and other methods of conveying instructions and data obtained from a
programmer language for programmers.
• Access to resources including CPU, memory and storage, etc., is facilitated by
software interfaces for the underlying computer system.
• While programming, the interface between software components makes use of
program and language facilities such as constants, various data types, libraries
and procedures, specifications for exception, and method handling.
• Operating system provides the interface that allows access to the system
resources from applications. This interface is called Application Programming
Interface (API). These APIs contain the collection of functions, definitions for
type, and constants, and also include some variable definitions. While
developing software applications, the APIs can be used to access and
implement functionalities.
2) Hardware Interface
• Hardware interfaces are primarily designed to exchange data and information
among various hardware components of the system, including internal and
external devices.
• This type of interface is seen between buses, across storage devices and other
I/O and peripherals devices.
• A hardware interface provides access to electrical, mechanical, and logical
signals and implements signaling protocols for reading and sequencing them.
• These hardware interfaces may be designed to support either parallel or serial
data transfer or both. Hardware interfaces with parallel implementations allow
more than one connection to carry data simultaneously, while serial allows
data to be sent one bit at a time.
• One of the popular standard interfaces is Small Computer System Interface
(SCSI) that defines the standards for physically connecting and
communicating data between peripherals and computers.
3) User Interface
• User interface allows interaction between a computer and a user by providing
various modalities of interaction including graphics, sound, position,
movement, etc. These interfaces facilitate transfer of data between the user and
the computing system. User interface is very important for all systems that
require user inputs.
13
2.6 Address Space in System Programming
The amount of space allocated for all possible addresses for data and other
computational entities is called address space. The address space is governed by the
architecture and managed by the operating system. The computational entities such as
a server, networked computer, or file all are addressed with in space.
Source Program
Assembler of Compiler
Object Code
Compile Time
Linkage Editor
Object Modules
Load Module
Load Time
15
2.8 System Software Development
Software development process follows the Software Development Life Cycle (SDLC),
which has each step doing a specific activity till the final software is built. The system
software development process also follows all the stages of SDLC, which are as
follows:
• Preliminary investigation: It determines what problems need to be fixed by the
system software being developed and what would be the better way of solving those
problems.
• System analysis: It investigates the problem on a large scale and gathers all the
information. It identifies the execution environment and interfaces required by
the software to be built.
• System design: This is concerned with designing the blueprint of system
software that specifies how the system software looks like and how it will
perform.
• System tool acquisition: It decides and works around the software tools to
develop the functionalities of the system software.
• Implementation: It builds the software using the software tools with all the
functionality, interfaces, and support for the execution. This may be very specific
as the system software adheres to the architecture. Operating system support is
sought for allocations and other related matters.
• System maintenance: Once the system software is ready, it is installed and used.
The maintenance includes timely updating software what is already installed.
16
• Android against iOS
• Moving towards GPU from CPU
• Renting against buying (Cloud Services)
• Web interfaces but not IDEs
• Agile development
System Software
Operating Systems
Target Machine Code
Utility Programs
Loader
Library Programs
Assemblers
Interpreter
Pre-processor
Long Questions
18
Chapter 3
Computer Language
19
3.1 Introduction to Computer Language
Computer needs language to communicate across its components and devices and
carry out instructions. A computer acts on a specific sequence of instructions written
by a programmer for a specific job. This sequence of instructions is known as a
program.
There are mainly three different languages with the help of which we can develop
computer programs. And they are High Level Language ,Low Level Language, and
Advanced High Level Language.
Computer
Languages
20
1) Machine Language
Machine language is the language which is purely depended on programming concept,
hence written by using binary languages. It is also known as machine code. Machine
code encompasses a function (Opcode) and an operand (Address) part. Since a
computer understands machine codes, programs written by using machine language
can be executed immediately without the requirement of any language translators.
2) Assembly Language
This is a type of programming language in low-level that uses symbolic codes or
mnemonics as instruction. Some examples of mnemonics include ADD, SUB, LDA,
and STA that stand for addition, subtraction, load accumulator, and store
accumulator, respectively. The language known as an assembly language program,
where systematic processing is done by using a language translator hereafter called
assembler. Who played a vital role for translation for assemble language code into
machine code. This type of Assembly language program is mandatory for translation
of equivalent machine language code (binary code) before execution.
1) Problem-Oriented Language
Problem-oriented language is a programming model which is resulting from
structured programming procedures. A procedure is known as functions,
routines, and subroutines series of computational steps to be carried out.
Program’s execution, any given procedure might be called at any point, including
by other procedures or itself. Languages used in Problem-Oriented Language:
FORTRAN, COBOL, ALGOL, Pascal, BASIC and C.
23
1 and 0 bits). The computer processes the machine code to perform the
corresponding tasks.
Role of Compiler
• Compliers reads the source code, outputs executable code
• Translates software written in a higher-level language into instructions that
computer can understand. It converts the text that a programmer writes into a
format the CPU can understand.
• The process of compilation is relatively complicated. It spends a lot of time
analyzing and processing the program.
• The executable result is some form of machine-specific binary code.
Role of Interpreter
• The interpreter converts the source code line-by-line during RUN Time.
• Interpret completely translates a program written in a high-level language
into machine level language.
• Interpreter allows evaluation and modification of the program while it is
executing.
• Relatively less time spent for analyzing and processing the program
• Program execution is relatively slow compared to compiler
• Execution gap: The gap between the semantic of programs written in different
programming language.
• Language processor: Language processor is software which bridges a
specification or execution gap.
• Language translator: Language translator bridges an execution gap to the
machine language of a computer system.
• Detranslator: It bridges the same execution gap as language translator, but in
the reverse direction.
• Preprocessor: It is a language processor which bridges an execution gap but is
not a language translator.
• Language migrator: It bridges the specification gap between two
programming languages.
• Interpreter: An interpreter is a language processor which bridges an execution
gap without generating a machine language program.
• Source language: The program which forms the input to a language processor
is a source program. The language in which the source program is written is
known source language.
• Target language: The output of a language processor is known as the target
program. The language, to which the target program belongs to, is called target
language.
25
• Forward Reference: A forward reference of a program entity is a reference to
the entity in some statement of the program that occurs before the statement
containing the definition or declaration of the entity.
• Language processor pass: A Language processor pass is the processing of
every statement in a source program, or in its equivalent representation, to
perform a language processing function (a set of language processing
functions).
• Intermediate representation (IR): An intermediate representation is a
representation of a source program which reflects the effect of some, but not
all analysis and synthesis functions performed during language processing.
An intermediate representation should have the following three properties:
1. Ease of use: It should be easy to construct the intermediate representation
and analyze it.
2. Processing efficiency: Efficient algorithms should be available for accessing
the data structures used in the intermediate representation.
3. Memory efficiency: The intermediate representation should be compact so
that it does not occupy much memory.
Error
26
3.6.2 Program Execution Activities
Two popular models for program execution are translation and interpretation.
1) Translation
The program translation model bridges the execution gap by translating a
program written in PL, called source program, into an equivalent program in
machine or assembly language of the computer system, called target program.
Error Data
2) Interpretation
The interpreter reads the source program and stores it in its memory. The CPU
uses the program counter (PC) to note the address of the next instruction to be
executed. The statement would be subjected to the interpretation cycle, which
consist the following steps:
27
1. Add a symbol and its attributes: Make a new entry in the symbol table.
2. Locate a symbol’s entry: Find a symbol’s entry in the symbol table.
3. Delete a symbol’s entry: Remove the symbol’s information from
the table.
4. Access a symbol’s entry: Access the entry and set, modify or copy its attribute
information.
The symbol table consists of a set entries organized in memory. Two kinds of data
structures can be used for organizing its entries:
1. Linear data structure: Entries in the symbol table occupy adjoining areas of
memory. This property is used to facilitate search.
2. Nonlinear data structure: Entries in the symbol table do no occupy
contiguous areas of memory. The entries are searched and accessed using
pointers.
28
Source Front End Back End Target
Program
Program
Intermediate
Representation
The first pass performs analysis of the source program and reflects its results in the
intermediate representation. The second pass reads and analyzes the intermediate
representation to perform synthesis of the target program.
Example-1
Consider following code
i; integer; a, b; real;
a = b + i;
a = b + i
29
Here a, b & i are real value which are defined in the code so it is considered as Id#1,
Id#2 & Id#3 respectively .
In the above code operator ‘=’ & ‘+” are taken as Op#1 & Op#2
respectively.
Example-1
Consider the statement a = b + i which can be represented in a tree form as
a +
b i
Where b & i are two real value which are added to have a result to another real
value a, so two sides of the above expressions are represented in a tree structure .
3) Semantic Analysis
Semantic analysis determines the meaning of a statement by applying the
semantic rules to the structure of the statement. While processing a declaration
statement, it adds information concerning the type, length and dimensionality of a
symbol to the symbol table. While processing an imperative statement, it
determines the sequence of actions that would have to be performed for
implementing the meaning of the statement and represents them in the
intermediate code.
Example-1
Considering the tree structure for the statement a = b + i
1. If node is operand, then type of operand is added in the description field of
operand.
30
2. While evaluating the expression the type of b is real and i is int so type of i
is converted to real, i*.
The analysis ends when the tree has been completely processed.
4) Intermediate Representation
IR contains intermediate code and symbol table
Example-1
Finding the intermediate code by the symbol table with symbol type as shown in
the table
1 I int
2 a real
3 b real
4 i* real
5 temp real
Example-1
Memory are allocated by specified address irrespective with symbol like 2000 for
symbol i type int.
1 i int 2000
2 a real 2001
3 b real 2002
6) Code Generation
The synthesis phase may decide to hold the value of i* and temp in machine
registers and may generate the assembly code
Example-1
CONV, R (CONV –assembly code, R-temp. registers)
AREG, I (AREG –assembly code, I-temp. registers)
ADD, R (ADD –assembly code, R-temp. registers)
AREG, B
MOVE, M
AREG, A
32
Exercise
Short Questions
Long Questions
33
Chapter 4
Introduction of Assembler
34
4.1 Introduction of Assembler
Assembler is a program, which generate the information for the loader by converts the
instruction written in low-level assembly code into re-locatable machine code.
● Pass-1:
1. Define symbols and literals and remember them in symbol table and literal table
respectively.
2. Keep track of location counter.
3. Process pseudo-operations.
● Pass-2:
1. Generate object code by converting symbolic op-code into respective numeric
op-code
2. Generate data for literals and look for values of symbols. Firstly, we will take a
small assembly language program to understand the working in their respective
passes. Assembly language statement format:
MOVER BREG, X
STOP
35
READ X
PRINT Y
ADD AREG, Z
2. Declaration Statement
Declaration statements are for reserving memory for variables.
[Label] DS <constant>
*Label + DC ‘<value>’
DS: stands for Declare storage, DC: stands for Declare constant. The DS statement
reserves area of memory and associates name with them.
Example: A DS 10
Above statement associates the name ONE with a memory word containing the value
‘1’.
Any assembly program can use constant in two ways- as immediate operands, and as
literals. Many machine support immediate operands in machine instruction.
But hypothetical machine does not support immediate operands as a part of the
machine instruction. It can still handle literals. A literal is an operand with the
syntax=’<value>’.
It differs from constant because its location cannot be specified in assembly program.
3. Assembler Directive
Assembler directives instruct the assembler to perform certain action during the
assembly program.
36
1) START
This directive indicates that first word of machine should be placed in the memory
word with address <constant>.
START <Constant>
First word of the target program is stored from memory location 500 onwards.
2) END
This directive indicates end of the source program. The operand indicates address of
the instruction where the execution of program should begin.
37
Example-1
• START: This instruction starts the execution of program from location 200 and
label with START provides name for the program.(JOHN is name for program)
• MOVER: It moves the content of literal (=’3′) into register operand R1.
• MOVEM: It moves the content of register into memory operand(X).
• MOVER: It again moves the content of literal (=’2′) into register operand R2 and its
label is specified as L1.
• LTORG: It assigns address to literals (current LC value).
• DS (Data Space): It assigns a data space of 1 to Symbol X.
• END: It finishes the program execution.
START 200: (Here no symbol or literal is found so both tables would be empty)
LITERAL ADDRESS
=’3′ –––
X –––
L1 MOVER R2, =’2′ 202: L1 is a label and =’2′ is a literal so store them in respective
tables.
LITERAL ADDRESS
SYMBOL ADDRESS
=’3′ –––
X –––
=’2′ –––
L1 202
LTORG 203: Assign address to first literal specified by LC value, i.e., 203
LITERAL ADDRESS
=’3′ 203
=’2′ – ––
SYMBOL ADDRESS
X 204
L1 202
END 205: Program finishes execution and the remaining literal will get the address
specified by LC value of END instruction. Here is the complete symbol and literal table
made by pass 1 of assembler.
39
SYMBOL ADDRESS LITERAL ADDRESS
Now tables generated by pass 1 along with their LC value will go to pass-2 of
assembler for further processing of pseudo-opcodes and machine opcodes.
IR
Assembly Target
Pass-1 Pass-2
Program Program
Symbol
Table
40
Take a look at flowchart to understand:
READ
Search POT DS / DC
Evaluate
operands by ST
Assemble the
Update LC
instruction
41
4.5.1 Statement Format
An assembly language statement has the following format:
Where the notation [..] indicates that the enclosed specification is optional. If a label is
specified in a statement, it is associated as a symbolic name with the memory word
generated for the statement. If more than one memory word is generated for a
statement, the label would be associated with the first of these memory words.
• The operand AREA refers to the memory word with which the name AREA is
associated.
• The operand AREA+5 refer to the memory word that is 5 words away from the
word with the name AREA. Here '5' is the displacement or offset from AREA.
• The operand AREA(4) implies indexing the operand AREA with index register 4—
that is, the operand address is obtained by adding the contents of index register 4 to
the address of AREA.
• The operand AREA+5 (4) is a combination of the previous two specifications.
42
4. Perform LC processing.
1. Symbol table: Each entry in the symbol table has two primary field- name and
address. This table is built by analysis phase
2. Mnemonics table: An entry in the mnemonics table has two primary fields-
mnemonics and opcode.
Tasks of Synthesis phase
1. Obtain machine opcode through look up in the mnemonics table.
2. Obtain address of memory operand from the symbol table.
3. Synthesize a machine instruction.
43
4.6.2 Single pass translation
A one pass assembler requires 1 scan of the source program to generate machine code.
The process of forward references is talked using a process called back patching. The
operand field of an instruction containing forward references is left blank initially. A
table of instruction containing forward references is maintained separately called table
of incomplete instruction (TII). This table can be used to fill-up the addresses in
incomplete instruction. The address of the forward referenced symbols is put in the
blank field with the help of back patching list.
44
Where <address specification> is an <operand specification> or <constant>. This directive
instructs the assembler to put the address given by <address specification> in the location
counter. The ORIGIN statement is useful when the target program does not consist of a
single contiguous area of memory. The ability to use an <operand specification> in the
ORIGIN statement provides the ability to change the address in the location counter in
a relative rather than absolute manner.
2. EQU
The EQU directive has the syntax
<displacement>. The EQU statement simply associates the name <symbol> with the
address specified by <address specification>. However, the address in the location
counter is not affected.
3. LTORG
The LT0RG directive, which stands for 'origin for literals', allows a programmer to
specify where literals should be placed. The assembler uses the following scheme for
placement of literals: When the use of a literal is seen in a statement, the assembler
enters it into a literal pool unless a matching literal already exists in the pool. At every
LTORG statement, as also at the END statement, the assembler allocates memory to the
literals of the literal pool and clears the literal pool. This way, a literal pool would
contain all literals used in the program since the start of the program or since the
previous LTORG statement. Thus, all references to literals are forward references by
definition. If a program does not use an LTORG statement, the assembler would enter
all literals used in the program into a single pool and allocate memory to them when it
encounters the END statement.
Consider the following assembly program to understand ORIGIN, EQU and LTORG
1 START 200
45
5 MOVER CREG, B 203) +04 3 218
7 …
13 LTORG
14 …
18 ORIGIN LOOP + 2
20 ORIGIN LAST + 1
21 A DS 1 217)
23 B DS 1 218)
24 END
ORIGIN: Statement number 18 of the above program viz. ORIGIN LOOP + 2 puts the address
204 in the location counter because symbol LOOP is associated with the address 202. The next
statement MULT CREG, B is therefore given the address 204.
46
EQU: On encountering the statement BACK EQU LOOP, the assembler associates the symbol
BACK with the address of LOOP i.e. with 202.
LTORG: In assembly program, the literals ='5' and ='1' are added to the literal pool in
Statements 2 and 6, respectively. The first LTORG statement (Statement 13) allocates the
addresses 211 and 212 to the values '5' and ‘1’. A new literal pool is now started. The value T is
put into this pool in Statement 15. This value is allocated the address 219 while processing the
END statement. The literal ='1' used in Statement 15 therefore refers to location 219 of the
second pool of literals rather than location 212 of the first pool.
OPTAB
MOVER IS (04,1)
DS DL R#7
START AD R#11
SYMTAB
A SYMTAB entry contains the symbol name, field address and length. Some address
can be determining directly, e.g. the address of the first instruction in the program,
however other must be inferred. To find address of other we must fix the addresses of
all program elements preceding it. This function is called memory allocation.
47
Symbol Address Length
LOOP 202 1
NEXT 214 1
LAST 216 1
A 217 1
BACK 202 1
B 218 1
LITTAB
A table of literals used in the program. A LITTAB entry contains the field literal and
address. The first pass uses LITTAB to collect all literals used in a program.
POOLTAB
Awareness of different literal pools is maintained using the auxiliary table POOLTAB.
This table contains the literal number of the starting literal of each literal pool. At any
stage, the current literal pool is the last pool in the LITTAB. On encountering an
LTORG statement (or the END statement), literals in the current pool are allocated
addresses starting with the current value in LC and LC is appropriately incremented.
literal Address
Literal no
1 =’5’
#1
2 =’1’
#3
3 =’1’
LITTAB POOLTAB
1. Address
2. Representation of mnemonics opcode
3. Representation of operands
49
Mnemonics field
The mnemonics field contains a pair of the form (statement class, code). Where
statement class can be one of IS, DL, and AD standing for imperative statement,
declaration statement and assembler directive respectively. For imperative statement,
code is the instruction opcode in the machine language. For declarations and assembler
directives, code is an ordinal number within the class. Thus, (AD, 01) stands for
assembler directive number 1 which is the directive START. Codes for various
declaration statements and assembler directives.
The information in the mnemonics field is assumed to have the same representation in
all the variants.
Assembler directive
Declaration Statement
START 01 EQU 04
DC 01
END 02 LTORG 05
DS 02
ORIGIN 03
The second operand, which is a memory operand, is represented by a pair of the form
(operand class, code). Where operand class is one of the C, S and L standing for
constant, symbol and literal. For a constant, the code field contains the internal
representation of the constant itself. Ex: the operand descriptor for the statement
START 200 is (C, 200). For a symbol or literal, the code field contains the ordinal
number of the operand’s entry in SYMTAB or LITTAB.
50
Condition Code
Register Code
LT 01
AREG 01
LE 02
BREG 02
EQ 03
CREG 03
GT 04
DREG 04
GE 05
ANY 06
Variant II
This variant differs from variant I of the intermediate code because in variant II
symbols, condition codes and CPU register are not processed. So, IC unit will not
generate for that during pass I.
Variant I Variant II
START 200 (AD,01) (C, 200) (AD,01) (C, 200)
READ A (IS, 09) (S, 01) (IS, 09) A
LOOP MOVER AREG, A (IS, 04) (1)(S, 01) (IS, 04) AREG, A
. . .
. . .
SUB AREG,=’1 (IS, 02) (1)(L, (IS, 02) AREG,(L, 01)
’ 01)
BC GT, LOOP (IS, 07) (4)(S, 02) (IS, 07) GT, LOOP
STOP (IS, 00) (IS, 00)
A DS 1 (DL, 02) (C,1) (DL, 02) (C,1)
LTORG (AD, 05) (AD, 05)
…..
51
4.9.4 Comparison of the variants
Variant I Variant II
IS, DL and AD all Contain DL and AD statements contain
statements processed form. processed form while for Is statements,
operand field is processed only to
identify literal references.
Extra work in pass I Extra work in pass II
Simplifies tasks in pass II Simplifies tasks in pass I
Occupies more memory Memory utilization of two passes gets
than pass II better balanced.
It has been assumed that the target code is to be assembled in the area named
code_area.
52
ii) Loc_cntr=loc_cntr+size;
3. Processing end statement
a) Perform steps 2(b) and 2(f)
b) Write code_area into output file.
Listing an error in first pass has the advantage that source program need not be
preserved till pass II. But, listing produced in pass I can only reports certain errors not
all. From the below program, error is detected at statement 9 and 21. Statement 9 gives
invalid opcode error because MVER does not match with any mnemonics in OPTAB.
Statement 21 gives duplicate definition error because entry of A is already exist in
symbol table. Undefined symbol B at statement 10 is harder to detect during pass I, this
error can be detected only after completing pass I.
53
4.10.2 Error Reporting In Pass II
During pass II data structure like SYMTAB is available. Error indication at statement 10
is also easy because symbol table is searched for an entry B. if match is not found, error
is reported.
54
3) Segment Register Table (SRTAB)
The segment register table (SRTAB) contains four entries, one for each segment register.
Each entry shows the SYMTAB entry # of the segment whose address is contained in
the segment register. SRTAB_ARRAY is an array of SRTABs.
55
stmt_no : = 1;
SYMTAB_segment_entry : = 0; Clear ERRTAB, SRTAB_ARRAY.
2) While the next statement is not an END statement
a) Clear machine_code_buffer.
b) If a symbol is present in the label field then this_label := symbol in the label field;
c) If an EQU statement
i) this_address := value of <address specification>;
ii) Make an entry for this_label in SYMTAB with offset := this_addr;
Defined: = ‘yes’
owner_segment:= owner_segment in SYMTAB entry of the symbol in the
operand field. source_stmt_# := stmt_no;
iii) Enter stmt_no in the CRT list of the label in the operand field.
iv) Process forward references to this_label;
v) Size := 0;
d) If an ASSUME statement
i) Copy the SRTAB in SRTAB_ARRAY[srtab_no] into SRTAB_ARRAY
[srtab_no+1]
ii) srtab_no := srtab_no+1;
iii) For each specification in the ASSUME statement
(a) this_register := register mentioned in the specification.
(b) This_segment:= entry number of SYMTAB entry of the segment
appearing in the specification.
(c) Make the entry (this_register, this_segment) in SRTAB_ARRAY
[srtab_no]. (It overwrites an existing entry for this_register.)
(d) size: = 0;
e) If a SEGMENT statement
i) Make an entry for this_label in SYMTAB and note the entry number.
ii) Set segment name? := true;
iii) SYMTAB_segment_entry := entry no. in SYMTAB;
iv) LC :=0;
v) size := 0;
f) If an ENDS statement then SYMTAB_segment_entry :=0;
g) If a declaration statement
i) Align LC according to the specification in the operand field.
ii) Assemble the constant(s), if any, in the machine_code_buffer.
iii) size: = size of memory area required;
h) If an imperative statement
i) If the operand is a symbol symb then enter stmt_no in CRT list of symb.
56
ii) If the operand symbol is already defined then check its alignment and
addressability. Generate the address specification (segment register, offset) for
the symbol using its SYMTAB entry and SRTAB_ARRAY[srtab_no].
Else
Make an entry for symb in SYMTAB with defined :=’no’;
Make the entry (srtab_no, LC, usage code, stmt_no) in FRT of symb.
iii) Assemble instruction in machine_code_buffer.
iv) size := size of the instruction;
i) If size ≠ 0 then
i) If label is present then
Make an entry for this_label in
SYMTAB with owner_segment := SYMTAB_segment_entry;
Defined := ‘yes’;
offset := LC;
source_stmt_# := stmt_no;
ii) Move contents of machine_code_buffer to the address code_area_address +
<LC>;
iii) LC := LC + size;
iv) Process forward references to the symbol. Check for alignment and
addressability errors. Enter errors in the ERRTAB.
v) List the statement along with errors pertaining to it found in the ERRTAB.
vi) Clear ERRTAB.
3) (Processing of END statement)
a) Report undefined symbols from the SYMTAB.
b) Produce cross reference listing.
c) Write code_area into the output file.
Example 1:
Assembly Program to compute N! & equivalent machine language program
(1 digit) (3 digit)
1 START 101
57
3 MOVER BREG, ONE 102) 04 2 115
18 N DS 1 113)
19 RESULT DS 1 114)
21 TERM DS 1 116)
22 END
OPTAB SYMTAB
Mnemonic Class Mnemonic Symbol Address Length
OPCODE info
AGAIN 104 1
START AD R#11
N 113 1
READ IS (09,1)
RESULT 114 1
MOVER IS (04,1)
ONE 115 1
MOVEM IS (05,1)
TERM 116 1
ADD IS (01,1)
BC IS (07,1)
DC DL R#5
SUB IS (02,1)
STOP IS (00,1)
58
COMP IS (06,1)
DS DL R#7
PRINT IS (10,1)
END AD
MULT IS (03,1)
LITTAB and POOLTAB are empty as there are no Literals defined in the program.
1 START 101 (AD, 01) (C, 101) (AD, 01) (C, 101)
3 MOVER BREG, ONE (IS, 04) (2)(S, 04) (IS, 04) (2) ONE
4 MOVEM BREG, TERM (IS, 05) (2)(S, 05) (IS, 05) (2) TERM
5 AGAIN MULT BREG,TERM (IS, 03) (2)(S, 05) (IS, 03) (2) TERM
6 MOVER CREG, TERM (IS, 04) (3)(S, 05) (IS, 04) (3) TERM
7 ADD CREG, ONE (IS, 01) (3)(S, 04) (IS, 01) (3) ONE
12 MOVEM CREG, TERM (IS, 05) (3)(S, 05) (IS, 05) (3) TERM
15 MOVEM BREG, RESULT (IS, 05) (2)(S, 03) (IS, 05) (2)RESULT
59
Exercise
Short Questions
Long Questions
60
Chapter 5
Introduction to Linker and Loader
61
5.1 Introduction to Linker
The linker is an arrangement in a device that permits to link the item module of the
machine within a precious object record. It plays the manner of linking. The linker
is additionally known as hyperlink editors. Linking is the technique of
accumulating and retaining a bit of code and reality within a single report. Linker
additionally links a specific module within the system library. It yields item
modules from assembler for absorption and bureaucracy an execrable report as
output for the loader.
Linking is carried out at each compile-time, whilst the source code is rewritten into
machine code together with load-time, whilst the program is loaded into memory
utilizing the loader. Linking is implemented at the final step in compiling an
application.
Source code -> compiler -> Assembler -> Object code -> Linker -> Executable file -> Loader
Linking can be categorized into two types: Static Linking, Dynamic linking.
The linker replica all library routines recycled in the program into an executable
program. Therefore, it requires extra memory space to store it. As it does no longer
require the presence of a library on the gadget when it’s far run. So it’s miles
quicker and more convenient. Success threats and much less error risk.
5.1.2 Dynamic linking
Dynamic linking is implemented all through the run time. This linking is executed
by means of putting on account of a shareable library inside the executable image.
There are extra likelihood of errors and failure possibilities. It requires less memory
area as multiple programs can proportion a single reproduction of the library.
Here we can carry out code sharing. It implies we are the use of the same items
several times inside the program. Instead of linking the equal item over and over
62
into the library, each module shares statistics of an object with different modules
having the same item. The shared library needed within the linking is stored in
digital memory to save RAM. In this linking, we also can relocate the code for the
smooth executing of code but all of the code isn’t re-locatable. It fixes the address at
run time.
The translator converts the source program to their corresponding object modules,
which are stored in files for future use. At the time of linking, the linker combines
all these object modules together and convert them to their respective binary
modules. These binary modules are ready to execute the form. They are also stored
in the files for future use. At the time of execution, the loader uses these binary
63
modules and load to the correct memory location, and the required binary program
is obtained. The binary program, in turn, receives the input from the user in the
form of data and the result is obtained.
5.3.2 Linking
• An external reference to a symbol alpha can be resolved only if alpha is
declared as a public definition in some object module.
• Using this observation as the basis, program linking can be performed as
follows:
• The linker would process the linking tables (LINKTABs) of all object modules
that are to be linked and copy the information about public definitions found
in the min to a table called the name table (NTAB).
• The external reference to alpha would be resolved simply by searching for
alpha in this table, obtaining its linked address, and copying it into the word
that contains the external reference.
• Accordingly, each entry of the NTAB would contain the following fields:
• Symbol: Symbolic name of an external reference or an object module.
• Linked-address: For a public definition, this field contains the linked address
of the symbol. For an object module, it contains the linked origin of the object
module.
Algorithm: Program Linking
1. program_linked_origin = <link origin> from the linker command.
2. for each object module mentioned in the linker command
(a) t_origin = translated origin of the object module; OM_size = size of the
objectmodule;
(b) Relocation_factor = program_linked_origin – t_origin;
(c) Read the machine language program contained in the program
component of the object module into the work area.
(d) Read LINKTAB of the objectmodule.
(e) Enter (object module name, program_linked_origin) inNTAB.
(f) For each LINKTAB entry with type = PD name = symbol field of the
LINKTABentry;
linked_address = translated_address + relocation_factor; Enter (name,
linked_address) in a new entry of the NTAB.
(g) program_linked_origin : = program_linked_origin +OM_size;
65
3. for each object module mentioned in the linker command
(a) t_origin = translated origin of the object module; program_linked_origin
= linked_adress from NTAB;
(b) For each LINKTAB entry with type = EXT
i. address_in_work_area = address of work_area + program_linked_origin -
<link origin> in linker command + translated address –t_origin;
ii. Search the symbol found in the symbol field of the LINKTAB entry in
NTAB and note its linked address. Copy this address into the operand
address field in the word that has the address address_in_work_area.
67
• An Intel 8088 object module is a sequence of object records, each object
record describing specific aspects of the programs in the object module.
• There are 14 types of object records containing the following five basic
categories of information:
• Binary image (i.e. code generated by a translator)
• External references
• Public definitions
• Debugging information (e.g. line number in source program).
• Miscellaneous information (e.g. comments in the source program).
68
Execution of an overlay structured program
• For linking and execution of an overlay structured program in MS-DOS, the
linker produces a single executable file at the output, which contains two
provisions to support overlays.
• First, an overlay manager module is included in the executable file.
• This module is responsible for loading the overlays when needed.
• Second, all calls that cross overlay boundaries are replaced by an interrupt
producing instruction.
• To start with, the overlay manager receives control and loads the root.
• A procedure call that crosses overlay boundaries leads to an interrupt.
• This interrupt is processed by the overlay manager and the appropriate overlay
is loaded into memory.
• When each overlay is structured into a separate binary program, as in IBM's
mainframe systems, a call that crosses overlay boundaries leads to an interrupt
which is attended by the OS kernel.
• Control is now transferred to the OS loader to load the appropriate binary
program.
Program loader
Source Compiler & go in memory and
Program Assembler Assembler
69
Advantages of the Compile-and-Go loaders
• The user needn’t be troubled with the separate steps of compilation,
assembling, linking, loading, and capital punishable.
• Execution speed is usually abundant in superior to interpreted systems.
• They are easy and simpler to implement.
70
• A Header record showing the load origin, length, and load time execution start
address of the program.
• A sequence of binary image records containing the program’s code. Each
binary image record contains a part of the program’s code in the form of a
sequence of bytes, the load address of the first byte of this code, and a count of
the number of bytes of code.
• The absolute loader records the load origin and also to the length of the
program mentioned within the header record.
• It then enters into a loop that reads a binary image record and moves the code
contained in it to the memory area beginning on the address mentioned within
the binary image record.
• In the top, it transfers control to the execution begin address of the program.
• A Header record showing the load origin, length, and load time execution start
address of the program.
• A sequence of binary image records containing the program’s code. Each
71
binary image record contains a part of the program’s code in the form of a
sequence of bytes, the load address of the first byte of this code, and a count of
the number of bytes of code.
• A table is analogous to the RELOCTAB table giving linked addresses of an
address sensitive instruction in the program.
72
In the above program, the address of variable X in the instruction ADD AREG, X
will be 30. If this program is loaded from the memory location 500 for execution
then the address of X in the instruction ADD AREG, X must become 530.
Offset=10
500
Offset=30
ADD AREG, X 530
ADD AREG, X
73
Exercise
Short Questions
Long Questions
74
Chapter 6
Macro Processors
75
6.1 Introduction to Macro
Formally, macro instructions (often called macro) are single-line abbreviations for
groups of instructions. For every occurrence of this one-line macro instruction
within a program, the instruction must be replaced by the entire block.
The advantages of using macro are as follows:
• Simplify and reduce the amount of repetitive coding.
• Reduce the possibility of errors caused by repetitive coding.
• Make an assembly program more readable.
76
6.2.2 Silent Features of Macro Processor
A macro can be called by writing the name of the macro in the mnemonic field of
the assembly language.
The syntax of a typical macro call can be of the following form:
<name_of_macro> [<actual_parameter_spec> [,…]]
The MACRO directive in the mnemonic field specifies the start of the macro
definition and it should compulsorily have the macro name in the label field. The
MEND directive specifies the end of the macro definition. The statements between
77
MACRO and MEND directives define the body (model statements) of the macro
and can appear in the expanded code.
Macro Definition Example:
MACRO
INCR &MEM_VAL, &INC_VAL,
® MOVER ®
&MEM_VAL
ADD ®
&INC_VAL MOVEM
®
&MEM_VAL MEND
START 100
A DS 1
B DS 1
INCR A, B, AREG
PRINT A
STOP
END
The preceding example uses a statement that calls the macro. The assembly code
sequence INCR A, B, AREG is an example of the macro call, with A and B being the
actual parameters of the macro. While passing over the assembly program, the
assembler recognizes INCR as the name of the macro, expands the macro, and places a
copy of the macro definition (along with the parameter substitutions). The expanded
code for the code is as below.
78
START 100
A DS 1
B DS 1
+ MOVER REG A
+ ADD REG B
+ MOVEM REG A
PRINT A
STOP
END
The statements marked with a ‘+’ sign in the preceding label field denote the expanded
code and differentiate them from the original statements of the program.
Macro Subroutine
Macro name in the mnemonic field Subroutine name in a call statement in
leads to an expansion only. the program leads to execution.
Macros are completely handled Subroutines are completely handled by
by the assembler during assembly the hardware at runtime.
time.
Macro definition and macro Hardware executes the subroutine call
expansion are executed by the instruction. So, it has to know how to
assembler. So, the assembler has to save the return address and how to
know all the features, options, and branch to the subroutine.
exceptions associated with them.
The hardware knows nothing about The assembler knows subroutines
macros. nothing about anything.
The macro processor generates a The subroutine call instruction is
new copy of the macro and places it assembled in the usual way and treated
in the program. by the assembler as any other
instruction.
Macro processing increases the size The use of subroutines does not result
of the resulting code but results in in bulk object codes but has substantial
faster execution of the program for overheads of control transfer during
expanded programs. execution.
79
6.5 Types of formal parameters
6.5.1 Positional parameters
For positional formal parameters, the specification <parameter kind> of syntax rule is
simply omitted. Thus, a positional formal parameter is written as &<parameter name>,
e.g., &SAMPLE where SAMPLE is the name of a parameter. In a call on a macro using
positional parameters (see syntax rule (4.2)), the <actual parameter specification> is an
ordinary string.
The value of a positional formal parameter XYZ is determined by the rule of positional
association as follows:
• Find the ordinal position of XYZ in the list of formal parameters in the macro
prototype statement.
• Find the actual parameter specification that occupies the same ordinal position in
the list of actual parameters in the macro call statement. If it is an ordinary string
ABC, the value of the formal parameter XYZ would be ABC.
AGO
An AGO statement has the syntax: AGO <sequencing symbol>
It unconditionally transfers expansion time control to the statement containing
<sequencing symbol> in its label field.
Example
MACRO
81
MEND
The LCL statement declares M to be a local EV. At the start of the expansion of the call,
M is initialized to zero. The expansion of model statement MOVEM, AREG, &X+&M
thus leads to the generation of the statement MOVEM AREG, B. The value of M is
incremented by 1 and the model statement MOVEM is expanded repeatedly until its
value equals the value of N.
Example
MACRO
CONSTANTS
LCL &A
&A SET 1
DB &A
&A SET &A
+l
DB &A
MEND
82
The local EV A is created. The first SET statement assigns the value '1' to it. The first
DB statement thus declares a byte constant ‘1’. The second SET statement assigns the
value '2' to A and the second DB statement declares a constant '2'.
83
The general design semantics of a macro preprocessor is shown as below
Program
Expanded
Expansion
Program Program
with Macro
Definitions Assembler
and Macro
Calls
The design of a macro preprocessor is influenced by the provisions for performing the
following tasks involved in macro expansion:
• Recognize macro calls: A table is maintained to store names of all macros
defined in a program. Such a table is called Macro Name Table (MNT) in which
an entry is made for every macro definition being processed. During processing
program statements, a match is done to compare strings in the mnemonic field
with entries in the MNT. A successful match in the MNT indicates that the
statement is a macro call.
• Determine the values of formal parameters: A table called Actual Parameter
Table (APT) holds the values of formal parameters during the expansion of a
macro call. The entry into this table will be in pair of the form (<formal
parameter name>, <value>). A table called Parameter Default Table (PDT)
contains information about default parameters stored as pairs of the form
(<formal parameter name>, <default value>) for each macro defined in the
program. If the programmer does not specify the value for any or some
parameters, its corresponding default value is copied from PDT to APT.
• Maintain the values of expansion time variables declared in a macro: A table
called Expansion time Variable Table (EVT) maintains information about
84
expansion variables in the form (<EV name>, <value>). It is used when a
preprocessor statement or a model statement during expansion refers to an EV.
• Organize expansion time control flow: A table called Macro Definition Table
(MDT) is used to store the body of a macro. The flow of control determines when
a model statement from the MDT is to be visited for expansion during macro
expansion. MEC {Macro Expansion Counter) is defined and initialized to the
first statement of the macro body in the MDT. MDT is updated following an
expansion of a model statement by a macro preprocessor.
• Determine the values of sequencing symbols: A table called Sequencing Symbols
Table (SST) maintains information about sequencing symbols in pairs of the form
(<sequencing symbol name>, <MDT entry #>)
Where <MDT entry #> denotes the index of the MDT entry containing the model
statement with the sequencing symbol. Entries are made on encountering a
statement with the sequencing symbol in their label field or on reading a
reference before its definition.
• Perform expansion of a model statement: The expansion task has the following
steps:
• MEC points to the entry in the MDT table with the model statements.
• APT and EVT provide the values of the formal parameters and EVs,
respectively.
• SST enables identifying the model statement and defining sequencing.
86
• Reads continually the statements and writes them to the MDT until the MEND
directive is found.
When a MEND directive is encountered in the source file, the assembler reverts to
the normal mode. If the MEND directive is missing, the assembler will stay in the
macro definition mode and continue to save program statements in the MDT until
an obvious error occurs. This will happen in cases such as reading another MACRO
directive or an END statement, where the assembler will generate an error
(runaway definition) and abort the assembly process.
87
definition.
• Numeric values of arguments: Although most macro processors treat arguments
normally as strings. Some assemblers, like VAX assembler, optionally allow using
the value, rather than the name of the argument.
• Comments in macros: Comments are printed with macro definition, but they
might or might not be with each expansion. Some comments are meant only for
definitions, while some are expected in the expanded code.
6.10.1 Design Features of Macro Processor
89
One-pass Macro Processor scheme is presented as below:
GETLINE PROCESSLI
Macro definition
Macro call
DEFINE EXPAND
Pass 0 of Assembler
The activities of the pass-0 macro processor are given in the following steps:
1. Read and examine the next source statement.
2. If MACRO statement, continue reading the source and copy the entire macro
definition to the MDT. Go to Step1.
3. If the statement is a pass-0 directive, execute it. Go to Step 1. (These directives
are written to the new source file in a unique manner (different from normal
directives). They are only needed for the listing in pass2.
4. If the statement contains a macro name, it must perform expansion, that is, read
model statements from the MDT corresponding to the call, substitute
parameters, and write each statement to the new source file (or execute it if it is a
pass-0 directive). Go to Step1.
5. For any other statement, write the statement to the new source file. Go to Step1.
6. If the current statement contains the END directive, stop (end of pass0).
90
• In the macro expansion mode, the assembler will read statements from the MDT,
substitute parameters, and write them to the new source file. Nested macros can be
implemented using the Definition and Expansion (DE) mode.
91
Exercise
Short Questions
Long Questions
92
Chapter 7
Introduction to Compilers
93
7.1 Introduction to Compiler
A compiler is motherly known as translators who reshape the high-degree
language into the machine oriented language. A high-level language is written by a
developer and machine language can be understood by the processor. The compiler
is used to show errors to the programmer. The main purpose of the compiler is to
change the code written in one language without changing the meaning of the
program. When you execute a program that is written in HLL programming
language then it executes into two parts. In the first part, the source program
compiled and translated into the object program (low-level language). In the
second part, an object program translated into the target program through the
assembler.
95
Memory allocation is mainly divided into two types:
1. Static binding
2. Dynamic binding
The delimiters mark the beginning and the end of the block. There can be nested
blocks for example block B2 can be completely defined within block B1. A block-
structured language uses dynamic memory allocation. Finding the scope of the
variable means checking the visibility within the block.
96
Following are the rules used to determine the scope of the variable:
1. Variable X is accessed within the block B1 if it can be accessed by any statement
situated in block B1.
2. Variable X is accessed by any statement in block B2 and block B2 is situated in
block B1.
Procedure A
{
int x, y, z
Procedure B
{
int a, b
}
Procedure C
{
int m, n
}
}
Variables x, y, and z are local variables to procedure A but those are non-local to
block B and C because these variables are not defined locally within the block B
and C but are accessible within these blocks. Dynamic allocation which is
automatic, implemented using the extended stack model. Each file within the stack
has two reserved pointers rather than one. Each stack document comprises the
variable for one activation of a block, which is called an activation record (AR).
97
7.3.4 Dynamic Pointer
The firstly reserved pointer in block’s AR points to the activation file of its dynamic
parent. This is referred to as a dynamic pointer and has the address 0 (ARB). The
dynamic pointer plays a vital role for de-allocating an AR. The following example
shows memory allocation for the program given below.
Activation Record
The activation record could be a block of memory used for managing statistics
wished by way of a single execution of a procedure.
Return value
Actual parameter
Control link
Access
Status
Temporaries98
1. Temporary variables: Such types of variables are needed during the evaluation
of expressions. These types of variables are stored in the temporary field of the
activation record.
2. Local variables: The local data is a type data which is local to the execution
system stored in this field of the activation record.
3. Saved machine registers: Before the procedure is called this field holds the
information regarding the status of the machine. This field also contains the
registers and program counter.
4. Control link: It’s an optional field which points to the activation record of the
calling procedure. This link is alternatively known as a dynamic link.
5. Access link: This field is also optional. It refers to the non-local data in another
activation record. This field is also called a static link field.
6. Actual parameters: This field holds information about the actual parameters.
These actual parameters are passed to the called procedure.
7. Return values: This field is used to store the result of a function call.
Example: a * b
99
Attribute Addressability
(int, 1) Address(a)
(int, 1) Address(b)
(int, 1) Address(AREG)
1) Postfix notation
• Postfix notation is a linearized representation of a syntax tree.
• It a list of nodes of the tree in which a node appears immediately after its
children
• the postfix notation of x=-a*b + -a*b will be x a –b *a-b*+=
t2 := t1 * b (1) * t1 b t2
t4 := t3 * b (3) * t3 b t4
t5 := t2 + t4 (4) + t2 t4 t5
x= t5 (5) := t5 X
101
7.5.3 Indirect Triples Representation
The indirect triple representation of the listing of triples is been done. And listing
pointers are used instead of using statements.
102
b) Common Sub Expression Elimination
The common sub expression is an expression appearing repeatedly in the program
which is computed previously. Then if the operands of this sub expression do not
get changed at all the result of such sub expression is used instead of recompiling it
each time.
Example:
t1 : = 4 * i t2 : = a[t1] t3 : = 4 * j t4 : = 4 * i t5 : =n t6 := b[t4] + t5
The above code can be optimized using common sub expression elimination
t1=4*i
t2=a[t1] t3=4*j t5=n t6=b[t1]+t5
The common sub expression t4:= 4 * i is eliminated as its computation is already in
t1 and the value of i is not been changed from definition to use.
d) Strength Reduction
The strength of certain operators is higher than in others. For instance, the strength
of * is higher than +. In this technique, the higher strength operators can be
replaced by lower strength operators.
Example:
for(i=1;i<=50;i++)
{
count = i x 7;
}
103
Here we get the count values as 7, 14, 21, and so on up to less than50.
This code can be replaced by using strength reduction as follows temp=7
for(i=l; i<=50; i++)
{
count = temp;
temp = temp + 7;
}
Example :
i=0;
if(i==1)
{
a=x+5;
}
If the statement is a dead code as this condition will never get satisfied hence, the
statement can be eliminated and optimization can be done.
104
Generate a target output program as Prevent any output program;
associate output, which may be run rather they judge the source
severally from the source program program at any time for execution.
written in Source Language.
105
• If a 400-statement program is executed on a test data with only 80 statements
being visited during the test run, the total CPU time in compilation followed by
the execution of the program is 400 * tc + 80 * te, while the total CPU time in the
interpretation of the program is 80 *ti = 80 *tc. This shows that the interpretation
will be cheaper in such cases.
• However, if more than 400 statements are to be executed, compilation followed by
execution will be cheaper, which means that using an interpreter is advantageous up
to the execution of 400 statements during the execution. This indicates that from the
CPU time cost, interpreters are a better choice at least for the program development
environment.
106
Exercise
Short Questions
1. What is Compiler?
2. Explain about Operand and register Descriptor.
3. Write about the Intermediate Code for the Expression.
4. What is Quadruple Representation?
5. Explain about the indirect Triples Representation.
6. Comparison between Compilers and Interpreters.
7. Write about Benefits of Interpretation.
Long Questions
107
Chapter 8
Programming Language Grammars
8.1 Programming Language
8.2 Programming Language Grammars
8.3 Classification of Grammar
8.4 Operator Grammars
8.4.1 Ambiguous Grammar
8.4.2 Scanning
8.5 Parsing
8.5.1 Bottom-Up Parsing
8.5.2 Shift Reduce Parsing
8.5.3 Operator Precedence Parsing
8.6 Operator Precedence Parsing Algorithm using Stack
8.7 Language Processor Development Tools
Exercise
108
8.1 Programming Language
It is defined the concept of its syntax and its semantics. The syntax gives us its
structure of program and the semantics gives us the importance of every
construction of programming language. In programming language, it describes
the semantics in many different ways. But there is mostly one technology to
describe its syntax is called context-free grammars.
Overview of Grammar
A program, which is a transformation of grammar. A linear sequence of ASCII
characters are normally represented into a syntax tree. In this way the tree are
syntactically valid and transformed in this way of the program. The program
process through a compiler or interpreter of the tree which is the main data-
structure. By traversing this tree the compiler can produce machine code, or can
type check the program, for instance. And by traversing this very tree the
interpreter can simulate the execution of the program.
109
Example
<Noun Phrase> → <Article><Noun>
<Article> → a | an | the
<Noun> → boy | apple
the<Noun>
the boy
Parse trees: The tree representation of the sequence of derivations that produces a
string from the distinguished (start) symbol is termed a sparse tree.
NTi … …
Example:
<Noun Phrase>
<Article> <Noun>
the
boy
110
8.3 Classification of Grammar
Type-0 grammars
These grammars are known as phrase structure grammars. Their productions are
of the form
α→β
Where both α and β can be strings of the terminal and nonterminal symbols. Such
productions permit the arbitrary substitution of strings during derivation or
reduction, hence they are not relevant to the specification of programming
languages.
Type-1 grammars
Productions of Type-1 grammars specify that derivation or reduction of strings
can take place only in specific contexts. Hence these grammars are also known as
context-sensitive grammars. Production of a Type -1 grammar has the form
αAβ → απβ
Here, a string π can be replaced by 'A' (or vice versa) only when it is enclosed by
the strings α and β in a sentential form. These grammars are also not relevant for
programming language specification since recognition of programming language
constructs is not context-sensitive.
Type-2 grammars
These grammars do not impose any context requirements on derivations or
reductions. A typical Type -2 production is of the form
A→π
This can be applied independently of its context. These grammars are therefore
known as context-free grammars (CFG). CFGs are ideally suited for programming
language specifications.
Type-3 grammars
Type-3 grammars are characterized by productions of the form
A → tB|t or A → Bt|t
Note that these productions also satisfy the requirements of type-2 grammars.
The specific form of the RHS alternatives—namely a single nonterminal symbol
or a string containing a single terminal symbol and a single nonterminal symbol,
111
gives some practical advantages in scanning. However, the nature of the
productions restricts the expressive power of these grammars, e.g., nesting of
constructs or matching of parentheses cannot be specified using such
productions. Hence the use of Type -3 productions is restricted to the
specification of lexical units, e.g., identifiers, constants, labels, etc. Type-3
grammars are also known as linear grammars or regular grammars.
Example
112
Two parse trees can be built for the source string a + b * c according to this
grammar – one I which a + b is first reduced to <exp> and another in which b*c is
first reduced to <exp>.
8.5 Parsing
It is the process of top-down with non-recursive. LL (1) – Two L are define where
first L is scanned from left to right and the second L indicates leftmost derivation
for input string. In bracket 1 uses for predict the parsing process with input
symbol. LL (1) parser is an input buffer, stack, and parsing table in the data
structure.
Top and current input symbol are the process in the parsing program. These two
symbols parsing action used determined the program. It consults the LL (1)
parsing table each time while taking the parsing actions hence this type of parsing
method is also called the table-driven parsing method. The input is successfully
parsed if the parser reaches the halting configuration. When the stack is empty and
the next token is $ then it corresponds to successful parsing.
113
3. Construct a predictive parsing table.
4. Parse the input string with the help of the parsing table.
Example:
E→E+T/T
T→T*F/F
F→(E)/id
E’→+TE’ | ϵ
T→FT’
T’→*FT’ | ϵ
F→(E) | id
FIRST FOLLOW
E {(,id} {$,)}
E {+,ϵ} {$,)}
’
T {(,id} {+,$,)}
T {*,ϵ} {+,$,)}
’
F {(,id} {*,+,$,)}
114
id + * ( ) $
E E→TE’ E→TE’
T T→FT’ T→FT’
F F→id F→(E)
115
8.5.1 Bottom-Up Parsing
116
Stack Input buffer Action
$ id+id*id$ Shift
$id +id*id$ Reduce F->id
$F +id*id$ Reduce T->F
$T +id*id$ Reduce E->T
$E +id*id$ Shift
$E+ id*id$ shift
$E+ id *id$ Reduce F->id
$E+F *id$ Reduce T->F
$E+T *id$ Shift
$E+T* id$ Shift
$E+T*id $ Reduce F->id
$E+T*F $ Reduce T->T*F
$E+T $ Reduce E->E+T
$E $ Accept
Relation Meaning
a < .b a “yields precedence to” b
a=.b a “has the same precedence as” b
a.>b a “takes precedence over” b
Table 8.5.3 Precedence between terminal a & b
117
Example:
E→E+T/T
T→T
*F/F
F→ i
d
1. a <.b
Op .NT Op <. Leading (NT)
+T + <. {*, id }
*F * <. { id }
2. a .>b
NT .Op Trailing (NT)
.>Op
3. $ <. {+,*,id}
4. {+,*,id} .> $
118
Step-4: Parsing of the string <id> + <id> * <id> using
precedence table. We will follow following steps to parse
the given string,
$ <.id .> + <.id .> * <.id .> $ Handle id is obtained between <..>
Reduce this by F->id
$ F+ <.id .> * <.id .> $ Handle id is obtained between <..>
Reduce this by F->id
$ F + F * <.id .> $ Handle id is obtained between <..>
Reduce this by F->id
$F+F*F$ Perform appropriate non-terminals.
Reductions of all
$ E + T * F$ Remove all nonterminal
$ +* $ Place relation between operators
$ <. + <. * >$ The * operator is surrounded by <..>.
This indicates * becomes handle we
have to reduce T*F.
$ <. + >$ + becomes handle. Hence reduce E+T.
$$ Parsing Done
119
Current Operator AS
St T
(a) ack
‘+ SB,TO |- a
’ S
|- a
SB,
(b) ‘* TO
’ S +
b
SB |- a
(c) ‘ -| + b
’
TO * c
S
‘ -| SB |- a
’
TO + *
(d) S
b c
(e ‘ -| SB,TO |- +
’ S
) a *
b c
120
7. While TOS operator .= current operator, then
a. if TOS operator = |-- then exit successfully if TOS operator
=’(‘, then
b. temp:=TOS.operand_pointer; pop an entry off the stack
TOS.operand_pointer:=temp; Go to step 3;
8. If no precedence defines between the TOS operator and the current
operator the report error and exit unsuccessfully.
LEX
• The input to LEX consists of two components.
• The first component is a specification of strings that represents the
lexical units in L.
• This specification is in the form of regular expressions.
• The second component is a specification of semantic actions that are
121
aimed at building the intermediate representation.
• The intermediated representation produced by a scanner would
consist of a set of tables of lexical units and a sequence of tokens for
the lexical units occurring in a source statement.
• The scanner generated by LEX would be invoked by a parser
whenever the parser needs the next token.
• Accordingly, each semantic action would perform some table
building actions and return a single token.
YACC
• Each translation rule input to YACC has a string specification that
resembles a production of a grammar, it has a nonterminal on the
LHS and a few alternatives on the RHS.
• For simplicity, we will refer to a string specification as a production.
YACC generates an LALR (1) parser for language L from the
productions, which is a bottom-up parser.
• The parser would operate as follows: For a shift action, it would
invoke the scanner to obtain the next token and continue the parse by
using that token. While performing a reduced action following
production, it would perform the semantic action associated with
that production.
• The semantic actions associated with productions achieve the
building of an intermediate representation or target code as follows:
Every nonterminal symbol in the parser has an attribute. The
semantic action associated with production can access attributes of
nonterminal symbols used in that production—a symbol '$n' in the
semantic action, where n is an integer, designates the attribute of the nth
nonterminal symbol in the RHS of the production and the symbol '$$'
designates the attribute of the LHS nonterminal symbol of the production.
The semantic action uses the values of these attributes for building the
intermediate representation or target code. The attribute type can be
declared in the specification input to YACC.
122
Exercise
Short Questions
Long Questions
123
Chapter 9
Systems Development
124
9.1 Introduction Systems Development
In current development of a program or a new software application need the
process of designing, testing, defining, and implementing which process is called
systems development. It includes the acquisition of third party developed or the
creation of database systems in the internal development of customized systems
development. It must guide all information systems processing functions are
written standards and procedures. The life cycle methodology of system
development governing the process of implementing, maintaining, acquiring,
developing the related technology and the computerized information systems.
125
loader locates the desired class file and passes it to the Java bytecode verifier.
Dat
aa
Java
Virtual Result
Java s
Machine
bytecod
e
Error
s
Class
Java Java Loader+
source Error
Compile Javabytecod s
Java
program r e
bytecod
e verifier
Java Just-In- Mixed-
mode Result
Time s
Compiler execution
Java M/c
native code language Result
M/c s
compiler language program
program
Figure 9.2 Java Language Environments
126
compiler has compiled some part of the program, some parts of the Java source
program has been converted into the machine language while the remainder of
the program still exists in the bytecode form. Hence the Java virtual machine uses
a mixed-mode execution approach. The other compilation option uses the Java
native code compiler shown in the lower part of Figure 8.1. It simply compiles the
complete Java program into the machine language of a computer. This scheme provides
fast execution of the Java program; however, it cannot provide any of the benefits of
interpretation or just-in-time compilation.
127
9.3 Types of Errors
9.3.1 Syntax Error
• Syntax errors occur because the syntax of the programming language is not
followed.
• The errors in token formation, missing operators, unbalanced parenthesis,
etc., constitute syntax errors.
• These are generally programmer induced due to mistakes and negligence
while writing a program.
• Syntax errors are detected early during the compilation process and restrict
the compiler to
• proceed for code generation.
• Let us see the syntax errors with Java language in the following examples.
Example 1:
Missing punctuation-"semicolon" int age=50
// note here semicolon is missing
128
return a + b ;
} // this method returns the incorrect value concerning the specification that
requires to multiply two integers
Example:
Non-terminating loops
String str = br. readLine();
while (str != null)
{
System.out.println(str);
} // the loop in the code did not terminate
Logical errors may cause undesirable effects and program behaviors. Sometimes,
these errors remain undetected unless the results are analyzed carefully.
129
information, generates a table that stores the information about the variables
used in a program and their addresses.
Breakpoints: Breakpoints specify the position within a program until which the
130
program gets executed without disturbance. Once the control reaches such a
position, it allows the user to verify the contents of the variables declared in the
program.
Tracing: Tracing monitors step by step the execution of all executable statements
present in a program. The other name for this process is "step into". Another
possible variation is "step over" debugging that can be executed at the level of
procedure or function. This can be implemented by adding a breakpoint at the
last executable statement in a program.
Traceback: This gives a user the chance to trace back over the functions, and the
traceback utility uses a stack data structure. Traceback utility should show the
path by which the current statement in the program was reached.
Multilingual capability: The debugging system must also consider the language
in which the debugging is done. Generally, different programming languages
involve different user environments and applications systems.
131
Exercise
Short Questions
Long Questions
132
Miscellaneous Problems
1. (0+1)*010(0+1)*
ϵ- closure (0) = {0, 1, 2, 4, 7} A
Move (A, 0) = {3, 8}
ϵ- closure (Move (A, 0)) = {3, 6, 7, 1, 2,4,8} ----- B
Move (A, 1) = {5}
ϵ- closure (Move (A, 1)) = {5, 6, 7, 1,2,4} ------ C
133
ϵ
0
2 3 ϵ
ϵ 0
ϵ ϵ 1
0 1 6 7 8 9
ϵ
4 5 ϵ
1
0
0
12 13
ϵ ϵ
ϵ ϵ
17
10 11 16
ϵ
14 15 ϵ
134
Move (E, 0) = {3, 8, 13}
ϵ- closure (Move (E, 0)) = {1, 2, 3, 4, 6, 7, 8, 13, 16, 17, 11,12,14} ----- F
Move (E, 1) = {5, 9, 15}
ϵ- closure (Move (E, 1)) = {1, 2, 4, 5, 6, 7, 9, 15, 16, 17, 11,12,14} ----- G
Transition Table:
States 0 1
A B C
B B D
C B C
D E C
E F G
F F G
G H I
H F I
I F I
135
DFA:
0
0
0 0
B H
E F
0
1
1
A 1 0
0 0
1
1 I
C D G 1
1
1
1
136
2. (010+00)*(10)* ϵ
0 1 0
ϵ 3 4 5 ϵ
2
ϵ ϵ
0 1 9
10
1
ϵ ϵ
6 7 8
0 0
ϵ
ϵ
ϵ
ϵ
1 0
11 14
12 13
137
ϵ- closure (Move (B, 0)) = {8, 9, 10, 11, 14, 1,2,6} -------- D
Move (B, 1) = {4}
ϵ- closure (Move (B, 1))={4} -------- E
States 0 1
A B C
B D E
C F φ
D B C
E G φ
F φ C
G B C
138
DFA:
0
0
D
B
0 0
1
A G
1
1
1 0 0
C F E
3. (a+b)*a(a+b)
ϵ
a
a
10
2 9 ϵ
3 ϵ ϵ
ϵ
ϵ ϵ a
0 1 6 7 8 13
ϵ
4 5 ϵ ϵ 11 ϵ
12
b b
139
ϵ- closure (Move (A, b))={5,6,7,1,2,4} --- C
Transition Table:
States a b
A B C
B D E
C B C
D D E
E B C
140
DFA:
a
B
a D
a
b
A a b
b
C E
b
b
141
4. (a+b)*abb(a+b)*
a
2 3 ϵ
ϵ a
ϵ ϵ b
0 1 6 7 8 9
ϵ
4 5 ϵ
b
b
a
12 13
ϵ ϵ
ϵ ϵ
17
10 11 16
ϵ
14 15 ϵ
142
Move (B, a) ={3,8}
ϵ- closure (Move (B, a)) = {3, 6, 1, 2, 4,7,8} ----- B
Move (B, b) = {5, 9}
ϵ- closure (Move (B, b)) = {5, 6, 7, 1, 2,4,9} ---- D
143
Move (I, b) = {5, 15}
ϵ- closure (Move (I, b)) = {5, 6, 7, 1, 2, 4, 15, 16, 17, 11,12,14} ----- G
Transition Table:
States a b
A B C
B B D
C B C
D B E
E F G
F F H
G F G
H F I
I F G
DFA: a
a
a b
B E F H
a a
b
a b
A a b a
b
a
b
C D G I
b
b
b
144
5. 10(0+1)*1
ϵ
0
4 5
ϵ
ϵ
1 0 ϵ ϵ 1
0 1 2 3 10
8 9
ϵ
ϵ 6 7 ϵϵ
Move (A, 0) =φ
ϵ- closure (Move (A, 0)) = φ Move (A, 1) = {1}
ϵ- closure (Move (A, 1))={1} ----- B
Transition Table:
States 0 1
A φ B
B C φ
C D E
D D E
E D E
DFA 0
B D
1
0
0 0
1
A
C E
1 1
MCQ practice Questions
S. NO Options Instructions
1. In a two pass assembler the object code generation is done during the?
A Second pass
B First pass
C Zeroeth pass
D Not done by assembler
2. Which of the following is not a type of assembler?
A one pass
B two pass
C three pass
D load and go
3. In a two pass assembler, adding literals to literal table and address
resolution of local symbols are done using?
A First pass and second respectively
B Both second pass
C Second pass and first respectively
D Both first pass
Answers
1-A, 2-C, 3-D, 4-A, 5-A, 6-A, 7-C, 8-B, 9-D, 10-A, 11-D, 12-B, 13-C, 14-B, 15-D,
16-A, 17-C, 18-A, 19-C, 20-B, 21-C, 22-D, 23-B, 24-B, 25-D, 26-A, 27-A, 28-D, 29-
D, 30-C, 31-D, 32-D, 33-C, 34-D, 35-D, 36-C, 37-B, 38-B, 39-A, 40-D, 41-C, 42-B,
43-A, 44-D, 45-D
Index
ADD AREG, Z, 67 AREG, =’0’, 67, 71, 72, 73, 74, 80, 81, 83, 85,
90, 95, 123, 158, 178, 197, 198, 203, 212,
ADD BREG, B, 68, 177, 180, 184, 200, 206, 213
216
AREG, A, 63, 85, 90, 158
Add(id#4) to (id#3), giving (id#5), 60
AREG, B, 63
ADD, R (ADD –assembly code, R-temp.
registers), 63 AREG, I (AREG –assembly code, I-temp.
registers), 62
Address, 37, 57, 60, 72, 73, 74, 75, 81, 84,
120, 121, 124, 139, 140, 147, 148, 150, 172, AREG,=’1’, 38, 101, 163, 205
177, 186, 201, 241, 246
Argument list array(ALA), 10, 42, 43, 150,
ADDRESS, 73, 74, 75, 76 151, 153, 164, 165, 206, 211, 216, 220, 221,
226, 227, 231, 232, 236
Address of name ONE, 81
Assembler, 4, 10, 18, 24, 38, 40, 41, 42, 44,
address_in_work_area, 245, 247 49, 53, 68, 78, 82, 85, 90, 107, 120, 121,
Addressing modes, 95, 102, 113, 119, 132, 123, 126, 139, 149, 151, 152, 153, 155, 158,
177, 180, 184 160, 164, 165, 177, 178, 179, 181, 182, 183,
188, 206, 209, 216, 218, 221, 224, 227, 229,
AGAIN, 11, 59, 78, 79, 80, 85, 95, 101, 120, 232, 234, 248, 250, 251, 253, 254, 255, 256
141, 172, 182, 198, 204, 213, 239, 240, 241,
242, 243, 244, 245, 248, 249, 251, 252, 255, Assembler directive, 69, 148
256 Assembler Directive, 69, 83
Agile development, 29 Assembler Directives, 38
Program Linking, 246
assembly and output, 9, 42, 149, 151, 164, Comments, 10, 42, 149, 151, 152, 164, 206,
206, 208, 216, 217, 221, 223, 227, 228, 232, 209, 216, 218, 221, 224, 227, 229, 232, 234
234
Compiler, 38, 95, 102, 113, 120, 132, 177,
Assembly Language, 38, 78 180, 184
BACK, 112, 201, 238 Construct intermediate code., 10, 42, 149,
151, 152, 164, 165, 206, 210, 216, 219, 221,
Base table (storing value of base register),
225, 227, 230, 232, 235
77
CONV, R (CONV –assembly code, R-
bottom up parsers, 95, 102, 113, 119, 132,
temp. registers), 62
177, 180, 184
Convert(id1#1) to real, giving (id#4), 60
Build the symbol table., 95, 102, 113, 119,
132, 177, 180, 184 Converter, 38
C statement, 9, 41, 42, 149, 151, 164, 206, cross compilation, 3, 18, 23, 37, 40, 41, 44,
207, 216, 217, 221, 222, 227, 228, 232, 233 49, 53, 68, 78, 82, 85, 90, 107, 120, 121,
123, 126, 138, 153, 158, 164, 176, 178, 179,
C Tokens, 9, 41, 42, 149, 151, 164, 206, 207,
181, 182, 183, 187, 248, 250, 251, 253, 255,
216, 217, 221, 222, 227, 232, 233
256
Go to pass II, 148
Cross Reference Table (CRT), 187
Clear machine_code_buffer., 144, 147, 172,
Custom Software, 8
176, 189, 191, 244, 246, 247
Custom Software XE "Software"
Clear machine_code_buffer;, 144, 147, 172,
Development, 10
176, 189, 191, 244, 246, 247
data dependant, 39
code generation, 95, 102, 113, 119, 132, 177,
180, 184 Debugging of programs is difficult., 38
Expression, 95, 102, 113, 119, 132, 177, 180, generated by operating system, 38
184 GT, LOOP, 100, 102, 163, 199, 204, 215
External references, 252 hardware, 5, 6, 7, 16, 17, 18, 22, 23, 24, 25,
Extra work in pass I, 172 266
Levels of System Software, 31 Identifier table, 95, 102, 113, 119, 132, 177,
180, 184
Von Neumann Architecture, 20
Imperative statement, 67, 149
Harvard Computer Architecture, 21
incremental compiler, 4, 18, 24, 38, 40, 41,
Life Cycle of a Source Program, 27 44, 49, 53, 68, 78, 82, 85, 90, 107, 120, 121,
123, 126, 139, 153, 155, 158, 160, 164, 177,
Computer Language, 36
178, 179, 181, 182, 183, 188, 249, 250, 251,
Program Generation Activities, 49 253, 254, 255, 256
Program Execution Activities, 50 Instruction formats, 4, 18, 24, 37, 40, 41, 44,
49, 53, 68, 78, 82, 85, 90, 107, 120, 121,
Design of Assembler, 70 123, 126, 139, 153, 154, 158, 159, 164, 176,
178, 179, 181, 182, 183, 187, 248, 250, 251,
First pass, 95, 102, 113, 120, 132, 177, 180,
253, 255, 256
184
Intermediate Representation, 57
Forward Reference Table (FRT), 187
Interpretation, 50
Fourth Generation Programming
Languages, 43 interpreter, 95, 102, 113, 119, 132, 177, 180,
184
General analysis, 38
Interpreter, 95, 102, 113, 119, 132, 177, 180, LOOP, 11, 59, 64, 65, 66, 67, 70, 72, 76, 78,
184 79, 80, 81, 82, 83, 90, 121, 143, 148, 149,
158, 163, 171, 172, 176, 185, 186, 188, 197,
LAST, 85, 90, 102, 200, 216 206, 213
LC (location counter), 77 LTORG, 71, 72, 73, 74, 84, 101, 120, 139,
LEX, 3, 18, 24, 37, 40, 41, 44, 49, 53, 68, 78, 144, 149, 171, 173
82, 85, 90, 107, 120, 121, 123, 126, 139, LTORG 203
153, 154, 158, 164, 176, 178, 179, 181, 182,
183, 187, 248, 250, 251, 253, 255, 256 Machine compiler, 4, 18, 24, 37, 40, 41, 44,
49, 53, 68, 78, 82, 85, 90, 107, 120, 121,
Lex object code, 3, 18, 24, 37, 40, 41, 44, 49, 123, 126, 139, 153, 154, 158, 159, 164, 177,
53, 68, 78, 82, 85, 90, 107, 120, 121, 123, 178, 179, 181, 182, 183, 188, 248, 250, 251,
126, 138, 153, 158, 164, 176, 178, 179, 181, 253, 255, 256
182, 183, 187, 248, 250, 251, 253, 255, 256
machine dependant, 10, 42, 150, 151, 152,
Lexical Analysis (Scanning), 54 164, 165, 206, 210, 216, 219, 221, 225, 227,
Linked list, 10, 42, 149, 151, 152, 164, 165, 230, 232, 236
206, 209, 216, 219, 221, 224, 227, 230, 232, machine independent optimization, 38
235
Machine Language, 37
linker, 38
Machine opcode (2), 186
Linker, 10, 42, 43, 95, 102, 113, 119, 132,
150, 151, 152, 164, 165, 177, 180, 184, 206, Macro definition table(MDT), 4, 18, 24, 37,
210, 216, 220, 221, 225, 227, 230, 232, 236 40, 41, 44, 49, 53, 68, 78, 82, 85, 90, 107,
120, 121, 123, 126, 139, 153, 155, 158, 159,
Literal table, 4, 18, 24, 37, 40, 41, 44, 49, 53, 164, 177, 178, 179, 181, 182, 183, 188, 248,
68, 78, 82, 85, 90, 107, 120, 121, 123, 126, 250, 251, 253, 254, 255, 256
139, 153, 154, 158, 159, 164, 176, 178, 179,
181, 182, 183, 187, 248, 250, 251, 253, 255, macroprocessor, 95, 102, 113, 119, 132, 177,
256 180, 184
Loader, 4, 18, 24, 37, 39, 40, 41, 44, 49, 53, Memory Allocation, 60
68, 78, 82, 85, 90, 107, 120, 121, 123, 126,
139, 153, 154, 158, 159, 164, 176, 178, 179, memory management software, 4, 18, 24,
181, 182, 183, 187, 248, 250, 251, 253, 255, 37, 40, 41, 44, 49, 53, 68, 78, 82, 85, 90,
256 107, 120, 121, 123, 126, 139, 153, 154, 158,
159, 164, 176, 178, 179, 181, 182, 183, 187,
Local variables, 42 248, 250, 251, 253, 255, 256
one pass, 4, 18, 24, 38, 40, 41, 44, 49, 53, 68, predictive parsing, 9, 41, 42, 149, 151, 164,
78, 82, 85, 90, 107, 120, 121, 123, 126, 139, 206, 207, 216, 221, 222, 227, 232, 233
153, 155, 158, 160, 164, 177, 178, 179, 181,
Process pseudo-operations., 66
182, 183, 188, 249, 250, 251, 253, 254, 255,
256 Processing end statement, 10, 11, 12, 33,
one pass compilation, 95, 102, 113, 119, 34, 35, 36, 37, 39, 43, 44, 45, 46, 48, 49, 50,
51, 52, 53, 54, 176, 247
132, 177, 180, 184
OP-Code, 4, 18, 24, 37, 40, 41, 44, 49, 53, 68, program, 243
78, 82, 85, 90, 107, 120, 121, 123, 126, 139, Program relocation, 9, 42, 149, 151, 164,
153, 154, 158, 159, 164, 177, 178, 179, 181,
206, 208, 216, 217, 221, 223, 227, 228, 232,
182, 183, 188, 248, 250, 251, 253, 255, 256
234
Open Source Software, 7
program_linked_origin = <link origin>
Operating system, 9, 41, 42, 149, 151, 164, from the linker command., 1, 2, 3, 4, 5, 6,
206, 207, 216, 221, 222, 227, 232, 233 7, 11, 75, 172, 244, 246
Operators, 95, 102, 113, 119, 132, 177, 180, programming language dependent, 4, 18,
184 24, 37, 40, 41, 44, 49, 53, 68, 78, 82, 85, 90,
107, 120, 121, 123, 126, 139, 153, 155, 158,
OPTAB, 121, 122, 147, 176, 203 160, 164, 177, 178, 179, 181, 182, 183, 188,
248, 250, 251, 253, 254, 255, 256
OPTAB, SYMTAB, LITTAB, and
POOLTAB., 121 Pseudo operation table(POT), 95, 102, 113,
119, 132, 177, 180, 184
Optimizes the code, 9, 41, 42, 149, 151, 164,
206, 216, 221, 227, 232 Public definitions, 252
ORIGIN, 83, 84, 102, 107, 119, 145, 149, 173 Queue, 95, 102, 113, 119, 132, 177, 180, 184
Segment Register Table (SRTAB), 186 START 200, 1, 2, 3, 4, 5, 6, 7, 11, 19, 41, 44,
55, 57, 60, 71, 72, 73, 75, 76, 79, 81, 82, 83,
Semantic analysis, 3, 18, 23, 37, 40, 41, 44,
84, 85, 90, 100, 101, 102, 107, 113, 119,
49, 53, 68, 78, 82, 85, 90, 107, 120, 121, 120, 123, 124, 125, 126, 132, 139, 141, 142,
123, 126, 138, 153, 158, 164, 176, 178, 179, 143, 145, 147, 148, 158, 163, 164, 166, 172,
181, 182, 183, 187, 248, 250, 251, 253, 255, 173, 177, 181, 186, 187, 189, 190, 191, 195,
256 196, 200, 201, 202, 203, 204, 205, 206, 216,
Semantic Analysis, 56 221, 226, 227, 232, 237, 244, 246
Single-page Web apps against websites, 29 stmt_no, 188, 189, 190, 193
Stop, 4, 18, 24, 37, 40, 41, 44, 49, 53, 68, 78, Table of information instructions, 95, 102,
82, 85, 90, 107, 120, 121, 123, 126, 139, 113, 119, 132, 177, 180, 184
153, 154, 158, 159, 164, 176, 178, 179, 181,
182, 183, 188, 248, 250, 251, 253, 255, 256 Tasks of Analysis phase, 80
STOP, 10, 42, 67, 102, 149, 151, 152, 163, Tasks of Synthesis phase, 81
164, 200, 205, 206, 208, 216, 218, 221, 223, temp, 11, 59, 78, 79, 80, 85, 95, 101, 120,
227, 229, 232, 234 141, 172, 182, 198, 204, 213, 239, 240, 241,
Store (id#5) in (id#2), 60 242, 243, 244, 245, 248, 249, 251, 252, 255,
256
subjective compiler, 39
TERM, 107, 176, 182, 201, 232
Symbol Table, 95, 102, 113, 119, 132, 177,
180, 184 Terminal table, 9, 41, 42, 149, 151, 164, 206,
207, 216, 217, 221, 222, 227, 228, 232, 233
Symbol Table (SYMTAB), 186
Terminate, 38
SYMTAB, 121, 123, 144, 147, 150, 174, 186,
188, 189, 190, 191, 192, 194, 201 Terminology, 3
SYMTAB entry # (2), 186 The EQU directive has the syntax, 84
syntax analysis, 4, 18, 24, 37, 40, 41, 44, 49, The syntax of this directive is, 83
53, 68, 78, 82, 85, 90, 107, 120, 121, 123, This_address=value specified in <address
126, 139, 153, 154, 158, 159, 164, 176, 178, spec>;, 144, 145, 146, 173, 174, 175, 189,
179, 181, 182, 183, 187, 248, 250, 251, 253, 190, 191, 192, 193, 194
255, 256
this_entry= SYMTAB entry number of
Syntax Analysis (Parsing), 55
operand generate IC ‘(IS, code)(S,
syntax dependant, 95, 102, 113, 120, 132, this_entry)’;, 147
177, 180, 184
three pass, 10, 42, 43, 150, 151, 152, 164,
Synthesize a machine instruction., 81 165, 206, 210, 216, 220, 221, 225, 227, 231,
232, 236
Synthesize the target program., 39
Tokens, 3, 18, 24, 37, 40, 41, 44, 49, 53, 68,
System Programming, 14 78, 82, 85, 90, 107, 120, 121, 123, 126, 139,
153, 154, 158, 164, 176, 178, 179, 181, 182,
Systems, 1, 64, 262, 280, 295, 311
183, 187, 248, 250, 251, 253, 255, 256
Table of incomplete instructions, 4, 18, 24,
top down parsers, 3, 18, 23, 37, 40, 41, 44,
37, 40, 41, 44, 49, 53, 68, 78, 82, 85, 90,
49, 53, 68, 78, 82, 85, 90, 107, 120, 121,
107, 120, 121, 123, 126, 139, 153, 155, 158,
123, 126, 138, 153, 158, 164, 176, 178, 179,
160, 164, 177, 178, 179, 181, 182, 183, 188,
248, 250, 251, 253, 254, 255, 256
181, 182, 183, 187, 248, 250, 251, 253, 255, yet accept compiler compiler, 95, 102, 113,
256 119, 132, 177, 180, 184
Transition tables, 95, 102, 113, 119, 132, yet accept compiler constructs, 3, 18, 23,
177, 180, 184 37, 40, 41, 44, 49, 53, 68, 78, 82, 85, 90,
107, 120, 121, 123, 126, 138, 153, 158, 164,
Translation, 50, 242 176, 178, 179, 181, 182, 183, 187, 248, 250,
Translation of information instruction, 39 251, 253, 255, 256
User Interface, 24
Using(1),, 249