0% found this document useful (0 votes)
10 views33 pages

Unit 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views33 pages

Unit 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Adama Science and Technology University

Unit 2: Overview of Language Processors

System Programming
COURSE CODE:CSEg4303
Version:1.01
Programming Languages

What is programming languages?

Programming languages allow a programmer to give the computer instructions in a
manner that the programmer can understand

Assembly Language is a low level language, where the programmer uses a mnemonic
(abbreviation) to communicate an instruction to the computer

EX: “A” or “add” for addition

This mnemonic is translated into machine language (binary) by a program called an
assembler

Assembly language is a one for one mapping to the machine language and has to be
rewritten for each machine (not all machine languages are the same)

An Assembler translates assembler code into machine language that a computer may
execute

High level languages were created to allow programmers to write at a level where one
programming instruction is converted into several machine language instructions and to
avoid having to constantly rewrite the program

A compiler converts high level language into machine language that a computer may
execute
Language Processing Activities

Language Processing activities arise due to the differences between the manner
in which a software designer describes the ideas concerning the behavior of a
software and the manner in which these ideas are implemented in a computer
system.

The designer expresses the ideas in terms related to the application domain of
the software.

To implement these ideas, their description has to be interpreted in terms related
to the execution domain.

These ideas are actually implemented in the execution domain.

The gap between the application domain and execution domain is called
semantic gap.
Semantic Gap

If a semantic gap is to be covered by the programmer, it means that the
designer has to specify his/her ideas directly in the machine language of the
computer.

It has three bad/worse effects as far as software design is concerned or
consequences:

Large development time

Large development effort

Poor quality and less reliable software
Specification and Execution Gaps

The software engineering aimed at the use of PL can be grouped into

Specification, design and coding steps.

PL implementation steps

Software implementation using PL introduces new domain, the PL domain
between application and execution domain.

The semantic gap is now split into:
1. Specification gap between the application domain and the PL domain.
2. Execution gap between the PL domain and execution domain.
Specification and Execution Gaps

Specification Gap

It is the semantic gap between two specifications of the same task.

Execution Gap

It is the gap between the semantics of programs (that perform the same
task) written in different programming languages.
Specification and Execution Gaps

The application domain is some specification language of the user:

data flow diagrams

Flow-charts

decision tables and

decision tree.

The specification language of PL domain is the programming language itself.

The specification language of execution domain is the machine language of
the computer system.
Language Processors

“A language processor is a software which bridges a specification or
execution gap”.


The input to the language processor is the source program while the output
of the language processor is the target program.


The languages in which these programs are written are known as source
language and target language respectively.
Types of Language Processors

The language processor has four important versions:

A language translator bridges an execution gap to the machine
language (or assembly language) of a computer system. E.g.
Assembler, Compiler.

A detranslator bridges the same execution gap as the language
translator, but in the reverse direction.

A preprocessor is a language processor which bridges an execution
gap but is not a language translator.

A language migrator bridges the specification gap between two PLs.
Language Processing Activities

It can be divided into two groups:
– Program generation activities
– Program execution activities

Program
generation
Language
Using
processing
compiler
activities
Program
execution
Using
interpreter
1.Program generation activities…

A program generation activity aims at automatic generation of a program.

The source language is the specification language of application domain while
the target language is the procedure oriented programming language.

The program execution activity organizes the execution of the program written
in the programming language on a computer system.

The source language here is procedure or problem oriented language.
Error Messages

Program PG Program in target


specification PL

Fig.1: Program generator



The program generator is a software which accepts the specification of program
to be generated and generates a program in target programming language.
1.Program generation activities…
1.Program generation activities…


Execution gap is bridged by the compiler or interpreter.

The program generator bridges the gap between application domain &
Programming language domain.
2.Program Execution activities
● Two popular models for program execution are translation and interpretation.
Program translation:

● The program translation model bridges the execution gap by translating a program
written in a PL, called the source program (SP), into an equivalent program in the
machine or assembly language of the computer system, called the target program (TP).
● Characteristics of the program translation model are:
– A program must be translated before it can be executed.
– The translated program may be saved in a file.
– The saved program may be executed repeatedly.
2. Program Execution activities
Program Interpretation:
 The interpreter reads the source program and stores it in its memory.
 During interpretation it takes a source statement, determines its meaning and
performs actions which implement it. This includes computational and input-output
actions.

 The CPU uses a program counter (PC) to note the address of the next
instruction to be executed.
2. Program Execution activities
Program Interpretation:
 This instruction is subjected to the instruction execution cycle consisting of
the following steps:

1. Fetch the instruction.

2. Decode the instruction to determine the operation to be performed, and also


its operands.

3. Execute the instruction.


 At the end of the cycle, the instruction address in PC is updated and the cycle is
repeated for the next instruction.
Fundamentals of Language Processing

Language Processing = Analysis of SP + Synthesis of TP

Collection of LP components engaged in analysis a source program as the
analysis phase and components engaged in synthesizing a target program
constitute the synthesis phase.

Sp=source prog.
TP=target program
Analysis Phase

The analysis phase uses each component of the source language specification to
determine relevant information concerning a statement in the source program.

Thus, analysis of a source statement consists of lexical, syntax and semantic
analysis.
1. Lexical rules -the formation of valid lexical units in the source language.

Example:percent_profit = (profit * 100) / cost_price;
identifies =, * and / operators,
100 as constant,
and the remaining strings as identifiers.
2. Syntax rules the formation of valid statements in the source language.

Example : percent_profit as the left hand side and (profit * 100) / cost_price as the
expression on the right hand side.
3. Semantic rules -associate meaning with valid statements of the language.
example: percent_profit = (profit * 100) / cost_price;

Semantic analysis :

assignment of profit X 100 / cost_price to percent_profit
Synthesis Phase

The synthesis phase is concerned with the construction of target language
statement(s) which have the same meaning as a source statement.

It performs two main activities:
– Creation of data structures in the target program (memory
allocation)
– Generation of target code (code generation)

We refer to these activities as memory allocation and code generation,
respectively
Lexical Analysis (Scanning)

It identifies the lexical units in a source statements. It then classifies the units
into different lexical classes, e.g. id’s, constants, reserved id’s, etc. and enters
them into different tables.

Read the source code and divided the source code into token.

The token is a group of characters with meaning.

The lexical analyzers takes a stream of lexeme as input and the output is
stream of token.

It builds a descriptor, called token, for each lexical unit. A token contains two
fields – class code and number in class.

class code identifies the class to which a lexical unit belongs. number in class is
the entry number of the lexical unit in the relevant table.

A token can be included such like: Keyword, Identifier and Operator.

Non token can be includes: Comment, Pre-processor directive Macro and
Whitespace
Example
#include <bits/stdc++.h>
using namespace std; Can you specifies token and a non-
token in this source code?
int maximum(int x, int y) {
// This will compare 2 numbers
if (x > y)
return x;
else {
return y;
}
Syntax Analysis (Parsing)

Syntax analysis processes the string of tokens built by lexical analysis to
determine the statement class, e.g. assignment statement, if statement,
etc.

Done in order to verify the grammatical mistake of source code.

Take a stream of token as input and generate syntax tree.

Example: a=b*c
Semantic analysis

It verify the meaning of each and every sentence, like tokens and syntax
structure.

It help interpret symbols, their types, and their relations with each other.

It just verify weather an operator is operating on required numbers of operand or
not.
Example: int a=”code”;
– Explain: should not issue an error in lexical and syntax analysis phase, as it
is lexically and structurally correct, but it should generate a semantic error
as the type of the assignment differs.

The following tasks should be performed in semantic analysis:
– Scope resolution
– Type checking
– Array-bound checking
Semantic Errors

A some of the semantics errors that the semantic analyzer is expected
to recognize:
– Type mismatch
– Undeclared variable
– Reserved identifier misuse.
– Multiple declaration of variable in a scope.
– Accessing an out of scope variable.
– Actual and formal parameter mismatch.
Symbol Tables Data Structures for Language Processing

Advance data structure used by compiler to store complete
information of source code at any phase.

If new variable encountered that should be stored into symbol table.

Communicate with each and every phase of the compiler.
Criteria of classification of data structure of LP

Nature of a data structure
– Linear or
– Non-linear

Purpose of a data structure
– Search
– Alloctaion
Linear Data structure

Constist of linear arrangement of elements in the memory.
– Example:

Stacks

Linked list

Queues

Advantage:
– Facilities efficient search.

Disadvantage:
– Require a contagious area of memory.
– Wastage of memory
Non-Linear Data structure

Overcome the disadvantage of the linear DS.

The elements of non-linear DS are access using pointer.

Elements need not occupy in contigious area of memory
– Example of Linear DS:

Trees

Graphs
Symbol Table
➢ Symbol table is an important data structure used in a compiler.
➢ Symbol table is used to store the information about the occurrence of various
entities such as:

objects, classes, variable name, interface, function name etc.
➢ It is used by both the analysis and synthesis phases.
➢ The symbol table used for following purposes:
● It is used to store the name of all entities in a structured form at one place.
● It is used to verify if a variable has been declared.
● It is used to determine the scope of a name.
● It is used to implement type checking by verifying assignments and
expressions in the source code are semantically correct.
Symbol Table Organization
1) Sequential Search Organization

It is a method for finding a particular value in a list that consists of checking
every one of its elements, one at a time and in sequence, until the desired one
is found.

Search for symbol: All active entries in of the table have same probability
of being accessed.

Add a symbol: The symbol is added to the first free in entry the table.
● Delete a symbol: The symbol is deleted by two way:
– Physical Deletion: It involves permanently removing the data from
symbol table.
– Logical Deletion: involves marking the element as inactive without
physically removing from the symbol table.
Symbol Table Organization
2)Binary Search Organization

It is a highly effective method for searching an ordered list.

All entries in of the table are assumed to satisfy ordering relation.

< relation implies that the symbol occupying an entry is smaller or larger
than the symbol.
● If the new symbol is smaller then it will be put at left side otherwise put at
right side .

Every new symbol existing symbol may have to be shifted to ensure order
relation.
● Hence, this is not good for symbol table.
● It is suitable for table which has fixed set of symbol.
– Ex: Table of reserved word in programming language.
Allocation of Data Structures
1)Stacks

Linear DS which stasfies the following properties:
– Allocation and de-allocation are preformed in a LIFO maner.
– Only last elements is accessible at any time.

It has limited size.
Allocation of Data Structures
2)Heaps

Non-Linear DS and used for dynamic memory.

Allocation and de-allocation of the entities in random order.

It has no limited size.

To access we use pointer so, it is slower.

You might also like