Unit 2
Unit 2
System Programming
COURSE CODE:CSEg4303
Version:1.01
Programming Languages
●
What is programming languages?
●
Programming languages allow a programmer to give the computer instructions in a
manner that the programmer can understand
●
Assembly Language is a low level language, where the programmer uses a mnemonic
(abbreviation) to communicate an instruction to the computer
●
EX: “A” or “add” for addition
●
This mnemonic is translated into machine language (binary) by a program called an
assembler
●
Assembly language is a one for one mapping to the machine language and has to be
rewritten for each machine (not all machine languages are the same)
●
An Assembler translates assembler code into machine language that a computer may
execute
●
High level languages were created to allow programmers to write at a level where one
programming instruction is converted into several machine language instructions and to
avoid having to constantly rewrite the program
●
A compiler converts high level language into machine language that a computer may
execute
Language Processing Activities
●
Language Processing activities arise due to the differences between the manner
in which a software designer describes the ideas concerning the behavior of a
software and the manner in which these ideas are implemented in a computer
system.
●
The designer expresses the ideas in terms related to the application domain of
the software.
●
To implement these ideas, their description has to be interpreted in terms related
to the execution domain.
●
These ideas are actually implemented in the execution domain.
●
The gap between the application domain and execution domain is called
semantic gap.
Semantic Gap
●
If a semantic gap is to be covered by the programmer, it means that the
designer has to specify his/her ideas directly in the machine language of the
computer.
●
It has three bad/worse effects as far as software design is concerned or
consequences:
●
Large development time
●
Large development effort
●
Poor quality and less reliable software
Specification and Execution Gaps
●
The software engineering aimed at the use of PL can be grouped into
●
Specification, design and coding steps.
●
PL implementation steps
●
Software implementation using PL introduces new domain, the PL domain
between application and execution domain.
●
The semantic gap is now split into:
1. Specification gap between the application domain and the PL domain.
2. Execution gap between the PL domain and execution domain.
Specification and Execution Gaps
●
Specification Gap
●
It is the semantic gap between two specifications of the same task.
●
Execution Gap
●
It is the gap between the semantics of programs (that perform the same
task) written in different programming languages.
Specification and Execution Gaps
●
The application domain is some specification language of the user:
●
data flow diagrams
●
Flow-charts
●
decision tables and
●
decision tree.
●
The specification language of PL domain is the programming language itself.
●
The specification language of execution domain is the machine language of
the computer system.
Language Processors
●
“A language processor is a software which bridges a specification or
execution gap”.
●
The input to the language processor is the source program while the output
of the language processor is the target program.
●
The languages in which these programs are written are known as source
language and target language respectively.
Types of Language Processors
●
The language processor has four important versions:
●
A language translator bridges an execution gap to the machine
language (or assembly language) of a computer system. E.g.
Assembler, Compiler.
●
A detranslator bridges the same execution gap as the language
translator, but in the reverse direction.
●
A preprocessor is a language processor which bridges an execution
gap but is not a language translator.
●
A language migrator bridges the specification gap between two PLs.
Language Processing Activities
●
It can be divided into two groups:
– Program generation activities
– Program execution activities
Program
generation
Language
Using
processing
compiler
activities
Program
execution
Using
interpreter
1.Program generation activities…
●
A program generation activity aims at automatic generation of a program.
●
The source language is the specification language of application domain while
the target language is the procedure oriented programming language.
●
The program execution activity organizes the execution of the program written
in the programming language on a computer system.
●
The source language here is procedure or problem oriented language.
Error Messages
●
Execution gap is bridged by the compiler or interpreter.
●
The program generator bridges the gap between application domain &
Programming language domain.
2.Program Execution activities
● Two popular models for program execution are translation and interpretation.
Program translation:
● The program translation model bridges the execution gap by translating a program
written in a PL, called the source program (SP), into an equivalent program in the
machine or assembly language of the computer system, called the target program (TP).
● Characteristics of the program translation model are:
– A program must be translated before it can be executed.
– The translated program may be saved in a file.
– The saved program may be executed repeatedly.
2. Program Execution activities
Program Interpretation:
The interpreter reads the source program and stores it in its memory.
During interpretation it takes a source statement, determines its meaning and
performs actions which implement it. This includes computational and input-output
actions.
The CPU uses a program counter (PC) to note the address of the next
instruction to be executed.
2. Program Execution activities
Program Interpretation:
This instruction is subjected to the instruction execution cycle consisting of
the following steps:
Sp=source prog.
TP=target program
Analysis Phase
●
The analysis phase uses each component of the source language specification to
determine relevant information concerning a statement in the source program.
●
Thus, analysis of a source statement consists of lexical, syntax and semantic
analysis.
1. Lexical rules -the formation of valid lexical units in the source language.
●
Example:percent_profit = (profit * 100) / cost_price;
identifies =, * and / operators,
100 as constant,
and the remaining strings as identifiers.
2. Syntax rules the formation of valid statements in the source language.
●
Example : percent_profit as the left hand side and (profit * 100) / cost_price as the
expression on the right hand side.
3. Semantic rules -associate meaning with valid statements of the language.
example: percent_profit = (profit * 100) / cost_price;
●
Semantic analysis :
●
assignment of profit X 100 / cost_price to percent_profit
Synthesis Phase
●
The synthesis phase is concerned with the construction of target language
statement(s) which have the same meaning as a source statement.
●
It performs two main activities:
– Creation of data structures in the target program (memory
allocation)
– Generation of target code (code generation)
●
We refer to these activities as memory allocation and code generation,
respectively
Lexical Analysis (Scanning)
●
It identifies the lexical units in a source statements. It then classifies the units
into different lexical classes, e.g. id’s, constants, reserved id’s, etc. and enters
them into different tables.
–
Read the source code and divided the source code into token.
–
The token is a group of characters with meaning.
–
The lexical analyzers takes a stream of lexeme as input and the output is
stream of token.
●
It builds a descriptor, called token, for each lexical unit. A token contains two
fields – class code and number in class.
●
class code identifies the class to which a lexical unit belongs. number in class is
the entry number of the lexical unit in the relevant table.
●
A token can be included such like: Keyword, Identifier and Operator.
●
Non token can be includes: Comment, Pre-processor directive Macro and
Whitespace
Example
#include <bits/stdc++.h>
using namespace std; Can you specifies token and a non-
token in this source code?
int maximum(int x, int y) {
// This will compare 2 numbers
if (x > y)
return x;
else {
return y;
}
Syntax Analysis (Parsing)
●
Syntax analysis processes the string of tokens built by lexical analysis to
determine the statement class, e.g. assignment statement, if statement,
etc.
●
Done in order to verify the grammatical mistake of source code.
●
Take a stream of token as input and generate syntax tree.
●
Example: a=b*c
Semantic analysis
●
It verify the meaning of each and every sentence, like tokens and syntax
structure.
●
It help interpret symbols, their types, and their relations with each other.
●
It just verify weather an operator is operating on required numbers of operand or
not.
Example: int a=”code”;
– Explain: should not issue an error in lexical and syntax analysis phase, as it
is lexically and structurally correct, but it should generate a semantic error
as the type of the assignment differs.
●
The following tasks should be performed in semantic analysis:
– Scope resolution
– Type checking
– Array-bound checking
Semantic Errors
●
A some of the semantics errors that the semantic analyzer is expected
to recognize:
– Type mismatch
– Undeclared variable
– Reserved identifier misuse.
– Multiple declaration of variable in a scope.
– Accessing an out of scope variable.
– Actual and formal parameter mismatch.
Symbol Tables Data Structures for Language Processing
●
Advance data structure used by compiler to store complete
information of source code at any phase.
●
If new variable encountered that should be stored into symbol table.
●
Communicate with each and every phase of the compiler.
Criteria of classification of data structure of LP
●
Nature of a data structure
– Linear or
– Non-linear
●
Purpose of a data structure
– Search
– Alloctaion
Linear Data structure
●
Constist of linear arrangement of elements in the memory.
– Example:
●
Stacks
●
Linked list
●
Queues
●
Advantage:
– Facilities efficient search.
●
Disadvantage:
– Require a contagious area of memory.
– Wastage of memory
Non-Linear Data structure
●
Overcome the disadvantage of the linear DS.
●
The elements of non-linear DS are access using pointer.
●
Elements need not occupy in contigious area of memory
– Example of Linear DS:
●
Trees
●
Graphs
Symbol Table
➢ Symbol table is an important data structure used in a compiler.
➢ Symbol table is used to store the information about the occurrence of various
entities such as:
➢
objects, classes, variable name, interface, function name etc.
➢ It is used by both the analysis and synthesis phases.
➢ The symbol table used for following purposes:
● It is used to store the name of all entities in a structured form at one place.
● It is used to verify if a variable has been declared.
● It is used to determine the scope of a name.
● It is used to implement type checking by verifying assignments and
expressions in the source code are semantically correct.
Symbol Table Organization
1) Sequential Search Organization
●
It is a method for finding a particular value in a list that consists of checking
every one of its elements, one at a time and in sequence, until the desired one
is found.
●
Search for symbol: All active entries in of the table have same probability
of being accessed.
●
Add a symbol: The symbol is added to the first free in entry the table.
● Delete a symbol: The symbol is deleted by two way:
– Physical Deletion: It involves permanently removing the data from
symbol table.
– Logical Deletion: involves marking the element as inactive without
physically removing from the symbol table.
Symbol Table Organization
2)Binary Search Organization
●
It is a highly effective method for searching an ordered list.
●
All entries in of the table are assumed to satisfy ordering relation.
●
< relation implies that the symbol occupying an entry is smaller or larger
than the symbol.
● If the new symbol is smaller then it will be put at left side otherwise put at
right side .
●
Every new symbol existing symbol may have to be shifted to ensure order
relation.
● Hence, this is not good for symbol table.
● It is suitable for table which has fixed set of symbol.
– Ex: Table of reserved word in programming language.
Allocation of Data Structures
1)Stacks
●
Linear DS which stasfies the following properties:
– Allocation and de-allocation are preformed in a LIFO maner.
– Only last elements is accessible at any time.
●
It has limited size.
Allocation of Data Structures
2)Heaps
●
Non-Linear DS and used for dynamic memory.
●
Allocation and de-allocation of the entities in random order.
●
It has no limited size.
●
To access we use pointer so, it is slower.