0% found this document useful (0 votes)
5 views59 pages

Module 1

The document outlines the course structure for System Software and Compiler Design, detailing prerequisites and learning objectives. It covers the phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation, along with the role of symbol tables and compiler-construction tools. Additionally, it discusses the evolution of programming languages from machine language to higher-level languages and their classifications.

Uploaded by

vvce22cse0028
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views59 pages

Module 1

The document outlines the course structure for System Software and Compiler Design, detailing prerequisites and learning objectives. It covers the phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation, along with the role of symbol tables and compiler-construction tools. Additionally, it discusses the evolution of programming languages from machine language to higher-level languages and their classifications.

Uploaded by

vvce22cse0028
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 59

System Software and

Compiler Design
BCSSS602
3:0:2
Pre- requisites
• Computer Organization
• Any programming language
• Data Structures
• Automata Theory
Course Learning Objectives (CLO)
• Understand the phases of the compiler
• Generate parse table, Intermediate Code, and Target Code
• Learn the concepts of System software – Assemblers and Loaders
MODULE 1
• Introduction
• Language Processors
• Structure of Compiler
• Evolution of programming languages
• Science of building a compiler

• Lexical Analysis
• Role of Lexical Analyzer
• Input Buffering
• Specifications of Token
• Recognition of Tokens
Language Processors
• A compiler is a program that can read a program
in one language - the source language - and
translate it into an equivalent program in another
language - the target language;
• An important role of the compiler is to report any
errors in the source program that it detects during the
translation process.

• If the target program is an executable machine-


language program, it can then be called by the
user to process inputs and produce outputs;
• An interpreter is another common kind of language processor.
• Instead of producing a target program as a translation, an interpreter appears to
directly execute the operations specified in the source program on inputs supplied
by the user

• An interpreter executes the source program line by line.


• Only on successful execution of the line in execution, it moves to next line of
execution
• The machine-language target program produced by a compiler is much faster than
an interpreter at mapping inputs to outputs.
• An interpreter, gives better error diagnostics than a compiler, because it executes
the source program statement by statement.
Example
• Java language processors combine compilation and interpretation, as shown in Figure.
• A Java source program may first be compiled into an intermediate form called bytecodes.
• The bytecodes are then interpreted by a virtual machine.
• The advantage is that bytecodes compiled on one machine can be interpreted on another
machine, across a network too.
• To achieve faster processing of inputs to outputs, some Java compilers, called just-in-time
compilers, translate the bytecodes into machine language immediately before they run the
intermediate program to process the input.
• In addition to a compiler, several other programs are
required to create an executable target program, as
A language
shown in Figure. processing
• A source program is divided into modules stored in system
separate files.
• A preprocessor collects the source program and
expands shorthands, calls macros into source language
statements.
• The modified source program is then fed to a compiler
which produces an assembly-language program as its
output
• Assembler processes the assembly code and produces
relocatable machine code as its output.
• Large programs are often compiled in pieces, so the
relocatable machine code may have to be linked
together with other relocatable object files and library
files into the code that runs on the machine.
• The linker resolves external memory addresses, where
the code in one file may refer to a location in another
file.
• The loader then puts together all of the executable
The Structure of a Compiler
• Compiler can be considered as a two-part program- Analysis and Synthesis
• The analysis part breaks up the source program into constituent pieces and imposes a
grammatical structure on them.
• It then uses this structure to create an intermediate representation of the source program.
• If the analysis part detects that the source program is either syntactically errored or
semantically incorrect, then it must provide informative messages, so the user can take
corrective action.
• The analysis part also collects information about the source program and stores it in a data
structure called a symbol table, which is passed along with the intermediate representation to
the synthesis part.
• The synthesis part constructs the desired target program from the intermediate
representation and the information in the symbol table.
• The analysis part is often called the front end of the compiler; the synthesis part is the back
end.
Structure
• The compilation process
operates as a sequence of
phases, each of which
transforms one representation
of the source program to
another.
• A decomposition of a compiler
into phases is shown in Figure
• The symbol table, which
stores information about the
entire source program, is used
by all phases of the compiler.
Lexical Analysis/Scanning
• The lexical analyzer reads the stream of characters from the source program and groups the
characters into meaningful sequences called lexemes.
• For each lexeme, the lexical analyzer produces an output - a token of the form
(token-name, attribute-value) that is passed on to the next phase, syntax analysis.
• Token-name is an abstract symbol that is used during syntax analysis, and attribute-value
points to an entry in the symbol table for this token.
• Information from the symbol-table entry is needed for semantic analysis and code generation.
• For example, consider the assignment statement: position = initial + rate * 60 -----(1)
• The characters in this assignment could be grouped into the following lexemes and mapped
into the following tokens passed on to the syntax analyzer:
1. position is a lexeme that is mapped into a token (id, 1), where id stands for identifier and
1 point to the symbol-table entry for the position.
• The symbol-table entry for an identifier holds information about the identifier, such as its
name and type.
2. The assignment symbol = is a lexeme that is mapped into the token (=).
• Since this token needs no attribute value, the second component is omitted .
3. initial is a lexeme that is mapped into the token (id, 2), where 2 points to the symbol-table
entry for initial.
4. + is a lexeme that is mapped into the token (+).
5. rate is a lexeme that is mapped into the token (id, 3), where 3 points to the symbol-table
entry for rate.
6. * is a lexeme that is mapped into the token (*) .
7. 60 is a lexeme that is mapped into the token (60) .
• Blanks separating the lexemes are discarded by the lexical analyzer.
• Figure shows the representation of the assignment statement (1) after lexical analysis as the
sequence of tokens

• In this representation, the token names =, +, and * are abstract symbols for the assignment,
addition, and multiplication operators, respectively.
Syntax Analysis
• The second phase of the compiler is syntax analysis or parsing.
• The parser uses the first components of the tokens produced by the lexical
analyzer to create a tree-like intermediate representation that depicts the
grammatical structure of the token stream.
• A typical representation is a syntax tree in which each interior node represents
an operation and the children of the node represent the arguments of the
operation.
• A syntax tree for the token stream (2) is shown as the output of the syntactic
analyzer in Figure
• This tree shows the order in which the operations in the assignment are to be
performed.
• The tree has an interior node labeled * with
<id, 3> as its left child and the integer 60 as
its right child.
• The node <id, 3> represents the identifier -
rate.
• The node labeled * makes it explicit that
first multiply the value of rate by 60.
• The node labeled + indicates that the result
of this multiplication is added to the value
initial.
• The root of the tree, labeled = , indicates that the result of addition is stored into
the location for the identifier position.
• This ordering of operations is consistent with the usual conventions of
arithmetic, where multiplication has higher precedence than addition, and hence
that the multiplication is to be performed before the addition.
Semantic Analysis
• The semantic analyzer uses the syntax tree and the information in the symbol table to
check the source program for semantic consistency with the language definition.
• It gathers type information and saves it in either the syntax tree or the symbol table, for
subsequent use during intermediate-code generation.
• An important part of semantic analysis is type checking, where the compiler checks that
each operator has matching operands.
• For example, the compiler reports an error if a floating-point number is used to index an
array.
• The language specification may permit some type conversions called coercions.
• For example, a binary arithmetic operator may be applied to either a pair of integers or
to a pair of floating-point numbers.
• If the operator is applied to a floating-point number and an integer, the compiler may
convert or coerce the integer into a floating-point number.
• Suppose that position, initial, and rate have been declared to be floating-point
numbers, and that the lexeme 60 by itself forms an integer.
• The type checker in the semantic analyzer in Figure discovers that the operator *
is applied to a floating-point number rate and an integer 60.
• In this case, the integer is converted into a floating-point number.
• In Figure, notice that the output of the semantic analyzer has an extra node for
the operator inttofloat, which explicitly converts its integer argument into a
floating-point number.
Intermediate Code Generation
• In the process of translating a source program into target code, a compiler may
construct one or more intermediate representations, which can have a variety of
forms.
• Syntax trees are a form of intermediate representation; they are commonly used
during syntax and semantic analysis.
• After syntax and semantic analysis of the source program, many compilers
generate an explicit low-level or machine-like intermediate representation.
• This intermediate representation should have two important properties: it should
be easy to produce and it should be easy to translate into the target machine.
• An intermediate form called three-address code, consists of a sequence of
assembly-like instructions with three operands per instruction
• Each operand can act like a register.
• The output of the intermediate code generator in Figure consists of the three-
address code sequence
tl = inttofloat(60)
t2 = id3 * tl
t3 = id2 + t2 ---- (3)
id1 = t3
• Each three-address assignment instruction has at most one operator on the right
side.
• Thus, these instructions fix the order in which operations are to be done; the
multiplication precedes the addition in the source program).
• The compiler then must generate a temporary name to hold the value computed
by a three-address instruction.
• Finally, some "three-address instructions" like the first and last in the sequence
(3), above, have fewer than three operands.
Code Optimization
• The machine-independent code-optimization phase attempts to improve the
intermediate code so that better target code will result.
• Better target code is faster, shorter code, or that consumes less power.
• For example, an algorithm generates the intermediate code (3), using an
instruction for each operator in the tree representation that comes from the
semantic analyzer.
• The optimizer can deduce that the conversion of 60 from integer to floating
point can be done once and for all at compile time, so the inttofloat operation
can be eliminated by replacing the integer 60 by the floating-point number 60.0.
• Moreover, t3 is used only once to transmit its value to id1 so the optimizer can
transform (3) into the shorter sequence
t1 = id3 * 60.0 -------------(4)
id1 = id2 + t1
Code Generation
• The code generator takes as input an intermediate representation of the source program and
maps it into the target language.
• If the target language is machine code, registers or memory locations are selected for each of
the variables used by the program.
• A crucial aspect of code generation is the judicious assignment of registers to hold variables.
• For example, using registers R1 and R2, the intermediate code in (4) might get translated
into the machine code
LDF R2, id3
MULF R2 , R2 , #60.0 ------ (5)
LDF R1 , id2
ADDF Rl , R l , R2
STF idl , Rl
• The first operand of each instruction specifies a destination, The F in each instruction tells us
that it deals with floating-point numbers.
• The code in (5) loads the contents of address id3 into register R2
• Then multiplies it with floating-point constant 60.0. ( The # signifies that 60.0 is
to be treated as an immediate constant).
• The third instruction moves id2 into register R1
• The fourth adds to it the value previously computed in register R2.
• Finally, the value in register R1 is stored into the address of idl, so the code
correctly implements the assignment statement (1).
Symbol-Table Management
• Symbol tables are data structures that are used by compilers to hold information about
source-program constructs.
• The symbol table is a data structure containing a record for each variable name, with fields
for the attributes of the name.
• The data structure should be designed to allow the compiler to find the record for each name
quickly and to store or retrieve data from that record quickly
• The information is collected incrementally by the analysis phases of a compiler and used by
the synthesis phases to generate the target code.
• Entries in the symbol table contain information about an identifier such as its character
string (or lexeme), its type, its position in storage, and any other relevant information.
• Symbol tables typically need to support multiple declarations of the same identifier within a
program (scope - where in the program its value may be used), and in the case of procedure
names, such things as the number and types of its arguments, the method of passing each
argument (for example, by value or by reference), and the type returned.
The Grouping of Phases into Passes
• In an implementation, activities from several phases may be grouped together into a
pass that reads an input file and writes an output file.
• For example, the front-end phases of lexical analysis, syntax analysis, semantic
analysis, and intermediate code generation might be grouped together into one pass.
• Code optimization might be an optional pass.
• The back-end pass consisting of code generation for a particular target machine.
• Some compiler collections have been created around carefully designed intermediate
representations that allow the front end for a particular language to interface with the
back end for a certain target machine.
• With these collections, compilers for different source languages for one target
machine can be designed by combining different front ends with the back end for that
target machine.
• Similarly, compilers for different target machines, by combining a front end with back
ends for different target machines can be designed.
Compiler-Construction Tools
• Some commonly used compiler-construction tools include
1. Parser generators that automatically produce syntax analyzers from a grammatical
description of a programming language.
2. Scanner generators that produce lexical analyzers from a regular-expression description
of the tokens of a language.
3. Syntax-directed translation engines that produce collections of routines for walking a
parse tree and generating intermediate code.
4. Code-generator generators that produce a code generator from a collection of rules for
translating each operation of the intermediate language into the machine language for a
target machine.
5. Data-flow analysis engines that facilitate the gathering of information about how values
are transmitted from one part of a program to each other part and it is a key part of code
optimization.
6. Compiler-construction toolkits that provide an integrated set of routines for
constructing various phases of a compiler
The Evolution of Programming Languages
• The first programming was machine language, which used sequences of 0s and
1s that explicitly told the computer what operations to execute and in what
order.
• Programming languages can be classified in a variety of ways –
• by generation.
• First-generation languages are the machine languages,
• second-generation the assembly languages, and
• third-generation the higher-level languages like Fortran, Cobol, Lisp, C, C++, C#, and
Java.
• Fourth-generation languages are languages designed for specific applications like
NOMAD for report generation, SQL for database queries, and Postscript for text
formatting.
• The fifth-generation language is logic- and constraint-based languages like Prolog and
OPS5.
• By Computation
• Imperative for languages in which a program specifies how a computation is to be done
• Languages such as C, C++, C#, and Java are imperative languages, which have a notion of program state
and statements that change the state.
• Declarative for languages in which a program specifies what computation is to be done.
• Functional languages such as ML and Haskell and constraint logic languages such as Prolog are often
considered to be declarative languages.
• By Architecture
• Von Neumann language is a programming language whose computational model is based
on the von Neumann computer architecture.
• Languages, such as Fortran and C are von Neumann languages.
• An object-oriented language is one that supports object-oriented programming, a
programming style in which a program consists of a collection of objects that interact with
one another.
• Simula 67 and Smalltalk are the earliest major object-oriented languages. Languages such as C++, C#,
Java, and Ruby are more recent object-oriented languages.
• Scripting languages are interpreted languages with high-level operators designed for "gluing
together" computations.
• Awk, JavaScript, Perl, PHP, Python, Ruby, and Tcl are examples of scripting languages
Modeling in Compiler Design and Implementation
• The study of compilers is mainly a study of how to design the right
mathematical models and choose the right algorithms, while balancing the need
for generality and power against simplicity and efficiency.
• Some of the most fundamental models are:
• finite-state machines and regular expressions, which are useful for describing the lexical
units of programs (keywords, identifiers, and such) and for describing the algorithms
used by the compiler to recognize those units.
• context-free grammar, used to describe the syntactic structure of programming
languages such as the nesting of parentheses or control constructs.
• trees for representing the structure of programs and their translation into object code.
The Science of Code Optimization
• The term "optimization" in compiler design refers to the attempts that a compiler makes to
produce code that is more efficient than the obvious code.
• "Optimization” is thus a misnomer, since there is no way that the code produced by a
compiler can be guaranteed to be as fast or faster than any other code that performs the same
task.
• In modern times, the optimization of code that a compiler performs has become both more
important and more complex.

WHY?
• It is more complex because processor architectures have become more complex, so, more
opportunities to improve the way code executes.
• It is more important because massively parallel computers require substantial optimization,
or their performance suffers by
• orders of magnitude..
• The use of a rigorous mathematical foundation shows that optimization is
correct and that it produces the desirable effect for all possible inputs.
• Models such as graphs, matrices, and linear programs are necessary for the
compiler to produce optimized code.
• Compiler optimizations must meet the following design objectives:
• The optimization must be correct semantically, that is, preserve the meaning of the
compiled program,
• The optimization must improve the performance of many programs- high speed, low
power consumption
• The compilation time must be kept reasonable – debugging & testing cannot be
exhaustive,
• The engineering and maintenance effort required must be manageable
Lexical Analysis – Role of Lexical Analyzer
• First phase of a compiler
• The main task is to read the input characters of the source program, group them into lexemes,
and produce as output a sequence of tokens for each lexeme in the source program and store
it in the symbol table.
• The stream of tokens is sent to the parser for syntax analysis.
• When the lexical analyzer discovers a lexeme constituting an identifier, it stores that lexeme
into the symbol table.
• It also reads information regarding the kind of identifier from the symbol table to assist it in
determining the proper token it must pass to the parser.
• Lexical analyzers are divided into a cascade of two processes:
a) Scanning consists of the simple processes such as deletion of comments and compaction of
consecutive whitespace characters into one.
b) Lexical analysis proper is the more complex portion, where the scanner produces the
sequence of tokens as output.
Interactions between the lexical analyzer and the parser
• The interaction is implemented by having the parser call the lexical analyzer.
• The call, suggested by the getNextToken command, causes the lexical analyzer
to read characters from its input until it can identify the next lexeme and
produce for it the next token, which it returns to the parser.
• The lexical analyser also removes comments and whitespace (blank, newline,
tab, and other characters that are used to separate tokens in the input).
• It also does correlating error messages generated by the compiler with the
source program.
• For instance, the lexical analyzer keeps track of the number of newline characters seen,
and associates a line number with each error message.
• The lexical analyzer makes a copy of the source program with the error
messages inserted at the appropriate positions.
• If the source program uses a macro-preprocessor, the expansion of macros is
performed by the lexical analyzer.
Tokens, Patterns, and Lexemes
• A token is a pair consisting of a token name and an optional attribute value.
• The token name is an abstract symbol representing a kind of lexical unit, e.g., a
particular keyword, or a sequence of input characters denoting an identifier.
• The token names are the input symbols that the parser processes and the token is
refered its token name.
• A pattern is a description of the form that the lexemes of a token may take.
• In the case of a keyword as a token, the pattern is just the sequence of characters
that form the keyword.
• For identifiers and some other tokens, the pattern is a more complex structure that
is matched by many strings.
• A lexeme is a sequence of characters in the source program that matches the
pattern for a token and is identified by the lexical analyzer as an instance of that
token.
Example
Attributes for Tokens
• When more than one lexeme can match a pattern, the lexical analyzer provides the
subsequent compiler phases additional information about the particular lexeme that
matched.
• For example, the pattern for token number matches both 0 and 1, but it is extremely
important for the code generator to know which lexeme was found in the source program.
• Thus, the lexical analyzer returns to the parser not only a token name, but an attribute
value that describes the lexeme represented by the token;
• Information about an identifier - e.g., its lexeme, its type, and the location at which it is
first found (in case an error message about that identifier must be issued) - is kept in the
symbol table.
• Thus, the appropriate attribute value for an identifier is a pointer to the symbol-table
entry for that identifier.
• The token name influences parsing decisions, while the attribute value influences
translation of tokens after the parse.
• The token names and associated attribute values for the Fortran statement are
written below as a sequence of pairs.
E = M * C ** 2
<id, pointer to symbol-table entry for E>
< assign-op >
<id, pointer to symbol-table entry for M>
<mult -op>
<id, pointer to symbol-table entry for C>
<exp-op>
<number , integer value 2 >
Lexical Errors

Problems in recognition of tokens


• DO 5 I = 1,25 is the required looping statement in FORTRAN
• DO is Looping statement,
• I is the counter variable
• 5 is the label of a statement till where the looping has to take place
• 1,25 is the range from 1 to 25

• If it was given as DO 5 I=1.25, then?


Lexical Errors
• It is hard for a lexical analyzer to tell, without the aid of other components, that
there is a source-code error.
• For instance, When the string fi is encountered for the first time in a C program
in the context: fi ( a == f(x))
• a lexical analyzer cannot tell whether fi is a misspelling of the keyword if or an
undeclared function identifier.
• Since fi is a valid lexeme for the token id, the lexical analyzer must return the
token id to the parser and let some other phase of the compiler - probably the
parser - handle an error due to transposition of the letters.
Input Buffering -Buffer Pairs
• Specialized buffering techniques have been developed to reduce the amount of overhead
required to process a single input character.
• An important scheme involves two buffers that are alternately reloaded as shown
• Each buffer is of the same size N, and N is usually the size of a disk block, e.g., 4096 bytes.
• Using one system read command, N characters are read into a buffer, rather than using one
system call per character.
• If fewer than N characters remain in the input file, then a special character, represented by
eof, marks the end of the source file and is different from any possible character of the source
program.
• Two pointers to the input are maintained:
1. Pointer lexemeBegin, marks the beginning of the current lexeme, whose extent
we are attempting to determine.
2. Pointer forward scans ahead until a pattern match is found
• Once the next lexeme is determined, forward is set to the character at its right
end.
• Then, after the lexeme is recorded as an attribute value of a token returned to
the parser, lexemeBegin is set to the character immediately after the lexeme just
found.
• In Figure, it is shown that forward has passed the end of the next lexeme, ** (the
Fortran exponentiation operator), and must be retracted one position to its left.
• Advancing forward requires that first test whether reached the end of one of the
buffers, and if so, reload the other buffer from the input, and move forward to
the beginning of the newly loaded buffer.
Sentinels
• The sentinel is a special character that cannot be part of the source program, and
a natural choice is the character eof.
• Figure shows the same arrangement as previous Figure, but with the sentinels
added.
• Note that eof retains its use as a marker for the end of the entire input.
• Any eof that appears other than at the end of a buffer means that the input is at
an end.
• The algorithm for advancing forward
Specification of Tokens
• Regular expressions are an important notation for specifying lexeme patterns.
• Strings and Languages
• An alphabet is any finite set of symbols Example letters, digits, and punctuation.
• A string over an alphabet is a finite sequence of symbols drawn from that alphabet
• A language is any countable set of strings over some fixed alphabet.
• Terms for Parts of Strings
1. A prefix of string is any string obtained by removing zero or more symbols from the end.
2. A suffix of string is any string obtained by removing zero or more symbols from the
beginning.
3. A substring of s is obtained by deleting any prefix and any suffix from s
4. The proper prefixes, suffixes, and substrings of a string s are those, prefixes, suffixes, and
substrings, respectively, of s that are not E or not equal to s itself.
5. A subsequence of s is any string formed by deleting zero or more not necessarily
consecutive positions of s.
Operations on Languages
Regular Expressions
• BASIS: There are two rules that form the basis:
1. E is a regular expression, and L (E) is {E} , that is, the language whose sole member is the
empty string.
2. If a is a symbol in C, then a is a regular expression, and L(a) = {a}, that is, the language
with one string, of length one, with a in its one position.
• INDUCTION: There are four parts to the induction whereby larger regular expressions are
built from smaller ones.
• Suppose r and s are regular expressions denoting languages L(r) and L(s), respectively.
1. (r)+ (s) is a regular expression denoting the language L(r) U L(s).
2. (r) . (s) is a regular expression denoting the language L(r) L(s) .
3. (r) * is a regular expression denoting (L (r)) * .
4. (r) is a regular expression denoting L(r). Ie., additional pairs of parentheses can be added
around expressions without changing the language they denote.
Recognition of Tokens
Recognition of Tokens- relop
• Transition diagram that
recognizes the lexemes
matching the token relop.
• Note, that state 4 and 8
has a * to indicate that we
must retract the input one
position.
• Implementation
for relop
transition diagram
Recognition of Reserved Words and Identifiers
• There are two ways that we can handle reserved words that look like identifiers:
1. Install the reserved words in the symbol table initially.
• A field of the symbol-table entry indicates that these strings are never ordinary identifiers,
and tells which token they represent.
• When an identifier is found, a call to installID places it in the symbol table if it is not already
there and returns a pointer to the symbol-table entry for the lexeme found.
• The function getToken examines the symbol table entry for the lexeme found, and returns
whatever token name the symbol table says this lexeme represents - either id or one of the
keyword tokens that was initially installed in the table.
• Create separate transition diagrams for each keyword
• Note that such a transition diagram consists of states representing the situation
after each successive letter of the keyword is seen, followed by a test for a
"nonletter-or-digit,“ i.e., any character that cannot be the continuation of an
identifier.
• It is necessary to check that the identifier has ended, or else token then is
returned in situations where the correct token was id, with a lexeme like
thenextvalue that has then as a proper prefix.
• If this approach is adopted, then the tokens must be prioritized so that the
reserved-word tokens are recognized in preference to id, when the lexeme
matches both patterns.
Unsigned floating Numbers
• 123.56E-36
• 3E54
• 567
• 12E-6
Transition diagram for white space
Lexical Analyzer Generator
Exercise
• What is the difference between a compiler and an interpreter?
• What are the advantages of (a) a compiler over an interpreter (b) an interpreter over a
compiler?
• What advantages are there to a language-processing system in which the compiler produces
assembly language rather than machine language?
• Indicate which of the following terms apply to which of the following languages:
• a) imperative b) declarative c) von Neumann d) object-oriented e) functional
f) third-generation g) fourth-generation h) scripting
• 1) C 2) C++ 3) Cobol 4) Fortran 5) Java 6) Lisp 7) ML 8) Per1 9) Python 10) VB.
Exercise
• Divide the following C++ program that returns x-squared, but never more than
100 into appropriate lexemes:
• float limitedSquare(x) float x { return (x <= -10.01 || x >= 10.01) ? 100 : x*x; }
Exercise
• Describe the languages denoted by the following regular expressions:

• In a string of length n, how many of the following are there?


• a) Prefixes. b) Suffixes. c) Proper prefixes. d) Substrings. e) Subsequences.
• The SQL keyword SELECT can also be written select, Select, or sElEcT, for instance. Show
how to write a regular expression for a keyword in a case insensitive language. Illustrate the
idea by writing the expression for "select“ in SQL.
THANK YOU

You might also like