0% found this document useful (0 votes)

128 views28 pages

CD Unit1 Notes

A compiler is a program that translates programs written in a high-level language into an equivalent program in a lower-level language. Studying compiler design is applicable to many fields like command interpreters, text formatters, graphic interpreters, and more. Compilers translate source code into object code as a whole and create object files for faster execution, while interpreters directly execute code and do not create object files but allow for easier debugging.

Uploaded by

Sharmili Nukapeyi N

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

128 views28 pages

CD Unit1 Notes

Uploaded by

Sharmili Nukapeyi N

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

What is Compiler?

A program that reads a program written in one high-level language and translates it into an equivalent
program in another (object) language, which is ready to be executed on a computer.

Why to study Compiler Design

Compiler theory and tools are applicable to other fields:
 Command and query interpreters;
 Text formatters (TeX, LaTeX, HTML);
 Graphic interpreters (PS, GIF, JPEG);
 Translating javadoc comments to HTML;
 Generating a table from the results of a SQL query;
 Spam filter;
 Server that responds to a network protocol;

Related Topics
Compilers vs. Translators
Compilers typically refer to the translation from high-level source code to low-level code.

Translators refer to the transformation at the same level of abstraction.

Examples
 Typical compilers: gcc, javac…
 Non-typical compilers:
o Latex (document compiler).
o C-to-silicon compiler.
 Translators
o F2c: Fortran-to-C translator (both high-level).
o Latex2html (both documents).
o Dvips2ps (both low-level).

Compilers vs. Interpreters

Interpreter:
It directly executes the source program on inputs supplied by the user.

Hybrid compiler (combine compilation and interpretation):

Compiler Interpreter
1) It translates the statements of the source code one by one and execute
1) It translates source code into object codes as a whole.
immediately.

2) It creates an object file. 2) It does not create an object file.

3) Program execution is very fast. 3) Program execution is slow.

4) Translator program is not required to translate the program each time you want to run 4) Translator program is required to translate the program each time you want to run
the program. the program.

5) It does not make easier to correct the mistakes in the source code. 5) It makes easier to correct the mistakes in the source code.

6) Most of the high-level programming languages have compiler program. 6) A few high-level programming languages have Iterpreter program.

Assembler: Program that translates an assembly-language program into a relocatable machine code.

In addition to a compiler, several other programs may be required to create an executable target program.

A source program may be divided into modules stored in separate files. The task of collecting the source program is sometimes
entrusted to a separate program, called a preprocessor. The preprocessor may also expand shorthand’s, called macros, into source
language statements

The modified source program is then fed to a compiler. The compiler may produce an assembly-language program as its output,
because assembly language is easier to produce as output and is easier to debug. The assembly language is then processed by a
program called an assembler that produces relocatable machine code as its output.

Large programs are often compiled in pieces, so the relocatable machine code may have to be linked together with other
relocatable object files and library files into the code tha t actually runs on the machine. The linker resolves external memory
addresses, where the code in one file may refer to a location in another file. The loader then puts together the entire executable
object files into memory for execution.
Fig: Language Processing System

Structure of a Compiler
There are two puts to compilation: analysis and synthesis.
The analysis part breaks up the source program into constituent pieces and imposes a grammatical structure on them. It
then uses this structure to create an intermediate representation of the source program. If the analysis part detects tha t the
source program is either syntactically ill formed or semantically unsound, then it must provide informative messages, so
the user can take corrective action. The analysis part also collects information about the source program and stores it in a
data structure called a symbol table, which is passed along with the intermediate representation to the synthesis part.
The synthesis part constructs the desired target program from the intermediate representation and the information in the
symbol table.
The analysis part is often called the front end of the compiler; the synthesis part is the back end.
Phases Of Compiler:
The compilation process is a sequence of various phases. E ach of which transforms the source program from one representation
to another and each phase takes input from its previous stage, has its own representation of source program, and feeds its output to the
next phase of the compiler.
Fig: Phases of Compiler
Lexical Analysis (also called Scanner)
It works as a text scanner. This phase scans the source code as a stream of characters and converts it into meaningful lexemes. Lexical
analyzer represents these lexemes in the form of tokens as:
<token-name, attribute-value>
Syntax Analysis
The next phase is called the syntax analysis or parsing. It takes the token produced by lexical analysis as input and generates a parse
tree (or syntax tree). In this phase, token arrangements are checked against the source code grammar, i.e., the parser checks if the
expression made by the tokens is syntactically correct.
Semantic Analysis
Semantic analysis checks whether the parse tree constructed follows the rules of language. For example, assignment of values is
between compatible data types, and adding string to an integer. Also, the semantic analyzer keeps track of identifiers, their types and
expressions; whether identifiers are declared before use or not, etc. The semantic analyzer produces an annotated syntax tree as an
output.
Intermediate Code Generation
After semantic analysis, the compiler generates an intermediate code of the source code for the target machine. It represents a program
for some abstract machine. It is in between the high-level language and the machine language. This intermediate code should be
generated in such a way that it makes it easier to be translated into the target machine code.
Code Optimization
The next phase does code optimization of the intermediate code. Optimization can be assumed as something that removes unnecessary
code lines, and arranges the sequence of statements in order to speed up the program execution without wasting resources (CPU,
memory).
Code Generation
In this phase, the code generator takes the optimized representation of the intermediate code and maps it to the target machine language.
The code generator translates the intermediate code into a sequence of (generally) re-locatable machine code.

Symbol Table
It is a data-structure maintained throughout all the phases of a compiler. All the identifiers’ names along with their types are stored here.
The symbol table makes it easier for the compiler to quickly search the identifier record and retrieve it.
Fig: Phases of Compiler with Example
Pass: several phases to be grouped into one pass

For example, the front-end phases of lexical analysis, syntax analysis, semantic analysis, and intermediate code generation might
be grouped together into one pass. Code optimization might be an optional pass. Then there could be a back-end pass consisting of
code generation for a particular target machine.
Compiler-Construction Tools:
These are the tools that are used to implement various phases of compiler
These are also called compiler-compilers, compiler-generators, or translator-writing systems

1. Scanner generators that produce lexical analyzers from a regular-expression description of the tokens of a language.
2. Parser generators that automatically produce syntax analyzers from a grammatical description of a programming
language.
3. Syntax-directed translation engines that produce collections of routines for walking a parse tree and generating
intermediate code.
4. Code-generator generators that produce a code generator from a collection of rules for translating each operation of the
intermediate language into the machine language for a target machine.
5. Data-flow analysis engines that facilitate the gathering of information about how values are transmitted from one part of a
program to each other part. Data-flow analysis is a key part of code optimization.
6. Compiler-construction toolkits that provide an integrated set of routines for constructing various phases of a compiler.

The Evolution of Programming Languages

Today, there are thousands of programming languages. They can be classified in a variety of ways.
One classification is by generation

1. First Generation Languages:

 Machine language
o Operation code – such as addition or subtraction.
o Operands – that identify the data to be processed.
o Machine language is machine dependent as it is the only language the computer can
understand.
o Very efficient code but very difficult to write.
2. Second Generation Languages
 Assembly languages
o Symbolic operation codes replaced binary operation codes.
o Assembly language programs needed to be “assembled” for execution by the computer.
Each assembly language instruction is translated into one machine language instruction.
o Very efficient code and easier to write.

3. Third Generation Languages

Closer to English but included simple mathematical notation.
 Programs written in source code which must be translated into machine language programs
called object code.
 The translation of source code to object code is accomplished by a machine language system
program called a compiler.
 Alternative to compilation is interpretation which is accomplished by a system program called an
interpreter.
 Common third generation languages
 FORTRAN
 COBOL
 C and C++
 Visual Basic

4. Fourth Generation Languages

A high level language (4GL) that requires fewer instructions to accomplish a task than a third generation
language.
 Used with databases
o Query languages
o Report generators
o Forms designers
o Application generators

5. Fifth Generation Languages

 Declarative languages
 Functional(?): Lisp, Scheme, SML
o Also called applicative
o Everything is a function
 Logic: Prolog
o Based on mathematical logic
o Rule- or Constraint-based

Another classification of languages uses the term imperative for languages, in which a program specifies how a
computation is to be done and declarative for languages in which a program specifies what computation is to be done.

 Declarative
o Functional : Lisp/Scheme, ML, Haskell
o Dataflow: Id, Val
o Logic, constraint-based: Prolog, spreadsheets
o Template-based: XSLT
 Imperative
o Von Neumann: C, Ada, Fortran, . . .
o Scripting: Perl, Python, PHP, . . .
o Object-oriented: Smalltalk, Eiffel, C++, Java, . . .

The Science of Building a Compiler

A compiler must accept all source programs that conform to the specification of the language; the set of source
programs is infinite and any program can be very large, consisting of possibly millions of lines of code. Any
transformation
performed by the compiler while translating a source program must preserve the meaning of the program being compiled.

Modeling in Compiler Design and Implementation

The study of compilers is mainly a study of how we design the right mathematical models and choose the right
algorithms, while balancing the need for generality and power against simplicity and efficiency.
Some of most fundamental models are
o Finite-state machines and regular expressions
 useful for describing the lexical units of programs
o Context-free grammars
 used to describe the syntactic structure of programming languages
o Trees
 model for representing the structure of programs
 For translation into object code

The Science of Code Optimization

The term "optimization" in compiler design refers to the attempts tha t a compiler makes to produce code that is
more efficient than the obvious code.

Compiler optimizations must meet the following design objectives:

• The optimization must be correct, tha t is, preserve the meaning of the compiled program,
• The optimization must improve the performance of many programs,
• The compilation time must be kept reasonable, and
• The engineering effort required must be manageable.

Applications of Compiler Technology

 Implementation of High-Level Programming Languages
 Optimizations for Computer Architectures
 Parallelism
 Memory Hierarchies
 Design of New Computer Architectures
 RISC
 Specialized architectures
 Program Translators
 Binary Translation
 Hardware Synthesis
 Database Query Interpreters
 Compiled Simulation
 Software Productivity tools
 Types checking
 Bounds Checking
 Memory Management tools
Programming Language Basics
 A language uses static scope or lexical scope if it is possible to determine the scope of a declaration by
looking only at the program. Uses dynamic scope, as the program runs,

 The environment is a mapping from names to locations in the store. Since variables refer to locations
("1-values" in the terminology of C), we could alternatively define an environment as a mapping from
names to variables.
 The state is a mapping from locations in store to their values. That is, the state maps 1-values to their
corresponding r-values, in the terminology of C.
 An identifier is a string of characters, typically letters or digits, that refers to (identifies) an entity, such
as a data object, a procedure, a class, or a type. All identifiers are names, but not all names are
identifiers. Names can also be expressions
 A variable refers to a particular location of the store.
 The scope rules for C are based on program structure; the scope of a declaration is determined implicitly
by where the declaration appears in the program. Later languages, such as C++ , Java, and C# also
provide explicit control over scopes through the use of keywords like public, private, and protected.
 A function generally returns a value of some type (the "return type"), while a procedure does not return
any value. C and similar languages, which have only functions, treat procedures as functions tha t have a
special return type "void," to signify no return value. Object-oriented languages like Java and C+ + use
the term "methods."
 The keywords like public, private, and protected, object oriented languages such as C+ + or Java provide
explicit control over access to member names in a superclass. These keywords support encapsulation by
restricting access
 Dynamic scope resolution is also essential for polymorphic procedures, those tha t have two or more
definitions for the same name
 All programming languages have a notion of a procedure, but they can differ in how these procedures
get their arguments.
o Call by value
o Call by reference
o Call by name

Role of the Lexical Analyzer

The lexical analyzer is the first phase of a compiler.Its main task of the lexical analyzer is to read the input characters of the
source program, group them into lexemes, and produce as output a sequence of tokens for each lexeme in the source program. The
stream of tokens is sent to the parser for syntax analysis.

Fig: Interaction between Lexical Analyzer and parser

The interaction is implemented by having the parser call the lexical analyzer. The call, suggested by the getNextToken
command, causes the lexical analyzer to read characters from its input until it can identify the next lexeme and produce for it the
next token, which it returns to the parser.
Lexical analyzer may perform certain other tasks besides identification of lexemes
1. Stripping out comments and whitespace (blank, newline, tab, and perhaps other characters that are used to separate
tokens in the input).
2. Correlating error messages generated by the compiler with the source program. For instance, the lexical analyzer
may keep track of the number of newline characters seen, so it can associate a line number with each error message.
In some compilers, the lexical analyzer makes a copy of the source program with the error messages inserted at the
appropriate positions.
3. If the source program uses a macro-preprocessor, the expansion of macros may also be performed by the lexical
analyzer.
Lexical Analysis versus Parsing
There are a number of reasons why the analysis portion of a compiler is normally separated into lexical analysis and parsing
(syntax analysis) phases.
1. The separation of lexical and syntactic analysis often allows us to simplify at least one of these tasks.
2. Compiler efficiency is improved. A separate lexical analyzer allows us to apply specialized techniques that serve only the
lexical task, not the job of parsing. In addition, specialized buffering techniques for reading input characters can speed up
the compiler significantly.
3. Compiler portability is enhanced.

Token: A token is a pair consisting of a token name and an optional attribute value. The token name is an abstract symbol
representing a kind of lexical unit, e.g., a particular keyword, or a sequence of input characters denoting an identifier
The token names are the input symbols that the parser processes

Pattern: A pattern is a description of the form that the lexemes of a token may take (i.e the regular expression pattern that the
lexeme should match)

Lexeme: A lexeme is a sequence of characters in the source program tha t matches the pattern for a token and is identified by the
lexical analyzer as an instance of that token.

Lexical Errors

 Lexical Analyzer cannot recognize the misspelled words

 Lexical analyzer is unable to proceed if none of the patterns for tokens matches any prefix of the remaining input.
The simplest recovery strategy is "panic mode" recovery. We delete successive characters from the remaining input, until
the lexical analyzer can find a well-formed token at the beginning of what input is left.

Other possible error-recovery actions are:

1. Delete one character from the remaining input.
2. Insert a missing character into the remaining input.
3. Replace a character by another character.
4. Transpose two adjacent characters.
This strategy makes sense, since in practice most lexical errors involve a single character. A more general correction strategy is to
find the smallest number of transformations needed to convert the source program into one that consists only of valid lexemes.

Input Buffering
It can be used in Lexical Analyzer to speed up the task of reading source Program.
In Scanner, it reads every character from the secondary storage. But it is very time consuming. To avoid this buffering Technique
is used.
In this, a block of data is first read into buffer and then scanned by lexical analyzer. Using one system read command we can read
N characters(block size)
into a buffer, rather than using one system call per character. If fewer than N characters remain in the input file, then a special
character, represented by eof, marks the end of the source file
Two pointers to the input are maintained:
1. Pointer lexemeBegin, marks the beginning of the current lexeme

2. Pointer forward scans ahead until a pattern match is found

Once the next lexeme is determined, forward is set to the character at its right end. Then, after the lexeme is recorded as an
attribute value of a token returned to the parser, lexemeBegin is set to the character immediately after the lexeme just found. In
below Figure, we see forward has passed the end of the next lexeme, ** (the Fortran exponentiation operator), and must be
retracted one position to its left.
Fig: Using a pair of input buffers
If only One Buffer is used then
if the length of lexeme > length of buffer then to scan rest of lexeme the buffer has to be refilled, that makes overwriting
first part of lexeme.
So to overcome this, two buffering scheme is used. In Two buffering scheme Advancing forward requires that we first test
whether we have reached the end of one of the buffers, and if so, we must reload the other buffer from the input, and move
forward to the beginning of the newly loaded buffer.
To identify the end of buffer, we have to place eof character at the end called as sentinel.(i.e sentinel is a special character that
represents buffer end)
Fig: Look ahead code with sentinels

Fig: Sentinels at end of each buffer

Specification of Tokens
Regular expressions are an important notation for specifying lexeme patterns. While they cannot express all possible patterns, they
are very effective in specifying those types of patterns that we actually need for tokens

An alphabet is any finite set of symbols. Typical examples of symbols are letters,digits, and punctuation. The set {0,1} is the
binary alphabet
A string over an alphabet is a finite sequence of symbols drawn from that alphabet. The length of a string s, usually written |s|.
banana is a string of length six. The empty string, denoted by ε, is the string of length zero.

The following string-related terms are commonly used:

1. A prefix of string s is any string obtained by removing zero or more symbols from the end of s. For example, ban, banana, and e
are
prefixes of banana.
2. A suffix of string s is any string obtained by removing zero or more symbols from the beginning of s. For example, nana,
banana, and e
are suffixes of banana.
3. A substring of s is obtained by deleting any prefix and any suffix from s. For instance, banana, nan, and e are substrings of
banana.
4. The proper prefixes, suffixes, and substrings of a string s are those, prefixes, suffixes, and substrings, respectively, of s that are
not e or
not equal to s itself.
5. A subsequence of s is any string formed by deleting zero or more not necessarily consecutive positions of s. For example, baan
is a
subsequence of banana.

Regular Expressions(R.E) are useful for representing sets of strings of a specific language. It provides convenient and useful
notation for representing tokens
A Regular Expression can be defined recursively
1. Any element x∈∑ is a regular expression
2. Null string ε is a R.E
3. Union of two R.E’s R1 and R2 is also R.E (R1+R2)or (R1|R2)
4. Concatenation of two R.E’s R1 and R2 also R.E (R1.R2)or (R1R2)
5. Iteration (Closure) of R.E is also R.E (R*)

Example Let ∑ = {a, b}.

1. The regular expression a|b denotes the language {a, b}.
2. (a|b)(a|b) denotes {aa, ah, ba, bb}, the language of all strings of length two over the alphabet E. Another regular expression for
the same language is aa|ab|ba|bb.
3. a* denotes the language consisting of all strings of zero or more a's, that is, { e , a , a a , a a a , . . . }.
4. (a|b)* denotes the set of all strings consisting of zero or more instances of a or b, that is, all strings of a's and 6's: {e,a, b,aa, ab,
ba, bb,aaa,...}.Another regular expression for the same language is (a*b*)*.
5. a|a*b denotes the language {a, b, ab, aab, aaab,...}, that is, the string a and all strings consisting of zero or more a's and ending
in b.

Regular definition:
If ∑ is an alphabet of basic symbols, then a regular definition is a sequence of definitions of the form:

where:
1. Each di is a new symbol, not in ∑ and not the same as any other of the d’s, and
2. Each ri is a regular expression over the alphabet ∑ U {d1,d2,.. . ,di-1}.
By restricting ri to ∑ and the previously defined d’s, we avoid recursive definitions

Examples: 1. Regular Definition for Identifiers

2. Regular Definition for Unsigned numbers (integer or floating point)

For example, ^[^aeiou]*$ matches any complete line that does not contain a lowercase vowel.
RECOGNITION OF TOKENS
Consider the following grammar fragment:

Fig: A grammar for branching statement in PASCAL

The terminals of the grammar, which are if, then, else, relop, id, and number, are the names of tokens as far as the lexical
analyzer is concerned. The patterns for these tokens are described using regular definitions, as

In addition, we assign the lexical analyzer the job of stripping out whitespace, by recognizing the "token" ws defined by:

Token ws is different from the other tokens in that, when we recognize it, we do not return it to the parser, but rather restart the
lexical analysis from the character that follows the whitespace. It is the following token that gets returned to the parser.
Fig: Regular Expression Patterns for Tokens

Transition Diagrams
As an intermediate step in the construction of a lexical analyzer, we first convert patterns into stylized flowcharts, called
"transition diagrams."
Now we perform the conversion from regular-expression patterns to transition diagrams by hand. But there is a mechanical
way to construct these diagrams from collections of regular expressions.
Fig: Transition Diagram for Relational Operators

Here the states 4,8 has a * to indicate that we must retract the input one position.

Fig: Transition Diagram for Identifiers

There are two ways that we can handle reserved words that look like identifiers:
1. Install the reserved words in the symbol table initially. When we find an identifier, a call to installlD
places it in the symbol table if it is not already there and returns a pointer to the symbol-table entry for the lexeme found.
The function getToken examines the symbol table entry for the lexeme found, and returns whatever token name the symbol table
says this lexeme represents — either id or one of the keyword tokens that was initially installed in the table.
2. Create separate transition diagrams for each keyword

Fig: Transition Diagram for Keyword ‘then’

Fig: Transition Diagram for Unsigned Numbers

Fig: Transition Diagram for White Spaces

Lexical analyzer generator - LEX

lex generates a C-language scanner from a source specification that you write. This specification contains a list of rules indicating
sequences of characters -- expressions -- to be searched for in an input text, and the actions to take when an expression is found.
... definitions ...
%%
... rules ...
%%
... subroutines ...
The following example prepends line numbers to each line in a file

The C source code for the lexical analyzer is generated when you enter
$ lex lex.l
where lex.l is the file containing your lex specification
The lexical analyzer code stored in lex.yy.c (or the .c file to which it was redirected) must be compiled to generate the executable
object program, or scanner, that performs the lexical analysis of an input text.
The lex library supplies a default main() that calls the function yylex(), so you need not supply your own main(). The library is
accessed by invoking the -ll option to cc:
$ cc lex.yy.c -ll

Lex Predefined Variables

Codehelp: Lec-9: SQL in 1-Video
100% (3)
Codehelp: Lec-9: SQL in 1-Video
8 pages
Compiler Construction
No ratings yet
Compiler Construction
63 pages
Compiler Construction: Language Processing System
No ratings yet
Compiler Construction: Language Processing System
8 pages
Compiler Unit - 1 PDF
No ratings yet
Compiler Unit - 1 PDF
16 pages
Compiler Notes
No ratings yet
Compiler Notes
68 pages
Chapter 2
No ratings yet
Chapter 2
11 pages
Compiler Design.: Why To Learn About Compilers
No ratings yet
Compiler Design.: Why To Learn About Compilers
12 pages
CD Notes
No ratings yet
CD Notes
69 pages
Compiler Construction Chapter1
No ratings yet
Compiler Construction Chapter1
20 pages
Compiler Design
No ratings yet
Compiler Design
11 pages
Compiler Design Quick Guide
No ratings yet
Compiler Design Quick Guide
45 pages
Quick Book of Compiler
100% (1)
Quick Book of Compiler
66 pages
Chapter 1
No ratings yet
Chapter 1
11 pages
Com 413 Compiler - Notes1-1
No ratings yet
Com 413 Compiler - Notes1-1
6 pages
Compiler Design - Quick Guide
No ratings yet
Compiler Design - Quick Guide
38 pages
Unit 1
No ratings yet
Unit 1
9 pages
CD Unit - 1 Lms Notes
No ratings yet
CD Unit - 1 Lms Notes
58 pages
CD Notes
No ratings yet
CD Notes
28 pages
Compiler Design
No ratings yet
Compiler Design
65 pages
AT&FL Lab 11
No ratings yet
AT&FL Lab 11
6 pages
Compiler Construction Notes
No ratings yet
Compiler Construction Notes
61 pages
CD Unit-I
No ratings yet
CD Unit-I
25 pages
Compiler 2021 Module 1
No ratings yet
Compiler 2021 Module 1
15 pages
Unit 1 Compiler Design
No ratings yet
Unit 1 Compiler Design
70 pages
Module - I: Introduction To Compiling: 1.1 Introduction of Language Processing System
No ratings yet
Module - I: Introduction To Compiling: 1.1 Introduction of Language Processing System
7 pages
Manjakkudi
No ratings yet
Manjakkudi
158 pages
CD Unit 1
No ratings yet
CD Unit 1
11 pages
Language Processing System:-: Compiler
No ratings yet
Language Processing System:-: Compiler
6 pages
CD Notes Final
No ratings yet
CD Notes Final
72 pages
Compiler Design: - Language Processor - Language Processing System - Phases of Compiler
No ratings yet
Compiler Design: - Language Processor - Language Processing System - Phases of Compiler
11 pages
Compiler 2024
No ratings yet
Compiler 2024
179 pages
Chapter 1 Introduction To Compiler Design
No ratings yet
Chapter 1 Introduction To Compiler Design
13 pages
Compiler Construction and Phases
No ratings yet
Compiler Construction and Phases
8 pages
Introduction To Compilers Complier: Ompiler Source Program Target Program Error Message
No ratings yet
Introduction To Compilers Complier: Ompiler Source Program Target Program Error Message
23 pages
CD Unit-1
No ratings yet
CD Unit-1
37 pages
Compiler Design - Quick Guide: Language Processing System
No ratings yet
Compiler Design - Quick Guide: Language Processing System
51 pages
Unit 1 Introduction To Compiler 1. Introduction To Compiler
No ratings yet
Unit 1 Introduction To Compiler 1. Introduction To Compiler
134 pages
Compiler Introduction
No ratings yet
Compiler Introduction
28 pages
Compiler Design
No ratings yet
Compiler Design
56 pages
Compiler Design and Implementation
No ratings yet
Compiler Design and Implementation
5 pages
CD Part
No ratings yet
CD Part
159 pages
CD 1
No ratings yet
CD 1
15 pages
Compiler Design Ch1
No ratings yet
Compiler Design Ch1
13 pages
Introduction To Compiler
No ratings yet
Introduction To Compiler
10 pages
PCC 1
No ratings yet
PCC 1
8 pages
The Curse of Compiler Construction
No ratings yet
The Curse of Compiler Construction
50 pages
CD Unit I Part I Introduction
No ratings yet
CD Unit I Part I Introduction
67 pages
CD Unit-I-1
No ratings yet
CD Unit-I-1
42 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
9 pages
The Curse of Compiler Construction
100% (1)
The Curse of Compiler Construction
50 pages
Lecture 1 - Ch1. Introduction To Compiler
No ratings yet
Lecture 1 - Ch1. Introduction To Compiler
29 pages
Compiler Design Short Notes
No ratings yet
Compiler Design Short Notes
133 pages
Compiler Design - Module 1-Notes
No ratings yet
Compiler Design - Module 1-Notes
74 pages
Core Course Viii Compiler Design Unit I
No ratings yet
Core Course Viii Compiler Design Unit I
27 pages
Lecture Notes of Compiler Design Lab
No ratings yet
Lecture Notes of Compiler Design Lab
170 pages
2 Compiler Design Notes
No ratings yet
2 Compiler Design Notes
31 pages
Midsem
No ratings yet
Midsem
13 pages
Compiler Design Notes PDF
No ratings yet
Compiler Design Notes PDF
103 pages
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
From Everand
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
Dexter Rogers
No ratings yet
Code Beneath the Surface: Mastering Assembly Programming
From Everand
Code Beneath the Surface: Mastering Assembly Programming
Kameron Hussain
No ratings yet
The 1 Page Python Book
From Everand
The 1 Page Python Book
Barani Kumar
2/5 (1)
Lec 10
No ratings yet
Lec 10
14 pages
Mid Question Bank
No ratings yet
Mid Question Bank
3 pages
Lec 11
No ratings yet
Lec 11
11 pages
Prof Sharon Raju July B 2024
No ratings yet
Prof Sharon Raju July B 2024
44 pages
Unit-4 Advanced Knowledge Representation
No ratings yet
Unit-4 Advanced Knowledge Representation
30 pages
Compiler Design QB PDF
No ratings yet
Compiler Design QB PDF
11 pages
Quality of Service: Designing The User Interface: Strategies For Effective Human-Computer Interaction
No ratings yet
Quality of Service: Designing The User Interface: Strategies For Effective Human-Computer Interaction
14 pages
Information Search and Visualization: - Who Earns $50,000 Among The Residents of Eugene, Oregon?
No ratings yet
Information Search and Visualization: - Who Earns $50,000 Among The Residents of Eugene, Oregon?
9 pages
Testing Stimulus Protocols in Retinal-Prosthesis Patients : A. Gonzalez-Calle, J.D. Weiland, Fellow IEEE
No ratings yet
Testing Stimulus Protocols in Retinal-Prosthesis Patients : A. Gonzalez-Calle, J.D. Weiland, Fellow IEEE
4 pages
A 1024-Channel CMOS Microelectrode Array With 26,400 Electrodes For Recording and Stimulation of Electrogenic Cells in Vitro
No ratings yet
A 1024-Channel CMOS Microelectrode Array With 26,400 Electrodes For Recording and Stimulation of Electrogenic Cells in Vitro
15 pages
Image Capture Using PRP, Csi, and I C: Mc9328Mx21
No ratings yet
Image Capture Using PRP, Csi, and I C: Mc9328Mx21
18 pages
Problem Based Upon Nested For/while Loop For Printing Pattern
100% (1)
Problem Based Upon Nested For/while Loop For Printing Pattern
28 pages
Ladderen PDF
100% (2)
Ladderen PDF
398 pages
2008 Computer Science Free Response
No ratings yet
2008 Computer Science Free Response
15 pages
Chapter 20 Recursion
No ratings yet
Chapter 20 Recursion
49 pages
Computer-Course Allocation 2022-2023
No ratings yet
Computer-Course Allocation 2022-2023
8 pages
DAMBI DOLLO UNIVERSITY PPT I-1
No ratings yet
DAMBI DOLLO UNIVERSITY PPT I-1
25 pages
Sem 5 Aids Web Computing Viva Question
No ratings yet
Sem 5 Aids Web Computing Viva Question
12 pages
1.1 General Introduction: N Queen Problem
No ratings yet
1.1 General Introduction: N Queen Problem
33 pages
OS Project Ideas For Beginners To Advanced 1
No ratings yet
OS Project Ideas For Beginners To Advanced 1
10 pages
Hook Routines For Version Exit Points Customer How To Guide 1.0
100% (2)
Hook Routines For Version Exit Points Customer How To Guide 1.0
21 pages
C279564 EBooksWorld Ir-162-167
No ratings yet
C279564 EBooksWorld Ir-162-167
6 pages
Log
No ratings yet
Log
5 pages
A Complete Guide To LLVM For Programming Language Creators
No ratings yet
A Complete Guide To LLVM For Programming Language Creators
22 pages
JDBC 1
No ratings yet
JDBC 1
5 pages
IMS DB-DC Return Codes
No ratings yet
IMS DB-DC Return Codes
31 pages
Unit - Iv RDBMS Notes
No ratings yet
Unit - Iv RDBMS Notes
22 pages
Misra C2012 Guidelines For The Use of The C Language in Critical Systems Motor Industry Software Reliability Association Download
No ratings yet
Misra C2012 Guidelines For The Use of The C Language in Critical Systems Motor Industry Software Reliability Association Download
88 pages
List of C Basic Programs
100% (2)
List of C Basic Programs
4 pages
L04 - Verilog - Procedural Statements
No ratings yet
L04 - Verilog - Procedural Statements
9 pages
DBMS Lab Manual Editing New
No ratings yet
DBMS Lab Manual Editing New
44 pages
Python Terms and Jargons
No ratings yet
Python Terms and Jargons
14 pages
Input and Output Functions in C - 1622196378713
No ratings yet
Input and Output Functions in C - 1622196378713
12 pages
Verilog Intro by Sneh Sir
No ratings yet
Verilog Intro by Sneh Sir
15 pages
Lecture 07 W23
No ratings yet
Lecture 07 W23
27 pages
Possible Thesis Topics For Computer Engineering
75% (4)
Possible Thesis Topics For Computer Engineering
7 pages
Practical Implementation of Stack Using List
No ratings yet
Practical Implementation of Stack Using List
6 pages
GUI Programming With Python QT EDITION
80% (5)
GUI Programming With Python QT EDITION
641 pages
Code First Migrations With The Entity Framework in An ASP
No ratings yet
Code First Migrations With The Entity Framework in An ASP
11 pages
Python Lex-Yacc: Language Tool For Python CS 550 Programming Languages
No ratings yet
Python Lex-Yacc: Language Tool For Python CS 550 Programming Languages
20 pages

CD Unit1 Notes

Uploaded by

CD Unit1 Notes

Uploaded by

What is Compiler?

Why to study Compiler Design

Translators refer to the transformation at the same level of abstraction.

Compilers vs. Interpreters

Hybrid compiler (combine compilation and interpretation):

2) It creates an object file. 2) It does not create an object file.

3) Program execution is very fast. 3) Program execution is slow.

The Evolution of Programming Languages

1. First Generation Languages:

3. Third Generation Languages

4. Fourth Generation Languages

5. Fifth Generation Languages

The Science of Building a Compiler

Modeling in Compiler Design and Implementation

The Science of Code Optimization

Compiler optimizations must meet the following design objectives:

Applications of Compiler Technology

Role of the Lexical Analyzer

Fig: Interaction between Lexical Analyzer and parser

 Lexical Analyzer cannot recognize the misspelled words

Other possible error-recovery actions are:

2. Pointer forward scans ahead until a pattern match is found

Fig: Sentinels at end of each buffer

The following string-related terms are commonly used:

Example Let ∑ = {a, b}.

Examples: 1. Regular Definition for Identifiers

2. Regular Definition for Unsigned numbers (integer or floating point)

Fig: A grammar for branching statement in PASCAL

Fig: Transition Diagram for Identifiers

Fig: Transition Diagram for Keyword ‘then’

Fig: Transition Diagram for Unsigned Numbers

Fig: Transition Diagram for White Spaces

Lexical analyzer generator - LEX

Lex Predefined Variables

You might also like