0% found this document useful (0 votes)

12 views18 pages

Unit 1

The document provides an overview of compiler design, focusing on programming languages, their classifications into low-level and high-level languages, and the role of translators like compilers and interpreters. It details the phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation, along with error handling routines and types of compiler errors. Additionally, it highlights the differences between machine-level and assembly languages, as well as between low-level and high-level languages.

Uploaded by

pagal pane

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views18 pages

Unit 1

Uploaded by

pagal pane

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Compiler Design

UNIT-1

What is a programming language?

A programming language defines a set of instructions that are compiled together to perform a
specific task by the CPU (Central Processing Unit). The programming language mainly refers
to high-level languages such as C, C++, Pascal, Ada, COBOL, etc.

Each programming language contains a unique set of keywords and syntax, which are used to
create a set of instructions. Thousands of programming languages have been developed till
now, but each language has its specific purpose. These languages vary in the level of
abstraction they provide from the hardware. Some programming languages provide less or no
abstraction while some provide higher abstraction. Based on the levels of abstraction, they can
be classified into two categories:

o Low-level language
o High-level language

The image which is given below describes the abstraction level from hardware. As we can
observe from the below image that the machine language provides no abstraction, assembly
language provides less abstraction whereas high-level language provides a higher level of
abstraction.

Low-level language

The low-level language is a programming language that provides no abstraction from the
hardware, and it is represented in 0 or 1 forms, which are the machine instructions. The
languages that come under this category are the Machine level language and Assembly
language.

Machine-level language

The machine-level language is a language that consists of a set of instructions that are in the
binary form 0 or 1. As we know that computers can understand only machine instructions,
which are in binary digits, i.e., 0 and 1, so the instructions given to the computer can be only
in binary codes. Creating a program in a machine-level language is a very difficult task as it is
not easy for the programmers to write the program in machine instructions. It is error-prone as
it is not easy to understand, and its maintenance is also very high. A machine-level language is
not portable as each computer has its machine instructions, so if we write a program in one
computer will no longer be valid in another computer.

The different processor architectures use different machine codes, for example, a PowerPC
processor contains RISC architecture, which requires different code than intel x86 processor,
which has a CISC architecture.

Assembly Language

The assembly language contains some human-readable commands such as mov, add, sub, etc.
The problems which we were facing in machine-level language are reduced to some extent by
using an extended form of machine-level language known as assembly language. Since
assembly language instructions are written in English words like mov, add, sub, so it is easier
to write and understand.

As we know that computers can only understand the machine-level instructions, so we require
a translator that converts the assembly code into machine code. The translator used for
translating the code is known as an assembler.

The assembly language code is not portable because the data is stored in computer registers,
and the computer has to know the different sets of registers.

The assembly code is not faster than machine code because the assembly language comes
above the machine language in the hierarchy, so it means that assembly language has some
abstraction from the hardware while machine language has zero abstraction.

Differences between Machine-Level language and Assembly language

The following are the differences between machine-level language and assembly language:

Machine-level language Assembly language

The machine-level language comes at the The assembly language comes above the
lowest level in the hierarchy, so it has zero machine language means that it has less
abstraction level from the hardware. abstraction level from the hardware.

It cannot be easily understood by humans. It is easy to read, write, and maintain.

The machine-level language is written in The assembly language is written in simple

binary digits, i.e., 0 and 1. English language, so it is easily understandable
by the users.

It does not require any translator as the In assembly language, the assembler is used to
machine code is directly executed by the convert the assembly code into machine code.
computer.

It is a first-generation programming It is a second-generation programming

language. language.

High-Level Language

The high-level language is a programming language that allows a programmer to write the
programs which are independent of a particular type of computer. The high-level languages are
considered as high-level because they are closer to human languages than machine-level
languages.

When writing a program in a high-level language, then the whole attention needs to be paid to
the logic of the problem.

A compiler is required to translate a high-level language into a low-level language.

Advantages of a high-level language

o The high-level language is easy to read, write, and maintain as it is written in English
like words.
o The high-level languages are designed to overcome the limitation of low-level
language, i.e., portability. The high-level language is portable; i.e., these languages are
machine-independent.

Differences between Low-Level language and High-Level language

The following are the differences between low-level language and high-level language:

Low-level language High-level language

It is a machine-friendly language, i.e., the It is a user-friendly language as this language

computer understands the machine is written in simple English words, which can
language, which is represented in 0 or 1. be easily understood by humans.

The low-level language takes more time to It executes at a faster pace.

execute.

It requires the assembler to convert the It requires the compiler to convert the high-
assembly code into machine code. level language instructions into machine
code.

The machine code cannot run on all The high-level code can run all the platforms,
machines, so it is not a portable language. so it is a portable language.

It is memory efficient. It is less memory efficient.

Debugging and maintenance are not easier Debugging and maintenance are easier in a
in a low-level language. high-level language.

What is Translators? Different type of translators

A program written in high-level language is called as source code. To convert the source code
into machine code, translators are needed.
A translator takes a program written in source language as input and converts it into a program
in target language as output.
It also detects and reports the error during translation.
Roles of translator are:
• Translating the high-level language program input into an equivalent machine language
program.
• Providing diagnostic messages wherever the programmer violates specification of the high-
level language program.
Different type of translators

The different types of translator are as follows:

Compiler

Compiler is a translator which is used to convert programs in high-level language to low-level

language. It translates the entire program and also reports the errors in source program
encountered during the translation.

Interpreter

Interpreter is a translator which is used to convert programs in high-level language to low-

level language. Interpreter translates line by line and reports the error once it encountered
during the translation process.
It directly executes the operations specified in the source program when the input is given by
the user.
It gives better error diagnostics than a compiler.

Differences between compiler and interpreter

SI. Compiler Interpreter

No
1 Performs the translation of a program as Performs statement by statement
a whole. translation.
2 Execution is faster. Execution is slower.
3 Requires more memory as linking is Memory usage is efficient as no
needed for the generated intermediate intermediate object code is generated.
object code.
4 Debugging is hard as the error messages It stops translation when the first error
are generated after scanning the entire is met. Hence, debugging is easy.
program only.

5 Programming languages like C, C++ Programming languages like Python,

uses compilers. BASIC, and Ruby uses interpreters.
Assembler

Assembler is a translator which is used to translate the assembly language code into machine
language code.

Phase of Compiler
Compiler operates in various phases each phase transforms the source program from one
representation to another. Every phase takes inputs from its previous stage and feeds its output
to the next phase of the compiler.
There are 6 phases in a compiler. Each of this phase help in converting the high-level langue
the machine code. The phases of a compiler are:
1. Lexical analysis
2. Syntax analysis
3. Semantic analysis
4. Intermediate code generator
5. Code optimizer
6. Code generator

Phases of Compiler
All these phases convert the source code by dividing into tokens, creating parse trees, and
optimizing the source code by different phases.
Phase 1: Lexical Analysis
Lexical Analysis is the first phase when compiler scans the source code. This process can be
left to right, character by character, and group these characters into tokens.
Here, the character stream from the source program is grouped in meaningful sequences by
identifying the tokens. It makes the entry of the corresponding tickets into the symbol table and
passes that token to next phase.

The primary functions of this phase are:

 Identify the lexical units in a source code
 Classify lexical units into classes like constants, reserved words, and enter them in
different tables. It will Ignore comments in the source program
 Identify token which is not a part of the language
Example:
x = y + 10
Tokens
X identifier
= Assignment operator
Y identifier
+ Addition operator
10 Number
Phase 2: Syntax Analysis
Syntax analysis is all about discovering structure in code. It determines whether or not a text
follows the expected format. The main aim of this phase is to make sure that the source code
was written by the programmer is correct or not.
Syntax analysis is based on the rules based on the specific programing language by constructing
the parse tree with the help of tokens. It also determines the structure of source language and
grammar or syntax of the language.
Here, is a list of tasks performed in this phase:
 Obtain tokens from the lexical analyzer
 Checks if the expression is syntactically correct or not
 Report all syntax errors
 Construct a hierarchical structure which is known as a parse tree
Example
Any identifier/number is an expression
If x is an identifier and y+10 is an expression, then x= y+10 is a statement.
Consider parse tree for the following example
(a+b)*c

In Parse Tree
 Interior node: record with an operator filed and two files for children
 Leaf: records with 2/more fields; one for token and other information about the token
 Ensure that the components of the program fit together meaningfully
 Gathers type information and checks for type compatibility
 Checks operands are permitted by the source language
Phase 3: Semantic Analysis
Semantic analysis checks the semantic consistency of the code. It uses the syntax tree of the
previous phase along with the symbol table to verify that the given source code is semantically
consistent. It also checks whether the code is conveying an appropriate meaning.
Semantic Analyzer will check for Type mismatches, incompatible operands, a function called
with improper arguments, an undeclared variable, etc.
Functions of Semantic analyses phase are:
 Helps you to store type information gathered and save it in symbol table or syntax tree
 Allows you to perform type checking
 In the case of type mismatch, where there are no exact type correction rules which
satisfy the desired operation a semantic error is shown
 Collects type information and checks for type compatibility
 Checks if the source language permits the operands or not
Example
float x = 20.2;
float y = x*30;
In the above code, the semantic analyzer will typecast the integer 30 to float 30.0 before
multiplication
Phase 4: Intermediate Code Generation
Once the semantic analysis phase is over the compiler, generates intermediate code for the
target machine. It represents a program for some abstract machine.
Intermediate code is between the high-level and machine level language. This intermediate
code needs to be generated in such a manner that makes it easy to translate it into the target
machine code.
Functions on Intermediate Code generation:
 It should be generated from the semantic representation of the source program
 Holds the values computed during the process of translation
 Helps you to translate the intermediate code into target language
 Allows you to maintain precedence ordering of the source language
 It holds the correct number of operands of the instruction
Example
For example,
total = count + rate * 5
Intermediate code with the help of address code method is:

t1 := int_to_float(5)
t2 := rate * t1
t3 := count + t2
total := t3
Phase 5: Code Optimization
The next phase of is code optimization or Intermediate code. This phase removes unnecessary
code line and arranges the sequence of statements to speed up the execution of the program
without wasting resources. The main goal of this phase is to improve on the intermediate code
to generate a code that runs faster and occupies less space.
The primary functions of this phase are:
 It helps you to establish a trade-off between execution and compilation speed
 Improves the running time of the target program
 Generates streamlined code still in intermediate representation
 Removing unreachable code and getting rid of unused variables
 Removing statements which are not altered from the loop
Example:
Consider the following code
a = intofloat(10)
b=c*a
d=e+b
f=d
Can become
b =c * 10.0
f = e+b
Phase 6: Code Generation
Code generation is the last and final phase of a compiler. It gets inputs from code optimization
phases and produces the page code or object code as a result. The objective of this phase is to
allocate storage and generate relocatable machine code.
It also allocates memory locations for the variable. The instructions in the intermediate code
are converted into machine instructions. This phase coverts the optimize or intermediate code
into the target language.
The target language is the machine code. Therefore, all the memory locations and registers are
also selected and allotted during this phase. The code generated by this phase is executed to
take inputs and generate expected outputs.
Example:
a = b + 60.0
Would be possibly translated to registers.
MOVF a, R1
MULF #60.0, R2
ADDF R1, R2
Symbol Table Management
A symbol table contains a record for each identifier with fields for the attributes of the
identifier. This component makes it easier for the compiler to search the identifier record and
retrieve it quickly. The symbol table also helps you for the scope management. The symbol
table and error handler interact with all the phases and symbol table update correspondingly.

Error Handling Routine:

In the compiler design process error may occur in all the below-given phases:
 Lexical analyzer: Wrongly spelled tokens
 Syntax analyzer: Missing parenthesis
 Intermediate code generator: Mismatched operands for an operator
 Code Optimizer: When the statement is not reachable
 Code Generator: When the memory is full or proper registers are not allocated
 Symbol tables: Error of multiple declared identifiers
Most common errors are invalid character sequence in scanning, invalid token sequences in
type, scope error, and parsing in semantic analysis.
The error may be encountered in any of the above phases. After finding errors, the phase needs
to deal with the errors to continue with the compilation process. These errors need to be
reported to the error handler which handles the error to perform the compilation process.
Generally, the errors are reported in the form of message.

Types of Compiler Errors

Compile-time errors are further divided into three categories:
1. Lexical Phase Errors
2. Semantic Errors

Lexical Error
Lexical errors are errors that your lexer throws when it is unable to continue. This means that there's no
way to recognize a lexeme as a valid token for your lexer. If you consider a lexer to be a finite state
machine that accepts valid input strings, errors are any input strings that do not result in that finite state
machine reaching an accepting state.
During the Lexical analyzer phase, lexical errors occur. In lexical analyzer conversion of the program
into the stream of tokens is done. There are patterns through which the identifiers are identified.
A lexical error is a sequence of characters that does not match the pattern of any token. During the
execution of a program, a lexical phase error is found.
Lexical phase error can be:
 Any Spelling errors.
 Exceeding the length of an identifier or numeric constants.
 The appearance of illegal characters.
 To replace a character with an incorrect character.
 Transposition of two characters.

Syntactic Error
In computer science, a syntactic error is an error in the syntax of a sequence of characters or tokens
intended to be written in a specific programming language. This type of error appears during the syntax
analysis phase. Syntax or syntactic error is also found during the execution of the program.

Some syntax errors can be:

 Error in structure
 Unbalanced parenthesis
 Missing operators
A syntax error can occur when an invalid calculation is entered into a calculator. This can happen if you
enter multiple decimal points in one number or if you open brackets without closing them.
As you can see in the above code, the closing braces are missing so it results in syntactic error.

Semantic Error
This type of error appears during the semantic analysis phase. These types of errors are detected during
the compilation process. Now, it is the phase where your defined identifiers are verified.
The majority of compile-time errors are scope and declaration errors. For example, undeclared
identifiers or multiple declared identifiers. Semantic errors can occur when the invalid variable or
operator is used, or the operations are performed in the incorrect order.
There can be different types of compilation errors depending on the program you’ve written.
Some examples of semantic errors are:
 Operands of incompatible types
 Variable not declared
The failure to match the actual argument with the formal argument

Data Structures in Compiler

Token
 Represented by an integer value or an enumeration literal
 Sometimes, it is necessary to preserve the string of characters that was scanned
 For example, name of an identifiers or value of a literal
Syntax Tree
 Constructed as a pointer-based structure
 Dynamically allocated as parsing proceeds
 Nodes have fields containing information collected by the parser and semantic analyzer
Symbol Table
 Keeps information associate with all kinds of identifiers:
Constants, variables, functions, parameters, types, fields, etc.
 Identifiers are entered by the scanner, parser or semantic analyzer
 Semantic analyzer adds type information and other attributes
 Code generation and optimization phases use the information in the symbol table
 Insertion, deletion and search operation needed to efficient because they are frequent
 Hash table with constant time operations is usually the preferred choice
 More than one symbol table may be used
Literal Table
 Stores constant values and string literal in a program
 One literal table applies globally to the entire program
 Used by the code generator to:
 Assign addresses for literals
 Enter data definitions in the target code file
 Avoids the replication of constants and strings
 Quick insertion and lookup are essential. Deletion is not necessary
Temporary Files
 Used historically by old compilers due to memory constraints
 Hold the data of various stages
Passes of Compiler

One-Pass Compiler

One pass compiler reads the code only once and then translates it. The one-pass compiler
passes only once through the parts of each compilation unit. It can translate each part into its
final machine program. In the one-pass compiler, when the line source is processed, it is
scanned and the token is extracted. This is in contrast to a multi-pass compiler which modifies
the program into one or more intermediate representations in steps between source program
and machine program, and which convert the whole compilation unit in each sequential pass.
A one-pass compiler is fast since all the compiler code is loaded in the memory at once. It can
process the source text without the overhead of the operating system having to shut down one
process and start another. A one-pass tends to impose some restrictions upon the program
constants, types, variables, and procedures that must be defined before they are used.

Multi-Pass Compiler

A multi-pass compiler can process the source code of a program multiple times. In the first
pass, the compiler can read the source code, scan it, extract the tokens and save the result in
an output file.
In the second pass, the compiler can read the output file produced by the first pass, build the
syntactic tree and implement the syntactical analysis. The output of this phase is a file that
includes the syntactical tree.
In the third pass, the compiler can read the output file produced by the second pass and check
that the tree follows the rules of language or not. The output of the semantic analysis phase is
the annotated tree syntax. This pass continues until the target output is produced.
Comparison between One-Pass and Multi-Pass Compiler.

One-Pass Compiler Multi-Pass Compiler

It reads the code only once and It reads the code multiple times, each time changing it
translates it at a similar time. into numerous forms.

They are faster. They are "slower." As more number of passes means
more execution time.

Less efficient code optimization Better code optimization and code generation.
and code generation.

It is also called a "Narrow It is also called a "wide compiler." As they can scan
compiler." It has limited scope. every portion of the program.

The compiler requires large The memory occupied by one pass can be reused by a
memory. subsequent pass; therefore, small memory is needed by
the compiler.

Example − Pascal & C Example − Modula -2 languages use multi-pass

languages use one-pass compilation.
compilation.

Lexical Analysis
Lexical analysis is the starting phase of the compiler. It gathers modified source code that is
written in the form of sentences from the language preprocessor. The lexical analyzer is
responsible for breaking these syntaxes into a series of tokens, by removing whitespace in the
source code. If the lexical analyzer gets any invalid token, it generates an error. The stream of
character is read by it and it seeks the legal tokens, and then the data is passed to the syntax
analyzer, when it is asked for.

Terminologies
There are three terminologies-

 Token
 Pattern
 Lexeme
Token: It is a sequence of characters that represents a unit of information in the source code.
Pattern: The description used by the token is known as a pattern.
Lexeme: A sequence of characters in the source code, as per the matching pattern of a token,
is known as lexeme. It is also called the instance of a token.

The Architecture of Lexical Analyzer

To read the input character in the source code and produce a token is the most important task
of a lexical analyzer. The lexical analyzer goes through with the entire source code and
identifies each token one by one. The scanner is responsible to produce tokens when it is
requested by the parser. The lexical analyzer avoids the whitespace and comments while
creating these tokens. If any error occurs, the analyzer correlates these errors with the source
file and line number.

Roles and Responsibility of Lexical Analyzer

The lexical analyzer performs the following tasks-

 The lexical analyzer is responsible for removing the white spaces and comments from
the source program.
 It corresponds to the error messages with the source program.
 It helps to identify the tokens.
 The input characters are read by the lexical analyzer from the source code.

Advantages of Lexical Analysis

 Lexical analysis helps the browsers to format and display a web page with the help of
parsed data.
 It is responsible to create a compiled binary executable code.
 It helps to create a more efficient and specialised processor for the task.

Disadvantages of Lexical Analysis

 It requires additional runtime overhead to generate the lexer table and construct the
tokens.
 It requires much effort to debug and develop the lexer and its token description.
 Much significant time is required to read the source code and partition it into tokens.
Lexical Analyzer: Input Buffering
Lexical Analysis has to access secondary memory each time to identify tokens. It is time-consuming
and costly. So, the input strings are stored into a buffer and then scanned by Lexical Analysis.
Lexical Analysis scans input string from left to right one character at a time to identify tokens. It uses
two pointers to scan tokens −
 Begin Pointer (bptr) − It points to the beginning of the string to be read.
 Look Ahead Pointer (lptr) − It moves ahead to search for the end of the token.
Example − For statement int a, b;
 Both pointers start at the beginning of the string, which is stored in the buffer.

 Look Ahead Pointer scans buffer until the token is found.

 The character ("blank space") beyond the token ("int") have to be examined before the token
("int") will be determined.

 After processing token ("int") both pointers will set to the next token ('a'), & this process will
be repeated for the whole program.

A buffer can be divided into two halves. If the look Ahead pointer moves towards halfway in First
Half, the second half is filled with new characters to be read. If the look Ahead pointer moves towards
the right end of the buffer of the second half, the first half will be filled with new characters, and it
goes on.
Sentinels − Sentinels are used to making a check, each time when the forward pointer is converted, a
check is completed to provide that one half of the buffer has not converted off. If it is completed, then
the other half should be reloaded.
Buffer Pairs − A specialized buffering technique can decrease the amount of overhead, which is
needed to process an input character in transferring characters. It includes two buffers, each includes
N-character size which is reloaded alternatively.

There are two pointers such as the lexeme Begin and forward are supported. Lexeme Begin points to
the starting of the current lexeme which is discovered. Forward scans ahead before a match for a pattern
are discovered. Before a lexeme is initiated, lexeme begin is set to the character directly after the lexeme
which is only constructed, and forward is set to the character at its right end.
Bootstrapping:
Bootstrapping is a process in which simple language is used to translate more complicated program
which in turn may handle for more complicated program. This complicated program can further
handle even more complicated program and so on.
Writing a compiler for any high level language is a complicated process. It takes lot of time to write
a compiler from scratch. Hence simple language is used to generate target code in some stages. to
clearly understand the Bootstrapping technique consider a following scenario.
Suppose we want to write a cross compiler for new language X. The implementation language of this
compiler is say Y and the target code being generated is in language Z. That is, we create XYZ. Now
if existing compiler Y runs on machine M and generates code for M then it is denoted as YMM. Now
if we run XYZ using YMM then we get a compiler XMZ. That means a compiler for source language
X that generates a target code in language Z and which runs on machine M.
Following diagram illustrates the above scenario.
Example:
We can create compiler of many different forms. Now we will generate.

Compiler which takes C language and generates an assembly language as an output with the
availability of a machine of assembly language.

 Step-1: First we write a compiler for a small of C in assembly language.

 Step-2: Then using with small subset of C i.e. C0, for the source language c the compiler is
written.

 Step-3: Finally we compile the second compiler. using compiler 1 the compiler 2 is compiled.

Step-4: Thus we get a compiler written in ASM which compiles C and generates code in ASM.

The compiler writer can use some specialized tools that help in implementing various phases of a
compiler. These tools assist in the creation of an entire compiler or its parts. Some commonly used
compiler construction tools include:
1. Parser Generator –
It produces syntax analyzers (parsers) from the input that is based on a grammatical description
of programming language or on a context-free grammar. It is useful as the syntax analysis phase
is highly complex and consumes more manual and compilation time.
Example: PIC, EQM

2. Scanner Generator –
It generates lexical analyzers from the input that consists of regular expression description based
on tokens of a language. It generates a finite automaton to recognize the regular expression.
Example: Lex
3. Syntax directed translation engines –
It generates intermediate code with three address format from the input that consists of a parse
tree. These engines have routines to traverse the parse tree and then produces the intermediate
code. In this, each node of the parse tree is associated with one or more translations.
4. Automatic code generators –
It generates the machine language for a target machine. Each operation of the intermediate
language is translated using a collection of rules and then is taken as an input by the code
generator. A template matching process is used. An intermediate language statement is replaced
by its equivalent machine language statement using templates.
5. Data-flow analysis engines –
It is used in code optimization.Data flow analysis is a key part of the code optimization that
gathers the information, that is the values that flow from one part of a program to another. Refer
– data flow analysis in Compiler
6. Compiler construction toolkits –
It provides an integrated set of routines that aids in building compiler components or in the
construction of various phases of compiler.

LEX
It is a tool or software which automatically generates a lexical analyzer (finite Automata). It takes as its
input a LEX source program and produces lexical Analyzer as its output. Lexical Analyzer will convert
the input string entered by the user into tokens as its output.
LEX is a program generator designed for lexical processing of character input/output stream. Anything
from simple text search program that looks for pattern in its input-output file to a C compiler that
transforms a program into optimized code.
Use of Lex

• lex.l is an a input file written in a language which describes the generation of lexical analyzer. The lex
compiler transforms lex.l to a C program known as lex.yy.c.
• lex.yy.c is compiled by the C compiler to a file called a.out.
• The output of C compiler is the working lexical analyzer which takes stream of input characters and
produces a stream of tokens.
• yylval is a global variable which is shared by lexical analyzer and parser to return the name and an
attribute value of token.
• The attribute value can be numeric code, pointer to symbol table or nothing.
• Another tool for lexical analyzer generation is Flex.
Structure of Lex Programs
Lex program will be in following form
declarations
%%
translation rules
%%
auxiliary functions
Declarations This section includes declaration of variables, constants and regular definitions.
Translation rules It contains regular expressions and code segments.
Form : Pattern {Action}
Pattern is a regular expression or regular definition.
Action refers to segments of code.
Auxiliary functions This section holds additional functions which are used in actions. These functions
are compiled separately and loaded with lexical analyzer.
Lexical analyzer produced by lex starts its process by reading one character at a time until a valid match
for a pattern is found.

Compiler Design Quantum PDF
100% (1)
Compiler Design Quantum PDF
211 pages
BCA C Language 2020-21
No ratings yet
BCA C Language 2020-21
124 pages
Ss Lab Viva Questions
67% (3)
Ss Lab Viva Questions
3 pages
Programming in Visual Basic 2024
No ratings yet
Programming in Visual Basic 2024
88 pages
Basic Concepts (2nd Class)
100% (1)
Basic Concepts (2nd Class)
22 pages
CompilerDesign Lab Manual
No ratings yet
CompilerDesign Lab Manual
66 pages
CC 102: Computer Programming 1: LESSON II: High and Low Level Programming Languages
No ratings yet
CC 102: Computer Programming 1: LESSON II: High and Low Level Programming Languages
13 pages
Computer Class 10 Icse Project
No ratings yet
Computer Class 10 Icse Project
39 pages
Computer Programming Notes
No ratings yet
Computer Programming Notes
181 pages
Unit - III Computer Languages PDF
No ratings yet
Unit - III Computer Languages PDF
18 pages
CSC 218 Notes 3
No ratings yet
CSC 218 Notes 3
8 pages
FoP Theory Final
No ratings yet
FoP Theory Final
159 pages
Computer Language and Its Types
No ratings yet
Computer Language and Its Types
11 pages
Introduction To Computer Programming PDF
75% (4)
Introduction To Computer Programming PDF
4 pages
Microprosser Lab Manual Final
No ratings yet
Microprosser Lab Manual Final
35 pages
Week 3 Language Translators
No ratings yet
Week 3 Language Translators
6 pages
Computer Languages - Note
No ratings yet
Computer Languages - Note
2 pages
Compiler Writing Tools
100% (2)
Compiler Writing Tools
17 pages
CPF Unit 3
No ratings yet
CPF Unit 3
33 pages
2023 Survey of Programing Lang CSC 301
No ratings yet
2023 Survey of Programing Lang CSC 301
30 pages
Intro To PRG Lngs
No ratings yet
Intro To PRG Lngs
30 pages
Levels of Programming Languages
No ratings yet
Levels of Programming Languages
19 pages
012.elementary Programming Principals
No ratings yet
012.elementary Programming Principals
45 pages
Machine Language
No ratings yet
Machine Language
3 pages
Microprocessor Module 1 and 2
No ratings yet
Microprocessor Module 1 and 2
30 pages
Elementary Programming Principles
No ratings yet
Elementary Programming Principles
24 pages
Structure Programming: ICT-1105 Information and Communication Technology
No ratings yet
Structure Programming: ICT-1105 Information and Communication Technology
25 pages
1 Unit 1
No ratings yet
1 Unit 1
42 pages
Cos 101 Module 5 Presentation
No ratings yet
Cos 101 Module 5 Presentation
31 pages
Chapter - 1 Topic - Computer Languages Computer Languages
No ratings yet
Chapter - 1 Topic - Computer Languages Computer Languages
5 pages
Lecture 5
No ratings yet
Lecture 5
25 pages
Assignment 3
No ratings yet
Assignment 3
25 pages
Lec - 2 C Programming
No ratings yet
Lec - 2 C Programming
22 pages
4.2 Languages
No ratings yet
4.2 Languages
19 pages
Computer Language Unit 3
No ratings yet
Computer Language Unit 3
30 pages
Programming Language
No ratings yet
Programming Language
17 pages
Programming in C
No ratings yet
Programming in C
13 pages
Software New
No ratings yet
Software New
14 pages
CSC 218 Notes 1
No ratings yet
CSC 218 Notes 1
11 pages
Unit II Computer Applications
No ratings yet
Unit II Computer Applications
12 pages
Types of Computer Language
No ratings yet
Types of Computer Language
12 pages
High and Low Level Language 1
No ratings yet
High and Low Level Language 1
14 pages
Lecture 1
No ratings yet
Lecture 1
9 pages
Week 2 - 3 Ict Year 10
No ratings yet
Week 2 - 3 Ict Year 10
5 pages
Chapter 1
No ratings yet
Chapter 1
5 pages
Computer Languages
No ratings yet
Computer Languages
5 pages
Types of Computer Language
No ratings yet
Types of Computer Language
4 pages
Lesson 3 - Computer Programming Languages and Language Translators
No ratings yet
Lesson 3 - Computer Programming Languages and Language Translators
6 pages
Evoluation of Programming Languages
No ratings yet
Evoluation of Programming Languages
4 pages
Difference Between High Level and Low Level Languages
No ratings yet
Difference Between High Level and Low Level Languages
8 pages
Lecture 1 - Introduction To Programming Languages
No ratings yet
Lecture 1 - Introduction To Programming Languages
5 pages
1.programming (BS Phy6) Lec1
No ratings yet
1.programming (BS Phy6) Lec1
7 pages
Introduction To Assembly Language Programming
No ratings yet
Introduction To Assembly Language Programming
10 pages
Introduction-Elementary Programming Principles Elementary Programming Principles
No ratings yet
Introduction-Elementary Programming Principles Elementary Programming Principles
38 pages
High & Low Level Lanagauges
No ratings yet
High & Low Level Lanagauges
5 pages
C++ Lecture 2
No ratings yet
C++ Lecture 2
3 pages
System Software and Programming Techniques 1
No ratings yet
System Software and Programming Techniques 1
5 pages
Print
No ratings yet
Print
5 pages
Lecture8 Intro To Machine and Assembly Language
No ratings yet
Lecture8 Intro To Machine and Assembly Language
7 pages
Implementation of Calculator Using LEX and YACC
0% (1)
Implementation of Calculator Using LEX and YACC
4 pages
Levels of Programming Languages
No ratings yet
Levels of Programming Languages
8 pages
CP High and - Low Level Languages
No ratings yet
CP High and - Low Level Languages
4 pages
Assembly, Interpreters, Compilers
No ratings yet
Assembly, Interpreters, Compilers
4 pages
Module 1
100% (1)
Module 1
91 pages
VMProtect 2 - Part Two, Complete Static Analysis - Back Engineering
No ratings yet
VMProtect 2 - Part Two, Complete Static Analysis - Back Engineering
23 pages
System Software and Compiler Design Laboratory
No ratings yet
System Software and Compiler Design Laboratory
33 pages
System Requirements: Hardware Requirements
No ratings yet
System Requirements: Hardware Requirements
128 pages
Experiment No. 9 3118013: Aim: Theory: Lexical Analyzer
No ratings yet
Experiment No. 9 3118013: Aim: Theory: Lexical Analyzer
16 pages
LEX and YACC
No ratings yet
LEX and YACC
3 pages
PCD - Answer Key NOV 2019
No ratings yet
PCD - Answer Key NOV 2019
19 pages
Compiler Lab (CSE353) Lab Manual
No ratings yet
Compiler Lab (CSE353) Lab Manual
21 pages
Chapter 3 - Regular Expressions
No ratings yet
Chapter 3 - Regular Expressions
49 pages
Lex and Yacc
No ratings yet
Lex and Yacc
27 pages
Lex Yacc Program Practice
No ratings yet
Lex Yacc Program Practice
21 pages
IT3230-Compiler Design Lab-HANDOUT-2024
No ratings yet
IT3230-Compiler Design Lab-HANDOUT-2024
5 pages
Chapter 2 - Lexical Analyser
No ratings yet
Chapter 2 - Lexical Analyser
39 pages
Compiler Isha
No ratings yet
Compiler Isha
30 pages
Compiler 2
No ratings yet
Compiler 2
10 pages
Compiler Design
No ratings yet
Compiler Design
11 pages
CD Lab RECORD PRINT' 1
No ratings yet
CD Lab RECORD PRINT' 1
91 pages
JasonDsouza - 9537 - Batch A
No ratings yet
JasonDsouza - 9537 - Batch A
114 pages
CD Lab Questions2024-2025 ODD
No ratings yet
CD Lab Questions2024-2025 ODD
2 pages
Semantic Analysis - 16CO125-151-254
No ratings yet
Semantic Analysis - 16CO125-151-254
43 pages
Nba - CD Lab Manual
No ratings yet
Nba - CD Lab Manual
40 pages
Flex Tool Presentation - DVK
No ratings yet
Flex Tool Presentation - DVK
17 pages
Compiler Assingmnet
No ratings yet
Compiler Assingmnet
5 pages
Introduction To Yacc (Bison)
No ratings yet
Introduction To Yacc (Bison)
21 pages
SS QB
No ratings yet
SS QB
8 pages
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
From Everand
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
Dexter Rogers
No ratings yet
Code Beneath the Surface: Mastering Assembly Programming
From Everand
Code Beneath the Surface: Mastering Assembly Programming
Kameron Hussain
No ratings yet

Unit 1

Uploaded by

Unit 1

Uploaded by

Compiler Design

What is a programming language?

Differences between Machine-Level language and Assembly language

Machine-level language Assembly language

It cannot be easily understood by humans. It is easy to read, write, and maintain.

The machine-level language is written in The assembly language is written in simple

It is a first-generation programming It is a second-generation programming

A compiler is required to translate a high-level language into a low-level language.

Advantages of a high-level language

Differences between Low-Level language and High-Level language

Low-level language High-level language

It is a machine-friendly language, i.e., the It is a user-friendly language as this language

The low-level language takes more time to It executes at a faster pace.

It is memory efficient. It is less memory efficient.

What is Translators? Different type of translators

The different types of translator are as follows:

Compiler is a translator which is used to convert programs in high-level language to low-level

Interpreter is a translator which is used to convert programs in high-level language to low-

Differences between compiler and interpreter

SI. Compiler Interpreter

5 Programming languages like C, C++ Programming languages like Python,

The primary functions of this phase are:

Error Handling Routine:

Types of Compiler Errors

Some syntax errors can be:

Data Structures in Compiler

One-Pass Compiler Multi-Pass Compiler

Example − Pascal & C Example − Modula -2 languages use multi-pass

The Architecture of Lexical Analyzer

Roles and Responsibility of Lexical Analyzer

Advantages of Lexical Analysis

Disadvantages of Lexical Analysis

 Look Ahead Pointer scans buffer until the token is found.

 Step-1: First we write a compiler for a small of C in assembly language.

You might also like