CD Experiments 1,2
CD Experiments 1,2
INDEX
S.
Name of Experiment Date Grade Signature
No.
Case Study on Compiler. (Types of
1
Compiler, Recent Compiler etc.)
2 Case Study on phases of compilation
Write a program to recognize white
spaces, count the number of identifiers,
3
characters, tabs and the length of the
input string.
Write a C program to identify whether a
4
given line is a comment or not.
Write a program to remove left
5
recursion.
6 Write a program for left factoring
Write a program to find FIRST and
7 FOLLOW of non terminals of any given
grammar.
Write a C program for constructing
8
LL(1) parsing.
9 Write a program for Parser (Lexparser).
10 Write a program for LR Parser.
11 Write a program for leading and trailing.
12 Case study of Lex and Yacc
Program to Check Whether a string is
13
valid or invalid.
Program to check if the string is valid or
14
not according to Grammar.
Write a program to convert the Infix
15
expression to postfix expression.
Introduction
The compiler is software that converts a program written in a high-level language (Source Language)
to a low-level language (Object/Target/Machine Language/0, 1’s).
The program written in a high-level language is known as a source program, and the program
converted into a low-level language is known as an object (or target) program. Without compilation, no
program written in a high-level language can be executed. For every programming language, we have a
different compiler; However, the basic tasks performed by every compiler are the same. The process of
translating the source code into machine code involves several stages, including lexical analysis, syntax
analysis, semantic analysis, code generation, and optimization.
Compiler is an intelligent program as compared to an assembler. Compiler verifies all types of limits,
ranges, errors , etc. Compiler program takes more time to run and it occupies a huge amount of
memory space. The speed of the compiler is slower than other system software. It takes time because it
enters through the program and then does translation of the full program. When the compiler runs on
the same machine and produces machine code for the same machine on which it is running. Then it is
called a self compiler or resident compiler. Compiler may run on one machine and produce the
machine codes for another computer then in that case it is called a cross compiler.
High-Level Language
A High-Level Language is a programming language that allows humans to create computer programs
and interact with the computer system. These languages are considered ‘High-Level’ as they are
similar to human languages and use keywords and syntax, making them easier to learn and understand.
They are independent of the computer systems and offer development tools like built-in functions and
libraries.
While writing a high- level language code, full attention needs to be put on the logic of the problem.
C++, Java, and Python are popular high-level programming languages.
Low-Level Language
A Low-Level Language deals with the computer’s hardware system and its components. It is machine
language that provides no abstraction from the hardware. It generally provides specific instructions to
the computer processor and is represented in binary forms (‘0’ or ‘1’). While high-level languages are
independent of computer systems, low-level languages can only be executed by the processor for
which it is written. These languages do not require programming ideas and concepts.
Binary, machine, and assembly codes are typical examples of Low-Level programming languages.
Working of a Compiler
There are six significant steps involved in the working of a compiler, namely:
1. Lexical Analysis: This is the first stage that involves scanning the source code. The compiler scans
the source code character by character and performs tokenization. Tokenization refers to breaking
the source code into tokens like keywords, operators, and identifiers.
2. Syntax Analysis: Parsing is done in this phase. In this stage, the compiler verifies the syntax of the
code and ensures that the proper rules of the programming language are followed or not. A parse
tree is created by the compiler based on the tokens of the program. A parse helps in checking for
syntax errors.
3. Semantic Analysis: This is the third stage, where the compiler checks whether the parse tree
follows the required rule of language. Type checking is performed, which ensures that the
operations are performed correctly and on compatible data types. The compiler also looks for errors
like incorrect function calls or undeclared variables. An annotated syntax tree is produced as an
output.
4. Intermediate Code Generation: The compiler generates an intermediate code from the source
code. An intermediate code is in between source code and machine code. The intermediate code is
generated in a way that makes it easier to convert this code into target machine code.
5. Code Optimization: In this phase, the intermediate code is optimized. The optimization involves
changing the organization of statements, removing unnecessary code lines, etc., which improves the
overall code performance.
6. Code Generation: This is the final stage in which an optimized machine code is generated. In this
stage, the intermediate code is taken as input and mapped with the machine code. The code
generator translates the intermediate code to machine code.
There are many different types of Compilers. A few of them are mentioned
below.
Traditional Compiler
Traditional Compilers simply convert a high-level language program code into its corresponding
machine code.
For example, a traditional compiler converts C++ source code to machine or assembly code.
Incremental Compiler
Incremental Compilers generate machine code for the statements independent of the machine code
generated for other statements. It recompiles only for those lines of source code which are modified,
and This recompiled code is merged with previously combined code to develop a new target code.
For example, C/C++ GNU compiler.
JIT Compiler
Just In Time Compilers or JIT Compilers are run-time compilers that help form executable code
(machine code) from intermediate code (byte code). These compilers perform specific optimizations
while compiling a series of bytecodes. They also implement type-based verification, which makes the
machine code more authentic and optimized.
Cross Compilers
A cross-compiler creates executable code for a platform other than the one on which it is running. For
example, a cross-compiler running on a machine using C programming language can produce
executable code for a machine using Java programming language.
Single-pass Compiler
A Single-pass Compiler combines all the compiler phases in a single module.
The phases involve extracting the tokens from the source code, and then the syntax of the programming
language used in the source code is checked. A parse tree is created, and semantic analysis is done,
which checks the meaning and correctness of the source code. Finally, an optimized machine code gets
created.
It requires two passes to scan the source code and perform its translation.
1. Front End: This is the analysis Phases of Compiler which involves scanning the source code,
performing lexical analysis and finding the syntax errors. This phase generates an Intermediate
Code which is passed to the Back end.
2. Back End: Back end or the synthesis phase generates the machine code with symbol table
representation and intermediate code.
Operations of Compiler
These are some operations that are done by the compiler.
➔ It breaks source programs into smaller parts.
➔ It enables the creation of symbol tables and intermediate representations.
➔ It helps in code compilation and error detection.
➔ it saves all codes and variables.
➔ It analyzes the full program and translates it.
➔ Convert source code to machine code.
Applications of Compilers
● Compiler technology is required in implementing high-level programming languages to transform
them into a low-level language that can be understood by the machine.
● Optimizing compilers help in optimizing the overall performance of the program and thus discards
the inefficiency of high-level abstractions.
● Compiler technology is also useful in designing computer architectures. Earlier, compilers were
created after setting up the machines. Lately, compilers have started to be built in the
processor-design stage of modern computer architecture designs.
● Compiler technology also helps several application threads to run on different processors.
● Compiler technology is also used in many program translations such as binary translation, hardware
synthesis, database query interpretation, etc.
The purpose of a compiler is to enable its user to write programs in a certain language that is user-friendly
and convenient. The compiler then converts the program into another program of a language that is more
close to the machine and more efficient. In compilation, there are many phases of the compiler.
1. Lexical Analysis:
In the first phase in the compiler, lexical analysis receives as input the source code of the program. Lexical
analysis is also referred to as linear analysis or scanning. It's the process of tokenizing.
Lexer scans the input source code, one character at a time. The instant it identifies the end of a lexeme, it
transforms the lexeme into a token. The input is transformed in this manner into a sequence of tokens. A
token is a meaningful group of characters from the source which the compiler recognizes. The lexical
analyzer then passes these tokens to the next phase in the compiler. Scanning only eliminates the non-token
structures from the input stream, such as comments, unnecessary white spaces, etc. The program that
implements lexical analysis is known as a lexer, lexical analyzer, or scanner.
2. Syntax Analysis:
Syntax analysis, the second phase in the compiler, receives as input the stream of tokens, corresponding to
which it produces a parse tree as output. Syntax analysis is also referred to as parsing. The parse tree is
generated with the help of predetermined grammar rules of the language that the compiler targets.
The syntax analyzer checks whether or not a given program follows the rules of context-free grammar. If it
does, then the syntax analyzer creates the parse tree for the input source program.
The phase of syntax analysis is also known as hierarchical analysis, or parsing. The program that is
responsible for performing syntax analysis is referred to as a parser. During parsing, the parser determines
the syntactic validity of the source program.
3. Semantic Analysis:
In the third phase in the process of compilation, semantic analysis checks if the parse tree that it receives as
input, abides by the rules of the language which the compiler targets. The semantic analyzer also records
all the identifiers, their types, expressions, etc. The semantic analysis phase generates as output the
annotated tree syntax.
The semantics of a language make its constructs such as tokens and syntax structures meaningful.
Semantics enables interpreting symbols, their types, and the relations among them. Semantic analysis
determines if the syntax structure of the source code has any meaning or not.
There are certain rules set by the grammar of the target language that is evaluated during semantic analysis.
Semantic analysis performs scope resolution, type checking, array-bound checking.
5. Code Optimization
This phase alters the intermediate code it receives as input such that the program output becomes relatively
more efficient in terms of both runtime and memory consumption. These changes include, but are not
limited to, removing unnecessary parts of code, appropriately arranging the lines of code.
Code optimization may or may not be dependent on the machine. In machine-independent optimization,
the compiler takes in the intermediate code and changes a part of the intermediate code such that there is
no involvement of any CPU registers and absolute memory locations.
Machine-dependent optimization is done after the target code has been generated. The code is changed as
per the architecture of the target machine. This optimization involves CPU registers and absolute memory
references.
6. Code Generation
In the sixth and the final phase of the compiler, code generation receives as input the optimized
intermediate code and translates the optimized intermediate code into the target machine language. This
phase involves assembly language usage to convert optimized code into target machine format. Target code
could be either machine code or assembly code. Each line in optimized code is mapped to several lines in
machine/assembly code.
t=a-b
u=a-c
v=t+u
d=v+u
SUB b, R0
SUB c, R1 R1 contains u u in R1
R1 contains u v in R0