0% found this document useful (0 votes)
2 views

section c

A compiler translates high-level programming languages into machine code for execution by a computer's processor. The compilation process involves several phases, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation. Different storage allocation strategies, such as static, heap, and stack allocation, impact the performance of software, while incremental and cross compilers enhance efficiency and platform compatibility in the development process.

Uploaded by

middenchopra
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

section c

A compiler translates high-level programming languages into machine code for execution by a computer's processor. The compilation process involves several phases, including lexical analysis, syntax analysis, semantic analysis, intermediate code generation, code optimization, and code generation. Different storage allocation strategies, such as static, heap, and stack allocation, impact the performance of software, while incremental and cross compilers enhance efficiency and platform compatibility in the development process.

Uploaded by

middenchopra
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Compilers

A compiler is a software tool that translates high-level programming


code written in languages like C, C++, Java, or Python into machine code
that a computer's processor can execute directly. It takes the source
code of a program as input and produces an executable file or binary
that can run on a specific hardware platform.

compiler working

1. **Parsing**: The compiler first parses the source code to understand


its structure and syntax.

2. **Semantic Analysis**: It then performs semantic analysis to check


for any logical errors or inconsistencies in the code.

3. **Optimization**: Many compilers include optimization techniques


to improve the efficiency and performance of the generated machine
code. This optimization can involve rearranging code, eliminating
redundant operations, and reducing memory usage.

4. **Code Generation**: Finally, the compiler translates the parsed and


optimized code into machine code instructions that the target
processor can understand. This machine code is typically in the form of
binary files or executable programs.
Overall, compilers play a crucial role in the software development
process by translating human-readable code into instructions that
computers can execute efficiently. They enable programmers to write
code in high-level languages and abstract away the complexities of
hardware architecture and instruction sets.

Compiler Phases
The compilation process contains the sequence of various phases. Each
phase takes source program in one representation and produces output in
another representation. Each phase takes input from its previous stage.

There are the various phases of compiler:

Fig: phases of compiler

Lexical Analysis:
Lexical analyzer phase is the first phase of compilation process. It takes
source code as input. It reads the source program one character at a time
and converts it into meaningful lexemes. Lexical analyzer represents these
lexemes in the form of tokens.

Syntax Analysis
Syntax analysis is the second phase of compilation process. It takes tokens
as input and generates a parse tree as output. In syntax analysis phase, the
parser checks that the expression made by the tokens is syntactically correct
or not.

Semantic Analysis
Semantic analysis is the third phase of compilation process. It checks
whether the parse tree follows the rules of language. Semantic analyzer
keeps track of identifiers, their types and expressions. The output of
semantic analysis phase is the annotated tree syntax.

Intermediate Code Generation


In the intermediate code generation, compiler generates the source code
into the intermediate code. Intermediate code is generated between the
high-level language and the machine language. The intermediate code
should be generated in such a way that you can easily translate it into the
target machine code.

Code Optimization
Code optimization is an optional phase. It is used to improve the
intermediate code so that the output of the program could run faster and
take less space. It removes the unnecessary lines of the code and arranges
the sequence of statements in order to speed up the program execution.

Code Generation
Code generation is the final stage of the compilation process. It takes the
optimized intermediate code as input and maps it to the target machine
language. Code generator translates the intermediate code into the machine
code of the specified computer.
Lexical Analysis:
A lexical analyzer is also called a "Scanner". Given the code's statement/
input string, it reads the statement from left to right character-wise. The
input to a lexical analyzer is the pure high-level code from the preprocessor.
It identifies valid lexemes from the program and returns tokens to the syntax
analyzer, one after the other, corresponding to
the getNextToken command from the syntax analyzer.

There are three important terms to grab:

1. Tokens: A Token is a pre-defined sequence of characters that cannot be


broken down further. It is like an abstract symbol that represents a unit. A
token can have an optional attribute value. There are different types of
tokens:
o Identifiers (user-defined)
o Delimiters/ punctuations (;, ,, {}, etc.)
o Operators (+, -, *, /, etc.)
o Special symbols
o Keywords
o Numbers

2. Lexemes: A lexeme is a sequence of characters matched in the source


program that matches the pattern of a token.
For example: (, ) are lexemes of type punctuation where punctuation is the
token.
3. Patterns: A pattern is a set of rules a scanner follows to match a lexeme in
the input program to identify a valid token. It is like the lexical analyzer's
description of a token to validate a lexeme.
For example, the characters in the keyword are the pattern to identify a
keyword. To identify an identifier the pre-defined set of rules to create an
identifier is the pattern

advantages and disadvantages of lexical analysis in compilers:

**Advantages:**

1. **Simplification of Parsing:** Lexical analysis simplifies the parsing process by


breaking down the source code into smaller, more manageable tokens. This
makes it easier for the parser to understand the structure and syntax of the
program, leading to faster and more efficient parsing.

2. **Error Detection:** Lexical analysis helps in detecting errors such as syntax


errors, misspelled keywords, or invalid characters early in the compilation
process. By identifying these errors at the lexical level, the compiler can provide
more accurate error messages to the programmer, facilitating debugging and
troubleshooting.

3. **Tokenization:** Lexical analysis tokenizes the source code, converting it into


a sequence of tokens representing keywords, identifiers, literals, and operators.
This tokenized representation simplifies subsequent compilation phases, such as
parsing, semantic analysis, and code generation, by providing a structured and
standardized input format.

4. **Efficient Memory Usage:** Lexical analysis can optimize memory usage by


discarding whitespace characters, comments, and other non-essential elements
from the source code. This reduces the memory footprint of the compiler and
improves overall compilation performance.
**Disadvantages:**

1. **Complexity:** Lexical analysis adds complexity to the compiler design and


implementation. Developing a robust lexical analyzer requires careful
consideration of language syntax, tokenization rules, and error handling
mechanisms. Managing the interactions between the lexical analyzer and other
compiler components can be challenging, especially in large-scale projects.

2. **Performance Overhead:** Lexical analysis introduces performance


overhead, particularly in the tokenization phase, where the source code is
scanned and analyzed character by character. The efficiency of the lexical analyzer
directly impacts the overall compilation time and resource utilization. Inefficient
lexical analyzers may slow down the compilation process and hinder developer
productivity.

3. **Language Flexibility:** Lexical analyzers are designed based on specific


language specifications and tokenization rules. Adapting a lexical analyzer to
support new languages or language variants may require significant effort and
expertise. Changes in language syntax or tokenization rules may necessitate
modifications to the lexical analyzer, leading to maintenance challenges and
compatibility issues.

4. **Error Handling:** While lexical analysis helps in detecting and reporting


syntax errors, it may struggle with certain types of errors, such as ambiguous or
context-sensitive constructs. Resolving these errors at the lexical level can be
difficult, requiring additional checks and validations in subsequent compilation
phases. In some cases, errors detected at later stages may trace back to
deficiencies in the lexical analysis process.
In summary, while lexical analysis offers several advantages such as simplification
of parsing, error detection, tokenization, and efficient memory usage, it also
poses challenges related to complexity, performance overhead, language
flexibility, and error handling. Effective design and implementation of lexical
analyzers require careful consideration of these factors to ensure robustness,
accuracy, and efficiency in the compilation process.

Parser
Parser is a compiler that is used to break the data into smaller elements
coming from lexical analysis phase.

A parser takes input in the form of sequence of tokens and produces output
in the form of parse tree.

Parsing is of two types: top down parsing and bottom up parsing.

Top down paring


o The top down parsing is known as recursive parsing or predictive parsing.
o Bottom up parsing is used to construct a parse tree for an input string.
o In the top down parsing, the parsing starts from the start symbol and
transform it into the input symbol.

Parse Tree representation of input string "acdb" is as follows:


Bottom up parsing
o Bottom up parsing is also known as shift-reduce parsing.
o Bottom up parsing is used to construct a parse tree for an input string.
o In the bottom up parsing, the parsing starts with the input symbol and
construct the parse tree up to the start symbol by tracing out the rightmost
derivations of string in reverse.

Storage Allocation Strategies in Compiler Design


Last Updated : 08 Sep, 2023

A compiler is a program that converts HLL(High-Level Language) to LLL(Low-Level


Language) like machine language. In a compiler, there is a need for storage
allocation strategies in Compiler design because it is very important to use the
right strategy for storage allocation as it can directly affect the performance of the
software.
Storage Allocation Strategies
There are mainly three types of Storage Allocation Strategies:
1. Static Allocation
2. Heap Allocation
3. Stack Allocation
1. Static Allocation
Static allocation lays out or assigns the storage for all the data objects at the
compile time. In static allocation names are bound to storage. The address of
these identifiers will be the same throughout. The memory will be allocated in a
static location once it is created at compile time. C and C++ use static allocation.
For example:
int number = 1;
static int digit = 1;
Advantages of Static Allocation
1. It is easy to understand.
2. The memory is allocated once only at compile time and remains the same
throughout the program completion.
3. Memory allocation is done before the program starts taking memory only on
compile time.
Disadvantages of Static Allocation
1. Not highly scalable.
2. Static storage allocation is not very efficient.
3. The size of the data must be known at the compile time.
2. Heap Allocation
Heap allocation is used where the Stack allocation lacks if we want to retain the
values of the local variable after the activation record ends, which we cannot do
in stack allocation, here LIFO scheme does not work for the allocation and de-
allocation of the activation record. Heap is the most flexible storage allocation
strategy we can dynamically allocate and de-allocate local variables whenever the
user wants according to the user needs at run-time. The variables in heap
allocation can be changed according to the user’s requirement. C, C++, Python,
and Java all of these support Heap Allocation.
Advantages of Heap Allocation
1. Heap allocation is useful when we have data whose size is not fixed and can
change during the run time.
2. We can retain the values of variables even if the activation records end.
3. Heap allocation is the most flexible allocation scheme.
Disadvantages of Heap Allocation
1. Heap allocation is slower as compared to stack allocation.
2. There is a chance of memory leaks.
3. Stack Allocation
Stack is commonly known as Dynamic allocation. Dynamic allocation means the
allocation of memory at run-time. Stack is a data structure that follows the LIFO
principle so whenever there is multiple activation record created it will be pushed
or popped in the stack as activations begin and ends. Local variables are bound to
new storage each time whenever the activation record begins because the
storage is allocated at runtime every time a procedure or function call is made.
When the activation record gets popped out, the local variable values get erased
because the storage allocated for the activation record is removed. C and C++
both have support for Stack allocation.
Conclusion
In conclusion, different storage allocation strategies play an important role in
determining the best-fit storage allocation strategy according to the need of the
user as the helps in determining how the memory is going to be allocated and
deallocated. Different storage allocation strategies have their own advantages
and disadvantages and the choice depends on the factors like speed, memory
allocation, efficiency, etc. So we can choose the allocation strategy according to
the requirement.
Incremental compiler
An incremental compiler is like a smart helper that remembers what you've
already done when you're writing code. Instead of recompiling everything from
scratch every time you make a change, it only recompiles the parts that have
actually changed. This saves a lot of time and makes the process much faster,
especially for big projects. It's like having someone who remembers where all the
puzzle pieces fit so you don't have to start from the beginning every time you
want to make a small adjustment to your picture.
function of an incremental compiler
1. **Change Detection:** The incremental compiler analyzes the source code to
detect any changes made by the programmer since the last compilation. It
compares timestamps or checksums of files to determine which parts of the code
need to be recompiled.

2. **Dependency Tracking:** It tracks dependencies between different modules


or files in the codebase. This allows the compiler to identify which files or
modules are affected by a change and prioritize their compilation accordingly.

3. **Selective Compilation:** Instead of recompiling the entire codebase, the


incremental compiler selectively compiles only the files or modules that have
changed or are dependent on changed files. This minimizes compilation time and
improves developer productivity.

4. **Caching:** The compiler may cache intermediate results, such as object files
or compiled code, to reuse them in subsequent compilations. This reduces
redundant work and speeds up the compilation process, especially for frequently
used libraries or modules.

5. **Incremental Linking:** In addition to incremental compilation, some


compilers support incremental linking, which updates only the parts of the
executable that have changed since the last build. This further reduces build times
and allows for faster iteration during development.

Advantages:

1. Faster compile times: Incremental compilers can save significant time by


only recompiling the parts of the program that have changed, rather than
the entire program.
2. Improved development speed: Incremental compilers can speed up the
development process by allowing developers to see the effects of their
changes more quickly.
3. More efficient use of resources: By only recompiling the modified parts of
the program, incremental compilers can use fewer resources, such as
CPU cycles and memory.
4. Better error reporting: Incremental compilers can provide more detailed
error reporting and debugging information than traditional compilers,
because they can track program changes more closely.

Disadvantages:

1. Increased complexity: Incremental compilers are typically more complex


to implement than traditional compilers, because they require more
sophisticated tracking and management of program changes.
2. Increased memory usage: Depending on the implementation, incremental
compilers may require more memory to store information about the
program’s changes and dependencies.
3. Risk of errors: Incremental compilers can introduce the risk of errors if
changes to the program are not properly tracked or dependencies are not
correctly resolved.
4. Platform-specific limitations: Some programming languages or platforms
may not support incremental compilation, or may have limitations that
make it less effective or efficient.
5. Overall, incremental compilers can be a useful tool for improving the
efficiency and speed of the compilation process, but they require careful
implementation and management to

Cross Compiler


Compilers are the tool used to translate high-level programming


language to low-level programming language. The simple compiler
works in one system only, but what will happen if we need a
compiler that can compile code from another platform, to perform
such compilation, the cross compiler is introduced. In this article, we
are going to discuss cross-compiler.
A cross compiler is a compiler capable of creating executable code
for a platform other than the one on which the compiler is running.
For example, a cross compiler executes on machine X and produces
machine code for machine Y.

Where is the cross compiler used?

 In bootstrapping, a cross-compiler is used for transitioning to a


new platform. When developing software for a new platform, a
cross-compiler is used to compile necessary tools such as the
operating system and a native compiler.
 For microcontrollers, we use cross compiler because it doesn’t
support an operating system.
 It is useful for embedded computers which are with limited
computing resources.
 To compile for a platform where it is not practical to do the
compiling, a cross-compiler is used.
 When direct compilation on the target platform is not infeasible,
so we can use the cross compiler.
 It helps to keep the target environment separate from the built
environment.

T-Diagram for Cross-compiler

advantages and disadvantages of cross compilers:

**Advantages:**

1. **Platform Independence:** Cross compilers allow developers to write and


compile code on one platform (host) for execution on a different platform
(target). This enables software development for diverse hardware architectures
and operating systems, expanding the reach of applications across various devices
and platforms.

2. **Efficient Development:** Cross compilers facilitate efficient development by


providing a unified development environment where developers can write,
debug, and compile code for multiple target platforms without needing access to
each platform's native development tools or hardware.

3. **Performance Optimization:** Cross compilers often include optimization


features tailored for specific target platforms, such as instruction set optimization,
memory layout optimization, and target-specific code generation. This can lead to
improved performance and efficiency of compiled code on the target platform.
4. **Portability:** Cross compilers enable portability of software across different
platforms by generating executable code that is compatible with the target
platform's hardware and operating system. This allows developers to create
applications that can run on a wide range of devices without modification.

**Disadvantages:**

1. **Complexity:** Cross compiling introduces additional complexity into the


development process due to differences in hardware architectures, operating
systems, and development environments between the host and target platforms.
Managing cross-platform dependencies, compatibility issues, and debugging
across multiple platforms can be challenging.

2. **Toolchain Limitations:** Cross compilers may have limitations compared to


native compilers, such as incomplete language support, missing libraries, or
differences in compiler optimizations. This can impact the performance,
functionality, or compatibility of compiled code on the target platform.

3. **Build Environment Setup:** Setting up a cross compiling environment


requires configuring and integrating cross compiler toolchains, libraries, headers,
and build scripts for each target platform. This setup process can be time-
consuming and error-prone, especially for complex projects with multiple
dependencies.

4. **Debugging Challenges:** Debugging cross-compiled code on the target


platform may be more challenging compared to native development
environments. Debugging tools and techniques may vary between the host and
target platforms, requiring additional effort to diagnose and resolve issues.
In summary, while cross compilers offer advantages such as platform
independence, efficient development, performance optimization, and portability,
they also present challenges related to complexity, toolchain limitations, build
environment setup, and debugging. Effective use of cross compilers requires
careful consideration of these factors and appropriate strategies to mitigate
potential drawbacks.

You might also like