0% found this document useful (0 votes)

98 views

What Compiler Is

A compiler is a computer program that translates source code written in a programming language into an executable program. The compiler must preserve the meaning of the original program and improve the code in some way, such as making it directly executable on a target machine. Studying compiler construction provides experience with large-scale applications and demonstrates the value of theory in software development.

Uploaded by

chanakkaya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views

What Compiler Is

Uploaded by

chanakkaya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

What compiler is?

The role of computers in daily life is growing each year. Modern microprocessors are found in cars,
microwave ovens, dishwashers, mobile telephones, GPSS navigation systems, video games and personal
computers. Each of these devices must be programmed to perform its job. Those programs are written in
some “programming” language – a formal language with mathematical properties and well-deﬁned
meanings – rather than a natural language with evolved properties and many ambiguities. Programming
languages are designed for expressiveness, conciseness, and clarity. A program written in a programming
language must be translated before it can execute directly on a computer; this translation is accomplished
by a software system called a compiler.

A compiler is just a computer program that takes as input an executable program and produces as output
an equivalent executable program.

Definition (Compiler)

While many issues in compiler design are amenable to several diﬀerent solutions, there are two principles
that should not be compromised.

 The ﬁrst principle that a well- designed compiler must observe is inviolable.

“The compiler must preserve the meaning of the program being compiled.”

The code produced by the compiler must faithfully implement the “meaning” of the source-code
program being compiled. If the compiler can take liberties with meaning, then it can always generate the
same code, independent of input. For example, the compiler could simply emit a nop or a return
instruction.

 The second principle that a well-designed compiler must observe is quite practical.

“The compiler must improve the source code in some discernible way.”

If the compiler does not improve the code in some way, why should any-one invoke it? A
traditional compiler improves the code by making it directly executable on some target machine. Other
“compilers” improve their input in diﬀerent ways. For example, tpic is a program that takes the
speciﬁcation for a drawing written in the graphics language pic, and converts it into L A TEX; the
“improvement” lies in L A TEX’s greater availability and generality. Some compilers produce output
programs in the same language as their input; we call these “source-to-source” translators. In general,
these systems try to restate the program in a way that will lead, eventually, to an improvement.

Why study compilers?

You may never write a commercial compiler, but that's not why we study compilers. We study
compiler construction for the following reasons:

1
 Writing a compiler gives a student experience with large-scale applications development. Your
compiler program may be the largest program you write as a student. Experience working with
really big data structures and complex interactions between algorithms will help you out on your
next big programming project.

 Compiler writing is one of the shining triumphs of CS theory. It demonstrates the value of theory
over the impulse to just "hack up" a solution.

 Compiler writing is a basic element of programming language research. Many language researchers
write compilers for the languages they design.

 Many applications have similar properties to one or more phases of a compiler, and compiler
expertise and tools can help an application programmer working on other projects besides
compilers.

The name "compiler" is primarily used for programs that translate source code from a high-level
programming language to a lower level language (e.g., assembly language or machine language).

A program that translates from a low level language to a higher level one is a decompiler.

A program that translates between high-level languages is usually called a language translator,
source to source translator, or language converter.

A language rewriter is usually a program that translates the form of expressions without a change
of language.

History

Software for early computers was exclusively written in assembly language for many years. Higher level
programming languages were not invented until the benefits of being able to reuse software on different
kinds of CPUs started to become significantly greater than the cost of writing a compiler. The very limited
memory capacity of early computers also created many technical problems when implementing a
compiler.

Towards the end of the 1950s, machine-independent programming languages were first proposed.
Subsequently, several experimental compilers were developed. The first compiler was written by Grace
Hopper, in 1952, for the A-0 programming language. The FORTRAN team led by John Backus at IBM is
generally credited as having introduced the first complete compiler, in 1957. COBOL was an early language
to be compiled on multiple architectures, in 1960.

In many application domains the idea of using a higher level language quickly caught on. Because of the
expanding functionality supported by newer programming languages and the increasing complexity of
computer architectures, compilers have become more and more complex.

Early compilers were written in assembly language. The first self-hosting compiler — capable of compiling
its own source code in a high-level language — was created for Lisp by Hart and Levin at MIT in 1962. Since
the 1970s it has become common practice to implement a compiler in the language it compiles, although
both Pascal and C have been popular choices for implementation language. Building a self-hosting compiler
is a bootstrapping problem -- the first such compiler for a language must be compiled either by a

2
compiler written in a different language, or (as in Hart and Levin's Lisp compiler) compiled by running the
compiler in an interpreter.

Compilers in education

Compiler construction and compiler optimization are taught at universities as part of the computer science
curriculum. Such courses are usually supplemented with the implementation of a compiler for an
educational programming language. A well-documented example is Niklaus Wirth's PL/0 compiler, which
Wirth used to teach compiler construction in the 1970s. [3] In spite of its simplicity, the PL/0 compiler
introduced several influential concepts to the field:

1. Program development by stepwise refinement (also the title of a 1971 paper by Wirth)
2. The use of a recursive descent parser
3. The use of EBNF to specify the syntax of a language
4. A code generator producing portable P-code
5. The use of T-diagrams in the formal description of the bootstrapping problem

Compiler output

One method used to classify compilers is by the platform on which the generated code they produce
executes. This is known as the target platform.

A native or hosted compiler is one whose output is intended to directly run on the same type of computer
and operating system as the compiler itself runs on. The output of a cross compiler is designed to run on a
different platform. Cross compilers are often used when developing software for embedded systems that
are not intended to support a software development environment.

3
The output of a compiler that produces code for a virtual machine (VM) may or may not be executed on
the same platform as the compiler that produced it. For this reason such compilers are not usually
classified as native or cross compilers.

Compiled versus interpreted languages

Higher-level programming languages are generally divided for convenience into compiled languages and
interpreted languages. However, there is rarely anything about a language that requires it to be exclusively
compiled, or exclusively interpreted. The categorization usually reflects the most popular or widespread
implementations of a language — for instance, BASIC is thought of as an interpreted language, and C a
compiled one, despite the existence of BASIC compilers and C interpreters.

In a sense, all languages are interpreted, with "execution" being merely a special case of interpretation
performed by transistors switching on a CPU. Modern trends toward just-in-time compilation and bytecode
interpretation also blur the traditional categorizations.

There are exceptions. Some language specifications spell out that implementations must include a
compilation facility; for example, Common Lisp. Other languages have features that are very easy to
implement in an interpreter, but make writing a compiler much harder; for example, APL, SNOBOL4, and
many scripting languages allow programs to construct arbitrary source code at runtime with regular string
operations, and then execute that code by passing it to a special evaluation function. To implement these
features in a compiled language, programs must usually be shipped with a runtime library that includes a
version of the compiler itself.

Hardware compilation

The output of some compilers may target hardware at a very low level. For example a Field Programmable
Gate Array (FPGA) or structured Application-specific integrated circuit (ASIC). Such compilers are said to be
hardware compilers or synthesis tools because the programs they compile effectively control the final
configuration of the hardware and how it operates; the output of the compilation are not instructions that
are executed in sequence - only an interconnection of transistors or lookup tables. For example, XST is the
Xilinx Synthesis Tool used for configuring FPGAs. Similar tools are available from Altera, Synplicity,
Synopsys and other vendors.

Compiler design

The approach taken to compiler design is affected by the complexity of the processing that needs to be
done, the experience of the person(s) designing it, and the resources (eg, people and tools) available.

A compiler for a relatively simple language written by one person might be a single, monolithic piece of
software. When the source language is large and complex, and high quality output is required the design
may be split into a number of relatively independent phases, or passes. Having separate phases means
development can be parceled up into small parts and given to different people. It also becomes much
easier to replace a single phase by an improved one, or to insert new phases later (eg, additional
optimizations).

The division of the compilation processes in phases (or passes) was championed by the Production Quality
Compiler-Compiler Project (PQCC) at Carnegie Mellon University. This project introduced the terms front
end, middle end (rarely heard today), and back end.

4
All but the smallest of compilers have more than two phases. However, these phases are usually regarded
as being part of the front end or the back end. The point at where these two ends meet is always open to
debate. The front end is generally considered to be where syntactic and semantic processing takes place,
along with translation to a lower level of representation (than source code).

The middle end is usually designed to perform optimizations on a form other than the source code or
machine code. This source code/machine code independence is intended to enable generic optimizations
to be shared between versions of the compiler supporting different languages and target processors.

The back end takes the output from the middle. It may perform more analysis, transformations and
optimizations that are for a particular computer. Then, it generates code for a particular processor and OS.

This front-end/middle/back-end approach makes it possible to combine front ends for different languages
with back ends for different CPUs. Practical examples of this approach are the GNU Compiler Collection,
LLVM, and the Amsterdam Compiler Kit, which have multiple front-ends, shared analysis and multiple
back-ends.

One-pass versus multi-pass compilers

Classifying compilers by number of passes has its background in the hardware resource limitations of
computers. Compiling involves performing lots of work and early computers did not have enough memory
to contain one program that did all of this work. So compilers were split up into smaller programs which
each made a pass over the source (or some representation of it) performing some of the required analysis
and translations.

The ability to compile in a single pass is often seen as a benefit because it simplifies the job of writing a
compiler and one pass compilers are generally faster than multi-pass compilers. Many languages were
designed so that they could be compiled in a single pass (e.g., Pascal).

In some cases the design of a language feature may require a compiler to perform more than one pass
over the source. For instance, consider a declaration appearing on line 20 of the source which affects the
translation of a statement appearing on line 10. In this case, the first pass needs to gather information
about declarations appearing after statements that they affect, with the actual translation happening
during a subsequent pass.

The disadvantage of compiling in a single pass is that it is not possible to perform many of the sophisticated
optimizations needed to generate high quality code. It can be difficult to count exactly how many passes an
optimizing compiler makes. For instance, different phases of optimization may analyse one expression
many times but only analyse another expression once.

Splitting a compiler up into small programs is a technique used by researchers interested in producing
provably correct compilers. Proving the correctness of a set of small programs often requires less effort
than proving the correctness of a larger, single, equivalent program.

While the typical multi-pass compiler outputs machine code from its final pass, there are several other
types:

 A "source-to-source compiler" is a type of compiler that takes a high level language as its input and
outputs a high level language. For example, an automatic parallelizing compiler will frequently take
in a high level language program as an input and then transform the code and annotate it with
parallel code annotations (e.g. OpenMP) or language constructs (e.g. Fortran's DOALL statements).
5
 Stage compiler that compiles to assembly language of a theoretical machine, like some Prolog
implementations
o This Prolog machine is also known as the Warren Abstract Machine (or WAM). Bytecode
compilers for Java, Python, and many more are also a subtype of this.
 Just-in-time compiler, used by Smalltalk and Java systems, and also by Microsoft .Net's Common
Intermediate Language (CIL)
o Applications are delivered in bytecode, which is compiled to native machine code just prior
to execution.

How does a compiler work?

From the diagram on the next page, you can see there are two main stages in the compiling process:
analysis and synthesis. The analysis stage breaks up the source program into pieces, and creates a
generic (language-independent) intermediate representation of the program. Then, the synthesis stage
constructs the desired target program from the intermediate representation. Typically, a compiler’s
analysis stage is called its front-end and the synthesis stage its back-end. Each of the stages is broken
down into a set of "phases" that handle different parts of the tasks. (Why do you think typical
compilersseparate the compilation process into front and back-end phases?)

Front-End Analysis Stage

There are four phases in the analysis stage of compiling:

1) Lexical Analysis or Scanning: The stream of characters making up a source program is read from left to
right and grouped into tokens, which are sequences of characters that have a collective meaning.
Examples of tokens are identifiers (user-defined names), reserved words, integers, doubles or floats,
delimiters, operators, and special symbols.
6
Example of lexical analysis:
int a;
a = a + 2;

A lexical analyzer scanning the code fragment above might return:

2) Syntax Analysis or Parsing: The tokens found during scanning are grouped together

using a context-free grammar. A grammar is a set of rules that define valid structures in the programming
language. Each token is associated with a specific rule, and grouped together accordingly. This process is
called parsing. The output of this phase is called a parse tree or a derivation, i.e., a record of which
grammar rules were used to create the source program.

Example of syntax analysis:

Part of a grammar for simple arithmetic expressions in C might look like this:

The symbol on the left side of the "->" in each rule can be replaced by the symbols on the right. To parse a
+ 2, we would apply the following rules:

When we reach a point in the parse where we have only tokens, we have finished.

By knowing which rules are used to parse, we can determine the structures present in the source program.
A source program which can be parsed is syntactically correct.
7
3) Semantic Analysis: The parse tree or derivation is checked for semantic errors i.e., a statement that is
syntactically correct (associates with a grammar rule correctly).

However, a syntactically correct statement may disobey the semantic rules of the source language.
Semantic analysis is the phase where we detect such things as use of an undeclared variable, a function
called with improper arguments, access violations, and incompatible operands and type mismatches.

Example of semantic analysis:

int arr[2], c;
c = arr * 10;

A lot of the semantic analysis work pertains to type checking. Although the C fragment above will scan into
valid tokens and successfully match the rules for a valid expression, it isn't semantically valid. In the
semantic analysis phase, the compiler checks the types and reports that you cannot use an array variable in
a multiplication expression.

4) Intermediate Code Generation: This is where the intermediate representation of the source program is
created. We want this representation to be easy to generate, and easy to translate into the target
program. The representation can have a variety of forms, but a common one is called three-address code
(TAC), which is a lot like a generic assembly language. Three-address code is a sequence of simple
instructions, each of which can have at most three operands.

Example of intermediate code generation:

The single C statement on the left is translated into a sequence of four instructions in three-address code
on the right. Note the use of temp variables that are created by the compiler as needed to keep the
number of operands down to three. Of course, it's a little more complicated than this, because we have to
translate branching and looping instructions, as well as function calls. Here is some TAC for a branching
translation:

The synthesis stage (back-end)

There can be up to three phases in the synthesis stage of compiling:

1) Intermediate Code Optimization: The optimizer accepts input in the intermediate representation (e.g.,
TAC) and outputs a streamlined version still in the intermediate representation. In this phase, the compiler
attempts to produce the smallest, fastest and most efficient running result by applying various techniques
such as:

8
• inhibiting code generation of unreachable code segments
• getting rid of unused variables
• eliminating multiplication by 1 and addition by 0
• loop optimization (e.g., remove statements that are not modified in the loop)
• common sub-expression elimination
• strength reduction
.....

The optimization phase can really slow down a compiler, so most compilers allow this feature to be
suppressed. The compiler may have fine-grain controls that allow a developer to make tradeoffs between
compilation time and optimization quality.

Example of code optimization:

In the example shown above, the optimizer eliminated an addition to the zero and a re-evaluation of the
same expression, allowing the original five TAC statements to be re-written in just three statements and
use two fewer temporary variables.

2) Object Code Generation: This is where the target program is generated. The output of this phase is
usually machine code or assembly code. Memory locations are selected for each variable. Instructions are
chosen for each operation. The three-address code is translated into a sequence of assembly or machine
language instructions that perform the same tasks.

Example of code generation:

In the example above, the code generator translated the TAC input into Sparc assembly output.

3) Object Code Optimization: There may also be another optimization pass that follows code generation,
this time transforming the object code into tighter, more efficient object code. This is where we consider
features of the hardware itself to make efficient usage of the processor(s) and registers. The compiler can
take advantage of machine-specific idioms (specialized instructions, pipelining, branch prediction, and
other peephole optimizations) in reorganizing and streamlining the object code itself. As with IR
optimization, this phase of the compiler is usually configurable or can be skipped entirely.

9
Front end

The front end analyzes the source code to build an internal representation of the program, called the
intermediate representation or IR. It also manages the symbol table, a data structure mapping each symbol
in the source code to associated information such as location, type and scope. This is done over several
phases, which includes some of the following:

1. Line reconstruction. Languages which strop their keywords or allow arbitrary spaces within
identifiers require a phase before parsing, which converts the input character sequence to a
canonical form ready for the parser. The top-down, recursive-descent, table-driven parsers used in
the 1960s typically read the source one character at a time and did not require a separate
tokenizing phase. Atlas Autocode, and Imp (and some implementations of Algol and Coral66) are
examples of stropped languages whose compilers would have a Line Reconstruction phase.
2. Lexical analysis breaks the source code text into small pieces called tokens. Each token is a single
atomic unit of the language, for instance a keyword, identifier or symbol name. The token syntax is
typically a regular language, so a finite state automaton constructed from a regular expression can
be used to recognize it. This phase is also called lexing or scanning, and the software doing lexical
analysis is called a lexical analyzer or scanner.
3. Preprocessing. Some languages, e.g., C, require a preprocessing phase which supports macro
substitution and conditional compilation. Typically the preprocessing phase occurs before syntactic
or semantic analysis; e.g. in the case of C, the preprocessor manipulates lexical tokens rather than
syntactic forms. However, some languages such as Scheme support macro substitutions based on
syntactic forms.
4. Syntax analysis involves parsing the token sequence to identify the syntactic structure of the
program. This phase typically builds a parse tree, which replaces the linear sequence of tokens with
a tree structure built according to the rules of a formal grammar which define the language's
syntax. The parse tree is often analyzed, augmented, and transformed by later phases in the
compiler.
5. Semantic analysis is the phase in which the compiler adds semantic information to the parse tree
and builds the symbol table. This phase performs semantic checks such as type checking (checking
for type errors), or object binding (associating variable and function references with their
definitions), or definite assignment (requiring all local variables to be initialized before use),
rejecting incorrect programs or issuing warnings. Semantic analysis usually requires a complete
parse tree, meaning that this phase logically follows the parsing phase, and logically precedes the
code generation phase, though it is often possible to fold multiple phases into one pass over the
code in a compiler implementation.

Back end

The term back end is sometimes confused with code generator because of the overlapped functionality of
generating assembly code. Some literature uses middle end to distinguish the generic analysis and
optimization phases in the back end from the machine-dependent code generators.

The main phases of the back end include the following:

10
1. Analysis: This is the gathering of program information from the intermediate representation
derived from the input. Typical analyses are data flow analysis to build use-define chains,
dependence analysis, alias analysis, pointer analysis, escape analysis etc. Accurate analysis is the
basis for any compiler optimization. The call graph and control flow graph are usually also built
during the analysis phase.
2. Optimization: the intermediate language representation is transformed into functionally equivalent
but faster (or smaller) forms. Popular optimizations are inline expansion, dead code elimination,
constant propagation, loop transformation, register allocation or even automatic parallelization.
3. Code generation: the transformed intermediate language is translated into the output language,
usually the native machine language of the system. This involves resource and storage decisions,
such as deciding which variables to fit into registers and memory and the selection and scheduling
of appropriate machine instructions along with their associated addressing modes (see also Sethi-
Ullman algorithm).

Compiler analysis is the prerequisite for any compiler optimization, and they tightly work together. For
example, dependence analysis is crucial for loop transformation.

In addition, the scope of compiler analysis and optimizations vary greatly, from as small as a basic block to
the procedure/function level, or even over the whole program (interprocedural optimization). Obviously, a
compiler can potentially do a better job using a broader view. But that broad view is not free: large scope
analysis and optimizations are very costly in terms of compilation time and memory space; this is especially
true for interprocedural analysis and optimizations.

Interprocedural analysis and optimizations are common in modern commercial compilers from HP, IBM,
SGI, Intel, Microsoft, and Sun Microsystems. The open source GCC was criticized for a long time for lacking
powerful interprocedural optimizations, but it is changing in this respect. Another good open source
compiler with full analysis and optimization infrastructure is Open64, which is used by many organizations
for research and commercial purposes.

Due to the extra time and space needed for compiler analysis and optimizations, some compilers skip them
by default. Users have to use compilation options to explicitly tell the compiler which optimizations should
be enabled.

Related techniques

Assembly language is not a high-level language and a program that compiles it is more commonly known as
an assembler, with the inverse program known as a disassembler.

A program that translates from a low level language to a higher level one is a decompiler.

A program that translates between high-level languages is usually called a language translator, source to
source translator, language converter, or language rewriter. The last term is usually applied to translations
that do not involve a change of language.

Cross compiler

A cross compiler is a compiler capable of creating executable code for a platform other than the one on
which the compiler is run. Cross compiler tools are generally found in use to generate compiles for
embedded system or multiple platforms. It is a tool that one must use for a platform where it is
inconvenient or impossible to compile on that platform, like microcontrollers that run with a minimal

11
amount of memory for their own purpose. It has become more common to use this tool for
paravirtualization where a system may have one or more platforms in use.

Not targeted by this definition are source to source translators, which are often called by the name of cross
compiler.

Uses of cross compilers

The fundamental use of a cross compiler is to separate the build environment from the target
environment. This is useful in a number of situations:

 Embedded computers where a device has extremely limited resources. For example, a microwave
oven will have an extremely small computer to read its touchpad and door sensor, provide output
to a digital display and speaker, and to control the machinery for cooking food. This computer will
not be powerful enough to run a compiler, a file system, or a development environment. Since
debugging and testing may also require more resources than are available on an embedded system,
cross-compilation can be less involved and less prone to errors than native compilation.

 Compiling for multiple machines. For example, a company may wish to support several different
versions of an operating system or to support several different operating systems. By using a cross
compiler, a single build environment can be set up to compile for each of these targets.

 Compiling on a server farm. Similar to compiling for multiple machines, a complicated build that
involves many compile operations can be executed across any machine that is free regardless of its
brand or current version of an operating system.

 Bootstrapping to a new platform. When developing software for a new platform, or the emulator of
a future platform, one uses a cross compiler to compile necessary tools such as the operating
system and a native compiler.

 Compiling native code for emulators for older now-obsolete platforms like the Commdore 64 or
Apple II by enthusiasts who use cross compilers that run on a current platform (such as Aztec C's
MS DOS 6502 cross compilers running under Windows XP).

Use of virtual machines (such as Java's JVM) resolves some of the reasons for which cross compilers were
developed. The virtual machine paradigm allows the same compiler output to be used across multiple
target systems.

Typically the hardware architecture differs (e.g. compiling a program destined for the MIPS architecture on
an x86 computer) but cross-compilation is also applicable when only the operating system environment
differs, as when compiling a FreeBSD program under Linux, or even just the system library, as when
compiling programs with uClibc on a glibc host.

Bootstrapping (compilers)

From Wikipedia, the free encyclopedia

Bootstrapping is a term used in computer science to describe the techniques involved in writing a compiler
(or assembler) in the target programming language which it is intended to compile.

12
One may then wonder how the chicken and egg problem of creating the compiler was solved: if one needs
a compiler for language X to obtain a compiler for language X, how did the first compiler get written?
Possible methods include:

 Implementing an interpreter or compiler for language X in language Y. Niklaus Wirth reported that
he wrote the first Pascal compiler in Fortran. Language Y could also be hand coded machine code or
assembly language.
 Another interpreter or compiler for X has already been written in another language Y; this is how
Scheme is often bootstrapped.
 Earlier versions of the compiler were written in a subset of X for which there existed some other
compiler; this is how some supersets of Java are bootstrapped.
 The compiler for X is cross compiled from another architecture where there exists a compiler for X;
this is how compilers for C are usually ported to other platforms
 Writing the compiler in X; then hand-compiling it from source (most likely in a non-optimized way)
and running that on the code to get an optimized compiler. Donald Knuth used this for his WEB
literate programming system.

Methods for distributing compilers in source code include providing a portable bytecode version of the
compiler, so as to bootstrap the process of compiling the compiler with itself.

The first language to provide such a bootstrap was NELIAC. The first commercial language to do so was
PL/I. Today, a large proportion of programming languages are bootstrapped, including Basic, C, Pascal,
Factor, Haskell, Modula-2, Oberon, OCaml, Common Lisp, Scheme and more.

Assembler

Typically a modern assembler creates object code by translating assembly instruction mnemonics into
opcodes, and by resolving symbolic names for memory locations and other entities. [1] The use of symbolic
references is a key feature of assemblers, saving tedious calculations and manual address updates after
program modifications. Most assemblers also include macro facilities for performing textual substitution—
e.g., to generate common short sequences of instructions to run inline, instead of in a subroutine.

Assemblers are generally simpler to write than compilers for high-level languages, and have been available
since the 1950s. Modern assemblers, especially for RISC based architectures, such as MIPS, Sun SPARC and
HP PA-RISC, optimize instruction scheduling to exploit the CPU pipeline efficiently.

More sophisticated high-level assemblers provide language abstractions such as:

 Advanced control structures

 High-level procedure/function declarations and invocations
 High-level abstract data types, including structures/records, unions, classes, and sets
 Sophisticated macro processing
 Object-Oriented features such as encapsulation, polymorphism, inheritance, interfaces

See Language design below for more details.

Note that, in normal professional usage, the term assembler is often used ambiguously: It is frequently
used to refer to an assembly language itself, rather than to the assembler utility. Thus: "CP/CMS was
written in S/360 assembler" as opposed to "ASM-H was a widely-used S/370 assembler."

13
linker or link editor

In computer science, a linker or link editor is a program that takes one or more objects generated by a
compiler and assembles them into a single executable program.

In IBM mainframe environments such as OS/360 this program is known as a linkage editor.

On Unix variants the term loader is often used as a synonym for linker. Because this usage blurs the
distinction between the compile-time process and the run-time process, this article will use linking for the
former and loading for the latter. However, in some operating systems the same program handles both the
jobs of linking and loading a program; see dynamic linking.

Computer programs are typically comprised of several parts or modules; these parts, if not all contained
within a single object file, refer to each other by means of symbols. Typically, an object file can contain
three kinds of symbols:

 defined symbols, which allow it to be called by other modules,

 undefined symbols, which call the other modules where these symbols are defined, and
 local symbols, used internally within the object file to facilitate relocation.

When a program comprises multiple object files, the linker combines these files into a unified executable
program, resolving the symbols as it goes along.

Linkers can take objects from a collection called a library. Some linkers do not include the whole library in
the output; they only include its symbols that are referenced from other object files or libraries. Libraries
exist for diverse purposes, and one or more system libraries are usually linked in by default.

The linker also takes care of arranging the objects in a program's address space. This may involve
relocating code that assumes a specific base address to another base. Since a compiler seldom knows
where an object will reside, it often assumes a fixed base location (for example, zero). Relocating machine
code may involve re-targeting of absolute jumps, loads and stores.

The executable output by the linker may need another relocation pass when it is finally loaded into
memory (just before execution). This pass is usually omitted on hardware offering virtual memory — every
program is put into its own address space, so there is no conflict even if all programs load at the same base
address. This pass may also be omitted if the executable is a position independent executable.

Contents

 1 Dynamic linking
 2 Relaxation
 3 References
 4 See also
 5 External links

Dynamic linking

Modern operating system environments allow dynamic linking, that is the postponing of the resolving of
some undefined symbols until a program is run. That means that the executable still contains undefined

14
symbols, plus a list of objects or libraries that will provide definitions for these. Loading the program will
load these objects/libraries as well, and perform a final linking.

This approach offers two advantages:

 Often-used libraries (for example the standard system libraries) need to be stored in only one
location, not duplicated in every single binary.
 If an error in a library function is corrected by replacing the library, all programs using it dynamically
will benefit from the correction after restarting them. Programs that included this function by static
linking would have to be re-linked first.

loader

In computing, a loader is the part of an operating system that is responsible for loading programs from
executables (i.e., executable files) into memory, preparing them for execution and then executing them.
The loader is usually a part of the operating system's kernel and usually is loaded at system boot time and
stays in memory until the system is rebooted, shut down, or powered off. Some operating systems that
have a pageable kernel may have the loader in the pageable part of memory and thus the loader
sometimes may be swapped out of memory. All operating systems that support program loading have
loaders. Some embedded operating systems in highly specialized computers run only one program and
have no program loading capabilities and thus no loaders, for example embedded systems in cars or stereo
equipment.

In Unix, the loader is the handler for the system call execve().[1] The loader's tasks under Unix include: (1)
validation (permissions, memory requirements etc.); (2) copying the program image from the disk into
main memory; (3) copying the command-line arguments on the stack; (4) initializing registers (e.g., the
stack pointer); (5) jumping to the program entry point (_start).

Loader programs are useful for prototyping, testing, and one-off applications. One such program was an
integral part of Gene Amdahl's original OS/360 operating system, and this loader facility was continued
through OS/360's descendants including MVT, MVS and z/OS.

In computer science, runtime or run time describes the operation of a computer program, the duration of
its execution, from beginning to termination (compare compile time). The term runtime can also refer to a
virtual machine to manage a program written in a computer language while it is running. Run time is
sometimes used to mean runtime library, a library of basic code that is used by a particular compiler but
when used in this fashion, runtime library is more accurate.

A runtime environment is a virtual machine state which provides software services for processes or
programs while a computer is running. It may pertain to the operating system itself, or the software that
runs beneath it. The primary purpose is to accomplish the objective of "platform independent"
programming.

15
Runtime activities include loading and linking of the classes needed to execute a program, optional
machine code generation and dynamic optimization of the program, and actual program execution.

For example, a program written in Java would receive services from the Java Runtime Environment by
issuing commands from which the expected result is returned by the Java software. By providing these
services, the Java software is considered the runtime environment of the program. Both the program and
the Java software combined request services from the operating system. The operating system kernel
provides services for itself and all processes and software running under its control. The Operating System
may be considered as providing a runtime environment for itself.

In computer science, compile time refers to either the operations performed by a compiler (the "compile-
time operations") or programming language requirements that must be met by source code for it to be
successfully compiled (the "compile-time requirements").

The operations performed at compile time usually include syntax analysis, various kinds of semantic
analysis (e.g., type checks and instantiation of template) and code generation.

The definition of a programming language will specify compile-time requirements that source code must
meet to be successfully compiled.

Compile time occurs before link time (when the output of one or more compiled files are joined together)
and runtime (when a program is executed). In some programming languages it may be necessary for some
compilation and linking to occur at runtime.

A Practical Approach To Compiler Construction PDF
88% (8)
A Practical Approach To Compiler Construction PDF
263 pages
Developing Apps with Python and Flet
From Everand
Developing Apps with Python and Flet
Williams Asiedu
No ratings yet
Compiler Construction: PDF Generated At: Fri, 04 May 2012 09:19:18 UTC
100% (1)
Compiler Construction: PDF Generated At: Fri, 04 May 2012 09:19:18 UTC
543 pages
C Programming For Beginners: The Simple Guide to Learning C Programming Language Fast!
From Everand
C Programming For Beginners: The Simple Guide to Learning C Programming Language Fast!
Tim Warren
5/5 (1)
2018 Com 414 (Compiler Construction)
100% (2)
2018 Com 414 (Compiler Construction)
79 pages
COM 121 Topic 1
No ratings yet
COM 121 Topic 1
10 pages
SAP Real Time Project Interview Questions
100% (1)
SAP Real Time Project Interview Questions
17 pages
Change Management: Presented By: Engr. Mohsin
100% (1)
Change Management: Presented By: Engr. Mohsin
34 pages
Compiler Design
No ratings yet
Compiler Design
85 pages
Compiler: This Piece of Writing Is About The Compilation Process.
No ratings yet
Compiler: This Piece of Writing Is About The Compilation Process.
6 pages
Compiler: From Wikipedia, The Free Encyclopedia
100% (2)
Compiler: From Wikipedia, The Free Encyclopedia
19 pages
Compiler: Navigation Search Compiler (Anime) Talk Page Wikiproject Computer Science Computer Science Portal
No ratings yet
Compiler: Navigation Search Compiler (Anime) Talk Page Wikiproject Computer Science Computer Science Portal
12 pages
Compiler Wikibook
No ratings yet
Compiler Wikibook
498 pages
Compiler
No ratings yet
Compiler
18 pages
Compiler Construction
No ratings yet
Compiler Construction
5 pages
Rahul Kirtoniya - 11800121032 - CD - CSE
No ratings yet
Rahul Kirtoniya - 11800121032 - CD - CSE
10 pages
Sir Chhotu Ram Ins. of Engg and Technology: Srs Report On Compiler
No ratings yet
Sir Chhotu Ram Ins. of Engg and Technology: Srs Report On Compiler
8 pages
C Programming: C Programming Language for beginners, teaching you how to learn to code in C fast!
From Everand
C Programming: C Programming Language for beginners, teaching you how to learn to code in C fast!
Adam Dodson
No ratings yet
Compiler
No ratings yet
Compiler
79 pages
SE Compiler Chapter 1
No ratings yet
SE Compiler Chapter 1
19 pages
CD Unit 1
No ratings yet
CD Unit 1
489 pages
Compiler Writing Tools
100% (2)
Compiler Writing Tools
17 pages
C Clearly - Programming With C In Linux and On Raspberry Pi
From Everand
C Clearly - Programming With C In Linux and On Raspberry Pi
Andrew Johnson
No ratings yet
CC 2
No ratings yet
CC 2
43 pages
Code Beneath the Surface: Mastering Assembly Programming
From Everand
Code Beneath the Surface: Mastering Assembly Programming
Kameron Hussain
No ratings yet
Compilers
No ratings yet
Compilers
2 pages
Compiler Design Chapter-1
No ratings yet
Compiler Design Chapter-1
46 pages
History: History of Compiler Construction
No ratings yet
History: History of Compiler Construction
4 pages
Cse (205) : Data Structures: Synopsis
No ratings yet
Cse (205) : Data Structures: Synopsis
4 pages
History of Compiler Development
No ratings yet
History of Compiler Development
33 pages
Computer Science and Compiler Design An Introduction with C 1st Edition by Compiler Generators in C ISBN pdf download
100% (4)
Computer Science and Compiler Design An Introduction with C 1st Edition by Compiler Generators in C ISBN pdf download
52 pages
The Curse of Compiler Construction
100% (1)
The Curse of Compiler Construction
50 pages
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
From Everand
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
Dexter Rogers
No ratings yet
The Curse of Compiler Construction
No ratings yet
The Curse of Compiler Construction
50 pages
Understanding and Writing Compilers - A Do It Yourself Guide
No ratings yet
Understanding and Writing Compilers - A Do It Yourself Guide
436 pages
Understanding and Writing Compilers PDF
No ratings yet
Understanding and Writing Compilers PDF
436 pages
CD - CH1 - Introduction
No ratings yet
CD - CH1 - Introduction
36 pages
Compiler Construction 1st edition by Niklaus Wirth 0201403536Â 978-0201403534 - Download the ebook now for full and detailed access
100% (5)
Compiler Construction 1st edition by Niklaus Wirth 0201403536Â 978-0201403534 - Download the ebook now for full and detailed access
82 pages
History of Compiler Construction
No ratings yet
History of Compiler Construction
15 pages
Compiler Construction Course
No ratings yet
Compiler Construction Course
12 pages
Learn C Programming in 24 Hours
From Everand
Learn C Programming in 24 Hours
Alex Nordeen
No ratings yet
Role of Data Structure in Compiler Design
67% (3)
Role of Data Structure in Compiler Design
16 pages
MODIFIED 2024_2025
No ratings yet
MODIFIED 2024_2025
28 pages
CH 1
No ratings yet
CH 1
29 pages
Teaching The Compilers Course: Alfred V. Aho
No ratings yet
Teaching The Compilers Course: Alfred V. Aho
4 pages
Compiler Construction NOTE 1
No ratings yet
Compiler Construction NOTE 1
37 pages
System Programming (BTCS-405A) Session:jan-May, 2018: Contents
No ratings yet
System Programming (BTCS-405A) Session:jan-May, 2018: Contents
10 pages
C Language for Beginners with Easy Tips of C Basic Programming
From Everand
C Language for Beginners with Easy Tips of C Basic Programming
Publicancy Ltd
No ratings yet
Computer Science and Compiler Design An Introduction with C 1st Edition by Compiler Generators in C ISBN - The ebook in PDF/DOCX format is ready for download now
100% (7)
Computer Science and Compiler Design An Introduction with C 1st Edition by Compiler Generators in C ISBN - The ebook in PDF/DOCX format is ready for download now
81 pages
CSC 319 Compiler Constructions
No ratings yet
CSC 319 Compiler Constructions
54 pages
CSC411 Compiler Construction - MO Onyesolu and OU Ekwealor - First Semester 2020/2021 Session
No ratings yet
CSC411 Compiler Construction - MO Onyesolu and OU Ekwealor - First Semester 2020/2021 Session
27 pages
CH 1
100% (1)
CH 1
29 pages
Brief Study On System Programming Language
No ratings yet
Brief Study On System Programming Language
10 pages
C Programming : All-in-One Resource for C Programming , Comprehensive Tutorials, Expert Tips, and a Wide Range of Exercises for All Skill Levels
From Everand
C Programming : All-in-One Resource for C Programming , Comprehensive Tutorials, Expert Tips, and a Wide Range of Exercises for All Skill Levels
Aria Thane
No ratings yet
File (5)
No ratings yet
File (5)
46 pages
Compilers and Interpreters
No ratings yet
Compilers and Interpreters
5 pages
Compilers and Compiler Generators in C++
No ratings yet
Compilers and Compiler Generators in C++
435 pages
Lecture 1- Introduction to compilers (1)
No ratings yet
Lecture 1- Introduction to compilers (1)
42 pages
Compiler Design
No ratings yet
Compiler Design
59 pages
Principles of Compiler Design PDF
0% (1)
Principles of Compiler Design PDF
177 pages
Chapter 1: Introduction To Compiler: April 2019
No ratings yet
Chapter 1: Introduction To Compiler: April 2019
14 pages
Lovely Professional: Submitted To: - Submitted By:-Neha Malhotra Satnam Singh
No ratings yet
Lovely Professional: Submitted To: - Submitted By:-Neha Malhotra Satnam Singh
15 pages
Pan Jeet
No ratings yet
Pan Jeet
3 pages
Socket Programs: Packet Tracer Questions
No ratings yet
Socket Programs: Packet Tracer Questions
1 page
Sample Nasm Programs
No ratings yet
Sample Nasm Programs
6 pages
Os
No ratings yet
Os
31 pages
LR Parsing Table Costruction
100% (1)
LR Parsing Table Costruction
47 pages
Thesis (CCTV) Final
No ratings yet
Thesis (CCTV) Final
26 pages
2014-CD Ch-03 SAn
No ratings yet
2014-CD Ch-03 SAn
21 pages
An Introduction To The Cisco Lifecycle Services Approach
No ratings yet
An Introduction To The Cisco Lifecycle Services Approach
20 pages
4 - Best PM - Critique Paper
No ratings yet
4 - Best PM - Critique Paper
4 pages
Sustainable Development Assessment Guide 1
No ratings yet
Sustainable Development Assessment Guide 1
218 pages
Context Free Grammar and Parsing
0% (1)
Context Free Grammar and Parsing
138 pages
ASAP Methodology PDF
No ratings yet
ASAP Methodology PDF
4 pages
Policy Implementation Practice in Ethiopia
No ratings yet
Policy Implementation Practice in Ethiopia
87 pages
Transport Assessment and Implementation: A Guide: Planning
No ratings yet
Transport Assessment and Implementation: A Guide: Planning
55 pages
Compiler Unit5notes
No ratings yet
Compiler Unit5notes
20 pages
Case Study Report of CMMI Implementation at Bank of Montreal
100% (1)
Case Study Report of CMMI Implementation at Bank of Montreal
4 pages
WA Implementation Guide
100% (1)
WA Implementation Guide
58 pages
Construction of SLR Parsing Table PDF
No ratings yet
Construction of SLR Parsing Table PDF
22 pages
Annexure 2
No ratings yet
Annexure 2
2 pages
Chapter 5 - System Implementation
No ratings yet
Chapter 5 - System Implementation
7 pages
Unit - II - Health Planning - Lect 1 - Ananda.S
100% (1)
Unit - II - Health Planning - Lect 1 - Ananda.S
39 pages
Failure of Hewlett-Packard (HP)
No ratings yet
Failure of Hewlett-Packard (HP)
5 pages
1.SAP ABAP Real Scenario
100% (1)
1.SAP ABAP Real Scenario
5 pages
Compiler Design Question Bank
No ratings yet
Compiler Design Question Bank
7 pages
Presentations PPT Unit-1 27042019073920AM
100% (1)
Presentations PPT Unit-1 27042019073920AM
42 pages
An Introduction To Flex
No ratings yet
An Introduction To Flex
7 pages
SEPM - Case Study
No ratings yet
SEPM - Case Study
13 pages
CIBG - Volume 27 - Issue 2 - Pages 5738-5742
No ratings yet
CIBG - Volume 27 - Issue 2 - Pages 5738-5742
5 pages
Steelwedge Rapid Implementation Methodology
No ratings yet
Steelwedge Rapid Implementation Methodology
13 pages
Project Cover Page Template
No ratings yet
Project Cover Page Template
11 pages
Consulting With Small Business A Process Model
No ratings yet
Consulting With Small Business A Process Model
8 pages
Flex Bison
100% (1)
Flex Bison
80 pages
TLB & Inverted Page Table
No ratings yet
TLB & Inverted Page Table
14 pages

What Compiler Is

Uploaded by

What Compiler Is

Uploaded by

What compiler is?

Why study compilers?

Compiled versus interpreted languages

One-pass versus multi-pass compilers

How does a compiler work?

Front-End Analysis Stage

There are four phases in the analysis stage of compiling:

A lexical analyzer scanning the code fragment above might return:

Example of syntax analysis:

Example of semantic analysis:

Example of intermediate code generation:

The synthesis stage (back-end)

There can be up to three phases in the synthesis stage of compiling:

Example of code optimization:

Example of code generation:

The main phases of the back end include the following:

Uses of cross compilers

From Wikipedia, the free encyclopedia

More sophisticated high-level assemblers provide language abstractions such as:

 Advanced control structures

See Language design below for more details.

 defined symbols, which allow it to be called by other modules,

This approach offers two advantages:

You might also like