A Practical Approach To Compiler Construction 1st Edition Des Watson (Auth.)
A Practical Approach To Compiler Construction 1st Edition Des Watson (Auth.)
A Practical Approach To Compiler Construction 1st Edition Des Watson (Auth.)
com
https://textbookfull.com/product/a-practical-
approach-to-compiler-construction-1st-edition-des-
watson-auth/
https://textbookfull.com/product/a-practical-guide-to-construction-
adjudication-1st-edition-pickavance/
textbookfull.com
https://textbookfull.com/product/ibm-spss-by-example-a-practical-
guide-to-statistical-data-analysis-second-edition-service-des-
societes-secretes/
textbookfull.com
https://textbookfull.com/product/a-practical-guide-to-construction-of-
hydropower-facilities-1st-edition-suchintya-kumar-sur/
textbookfull.com
https://textbookfull.com/product/a-counselor-s-guide-to-the-
dissertation-process-where-to-start-and-how-to-finish-1st-edition-
flamez/
textbookfull.com
Introduction to instrumentation and measurements Third
Edition Northrop
https://textbookfull.com/product/introduction-to-instrumentation-and-
measurements-third-edition-northrop/
textbookfull.com
https://textbookfull.com/product/chatbots-and-the-domestication-of-ai-
a-relational-approach-hendrik-kempt/
textbookfull.com
https://textbookfull.com/product/oxford-guides-to-chaucer-the-
canterbury-tales-3rd-edition-cooper/
textbookfull.com
https://textbookfull.com/product/last-temptation-second-chance-
romance-4-1st-edition-nina-dallas/
textbookfull.com
https://textbookfull.com/product/the-routledge-guidebook-to-james-s-
principles-of-psychology-1st-edition-david-e-leary/
textbookfull.com
The Construction of Discourse as Verbal Interaction María
De Los Ángeles Gómez Gónzalez (Editor)
https://textbookfull.com/product/the-construction-of-discourse-as-
verbal-interaction-maria-de-los-angeles-gomez-gonzalez-editor/
textbookfull.com
Undergraduate Topics in Computer Science
Des Watson
A Practical
Approach
to Compiler
Construction
Undergraduate Topics in Computer Science
Undergraduate Topics in Computer Science (UTiCS) delivers high-quality
instructional content for undergraduates studying in all areas of computing and
information science. From core foundational and theoretical material to final-year
topics and applications, UTiCS books take a fresh, concise, and modern approach
and are ideal for self-study or for a one- or two-semester course. The texts are all
authored by established experts in their fields, reviewed by an international advisory
board, and contain numerous examples and problems. Many include fully worked
solutions.
A Practical Approach
to Compiler Construction
123
Des Watson
Department of Informatics
Sussex University
Brighton, East Sussex
UK
Series editor
Ian Mackie
Advisory Board
Samson Abramsky, University of Oxford, Oxford, UK
Karin Breitman, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil
Chris Hankin, Imperial College London, London, UK
Dexter Kozen, Cornell University, Ithaca, USA
Andrew Pitts, University of Cambridge, Cambridge, UK
Hanne Riis Nielson, Technical University of Denmark, Kongens Lyngby, Denmark
Steven Skiena, Stony Brook University, Stony Brook, USA
Iain Stewart, University of Durham, Durham, UK
v
vi Preface
compiler is important knowledge for any user of a compiler. Compilers are complex
pieces of code and an awareness of how they work can very helpful. The algorithms
used in a compiler are relevant to many other application areas such as aspects of
text decoding and analysis and the development of command-driven interfaces. The
need for simple domain-specific languages occurs frequently and the knowledge of
compiler design can facilitate their rapid implementation.
Writing a simple compiler is an excellent educational project and enhances skills
in programming language understanding and design, data structure and algorithm
design and a wide range of programming techniques. Understanding how a
high-level language program is translated into a form that can be executed by the
hardware gives a good insight into how a program will behave when it runs, where
the performance bottlenecks will be, the costs of executing individual high-level
language statements and so on. Studying compiler design makes you a better
programmer.
Why is there now yet another book on compiler design? Many detailed and
comprehensive textbooks in this field have already been published. This book is a
little different from most of the others. Hopefully, it presents key aspects of the
subject in an accessible way, using a practical approach. The algorithms shown
are all capable of straightforward implementation in almost any programming
language, and the reader is strongly encouraged to read the text and in parallel
produce code for implementations of the compiler modules being described. These
practical examples are concentrated in areas of compiler design that have general
applicability. For example, the algorithms shown for performing lexical and syntax
analysis are not restricted for use in compilers alone. They can be applied to the
analysis required in a wide range of text-based software.
The field of programming language implementation is huge and this book covers
only a small part of it. Just the basic principles, potentially applicable to all
compilers, are explained in a practical way.
This book introduces the topic of compiler construction using many programmed
examples, showing code that could be used in a range of compiler and
compiler-related projects. The code examples are nearly all written in C, a mature
language and still in widespread use. Translating them into another programming
language should not cause any real difficulty. Many existing compiler projects are
written in C, many new compiler projects are being written in C and there are many
compiler construction tools and utilities designed to support compiler
Preface vii
Turning the compiler construction project into a group project worked well.
Programming teams can be made responsible for the construction of a complete
compiler. The development can be done entirely by members of the team or it may
be possible for teams to trade with other teams. This is a good test of
well-documented interfaces. Producing a set of good test programs to help verify
that a compiler works is an important part of the set of software modules produced
by each team.
Generating standard-format object code files for real machines in an introductory
compilers course may be trying to go a little too far. Generating assembly code for a
simple processor or for a simple subset of a processor’s features is probably a better
idea. Coding an emulator for a simple target machine is not difficult—just use the
techniques described in this book, of course. Alternatively, there are many virtual
target architecture descriptions with corresponding emulator software freely avail-
able. The MIPS architecture, with the associated SPIM software [1], despite its age,
is still very relevant today and is a good target for code generation. The pleasure of
writing a compiler that produces code that actually runs is considerable!
Reference
1. Larus JR (1990) SPIM S20: a MIPS R2000 simulator. Technical Report 966. University of
Wisconsin-Madison, Madison, WI, Sept 1990
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 High-Level Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Advantages of High-Level Languages . . . . . . . . . . . . . . . . 2
1.1.2 Disadvantages of High-Level Languages . . . . . . . . . . . . . . 3
1.2 High-Level Language Implementation . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Compilers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Compiler Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Interpreters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Why Study Compilers? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Present and Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Conclusions and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . 11
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Compilers and Interpreters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Approaches to Programming Language Implementation . . . . . . . . 13
2.1.1 Compile or Interpret? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Defining a Programming Language . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 BNF and Variants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Analysis of Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.2 Chomsky Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.3 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Compiler and Interpreter Structure . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.1 Lexical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.2 Syntax Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.3 Semantic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.4 Machine-Independent Optimisation. . . . . . . . . . . . . . . . . . . 31
2.4.5 Code Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.6 Machine-Dependent Optimisation . . . . . . . . . . . . . . . . . . . . 32
xi
xii Contents
xv
Chapter 1
Introduction
The high-level language is the central tool for the development of today’s software.
The techniques used for the implementation of these languages are therefore very
important. This book introduces some of the practicalities of producing implemen-
tations of high-level programming languages on today’s computers. The idea of a
compiler, traditionally translating from the high-level language source program to
machine code for some real hardware processor, is well known but there are other
routes for language implementation. Many programmers regard compilers as being
deeply mysterious pieces of software—black boxes which generate runnable code—
but some insight into the internal workings of this process may help towards their
effective use.
Programming language implementation has been studied for many years and it is
one of the most successful areas of computer science. Today’s compilers can generate
highly optimised code from complex high-level language programs. These compilers
are large and extremely complex pieces of software. Understanding what they do and
how they do it requires some background in programming language theory as well
as processor design together with a knowledge of how best to structure the processes
required to translate from one computer language to another.
Even in the earliest days of electronic computing in the 1940s it was clear that there
was a need for software tools to support the programming process. Programming was
done in machine code, it required considerable skill and was hard work, slow and
error prone. Assembly languages were developed, relieving the programmer from
having to deal with much of the low-level detail, but requiring an assembler, a piece
of software to translate from assembly code to machine code. Giving symbolic names
to instructions, values, storage locations, registers and so on allows the programmer
© Springer International Publishing AG 2017 1
D. Watson, A Practical Approach to Compiler Construction, Undergraduate
Topics in Computer Science, DOI 10.1007/978-3-319-52789-5_1
2 1 Introduction
to concentrate on the coding of the algorithms rather than on the details of the binary
representation required by the hardware and hence to become more productive. The
abstraction provided by the assembly language allows the programmer to ignore the
fine detail required to interact directly with the hardware.
The development of high-level languages gathered speed in the 1950s and beyond.
In parallel there was a need for compilers and other tools for the implementation
of these languages. The importance of formal language specifications was recog-
nised and the correspondence between particular grammar types and straightforward
implementation was understood. The extensive use of high-level languages prompted
the rapid development of a wide range of new languages, some designed for particular
application areas such as COBOL for business applications [1] and FORTRAN for
numerical computation [2]. Others such as PL/I (then called NPL) [3] tried to be very
much more general-purpose. Large teams developed compilers for these languages
in an environment where target machine architectures were changing fast too.
The difficulties of programming in low-level languages are easy to see and the need
for more user-friendly languages is obvious. A programming notation much closer
to the problem specification is required. Higher level abstractions are needed so that
the programmer can concentrate more on the problem rather than the details of the
implementation of the solution.
High-level languages can offer such abstraction. They offer many potential advan-
tages over low-level languages including:
so that, for example, moving a Java program from one machine to another with
different architectures and operating systems should be an easy task.
• Compile-time checking can remove many bugs at an early stage, before the pro-
gram actually runs. Checking variable declarations, type checking, ensuring that
variables are properly initialised, checking for compatibility in function arguments
and so on are often supported by high-level languages. Furthermore, the compiler
can insert runtime code such as array bound checking. The small additional runtime
cost may be a small price to pay for early removal of errors.
Despite these significant advantages, there may be circumstances where the use of a
low-level language (typically an assembly language) may be more appropriate. We
can identify possible advantages of the low-level language approach.
1.1.2.2 Efficiency
A simplistic but not inaccurate view of the language implementation process suggests
that some sort of translator program is required (a compiler) to transform the high-
level language program into a semantically equivalent machine code program that
can run on the target machine. Other software, such as libraries, will probably also
be required. As the complexity of the source language increases as the language
becomes “higher and higher-level”, closer to human expression, one would expect
the complexity of the translator to increase too.
Many programming languages have been and are implemented in this way. And
this book concentrates on this implementation route. But other routes are possible,
and it may be the characteristics of the high-level language that forces different
approaches. For example, the traditional way of implementing Java makes use of
the Java Virtual Machine (JVM) [5], where the compiler translates from Java source
code into JVM code and a separate program (an interpreter) reads these virtual
machine instructions, emulating the actions of the virtual machine, effectively run-
ning the Java program. This seemingly contrary implementation method does have
significant benefits. In particular it supports Java’s feature of dynamic class loading.
Without such an architecture-neutral virtual machine code, implementing dynamic
class loading would be very much more difficult. More generally, it allows the support
of reflection, where a Java program can examine or modify at runtime the internal
properties of the executing program.
Interpreted approaches are very appropriate for the implementation of some pro-
gramming languages. Compilation overheads are reduced at the cost of longer run-
times. The programming language implementation field is full of tradeoffs. These
issues of compilers versus interpreters are investigated further in Chap. 2.
To make effective use of a high-level language, it is essential to know something
about its implementation. In some demanding application areas such as embedded
systems where a computer system with a fixed function is controlling some elec-
tronic or mechanical device, there may be severe demands placed on the embedded
controller and the executing code. There may be real-time constraints (for example,
when controlling the ignition timing in a car engine where a predefined set of opera-
tions has to complete in the duration of a spark), memory constraints (can the whole
program fit in the 64 k bytes available on the cheap version of the microcontroller
chip?) or power consumption constraints (how often do I have to charge the batter-
ies in my mobile phone?). These constraints make demands on the performance of
the hardware but also on the way in which the high-level language implementing
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
6 1 Introduction
1.2.1 Compilers
The language implementation does not stop at the compiler. The support of col-
lections of library routines is always required, providing the environment in which
code generated by the compiler can run. Other tools such as debuggers, linkers, doc-
umentation aids and interactive development environments are needed too. This is
no easy task.
Dealing with this complexity requires a strict approach to design in the structuring
of the compiler construction project. Traditional techniques of software engineering
are well applied in compiler projects, ensuring appropriate modularisation, testing,
interface design and so on. Extensive stage-by-stage testing is vital for a compiler.
A compiler may produce highly optimised code, but if that code produces the wrong
answers when it runs, the compiler is not of much use. To ease the task of producing
a programming language implementation, many software tools have been developed
to help generate parts of a compiler or interpreter automatically. For example, lexi-
cal analysers and syntax analysers (two early stages of the compilation process) are
often built with the help of tools taking the formal specification of the syntax of the
programming language as input and generating code to be incorporated in the com-
piler as output. The modularisation of compilers has also helped to reduce workload.
For example, many compilers have been built using a target machine independent
front-end and a source language-independent back-end using a standard intermediate
representation as the interface between them. Then front-ends and back-ends can be
mixed and matched to produce a variety of complete compilers. Compiler projects
rarely start from scratch today.
Fortunately, in order to learn about the principles of language implementation,
compiler construction can be greatly simplified. If we start off with a simple pro-
gramming language and generate code for a simple, maybe virtual, machine, not
worrying too much about high-quality code, then the compiler construction project
should not be too painful or protracted.
1.2.3 Interpreters
possibly multiple statement analysis followed by the interpreter emulating the action
of the statement will be many times greater than the cost of executing a few machine
instructions obtained from a compilation of a = b + 1. However this cost can
be reduced fairly easily by only doing the analysis of the program once, translating
it into an intermediate form that is subsequently interpreted. Many languages have
been implemented in this way, using an interpreted intermediate form, despite the
overhead of interpretation.
The second problem concerns the need for the presence of an interpreter at runtime.
When the program is “executing” it is located in the memory of the target system in
source or in a post-analysis intermediate form, together with the interpreter program.
It is likely that the total memory footprint is much larger than that of equivalent
compiled code. For small, embedded systems with very limited memory this may be
a decisive disadvantage.
All programming language implementations are in some sense interpreted. With
source code interpretation, the interpreter is complex because it has to analyse the
source language statements and then emulate their execution. With intermediate code
interpretation, the interpreter is simpler because the source code analysis has been
done in advance. With the traditional compiled approach with the generation of
target machine code, the interpretation is done entirely by the target hardware, there
is no software interpretation and hence no overhead. Looking at these three levels of
interpretation in greater detail, one can easily identify tradeoffs:
The different memory requirements of the three approaches are somewhat harder
to quantify and depend on implementation details. In the source-level interpretation
case, a simplistic implementation would require both the text of the source code and
the (complex) interpreter to be in main memory. The intermediate code interpretation
case would require the intermediate code version of the program and the (simpler)
interpreter to be in main memory. And in the full compilation case, just the compiled
target code would need to be in main memory. This, of course, takes no account of the
memory requirements of the running program—space for variables, data structures,
buffers, library code, etc.
1.2 High-Level Language Implementation 9
There are other tradeoffs. For example, when the source code is modified, there is
no additional compilation overhead in the source-level interpretation case, whereas in
the full compilation case, it is usual for the entire program or module to be recompiled.
In the intermediate code interpretation case, it may be possible to just recompile the
source statements that have changed, avoiding a full recompilation to intermediate
code.
Finally, it should be emphasised that this issue of lower efficiency of interpreted
implementations is rarely a reason to dismiss the use of an interpreter. The interpreting
overhead in time and space may well be irrelevant, particularly in larger computer
systems, and the benefits offered may well overwhelm any efficiency issues.
Today’s compilers and language tools can deal with complex (both syntactically and
semantically) programming languages, generating code for a huge range of computer
architectures, both real and virtual. The quality of generated code from many of
today’s compilers is astonishingly good, often far better than that generated by a
competent assembly/machine code programmer. The compiler can cope well with
the complex interacting features of computer architectures. But there are practical
limits. For example, the generation of truly optimal code (optimising for speed, size,
power consumption, etc.) may in practice be at best time consuming or more likely
impossible. Where we need to make the best use of parallel architectures, today’s
compilers can usually make a good attempt, but not universally. There are many
unsolved optimisation-related problems. Also, surely there must be better processor
architectures for today’s and tomorrow’s programming languages?
Compilers are not just about generating target code from high-level language
programs. Programmers need software tools, probably built on compiler technology,
to support the generation of high-quality and reliable software. Such tools have
been available for many years to perform very specific tasks. For example, consider
the lint tool [6] to highlight potential trouble spots in C programs. Although huge
advances have been made, much more is required and this work clearly interacts with
programming language design as well as with diverse areas of software engineering.
Random documents with unrelated
content Scribd suggests to you:
itseni ikkunasta eteishuoneen ulkonevalle katolle. Sekuntia
myöhemmin sain pihallaolijain huomion kiintymään itseeni
tipahtamalla maahan keskelle rähäkkää; hätäytynyt Castelroux, joka
luuli minun aikovan karata, seurasi perässäni samaa harvinaista tietä
myöten, huutaen täyttä kurkkua:
»Oh, sitäkö?» vastasin keveästi. »Niin, siitä olen ollut koko ajan
varma. Ilmiantaja oli St. Eustache houkkio. Piiskasin häntä —»
Tajusin, kuinka julma oli sen kolahduksen täytynyt olla, joka oli
ajanut hänet tähän. Mutta hän oli rakastanut minua, niin, hän rakasti
minua vieläkin, vaikkakin hän luuli minua vihaavansa ja vaikka hän
oli toiminut, ikäänkuin hän olisi vihannut.
Yhtä äkkiä kuin olin riemastunut, lamaannuin jälleen. Jos hän nyt
oli loukkaantunut siitä, että hän luuli minun leikittelevän hänen
tunteillaan, niin eikö sama tunne valtaisi hänet taaskin, kun hän saisi
tietää totuuden?
Niin, sotku oli todella surkea. Mutta minä rohkaisin mieltäni. Vetoni
oli maksettava, ennenkuin taas menisin hänen luokseen, vaikka
jäisinkin todella köyhäksi. Nyt kiitin Jumalaa siitä, ettei hän saisi sitä
tietää, ennenkuin itse palaisin kertomaan siitä hänelle.
XI
KUNINKAAN KÄSKYNHALTIJA
Hän oli raskas mies, joskin lyhyt, ja hänen tanakka vartalonsa oli
suorastaan tavattoman jäntevä. Mutta vimma antoi silloin minulle
niin rajut voimat, että nykäisin hänet jaloiltaan, ikäänkuin hän olisi
ollut heikko vätys. Kiskaisin hänet pitkälleen pöydälle ja pehmitin
sitten hänen naamaansa sydämeni pohjasta ja suureksi
nautinnokseni.
TOULOUSEN TUOMIOISTUIN
»Kyllä, monsieur.»
»Nimittäin?»
Olin ollut puoli tuntia kopissani, kun ovi avautui ja sisään astui
Castelroux, jota en ollut nähnyt senjälkeen kuin edellisenä yönä. Hän
tuli valittamaan kovaa osaani ja samalla kehoittamaan, etten vielä
tyyten menettäisi toivoa.
»Monsieur, monsieur.»
»He eivät saa panna sitä täytäntöön! Oi, he eivät saa! Sanokaa
minulle, että voitte puolustautua, ettette ole se mies, joksi he teitä
luuleva!»
YHDENNELLÄTOISTA HETKELLÄ