Compiler & Interpreter
By
Dr. Tariq R. Soomro
WEEK-3: INTRODUCTION TO STATEMENTS AND TYPES:
• The facilities provided by Pascal and C are developed enough that
latter languages have simply borrowed them, and built on them.
• In imperative programming:
1) The values of variables can change as a program runs, and
2) Programs – what we write – are not the same as
computations – the actions that occur when a program
runs.
• These motivate the two key concepts: Structure Statements and
invariants (relate programs and computations).
The need for structured programming:
• The basic unit of imperative programming are actions, which can
change the values of variables.
• A typical action is an assignment.
• The following assignment changes the value of variable a:
a = 2+3; or a := 2+3
• The assignment symbol := or = appears between the variable a
and the expression 2+3.
• This assignment computes the value 5 of the expression 2+3 and
associates it with a; the old value of a is forgotten (in whatever the
case – even if a not assign any value – garbage)
• An Array supports random access to a sequence of elements of the
same type.
• Random access means that elements can be selected by their
position in the sequence.
• Static programs, Dynamic Computations: A program is a concise
representation of the computation that occurs when the program
can be executed over and over again, possibly leading to a long
computation.
• For example: Dynamic allocation of memory or gotoxy(x,y) –
depends on value of x and y.
• The static text of a program is distinct from the dynamic
computations that occur when the program runs.
• Problems occur if a reader of a program cannot readily
understand the actions that occur when the program runs.
• The control-flow constructs are designed with the following
principle in mind:
- Structure programming: The structure of the program text
should be help us understand what the program does
• The emphasis on structure and correctness is independent of
efficiency.
• Carefully designed structured programs can be just as efficient as
unstructured ones.
• In fact, the readability of structured program can make them
easier to modify and tune for efficiency.
• In reality, efficient use of the underlying machine has always been
a primary design goal for imperative languages.
• Language designers have achieved efficiency by staying true to
the machine, using the following design principle:
- Efficiency: A language must allow an underlying
assignment-oriented machine to be used directly and
efficiently.
• Efficiency is likely to remain a driving influence.
• For example, consider the following problem:
• Remove adjacent duplicated from a list of elements.
• For removal of adjacent duplicates from:
112223144
• yields the list:
12314
• The remaining occurrences of 1 are not adjacent.
• This problem can be solved by treating the input not as a list of
elements, but as a list of runs, where run consists of one or more
copies of the same element.
• The approach is to repeatedly
Read element x;
Write x;
Skip past the duplicates of x;
Invariants: Program Design:
• Programming language design must deal at some level with
program design.
• The purpose of programming languages is to make programming
easier.
• Invariants are a key concept for designing imperative programs.
• An invariant at some point in a program is an assertion that holds
whenever that point is reached at run time – that is, whenever
control reaches that point.
• An assertion is a true/false condition about the state of a
computation.
• Some of the difficulty in writing correct code is that correctness is
a property not of the static source text we write but of its dynamic
computations – when the program is run it must do such and
such.
• Technically we work with assertions, yet we talk about invariants,
since the intent is to work with properties that hold every time
control reaches a program point.
Types: Data Representation:
• The emphasis in imperative languages is on data structures with
assign-able components; that is, components that can hold
individual values.
• The size and layout of data structures tends to be fixed at compile
time in imperative languages, before a program is run.
• Data structures that grow and shrink are typically implemented
using fixed-sized cells and pointers.
• Storage in imperative languages must typically be allocated and
de-allocated explicitly.
• By contrast, in functional languages, the emphasis is on values,
independent of how they are stored; operations of values
implicitly grow and shrink the underlying data structure; and,
storage is recycled automatically, using garbage collection.
• Role of Types:
• The term object or data object refers to something meaningful to
an application; the term representation or data representation
refers to the organization of values in a program.
• Thus, objects in an application have corresponding
representations in a program.
• For example, the objects in this example are days like May 6 and
their representations are integers like 126.
• A parking-lot company counts the number of cars that enter the
lot each day.
• A day of the year is a data object in this parking-lot application.
• In a program, a day can be represented by an integer between 1 to
366.
• January 1 is represented by 1, January 31 by 31, February 1 by
32, and so on.
• The correspondence between application and program is as
follows:
Application Program
Data … January 31 31
Data … May 6 126
Variable … d n
Operation tomorrow(d) n+1
• Once days are represented as integers, the integers can be
manipulated using arithmetic operation like = and + on integers.
• For example, if d is a day and n is its integer representation, then
the next day, tomorrow(d), is represented by n+1.
• Confusion between objects and their representations can lead to
errors.
• It makes sense to multiply two integers, but it does not make sense
to multiply two days.
• Furthermore, the representation may not be unique; the day May
6 is represented by the integer 126 – or is it 127?
• The answer depends on whether the month of February has 28 or
29 days that year.
• Data representations in imperative language are built up from
values that can be manipulated directly by the underlying
machine.
• Values held in machine locations can be classified into basic types,
such as integers, characters, reals, and booleans.
• A type expression describes how a data representation is built up.
• The term “type expression” will be abbreviated to type to avoid
confusion between a type expression like:
int abc[10]; or array [0..99] of char
• and an arithmetic expression like
a+b*c;
Basic Types:
• Values associated with basic types can be used freely and
efficiently in imperative languages.
• They can be compared for equality, they can appear on the right
sides of assignments, and they can be passed as parameters.
• Operations on basic values often correspond to single machine
instructions or short sequences of machine instructions, so the
operations are implemented efficiently.
• Enumerations:
• An enumeration is a finite sequence of names written between
parentheses.
• The declaration makes day an enumeration with seven elements
as:
type day = (Mon, Tue, Wed, Thu, Fri, Sat, Sun);
• Names like Mon are treated as constants.
• Enumerated names are constants.
• The basic type boolean is treated as the pre-declared
enumeration:
type boolean = (true, false);
• Integer & Reals:
• Integer and reals get machine support.
• Short-circuit evaluation of boolean expression:
• C uses short-circuit evaluation for the boolean operations | | and
&&; that is the second operand is evaluated only if necessary.
• For example:
While (i >= 0 && x != a[i] ) i = i - 1 ;
• Subranges:
• Subranges specify bounds.
• Subranges are a special case of basic types because they restrict
the range of values of an existing type.
• Arrays:
• An array is a data structure that holds a sequence of elements of
the same type.
• Array types are often indexed by integers, and they can be
indexed by enumerations (e.g. Pascal language includes array
bounds).
• Records / Structures:
• Example is structure in C.
• Also Union in C is example.
• Sets:
• Sets can be implemented efficiently using bits in the underlying
machine.
• Operations of sets turn into bit operations.
• Pointers: Efficiency and Dynamic Allocation:
• A pointer type is a value that provides indirect access to elements
of a known type.
• Pointer are motivated by indirect addresses in machine language,
the main difference being that a pointer p points only to objects of
a specific type T.
• Pointers are used as follows:
• Efficiency: Rather than move or copy a large data structure in
memory, it is more efficient to move or copy a pointer to the data
structure.
• Dynamic data: Data Structures that grow and shrink during
execution can be implemented using records and pointers.
• Pointers serve as links between cells.
• String:
• String in C.