CSC 332 Study Session 2
CSC 332 Study Session 2
Introduction
The language description is a computer language designed to better communicate various aspects of
a printed material from a computer to a printing machine. Text and graphics are laid out on a
computer screen, but using only output bitmaps limits the amount of information that can be
received by a printer.
It relays more information faster, with higher quality materials and more efficient printing
instructing the printers on the quantity of materials to be printed upon. This study session will
introduce you to: Syntactic Structure and Language Semantics
Page 1 of 20
CSC 332: Survey of Programming Languages
Language description is a type of “computer language” used primarily for the printing industry.
They are thorough enough to be considered programming languages. A large percentage of them
are not. These are known as mark-up languages. Like the hypertext mark-up language (HTML),
description mark-up languages are capable of speaking to a limited number of programs.
Just as HTML is used primarily to speak with web browsers, most language description varieties
can be read by a particular machine, program, or computer. Languages descriptions are often
created using binary or textual commands. In the case of a binary language, “graphical and textual
formatting” is turned into a series of ones and zeros using the language description.
This code is sent to the printer and then transferred back into a visual graphic as it is printed.
Computers are usually connected to large printing machines, and a specialized language is used for
both devices to communicate.
The efficiency in setting and printing materials is greatly increased. Though, there are a wide
variety of description languages, but the most common is Adobe® PostScript®. Binary codes are the
same ones used by every computer. When a command is typed into a keyboard, for example, a
series of ones (1s) and zeros (0s) representing the key tells the computer what to do.
Since information is better received by a printer using description languages, the colour, layout, and
resolution is often of a better quality than they would be if the same item were printed without
them.
In many cases, a programmer is needed to create new description language code. Some of the easier
mark-up languages may be learned by anyone, although they are limited in their use.
Page 2 of 20
CSC 332: Survey of Programming Languages
Syntax is concerned with the structure of programs and layout with their appearance. The syntactic
elements of a programming language are determined by the “computation model” and “pragmatic
concerns”. There are well developed tool for the description of the syntax of programming
languages. Examples of such tools are regular, context-free and attribute grammars.
________________ is concerned with the structure of programs and layout with their appearance.
The purpose of syntactic structure is to determine the structure of the input text. This structure
consists of:
A hierarchy of phrase
The smallest of which are the basic symbols and
The largest of which is the sentence
The structure can be described by a tree with one node for each phrase. Basic symbols are
represented by values stored at the nodes. The root of the tree represents the sentence.
Page 3 of 20
CSC 332: Survey of Programming Languages
However, programmers and compiler writers need to know the actual symbols used in programs
(that is, the concrete syntax). For instance, a grammar defining the concrete syntax of arithmetic
expressions is grammar G1.
Example 2.1:
V={E}
T = { c, id, +, *, (, ) }
P = {E --> c,
E --> id,
E --> (E),
E --> E + E,
E --> E * E }
S=E
Let’s assume that c and id stand for any constants and identifiers respectively. Concrete syntax is
concerned with the hierarchical relationships and the particular symbols used.
The abstract syntax describes the structure of an “abstract syntax tree”, much the way the concrete
syntax describes the “phrase structure of the input”. Thus, computations over the input can be
written with attribute grammar specifications that are based on an abstract syntax.
A tool, called “Map tool” that automatically generates the abstract syntax tree based on an analysis
of the concrete, abstract syntaxes and user specifications given in files of type `.map' was used.
Page 4 of 20
CSC 332: Survey of Programming Languages
This tool can convert ASDL descriptions into the appropriate data-structure definitions and
functions. And to convert the data-structures to or from a standard flattened representation. This
makes it easier to build compiler components that inter-operate.
Abstract Syntax Description Language describes the abstract syntax of compiler intermediate
representations (IRs) and other tree-like data structures. The abstract syntax however, describes the
structure of an abstract syntax tree, much the way the concrete syntax describes the phrase structure
of the input.
The Map-tool converts ASDL into C, C++, Java, ML data-structure definitions, graphical browser-
editor of ASDL data structures and conversion functions has been built. ASDL has shares features
found in many network interface description languages (IDLs), algebraic data types, and languages
(such as ASN.1 and SGML). Compared to other alternatives, ASDL is simple and powerful. The
main point of abstract syntax is to:
Page 5 of 20
CSC 332: Survey of Programming Languages
Example 2.2:
V={E}
T = { c, id, add, mult}
P = {E --> c,
E --> id,
E --> add E E ,
E --> mult E E }
S=E
The key difference in the use of concrete and abstract grammars is best illustrated by comparing the
derivation tree and the abstract syntax tree for the expression id + (id * id) in the table 2.1 below.
Table 2.1: The Key Differences between Concrete and Abstract Syntax
2. Concrete syntax defines the way programs are Abstract syntax describes the pure structure of a
written. program by specifying the logical relation
between parts of the program.
Page 6 of 20
CSC 332: Survey of Programming Languages
However, context-free grammars are sufficient to describe most programming language constructs.
They cannot specify:
1. Grammatical categories (e.g. noun phrase, verb phrase, article, noun, verb etc.),
2. Individual words (elements of the alphabet),
3. Rules for describing the order in which elements of the grammatical categories must appear
4. A most general grammatical category
The grammatical categories are: S, NP, VP, D, N, V. The words are: a, the, cat, mouse, ball, boy,
girl, ran, bounced, caught. The grammar rules are below:
S --> NP VP
NP --> N
NP --> D N
VP --> V
VP --> V NP
V --> ran | bounced | caught
D --> a | the
N --> cat | mouse | ball | boy | girl
Page 7 of 20
CSC 332: Survey of Programming Languages
Using the grammar G0, the sentence the cat caught the mouse can be generated as follows:
S ==> NP VP
==> D N VP
==> The N VP
==> The cat VP
==> The cat V NP
==> The cat caught NP
==> The cat caught D N
==> The cat caught the N
==> The cat caught the mouse
This derivation is performed in a leftmost manner. That is, in each step the leftmost variable in the
sentential form is replaced. Sometimes a derivation is more readable if it is displayed in the form of
a derivation tree.
/\
NP VP
/\ /\
D N V NP
/ / / /\
The cat caught D N
/ /
The mouse
Page 8 of 20
CSC 332: Survey of Programming Languages
The notion of a tree based derivation is formalized. When there are two or more left-most
derivations of a string in a given grammar or, equivalently, there are two distinct derivation trees
for the same sentence, the grammar is said to be ambiguous. In some instances, ambiguity may be
eliminated by the selection of another grammar for the language or adding rules which may not be
context-free rules.
A context-free grammar G is said to be ambiguous if there exists some w in L (G) which has two
distinct derivation trees.
Grammars are rewriting rules. They may be used for both recognition and generation of programs.
Grammars are independent of computational models and are useful for the description of the
structure of languages in general.
Grammars may be used both for the generation and recognition (parsing) of sentences. Both
generation and recognition requires finding a rewriting sequence consisting of applications of the
rewriting rules which begins with the grammar's start symbol and ends with the sentence.
The recognition of a program in terms of the grammar is called parsing. An algorithm which
recognizes programs is called a parser. A parser either implicitly or explicitly builds a derivation
tree for the sentence.
There are two approaches to parsing. The first approach is called top-down parsing and the second,
bottom-up parsing. The parser can begin with the start symbol of the grammar and attempt to
generate the same sentence, that it, is attempting to recognize or it can try to match the input to the
right-hand side of the productions building a derivation tree in reverse.
Page 9 of 20
CSC 332: Survey of Programming Languages
The top-down parsing above displayed both the parse tree and the remaing unrecognized input.
The input is scanned from left to right one token at a time. Each line in the figure represents a
single step in the parse. Each non-terminal is replaced by the right-hand side defining it. Each time
a terminal matches the input, the corresponding token is removed from the input.
Page 10 of 20
CSC 332: Survey of Programming Languages
\ /
S
The bottom-up parsing above displayed both the parse tree and the remaining unrecognized input.
Note that the parse tree is constructed up-side down, i.e., the parse tree is built in reverse.
2.1.4 Expression
Expression is one of the most basic building blocks of computer programs. It is central to virtually
all computer programs. Expressions can enable you to:
1+1
An expression like this is not very useful by itself in computer language. Unlike people, who can
easily recognize “one plus one” and fill in the blank (“equals two”); computers are not capable of
that kind of leap of logic. For the expression to be useful, it needs to tell a computer not just to add
one and one, but to store the result somewhere so that we can make use of it later (either by
displaying it to the user or using it in another expression later).
An expression is defined as a number of operands or data items combined using several operators
Page 11 of 20
CSC 332: Survey of Programming Languages
1) Infix notation
2) Prefix notation
3) Postfix notation
Infix notation: It is most common notation in which, the operator is written or placed in-between
the two operands. For e.g. the expression to add two numbers A and B is written in infix notation
as, A+B Operands Operator. In this example, the operator is placed in-between the operands A and
B. This is the reason why this notation is called infix.
Prefix Notation: This is also called Polish notation. It refers to the notation in which the operator is
placed before the operand as, +AB. As the operator ‘+’ is placed before the operands A and B, this
notation is called prefix (pre means before).
Postfix Notation: In the postfix notation the operators are written after the operands, so it is called
the postfix notation (post means after). It is also known as suffix notation or reverse polish notation.
In computer science, the term semantics refers to the meaning of languages, as opposed to their
form (syntax). According to Euzenat, semantics "provides the rules for interpreting the syntax
which do not provide the meaning directly but constrains the possible interpretations of what is
declared." In other words, semantics is about interpretation of an expression.
Page 12 of 20
CSC 332: Survey of Programming Languages
4. The relation between computation and the underlying mathematical structures from fields such as
logic, set theory, model theory, category theory etc.
Structural Operational Semantics (or small-step semantics) formally describe how the individual
steps of a computation take place in a computer-based system. This programming language defines
how a valid program is interpreted as sequences of computational steps. These sequences are the
Page 13 of 20
CSC 332: Survey of Programming Languages
meaning of the program. In the context of functional programs, the final step in an ending sequence
returns the value of the program.
Opposition Natural Semantics (or big-step semantics) describe how the overall results of the
executions are obtained.
In general, there can be many return values for a single program, because the program could be
nondeterministic and deterministic program.
Therefore, the distinctions between the three broad classes of approaches can sometimes be
ambiguous, but all known approaches to formal semantics use the above techniques, or some
combination thereof.
1. Language description is a type of “computer language” used primarily for the printing
industry. Languages descriptions are often created using binary or textual commands.
2. Like the hypertext mark-up language (HTML), description mark-up languages are capable
of speaking to a limited number of programs.
3. Abstract Syntax Description Language (ASDL) describes the abstract syntax of compiler
intermediate representations (IRs) and other tree-like data structures.
4. The ordering of symbols within a token (lexical units) is described by regular expressions
while the ordering of symbols within a program is described by context-free grammars.
5. expression is defined as a “number of operands or data items combined using several
operators”. There are basically three types of notations for an expression; 1) Infix notation,
2) Prefix notation, and 3) Postfix notation.
Page 14 of 20
CSC 332: Survey of Programming Languages
Pilot Answers
Page 15 of 20
CSC 332: Survey of Programming Languages
the logical formulas that describe it. Its meaning is exactly what can be proven about it in
some logic. The canonical example of axiomatic semantics is Hoare logic.
c. Operational Semantics: The execution of the language is described directly (rather than by
translation). Operational semantics loosely corresponds to interpretation. Although, the
"implementation language" of the interpreter is generally a mathematical formalism.
Operational semantics may define an abstract machine such as the SECD (Stack,
Environment, Code, and Dump - the internal registers of the machine). This gives meaning
to phrases by describing the transitions they induce on states of the machine.
Operational semantics can be classified in two categories:
Page 16 of 20
CSC 332: Survey of Programming Languages
Glossary of Terms
Axiomatic Semantics: This gives meaning to phrases by describing the logical axioms that apply
to them.
Binary language is a term used to describe a basic form of computer code used by many cultures
throughout the Milky Way Galaxy.
Concrete syntax is concerned with the hierarchical relationships and the particular symbols used.
Hoare logic is a formal system with a set of logical rules for reasoning rigorously about the
correctness of computer programs.
Language description is a type of “computer language” used primarily for the printing industry.
Opposition Natural Semantics (or big-step semantics) describe how the overall results of the
executions are obtained.
Structural Operational Semantics (or small-step semantics) formally describe how the individual
steps of a computation take place in a computer-based system.
Page 17 of 20
CSC 332: Survey of Programming Languages
2. Just as HTML is used primarily to communicate with web browsers, most language description
varieties can be read by a particular by ………...
3. When a command is typed into a keyboard, what series tells the computer what to do?
9. What is the key difference between regular expressions and context-free grammars?
2. List and explain briefly the three approaches to formal semantics of programming language.
Page 18 of 20
CSC 332: Survey of Programming Languages
SAQ 2.2
Page 19 of 20
CSC 332: Survey of Programming Languages
Page 20 of 20