E6998-3: Advanced Topics in Programming Languages and Compilers
E6998-3: Advanced Topics in Programming Languages and Compilers
2 Al Aho
Outline
• Course objectives
• Course requirements
• Computational thinking
• Issues in language design and specification
• Overview of a compiler
• Overview of lambda calculus
3 Al Aho
Course Objectives
• Computational thinking in language design
• Understanding how modern language and compiler
technology can be used to make software reliably,
securely, and efficiently
• Motivation for new programming languages such as GO
and Clojure
• Understanding modern program translation techniques
and tools
• Awareness of language and compiler issues in dealing
with parallelism and concurrency
• A highlight of this course is a semester-long project in
which you can explore in some depth an advanced topic in
programming languages and compilers of your own
choosing.
4 Al Aho
Potential Project Topics
• Lambda calculus and functional languages
• Concurrency and parallelism
• Program analysis techniques
• New programming languages such as GO and Clojure
• New program translation tools such as KPEG and PLY
• Model checking
• Satisfiability modulo theory solvers
• Abstract interpretation
• Report on a “most influential PLDI paper”
– http://www.sigplan.org/Awards/Conferences/PLDI/Main
5 Al Aho
Recent Most Influential PLDI Papers
• Extended Static Checking for Java
• Automatic predicate abstraction of C programs
• Dynamo: A Transparent Dynamic Optimization System
• A Fast Fourier Transform Compiler
• The Implementation of the Cilk-5 Multithreaded Language
• Exploiting Hardware Performance Counters with Flow and Context Sensitive
Profiling
• TIL: A Type-Directed Optimizing Compiler for ML
• Selective Specialization for Object-Oriented Languages
• ATOM: a system for building customized program analysis tools
• Space Efficient Conservative Garbage Collection
• Lazy Code Motion
• A data locality optimizing algorithm
[http://www.sigplan.org/Awards/Conferences/PLDI/Main]
6 Al Aho
Additional Project Topics
• Garbage collection
• Data-flow analysis schemas
• Instruction-level parallelism
• Optimizing for parallelism and locality
• Interprocedural analysis
• Intermediate representations
– Functional IRs
• New compiler development tools
• Compiler collections
– GCC
– LLVM
– .NET
7 Al Aho
Prerequisites and Background Text
8 Al Aho
Course Project and Grade
9 Al Aho
The Age of Computational Thinking
Computational advertising
Computational biology
Computational chemistry
Computational linguistics
Computational neuroscience
Computational physics
Computational science
Computational thinking in programming
language design
10 Al Aho
Computational Thinking – Jeannette Wing
A. V. Aho
Computation and Computational Thinking
The Computer Journal 55:12, pp. 832-835, 2012
12 Al Aho
A Good Way to Learn Computational Thinking
13 Al Aho
Computational Thinking in Language Design
Problem
Domain
Computational
Model
Algorithms
Programming
Language
14 Al Aho
Computational Model of AWK
AWK is a simple language designed to perform routine
data-processing tasks on strings and numbers
{ total[$1] += $2 }
END { for (x in total) print x, total[x] }
eve 20
bob 15
alice 40
15 Al Aho
Theory in Practice: Regular Expression Pattern
Matching in Perl, Python, Ruby vs. AWK
Time to check whether a?nan matches an
Russ Cox, Regular expression matching can be simple and fast (but is slow in Java,
Perl, PHP, Python, Ruby, ...) [http://swtch.com/~rsc/regexp/regexp1.html, 2007]
16 Al Aho
Evolution of Programming Languages
1970 2012
Fortran C
Lisp Java
Cobol Objective-C
Algol 60 C++
APL C#
Snobol 4 PHP
Simula 67 Basic
Basic Python
PL/1 Perl
Pascal Ruby
17 Al Aho
Issues in Language Design
• Domain of application
– exploit domain restrictions for expressiveness, performance
• Computational model
– simplicity, ease of expression
• Abstraction mechanisms
– reuse, suggestivity
• Type system
– reliability, security
• Usability
– readability, writability, efficiency
18 Al Aho
Computational Models in Languages
Prolog: Logic
19 Al Aho
Evolutionary Forces
20 Al Aho
Target Languages
CISCs
RISCs
Parallel machines
Multicores
GPUs
Quantum computers
21 Al Aho
Programming Languages Today
22 Al Aho
“99 Bottles of Beer”
99 bottles of beer on the wall, 99 bottles of beer.
Take one down and pass it around, 98 bottles of beer on the wall.
24 Al Aho
“99 Bottles of Beer” in Perl
''=~( '(?{' .('`' |'%') .('[' ^'-')
.('`' |'!') .('`' |',') .'"'. '\\$'
.'==' .('[' ^'+') .('`' |'/') .('['
^'+') .'||' .(';' &'=') .(';' &'=')
.';-' .'-'. '\\$' .'=;' .('[' ^'(')
.('[' ^'.') .('`' |'"') .('!' ^'+')
.'_\\{' .'(\\$' .';=('. '\\$=|' ."\|".( '`'^'.'
).(('`')| '/').').' .'\\"'.+( '{'^'['). ('`'|'"') .('`'|'/'
).('['^'/') .('['^'/'). ('`'|',').( '`'|('%')). '\\".\\"'.( '['^('(')).
'\\"'.('['^ '#').'!!--' .'\\$=.\\"' .('{'^'['). ('`'|'/').( '`'|"\&").(
'{'^"\[").( '`'|"\"").( '`'|"\%").( '`'|"\%").( '['^(')')). '\\").\\"'.
('{'^'[').( '`'|"\/").( '`'|"\.").( '{'^"\[").( '['^"\/").( '`'|"\(").(
'`'|"\%").( '{'^"\[").( '['^"\,").( '`'|"\!").( '`'|"\,").( '`'|(',')).
'\\"\\}'.+( '['^"\+").( '['^"\)").( '`'|"\)").( '`'|"\.").( '['^('/')).
'+_,\\",'.( '{'^('[')). ('\\$;!').( '!'^"\+").( '{'^"\/").( '`'|"\!").(
'`'|"\+").( '`'|"\%").( '{'^"\[").( '`'|"\/").( '`'|"\.").( '`'|"\%").(
'{'^"\[").( '`'|"\$").( '`'|"\/").( '['^"\,").( '`'|('.')). ','.(('{')^
'[').("\["^ '+').("\`"| '!').("\["^ '(').("\["^ '(').("\{"^ '[').("\`"|
')').("\["^ '/').("\{"^ '[').("\`"| '!').("\["^ ')').("\`"| '/').("\["^
'.').("\`"| '.').("\`"| '$')."\,".( '!'^('+')). '\\",_,\\"' .'!'.("\!"^
'+').("\!"^ '+').'\\"'. ('['^',').( '`'|"\(").( '`'|"\)").( '`'|"\,").(
'`'|('%')). '++\\$="})' );$:=('.')^ '~';$~='@'| '(';$^=')'^ '[';$/='`';
26 Al Aho
Grammars are Used for Specifying Syntax
The grammar S → aSbS | bSaS | ε generates all strings of
a’s and b’s with the same number of a’s as b’s.
This grammar is ambiguous: abab has two parse trees.
S S
a S b S a S b S
b S a S ε ε a S b S
ε ε ε ε
1 2n
(ab)n has parse trees
n 1 n
27 Al Aho
Natural Languages are Inherently Ambiguous
28 Al Aho
Programming Languages are not
Inherently Ambiguous
This grammar G generates the same language
S → aAbS | bBaS | ε
A → aAbA | ε S
B → bBaB | ε
a A b S
G is unambiguous and has
only one parse tree for ε a A b S
every sentence in L(G).
ε ε
29 Al Aho
Methods for Specifying the Semantics of
Programming Languages
Operational semantics
Program constructs are translated to an understood language.
Axiomatic semantics
Assertions called preconditions and postconditions specify
the properties of statements.
Denotational semantics
Semantic functions map syntactic objects to semantic values.
30 Al Aho
Phases of a Compiler
source target
program program
Interm.
Lexical Syntax Semantic Code Code
Code
Analyzer Analyzer Analyzer Optimizer Gen.
Gen.
annotated
token syntax interm. interm.
syntax
stream tree rep. rep.
tree
Symbol Table
[A. V. Aho, M. S. Lam, R. Sethi, J. D. Ullman, Compilers: Principles, Techniques, & Tools, 2007]
31 Al Aho
Compiler Component Generators
lex yacc
specification specification
Lexical Syntax
Analyzer Analyzer
Generator Generator
(lex) (yacc)
32 Al Aho
Lex Specification for a Desk Calculator
number [0-9]+\.?|[0-9]*\.[0-9]+
%%
[ ] { /* skip blanks */ }
{number} { sscanf(yytext, "%lf", &yylval);
return NUMBER; }
\n|. { return yytext[0]; }
33 Al Aho
Yacc Specification for a Desk Calculator
%token NUMBER
%left '+'
%left '*'
%%
lines : lines expr '\n' { printf("%g\n", $2); }
| /* empty */
;
expr : expr '+' expr { $$ = $1 + $3; }
| expr '*' expr { $$ = $1 * $3; }
| '(' expr ')' { $$ = $2; }
| NUMBER
;
%%
#include "lex.yy.c"
[Stephen C. Johnson, Yacc: Yet Another Compiler-Compiler ]
34 Al Aho
Creating the Desk Calculator
Result
Desk
1.2 * (3.4 + 5.6) 10.8
Calculator
35 Al Aho
Some Computational Thinking Lessons
Learned in COMS W4115
• “Designing a language is hard and designing a simple
language is extremely hard!”
36 Al Aho
Lambda Calculus − A Quick Overview
37 Al Aho
Grammar for Lambda Calculus
39 Al Aho
Function Application and Currying
40 Al Aho
Lambda Calculus Conventions
43 Al Aho
Examples of Free and Bound Variables
44 Al Aho
The Set of Free Variables
45 Al Aho
Renaming Bound Variables by
Alpha Conversion
• The name of a formal parameter in a function definition
is arbitrary. We can use any variable to name a
parameter, so that the function λx.x is equivalent to
λy.y and λz.z. This kind of renaming is called alpha
conversion.
• Note that we cannot rename free variables in
expressions.
• Also note that we cannot change the name of a bound
variable in an expression to conflict with the name of a
free variable in that expression.
46 Al Aho
Substitution
47 Al Aho
Evaluation of Function Applications by
Beta Reductions
• A function application fg is evaluated by substituting
the argument g for the formal parameter in the body of
the function definition f.
• Example: (λx.x)y → [y/x]x = y
• This substitution in a function application is called a
beta reduction and we use a right arrow to indicate a
beta reduction.
48 Al Aho
Function Application by Beta Reductions
49 Al Aho
Eta Conversion and Beta Abstraction
50 Al Aho
Evaluating Expressions using Renaming
51 Al Aho
Examples of Evaluating Expressions
using Renaming
• The expression (λx.(λy.xy))y) contains a bound y
in the middle and a free y at the right. We can rename
the bound variable y to a new variable, say z, to
evaluate the expression with no name conflicts:
(λx.(λy.xy))y) = (λx.(λz.xz))y) →
[y/x](λz.xz) = (λz.yz)
53 Al Aho
Remarkable Properties of Lambda Calculus
54 Al Aho
Evaluation Strategies
56 Al Aho
Normal Form Evaluation
57 Al Aho
Applicative Order Evaluation
• References
– Simon Peyton Jones, The Implementation of Functional
Languages, Prentice-Hall, 1987
– Stephen Edwards, The Lambda Calculus
http://www.cs.columbia.edu/~sedwards/classes/2012/w4115-fall/index.html
61 Al Aho