0% found this document useful (0 votes)
49 views

Lecture 01

The document provides an overview of the CS143 Compilers course at Stanford University. It includes information about the instructor, teaching assistants, schedule, assignments, exams, textbooks, course structure, goals, and academic honesty policy. It also provides a high-level introduction to the major components of a compiler, including lexical analysis, parsing, semantic analysis, optimization, and code generation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

Lecture 01

The document provides an overview of the CS143 Compilers course at Stanford University. It includes information about the instructor, teaching assistants, schedule, assignments, exams, textbooks, course structure, goals, and academic honesty policy. It also provides a high-level introduction to the major components of a compiler, including lexical analysis, parsing, semantic analysis, optimization, and code generation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Compilers

CS143
Tuesday/Thursday 9:45–11:15

Instructor: Fredrik Kjolstad


1
The slides in this course are designed by
Alex Aiken,
with modifications by Fredrik Kjolstad.

2
Staff

• Instructor
– Fredrik Kjolstad

• TAs
– Scott Kovach
– Wonyeol Lee
– Nikhil Raghuraman
– Toby Bell
– Timothy Gu

3
Administrivia

• Syllabus is on-line
– cs143.stanford.edu
– Assignment dates will not change
– Midterm (Thursday April 28)
– Final

• Office hours
– 22 office hours spread throughout the week
– Some zoom office hours where SCPD students get preference
– My office hours this week: Thursday 4-5pm (zoom) and Friday 10-11am
(Gates 486)
– Office hours starting next week to be announced

• Communication
– Use discussion forum, email, zoom, office hours
4
Webpages and servers

• Course webpage at cs143.stanford.edu


– Syllabus, lecture slides, handouts, assignments, and policies

• Canvas at canvas.stanford.edu
– Lecture recordings available under the Panopto Course Videos tab

• Ed Discussion at https://edstem.org/us/courses/21322/discussion/
– This is where you should ask most questions
– Also accessible from Canvas

• Gradescope at gradescope.com
– This is where you will hand in written assignments

• Computing Resources at myth.stanford.edu


– We will use myth for the programming assignments
– Class folder: /afs/ir/class/cs143/
5
Text

• The Purple Dragon Book

• Aho, Lam, Sethi & Ullman

• Not required
– But a useful reference

6
Course Structure

• Course has theoretical and practical aspects

• Need both in programming languages!

• Written assignments + exams = theory

• Programming assignments = practice

7
Course Goal

• Open the lid of compilers and see inside


– Understand what they do
– Understand how they work
– Understand how to build them

• Correctness over performance


– Correctness is essential in compilers
– They must produce correct code
– CS143 is more like CS103+CS110 than CS107
– Other classes focus on performance (CS149, CS243)

8
Academic Honesty

• Don’t use work from uncited sources

• We may use plagiarism detection software


– many cases in past offerings

PLAGIARISM

9
The Course Project

• You will write your own compiler!

• One big project

• … in 4 parts

• Start early!

PA1 PA2 PA3 PA4


lexer parser type checker code generation
10
How are Languages Implemented?

• Two major strategies:


– Interpreters run your program
– Compilers translate your program

Program
Program Compiler Binary Code
Interpreter
Machine Machine
Machine

11
Language Implementations

• Compilers dominate low-level languages


– C, C++, Go, Rust

• Interpreters dominate high-level languages


– Python, Ruby

• Some language implementations provide both


– Java, Javascript, WebAssembly
– Interpreter + Just in Time (JIT) compiler

12
History of High-Level Languages

• 1954: IBM develops the 704

• Problem
– Software costs exceeded
hardware costs!

• All programming done in


assembly

13
The Solution

• Enter “Speedcoding”

• An interpreter

• Ran 10-20 times slower than hand-written


assembly

14
FORTRAN I

• Enter John Backus

• Idea
– Translate high-level code to
assembly

– Many thought this


impossible

– Had already failed in other


projects

15
FORTRAN I (Cont.)

• 1954-7
– FORTRAN I project

• 1958
– >50% of all software is in
FORTRAN

• Development time halved

• Performance close to
hand-written assembly!

16
FORTRAN I

• The first compiler


– Huge impact on computer science

• Led to an enormous body of theoretical and practical


work

• Modern compilers preserve the outlines of FORTRAN I

• Can you name a modern compiler?

17
The Structure of a Compiler

1. Lexical Analysis — identify words


2. Parsing — identify sentences
3. Semantic Analysis — analyse sentences
4. Optimization — editing
5. Code Generation — translation

Can be understood by analogy to how


humans comprehend English.

18
Lexical Analysis

• First step: recognize words.


– Smallest unit above letters

This is a sentence.

19
More Lexical Analysis

• Lexical analysis is not trivial. Consider:


ist his ase nte nce

20
And More Lexical Analysis

• Lexical analyzer divides program text into


“words” or “tokens”
If x == y then z = 1; else z = 2;

• Units:

21

Parsing

• Once words are understood, the next step is to


understand sentence structure

• Parsing = Diagramming Sentences


– The diagram is a tree

22
Diagramming a Sentence

This line is a longer sentence

article noun verb article adjective noun

subject object

sentence

23
Parsing Programs

• Parsing program expressions is the same


• Consider:
If x == y then z = 1; else z = 2;
• Diagrammed:
x == y z 1 z 2
relation assign assign

predicate then-stmt else-stmt


if-then-else
24

Semantic Analysis

• Once sentence structure is understood, we can


try to understand “meaning”
– But meaning is too hard for compilers

• Compilers perform limited semantic analysis to


catch inconsistencies

25
Semantic Analysis in English

• Example:
Jack said Jerry left his assignment at home.
What does “his” refer to? Jack or Jerry?

• Even worse:
Jack said Jack left his assignment at home?
How many Jacks are there?
Which one left the assignment?

26
Semantic Analysis in Programming

• Programming {
languages define strict int Jack = 3;
rules to avoid such {
ambiguities
int Jack = 4;
cout << Jack;
• This C++ code prints
“4”; the inner definition }
is used }

27





More Semantic Analysis

• Compilers perform many semantic checks


besides variable bindings

• Example:
Jack left her homework at home.

• Possible type mismatch between her and Jack


– If Jack is male

28
Optimization

• Akin to editing
– Minimize reading time
– Minimize items the reader must keep in short-term
memory

• Automatically modify programs so that they


– Run faster
– Use less memory
– In general, to use or conserve some resource

• The project has no optimization component


– CS243: Program Analysis and Optimization 29
Optimization Example

X = Y * 0 is the same as X = 0

(the * operator is annihilated by zero)

Is this optimization legal?

30

Code Generation

• Typically produces assembly code

• Generally a translation into another language


– Analogous to human translation

31
Intermediate Representations

• Many compilers perform translations between


successive intermediate languages
– All but first and last are intermediate representations
(IR) internal to the compiler
Source

• IRs are generally ordered in


descending level of abstraction IR

– Highest is source …
– Lowest is assembly IR

Assembly
32
Intermediate Representations (Cont.)

• IRs are useful because lower levels expose


features hidden by higher levels
– registers
– memory layout
– raw pointers
– etc.

• But lower levels obscure high-level meaning


– Classes
– Higher-order functions
– Even loops…

33
Issues

• Compiling is almost this simple, but there are


many pitfalls

• Example: How to handle erroneous programs?

• Language design has big impact on compiler


– Determines what is easy and hard to compile
– Course theme: many trade-offs in language design

34
Compilers Today

• The overall structure of almost every compiler


adheres to our outline

• The proportions have changed since FORTRAN


– Early: lexing and parsing most complex/expensive

– Today: optimization dominates all other phases, lexing


and parsing are well understood and cheap

• Compilers are now also found inside libraries

35

You might also like