0% found this document useful (0 votes)
16 views43 pages

10 TypeCheck

The document discusses type checking in comparative programming languages, focusing on abstract semantics, operational semantics, and typing rules. It covers concepts such as type safety, type inference, and the differences between static and dynamic type checking, using examples from a simple language syntax. Additionally, it explores the FUN language and its type system, illustrating how types can represent sets of values and the implications for program correctness.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views43 pages

10 TypeCheck

The document discusses type checking in comparative programming languages, focusing on abstract semantics, operational semantics, and typing rules. It covers concepts such as type safety, type inference, and the differences between static and dynamic type checking, using examples from a simple language syntax. Additionally, it explores the FUN language and its type system, illustrating how types can represent sets of values and the implications for program correctness.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

CMPT383 Comparative Programming Languages

Lecture 10: Type Checking

Yuepeng Wang
Spring 2025

1
Overview
• Abstract semantics

• Types and typing

• Operational semantics and typing

• Typing rules for lambdas

• Typing rules for let bindings

• Type safety

2
Motivation
• (Operational) semantics is useful for proving some properties or
correctness of certain programs
• Example: prove "if x > 0 then x else -x" computes the absolute value of x

• Rice's Theorem: any non-trivial property of a language recognized by a


Turing machine is undecidable

• In other words, given a non-trivial property of a language recognized by a


Turing machine, there is no algorithm that can check if the property holds
for every program of the language
• Example: check if a C program has null pointer dereference

3
Abstract Semantics
• Even though we cannot prove some behavior of the original program, we
can still prove something over an abstraction of the program

• Instead of concrete operational semantics, we can define abstract


semantics to over-approximate the behavior of a program
• Concrete: "let x = 1 in x" evaluates to 1
• Abstract: "let x = 1 in x" evaluates to an integer

• For example, we may not be able to prove an expression evaluates to a


certain value, but we can prove the expression evaluates to what type of
values, e.g., integer, bool, string, etc.

4
Abstract Semantics
• The 3x+1 problem. Define a function f

• Question: given a positive integer n, consider a sequence of numbers


2 3
n, f(n), f (n), f (n), . . . . Does 1 always occur in the sequence for all n?

• Example: n = 5, sequence = 5, 16, 8, 4, 2, 1, 4, 2, 1, ...

• Unfortunately, we don't know the answer :(

• But we know every number in the sequence is an integer :)


5
Over-Approximation
• Under concrete semantics, a program evaluates to a concrete value
• Under abstract semantics, a program evaluates to an abstract value, i.e.,
a set of concrete values
• Over-approximation requires that the values obtained by the abstract
semantics are superset of those obtained by the concrete semantics

result by abstract semantics

result by
concrete semantics

• Example: positive integer does not over-approximate value -1, but integer
over-approximates -1
6
Abstract Semantics
• The abstract semantics is less precise than the concrete operational
semantics

• But if the abstract semantics is "precise enough" for the property that we
want to prove, then it is fine

• Example: if we know x > 10, then we know "if x>10 then x else -1"
evaluates to a positive number

• But we need to know the exact value of x to know the evaluation result of
"if x>10 then x else -1"

7
Abstract Semantics
• If the semantics is abstract enough, then checking properties of the
abstraction of the program is decidable

• i.e., we can find an algorithm that works for all programs

• but the abstract semantics needs to be useful for checking the


properties

• The tradeoff is to find a useful but decidable abstraction

8
Types and Abstract Semantics
• A type represents a set of values
• Int represents integers within a range
• Bool represents {true, false}
• String represents all possible strings

• A type can be viewed as an abstract value

• Typing of a program can be viewed as executing the program using


abstract semantics
• We can use judgments of the form ⊢ e : T to denote the type of
expression e is T
9
Case Study: Syntax
• Consider a simple language with the following syntax
e ::= 'true' | 'false' | '0'
| 'succ' e | 'pred' e
| 'iszero' e
| 'if' e 'then' e 'else' e

• This language defines a class of simple expressions

• Syntactically, the expression can only use three constants: true, false, 0

• But the expression can evaluate to two types of values


• Bool values: true, false
• Int values: …, − 3, − 2, − 1,0,1,2,3,…
10
Case Study: Data Type for Programs
• Haskell data type for programs (expressions)

data Expr = CBool Bool


| CZero
| Succ Expr
| Pred Expr
| IsZero Expr
| ITE Expr Expr Expr
deriving (Eq, Ord, Read, Show)

11
Case Study: Operational Semantics
• Provide an operational semantics without environments
e ::= 'true' | 'false' | '0'
| 'succ' e | 'pred' e
| 'iszero' e
⊢ true ⇓ true ⊢ false ⇓ false ⊢0⇓0
| 'if' e 'then' e 'else' e

⊢e⇓v ⊢e⇓0 ⊢ e1 ⇓ true ⊢ e2 ⇓ v


⊢ succ e ⇓ v + 1 ⊢ iszero e ⇓ true ⊢ if e1 then e2 else e3 ⇓ v

⊢e⇓v ⊢ e ⇓ c Int c c ≠ 0 ⊢ e1 ⇓ false ⊢ e3 ⇓ v


⊢ pred e ⇓ v − 1 ⊢ iszero e ⇓ false ⊢ if e1 then e2 else e3 ⇓ v

12
Case Study: Interpreter
• The operational semantics is closely related to an interpreter
Interpreter
eval :: Expr -> Maybe Value
Data type for values
eval (CBool b) = Just (VBool b)
data Value = VBool Bool
eval CZero = Just (VInt 0)
| VInt Int
eval (Succ e) = fmap inc (eval e)
deriving (Eq, Ord, Read, Show)
eval (Pred e) = fmap dec (eval e)
eval (IsZero e) = case (eval e) of
Auxiliary functions
Just (VInt 0) -> Just (VBool True)
inc :: Value -> Value
Just (VInt _) -> Just (VBool False)
inc (VInt x) = VInt (x + 1)
_ -> Nothing
dec :: Value -> Value
eval (ITE c t e) = case (eval c) of
dec (VInt x) = VInt (x - 1)
Just (VBool True) -> eval t
Just (VBool False) -> eval e
_ -> Nothing
13
Case Study: Typing Rules
• Provide typing rules without environments
e ::= 'true' | 'false' | '0'
| 'succ' e | 'pred' e
| 'iszero' e
⊢ true : Bool ⊢ false : Bool ⊢ 0 : Int
| 'if' e 'then' e 'else' e

⊢ e : Int ⊢ e : Int ⊢ e : Int ⊢ e1 : Bool ⊢ e2 : t ⊢ e3 : t


⊢ succ e : Int ⊢ pred e : Int ⊢ iszero e : Bool ⊢ if e1 then e2 else e3 : t
• Recall that there are two types of values: Int and Bool
• Getting stuck in typing rules means there is a type error
• If-then-else requires that the condition is a boolean expression and that
two branches have the same type
14
Case Study: Typing Function
Data type for types
data Type = TBool
| TInt
deriving (Eq, Ord, Read, Show)
Typing function
typing :: Expr -> Maybe Type
typing (CBool _) = Just TBool
typing CZero = Just TInt
typing (Succ e) = case (typing e) of
Just TInt -> Just TInt
_ -> Nothing
typing (Pred e) = case (typing e) of
Just TInt -> Just TInt
_ -> Nothing
typing (IsZero e) = case (typing e) of
Just TInt -> Just TBool
_ -> Nothing
typing (ITE c t e) = case (typing c) of
Just TBool -> let v1 = typing t
v2 = typing e
in if v1 == v2 then v1 else Nothing
_ -> Nothing
15
Operational Semantics and Typing
• A type represents a set of values. It can be viewed as an abstract value

• Computing the type of an expression (typing) can be viewed as executing the


expression over abstract semantics

• In operational semantics, E ⊢ e ⇓ v means expression e evaluates to value v


under evaluation environment E
• E can be omitted if no rules use it
• In typing rules, Γ ⊢ e : T means expression e has type T under typing
environment Γ
• Γ can also be omitted if no rules use it
16
Operational Semantics and Typing
• Getting stuck in operational semantics means there is a runtime error

• Getting stuck in typing rules means there is a type error

• Why is typing useful?

• It can avoid runtime type errors

17
Example: Simple Expressions
• Provide an operational semantics without environments
e ::= 'true' | 'false' | '0'
| 'succ' e | 'pred' e
| 'iszero' e
⊢ true ⇓ true ⊢ false ⇓ false ⊢0⇓0
| 'if' e 'then' e 'else' e

⊢e⇓v ⊢e⇓0 ⊢ e1 ⇓ true ⊢ e2 ⇓ v


⊢ succ e ⇓ v + 1 ⊢ iszero e ⇓ true ⊢ if e1 then e2 else e3 ⇓ v

⊢e⇓v ⊢ e ⇓ c Int c c ≠ 0 ⊢ e1 ⇓ false ⊢ e3 ⇓ v


⊢ pred e ⇓ v − 1 ⊢ iszero e ⇓ false ⊢ if e1 then e2 else e3 ⇓ v

18
Example: Simple Expressions
• Provide typing rules without environments
e ::= 'true' | 'false' | '0' ⊢ true : Bool (T-True)
| 'succ' e | 'pred' e
| 'iszero' e ⊢ false : Bool (T-False)
| 'if' e 'then' e 'else' e
⊢ 0 : Int (T-Zero)

⊢ e : Int ⊢ e : Int
(T-Succ) (T-IsZero)
⊢ succ e : Int ⊢ iszero e : Bool

⊢ e : Int ⊢ e1 : Bool ⊢ e2 : T ⊢ e3 : T
(T-Pred) (T-ITE)
⊢ pred e : Int ⊢ if e1 then e2 else e3 : T

19
Proof Example
• Prove the type of "if iszero 0 then succ (succ 0) else pred 0" is Int with
respect to the typing rules

(T-Zero)
⊢ 0 : Int
(T-Zero) (T-Succ) (T-Zero)
⊢ 0 : Int ⊢ succ 0 : Int ⊢ 0 : Int
(T-IsZero) (T-Succ) (T-Pred)
⊢ iszero 0 : Bool ⊢ succ (succ 0) : Int ⊢ pred 0 : Int
(T-ITE)
⊢ if iszero 0 then succ (succ 0) else pred 0 : Int

20
Strategies to Compute Types
• What are strategies to compute types for more realistic languages?

• Ask the programmer to provide declarations

• Infer types of expressions from known types of values

• Most popular languages take the first strategy

• Some languages support the second strategy

21
Type Checking
• Type checking is the process of verifying the types are compatible with
certain constraints (e.g., programmer declarations)

• Static type checking is a type checking process at compile time

• Type errors are detected before running the program

• Example language: C, C++, Java, Haskell, ...

• Dynamic type checking is a type checking process at runtime

• Example language: Python

22
Type Inference
• Type inference means the automatic detection of the type of an
expression in a program

• The compiler automatically computes an appropriate type of every


expression and reports error if types are incompatible

• Example language with type inference: C++, Java, Haskell, ...

23
The FUN Language
• Consider the FUN language again
e ::= c | b | x | '(' e ')'
| e '+' e | e '-' e | ... (arithmetic op)
| e '==' e | ... (logical op)
| 'if' e 'then' e 'else' e (conditional)
| 'lambda' x ':' t '.' e (fun abstraction)
| 'app' e e (fun application)
| 'let' x ':' t '=' e 'in' e (let binding)
t ::= 'Int' | 'Bool' | '(' t ')'
| t '->' t (fun type)

• where c is an Int constant, b is a Bool constant, and x is an identifier

• Identifiers in function abstraction and let bindings are augmented with types
24
Types in the FUN Language
• Types in the FUN language is consistent with a small Haskell fragment

t ::= 'Int' | 'Bool' | '(' t ')'


| t '->' t

• Int and Bool are primitive types


• A type of the form T1 → T2 is a function type, which is different from the
return type T2 of a function

• T1 → T2 can represent a type for higher-order functions, where T1 or T2


(or both) is a function type
• The → for types is right-associative
25
Example FUN Programs
e ::= c | b | x | '(' e ')'
| e '+' e | e '-' e | ...
• "let f: Int -> Int =
| e '==' e | ...
lambda x: Int. if x == 0 then 0 else x + app f (x-1)
| 'if' e 'then' e 'else' e
in app f 3"
| 'lambda' x ':' t '.' e is a FUN program that evaluates to 6
| 'app' e e
| 'let' x ':' t '=' e 'in' e • "let f: Int -> Int -> Int = (+) in app (app f 1) 2" is not a
t ::= 'Int' | 'Bool' | '(' t ')' well-formed FUN program (syntax error)
| t '->' t
• "if 0 then 10 else 20" is a FUN program that has a
type error (also a runtime error)

• Suppose there is a production e ::= e '/' e for integer


division, then "let x: Int = 0 in 10 / x" is a FUN
program that type checks but has a runtime error

26
Operational Semantics of FUN
e ::= c | b | x | '(' e ')'
E ⊢ e1 ⇓ v1 E ⊢ e1 ⇓ v1
| e '+' e | e '-' e | ...
| e '==' e | ...
E ⊢ e2 ⇓ v2 E ⊢ e2 ⇓ v2
(E-Plus) (E-Minus)
| 'if' e 'then' e 'else' e E ⊢ e1 + e2 ⇓ v1 + v2 E ⊢ e1 − e2 ⇓ v1 − v2
| 'lambda' x ':' t '.' e
| 'app' e e E ⊢ e1 ⇓ v E ⊢ e2 ⇓ v E ⊢ e1 ⇓ v1 E ⊢ e2 ⇓ v2
| 'let' x ':' t '=' e 'in' e Int v or Bool v Int v1, v2 or Bool v1, v2
(E-Eq1)
t ::= 'Int' | 'Bool' | '(' t ')'
E ⊢ e1 == e2 ⇓ true v1 ≠ v2
(E-Eq2)
| t '->' t
E ⊢ e1 == e2 ⇓ false
Int c Bool b
(E-Int) (E-Bool) E ⊢ e1 ⇓ true E ⊢ e2 ⇓ v2
E⊢c⇓c E⊢b⇓b (E-ITE1)
E ⊢ if e1 then e2 else e3 ⇓ v2
Ident x
E(x) = v
(E-Ident) E ⊢ e1 ⇓ false E ⊢ e3 ⇓ v3
E⊢x⇓v (E-ITE2)
E ⊢ if e1 then e2 else e3 ⇓ v3
27
Operational Semantics of FUN
e ::= c | b | x | '(' e ')'
| e '+' e | e '-' e | ... (E-Abs)
| e '==' e | ...
E ⊢ lambda x : T . e ⇓ lambda x : T . e
| 'if' e 'then' e 'else' e
| 'lambda' x ':' t '.' e E ⊢ e1 ⇓ lambda x : T . e3
| 'app' e e E ⊢ e3[x ↦ e2] ⇓ v
| 'let' x ':' t '=' e 'in' e (E-App)
t ::= 'Int' | 'Bool' | '(' t ')' E ⊢ app e1 e2 ⇓ v
| t '->' t

E ⊢ e1 ⇓ v1
E[x ◃ v1] ⊢ e2 ⇓ v
(E-Let)
E ⊢ let x : T = e1 in e2 ⇓ v

28
Basic Typing Rules of FUN
e ::= c | b | x | '(' e ')'
| e '+' e | e '-' e | ... Int c Bool b
(T-Int) (T-Bool)
| e '==' e | ... Γ ⊢ c : Int Γ ⊢ b : Bool
| 'if' e 'then' e 'else' e
| 'lambda' x ':' t '.' e
Γ ⊢ e1 : Int Γ ⊢ e1 : Int
| 'app' e e
| 'let' x ':' t '=' e 'in' e Γ ⊢ e2 : Int Γ ⊢ e2 : Int
(T-Plus) (T-Minus)
t ::= 'Int' | 'Bool' | '(' t ')' Γ ⊢ e1 + e2 : Int Γ ⊢ e1 − e2 : Int
| t '->' t

• The type of integer constants is Int. The type of boolean constants is Bool

• If expressions e1 and e2 have type Int, then e1 + e2 and e1 − e2 have


type Int

29
Basic Typing Rules of FUN
e ::= c | b | x | '(' e ')'
| e '+' e | e '-' e | ... Γ ⊢ e1 : T Γ ⊢ e2 : T T ∈ {Int, Bool}
(T-Eq)
| e '==' e | ...
Γ ⊢ e1 == e2 : Bool
| 'if' e 'then' e 'else' e
| 'lambda' x ':' t '.' e
| 'app' e e
Γ ⊢ e1 : Bool Γ ⊢ e2 : T Γ ⊢ e3 : T
| 'let' x ':' t '=' e 'in' e (T-ITE)
t ::= 'Int' | 'Bool' | '(' t ')' Γ ⊢ if e1 then e2 else e3 : T
| t '->' t

• If both e1 and e2 have type T (Int or Bool), then e1 == e2 have type Bool
• Note that we cannot compare two function abstractions
• If e1 has type Bool, e2 and e3 have type T (including function types), then "if
e1 then e2 else e3" has type T
30
Typing Rules for Abstractions
e ::= c | b | x | '(' e ')'
| e '+' e | e '-' e | ...
| e '==' e | ...
| 'if' e 'then' e 'else' e
| 'lambda' x ':' t '.' e
Γ[x ◃ T1] ⊢ e : T2
(T-Abs)
| 'app' e e Γ ⊢ lambda x : T1 . e : T1 → T2
| 'let' x ':' t '=' e 'in' e
t ::= 'Int' | 'Bool' | '(' t ')'
| t '->' t

• To know the type of lambda x : T1 . e, we only need to know the type of


body e, because the type T1 of parameter x is provided by programmers

• Since e can use variable x, we need to compute the type of e based on the
knowledge that the type of x is T1
31
Typing Environments
Γ[x ◃ T1] ⊢ e : T2
(T-Abs)
Γ ⊢ lambda x : T1 . e : T1 → T2

• In operational semantics, the evaluation environment E is a map data


structure that maps identifiers to their values
• In typing rules, the typing environment Γ is a map data structure that
maps identifiers to their types
• In typing rules, the environment Γ should be considered as universally
quantified, i.e., it means all possible environment Γ
• You can take any instantiation for better understanding, e.g., empty
• At the beginning of the typing process, environment Γ is empty
32
Typing Environments
Γ[x ◃ T1] ⊢ e : T2
(T-Abs)
Γ ⊢ lambda x : T1 . e : T1 → T2

• Γ[x ◃ T] does not modify environment Γ


• Γ[x ◃ T] is a new environment with all entries in Γ and mapping x ↦ T
added
• If x ↦ T′ exists in Γ, then Γ[x ◃ T] overwrites the type of x to T
• Γ(x) means to look up the type of x in Γ
• If x ↦ T exists in Γ, then Γ(x) = T
• If there is no entry for x in Γ, it gets stuck, leading to a type error

33
Typing Rules for Identifiers
e ::= c | b | x | '(' e ')'
| e '+' e | e '-' e | ...
| e '==' e | ...
| 'if' e 'then' e 'else' e
Ident x Γ(x) = T
| 'lambda' x ':' t '.' e (T-Ident)
| 'app' e e Γ⊢x:T
| 'let' x ':' t '=' e 'in' e
t ::= 'Int' | 'Bool' | '(' t ')'
| t '->' t

• The type of identifiers is stored in the typing environment Γ

• If the environment has entry x ↦ T, then we say the type of x is T

34
Proof Example
• Prove the type of "lambda x: Int. x + 1" is Int → Int

Ident x Γ[x ◃ Int](x) = Int Int 1


(T-Ident) (T-Int)
Γ[x ◃ Int] ⊢ x : Int Γ[x ◃ Int] ⊢ 1 : Int
(T-Plus)
Γ[x ◃ Int] ⊢ x + 1 : Int
(T-Abs)
Γ ⊢ lambda x: Int. x + 1 : Int → Int

35
Proof Example
• Prove the type of "lambda x: Int. lambda y: Int. x + y" is Int → Int → Int

Ident x Γ[x ◃ Int][y ◃ Int](x) = Int Ident y Γ[x ◃ Int][y ◃ Int](y) = Int
(T-Ident) (T-Ident)
Γ[x ◃ Int][y ◃ Int] ⊢ x : Int Γ[x ◃ Int][y ◃ Int] ⊢ y : Int
(T-Plus)
Γ[x ◃ Int][y ◃ Int] ⊢ x + y : Int
(T-Abs)
Γ[x ◃ Int] ⊢ lambda y: Int. x + y : Int → Int
(T-Abs)
Γ ⊢ lambda x: Int. lambda y: Int. x + y : Int → Int → Int

36
Typing Rules for Applications
e ::= c | b | x | '(' e ')'
| e '+' e | e '-' e | ...
| e '==' e | ...
| 'if' e 'then' e 'else' e
Γ ⊢ e1 : T1 → T2
| 'lambda' x ':' t '.' e Γ ⊢ e2 : T1
| 'app' e e (T-App)

| 'let' x ':' t '=' e 'in' e


Γ ⊢ app e1 e2 : T2
t ::= 'Int' | 'Bool' | '(' t ')'
| t '->' t

• For function application app e1 e2, the typing rule needs to check e1 has a
function type of the form T1 → T2, and the actual parameter e2 has type T1

• Then the application result has type T2

37
Typing Rules for Let Bindings
e ::= c | b | x | '(' e ')'
| e '+' e | e '-' e | ...
| e '==' e | ...
| 'if' e 'then' e 'else' e
Γ[x ◃ T1] ⊢ e1 : T1
| 'lambda' x ':' t '.' e Γ[x ◃ T1] ⊢ e2 : T2
(T-Let)
| 'app' e e
| 'let' x ':' t '=' e 'in' e
Γ ⊢ let x : T1 = e1 in e2 : T2
t ::= 'Int' | 'Bool' | '(' t ')'
| t '->' t

• Given the type of x is T1, the rule needs to check the type of e1 is indeed T1

• Given e2 can use identifier x, if the type of e2 is T2 based on the knowledge


that x has type T1, then the entire let-in expression has type T2
• The rule handles recursive functions
38
Proof Example
• Prove the type of "let x: Int = 2 in x - 1" is Int

Ident x Γ[x ◃ Int](x) = Int Int 1


(T-Ident) (T-Int)
Int 2 Γ[x ◃ Int] ⊢ x : Int Γ[x ◃ Int] ⊢ 1 : Int
(T-Int) (T-Minus)
Γ[x ◃ Int] ⊢ 2 : Int Γ[x ◃ Int] ⊢ x − 1 : Int
(T-Let)
Γ ⊢ let x: Int = 2 in x − 1 : Int

39
Typing Rules for Parenthesized Exprs
• Typically, the parentheses are handled in the parsing process
• After we obtain the tree representation (called abstract syntax tree, or
AST) of a program from the parser, the parentheses are gone
+ +
(1+2)+3 1+(2+3)

+ 3 1 +

1 2 2 3

• In case we want to make parenthesized expressions explicit, here is the


typing rule
Γ⊢e:T
(T-Paren)
Γ ⊢ (e) : T

40
Type Safety
• We collectively refer to a set of typing rules as a type system

• One of the most basic properties of a type system is safety (also called
soundness): well-typed programs do not go wrong

• What is "go wrong"?


• The evaluation gets stuck, i.e., the evaluation does not reach one of
the designated values, but the evaluation rules do not know what to do
• The evaluation goes through but its result is inconsistent with the type

41
Safety = Progress + Preservation
• Type safety of a language can be proved in two steps

• Progress: a well-typed program is not stuck

• If Γ ⊢ e : T, then there is an evaluation rule that applies to e


• Preservation: Over-approximation between types and concrete values is
preserved by the evaluation and typing rules

• If E ⊢ e ⇓ v and Γ ⊢ e : T, then v is in the set of values represented


by type T

42
Safety = Progress + Preservation
• The proof methodology is usually structural induction
• Prove the property holds for base cases (e.g., constants)
• Assuming the the property holds for all sub-expressions, prove it holds
for recursive cases (e.g., if-then-else)

• Example type-safe language


• The language for simple expressions

• Example type-unsafe languages


• C, C++
• int buf[10]; buf[10] = 100; presents an undefined behavior
43

You might also like