Toc 1
Toc 1
theory
6
Theory of Computation or Automata
Some devices we will see
theory
finite automata Devices with a finite amount of memory. • In addition to the usefulness of various tools (regular
Used to model “small” computers. expressions, context free grammars, state machines
etc.) in your daily life as a programmer,
push-down Devices with infinite memory that can be
– a good theoretical computer science course will have taught
automata accessed in a restricted way. you how to model certain problems in a way that you can tackle
Used to model parsers, etc. effectively.
• For example, you may be tempted to parse some
Turing Machines Devices with infinite memory. programming language as input to your program with
Used to model any computer. Regular Expressions.
– CS Theory proves why this is a bad idea (most programming
time-bounded Infinite memory, but bounded running time. language syntaxes are not Regular), and can never be
Turing Machines overcome no matter how much you'd like to try.
Used to model any computer program that
runs in a “reasonable” amount of time.
• In theoretical computer science and formal language • Computer language syntax is generally
theory, a regular expression (sometimes called a
rational expression) is a sequence of characters that distinguished into three levels:
define a search pattern, – Words – the lexical level, determining how
– mainly for use in pattern matching with strings, or string matching, characters form tokens;
i.e. "find and replace"-like operations.
• Automata theory is the study of abstract computational • What are the mathematical properties of computer hardware
devices and software?
• What is a computation and what is an algorithm? Can we give
• Abstract devices are (simplified) models of real rigorous mathematical definitions of these notions?
computations
• What are the limitations of computers? Can “everything” be
• Computations happen everywhere: On your laptop, computed? (As we will see, the answer to this question is
on your cell phone, in nature, … “no”.)
• Nowadays, the Theory of Computation can be divided into the What is Automata Theory?
following three areas: Complexity Theory, Computability
Theory, and Automata Theory. Study of abstract computing devices, or
• Complexity theory: Classify problems according to their “machines”
degree of “difficulty”. Give a rigorous proof that problems that Automaton = an abstract computing device
seem to be “hard” are really “hard”. Note: A “device” need not even be a physical
• Computability Theory: Classify problems as being solvable or hardware!
unsolvable. A fundamental question in computer science:
• Automata Theory: Do these models have the same power, or Find out what different models of machines can do
and cannot do
can one model solve more problems than the other?
The theory of computation
20
Model of Computation
CPU memory
21 22
input input
CPU CPU
output output
Program
Program compute xx
compute x2 x
23 24
memory f ( x) x3
Model of Computation
z 2*2 4
3
f ( x) x f ( x) z * 2 8
memory
input input
x2 x2
CPU CPU
output output
Program Program
memory f ( x) x3 Automaton
z 2*2 4
memory
f ( x) z * 2 8
input
Automaton
x2 input
CPU CPU
f ( x) 8 output
Program output
compute xx Program
compute x2 x
27 28
Automaton
memory need abstract
Automaton
input
models?
Why do we need abstract
output models?
transition
state
29 30
A design problem Preliminaries of automata theory
• Can you design a circuit where the light is on if and • How do we formalize the question
only if all the switches were flipped exactly the same
number of times ? Can device A solve problem B?
• Such devices are difficult to reason about, because
they can be designed in an infinite number of ways • First, we need a formal way of describing the
• By representing them as abstract computational problems that we are interested in solving
devices, or automata, we will learn how to answer
such questions
1
2 2
off on
1 3
39 40
Strings
• Examples
Length of a string w, denoted by “|w|”, is
abfbz is a string over 1 = {a, b, c, d, …, z} equal to the number of (non- ) characters in the
9021 is a string over 2 = {0, 1, …, 9} string
E.g., x = 010100 |x| = 6
ab#bc is a string over 3 = {a, b, …, z, #}
x = 01 0 1 00 |x| = ?
))()(() is a string over 4 = {(, )}
xy = concatentation of two strings x and y
42
The * Operation
* : the set of all possible strings from
Powers of an alphabet alphabet
Let ∑ be an alphabet.
a, b
∑k = the set of all strings of length k
∑+ = ∑1 U ∑2 U ∑3 U …
43 44
Definition: 0 1 2
L* L L L
a, b
Example:
, * , a, b, aa, ab, ba, bb, aaa, aab,
a, bb,
a,bb*
aa, abb, bba, bbbb,
*
aaa, aabb, abba, abbbb, a, b, aa, ab, ba, bb, aaa, aab,
45 46
Introduction Introduction
What are formal languages What are formal languages
In the study of formal languages we care about only the a language designed for use in situations in which natural
well-formedness/membership, but not the meaning of language is unsuitable, as for example in mathematics, logic,
sentences in a language. or computer programming. The symbols and formulas of
Ex1: Our usual decimal language of positive numbers ? such languages stand in precisely specified syntactic and
semantic relations to one another.
Problem: Which of the following are well-formed
[representation of] numbers:
(1) 128 (2) 0023 (3) 44ac (4) 3327 any language of symbols and formulas developed for systems
Let L be the set of all well-formed [representations of ] which cannot work with natural language, such as computer
numbers. ==> 123, 3327 in L but 0023, 44ac not in L. programming and mathematics.
So according to the view of FL, The usual decimal language
of positive numbers (i.e., L) is just the set :
{ x | x is a finite sequence of digits w/t leading zeros }.
Note: FL don't care about that string '134' corresponds to the (abstract)
positive number whose binary representation is 10000000 –It’s the job
of semantics.
53 54
Transparency No. 1-53 Transparency No. 1-54
Introduction
Definition 2.1
Introduction Introduction
Definition 2.1 (cont'd) Examples of practical formal languages
* =def the set of all strings over . Ex: Let be the set of all ASCII codes.
a C program is simply a finite string over satisfying all
Ex: {a,b}* = {,a,b,aa,ab,ba,bb,aaa,...}
syntax rules of C.
{a}* = {,a,aa,aaa,aaaa,...} = {an | n 0}.
C-language =def { x | x is a well-formed C program over }.
{}* = ? ( {} or {} or ?)
PASCAL-language = {x | x is a well-formed PASCAL
Note the difference b/t sets and strings: program over }.
{a,b} = {b,a} but ab ba. Similarly, let ENG-dict = The set of all English lexicons
{a,a,b} = {a,b} but aab ab
= { John, Mary, is, are, a, an, good, bad, boy, girl,..}
So what's a (formal) language ? an English sentence is simply a string over ENG-DIC
A language over is a set of strings over (i.e., a ==> English =def {x | x is a legal English sentence over ENG
subset of *). Ex: let = {0,...,9} then all the followings dict} ==>
are languages over . 1.John is a good boy . English.
1. {} 2. {} 3. {0,...,9} = 4. {x | x * and has no leading 2. |John is a good boy . | = ?
0s} 5. 5 = {x | |x| = 5} 6. * = {x | |x| is finite }
57 58
Transparency No. 1-57 Transparency No. 1-58
Introduction Introduction
issues about formal languages How to specify a language
Why need formal languages? principles: 1. must be precise and no ambiguity among
for specification (specifying programs, etc.) users of the language: 2. efficient for machine processing
i.e., basic tools for communications b/t people and tools:
machines. 1. traditional mathematical notations:
although FL does not provide all needed theoretical A = {x | |x| < 3 and x {a,b}} = {e,a,b,aa,ab,ba,bb}
framework for subsequent processing, it indeed provides a problem: in general not machine understandable.
necessary start, w/t which subsequent processing would 2. via programs (or machines) :
be impossible -- first level of abstraction. P: a program; L(P) =def {x | P return 'ok' on input string x}
precise, no ambiguity, machine understandable.
Many basic problems [about computation] can be
hard to understand for human users !!
investigated at this level.
3. via grammars: (easy for human to understand)
How to specify(or represent) a language ? Ex: noun := book | boy | jirl | John | Mary
Notes: All useful natural and programming languages art := a | an | the ; prep := on | under | of | ...
contains infinite number of strings (or programs and adj := good | bad | smart | ...
sentences) NP := noun | art noun | NP PP | ...
PP := prep NP ==> 'the man on the bridge' PP.
59 60
Transparency No. 1-59 Transparency No. 1-60
Languages are used to describe
computation problems:
Decimal numbers alphabet { 0,1,2, ,9}
PRIMES {2,3,5,7,11,13,17, }
102345 567463386
EVEN { 0,2,4,6, }
Binary numbers alphabet { 0,1}
Alphabet: { 0,1,2, ,9} 100010001 101101111
61 62
String Operations
Unary numbers alphabet {1}
w a1a2 an abba
v b1b2 bm bbbaaa
Unary number: 1 11 111 1111 11111
Decimal number: 1 2 3 4 5
Concatenation
63 64
String Length
w a1a2 an
w a1a2 an ababaaabbb
Length: w n
uv u v
A string with no letters is denoted: or
ADDITION {x y z : x 1n , y 1m , z 1k , SQUARES {x # y : x 1n , y 1m , m n 2 }
nm k}
Another Operation
Definition: Ln
LL L
n L {a nb n : n 0}
a, b3 a, ba, ba, b
aaa, aab, aba, abb, baa, bab, bba, bbb L' {a nb n a mb m : n, m 0}
Special case: L0
aabbaaabbb L'
a , bba , aaa 0
81 82
Theory of Computation: A
Historical Perspective The Chomsky Hierachy
• A containment hierarchy of classes of formal languages
1930s • Alan Turing studies Turing machines
• Decidability
• Halting problem
1940-1950s • “Finite automata” machines studied Regular Context-
• Noam Chomsky proposes the (DFA) Context- Recursively-
free
“Chomsky Hierarchy” for formal sensitive enumerable
(PDA)
languages (LBA) (TM)
1969 Cook introduces “intractable” problems
or “NP-Hard” problems
1970- Modern computer science: compilers,
computational & complexity theory evolve
87 88
input
Finite
• Pushdown Automata: stack Automaton
output
• Turing Machines: random access memory
Power of Automata
Turing Machine is the most powerful
Simple More complex Hardest computational model known
problems problems problems
NP-complete problems
Believed to take exponential Next Lecture
Lecture-2.ppt
time to be solved
P problems
Solved in polynomial time
97 98