0% found this document useful (0 votes)
18 views36 pages

PPT6 Chomsky Hierarchy PS

The Chomsky Hierarchy categorizes languages into four types based on their grammars and computational power: Regular, Context-Free, Context-Sensitive, and Recursively Enumerable languages. Noam Chomsky, a linguist and philosopher, proposed this hierarchy to illustrate that language is a cognitive ability rather than a learned behavior. Each type of language corresponds to specific automata, with Regular languages being the simplest and Recursively Enumerable languages being the most complex.

Uploaded by

h282y87ykk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views36 pages

PPT6 Chomsky Hierarchy PS

The Chomsky Hierarchy categorizes languages into four types based on their grammars and computational power: Regular, Context-Free, Context-Sensitive, and Recursively Enumerable languages. Noam Chomsky, a linguist and philosopher, proposed this hierarchy to illustrate that language is a cognitive ability rather than a learned behavior. Each type of language corresponds to specific automata, with Regular languages being the simplest and Recursively Enumerable languages being the most complex.

Uploaded by

h282y87ykk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

The Chomsky Hierarchy

1
Who is Noam Chomsky Anyway?
▪ Philosopher of Languages
▪ Professor of Linguistics at MIT
▪ Constructed the idea that language was
not a learned “behavior”, but that it was
cognitive and innate; versus stimulus-
response driven
▪ In an effort to explain these theories, he
developed the Chomsky Hierarchy
Chomsky Hierarchy
• Comprises four types of languages and
their associated grammars and machines.
Type 3: Regular Languages
Type 2: Context-Free Languages
Type 1: Context-Sensitive Languages
Type 0: Recursively Enumerable Languages
• These languages form a strict hierarchy
Chomsky Hierarchy

Language Grammar Machine Example


Regular Grammar Deterministic or
Regular ▪ Right-linear Nondeterministic
grammar Finite-state a*
Language
▪ Left-linear acceptor
grammar
Context-free Context-free Nondeterministic anbn
Language grammar Pushdown
automaton
Context- Context-sensitive Linear-bounded anbnc n
sensitive grammar automaton
Recursively Unrestricted Turing machine Any computable
enumerable grammar function
The Chomsky Hierarchy

Non Turing-Acceptable

Turing-Acceptable
decidable

Context-sensitive

Context-free

Regular
5
9.7: Chomsky Hierarchy
Turing Machine
Turing Machine (II)
Unrestricted grammar
Recognized by Turing machine
It consists of a read-write head that can be
positioned anywhere along an infinite tape.
It is not a useful class of language for
compiler design.
Linear-Bounded Automata
Linear-Bounded Automata
Context-sensitive
Restrictions
Left-hand of each production must have at least
one nonterminal in it
Right-hand side must not have fewer symbols
than the left
There can be no empty productions (N→)
Push-Down Automata
Push-Down Automata (II)
Context-free
Recognized by push-down automata
Can only read its input tape but has a stack that can grow to
arbitrary depth where it can save information
An automation with a read-only tape and two independent
stacks is equivalent to a Turing machine.
It allows at most a single nonterminal (and no terminal) on
the left-hand side of each production.
Finite-State Automata
Finite State Automata (II)
Regular language
Anything that must be remembered about
the context of a symbol on the input tape
must be preserved in the state of the
machine.
It allows only one symbol (a nonterminal) on
the left-hand, and only one or two symbols
on the right.
Linear-Bounded Automata:

Same as Turing Machines with one difference:

the input string tape space


is the only tape space allowed to use

15
Linear Bounded Automaton (LBA)

Input string
[ a b c d e ]

Working space
Left-end Right-end
in tape
marker marker

All computation is done between end markers

16
We define LBA’s as NonDeterministic

Open Problem:
NonDeterministic LBA’s
have same power as
Deterministic LBA’s ?

17
Example languages accepted by LBAs:

L = {a b c }
n n n
L = {a }
n!

LBA’s have more power than PDA’s


(pushdown automata)

LBA’s have less power than Turing Machines

18
Unrestricted Grammars:

Productions
u →v

String of variables String of variables


and terminals and terminals

19
Example unrestricted grammar:

S → aBc
aB → cA
Ac → d

20
Theorem:
A language L is Turing-Acceptable
if and only if L is generated by an
unrestricted grammar

21
Context-Sensitive Grammars:

Productions
u →v

String of variables String of variables


and terminals and terminals

and: |u|  |v|


22
The language n n n
{a b c }
is context-sensitive:

S → abc | aAbc
Ab → bA
Ac → Bbcc
bB → Bb
aB → aa | aaA
23
Theorem:
A language L is context sensistive
if and only if
it is accepted by a Linear-Bounded automaton

Observation:
There is a language which is context-sensitive
but not decidable
24
Intro to Languages
English grammar tells us if a given combination of words is a
valid sentence.
The syntax of a sentence concerns its form while the
semantics concerns
its meaning.
e.g. the mouse wrote a poem

From a syntax point of view this is a valid sentence.

From a semantics point of view not so fast…perhaps in Disney


land

Natural languages (English, French, Portguese, etc) have very


complex rules of syntax and not necessarily well-defined.

25
Formal Language
Formal language – is specified by well-defined set of rules of
syntax

We describe the sentences of a formal language using a


grammar.

Two key questions:


1 - Is a combination of words a valid sentence in a formal
language?
2 – How can we generate the valid sentences of a formal
language?

Formal languages provide models for both natural languages and


programming languages.

26
Grammars
A formal grammar G is any compact, precise
mathematical definition of a language L.
As opposed to just a raw listing of all of the language’s
legal sentences, or just examples of them.
A grammar implies an algorithm that would
generate all legal sentences of the language.
Often, it takes the form of a set of recursive
definitions.
A popular way to specify a grammar recursively is
to specify it as a phrase-structure grammar.
Grammars (Semi-formal)

Example: A grammar that generates a


subset of the English language

sentence → noun _ phrase predicate

noun _ phrase → article noun

predicate → verb 28
article → a
article → the

noun → boy
noun → dog

verb → runs
verb → sleeps
29
A derivation of “the boy sleeps”:

sentence  noun _ phrase predicate


 noun _ phrase verb
 article noun verb
 the noun verb
 the boy verb
 the boy sleeps

30
A derivation of “a dog runs”:

sentence  noun _ phrase predicate


 noun _ phrase verb
 article noun verb
 a noun verb
 a dog verb
 a dog runs 31
Language of the grammar:
L = { “a boy runs”,
“a boy sleeps”,
“the boy runs”,
“the boy sleeps”,
“a dog runs”,
“a dog sleeps”,
“the dog runs”,
“the dog sleeps” }

32
Notation

noun → boy
noun → dog

Variable Terminal
or Production
Symbols of
Non-terminal rule
the vocabulary

Symbols of
the vocabulary
33
Basic Terminology
► A vocabulary/alphabet, V is a finite nonempty set of
elements called symbols.
Example: V = {a, b, c, A, B, C, S}
► A word/sentence over V is a string of finite length of
elements of V.
Example: Aba
► The empty/null string, λ is the string with no symbols.
► V* is the set of all words over V.
Example: V* = {Aba, BBa, bAA, cab …}
► A language over V is a subset of V*.
We can give some criteria for a word to be in a
language.
Context-Sensitive Languages

The language { anbncn | n  1} is context-


sensitive but not context free.
A grammar for this language is given by:
S → aSBC | aBC
CB → BC
aB → ab
Terminal bB → bb
and bC → bc
non-terminal cC → cc
Context-Sensitive Languages
Example
A derivation from this grammar is:-
S  aSBC
 aaBCBC (using S → aBC)
 aabCBC (using aB → ab)
 aabBCC (using CB → BC)
 aabbCC (using bB → bb)
 aabbcC (using bC → bc)
 aabbcc (using cC → cc)
which derives a2b2c2.

You might also like