Unit-III
FORMAL LANGUAGES and GRAMMARS
Introduction to Languages
• English grammar tells us if a given combination of words is a valid sentence.
• The syntax of a sentence concerns its form while the semantics concerns
• its meaning.
• e.g. the mouse wrote a poem
• From a syntax point of view this is a valid sentence.
• From a semantics point of view not so fast…perhaps in Disney land
• Natural languages (English, French, Portguese, etc) have very complex rules of
syntax and not necessarily well-defined.
2
Formal Language
• Formal language – is specified by well-defined set of rules of syntax
• We describe the sentences of a formal language using a grammar.
• Two key questions:
• 1 - Is a combination of words a valid sentence in a formal
language?
• 2 – How can we generate the valid sentences of a formal
language?
• Formal languages provide models for both natural languages and
programming languages.
3
Grammars
• A formal grammar G is any compact, precise mathematical definition
of a language L.
• As opposed to just a raw listing of all of the language’s legal sentences, or just
examples of them.
• A grammar implies an algorithm that would generate all legal
sentences of the language.
• Often, it takes the form of a set of recursive definitions.
• A popular way to specify a grammar recursively is to specify it as a
phrase-structure grammar.
Grammars (Semi-formal)
• Example: A grammar that generates a subset of
the English language
sentence → noun _ phrase predicate
noun _ phrase → article noun
predicate → verb
5
article → a
article → the
•
noun → boy
noun → dog
verb → runs
verb → sleeps
6
• A derivation of “the boy sleeps”:
sentence noun _ phrase predicate
noun _ phrase verb
article noun verb
the noun verb
the boy verb
the boy sleeps
7
• A derivation of “a dog runs”:
sentence noun _ phrase predicate
noun _ phrase verb
article noun verb
a noun verb
a dog verb
a dog runs 8
• Language of the grammar:
L = { “a boy runs”,
“a boy sleeps”,
“the boy runs”,
“the boy sleeps”,
“a dog runs”,
“a dog sleeps”,
“the dog runs”,
“the dog sleeps” }
9
Production rules example:
noun → boy
noun → dog
Variable Terminal
or Production
Symbols
Non-terminal rule
Symbols of
the vocabulary
10
Phrase-Structure Grammars
A phrase-structure grammar (abbr. PSG)
G = (V, Σ,S,P) is a 4-tuple, in which:
• V is a set of Variables or Non-Terminals)
• Σ is a set of symbols called terminals
• SV is a special nonterminal, the start symbol.
• P is a set of productions (to be defined).
• Rules for substituting one sentence fragment for another
Grammars
• Used to generate sentences of a language and to determine
if a given sentence is in a language
• Formal languages, generated by grammars, provide models
for programming languages (Java, C, etc) as well as natural
language --- important for constructing compilers
Finding a language L(G) corresponding to given grammar G
Finding a language L(G) corresponding to given grammar G
Finding a language L(G) corresponding to given grammar G
Finding a language L(G) corresponding to given grammar G
Finding a grammar G corresponding to given language L(G)
Finding a grammar G corresponding to given language L(G)
Finding a grammar G corresponding to given language L(G)
CHOMSKY HIERARCHY/CLASSIFICATION OF GRAMMARS
Languages Machines
Type-0 TM
(Unrestricted)
Type-1 LBA
(Context Sensitive)
Type-2 PDA
(Context Free)
Type-3 FA
(Regular)
Type-0 Grammar
Any phrase structure grammar without any restrictions.
A grammar in which the productions are Examples:
of the following form:
φAѰ -> φαѰ
Where ,
A is a variable,
Φ is the left context
Ѱ is the right context
φαѰ is the replacement string.
Type-1 Grammar (Context Sensitive or Context Dependent)
A grammar in which the productions are
of the following form:
φAѰ -> φαѰ
With a condition that
α≠^
ie, erasing of α is not permitted
Examples:
Type-2 Grammar (Context Free Grammars)
A grammar in which the productions are
of the following form:
A -> α
where
α ∈ (VN U ∑)*
Type-3 Grammar (Regular Grammars)
A grammar in which the productions are
of the following form:
A -> a or A -> aB
where
A,B ∈ VN and a ∈ ∑
Example: Construct a Regular Grammar for the following
Finite Automata
Solution:
Example:
Recursive Enumerable Languages (RE)
And
Recursive Languages (REC)
Recursive Enumerable (RE) or Type -0 Languages
• RE languages or type-0 languages are generated by type-0 grammars.
• RE language can be accepted or recognized by Turing machine which
means it will enter into final state for the strings of language and may
or may not enter into rejecting state for the strings which are not part
of the language. It means TM can loop forever for the strings which
are not a part of the language.
• RE languages are also called as Turing recognizable languages.
Recursive Languages (REC)
• A recursive language (subset of RE) can be decided by Turing machine
which means it will enter into final state for the strings of language and
rejecting state for the strings which are not part of the language.
e.g.; L= {anbncn|n>=1} is recursive because we can construct a Turing
machine which will move to final state if the string is of the form
anbncn else move to non-final state.
So the TM will always halt in this case.
• REC languages are also called as Turing decidable languages.
Right Linear and Left Linear Grammars
Right Linear Grammar Left Linear Grammar
In this type of grammar, all the non-terminals (ie, In this type of grammar, all the non-terminals (ie,
Variables) on the right-hand side of productions Variables) on the right-hand side of productions exist at
exist at the rightmost place, i.e; right end. the leftmost place, i.e; left end.
Example-1 : Example-1 :
A ⇢ a, A ⇢ a,
A ⇢ aB, A ⇢ Ba,
A ⇢ ∈ A ⇢ ∈
where, where,
A and B are Variables A and B are Variables
a is terminal a is terminal
∈ is empty string ∈ is empty string
Example-2 : Example-2 :
S ⇢ B00 | S11
S ⇢ 00B | 11S
B ⇢ B0 | B1 | 0 | 1
B ⇢ 0B | 1B | 0 | 1 Where,
where, S and B are Variables,
S and B are Variables, 0 and 1 are terminals
0 and 1 are terminals