0% found this document useful (0 votes)
7 views65 pages

LanguagesandGrammars Unit 3

Language

Uploaded by

Suman Sk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views65 pages

LanguagesandGrammars Unit 3

Language

Uploaded by

Suman Sk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 65

• UNIT -3

• FORMAL LANGUAGES and


GRAMMARS

• CSE322
• Formal Languages and Automata Theory

1
Intro to Languages
• English grammar tells us if a given combination of words is a valid
sentence.

• The syntax of a sentence concerns its form while the semantics


concerns
• its meaning.
• e.g. the mouse wrote a poem

• From a syntax point of view this is a valid sentence.

• From a semantics point of view not so fast…perhaps in Disney land

• Natural languages (English, French, Portguese, etc) have very complex


rules of syntax and not necessarily well-defined.

2
Formal Language

• Formal language – is specified by well-defined set of rules


of syntax

• We describe the sentences of a formal language using a


grammar.

• Two key questions:


• 1 - Is a combination of words a valid sentence in a
formal language?
• 2 – How can we generate the valid sentences of a
formal language?

• Formal languages provide models for both natural languages


and programming languages.
3
Grammars
• A formal grammar G is any compact, precise
mathematical definition of a language L.
– As opposed to just a raw listing of all of the language’s
legal sentences, or just examples of them.
• A grammar implies an algorithm that would
generate all legal sentences of the language.
– Often, it takes the form of a set of recursive
definitions.
• A popular way to specify a grammar recursively is
to specify it as a phrase-structure grammar.
Grammars (Semi-formal)
Example: A grammar that generates a
subset of the English language

sentence  noun _ phrase predicate

noun _ phrase  article noun

predicate  verb 5
article  a
article  the


noun  boy
noun  dog

verb  runs
verb  sleeps
6
• A derivation of “the boy sleeps”:

sentence  noun _ phrase predicate


 noun _ phrase verb
 article noun verb
 the noun verb
 the boy verb
 the boy sleeps

7
• A derivation of “a dog runs”:

sentence  noun _ phrase predicate


 noun _ phrase verb
 article noun verb
 a noun verb
 a dog verb
 a dog runs 8
• Language of the grammar:
L = { “a boy runs”,
“a boy sleeps”,
“the boy runs”,
“the boy sleeps”,
“a dog runs”,
“a dog sleeps”,
“the dog runs”,
“the dog sleeps” }

9
Notation

noun  boy
noun  dog

Variable Terminal
or Production
Symbols of
Non-terminal rule
the vocabulary

Symbols of
the vocabulary
10
Basic Terminology

► A vocabulary/alphabet, V is a finite nonempty set of


elements called symbols.
• Example: V = {a, b, c, A, B, C, S}
► A word/sentence over V is a string of finite length of
elements of V.
• Example: Aba
► The empty/null string, λ is the string with no symbols.
► V* is the set of all words over V.
• Example: V* = {Aba, BBa, bAA, cab …}
► A language over V is a subset of V*.
• We can give some criteria for a word to be in a
language.
Phrase-Structure Grammars

• A phrase-structure grammar (abbr. PSG)


G = (V,T,S,P) is a 4-tuple, in which:
– V is a set of nonterminals (that is represented
by Upper Case).
– T is a set of symbols called terminals

– S, the start symbol.


• in our example the start symbol was “sentence”.
– P is a set of productions (to be defined).
• Rules for substituting one sentence fragment for
another
• Every production rule must contain at least one
nonterminal on its left side.
Phrase-structure Grammar

► EXAMPLE:

 Let G = (V, T, S, P),

 Where V={ A, B, S}
 T = {a, b},
 S is a start symbol
 P = {S → ABa, A → BB, B → ab, A → Bb}.

G is a Phrase-Structure Grammar.
What sentences can be generated
with this grammar?
Derivation
• Definition

• Let G=(V,T,S,P) be a phrase-structure grammar.

• Let w0=lz0r (the concatenation of l, z0, and r) w1=lz1r be strings over V.

• If z0  z1 is a production of G we say that w1 is directly derivable from


w0 and we write wo => w1.

• If w0, w1, …., wn are strings over V such that w0 =>w1,w1=>w2,…, wn-1 => wn,
then we say that wn is derivable from w0, and write w0=>*wn.

• The sequence of steps used to obtain wn from wo is called a derivation.


Language
• Let G(V,T,S,P) be a phrase-structure grammar.
The
• language generated by G (or the language of G)
• denoted by L(G) , is the set of all strings of
terminals
• that are derivable from the starting state S.

• L(G)= {w  T* | S =>*w}

16
Language L(G)
► EXAMPLE:

• Let G = (V, T, S, P), where V = {, A, S}, T = {a, b}, S is a start
symbol and P = {S → aA, S → b, A → aa}.

• The language of this grammar is given by L (G) = {b, aaa};

• we can derive aA from using S → aA, and then derive aaa using
A → aa.

• We can also derive b using S → b.


Another example
• Grammar:

G=(V,T,S,P) T={a,b} P=
S  aSb
S 
V={S}
• Derivation of sentence :
ab
S  aSb  ab

S  aSb S  18
• Grammar: S  aSb
S 

• Derivation of sentence
aabb :
S  aSb  aaSbb  aabb

S  aSb S 
19
• Other derivations:
S  aSb  aaSbb  aaaSbbb  aaabbb
S  aSb  aaSbb  aaaSbbb
 aaaaSbbbb  aaaabbbb
So, what’s the language of the
grammar with the productions? S  aSb
S  20
• Language of the grammar with the
productions: S  aSb

S 
n n
L  {a b : n  0}

21
Another Example

V T

• Let G = ({ A, B, S}, {a, b}, S, P

• {S → ABa, A → BB, B → ab, AB → b}).


• One possible derivation in this grammar is:
S  ABa  Aaba  BBaba  Bababa
 abababa.
Types of Grammars -
Chomsky hierarchy of languages
• Venn Diagram of Grammar Types:

Type 0 – Phrase-structure Grammars


Type 1 –
Context-Sensitive
Type 2 –
Context-Free
Type 3 –
Regular
Defining the PSG(Phrase Structure Grammar) Types

and Rules for type of productions in these Grammars


• Type 0: Phase-structure grammars – no restrictions on
the production rules

• Type 1: Context-Sensitive PSG:


– All after fragments are either longer than the corresponding
before fragments, or empty:
if b → a, then |b| <= |a|  a = Ꜫ

• Type 2: Context-Free PSG:


– All before fragments have length 1 and are nonterminals:
if b → a, then |b| = 1 (b  N).

• Type 3: Regular PSGs:


– All before fragments have length 1 and nonterminals
– All after fragments are either single terminals, or a pair of a
terminal followed by a nonterminal.
if b → a, then a  T  a  TN.
Classifying grammars

• Given a grammar, we need to be able to find the


smallest class in which it belongs. This can be
determined by answering three questions:
• Are the left hand sides of all of the productions
single non-terminals?
• If yes, does each of the productions create at most
one non-terminal and is it on the right?
• Yes – regular No – context-free
• If not, can any of the rules reduce the length of a
string of terminals and non-terminals?
• Yes – unrestricted No – context-sensitive
Linear Grammars

31
Linear Grammars
Grammars with
at most one variable at the right side
of a production

Examples: S  aSb S  Ab
S  A  aAb
A

32
A Non-Linear Grammar
Grammar G
: S  SS
S 
S  aSb
S  bSa

L(G )  {w : na ( w)  nb ( w)}

33
Another Linear Grammar

Grammar G : SA
A  aB | 
B  Ab

n n
L(G )  {a b : n  0}

34
Left-Linear Grammars
All productions have form: A  Bx
or

A x

Example: S  Aab
A  Aab | B
Ba
35
Right-Linear Grammars
All productions have form: A  xB
or

A x

Example: S  abS
S a

36
Definition: Context-Free Grammars
Grammar G  (V , T , S , P )

Vocabulary Terminal Start


symbols variable

Productions of the form:


A x
Non-Terminal String of variables
and terminals
Derivation Tree of A Context-free Grammar

► Represents the language using an ordered rooted


tree.

► Root represents the starting symbol.


► Internal vertices represent the nonterminal
symbol that arise in the production.
► Leaves represent the terminal symbols.

► If the production A → w arise in the derivation,


where w is a word, the vertex that represents A
has as children vertices that represent each
symbol in w, in order from left to right.
Language Generated by a Grammar
• Example: Let G = ({S,A,},{a,b}, S,
{S → aA, S → b, A → aa}). What is L(G)?
• Easy: We can just draw a tree
of all possible derivations.
– We have: S  aA  aaa. S
– and S  b.
aA b
• Answer: L = {aaa, b}.
Example of a
aaa derivation tree
or parse tree
or sentence
diagram.
Example: Derivation Tree

► Let G be a context-free grammar with the


productions P = {S →aAB, A →Bba, B →bB, B
→c}. The word w = acbabc can be derived from S as
follows:
S ⇒ aAB →a(Bba)B ⇒ acbaB ⇒ acba(bB) ⇒
acbabc S
Thus, the derivation tree is given as follows:
a
A B

B b a b B

c c
Generating Infinite Languages
• A simple PSG can easily generate an infinite
language.
• Example: S → 11S, S → 0 (T = {0,1}).
• The derivations are:
– S0
– S  11S  110
– S  11S  1111S  11110 L = {(11)*0} – the
– and so on… set of all strings
consisting of some
number of concaten-
ations of 11 with itself,
followed by 0.
Another example
• Construct a PSG that generates the language L =
{0n1n | nN}.
– 0 and 1 here represent symbols being concatenated n
times, not integers being raised to the nth power.
• Solution strategy: Each step of the derivation
should preserve the invariant that the number of
0’s = the number of 1’s in the template so far, and
all 0’s come before all 1’s.
• Solution: S → 0S1, S → λ.
• Context-Sensitive Languages

• The language { anbncn | n  1} is context-


sensitive but not context free.
• A grammar for this language is given by:
• S  aSBC | aBC
• CB  BC
• aB  ab
Terminal • bB  bb
and bC  bc

non-terminal
• cC  cc
• A derivation from this grammar is:-
• S  aSBC
•  aaBCBC (using S 
aBC)
•  aabCBC (using aB 
ab)
•  aabBCC (using CB 
BC)
•  aabbCC (using bB  bb)
•  aabbcC (using bC  bc)
•  aabbcc (using cC  cc)
• which derives a2b2c2.
• Language Of Grammar-

• Language of Grammar is the set of all strings
that can be generated from that grammar.
• If the language consists of finite number of
strings, then it is called as a Finite language.
• If the language consists of infinite number of
strings, then it is called as an Infinite language.

45
Example-01:

Consider a grammar G = (V , T , P , S) where-


V={S}
T={a,b}
P = { S → aSbS , S → bSaS , S → ∈ }
S={S}

This grammar generates the strings having equal number of a’s and b’s.

So, Language of this grammar is-

L(G) = { ∈ , ab , ba , aabb , bbaa , abab , baba , …… }


This language consists of infinite number of strings.
Therefore, language of the grammar is infinite.

46
Consider a grammar G = (V , T , P , S) where-
V={S,A,B,C}
T={a,b,c}
P = { S → ABC , A → a , B → b , C → c }
S={S}

This grammar generates only one string “abc”.

So, Language of this grammar is-

L(G) = { abc }
This language consists of finite number of strings.
Therefore, language of the grammar is finite.
47
48
49
Here,

Both the grammars generate a unique


language.
But given a language L(G) = { ab }, we have
two different grammars generating that
language.
This justifies the above concept.

50
51
52
step by step find RLG and LLG for
regex 0∗(1(0+1))∗0∗(1(0+1))∗. At each step,
same color is used to match part of regex getting
translated into corresponding part in grammar.

53
54
55
Conversion from RG to FA

• Let G = (V, T, P, S) be a regular grammar, where


• V = {A0, A1, A2}
• T = {0, 1}
• S is the start symbol of the grammar.
• P is the set of production rules defined as:
• A0 -> 0A1
• A0 -> 1A2
• A1 -> 0A2
• A2 -> 0
• Construct a finite-automata that accepts the language
generated by a given grammar G. 56
57
Conversion of RG from given FA

58
Solution is

59
• Conversion from regular grammar to
regular expression

Let A->aA | b

Then it will be a*.b

If it is A->Aa|b then it will be b.a*

60
EX#1:

S-> 1A | 10 B
A -> 0A | 1
B -> 11B | 0
solution : A= (0)*1 , B= (11)*0
S= 1 (0)*1 +11 (11)*0
EX#2: S-> bB
B -> aB | b B | b
Solution : b.(a+b)*.b 61
Conversion from R.E to R.G

Method 2: First Convert the given R.E to F.A


, then from F.A to R.G

a * .b .(a+b)*
Solution is : S->aS | bA
A-> aA | b A
(Where S = q0 in FA and A is q1 in FA)

62
Ex#2: (a+ba)* b

RLG : A->aA/bB/bC
B-> aA
C->epsilon

LLG: here final will be start (it will be


opposite to RLG)
C->Ab , B-> Ab , A-> Ba/Aa/epsilon
63

• Equivalence of 2 finite automata

64
Method
1. The two finite automata (FA) are said to be equivalent if both the
automata accept the same set of strings over an input set Σ.

2. When two FA’s are equivalent then, there is some string x over Σ. On
acceptance of that string, if one FA reaches to the final state, the other
FA also reaches to the final state.
• Method
• The method for comparing two FA’s is explained below −
• Let M and M1 be the two FA’s and Σ be a set of input strings.
• Step 1 − Construct a transition table that has pairwise entries (q, q 1)
where q ∈ M and q1 ∈ M1 for each input symbol.
• Step 2 − If we get in a pair as one final state and other non-final state
then we terminate the construction of transition table declaring that two
FA’s are not equivalent
• Step 3 − The construction of the transition table gets terminated when
there is no new pair appearing in the transition table.

65
66
67
68
If there is a grammar

G: N = {S, A, B} T = {a, b} P = {S → AB, A → a,


B → b}

Find the language of the given grammar

69

You might also like