0% found this document useful (0 votes)
46 views51 pages

Computation Theory: Expressions Languages Grammar

The document summarizes operations on formal languages including union, concatenation, and Kleene star. It defines regular expressions as a shorthand for describing regular languages in a similar way that arithmetic expressions describe numbers. Kleene's theorem states that the class of languages represented by regular expressions is equivalent to the class of languages accepted by finite automata, known as regular languages. The document provides examples of constructing a nondeterministic finite automaton equivalent to a given regular expression using the operations of union, concatenation, and Kleene star.

Uploaded by

Osama Alkhatib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views51 pages

Computation Theory: Expressions Languages Grammar

The document summarizes operations on formal languages including union, concatenation, and Kleene star. It defines regular expressions as a shorthand for describing regular languages in a similar way that arithmetic expressions describe numbers. Kleene's theorem states that the class of languages represented by regular expressions is equivalent to the class of languages accepted by finite automata, known as regular languages. The document provides examples of constructing a nondeterministic finite automaton equivalent to a given regular expression using the operations of union, concatenation, and Kleene star.

Uploaded by

Osama Alkhatib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Computation Theory

Chapter 3

Regular expressions , Regular languages and Regular Grammar

1
Chap3: Regular expressions, Regular languages and Regular Grammar
Operations on formal languages (1)
Purpose : We define operators that allow us to combine elementary
languages ​to create more complex languages.

Let L1={abc, cab}, L2={ab, ac}


Union: L=L 1 U L2={abc, cab, ab, ac}
Concatenation: L1={abc, cab}, L2={ab, ac}
L=L1L2={abcab, abcac, cabab, cabac}
Kleene Star: L1={abc, cab}
L1*={abc, cab, abcabc, abccab, cababc, cabcab,
abcabcabc, abcabccab,…}
Other operations: intersection, complement, difference
Chap3: Regular expressions, Regular languages and Regular Grammar 2
Operations on formal languages (2)
Let L1={abc, cab}, L2={ab, ac}
Union: L=L 1 U L2={abc, cab, ab, ac}

Concatenation: L1={abc, cab}, L2={ab, ac}


L=L1L2={abcab, abcac, cabab, cabac}

Kleene Star: L1={abc, cab}


L1*={abc, cab, abcabc, abccab, cababc, cabcab,
abcabcabc, abcabccab,…}
Other operations: intersection, complement, difference

Chap3: Regular expressions, Regular languages and Regular Grammar 3


Elementary Language
• An elementary language has length one

• Exemple : on  ={a, b} the elementary languages are ,


{λ}, {a}, and {b}

• An elementary language is a regular language

• A regular language is a language that can be built from


elementary languages, by using the three operations of
union, concatenation, and Kleene star.

Chap3: Regular expressions, Regular languages and Regular Grammar 4


Regular Languages : Another definition
Let L1= {a}, L2= {b}
▪ the union of two regular languages ({a, b}) is a regular
language
▪ the concatenation of two regular languages ({ab}) is a
regular language
▪ the kleene’s closure of a regular language (a*={a, aa,
aaa, aaaa, …}) is a regular language
• Ø is RL
• elementary language is RL
• L1  L2 is RL
• L1L2 is RL
• L1* is RL Chap3: Regular expressions, Regular languages and Regular Grammar 5
Regular expressions
▪ A useful shorthand for describing regular languages.
▪ Compare to arithmetic expressions, such as (x + 3)/2.
An arithmetic expression is constructed using arithmetic
operators, such as addition and division.
▪A regular expression is constructed using operations on
languages, such as concatenation, union, and Kleene star.
▪ The value of an arithmetic expression is a number.
▪ The value of a regular expression is a language.

Chap3: Regular expressions, Regular languages and Regular Grammar 6


Regular Languages correspond to
Regular Expressions

• RL = Ø RE is Ø (rule 1)
• RL = {} RE =  (rule 2)
• RL = {a} RE = a (rule 3)
• RL = L1  L2 RE = (r1 + r2) (rule 4)
• RL = L1 L2 RE = (r1r2) (rule 5)
• RL = L1* RE = (r1*) (rule 6)

Chap3: Regular expressions, Regular languages and Regular Grammar 7


L=(ab  b)* is it regular?
L={, ab, b, abab, abb, bb, ababab, ababb, abbb,…}

Language Rule Expression


RL is elementary language Rule 3 R1= a R2= b
RL = L1 L2 Rule 5 R3= ab
RL = L1  L2 Rule 4 R4= ab  b
RL = L1* Rule 6 R5= (ab  b)*

From R1 to R5 are all regular languages (regular expressions)

Chap3: Regular expressions, Regular languages and Regular Grammar 8


Hints for writing regular expressions
Assume  = {a, b, c}.

Zero or more a’s: a*


One or more a’s: aa*
Any string at all: (a + b + c)*
Any nonempty string: (a + b + c)(a + b + c)*
Any string that does not contain a: (b + c)*
Any string containing exactly one a: (b + c)*a(b + c)*
Chap3: Regular expressions, Regular languages and Regular Grammar 9
Examples of Regular Expressions
• a*+ b = {λ, a, b, aa, aaa, aaaa, aaaaa, … }
• a*ba* = {w  * : w has exactly one b}
• (a + b)* = any string of a’s and b’s
• (a + b)*aa (a + b)* = {w  * : w contains aa}
• (a + b)*aa (a + b)* + (a + b)*bb (a + b)* =
{w  * : w contains aa or bb}
• (a + λ)b* = {abn : n  0} + {bn : n  0}
• (b + c)*(λ + a)(b + c)*(λ + a)(b + c)*=
all strings containing no more than two a’s:
▪ As with arithmetic expressions, there is an order of precedence for operators The order is:
star closure first, then concatenation, then union.
Chap3: Regular expressions, Regular languages and Regular Grammar 10
Do these strings match the regular
expression?

Regular expression String

(01* + 1) 0101

(a + λ)b b

(ab)*a* λ

(a + b)(ab) bb
Chap3: Regular expressions, Regular languages and Regular Grammar 11
Kleene’s theorem
(1909-1994)

1) For any regular expression r that represents language


L(r), there is a finite automaton that accepts that same
language.

2) For any finite automaton M that accepts language L(M),


there is a regular expression that represents the same
language.

Therefore, the class of languages that can be represented by


regular expressions is equivalent to the class of languages
accepted by finite automata -- the regular languages.
Chap3: Regular expressions, Regular languages and Regular Grammar 12
Kleene’s theorem part 1

NDFA regular
expression

Kleene’s
DFA Theorem part 2

Chap3: Regular expressions, Regular languages and Regular Grammar 13


Kleene’s Theorem part1(1)
Regular expression NDFA

Let r be a regular expression. Then there exists some


nondeterministic regular accepter that accepts L(r).
Consequently, L(r) is a regular language.

Proof strategy: for any regular expression, we show how to


construct an equivalent NDFA.

Chap3: Regular expressions, Regular languages and Regular Grammar 14


Kleene’s Theorem part1(2)
Regular expression NDFA

Base step: Give a NDFA that accepts each of the


elementary languages :

:

{λ} :

{a} for each a   : a

Chap3: Regular expressions, Regular languages and Regular Grammar 15


Kleene’s Theorem part1(3)
Regular expression NDFA
Inductive step: For each of the operations – union,
concatenation and Kleene star – show how to construct an
accepting NDFA.
Closure under union:

M1
λ
λ

λ M2
λ

Chap3: Regular expressions, Regular languages and Regular Grammar 16


Kleene’s Theorem part1(4)
Regular expression NDFA
Closure under concatenation:
M1 M2
λ

Closure under Kleene Star:


λ

λ M1

Chap3: Regular expressions, Regular languages and Regular Grammar 17


Example: how to construct an equivalent
NDFA for R=(abUb)* (1)

1) A NDFA that accepts each of the elementary languages


b
a
R1= a R2 = b
q0 q1 q2 q3

2) Closure under concatenation

a b
λ
R3 = ab q0 q1 q2 q3

a b
R3 = ab q0 q2 q3

Chap3: Regular expressions, Regular languages and Regular Grammar 18


Example: how to construct an equivalent
NDFA for R=(abUb)* (2)

b
3) Closure under union: R3 = ab a
q0 q1 q2

b
R2 = b q3 q4

R4 = R3 U R2 a b
= ab U b λ q0 q1 q2 λ

q5
λ b q6
λ
q3 q4

a b
R4 = R3 U R2 q5 q1
= ab U b
b q6

Chap3: Regular expressions, Regular languages and Regular Grammar 19


Example: how to construct an equivalent
NDFA for R=(abUb)* (3)
4) Closure under Kleene Star:
R5 = (R4)* = (ab U b)* R4 = ab U b
a b
q0 q1
λ - transition elimination :
q3q0 q1 : q3 q1 : a λ
b q2
q3q0 q2: q3 q2 : b
λ a b
q3 q0 q1

b q2

a a
λ q3 q1
λ - transition elimination :
q2q3 q1 : q2 q1 : a b
q2q3 q2 : q2 q2 : b a b q2
b
q3 q1
b
b q2
R5 = (ab U b)*
L5 = {λ, ab, b, abab, abb, bab, bb,
ababab, ababb,…}

Chap3: Regular expressions, Regular languages and Regular Grammar 20


Kleene’s Theorem part2(1)
NDFA Regular expression
(1909-1994)

▪ Let L be a regular language. Then there exists a regular


expression r such that L = L(r).

▪ Any language accepted by a finite automaton can be


represented by a regular expression.

▪ The proof strategy: For any DFA, we show how create


an equivalent regular expression. In other words, we
describe an algorithm for converting any DFA to a regular
expression.
Chap3: Regular expressions, Regular languages and Regular Grammar 21
Kleene’s Theorem part2(2)
NDFA Regular expression

• An Expression diagram is a labeled directed graph


(similar to a finite state diagram) in which
transitions are labeled by regular expressions
• Has a single start state with no incoming transitions
• Has a single accepting state with no outgoing
transitions
• Example: ab

(a+b) a*

Chap3: Regular expressions, Regular languages and Regular Grammar 22


Kleene’s Theorem part2(3)
NDFA Regular Expression
▪ Algorithm for converting a DFA into Regular Expression
• Initial step:
1) Change every transition labeled a, b to (a + b).
2) Add a single start state with an outgoing λ-transition to the current
start state.
3) Add a single final state with incoming λ-transitions from every
previous final state.
• Main step: Until expression diagram has only two states (initial state
and final state), repeat the following:
4) Pick some non-start, non-final state.
5) Remove it from the diagram and re-label transitions with regular
expressions so that the same language is accepted.
Chap3: Regular expressions, Regular languages and Regular Grammar 23
Kleene’s Theorem part2(4)
NDFA Regular expression
▪ The key step is removing states and re-labeling transitions with
regular expressions. Here are some examples of how to do this.
b
a a ab*a

b ab*a
a
b ab*b
a
b
b ab*b
a
a ab*a
Chap3: Regular expressions, Regular languages and Regular Grammar 24
Kleene’s Theorem part2(5)
NDFA Regular expression
▪ The key step is removing states and re-labeling transitions with
regular expressions. Here are some examples of how to do this.
Remove state 2

4 pathes passing by state 2 :

Chap3: Regular expressions, Regular languages and Regular Grammar 25


Example: how to construct an equivalent
R. Expression for A NDFA (1)
a a,b a (a+b)

b λ b λ

(a+b)

a* b λ a*b (a+b)*

Chap3: Regular expressions, Regular languages and Regular Grammar 26


Applications of regular expressions
• Validation
– checking that an input string is in valid format
– example 1: checking format of email address on
WWW entry form
– example 2: UNIX regex command
• Search and selection
– looking for strings that match a certain pattern
– example: UNIX grep command
• Tokenization
– converting sequence of characters (a string) into sequence of
tokens (e.g., keywords, identifiers)
– used in lexical analysis phase of compiler
Chap3: Regular expressions, Regular languages and Regular Grammar 27
3 ways of specifying regular
languages

Regular expressions
describe

accept
DFA NDFA Regular languages

generate
Regular grammars

Chap3: Regular expressions, Regular languages and Regular Grammar 28


Grammar
• A grammar G = (V, T, S, P) consists of the
following quadruple:
– a set V of variables (non-terminal symbols),
including a starting symbol S
– a set T of terminals (same as an alphabet, )
– a start symbol S  V
– a set P of production rules
• Example:
S → aS | A
A→ bA | λ Chap3: Regular expressions, Regular languages and Regular Grammar 29
Derivation
• Strings are “derived” from a grammar
• Example of a derivation
S  aS  aaS  aaA  aabA  aab
• At each step, a nonterminal is replaced by
the sentential form on the right-hand side
of a rule (a sentential form can contain
nonterminals and/or terminals)
• Automata recognize languages; grammars
generate languages
Chap3: Regular expressions, Regular languages and Regular Grammar 30
Context-free grammar
• A grammar is said to be context-free if every
rule has a single non-terminal on the left-hand
side
• This means you can apply the rule in any
context. More complicated languages (such as
English) have context-dependent rules. A
language generated from a context-free
grammar is called a context-free language
Chap3: Regular expressions, Regular languages and Regular Grammar 31
Regular grammar
• A grammar is said to be right-linear if all
productions are of the form A→xB or A→x,
where A and B are variables and x is a string of
terminals
• A grammar is said to be left-linear if all
productions are of the form A→Bx or A→x
• A regular grammar is either right-linear or left-
linear. (λ can appear in RHS)
Chap3: Regular expressions, Regular languages and Regular Grammar 32
Linear grammar
• A grammar can be linear without being right-
or left-linear.
• A linear grammar is a grammar in which at
most one variable can occur on the right side of
any production rule, without any restriction on
the position of the variable.
• Example:
S → aS | A
A→ Ab | λ Chap3: Regular expressions, Regular languages and Regular Grammar 33
Another formalism for regular languages

• Every regular grammar generates a regular


language, and every regular language can be
generated by a regular grammar.
• A regular grammar is a simpler, special-
case of a context-free grammar
• The regular languages are a proper subset of
the context-free languages

Chap3: Regular expressions, Regular languages and Regular Grammar 34


Example 1
• The following grammar generates the
language on  = {a,b} consisting of all
strings with no more than three a’s

S → bS | aA | λ
A → bA | aB | λ
B → bB | aC | λ
C → bC | λ
Chap3: Regular expressions, Regular languages and Regular Grammar 35
Example 2 (1)
• Given a grammar, you should be able to say
what language it generates
• Use set notation to define the language
generated by the following grammar
S → aaSB | λ
B → bB | b

Chap3: Regular expressions, Regular languages and Regular Grammar 36


Example 2 (2)
S → aaSB | λ
B → bB | b
It helps to list some of the strings that can be formed:
S  aaSB  aaB  aab
S  aaSB  aaB  aabB  aabb
S  aaSB  aaB  aabB  aabbB  aabbb
S  aaSB  aaB  aabB  aabbB  aabbbB  aabbbb
S  aaSB  aaaaSBB  aaaaBB  aaaaBb  aaaabb
S  aaSB  aaaaSBB  aaaaBB  aaaaBbB  aaaaBbb 
aaaabbb
What is the pattern?
L = {(aa)nbnb*} Chap3: Regular expressions, Regular languages and Regular Grammar 37
Example 3 (1)
• Given a language, you should be able give a
grammar that generates it.
• For example, give a regular (right-linear)
grammar for the language consisting of all
strings over {a, b, c} that begin with a,
contain exactly two b’s, and end with cc.

Chap3: Regular expressions, Regular languages and Regular Grammar 38


Example 3 (2)
• Give a regular (right-linear) grammar for
the language consisting of all strings over
{a, b, c} that begin with a, contain exactly
two b’s, and end with cc
S → aA
A → bB | aA | cA
B → bC | aB | cB
C → aC | cC | cD
D→c Chap3: Regular expressions, Regular languages and Regular Grammar 39
Theorem 1(1)
G NDFA

▪ Every language generated by a right-


linear grammar is regular.
▪ Proof:
– Specify a procedure for automatically
constructing an NDFA that mimics the
derivations of a right-linear grammar.

Chap3: Regular expressions, Regular languages and Regular Grammar 40


Theorem 1(2)
G NDFA
▪ Justification:
• The sentential forms produced by a right linear grammar
have exactly one variable, which occurs as the rightmost
symbol.

• Assume that our grammar has a production rule


D → dE and that, during the derivation of a string, there is a
step wcD  wcdE

• We can construct an NDFA which has states D and E, and


an arc labeled d from D to E.

• NDFAs can be converted to DFAs.

• All languages accepted by DFAs are regular.


Chap3: Regular expressions, Regular languages and Regular Grammar 41
Theorem 1(3)
G NDFA

▪ Construction:
• For each variable Vi in the grammar there will be a state in
the automaton labeled Vi.

• The initial state of the automaton will be labeled V0 and


will correspond to the S variable in the grammar.

• For each production rule Vi → a1a2…amVj the automaton


will have transitions such that δ*(Vi → a1a2…am) = Vj
• For each production rule Vi → a1a2…am the automaton will
have transitions such that δ*(Vi → a1a2…am) = Vfinal

Chap3: Regular expressions, Regular languages and Regular Grammar 42


Theorem 1(4)
G NDFA

Construct an NDFA that accepts the language


generated by the grammar:
S → aA convert to: V0 → aV1
A →abS | b V1 → abV0 | b

a b
V0 V1 Vf

b a

Chap3: Regular expressions, Regular languages and Regular Grammar 43


Theorem 2(1)
DFA G

• Every regular language can be generated


by a right-linear grammar.
• Proof:
Specify a procedure for automatically
constructing a right-linear grammar from
the DFA.

Chap3: Regular expressions, Regular languages and Regular Grammar 44


Theorem 2(2)
DFA G

• Given a regular language L, let M = (Q, , δ,


q0, F) be a DFA that accepts L. Let Q = {q0,
q1, …, qn} and  = {a1, a2, …, am}.
• Construct the grammar G = (V, T, S, P) with:
V = {q0, q1, …, qn}
T = {a1, a2, …, am}
S = q0.
P = {} initially.
• P, the set of production rules, is constructed
as follows:
Chap3: Regular expressions, Regular languages and Regular Grammar 45
Theorem 2(3)
DFA G

• For each transition


δ(qi, aj) = qk
in the transition table of M, add to P the
production:
qi → ajqk
• If qk is in F, then add to P the production:
qk → λ

Chap3: Regular expressions, Regular languages and Regular Grammar 46


Theorem 2(4)
DFA G

• Example: Construct a right-linear grammar for


the language L = L(aab*a)
• First, build an NDFA for L:

a a a
q0 q1 q2 qf

Chap3: Regular expressions, Regular languages and Regular Grammar 47


Theorem 2(5)
DFA G

q0 a q1 a q2 a qf

P = {} initially.
Add to P a rule for each transition in the NDFA:
q0 → aq1
q1 → aq2
q2 → bq2
q2 → aqf
Since qf is in F, add to P the production:
qf → λ Chap3: Regular expressions, Regular languages and Regular Grammar 48
Theorem 2(6)
DFA G

q0 a q1 a q2 a qf

Now P = You can convert to normal grammar notation:


{q0 → aq1 S → aA
q1 → aq2 A → aB
q2 → bq2 B → bB
q2 → aqf B → aC
qf → λ } C→λ

Chap3: Regular expressions, Regular languages and Regular Grammar 49


Theorem 3
▪ A language L is regular if and only if there exists a
regular grammar G such that L = L(G).

▪ Proof:
Combine our definition of regular grammars,
which includes the statement, “A regular grammar is
right-linear, with theorem 2.

Chap3: Regular expressions, Regular languages and Regular Grammar 50


Questions

Chap3: Regular expressions, Regular 51


languages and Regular Grammar

You might also like