0% found this document useful (0 votes)

45 views21 pages

Languages, Automata and Grammars Lecture Notes

- Languages are sets of strings over an alphabet. Strings are finite sequences of characters from the alphabet. - Finite state automata are mathematical models that determine if a string is part of a language. They consist of a finite set of states, transitions between states, an initial state, accepting states, and an input alphabet. - The behavior of an automaton is defined by its state transition diagram, which is a directed graph showing the transitions between states based on input characters. Automata read input strings and change states according to the transitions, accepting the string if an accepting state is reached.

Uploaded by

Tinashe Nyamuona

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views21 pages

Languages, Automata and Grammars Lecture Notes

Uploaded by

Tinashe Nyamuona

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 21

LANGUAGES


Definitions

Consider a nonempty set A of symbols. A string/ word w on the set A is a finite
sequence of its elements.

 For example, suppose Then the following sequences are strings on A:

and
When discussing strings/words on A, we frequently call A the alphabet, and its
elements are called characters.

 We will also abbreviate our notation and write for , for , and so on. Thus, for the
above words, u = aba and v = a b .
 The empty string has no characters and is denoted ε (Greek letter epsilon), or
denoted by λ (Greek letter lambda).

The set of all strings/words on A is denoted by (read: “A star”).

 The length of a string/word u, written |u| or (u), is the number of elements in its
sequence of letters. For the above strings/words u and v, we have (u) = 5 and (v) = 7.
Also, (λ) = 0, where λ is the empty word.
LANGUAGES

Concatenation
of Strings
 Consider two strings/words u and v on the alphabet A. The concatenation of u and
v, written , is the word obtained by writing down the letters of u followed by the
letters of v
Example
For the strings and
uv = ababbaccbaaa = abaab

 The concatenation operation for strings/words on an alphabet A is associative.

 The empty string/word ε is an identity element for the concatenation operation.

Adjoining the empty word before or after a word u does not change the word u.

The operation is not commutative, e.g., for the above words u and v.
LANGUAGES

Subwords
and Initial segments
 Consider any string/word u = . . . on an alphabet A. Any sequence w
= . . . is called a subword/substring of u.
 In particular, the subword/substring w = . . . beginning with the first letter of u, is
called an initial segment of u.
 In other words, w is a subword of u if u = w and w is an initial segment of u if u =
wv.
 Observe that ε and u are both subwords of uv since u = ε u.
Example
Consider the word u = abca. The subwords and initial segments of u follow:
 Subwords: ε, a, b, c, ab, bc, ca, abc, bca, abca = u
 Initial segments: ε, a, ab, abc, abca = u
LANGUAGES
 Recall
 that A* denotes the set of all strings/words on an alphabet A.

A language over an alphabet A is a collection of words on A.

 Thus a language L is simply a subset of A*.

Example
Let A = {a, b}. The following are languages over A .
 = {a, ab, a, . . .}
consists of all words beginning with an a and followed by zero or more b’s.
 = { |}
consists of all words beginning with one or more as followed by one or more b’s.
 = { |m > 0}
consists of all words beginning with one or more a’s and followed by the same number of
b’s.
LANGUAGES

Language
Concatenation
 Suppose L and M are languages over an alphabet A. Then the “concatenation” of L and M,
denoted by LM, is the language defined as follows:

Example
If and L₂ = { aa, bb }, then
L₁L₂ = { aaa, abb, baaa, babb, bbaa, bbbb }

Language Exponentiation
 We can define what it means to “exponentiate” a language as follows:
= {ε}
The set containing just the empty string.
This means any string formed by concatenating zero strings together is the empty
string.
=
This means concatenating (n+1) strings together works by concatenating
n strings, then concatenating one more.
LANGUAGES

 An important operation on languages is the Kleene Closure, which is defined as
L* = { w ∈ Σ* | ∃n ∈ ℕ. w ∈ }
 Mathematically: w ∈ L* if ∃n ∈ ℕ. w ∈
 Intuitively, all possible ways of concatenating zero or more strings in L together,
possibly with repetition.

Theorem on Kleene Closure of L :

Example
If L = { a, bb }, then L* = {
ε,
a, bb,
aa, abb, bba, bbbb,
aaa, aabb, abba, abbbb, bbaa, bbabb, bbbba, bbbbbb,

}
LANGUAGES
Summary
 Languages are sets of strings.
 Strings are sequences of characters.
 Characters are individual symbols.
 Alphabets are sets of characters.
LANGUAGES
Regular Expressions
 Regular expressions are a way of describing a language via a string representation.
 Regular expressions match strings in the language. They describe the general shape
of all strings in the language.
 They’re used extensively in software systems for string processing.
 Conceptually, regular expressions are strings describing how to assemble a larger
language out of smaller pieces.
LANGUAGES

Each of the following is a regular expression over an alphabet A.
 The symbol ε is a regular expression that represents the language {ε}.
 The symbol is a regular expression that represents the empty language This is just
a pair (empty expression).
 For any a ∈ A, the symbol a is a regular expression for the language {a}.
 If r is a regular expression, is a regular expression for the Kleene closure of the
language of r.
 If and are regular expressions, is a regular expression for the concatenation of
the languages of and .
 If and are regular expressions, ∪ is a regular expression for the union of the
languages of and .
 If r is a regular expression, (r) is a regular expression with the same meaning as r.

 All regular expressions are formed in this way.

 Observe that a regular expression r is a special kind of a word (string) which uses
the letters of A and the five symbols:
 The operator precedence for regular expressions, from highest to lowest:
(r) r* ∪
LANGUAGES
 The language of a regular expression is the language described by that regular
expression.
The language over A defined by a regular expression r over A is as follows:
 L() = ϵ
 L() = , the empty set.
 L(a) = {a}, where a is a letter in A.
 L(r∗) = (L(r))* (the Kleene closure of L(r)).
 L(r1 r2) = L(r1) ∪ L(r2) (the union of the languages).
 L(r1r2) = L(r1)L(r2) (the concatenation of the languages).
 L((r))) = L(r)
 Parentheses will be omitted from regular expressions when possible. Since the

 * takes precedence over concatenation, and concatenation takes precedence over ∪.

Let L be a language over A. If L is a regular language, then there is a regular expression

for L such that L = L(r).
LANGUAGES

Example

Let A = {a, b}. Each of the following is an expression r and its corresponding language L(r):
(a) Let r = a*.
Then L(r) consists of all powers of a including the empty word l.
(b) Let r = aa*.
Then L(r) consists of all positive powers of a excluding the empty word.
(c) Let r = a b∗.
Then L(r) consists of a or any word in b, that is, L(r) = {a,, b, , }.
(d) Let r = (a b)∗.
Note L(a b) = {a} ∪ {b} = A; hence L(r) = A∗, all words over A.
(e) Let r = (a b)∗bb.
Then L(r) consists of the concatenation of any word in A with bb, that is, all words ending in .
(f) Let r = a b∗.
L(r) does not exist since r is not a regular expression. (Specifically, is not one of the symbols
used for regular expressions.)
FINITE STATE AUTOMATA
An
automaton (plural: automata) is a mathematical model of a computing device.
 A finite automaton is a simple type of mathematical machine for determining
whether a string is contained within some language.

A finite state automaton (FSA) or, simply, an automaton M, consists of five parts:
 A finite set (alphabet) A of inputs.
 A finite set S of (internal) states.
 A subset Y of S (called accepting or “yes” states).
 An initial state in S.
 A next-state function F from into S.

Such an automaton M is denoted by M = (A, S, Y, , F)

 Some texts define the next-state function F : in by means of a collection of functions

, one for each . Setting shows that both definitions are equivalent.
FINITE STATE AUTOMATA

State
Diagram of an Automaton M
 An automaton M is usually defined by means of its state diagram D = D(M) rather
than by listing its five parts. The state diagram D = D(M) is a labeled directed graph
as follows.
 The vertices of D(M) are the states in S and an accepting state is denoted by means of a
double circle.
 There is an arrow (directed edge) in D(M) from state to state labeled by an input a
if F(, a) = or, equivalently, if () = .
 The initial state is indicated by means of a special arrow which terminates at but has
no initial vertex.
 For each vertex and each letter a in the alphabet A, there will be an arrow leaving
which is labeled by a; hence the outdegree of each vertex is equal to number of
elements in A. For notational convenience, we label a single arrow by all the inputs
which cause the same change of state rather than having an arrow for each such
input.
FINITE STATE AUTOMATA

Example

The following defines an automaton M with two input symbols and three states:
 A = {a, b}, input symbols.
 S = {, , }, internal states.
 Y = {, }, “yes” states.
 , initial state.
 Next-state function defined explicitly by a list of functions or by the table below.

,,,,,

F a b
FINITE STATE AUTOMATA
Example

The state diagram D = D(M) of the automaton M above.

 Note that both a and b label the arrow from to since F(, a) = and F(, b) = .
Note also that the outdegree of each vertex is 2, the number of elements in A.
FINITE STATE AUTOMATA

Language
L(M) Determined by an Automaton M
 Each automaton M with input alphabet A defines a language over A, denoted by
L(M), as follows.
 Let w = be a string on A. Then w determines the following path in the state
diagram graph D(M) where is the initial state and F(, ) = for i ≥ 1:
P = (, , , , , , , )
 We say that M recognizes the string/word w if the final state is an accepting state
in Y .
 The language of M is the set of all strings from A which are accepted by M.
FINITE STATE AUTOMATA

Example
Determine whether or not the automaton M in the Figure below accepts the strings/words:
; ; the empty word.

Path for string P = (, a, , , , ,,, )

The final state in the path is which is not an accepting state; hence is not accepted by M.

Path for string P = (, b, , , , ,,)

The final state in the path is which is an accepting state; hence is accepted by M.

The final state determined by is the initial state since = is the empty string. Thus is
accepted by M since ∈ Y .
GRAMMAR
Grammar

A phrase structure grammar or, simply, a grammar G consists of four parts:

 A set N of nonterminal symbols (also called variables)
 A set T of terminal symbols (the alphabet of the Grammar)
 A set P of production rules saying how each nonterminal can be replaced by a
string of terminals and nonterminals
 A start symbol S (which must be a nonterminal) that begins the derivation.

Such a grammar G is denoted by G = G(V, T, S, P) when we want to indicate its four

parts.
V is a finite set ( vocabulary) such that T is a subset of V and N = V-T
GRAMMAR
 Terminals
 will be denoted by lower case letters, a, b, c, .
 Nonterminals will be denoted by uppercase letters, A,B,C, , with S as the start
symbol.
 Also, Greek letters, , , will denote strings in V , that is, arbitrary strings/words of
terminals and nonterminals.

Example
The following defines a grammar G with S as the start symbol:

The productions may be abbreviated as follows:

GRAMMAR

Language
L(G) of a Grammar
 Suppose w and are words over the vocabulary set V of a grammar G.
 We write w ⇒ w if w can be obtained from w by using one of the productions; that
is, if there exists words u and v such that w = uαv and w = uβv and there is a
production .
 Furthermore, we write
w ⇒⇒ w or w ∗⇒w
if can be obtained from w using a finite number of productions.
 A sequence of steps where nonterminals are replaced by the right-hand side of a
production is called a derivation.

Now let G be a grammar with terminal set T . The language of G, denoted by L(G),
consists of all strings/words in T that can be obtained from the start symbol S by the
above process; that is, L(G) = {w ∈ T ∗ | S ⇒⇒ w}

 That is, ℒ(G) is the set of strings of terminals derivable from the start symbol.
GRAMMAR

Types
of Grammars
 Grammars are classified according to the kinds of production which are allowed.
The following grammar classification is due to Noam Chomsky.
 A Type 0 grammar has no restrictions on its productions.
 Types 1, 2, and 3 are defined as follows:
 A grammar G is said to be of Type 1 if every production is of the form
where or of the form
 A grammar G is said to be of Type 2 if every production is of the form
where the left side A is a nonterminal.
 A grammar G is said to be of Type 3 if every production is of the form or , that is,
where the left side A is a single nonterminal and the right side is a single
terminal or a terminal followed by a nonterminal, or of the form .
 Observe that the grammars form a hierarchy; that is, every Type 3 grammar is a
Type 2 grammar, every Type 2 grammar is a Type 1 grammar, and every Type 1
grammar is a Type 0 grammar.

Regular Expression
No ratings yet
Regular Expression
89 pages
Lecture 03
No ratings yet
Lecture 03
16 pages
Theory of Automata (Regular Expression)
No ratings yet
Theory of Automata (Regular Expression)
42 pages
03-RegularExpression 112422
No ratings yet
03-RegularExpression 112422
22 pages
Unit 1 Finite Automata and Languages: Structure Page Nos
No ratings yet
Unit 1 Finite Automata and Languages: Structure Page Nos
19 pages
Week 4 Lec 8 CC p2-1
No ratings yet
Week 4 Lec 8 CC p2-1
17 pages
3 RegularExpressions
No ratings yet
3 RegularExpressions
25 pages
TOA Lecture 03
No ratings yet
TOA Lecture 03
63 pages
Regular Expression
No ratings yet
Regular Expression
16 pages
Chapter 4
No ratings yet
Chapter 4
31 pages
CS273 Theory of Automata & Fomal Languages: (WEEK-2) Lecture-3 & 4
No ratings yet
CS273 Theory of Automata & Fomal Languages: (WEEK-2) Lecture-3 & 4
41 pages
Regular Expressions
No ratings yet
Regular Expressions
31 pages
Unit 3
No ratings yet
Unit 3
71 pages
2022 CSC 353 2.0 2 Alphabets and Languages
No ratings yet
2022 CSC 353 2.0 2 Alphabets and Languages
3 pages
Formal Language and Applications Notes
No ratings yet
Formal Language and Applications Notes
63 pages
Theory of Computation: Dr. Krishnendu Rarhi E: Krishnendu.e9621@cumail - in
No ratings yet
Theory of Computation: Dr. Krishnendu Rarhi E: Krishnendu.e9621@cumail - in
44 pages
Lecture 3 Regular Expressions
No ratings yet
Lecture 3 Regular Expressions
36 pages
Lec 1 IntroToAutomataTheory
No ratings yet
Lec 1 IntroToAutomataTheory
20 pages
Lesson 03
No ratings yet
Lesson 03
30 pages
Unit I
No ratings yet
Unit I
37 pages
Unit 3
No ratings yet
Unit 3
98 pages
Bcs503 Module 2
No ratings yet
Bcs503 Module 2
46 pages
Lesson 4
No ratings yet
Lesson 4
18 pages
Microsoft PowerPoint - Tcs1-Languages
No ratings yet
Microsoft PowerPoint - Tcs1-Languages
23 pages
Formal Languages Part 1 Including Regular Expressions: Basic Concepts For Symbols, Strings, and Languages
No ratings yet
Formal Languages Part 1 Including Regular Expressions: Basic Concepts For Symbols, Strings, and Languages
4 pages
Regular Expressions
No ratings yet
Regular Expressions
17 pages
Specification of Tokens
No ratings yet
Specification of Tokens
21 pages
3B-Formal Languages
No ratings yet
3B-Formal Languages
24 pages
CSC236 Week 9: Larry Zhang
No ratings yet
CSC236 Week 9: Larry Zhang
44 pages
Lec 4
No ratings yet
Lec 4
16 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Automata
No ratings yet
Automata
17 pages
Lecture 2 PDF
No ratings yet
Lecture 2 PDF
33 pages
ATCD Material
No ratings yet
ATCD Material
50 pages
TOA Concepts
No ratings yet
TOA Concepts
26 pages
02 PDF
No ratings yet
02 PDF
13 pages
Lecture # 2: Automata Theory and Formal Languages (CSC-221)
No ratings yet
Lecture # 2: Automata Theory and Formal Languages (CSC-221)
48 pages
SPECIFICATION OF TOKENS - Unit 1
No ratings yet
SPECIFICATION OF TOKENS - Unit 1
13 pages
Lecture#03,4
No ratings yet
Lecture#03,4
27 pages
Chap-2 2 (RegularExpression)
No ratings yet
Chap-2 2 (RegularExpression)
46 pages
Operations On Languages
No ratings yet
Operations On Languages
3 pages
Theory of Computer Science - SCJ 3203: Paridah Samsuri Mohd Soperi Mohd Zahid
No ratings yet
Theory of Computer Science - SCJ 3203: Paridah Samsuri Mohd Soperi Mohd Zahid
49 pages
Automata Theory: CS411-2012S-02 Formal Languages
No ratings yet
Automata Theory: CS411-2012S-02 Formal Languages
33 pages
Week4 5
No ratings yet
Week4 5
43 pages
Lecture 3a and 3b
No ratings yet
Lecture 3a and 3b
21 pages
Fall Semester 2023-24 CSE1013 TH AP2023242000613 Reference Material I 02-Aug-2023 Module - I Part - I
No ratings yet
Fall Semester 2023-24 CSE1013 TH AP2023242000613 Reference Material I 02-Aug-2023 Module - I Part - I
38 pages
Regular Expressions and Regular Languages
No ratings yet
Regular Expressions and Regular Languages
5 pages
Chapter One - Introduction
No ratings yet
Chapter One - Introduction
30 pages
Dr. Fouzia Jabeen: Theory of Automata
No ratings yet
Dr. Fouzia Jabeen: Theory of Automata
24 pages
Chapter 3 - Regular Expression
No ratings yet
Chapter 3 - Regular Expression
16 pages
ECS 20 Chapter 12, Languages, Automata, Grammars: R R 1 2 N R N n-1 2 1 R
No ratings yet
ECS 20 Chapter 12, Languages, Automata, Grammars: R R 1 2 N R N n-1 2 1 R
4 pages
1 Finite Autometa
No ratings yet
1 Finite Autometa
21 pages
Regular Expression2
No ratings yet
Regular Expression2
12 pages
Chapter 3
No ratings yet
Chapter 3
10 pages
Theory of Automata and Formal Languages
No ratings yet
Theory of Automata and Formal Languages
24 pages
Formal Languages and Automata Theory - Regular Expressions and Finite Automata
No ratings yet
Formal Languages and Automata Theory - Regular Expressions and Finite Automata
17 pages
Introduction to Formal Languages
From Everand
Introduction to Formal Languages
György E. Révész
2/5 (1)
The Genetic Code of All Languages,(Part-1; An Overview)
From Everand
The Genetic Code of All Languages,(Part-1; An Overview)
Moni Kanchan Panda
No ratings yet
A Short Course in Automorphic Functions
From Everand
A Short Course in Automorphic Functions
Joseph Lehner
No ratings yet
Permutation Groups
From Everand
Permutation Groups
Donald S. Passman
1/5 (1)

Languages, Automata and Grammars Lecture Notes

Uploaded by

Languages, Automata and Grammars Lecture Notes

Uploaded by

LANGUAGES

 For example, suppose Then the following sequences are strings on A:

 The concatenation operation for strings/words on an alphabet A is associative.

 The empty string/word ε is an identity element for the concatenation operation.

 Thus a language L is simply a subset of A*.

 All regular expressions are formed in this way.

 * takes precedence over concatenation, and concatenation takes precedence over ∪.

Let L be a language over A. If L is a regular language, then there is a regular expression

Such an automaton M is denoted by M = (A, S, Y, , F)

 Some texts define the next-state function F : in by means of a collection of functions

Path for string P = (, a, , , , ,,, )

Path for string P = (, b, , , , ,,)

A phrase structure grammar or, simply, a grammar G consists of four parts:

Such a grammar G is denoted by G = G(V, T, S, P) when we want to indicate its four

The productions may be abbreviated as follows:

You might also like