0% found this document useful (0 votes)
14 views30 pages

Chapter 1

The document outlines a course on Formal Language Theory, covering topics such as set theory, relations, functions, grammars, and automata. It distinguishes between natural and formal languages, detailing their characteristics, advantages, and disadvantages. Additionally, it introduces mathematical induction, graphs, and trees, along with their definitions and properties.

Uploaded by

epohulll
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views30 pages

Chapter 1

The document outlines a course on Formal Language Theory, covering topics such as set theory, relations, functions, grammars, and automata. It distinguishes between natural and formal languages, detailing their characteristics, advantages, and disadvantages. Additionally, it introduces mathematical induction, graphs, and trees, along with their definitions and properties.

Uploaded by

epohulll
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Introduction to

Formal Language Theory


Course outline
Chapter 1: Basics
Set theory
Relations & functions
Mathematical induction
Graphs & trees
Strings & languages
Chapter 2: Introduction to grammars
Chapter 3: Regular languages
Regular grammar
Automata
Regular expressions
Chapter 4 :Context Free Languages
Context free grammars
Normal forms
Chapter 5: Push Down Automata (PDA)
NPDA
DPDA

October 14, 2024 Formal Language Theory 2


Basics: outline

 Overview of languages: natural vs formal


 Review of set theory and relations
 Set theory
 Relations and functions
 Mathematical induction
 Graphs and trees
 Strings and languages

October 14, 2024 Formal Language Theory 3


Overview of languages : natural Vs formal
 Language is a set of strings or sentences.
 Natural Languages
 rules come after the language
 evolve and develop

 highly flexible

 quite powerful

 no special learning effort needed

Disadvantages
 vague

 imprecise

 ambiguous

 user and context dependent

 Ex. Amharic, English, French, …

October 14, 2024 Formal Language Theory 4


Overview of languages: cont’d

 Formal Languages
 developed with strict rules
predefined syntax and semantics
 precise
 unambiguous

can be processed by machines!


Disadvantages
 unfamiliar notation

 initial learning effort

 Ex. Programming languages: Pascal, C++, …

October 14, 2024 Formal Language Theory 5


Overview of languages: cont’d

 Sentences: the basic building blocks of


languages
 Sentence = Syntax + Semantics
 Grammar: the study of the structure of a
sentence
 Ex:
<simple sentence> ::= <noun phrase><verb><noun phrase>
<noun phrase> ::= <article><noun>

A person entered the room


October 14, 2024 Formal Language Theory 6
Overview of languages: cont’d
<simple sentence>

<noun phrase> <verb> <noun phrase>

<article> <noun> <article> <noun>

A person entered the room

Derivation tree for the simple sentence: A person entered the room.

October 14, 2024 Formal Language Theory 7


Overview of languages: cont’d

 In Pascal (as well as in many other


languages), for example, an identifier is
specified as follows:

<identifier> ::= <letter> | <letter> {<letter> | <digit>}*


<letter> ::= a | b| c …
<digit> ::= 0 | 1| 2 | … | 9

Ex. a, x1, num, count1, …

October 14, 2024 Formal Language Theory 8


Review of set theory and relations
 Sets
 A well defined collection of objects (called members or elements)

 Notation: a Є S  a is an element of the set S

 Operation on sets
Let A and B be two sets and U the universal set
 Subset: A C B

 Proper subset: A c B

 Equality: A = B

 Union: A U B

 Intersection: A ∩ B

 Set difference: A \ B or A – B

 Complement: A’ or A bar

 Cartesian product: A X B = {(a,b) | a Є A and b Є B}

Note: (a,b) is called an ordered pair, and is different from (b,a)

October 14, 2024 Formal Language Theory 9


Set theory and relations: cont’d

 Properties
Let A, B, C be sets and U the universal set
 Associative property: A U (B U C) = ( A U B) U C

 Commutative property: A U B = B U A

 Demorgan’s laws: (A U B)’ = A’ ∩ B’, ...

 Involution law: (A’)’ = A

Definitions:
 Let A be a set. The cardinality of a set A is a measure
of the "number of elements of the set" and denoted by
|A| or #(A).
 The set of all subsets of a set A is called the power set
of A, denoted by 2A.
October 14, 2024 Formal Language Theory 10
Set theory and relations: cont’d
Definition:
Let S be a set. A collection {A1, A2, …, An} of subsets of S is called a partition
if Ai ∩ Aj = Ø, i≠j and S = A1 U A2 U … U An.
Ex. S = {1, 2, …, 10}
Let A1 ={1, 3, 5, 7, 9} and A2 ={2, 4, 6, 8, 10}, then {A1 , A2} = {{1, 3, 5, 7, 9},{2, 4,
6, 8, 10}} is a partition of S.
Q. Find other partitions of S
 Countability
 A finite set is countable
 If the elements of set A can be associated with 1st,2nd, …, ith, … elements of
the set of Natural Numbers, then A is countable.
Note: that in this case A may not be finite.
Ex.
1. N = {1, 2, …, ith, …} is countable
2. Z = {…, -3, -2, -1, 0, 1, 2, 3, …} = {0, 1, -1, 2, -2, 3, -3, …} is countable
3. [0, 3] is uncountable (not countable)

October 14, 2024 Formal Language Theory 11


Relations and functions

 Relations
 Definition: A relation R is a set of ordered pairs of elements
in S. (i.e is a subset of S X S)
Notation: (x, y) Є R or x R y
 Properties of relations
 Let R be a relation on a set A, then
a. R is reflexive if for all a Є A, a R a or (a, a) Є R
b. R is symmetric if a R b => b R a
c. R is transitive if a R b and b R c => a R c, for all a, b, c Є R
d. R is an equivalence relation if (a), (b) and (c) above hold.
 Let R be an equivalence relation on set A and let a Є A, then
the equivalence class of a, denoted by [a], is defined as:
[a] = {b ЄA | a R b}

October 14, 2024 Formal Language Theory 12


Relations and functions: cont’d

Examples:
Check whether the following relations are
reflexive, symmetric, and transitive

1. Let R be a relation in {1, 2, 3, 4, 5, 6} is given by


{(1,2), (2, 3), (3, 4), (4, 4), (4, 5)}
2. Let R be a relation in {1, 2, 3, …, 10} defined as
a R b if a divides b
3. Let R be defined on a set S such that aRb if a=b
4. Let R be defined on all people in Addis Ababa by
aRb if a and b have the same date of birth.
October 14, 2024 Formal Language Theory 13
October 14, 2024 Formal Language Theory 14
Relations and functions: cont’d

 Functions
 Definition: A function f from a set X to a set Y is a rule that
associates to every element x in X a unique element in Y,
which is denoted by f(x).
 The element f(x) is called the image of x under f.
 The function is denoted by f: X  Y
 Functions can be defined in the following two ways:
1. By giving the images of all elements of X
Ex. f:{1, 2, 3, 4}  {2, 4, 6} can be defined by
f(1) = 2, f(2) = 4, f(3) = 6, f(4) = 6
2. By a computational rule which computes f(x) once x is given
Ex. f:R  R can be defined by f(x) = x2 + 2x + 1, x Є R (R =
the set of all real numbers)

October 14, 2024 Formal Language Theory 15


Relations and functions: cont’d

 Let f: A  B be a function
1. f is an into function if Rf C B
2. f is an onto function if Rf = B
3. f is a one-to-one function
if for x1 & x2 Є A, x1 ≠ x2 => f(x1) ≠ f(x2)
4. f is bijective (one-to-one correspondence) if it satisfies (2)
and (3) above.
Ex. f:Z  Z is given by f(x) = 2x
Show that f is one-to-one but not onto.

 Definition: A set A is said to be countable iff there exists a


function f:A  N such that f is bijective. (N=the set of
natural numbers)

October 14, 2024 Formal Language Theory 16


Mathematical induction

 Let Pn be a proposition that depends on nЄZ+.


Then Pn is true for all +ve n provided that:
i. Pi is true
ii. If Pk is true, so is Pk+1, for some kЄZ+.
Three steps:
1. Base case: verify that P1 holds
2. Inductive hypothesis: assume that Pk holds, for some
kЄZ+
3. Inductive step: show that Pk+1 holds
Ex. Show that 1+2+…+n = n(n+1)/2, for all nЄZ+.

October 14, 2024 Formal Language Theory 17


Mathematical induction: cont’d
Solution:
Let Pn: 1+2+…+n = n(n+1)/2
Step1: for n = 1, P1 holds
Step2: for some kЄZ+, assume Pk is true
i.e. Pk: 1+2+…+k = k(k+1)/2
Step3: WTS Pk+1 is true
Pk+1 : 1+2+…+k+(k+1) = (k+1)(k+2)/2
: Pk + (k+1) = (k+1)(k+2)/2
: k(k+1)/2 + (k+1) = (k+1)(k+2)/2
: [k(k+1) + 2(k+2)]/2 = (k+1)(k+2)/2
: (k+1)(k+2)/2 = (k+1)(k+2)/2
Therefore, Pn holds for all n ЄZ+

Ex. Show that Pn = ∑(i=1,n)(i2) = (n+1)(n)(2n+1)/6 for all n

October 14, 2024 Formal Language Theory 18


Graphs and trees

 Graphs
 Definition: A graph (undirected graph) consists
of:
a. A non-empty set v called the set of vertices,
b. A set E called the set of edges, and
c. A map Φ (phi) which assigns to every edge a unique
unordered pair of vertices
e1 e6
v2 e1 = {v1, v2}
v1 e2 = {v1, v3}
e2 e3 e5 …
v3 v4 e6 = {v2, v2} (a self loop)
e4

October 14, 2024 Formal Language Theory 19


Graphs and trees: cont’d

 Definition: A directed graph (digraph) consists of:


a. A non-empty set v called the set of vertices,
b. A set E called the set of edges, and
c. A map Φ (phi) which assigns to every edge a unique
ordered pair of vertices

e1
v2 e1 = (v1, v2)
v1 v1 : a predecessor of v2
e2 e3 e5 v2 : a successor of v1
v3 e4 v4

October 14, 2024 Formal Language Theory 20


Graphs and trees: cont’d
 Definition: The degree of a vertex v in a graph (directed or
undirected) is the number of edges with v as an end vertex.
Note: that a self loop is counted twice when calculating the
degree of a vertex.
Ex. In the previous graph, deg(v1) = ? deg(v2) = ?

 Definition: A path in a graph (directed or undirected) is an


alternating sequence of vertices and edges of the form
v1e1v2e2…en-1vn, beginning and ending with vertices such that ei
has vi and vi+1 as its end vertices and no edge or vertex is
repeated in the sequence.
The path is said to be from v1 to vn.

Ex. In the previous graph, v1e1v2e3v3e4v4 is a path from v1 to v4.


Note: that a path may be directed (if all the edges in the path
have the same direction.)

October 14, 2024 Formal Language Theory 21


Graphs and trees: cont’d

 Definition: A graph (directed or undirected) is connected if


there is a path between every pair of vertices.
Q. Are the previous two graphs connected?

 Definition: A circuit in a graph is an alternating sequence


v1e1v2e2…en-1v1 of vertices and edges starting and ending
with the same vertex such that ei has vi and vi+1 as end
vertices and no edge or vertex other than v1 is repeated.
Ex. V2e3v3e4v4e5v2 is a circuit in the previous graph

October 14, 2024 Formal Language Theory 22


Graphs and trees: cont’d

 Trees
 Definition: A graph (directed or undirected) is
called a tree if it is connected and has no circuits.
Q. Are the previous two graphs trees?
 Properties of trees:
 In a tree there is one and only one path between every
pair of vertices (nodes)
 A tree with n vertices has n-1 edges
 A leaf in a tree can be defined as a vertex of degree one
 Vertices other than leaves are called internal vertices

October 14, 2024 Formal Language Theory 23


Graphs and trees: cont’d
 Definition: An ordered directed tree is a digraph satisfying the
following conditions:
 There is one vertex called the root of the tree which is distinguished
from all other vertices and the root has no predecessors.
 There is a directed path from the root to every other vertex.
 Every vertex except the root has exactly one predecessor.
(For the sake of simplicity, we refer to ordered directed trees as
simply trees.)
 The number of edges in a path is called the length of the path.
 The height of a tree is the length of the longest path from the root.
 A vertex v in a tree is at level k if there is a path of length k from
the root to the vertex v.
Q. what is the maximum possible level in a tree?
 There are several types of trees: binary, balanced binary, binary
search tree, heap, general tree, …

October 14, 2024 Formal Language Theory 24


Graphs and trees: cont’d
1
Ex. 1. List the leaves.
3 2. List the internal nodes.
2
3. What is the length of the
path from 1 to 9?
10 4 5 6
4. What is the height of the
tree?
7 8

 Note: a path from vertex (node) n1 to node nk can


be simply expressed as the sequence of nodes ni,
i=1,…,k such that ni is the parent (predecessor) of
ni+1 (1<= I <=k)

October 14, 2024 Formal Language Theory 25


Strings and languages
 Strings
 An alphabet, ∑, is a set of finite symbols.

 A string over an alphabet ∑ is a sequence of symbols from ∑.

 An empty string is a string without symbols, and is denoted by λ.

 Let w be a string, then its length, denoted by /w/, is the number of


symbols of w.
Ex. Let ∑ = {0, 1}, the following are some strings over ∑
w = λ, /w/ = 0; w = 01, /w/ = 2; w = 010110, /w/ = 6
*
 Given an alphabet ∑, ∑ denotes the set of all strings (including
λ) over ∑.
*
 ∑+ = ∑ - {λ}

Ex. ∑ = {0, 1} => ∑* = {λ, 0, 1, 01, 00, 11, 111, 0101, 0000, …}
 ∑i is a set of strings of length i, i = 0, 1, 2, …
*
 Let x Є ∑ and /x/ = n, then x = a1a2…an, ai Є ∑

October 14, 2024 Formal Language Theory 26


Strings and languages: cont’d

 Operations on strings
 Concatenation operation
 Let x, y Є ∑* and /x/ = n and /y/ = m. Then xy,
concatenation of x and y, = a1a2…anb1b2…bm, ai, bi Є ∑
 The set ∑* has an identity element λ with respect to the
binary operation of concatenation.
Ex. x Є ∑* , xλ = λx = x
 ∑* has left and right cancellation
For x, y, z Є ∑*,
zx = zy => x = y (left cancellation)
xz = yz => x = y (right cancellation)
 For x, y Є ∑* , we have /xy/ = /x/ + /y/

October 14, 2024 Formal Language Theory 27


Strings and languages: cont’d

 Transpose operation
 For any x in ∑* and a in ∑, (xa)T = a(x)T
Ex. (aaabab)T = babaaa
 A palindrome of even length can be obtained by the
concatenation of a string and its transpose.
 A prefix of a string is a substring of leading symbols of that
string.
w is a prefix of y if there exists y’ in ∑* such that y=wy’
Ex. y = 123, list all prefixes of y.
 A suffix of a string is a substring of trailing symbols of that
string.
w is a prefix of y if there exists y’ in ∑* such that y=y’w
Ex. y = 123, list all suffixes of y.

October 14, 2024 Formal Language Theory 28


Strings and languages: cont’d

 A terminal symbol is a unique indivisible object


used in the generation of strings.
 A nonterminal symbol is a unique object but
divisible, used in the generation of strings.
Ex. In English, a, b, A, B, etc are terminals and
the words boy, cat, dog, … are nonterminals.
In programming languages, a, A, :, ;, =, if, then, …
are terminals

October 14, 2024 Formal Language Theory 29


Strings and languages: cont’d
 Languages
 Definition: A language, L, is a set (collection) of strings over a given
alphabet, ∑.
 A string in L is called a sentence or word.
Ex. ∑ = {0, 1}, ∑* = {λ, 0, 1, 01, 00, 11, …}
L1 = {λ}, L2 = {0, 1, 01} over ∑
L3 = {an | n>= 0} over ∑ = {a}

 Let L1 , L2 be languages over ∑, then


 L1L2 = {xy | xЄL1, yЄL2}
 L{λ} = {λ}L = L, for any language L
 L0 = {λ}
 L1 = L
 L2 = LL ≡ {xx | xЄL}
 …
 Li = LLi-1, for i>=2
 L* = U(i=0,∞)(Li)

October 14, 2024 Formal Language Theory 30

You might also like