Lecture#01,2
Lecture#01,2
Lecture#01
Course Contents
Introduction to the course title
Formal and In-formal languages
Alphabets
Strings
Null string
Words
Valid and In-valid alphabets
length of a string
Reverse of a string
Defining languages
Course Contents
Descriptive definition of languages
EQUAL
EVEN-EVEN
INTEGER
EVEN
{an bn}
{ an bn an }
FACTORIAL
DOUBLEFACTORIAL
SQUARE
DOUBLESQUARE
PRIME
PALINDROME
4
Regular Context-
(DFA) free Context-
Recursively-
(PDA) sensitive
enumerable
(LBA)
(TM)
Automata
“It is the plural of automaton, and it means “something that works
automatically”
Types of languages
1. Formal Languages (Syntactic languages)
2. Informal Languages (Semantic languages)
Alphabets
A finite non-empty set of symbols (called letters), is called an alphabet. It is
denoted by Σ ( Greek letter sigma)
Example:
1. Binary: Σ = {0,1}
2. All lower case letters: ∑ = {a,b,c,..z}
3. Alphanumeric: ∑ = {a-z, A-Z, 0-9}
4. DNA molecule letters: ∑ = {a,c,g,t}
Note: Σ (alphabet) includes letters, digits and a variety of operators
Strings
Definition
Concatenation of finite number of letters from the alphabet is called a string.
Example
If Σ = {a, b} then a, abab, aaabb, ababababababababab are all strings
Note
Empty string or null string
Sometimes a string with no symbol at all is used, denoted by (Small Greek letter
Lambda) λ or (Capital Greek letter Lambda) Λ, is called an empty string or null string
The capital lambda will mostly be used to denote the empty string in further
discussion
Words
Definition
Words are strings belonging to some language
Example
If Σ= {x} then a language L can be defined as
L={xn : n=1,2,3,…..} or L={x,xx,xxx,….}
Here x, xx, xxx, … are the words of L
Note: All words are strings, but not all strings are words
Grammar
“A grammar can be regarded as a device that enumerates the sentences
of a language” - nothing more, nothing less
Alphabet,
String,
Word,
Grammar,
Language
Power of an alphabet
Let ∑ be an alphabet:
∑k = the set of all strings of length k
∑* = ∑0 U ∑1 U ∑2 U … (All combinations of letters in ∑)
∑+ = ∑1 U ∑2 U ∑3 U … (Set of all strings in ∑ with length 1 or more)
Languages
L is a said to be a language over alphabet ∑, only if L ∑*
Why?
Examples:
1. Let L be the language of all strings consisting of n 0’s
followed by n 1’s: L = {^,01,0011,000111,…}
2. Let L be the language of all strings of with equal number
of 0’s and 1’s: L = {^,01,10,0011,1100,0101,1010,1001,…}
NO
Finite Automata Uses
Some Applications:
Software for designing and checking the behavior of digital circuits
Lexical analyzer of a typical compiler
Software for scanning large bodies of text (e.g., web pages) for
pattern finding
Software for verifying systems of all types that have a finite number
of states (e.g., stock market transaction, communication/network
protocol)
14
On/Off switch
A string of other
Start with a letter letters (possibly
empty) Should end w/ 2-letter state code
Now consider an alphabet Σ2= {B, Ba, bab, d} and a string BababB
This string can be tokenized in two different ways
1. (Ba), (bab), (B)
2. (B), (abab), (B)
Which shows that the second group cannot be identified as a string, defined over
alphabet Σ2 . It makes Σ2 as an invalid alphabet.
Valid / Invalid Alphabet
While defining an alphabet of letters consisting of more than one
symbols, no letter should be started with the letter of the same alphabet
i.e. one letter should not be the prefix of another.
Conclusion
Σ1= {B, aB, bab, d}
Σ2= {B, Ba, bab, d}
Σ1 is a valid alphabet while Σ2 is an in-valid alphabet. Why?
Length of a String
Definition
The length of string s, denoted by |s|, is the number of letters in the
string.
Example1
Σ={a,b} and s=ababa belongs to Σ then |s|=5
Example2
Σ= {B, aB, bab, d} and s=BaBbabBd belongs to Σ then |s|=5
Tokenizing=(B), (aB), (bab), (B), (d)
Reverse of a String
Definition
The reverse of a string s denoted by Rev(s) or sr, is obtained by writing
the letters of s in reverse order
Example
If s=abc is a string defined over Σ={a,b,c} then Rev(s) or sr = cba
Example
Σ= {B, aB, bab, d} and s=BaBbabBd then Rev(s)=dBbabaBB
Defining a Language
The languages can be defined in different ways:
1. Descriptive definition
2. Recursive definition
3. using Regular Expressions(RE)
4. using Finite Automaton(FA) etc
Descriptive Definition of a Language
Definition: The language is defined, describing the conditions imposed on its words.
Example1: The language L of strings of odd length, defined over Σ={a}, can be written
as
L={a, aaa, aaaaa,…..}
Example2: The language L of strings that does not start with a, defined over Σ
={a,b,c}, can be written as
L ={^, b, c, ba, bb, bc, ca, cb, cc, …}
Example 3: The language L of strings of length 2, defined over Σ ={0,1,2}, can be
written as
L={00, 01, 02,10, 11,12,20,21,22}
Example 4: The language L of strings ending in 0, defined over Σ ={0,1}, can be
written as
L={0,00,10,000,010,100,110,…}
Exercises: Descriptive Definition of a Language
Define languages L1, L2, L3, L4, L5 over alphabet Σ ={a,b,c} as
L1 = Language of all words with length two or less
L2 = Language of all words not ending on b
L3 = Language of all words with length odd
L4 = Language of all words not starting with a
L5 = Language of all words with letter b appearing in even chunks
Exercises: Descriptive Definition of a Language
Define languages L1, L2, L3, L4, L5 over alphabet Σ ={a,b} as
L1 = Language of all words with length EVEN
L2 = Language of all words with length EVEN-EVEN
L3 = Language of all words with length EQUAL-EQUAL
L4 = Language defined as {anbn }
L5 = The language {anbnan }, of strings defined as {an bn an: n=1,2,3,…}
Define languages L1, L2, L3, L4 over alphabet Σ ={-, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9} as
L1 = Language of all words as INTEGERS
L2 = Language of all words belonging to set of natural integers
L3 = Language of all words with even values
L4 = The language factorial
Kleen Star Closure
Given Σ, then the Kleene Star Closure of the alphabet Σ, denoted by Σ*, is the
collection of all strings defined over Σ, including Λ
Note: It is to be noted that Kleene Star Closure can be defined over any set of
strings
Examples
If Σ = {x} then Σ* = {Λ, x, xx, xxx, xxxx, ….}
If Σ = {0,1} Then Σ* = {Λ, 0, 1, 00, 01, 10, 11, ….}
If Σ = {aaB, c} Then Σ* = {Λ, aaB, c, aaBaaB, aaBc, caaB, cc, ….}
Note: Languages generated by Kleene Star Closure of set of strings, are infinite
languages. (By infinite language, it is supposed that the language contains
infinite many words, each of finite length)
Plus Operation (+)
Plus Operation is same as Kleene Star Closure except that it does not generate
Λ (null string), automatically
Example:
If Σ = {0,1} Then Σ+ = {0, 1, 00, 01, 10, 11, ….}
If Σ = {aab, c} Then Σ+ = {aab, c, aabaab, aabc, caab, cc, ….}
Remark
It is to be noted that Kleene Star can also be operated on any string i.e. a* can be
considered to be all possible strings defined over {a}, which shows that a* generates Λ,
a, aa, aaa, …
It may also be noted that a+ can be considered to be all possible non empty strings
defined over {a}, which shows that a+ generates a, aa, aaa, aaaa, …