0% found this document useful (0 votes)
13 views8 pages

Automata Theory

This document discusses computability and formal languages. It defines an algorithm as an effective procedure that always halts. Some problems are undecidable or noncomputable, such as determining if two procedures solve the same problem. A Turing machine is introduced as a model of computation with a tape and read/write head. Formal languages are defined using grammars consisting of terminals, nonterminals, productions, and a start symbol. Context-free grammars are discussed as one type of grammar.

Uploaded by

Ezekiel James
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views8 pages

Automata Theory

This document discusses computability and formal languages. It defines an algorithm as an effective procedure that always halts. Some problems are undecidable or noncomputable, such as determining if two procedures solve the same problem. A Turing machine is introduced as a model of computation with a tape and read/write head. Formal languages are defined using grammars consisting of terminals, nonterminals, productions, and a start symbol. Context-free grammars are discussed as one type of grammar.

Uploaded by

Ezekiel James
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

AUTOMATA THEORY, COMPUTABILITY AND FORMAT LANGUAGES

I - COMPUTABILITY
ALGORITHMS AND ALGORITHMIC LANGUAGE:
An effective procedure is a finite, unambiguous description of a finite set of operations. the
operations must be effective in the sense that there is a statistically mechanical procedure for
completing them. Also, the process of deciding the order in which to perfom operations must be
effective. An effective procedure may perform a task, compute a function or just perform a
sequence of unrelated operations; examples are computer programs.

An effective procedure which specifies a sequence of operations which always halts is called an
algorithm. Thus, computer program which always halt for any input, i.e., never go into an
infinite loop, are algorithms.

Effective procedures will be written as finite sets of sentences, called statements, of languages
which are designed to expedite the representation of effective procedures. These algorithmic
languages may either be carefully chosen unambiguous subsets of natural languages or artificial
languages such as ALGOL, PASCAL or X-calculus.

The format of effective procedures in a particular algorithmic language is specified by a set of


rules called syntax. All algorithmic language have the following properties:
1. There is a finite set of symbols called the alphabet of the language
2. The number of statements in an effective procedure is finite
3. There is a mechanical procedure for checking what an arbitrary finite sequence of
alphabet so is a statement and whether an arbitrary finite of statements has the form of a
program (effective procedure).

There are many interesting problem which are not solve by effective procedures. For example,
there is no effective procedure in the language for deciding whether or not any two arbitrary
chosen effective procedures solve the same problem (i.e., are equivalent). Also, there is no effect
solution admitting whether an effective procedure written in an algorithm language is actually an
algorithm, that is, will always halt.
Problems which have an effective solution are said to be decidable, effectively solvable or
computable. All other problems, such as the halting problem for algorithmic language are said to
be undividable or noncomputable.

FUNDAMENTAL DEFINITIONS AND NOTATIONAL CONVENTIONS


i. ALPHABET: An alphabet E is a nonempty finite set of symbols or letters. In many
cases, it will be convenient to use symbols 1,2,3,…,n for the n members of E and the
alphabet { 1 , 2, … , n } will be designated by En . In other cases, it may be convenient to use
a , b ,+ ,1 , or other symbols as members of E.
ii. WORD: A word in e is any finite string of symbols from E; e.g., 2131 and 112 are
words in E3 . The set of words in E will be denoted by W.
iii. LENGTH OF A WORD: The length of a word x is the number of symbols in x ; e.g.,
length (2131) in four and length (112) is three. A word of length zero is called the null
word. The symbol O will be used to denote the null word.
iv. SUBWORD: The word x is a subword of y , written x ≤ y , if there are words u and v
such that u . x . v = y . For example, 212321243, when u=3∧v=43 ; and 212 ≤2123 , where
u=0 and v=3. Note that 0 ≤ x and x ≤ x for all x ∈W .

SOME SPECIAL FUNCTIONS


Given a function f :W r →W s, where W =E¿ and E is a finite alphabet, the following functions
are defined.
i. i:W → W , where i ( x ) =x, is the identity function.
π :W →W , where π ( x ) =(), is the projection function.
0
ii.
0
iii. ξ :W →W , where ξ ()=0, is the zero function.
iv. ς :W → W , where ς ( x )=x +1, is the successsor function.

v. x− y = {x−0 y if x ≥ y
otherwise
, is the proper subtraction.

k+1 k 0
vi. x =x . x , where x =0 , is the concatenation.
Note that 0. x=x .0=x for all x ∈W ∧¿ x . ( y . z ) =( x . y ) . z for all x , y , z ∈W concatenation is
commutative so that ( x . y = y . x ) iff E contains only one letter but it is purely associative.
vii. ρ :W →W , where ρ ( 1212 )=2121 ;and ξ ( 11211) =11221in E2and
viii. 11213∈ E 3.
PL – COMPUTABLE FUNCTIONS
The PL program P computes the function f :W r →W s, if when stated with x 1 , … , x r as the values
of the variables x 1 , … , x r and with all other program variables initially o.
1. P halts with y 1 , … , y s as the values of the variables y 1 , … y3 respectively, if
( x 1 ,… , x r ) ϵ dom f ∧f ( x 1 , … , x r )= ( y 1 , … y r ) or
2. P never halts if f ( x 1 , … , x r ) is defined.
x 1 , … , x r∧ y 1 , … y s are called the input and output variables, respectively.
If there is a P which computes f, then f is a said to be PL – computable.

According to the definition given, each PL program computes different function f :W r →W s, for
each r , s ≥0 . There is requirement that the variables x 1 , … , x r and y 1 , … y s action appear in the
program, since there is the convention, the variable, except the input variables x 1 , … , x r is
initialize.

Example: Below is a simple PL program which sets its output to y + z if W =0 or


x + z if W ≠ 0 .
0.
V ←0 ;
LOOP W ;
LOOP X ;
V ←V +1 ;
END ;
GOTO L ;
END
L : LOOPZ ;
V ←V +1 ;
END ;

If P computes f :W r →W s, then P is defined on x 1 , … , x r if ( x 1 ,… , x r ) ϵ dom f , that is f ( x 1 , … , x r )


is defined otherwise β is undefined on ( x 1 ,… , x r ). If for a fixed r and s, a PL program P halts on
all possible input values, then it computes a total function and P is an algorithm. All PL
programs are effective procedures. Every PL – computable function is effectively computable in
the intuitive sense of previous discussion.

CHURCH’S THESIS: A total function is effectively compatible if and only if it is PL-


computable (recursive, Markov-algorithm – computable, etc.). However, since the motion of
effective computability is not mathematical in nature, Church’s thesis can never be proved.

UNDECIDABLE PROBLEM: By definition a set of S is countable if there exist a 1-1


correspondence between S and the natural numbers, or equivalently, the set W. hence, if S is
countable, there is an enumeration a 0 , a 1 , a2 , … of S. this does not necessarily mean that it is
possible to effectively enumerate the set S, i.e., there need not exist an effective enumeration
procedure which successively generates members of S in such a way that each member of S is
eventually generated. As an informal illustration, consider the following cardinality argument.
There are only a countable number of effective procedures for enumerating sets. Hence, only a
countable number of sets may be effectively enumerated. However, there are an uncountable
number of subsets of W, so there are enumerable subsets of W which cannot be effectively
enumerated.

TURING MACHINE: The Turing machine has a single two-way work tape, divided into
squares each of which can contain a symbol from a finite alphabet. At each instance of time, the
tape’s re write head scams a single square. Based on the contents of square scanned, the
instruction executed will select the instruction and either change the symbol scanned or move
head one square right or left.

Finite Control
Unit

More formally, it can be described as a set of quintuples of the form ( q ,r , r ' , M , q' ). The q and q '
are the states of the machine. The symbol stands for a MOVE which may be R or L (and
sometimes C) meaning more Right or Left (or (do not move). The symbol σ represents the
symbol on the of the tape scanned by the machine and σ ' is to be written in its place.
( q ,r , r ' , M , q' ) indicates that if the machine is in states q and q ' and a is the symbol scanned on
the tape, replace the a with the symbol b and move to the right one tape square and go to the state
'
q.

II FORMAL LANGUAGES

FUNDAMENTAL DEFINITIONS AND NOTATIONS

Grammars and Language


Grammar
A grammar G is any notation used for specifying the syntax of a language. An alphabet
or vocabulary is any finite set of symbols e.g. a binary alphabet containing {0,1}.
Sentence
A sentence over an alphabet is any string of finite length composed of symbols from the
alphabet.
Language
Informally, a language is any set of sentences over an alphabet. More formally, a
language is defined as: L :{x∨φ (x)} where φ ( x ) is some statement about x , i.e. φ ( x ) is the
grammar that allows us to making sentences (words, vocabulary). Any sentence made from this
φ ( x ) is a member of the language (L).
A context-free grammar has four components:
1. A set of tokens known as terminal symbols
2. A set of non-terminals
3. A set of productions where each production consists of non-terminals called the left side
of the production, an arrow and a sequence of tokens and/or non-terminals, called the
right side of the production.
4. A designation of one or the non-termianls as the start symbols.
Thus, formally, a grammar is defined as:
G ( ϕ ( x ) ) ={ V N ,V T , P , S }
Where V N =set of nonterminals
V T =set of terminals
P= production∨rule
S=start ∨distinguished symbol .
However, V T ∩V T =ϕ
Ex1 A grammar to generate expression consisting of digits, additions and subtractions.
list → list+ digit
list → list−digit
list → difit
digit → 0|1|2|3|4 |5|6|7|8∨9
Equivalently: list → list+ digit |list−digit|digit .
Conventionally, the nonterminals are list and digit while the terinals are ±0123456789 (tokens).
Illustration 1: To construct a parse tree for 9−5+2 according to the grammar in Ex1.
list

list list

digit digit

digit

5 ¿ 2 +¿ 9

Types of Grammar
Grammar can be classified into different types, according to the form of productions allowed.
This is due to the fact that imposing restrictions on productions makes them much easier to
parse. The following is a summary of Chomsky classification.
TYPE 0 GRAMMAR: These are the most general and are also known as ‘free grammars’.
Productions are of the form u → v , where both u∧v are arbitrary strings of symbols in V, with u
non-null.
TYPE 1 GRAMMAR: These are known as ‘context-sensitive’ grammars. Productions are of the
form u X w → u v w , where u , v ,∧w are arbitrary strings of symbols in V, with v non-null and X
is a single non-terminal. Thus, X may be replaced by v but only when it is found surrounded by
u∧w .
TYPE 2 GRAMMAR: These are known as ‘context-free’ grammars. Productions are of the
form X → v , where v is an arbitrary string of symbols in V and X is a single non-terminal. Thus,
X may be replaced by v whenever it is found. The grammars used by syntax analysis in
compilers are type 2 grammars. They are normally defined by BNF (Backus-Naur Form)
notation. All languages derived from type 2 grammar can be parsed, but there are some subsets,
commonly used for programming language definition, that can be parsed particularly efficiently.
TYPE 3 GRAMMAR: These are more commonly called ‘regular grammars’. Productions are of
the form X → a∨ X →aY , where X ∧Y are non-terminal and a is a terminal. These grammar are
widely used in the lexical analyzers of compilers to describe the basic entities that make up the
programming language, though they are not generally enough to describe the syntax of a
complete programming language. Languages defined by type 3 grammars can be parsed vary
efficiently. In particular, they can always be recognized by a finite state info.
PROPERTIES OF GRAMMARS
1. EQUIVALENT GRAMMARS: Two grammars G and G’ are said to be equivalent if the
languages they generate L(G) and L(G’) are the same, though not necessarily the same parse
trees for each sentence. E.g., G : A → Ax| y ; G ’ : A → yB ; B → xB|ϵ .

Note that the parse trees are markedly different for the sentence yxx . Since parse tree often
reflects the semantic structure of a language, rewriting a grammar to make it easier to parse
may result in the loss of semantic information. Instead, a different parsing technique should
be used rather than rewriting or transforming the grammar.
2. AMBIGUOUS GRAMMARS: A grammar is said to be ambiguous if it has more than one
parse tree generating a given string of tokens. An ambiguous grammar can be resolved either
by either rewriting the grammar or by using disambiguating rules.

Illustration: To show that the grammar defined by 9+5∗2 is ambiguous using the
following grammar production:
E→T∗F | E+T
T →F
F→0|1|2|3| 4| 5|6|7| 8|9
3. RECURSIVE GRAMMARS: A production of a grammar is said to recursive if it is defined
in terms of itself in one of its attractive on the right hand side. E.g., variable_list→variable |
variable_list, variable. Productions of this form are said to be left-recursive, and have the
general form: A → u∨ A v , where A is a non-terminal and u and v are arbitrary strings of
terminals and non-terminals. Similarly, right-recursive productions have theirs in one of their
right-hand side alternatives occurring at the right-end. They have the general form
A → u∨v A . Left-recursive productions are a source of problem with some parsing methods.
There is, therefore, a standard transformation to give an equivalent grammar that is right-
recursive. E.g. A → u∨ A v can be replaced as: A → u B B → v B∨ϵ .
4. LEFTMOST AND RIGHTMOST DERIVATIONS: A leftmost derivation is a parse tree
in which we always expand the leftmost non-terminal first; and a parse in which we always
expand right-most non-terminal of the production first is a right-most derivation.
5. LEF FACTORING: This is a method by which an equivalent grammar is generated from
productions with identical first parts to alternative right-hand sides. E.g., X → v w can be
written as: X → v B ; B → w∨z equivalently. Left factoring can be very helpful in syntax
analysis.

You might also like