0% found this document useful (0 votes)
23 views

Lecture 12 14

Uploaded by

Sumit Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Lecture 12 14

Uploaded by

Sumit Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 11

Natural Language Processing

- Formal Language -

(formal) Language
(formal) Grammar
Formal Language
A formal language L is a set of finite-length
words (or "strings") over some finite
alphabet A.  is the empty word.
Example:
A = {a, b, c}
L1 = {ab, c}
Formal Languages - Examples
Some examples of formal languages:
• the set of all words over {a, b},
• the set { an | n is a prime number },
• the set of syntactically correct programs in
some programming language, or
• the set of inputs upon which a certain
Turing machine halts.
Several operations can be used to produce new languages from
given ones. Suppose L1 and L2 are languages over some common
alphabet.
• The concatenation L1L2 consists of all strings of the form vw
where v is a string from L1 and w is a string from L2.
• The intersection of L1 and L2 consists of all strings which are
contained in L1 and also in L2.
• The union of L1 and L2 consists of all strings which are contained
in L1 or in L2.
• The complement of the language L1 consists of all strings over the
alphabet which are not contained in L1.
• The Kleene star L1* consists of all strings which can be written in
the form w1w2...wn with strings wi in L1 and n ≥ 0. Note that this
includes the empty string ε because n = 0 is allowed.
A formal language can be specified in a great
variety of ways, such as:
• Strings produced by some formal grammar (see
Chomsky hierarchy)
• Strings produced by a regular expression
• Strings accepted by some automaton, such as a
Turing machine or finite state automaton
• From a set of related YES/NO questions those
ones for which the answer is YES, see decision
problem
Formal Grammar - Definition

A formal grammar G = (N, Σ, P, S) consists of:


• A finite set N of nonterminal symbols.
• A finite set Σ of terminal symbols that is disjoint from
N.
• A finite set P of production rules where a rule is of the
form
• string in (Σ U N)* -> string in (Σ U N)*
– (where * is the Kleene star and U is set union)
– the left-hand side of a rule must contain at least one
nonterminal symbol.
• A symbol S in N that is indicated as the start symbol.
Language of a Formal Grammar

The language of a formal grammar G = (N, Σ, P,


S), denoted as L(G), is defined as all those
strings over Σ that can be generated by starting
with the start symbol S and then applying the
production rules in P until no more nonterminal
symbols are present.
Language of a Formal Grammar

Example
Consider, for example, the grammar G with N =
{S, B}, Σ = {a, b, c}, P consisting of the
following production rules
1. S -> aBSc
2. S -> abc
3. Ba -> aB
4. Bb -> bb

This grammar defines the language {anbncn | n>0}


Chomsky's four types of grammars
• Type-0 grammars (unrestricted grammars)
languages recognized by a Turing machine
• Type-1 grammars (context-sensitive grammars)
Turing machine with bounded tape
• Type-2 grammars (context-free grammars)
non-deterministic pushdown automaton
• Type-3 grammars (regular grammars)
regular expressions, finite state automaton
Grammars, Languages, Machines

Type-0
Recursively enumerable Turing machine No restrictions
Type-1
Context-sensitive Linear-bounded αAβ -> αγβ
non-deterministic
Turing machine
Type-2
Context-free Non-deterministic A -> γ
pushdown automaton
Type-3
Regular Finite state automaton A -> aB
A -> a
Example

You might also like