Grammar Based Language Model
Grammar Based Language Model
Human Communication 1
Lecture 6
• Two kinds of language models
• Grammar-based models
• Statistical models
• We still look at grammar-based models
today
1 2
3 4
Special symbols Terminology
• Alphabet: A set of (terminal and
nonterminal) symbols
• Formal grammars usually have two special • Word: A string of symbols from an alphabet
symbols (what we also called ‘sentence’)
• S: the start symbol • Grammar: A set of rules defined on a
• ε: the empty string (sometimes: λ) alphabet
• Language: The set of words defined by a
grammar
5 6
Representing formal
Formal definition
grammar
A grammar G =〈Φ, ∑, R, S〉consists of
7 8
Example Solution
• Grammar • Apply rewrite rules until the result contains only symbols
from the alphabet.
• Alphabet: a, b
• We can rewrite
• Start symbol: S
• S to aSb by replacing S with aSb (rule 1);
• Rules: • aSb to aaSbb (rule 1)
13 14
15 16
Chomsky hierarchy What type?
Type-0 · recursively enumerable S → aS
Type-1 · context sensitive S → abS
S→c
Type-2 · context free
Type-3 · regular
17 18
19 20
What type? What type?
S → aS S → aS
S → aABb S → aABb
A → Aa aA → Aa
A→a Aa → a
B → Bb B → Bb
B→b B→b
21 22
23 24
Why the distinction? Human grammar
• Why is this distinction relevant?
• What type of grammar is human grammar?
• Answer: different computational complexity
• Probably in between context free and context
• This means: different amount of resources sensitive: mildly context-sensitive grammars
(MCSG)
needed
• This means in particular: different execution • MCSGs
times
• allow certain kinds of context dependencies
• Generally: The simpler the grammar type, the
• have low computational complexity (they
faster the parsing and generation
have polynomial complexity)
25 26