CYK Algorithm & Tree Based Language Models
CYK Algorithm & Tree Based Language Models
Key Concepts:
• Context-Free Grammar (CFG): A grammar where each production rule has a single
non-terminal on the left-hand side.
• Chomsky Normal Form (CNF): A restricted form of CFG where every production is
either:
o A → BC (two non-terminals)
o A → a (a terminal)
Given:
• A CFG in CNF
Step-by-Step Process:
1. Initialize a table T[n][n], where each cell T[i][j] holds the set of non-terminals that can
generate the substring w[i...j].
2. Base Case (Length = 1): For each position i, find the non-terminals that produce the
terminal w[i].
3. Recursive Step (Length > 1): For each substring length l = 2 to n, and each starting
position i, compute the possible non-terminals for substring w[i...i+l-1] by:
o Where nn is the length of the input string, and ∣G∣|G| is the number of production
rules in the grammar.
Example:
Grammar in CNF:
S → AB | BC
A → BA | a
B → CC | b
C → AB | a
Applications in NLP:
Tree-based language models aim to improve upon traditional n-gram or sequential neural
models (like RNNs or LSTMs) by explicitly modeling the grammatical structure of a sentence
using constituency trees or dependency trees.
• How it works: Each rule (e.g., NP → DT NN) has a probability based on frequency.
Used in: Parsing and modeling of languages with complex syntax (like German, Hindi).
• Introduced by: Kai Sheng Tai, Richard Socher, and Christopher Manning (2015).
Used in:
• Sentiment analysis
• Structure: A neural model that recursively combines child node vectors to form
parent node vectors.
• Parse Tree Usage: Applies the same function at each node of a syntax tree.
• Limitation: Shallow structure and hard to train; replaced in many areas by Tree-LSTMs.
Used in:
• Sentence similarity
• Syntax-aware classification
• Focus: Dependency parse trees, where each word is connected to others through
grammatical relationships (e.g., subject, object).
• Model: Predicts words using their syntactic dependents rather than left-to-right
sequence.
• Example: Eisner’s Dependency Model (1996), Structured Language Models by Chelba &
Jelinek (1998)
Benefits:
6. Tree Transformers
• What it is: Transformer models that integrate syntactic trees (parse trees) into attention
mechanisms.
• How:
Examples:
• Syntax-Aware Transformers
• TreeFormer
Summary: