Coding Theory and Linguistics
Coding Theory and Linguistics
Matilde Marcolli
CS101: Mathematical and Computational Linguistics
Winter 2015
Code parameters
• R = k/n = transmission rate of the code
• δ = d/n = relative minimum distance of the code
Small R: fewer code words, easier decoding, but longer encoding
signal; small δ: too many code words close to received one, more
difficult decoding
K(wn )
κ(w ) = lim inf
wn →w `(wn )
• Levin (semi)measure
− logq µU (wn )
κ(w ) = lim inf
wn →w `(wn )
− logq µ(w )
κ(x) ≤ lim = R(C )
`(w )
(a1 , . . . , an ) ∈ C3 iff ai = a.
δ 1
• Plotkin bound:
q−1
αq (δ) = 0, δ≥
q
• singleton bound:
αq (δ) ≤ 1 − δ
• Hamming bound:
δ
αq (δ) ≤ 1 − Hq ( )
2
• Gilbert–Varshamov bound:
αq (δ) ≥ 1 − Hq (δ)
KP(x)−β
P
• variant with prefix-free complexity ZP(X , β) =
N
1 X
δH (Li , Lj ) = |Π` (Li ) − Π` (Lj )|
N
`=1