Codes, Curves and Cryptography: Informal Notes. I

Codes, Curves and Cryptography
Informal Notes. I
”God created the integers. Every-
thing else is the work of Man. ”
Leopold Kronecker
Viet Nguyen-Khac
Hanoi Institute of Mathematics
1
Galois Fields
Fp, p – a prime (easy to imagine)
Fq , q = pe, e > 1 – a bit difficult
(Hint: Fq =∼ F [x]/(f (x)) for some irreducible

p
(better primitive) polynomial f of degree e)
Finite Metric Spaces
A – an alphabet, usually A = Fq
An := A × · · · × A, (n times)
Hamming distance:
for x = (x1, . . . , xn), y = (y1, . . . , yn) ∈ Fn

q
d(x, y) := #{i : xi = yi}
(Fn
q , d) – a finite metric (Fq -vector) space
2
The ISBN-Code: a1-a2a3a4 − a5a6a7a8a9-a10
ISBN – (International Standardized Book Number)

10

iai ≡ 0 (mod 11) (the check digit)
i=1
if a10 ≡ 10 (mod 11), then it is taken to be

the symbol X (see also the Maple worksheet)
The code can detect a single error
The Repetition Code
Instead of an information bit a just send aaa

(three times): this code can detect 2 errors,
and correct 1 error
Quiz: How about sending n times?
In general it is inefficient (Topological Econ-

omy Principle in AG – V. I. Arnold?!)
3
Basic Concepts
A code C = a subset of Fn
q
Codewords = elements of C
A linear code C = an Fq -linear subspace of Fn

q
w(x) := #{i : xi = 0} – the Hamming weight
The minimum distance:
dmin (C) := min {d(x, y) : x = y ∈ C}
dmin (C) := min {w(a) : a = 0 ∈ C} for linear C
Fact.
The [n, k, d]q-code
C can correct t : =
d−1 d
and detect errors
2 2
4
The Code Domain
M := |C| - the number of codewords
k(C) := logq M - the log-cardinality of C
(k(C) = dimFq C, if C is a linear code)
k
R(C) := – the information rate
n
dmin
δ(C) := – the relative minimum distance
n
In the (reversed) plane (R, δ)
Vq := {(R(C), δ(C)) ∈ [0, 1] × [0, 1]}
(the points are counted up to equivalence of

codes and with multiplicities)
Uq := {limit points of Vq }, Vq \ Uq :=: {isolated

codes}
5
Encoding+Transmission (error)+Decoding:
M – the message space
encoding = E : M → C (usually an inclusion)
decoding = D : Fn
q → C s.t. D(a) = a, ∀a ∈ C
(a retract)
the decoding strategy: the nearest neighbour

decoding (or standard)
D(b) = nearest to b codeword (which may not

be unique)
A brute-force method is to compare b with all

codewords (impossible as k is large)
6
The Main Problem of Coding Theory.
Find good codes, i.e. with both R, δ large
(efficiency + high capability to correct errors)
The whole space Fn

q is an [n, n, 1]q -code.
The ISBN-code is not linear, but it can be con-

sidered as a subcode of a linear code ⊂ F10
11.
The parity check code: an [8, 7, 2]2-code go-

ing back to the time of punched paper tape
8

C := {x ∈ F8
2: xi = 0}
i=1
The repetition code has parameters [n, 1, n]q .
A refined ISBN-code: an [11, 9, 3]11-code

10
10

C := {(x0, . . . , x10) ∈ F11
11 : xi = 0, ixi = 0}
i=0 i=0
(a dual first order Reed-Muller code)

7
Linear Codes
ϕ : Fkq → Fn
q ←→ G – generator k × n-matrix
ψ : Fn
q → Fq
n−k ←→ H – parity check (n−k)×n-
matrix
C := {u . G : u ∈ Fkq } (Im ϕ)
C := {x ∈ Fn
q : H . tx = 0} (Ker ψ)
u1u2 . . . uk - a message; x1x2 . . . xn - a codeword
xi = ui, i = 1, . . . , k; xk+1, . . . , xn - check digits
Then H = (A|In−k ), G = (Ik |B) with B = −tA.

C ⊥ := {x ∈ Fn
q : x . c := xici = 0, ∀c ∈ C} -
dual code
Proposition. For a non-zero (linear) code C

dmin (C) = max {d : ∀(d−1) column vectors of H
are linearly independent}
8
Decoding Linear Codes
A standrad coset partition: t = q n−k − 1

Fn
q = C ∪ (a1 + C) ∪ (a2 + C) ∪ · · · ∪ (at + C)
Every ai is the coset leader, i.e. the minimum

weight vector in its coset (chosen randomly if
not unique).
The decoder’s strategy is to find the coset with

coset leader, say ê, containing the received
message y, and to decode x̂ = y − ê.
The standard array: (maximum likelihood de-

coding) the first row consits of s + 1 (= q k )
codewords
0, c(1), ··· , c(s)
a1, a1 + c(1), · · · , a1 + c(s)
···
at, at + c(1), · · · , at + c(s)
The syndrome: S(y) := H . ty - a column vec-
tor of length n − k
9
Properties: 1) S(y) = H . te, where y = x+e.
In particular S(y) = 0 ⇐⇒ y - a codeword.
2) Two vectors are in the same cosets ⇐⇒

they have the same syndrome.
3) For a binary code if e = (0 · · · 010 · · · 1 · · · 1 · · · ),

a b c
then

S(y) = ei hi = ha + hb + hc + · · ·
i
that is S(y) = sum of the columns of H where
the errors occurred. So the MLD problem is
to find a minimal subset M of {hi} s.t. S(y) ∈
M .
Theorem (Berlekamp-McEliece-van Tilborg,1978)

The MLD problem is N P -complete (equiva-
lent to the MAX-CUT problem).
(cf. Madhu Sudan, Algorithmic Introduction

to Coding Theory, 2001).
10
The Hamming Code
R. A. Fisher, The theory of cofounding in fac-

torial experiments in relation to the theory of
groups, Ann. Eugenics, 11 (1942), 341–353.
R. A. Fisher, A system of cofounding for fac-

tors with more than two alternatives, giving
completely orthogonal cubes and higher pow-
ers, Ann. Eugenics, 12 (1945), 2283–2290.
M. J. E. Golay, Notes on digital coding, Proc.

IEEE, 37 (1949), 657.
R. Hamming, Error detecting and error cor-

recting codes, Bell Systems Tech. J., 29 (1950),
147–160.
see also R. Hamming (1915-1998)
11
The Hamming [n, n − r, 3]2-code Hr :
The parity check matrix Hr consists of all n

(= 2r − 1) non-zero binary column vectors of
length r, namely the binary representations of
n numbers 1, 2, . . . , 2r − 1, e.g. with r = 3 it is
a [7, 4, 3]2-code with
⎡ ⎤
0 0 0 1 1 1 1
⎢ ⎥
H3 = ⎣0 1 1 0 0 1 1⎦
1 0 1 0 1 0 1
Let ei ∈ Fn2 be the standard i-th basis vector,
clearly Hr . ei = the binary representation of i.
So if S(y) = hj , then the occurred error is at
j-th position: just change the bit here.
The dual of Hr is the so-called simplex [n, r, (n+

1)/2]2-code Sr (all the non-zero codewords have
the same weight (n + 1)/2), e.g. S2 (the tetra-
hedron code) is the check parity [3, 2, 2]2-code
and H2 is the repetition [3, 1, 3]2-code.
In general one has the Hamming [n, n − r, 3]q -

code with n = (q r − 1)/(q − 1).
12
Perfect Codes
Vq (n, r) := #Br (x0) := #{y ∈ Fn

q : d(y, x0) ≤ r}
The Hamming or sphere-packing bound:

M . Vq (n, t) ≤ q n.
A code C is called perfect if it meets the Ham-

ming bound, or equivalently

Fn
q = Bt(c) (disjoint union).
c∈C
The trivial perfect codes are: Fnq (the whole

space), the single word code, a binary repe-
tition code of odd length. Among non-trivial
ones there are the Hamming codes, the Golay
[23, 12, 7]2-code G23 and the Golay [11, 6, 5]4-
code G11.
Theorem (Tietäväinen, van Lint, 1973). A

non-trivial perfect code over Fq must have the
same parameter [n, M, d] as one of the Ham-
ming or Golay codes.
13
MDS Codes
(Maximum Distance Separable Codes)
The Singleton bound (1964): d ≤ n − k + 1,

(⇔ R + δ ≤ 1 + 1/n) (cf. Y. Komamiya, 1953)
Codes that meet Singleton’s bound are called M DS.
Examples: the trivial code ([n, n, 1]q ), the par-

ity check code ([n, n − 1, 2]q ), the repetition
code ([n, 1, n]q ) are M DS (trivial series), the
RS code,... Actually M DS codes are isolated.
Theorem. In an M DS code k ≤ q − 1, if d > 2,

and d ≤ q, if k ≥ 2. In particular, if q = 2 these
are the only codes from the trivial series.
The Main Conjecture on M DS Codes: non-trivial

M DS codes are short.
14
The Reed-Solomon code
k ≤ n ≤ q; α1, · · · , αn - distinct elements of Fq

Lk−1 := {f ∈ Fq [x] : deg(f ) < k}.
Consider the evaluation map
ev : Lk−1 −→ Fn q
f −→ (f (α1), · · · , f (αn))
RS(q, n, k) := ev(Lk−1)
is an M DS code with parameters [n, k, n−k+1]q
Applications of RS codes: used in CD-Digital-

Audio, e.g. CD, CD-ROM, DVD, DTV,... (cf.
the attached Web page).
Goppa’s generalization. Let X be an algebraic

variety /Fq , P = {P1, P2, · · · , Pn } consist of
distinct elements of X(Fq ). Let L be an Fq -
vector space of rational functions ∈ Fq (X).
The map evP : X → Fn q , evaluated at points
of P as above, gives rise to a promising code
belonging to the class of Algebraic Geometry
codes, or Goppa Algebraic-Geometric codes.
15
Cyclic Codes
BCH Codes with t ≥ 2: R. C. Bose-D. K. Ray-

Chaudhuri (1960), A. Hocquenghem (1959)
The Hamming [n, n−r, 3]2-code needed r parity

checks to correct one error (n = 2r − 1). In the
abbreviation form below each entry means the
corresponding binary r-tuple

1 2 ··· n
H=
f (1) f (2) · · · f (n)
It is, with f (i) : = i3, an [n, n − 2r, ≥ 5]2-code
(the arithmetic operations in F2r play here an
important role).
(c0, c1, · · · , cn−1) ↔ c(x) := c0+c1x+· · ·+cn−1xn−1

Definition. A code C is cyclic if it is invariant
under any cyclic shift, or equivalently it is an
ideal of Rn := Fq [x]/(xn − 1).
16
Examples: The ideal x−1 = {f : f (1) = 0} ↔
the parity check code. In the other extreme
case, the ideal g(x) : = 1 + x + · · · + xn−1 =
{scalar multiples of g} ↔ the repetition code.
(n, q) = 1, m = min {a : n|q a − 1}, α ∈ F∗qm

has order n (primitive n-th root of unity)
BCH Codes of designed distance δ: b ≥ 0, δ ≥ 1
C := {c : c(αb) = c(αb+1) = · · · = c(αb+δ−2) = 0}

Theorem. dmin (C) ≥ δ.
b = 1, δ = 3: the binary Hamming code Hr

F∗2r = α, n = 2r −1, H = [1, α, α2, · · · , αn−1]
c(α) = 0 (so c(α2) = 0) ⇐⇒ H . tc = 0
b = 1, δ = 5: the binary double-error-correcting

BCH code c(α) = c(α3) = 0 ↔ M (1)(x).M (3)(x)

n−k
The RS code with n|q − 1 ←→ (x − αj )
j=1
17
Asymptotic Bounds
Aq (n, d) := max {M : ∃ an [n, M, d]q -code}
The Asymptotic Problem: Let {dn} be a se-

quence of natural numbers s.t. dn/n → δ ∈
[0, 1]. Investigate the behaviour of Aq (n, dn )
as n → ∞.
{Ci} - a family of [ni, ki, di ]-codes
R({Ci }) := lim R(Ci), δ({Ci}) := lim δ(Ci)

i→∞ i→∞
The family {Ci} is called asymptotically good,

if R({Ci }) > 0 and δ({Ci}) > 0.
Uq = {(R, δ) : 0 ≤ R ≤ αq (δ)}, where
logq Aq (n, [δn])

αq (δ) := lim
i→∞ n
(αq (δ) = sup {R : ∃ {Ci} s.t Ri → R, δi → δ})

18
Theorem (Manin, 1981). αq (δ) is a conti-
nous function decreasing on [0, θ], satisfying
αq (0) = 1, αq (δ) = 0 on [θ, 1] and on [0, θ]:
• αq (δ) ≤ 1 − δ/θ (Plotkin bound),
• αq (δ) ≤ 1 − Hq (δ/2) (Hamming bound),

• αq (δ) ≤ 1 − Hq θ − θ(θ − δ) (Bassalygo-
Elias bound),
• αq (δ) ≥ 1−Hq (δ) (Gilbert-Varshamov bound).
Here Hq (δ) denotes the Hilbert entropy func-

tion
⎧
⎪
⎨ δlogq (q − 1) − δlogq x − (1 − δ)logq (1 − δ),
⎪
Hq (δ) :=
⎪
0 < δ ≤ θ := 1 − 1/q
⎪
⎩
0, δ=0
Remark. In fact αq (δ) is Lipschitz, but it is an
open question whether it is differentiable.
19
Algebraic-Geometric Codes
V. D. Goppa, Codes on Algebraic Curves, Dok-

lady USSR, 259 (1981), 1289-1290.
X - an algebraic curve/Fq , g := g(X)
P = {P1, P2, · · · , Pn} - distinct points, ⊂ X(Fq )

D= nP P ∈ DivFq (X), Supp D ∩ P = ∅
L := L(D) := {f ∈ k(X)∗ : div(f )+D ≥ 0}∪{0}
‘ ≥ 0‘ = ‘effective.‘; ‘f ∈ L‘ = ‘∀P : ordP (f ) ≥ −nP ‘

evP : L −→ Fn
q
f −→ (f (P1), · · · , f (Pn ))
Goppa’s code C(D, P) := evP(L), k := dimFq C(D, P)
Theorem. If 2g − 2 < deg(D) ≤ n, then
(i) k = deg(D) − g + 1 (Riemann-Roch),
(ii) dmin ≥ n − deg(D) (Bézout).

20
Idea: Since R + δ ≥ 1 − (g − 1)/n (g = 0 ⇒ the
RS code), so fixing the ratio deg(D)/n
Problem: Find X/Fq with small g & large #X(Fq )
# X(F ) ≤ q+1+g[2√q] (Hasse-Weil-Serre bound)

q
Fermat (or Hermitian) curves are maximal:

xq+1 + y q+1 + z q+1 = 0/Fq2 .
(g − 1) 1
Sq (X) := # , Sq := lim Sq (X) ≥ √
X(Fq ) − 1 g>0 [2 q]
Theorem (Tsfasman, 1982). The interval

R + δ = 1 − Sq , 0 ≤ R, δ ≤ 1 lies entirely in
the code domain Uq . If q ≥ 49, it intersects
the Gilbert-Varshamov curve at two points the
interval between them lies above.
Remark. Plotkin’s bound gives better estimate

for S2, S3
21
#X(F )
q
Aq := lim = S−1
q (Ihara’s notation)
g>0 g(X)
Theorem (Ihara, 1981). Aq2 ≥ q − 1.
Theorem (Drinfel’d-Vladuts, 1983). Aq ≤

√ √
q−1. In particular, if q = , then Aq = q−1.
Theorem (Serre, 1983). There is an absolute

constant c > 0 s.t. Aq > c log q. As for q = 2
we have A2 ≥ 2/9.
Observation (Ihara & Tsfasman-Vladuts-Zink).

∃ a sequence Xi /Fq2 with #Xi(Fq2 )/g(Xi) →
q − 1. These are (Drinfel’d) modular curves,
having a plenty of supersingular points.
Theorem (Tsfasman-Vladuts-Zink, 1982). ∃

a sequence of Goppa codes over Fq2 with limit
point on or above the Tsfasman interval (bet-
ter than the GV bound as q ≥ 49).
Conjecture. The GV bound is best possible

for q = 2.
22
Some Additional References
Y. Ihara, Some remarks on the number of ra-

tional points of algebraic curves over finite fields,
J. Fac. Sci. Univ. Tokyo, 28 (1981), Sec. IA,
721–724.
F. J. MacWilliams, N. J. A. Sloane, The The-

ory of Error-Correcting Codes, North-Holland
Publ., Amsterdam, 1977.
Yu. I. Manin, What is the maximal number of

points on a curve over F2, J. Fac. Sci. Univ.
Tokyo, 28 (1981), Sec. IA, 715–720.
Yu. I. Manin, S. G. Vladuts, Linear Codes and

Modular Curves, Contemporary Problems of
Mathematics, 25 (1984), 209–257 (Russian).
M. A. Tsfasman, S. G. Vladuts, Algebraic-

Geometric Codes, Kluwer Acad. Publ., Dodrecht-
Boston-London, 1991.
23

Codes, Curves and Cryptography: Informal Notes. I

Uploaded by

Copyright:

Available Formats

Codes, Curves and Cryptography: Informal Notes. I

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Codes, Curves and Cryptography: Informal Notes. I

Uploaded by

Copyright:

Available Formats

Codes, Curves and Cryptography

”God created the integers. Every-

thing else is the work of Man. ”

Hanoi Institute of Mathematics

Fp, p – a prime (easy to imagine)

Fq , q = pe, e > 1 – a bit diﬃcult

(Hint: Fq =∼ F [x]/(f (x)) for some irreducible

Finite Metric Spaces

for x = (x1, . . . , xn), y = (y1, . . . , yn) ∈ Fn

d(x, y) := #{i : xi = yi}

ISBN – (International Standardized Book Number)

if a10 ≡ 10 (mod 11), then it is taken to be

The code can detect a single error

The Repetition Code

Instead of an information bit a just send aaa

Quiz: How about sending n times?

In general it is ineﬃcient (Topological Econ-

A linear code C = an Fq -linear subspace of Fn

w(x) := #{i : xi = 0} – the Hamming weight

The minimum distance:

dmin (C) := min {d(x, y) : x = y ∈ C}

dmin (C) := min {w(a) : a = 0 ∈ C} for linear C

M := |C| - the number of codewords

k(C) := logq M - the log-cardinality of C

(k(C) = dimFq C, if C is a linear code)

In the (reversed) plane (R, δ)

Vq := {(R(C), δ(C)) ∈ [0, 1] × [0, 1]}

(the points are counted up to equivalence of

Uq := {limit points of Vq }, Vq \ Uq :=: {isolated

M – the message space

encoding = E : M → C (usually an inclusion)

the decoding strategy: the nearest neighbour

D(b) = nearest to b codeword (which may not

A brute-force method is to compare b with all

The whole space Fn

The ISBN-code is not linear, but it can be con-

The parity check code: an [8, 7, 2]2-code go-

The repetition code has parameters [n, 1, n]q .

A reﬁned ISBN-code: an [11, 9, 3]11-code

(a dual ﬁrst order Reed-Muller code)

u1u2 . . . uk - a message; x1x2 . . . xn - a codeword

xi = ui, i = 1, . . . , k; xk+1, . . . , xn - check digits

Then H = (A|In−k ), G = (Ik |B) with B = −tA.

Proposition. For a non-zero (linear) code C

A standrad coset partition: t = q n−k − 1

Every ai is the coset leader, i.e. the minimum

The decoder’s strategy is to ﬁnd the coset with

The standard array: (maximum likelihood de-

2) Two vectors are in the same cosets ⇐⇒

3) For a binary code if e = (0 · · · 010 · · · 1 · · · 1 · · · ),

Theorem (Berlekamp-McEliece-van Tilborg,1978)

(cf. Madhu Sudan, Algorithmic Introduction

R. A. Fisher, The theory of cofounding in fac-

R. A. Fisher, A system of cofounding for fac-

M. J. E. Golay, Notes on digital coding, Proc.

R. Hamming, Error detecting and error cor-

see also R. Hamming (1915-1998)

The parity check matrix Hr consists of all n