2024 CInvestigation of The Permutation
2024 CInvestigation of The Permutation
Abstract
m
Dobbertin in 1999 proved that the Welch power function x2 +3 was almost per-
ferct nonlinear (APN) over the finite field F22m+1 , where m is a positive integer.
m
In his proof, Dobbertin showed that the APNness of x2 +3 essentially relied on
m+1
the bijectivity of the polynomial g(x) = x2 +1
+ x3 + x over F22m+1 . In
this paper, we first determine the differential and Walsh spectra of the permu-
tation polynomial g(x), revealing its favourable cryptograhphic properties. We
then explore four families of binary linear codes related to the Welch APN power
functions. For two cyclic codes among them, we propose algebraic decodings that
significantly outperform existing methods in terms of decoding complexity.
1 Introduction
Let F2n denote the finite field of 2n elements and F∗2n be its multiplicative group. Non-
linear functions over F2n have wide applications in cryptography and coding theory.
In symmetric cryptography, block ciphers are designed by appropriate compositions
of linear permutations and S-boxes that are the only nonlinear component. Hence the
cryptographic properties of the nonlinear S-boxes are crucial to the security of the
1
ciphers. Differential and linear attacks [1, 2] are two of most powerful cryptographic
attacks agsinst block ciphers, and the link between these two approaches was investi-
gated in [3]. To ensure good resistance to differential attacks, the differential uniformity
of the nonlinear function used in an S-box should be low. The lowest possible differen-
tial uniformity is 2 and functions with this property are called almost perfect nonlinear
(APN) functions. There has been much work and progress on APN functions; see, for
example, [4] and [5]. The nonlinearity quantifies the level of resistance of the function
to the linear attack: the higher is the nonlinearity the better is the resistance of the
function against the linear attack. Besides the differential uniformtiy and the nonlin-
earity, there are also some other cryptographic criteria that measure the resistance of
the nonlinear functions to various known attacks. For further details about this topic,
the reader is referred to [4, 6] and references therein. The study on the cryptographi-
cally significant functions during the past decades shows that it is difficult to design a
function attaining all good cryptographic criteria, and trade-offs must be considered.
Linear codes, particularly cyclic codes, have wide applications in reliable data
storage and communications. In coding theory one of the most important topic is to
construct linear codes with desirable properties and to explore efficient decoding for
them. Constructing linear codes from nonlinear functions was extensively explored
in the past decades [7–10], and many optimal linear codes have been obtained from
cryptographically significant functions [11–15], such as perfect nonlinear functions,
almost perfect nonlinear functions, bent functions and plateaued functions. In those
works, the minimum distances and weight distributions of the constructed codes and
their duals were intensively studied (see for instance a recent survey by Li and Mes-
nager [10]). There are other parameters of linear codes, such as the covering radius
[16] and coset weight distribution [17], that are of fundamental interest, particularly
when evaluating the performance of linear codes in error correction. Nevertheless, due
to their intractabilities, there has been limited research progress on such topics. It
is well known that the problem of random syndrome decoding is NP-complete [18].
There do exist certain linear codes with efficient decoding. For instance, BCH codes,
due to their special property, allow for efficient decoding with polynomial-time com-
plexity [19]. However, efficiently decoding non-BCH cyclic codes remains a significant
open problem, despite recent efforts to develop decoders for generic cyclic codes by
investigating generalized error-locator polynomials [20–22].
In this paper, we first investigate important cryptographic properties, namely, the
differential spectrum and Walsh spectrum, of the permutation polynomial f (x) =
m+1
x2 +1
+ x3 + x over F22m+1 , which we call the Welch permutation since it was used
m
to prove the APNness of the Welch power function F (x) = x2 +3 [23]. In the second
part, we explore two families of cyclic codes and two families of linear codes that
are closely related to the Welch power function. For the two binary cyclic codes, we
2
propose efficient algebraic decoders with complexity in the order of O(N (log N )3 ),
where N = 22m+1 − 1 is the code length. For the second family of binary linear codes,
it is shown to have at most five nonzero weights, which provides a partial resolution
to the conjecture by Ding [13].
The remainder of this paper is organized as follows. Section 2 recalls basic def-
initions and auxiliary results. Section 3 determines the differential spectrum and
Walsh spectrum of g(x). Section 4 explores the properties and decoding of binary
codes derived from the Welch APN power function, and Section 5 summarizes our
contributions in this work.
2 Preliminaries
2.1 Cryptographic properties of vectorial Boolean functions
For a vectorial Boolean function F (x) from F2n to F2n , denote
ΩF = [ω0 , ω1 , . . . , ωδ ], (2)
3
It is easily seen that ωi = 0 in differential spectrum if i is odd. Moreover, we have
the following properties
δ
X δ
X
ωi = 2n (2n − 1) and (i × ωi ) = 2n (2n − 1). (3)
i=0 i=0
For any APN function over F2n , there are only two possible values 0 and 2 in its
differential spectrum. Thus, from the equalities in (3), the differential spectrum can
be uniquely determined.
Another important criterion of a vectorial Boolean function F (x) is its nonlinearity,
which can be given in terms of the Walsh transform of F (x).
Definition 2. The (extended) Walsh transform of a vectorial Boolean function F (x)
at (a, b) is defined by
X n
WF (a, b) = (−1)Tr1 (bF (x)+ax) ,
x∈F2n
1
N L(F ) = 2n−1 − max{|WF (a, b)| : a, b ∈ F2n , b ̸= 0}.
2
Remark 1. Note that for a Boolean function G(x) from F2n to F2 , the extended
Walsh transform reduces to the original Walsh-Hadamard transform
X n
G(λ)
b = (−1)G(x)+Tr1 (λx) , λ ∈ F2n .
x∈F2n
Next we recall some results about the Walsh transforms of quadratic Boolean
functions. Given a quadratic Boolean function Q(x) from F2n to F2 , the function
B(x, z) = Q(x + z) + Q(x) + Q(z) is a bilinear function in x and z. When x, z are
expressed as vectors in Fn2 , the bilinear function can be written as B(x, z) = xBz ⊺ ,
where B is the n × n symplectic matrix of Q(x) satisfying B(i, j) = 1 for 1 ≤ i, j ≤ n
if and only if the multivariate form of Q(x) contains the term xi xj . The rank of Q(x)
is defined as the rank of its symplectic matrix B. Let
4
By the rank-null theorem we have dimF2 (VQ ) + Rank(Q) = n. Note that
2
X X X X
(−1)Q(x) = (−1)Q(x) (−1)Q(x+z)+Q(x)+Q(z) = 2n (−1)Q(x) ,
x∈F2n x∈F2n z∈F2n x∈VQ
where VQ is the F2 -linear space defined as above. It is readily seen that Q(x) is linear
over VQ . Hence one has
(
X
Q(x) ±2n−Rank(Q)/2 , if Q(x) = 0 for any x ∈ VQ ,
(−1) =
x∈F2n 0, otherwise.
Moreover, when λ runs through F2n , the distribution of the Walsh transform Q(x)
b
can be given as follows.
Lemma 1. [29, Theorem 6.2] Let Q(x) be a quadratic form on F2n to F2 with rank
2h. Then its Walsh transform has the following distribution
±2n−h , 22h−1 ± 2h−1 times,
X n
Q(x)+Tr1 (λx)
Q(λ) =
b (−1) =
2n − 22h times.
x∈F2n 0,
5
as the generator polynomial of C. Equivalently, the code C = ⟨g(x)⟩, can be uniquely
given by its complete defining set SC = {i : g(αi ) = 0, 0 ≤ i < N }, where α is an
N -th primitive root of unity. Since g(αi ) = 0 iff g(α2i ) for any 0 ≤ i < N , the set SC
is usually partitioned into disjoint cyclotomic cosets modulo N . A subset of SC that
consists of coset leaders from each coset in SC can uniquely define C, and is therefore
termed as the primary defining set of C. When the (complete) defining set SC contains
d−1 consecutive integers, the cyclic code C has minimum distance at least d according
to the BCH bound [19].
Let C be a binary linear code of length N and minimum weight d. The space Fn2
can be then partitioned to cosets with respect to C. For each coset, the coset leader is
defined as one element with minimum weight in the coset. When the minimum weight
of a coset is no greater than ⌊ d−1 2 ⌋, it has a unque coset leader; when its minimum
weight is larger, a coset may have several elements with the minmum weight, indicating
that the coset leader is not unique. For a received vector y = c + e with certain
codeword c ∈ C and error vector e ∈ FN ⊺ ⊺
2 , the syndrome s = yH = eH associates
N
the error e with a coset of C in F2 . In particular, for the case of s = 0, it corresponds
to the code C, of which the coset leader is the zero vector. This indicates that when
a codeword c is transmitted and the received vector y = c + e is another codeword
of C, the process of error detection by the parity-check equation s = yH ⊺ fails. The
probability of the detection failure of the code C can be expressed in terms of its
weight distribution, which is defined as (A0 , A1 , . . . , AN ), where Ai denote the number
of codewords with Hamming weight i in the code C and it is obvious that A0 = 1.
Thanks to the MacWilliams identity, the weight distribution of C can be derived from
the weight distribution (1, B1 , . . . , BN ) of its dual C ⊥ .
For a nonzero syndrome s = yH ⊺ = eH ⊺ , it belongs to a coset with a nonzero
coset leader. The corresponding coset leader has the same syndrome as e, and it will
be deemed as the error e added to the received vecotr y, since the coset leader has
the minimum weight. The process can uniquely correct the error e when its weight
is within the packing radius of C given by t = ⌊ d(C)−1 2 ⌋, for which the coset leader is
unique; when an error e has weight beyond the packing radius t, it is likely that the
corresponding coset doesn’t have unique coset leader anymore. In this case, the error
e cannot be uniquely decoded and the decoder may fail to return a correct codeword.
The performance of the aforementioned error correction procedure can be evaluated
by in terms of weight distributions of cosets [19]. Unfortunately, a complete picture
of weight distributions of all cosets is intractable. Instead, some attempts have been
made in calculating the coset distribution (1, K1 , K2 , . . . , KN ), where Ki denotes the
number of coset leaders with weight i, of the linear code C [17].
The largest weight of coset leaders of C is known as the covering radius of C, which
is defined by ρ(C) = max{min{d(y, c) : c ∈ C} : y ∈ FN 2 }. The covering radius of C is
a basic geometric parameter, which is a measure of the maximum distortion when C is
6
used for data compression, and is the maximum weight of a correctable random error
when C is used for error correction [16]. It is clear that the covering radius of a code is
lower bounded by its packing radius t. The equality of such an inequality is achieved
by perfect codes. In addition, a linear code C is called a quasi-perfect if ρ(C) = t + 1;
and a quasi-perfect code is called uniformly packed code if ρ(C) is the same as the
external distance of C, which is the number of non-zero weights in its dual C ⊥ .
Generic construction 1.
Let F be a function from F2n to itself with F (0) = 0, and β be a primitive element
of F2n . A binary linear code C of length 2n − 1 can be constructed from F via the
following parity-check matrix
" n
#
1 β β 2 . . . β 2 −2
H= n , (5)
F (1) F (β) F (β 2 ) . . . F (β 2 −2 )
where each symbol stands for the column of its coordinate with respect to a basis of
F2n over F2 . It is easy to verify that the dual code C ⊥ is given by
For the nonlinear function F , the code C has dimension 2n − 1 − 2n. In particular,
when F (x) is a power function xd , the code C is a cyclic code with primary defining set
{1, d}. This generic construction has a long history and pertains to Delsarte’s Theorem
[7]. Note that for the dual code C ⊥ , the Hamming weight of a codeword ca,b ∈ C ⊥ is
given by
Therefore, the weight distribution of C ⊥ can be directly derived from the extended
Walsh spectrum of F (x) given by {WF (a, b) : a, b ∈ F2n } . This relation has led to a
well-established coding-theory characterization of APN functions, almost bent (AB)
functions [8].
Theorem 1. [8] Let F be a function from F2n to itself with F (0) = 0 and n being
odd. Let the code C be defined by a parity-check matrix H as in (5). Then F (x) is an
APN function if and only the code C has minimum distance 5. Furthermore, F (x) is
an AB function if and only C ⊥ is a [2n − 1, 2n − 1 − 2n] uniformly packed code with
minimum distance 5 and packing radius 3.
7
Generic construction 2.
Let D = {d1 , d2 , . . . , dℓ } be a subset of F2n . A binary linear code having D as its
defining set is given by
Proof. For (a, b) ∈ F∗2n × F2n , let N (a, b) be the number of solutions of the derivative
equation g(x + a) + g(x) = b in F2n . Note that
g(x + a) + g(x) + b
m+1 m+1 m+1
= x2 a + xa2 + a2 +1
+ x2 a + xa2 + a3 + a + b
2m+1 2 2m+1 2
= ax + ax + (a + a )x + g(a) + b.
where
m+1
−1 g(a) + b
c = a2 + a and d = . (8)
a
8
Note that c = 0 if and only if a = 1. Next we consider the following equation
m+1
x2 + x2 + cx = 0. (9)
If c = 0, i.e., a = 1, then (9) have two solutions in F2n , which are 0 and 1. If c ̸= 0,
/ F2 , then by raising (9) to the power 2m , we get
i.e., a ∈
m+1 m m
x + x2 + c2 x2 = 0. (10)
which implies
m x2 c+1
x2 = m + x. (11)
c2 c2m
Substituting (11) into (10), we get
m+1 m+1
x4 + (c2 + c2 + 1)x2 + c2 +1
x = 0. (12)
The above arguments show that when c ̸= 0, the solutions of (9) must be those of (12).
Note that the left hand side of (12) is a linearized polynomial over F2n and it may
have 1, 2 or 4 roots in F2n . Thus, the equation (9) may also have 1, 2 or 4 solutions
in F2n . Moreover, note that
m+1
2m+1 −1 a2 + a2
c=a +a= .
a
9
(9), we should investigate the solutions of the following quadratic equation
m+1
c2 +1
x2 + ax + = 0. (13)
a
Note that m+1 +1
c2
Trn1 a3
m+2 m+1
a2 +a2 2 2
= Trn1 a 2m+1
· a a4 +a
4 2 2m+1 2m+2 2m+1 2 2m+2
= Trn1 a +a ·a +a ·a
a2m+1 ·a4
+a ·a
2m+2 2m+1
= Trn1 1
a2m+1
+ a12 + a a4 + a a2
m+1 m+1
a2 a2
= Trn1 1
+ Trn1 1
+ Trn1 + Trn1
a a a2 a2
= 0.
Thus, (13) has two solutions in F2n . This also shows that for any a ∈ F2n \ F2 , (12)
always has four solutions in F2n . By Theorem 1 in [35], one can get the solutions of
(13), which are
m m+1
!22i−1
X c2 +1
x1 = a , and x2 = x1 + a.
i=1
a3
Next we should verify that whether x1 is a solution of (9) or not. If x1 is a solution of
(9), so does x2 .
Let y = xa1 , then (13) becomes into
m+1
c2 +1
y2 + y + = 0. (14)
a3
m+1 a2 ca
y2 + y2 + y = 0. (15)
a2m+1 a2m+1
m+1
c 2m+1 +1
y2 +y+ = 0. (16)
a
10
On the other hand, by (14) we have
m+1
y2 +y
m 2i
2
P
= y +y
i=0
m 2m+1 +1 2i
c
P
= a3
i=0
m m+2 m+1 2i
1 1 a2 a2
P
= a2m+1
+ a2 + a4 + a2
i=0
m
2 2i
P 1 2
m 1
a2
m+1
a2
m+1 (17)
= a2 + a2 + a2 + a2
i=0
m+1 m+1 2m+1
a2 a2
Trn1 1 1
= a2 + a2m+1
+ a2 + a2
m+1
a2 2
Trn1 1 1
+ a2am+2
= a2 + a2m+1
+ a2
2m+1 +a2 2m+1 2m+1 +a2
= Trn1 1 a
· a a2
a2 + 1 + a2
m+1
c 2 +1
= Trn1 1
a2 + 1 + a .
By (17) and (16), we can conclude that for each a ∈ F2n \ F2 , the solution x1 of (13) is
also a solution of (9) if and only if Trn1 a12 = Trn1 a1 = 1. This means that for each
a ∈ F2n \ F2 , (9) has two (resp. four) solutions in F2n if and only if Trn1 a1 = 0 (resp.
Trn1 a1 = 1). It is obvious that the number of a ∈ F2n \ F2 such that Trn1 a1 = 0
For each given a ∈ F∗2n , denote the linearized polynomial on the left hand side
of (9) by La (x). Then, La (x) is a linear transformation from the vector space F2n to
itself. Let Ai = {a ∈ F2n \ F2 | Trn1 ( a1 ) = i}, where i = 0, 1. Then F∗2n = {1} ∪ A0 ∪ A1 .
The above arguments have shown that the kernel of La (x), denoted by kerLa , contains
two elements of F2n if a ∈ {1} ∪ A0 and four elements if a ∈ A1 . Note that the
linear transformation La (x) is also a homomorphism from the additive group of F2n
to itself. Then by the homomorphism theorem, the image of La (x) has cardinality
2n n−1
|kerLa | = 2 if a ∈ {1} ∪ A0 and has cardinality 2n−2 if a ∈ A1 . Moreover, for each
element d in the image of La (x), there exist exactly |kerLa | elements x’s in F2n such
that La (x) = d.
For each a ∈ F∗2n , let Ba denote the image of the linear transformation La (x) =
m+1
x2 + x2 + cx. We have obtained that |Ba | = 2n−1 if a ∈ {1} ∪ A0 and |Ba | = 2n−2
if a ∈ A1 . By (8), for a given element a ∈ F∗2n , the correspondence between d and b is
one-to-one. Recall that N (a, b) denotes the number of solutions of (7) in F2n . Thus,
we can conclude that for each a ∈ {1} ∪ A0 (resp. a ∈ A1 ), N (a, b) = 2 (resp. 4)
iff b ∈ aBa + g(a) = {ad + g(a) | d ∈ Ba }. In other cases, we all have N (a, b) = 0.
Thus, the number of pairs (a, b) ∈ F∗2n × F2n such that N (a, b) = 2 (resp. 4) is equal
11
to 2n−1 · 2n−1 (resp. 2n−1 − 1 · 2n−2 ). This together with (3) gives the differential
spectrum of g(x).
m+1
Note that Trn1 (ag(x)) = Trn1 a(x2 +1
+ x3 + x) is a quadratic Boolean func-
tion from F2n to F2 . According to Lemma 1, the Walsh transform of Trn1 (ag(x)) heavily
depends on its rank. Below is an auxiliary result for the rank of Trn1 (ag(x)).
Lemma 2. Let s, n, l be positive integers satisfying gcd(s, n) = 1 and let
l
X si
Q(x) = Trn1 (ci x2 +1
),
i=1
where ci ∈ F2n and at least one ci is nonzero for 1 ≤ i ≤ l. Then, the rank 2h of Q(x)
is in the range n − 2l ≤ 2h ≤ n.
Proof. We consider the following equation
l
X si −is −is
ci x2 + c2i x2 = 0,
i=1
which is equivalent to
l 2ls l
X si −is −is X ls s(l+i) s(l−i) s(l−i)
ci x2 + c2i x2 = c2i x2 + c2i x2
i=1 i=1
2l l−1
X ls si X sj sj (18)
= c2i−l x2 + c2l−j x2
i=l+1 j=0
= 0.
2l
X si
ai x2 = 0, (19)
i=0
12
si ls
where ai = c2l−i for i = 0, 1, . . . , l − 1, al = 0 and ai = c2i−l for i = l + 1, l + 2, . . . , 2l.
Since gcd(s, n) = 1, according to [36, Corollary 1], the equation (19) has at most 22l
solutions in F2n . The desired result then follows.
With Theorem 3 and Lemma 2, we are ready to prove the following theorem.
m+1
Theorem 3. Let n = 2m + 1 and g(x) = x2 +1
+ x3 + x be the Welch permutation
of F2n . Then the extended Walsh spectrum of g(x) is given in Table 1.
value Frequency
0 9· 22n−4+ 3 · 2n−3 − 1
n−1 n−3
(5·2 −2)
±2m+1 3
2n−2 ± 2 2
(2n−1 −1) n−5
±2m+2 3
2n−4 ± 2 2
Then, by Lemma 2 and taking s = m and l = 2, we can conclude that the rank
of Qa (x) is n − 3 or n − 1. When a runs through F∗2n , assume that the number of
a ∈ F∗2n such that Qa (x) has rank n − (2i − 1) is Ni , i = 1, 2. Then, by Lemma 1,
when (a, b) runs through F2n × F2n , the extended Walsh transform Wg (a, b) of g(x)
has the following distribution
0, (2n − 1) + N1 (2n − 2n−1 ) + N2 (2n − 2n−3 ) times,
n−3
Wg (a, b) = ±2m+1 , N1 (2n−2 ± 2 2 ) times,
±2m+2 , N (2n−4 ± 2 n−5
2 ) times.
2
13
Next we calculate the fourth power sum of Wg (a, b). On one hand, we have
X 4
(Wg (a, b)) = 24n + 24m+4 · 2n−1 · N1 + 24m+8 · 2n−3 · N2 . (20)
a,b∈F2n
4
where T denotes the number of (x, y, u, v) ∈ (F2n ) satisfying
x + y + u + v = 0,
g(x) + g(y) + g(u) + g(v) = 0.
Let N (a, b) be the number of solutions of g(x + a) + g(x) = b in F2n . Then, we have
N (a, b)2 . Using the notation and results in Theorem 2, we have
P
T =
a,b∈F2n
X
N (a, b)2 = 22n + 4ω2 + 16ω4 = 4 · 22n − 2n .
T = (22)
a,b∈F2n
Combining (20), (21), (22) and the fact that N1 + N2 = 2n − 1, we obtain N1 and
N2 . Then, we get the value distribution of the extended Walsh transform of g(x) as
in Table 1.
14
the Welch exponent. That is to say, the matrix
" n
#
1 β β 2 . . . β 2 −2
H= n (23)
1 β d β 2d . . . β (2 −2)d
where i1 , . . . , it are the t locations of the error, from the key equation
15
derivative equation of xd can be written as
m m
(x + 1)d + xd = (x + x2 )(x2 + x + 1) + 1 = g(x + x2 ) + 1,
m+1
where g(x) = x2 +1
+ x3 + x is the corresponding Welch permutation. Let z =
m
x + x2 . The task of correcting double errors for C1 therefore can be re-arranged as
follows:
Step 1: solve the equation g(z) = c = 1 + s2 /sd1 ;
m
Step 2: solve the equation y + y 2 = η, where η is the solution obtained in Step 1;
Step 3: determine error positions i1 , i2 from xt = s1 yt for t = 1, 2.
For the first step, one can find the preimage of c with the help of the com-
positional inverse g −1 (x) of the permutation g(x). Nevertheless, we don’t have an
explicit expression of the compositional inverse g −1 (x) yet. A straightforward way is
to exhaust possible z ∈ F2n for the equation g(z) + c = 0. For each evaluation g(z),
the Chien search method can reduce the computational complexity from O(t2 ) to
O(t). The optimization in this part is negligible for t = 2. Another way is to calculate
n
gcd(z 2 −1 − 1, g(z) + c) over the polynomial ring F2n [x], which gives a linear term
z + z0 . This method can be further optimized based on the form of g(z). As observed
in [23], the equation g(z) = c for c ̸= 0 can be rewritten as
m+1 c
z2 = z2 + 1 + .
z
Dobbertin showed that g0 (z) can only have one solution in F2n . Hence, an alternative
way to solve g(z) = c is to calculate gcd(g(z) + c, g0 (z)). To compare this calculation
n
with the typical root searching and the calculation of gcd(z 2 −1 −1, g(z)+c), we recall
the result from [38].
Theorem 4. [38, Th. 5.4] Let Fq be the finite field of q elements and Fq [x]t be the poly-
nomials in Fq [x] of degree t. Let e, d be positive integers such that q > d(2e − d + 1)/2
÷ −,×
and e > d. Let tdiv
g , tg , tg be the polynomial divisions, divisions, addition/multipli-
cations in Fq . Given a polynomial g ∈ Fq [x]e , the average number E tw g of operations
w ∈ {div, ÷, −, ×} performed on (uniform distributed) inputs from Fq [x]d is bounded
16
in the following way:
E t÷ E tg−,×
E tdiv
g de g de de
−1 ≤ , −1 ≤ , −1 ≤ .
d+1 q e+d+1 q de q
n
Note that gcd(z 2 −1 − 1, g(z) + c) = gcd(g(z) + c, g1 (z)), where g1 is the remain-
n
der polynomial with degree less than deg(g). Hence gcd(z 2 −1 − 1, g(z) + c) has more
operations than gcd(g(z) + c, g0 (z)), where deg(g0 ) = 9. According to the above
theorem, for the polynomial g(z) + c ∈ F2n [x] of degree e = 2m+1 + 1, calculating
gcd(g(z) + c, g0 (z)) with d = deg(g0 ) = 9 on average takes d + 1 = 10 polynomial
√ √
divisions, e + d + 1 = 2m+1 + 10 ≈ q divisions and ed = 9(2m+1 + 1) ≈ 9 q addi-
tion and multiplications in Fq for q = 2n = 22m+1 . On the other hand, finding roots
of g(z) = c with Chien search method for t = 2 takes on average tq 2 = q operations in
Fq . In this sense, it is better to calculate gcd(g(z) + c, g0 (z)) in solving the equation
g(z) = c for Step 1.
m
Suppose η is the root of g(z) = c in Step 1. From the equality y 2 + y = η, we
m+1 m+1
can obtain the quadratic equation y 2 + y = η 2 + η 2 . Let θ = β 2 + β 2 . Then, it
i
satisfies Trn1 (θ) = 0. Suppose for a normal basis β 2 , i = 0, 1, . . . , n − 1, the element
Pn−1 i Pn−1 i−1
θ = i=0 θi β 2 and y = i=0 yi β 2 . Then we obtain the following system of n
linear equations with rank n − 1 in n variables in F2 :
1 0 0 ... 1 y0 θ0
1 1 0 ... 0 y1 θ 1
0 1 1 ... 0 y2 = θ 2 ,
.. .. .. .. .. ..
. . . . . .
0 0 0 1 1 yn−1 θn−1
Pi
which has solutions yi = yn−1 + j=0 θj for i = 0, 1, . . . , n − 2 and yn−1 ∈ F2 . Here
we provide this elementary process to show its complexity is in order of O(n) instead
of the typical complexity O(n3 ) for solving linearized equations over F2n .
With the two solutions y1 , y2 in Step 2, the two corresponding error positions i1 , i2
can be immediately obtained from β i1 = x1 = y1 s1 , β i2 = x2 = y2 s1 .
We now discuss another family of binary cyclic codes closely related to the Welch
m+1
permutation. For the Welch permutation g(x) = x2 +1
+ x3 + x, we define a cyclic
code C2 with the primary defining set {1, 3, 2m+1 + 1}. We see that the code C2 is a
subcode of the trivial double-error-correcting BCH code, namely, its primary defining
set is {1, 3}. Interestingly, the code C2 actually has properties rather similar to that
of the triple-error-correcting BCH code with primary defining set {1, 3, 5}. With the
fact that (22(m+1) + 1) ≡ 3 mod 22m+1 − 1, we see that the defining set of C2 can be
17
written as {1, 2k + 1, 22k + 1}, where k = m + 1. The first author and Bracken [39]
showed that C2 has minimum distance 7. Note that the dual code C2⊥ is given by
m+1
C2⊥ = Trn1 (ax2 +1
+ bx3 + cx) : a, b, c ∈ F2n .
x∈F∗
2n
when n/ gcd(n, k) is odd. From [40, Th. 1] one can readily see that the codes D1 and
Dm+1 have exactly the same weight distribution with a 5-weight spectrum
n o
2n−1 , 2n−1 ± 2(n−1)/2 , 2n−1 ± 2(n+1)/2 .
This implies that C2 with defining set {1, 3, 2m+1 } and the triple-error-correcting BCH
code with defining set {1, 3, 5} have exactly the same weight distribution. Since the
m+1
terms x2 +1
and x3 have algebraic degree 2, the code C2⊥ is a super-code of first-order
binary Reed-Muller codes in the second-order Reed-Muller code. It is worth noting
that in this context, Kai-Uwe Schmidt has made significant contributions, including
first-order generalized Reed-Muller codes [41] and complementary sets in the context
of sequence design [42].
Charpin, Helleseth and Zinoviev [17] showed that the coset weight distribution of
triple-error-correcting BCH code of length N = 2n − 1 is given by
N N N
K0 = 1, K1 = 1 , K2 = 2 , K3 = 3 ,
N (5N 2 + 10N − 3) 4N (N + 2)
K4 = , K5 = ,
6 3
where Ki is the number of coset leaders with weight i. Moreno and Castro in [43]
showed that binary cyclic codes with primary defining set {1, 2k + 1, 22k + 1}, where
gcd(k, n) = 1, have covering radius 5. This implies that the cyclic code C2 has covering
radius 5. Experimental results show that for m ≥ 3, the cyclic code C2 has the same
coset distribution as above. In our view, this is an interesting connection. Nevertheless,
we are not able to provide a theoretical proof for this fact.
Below we discuss the decoding of this triple-error-correcting code. Suppose a
received vector y contains an error e of weight 2. Since C2 has defining set {1, 3, 2m+1 +
1}, the error e can be corrected with a BCH decoder. On the other hand, since the
code length is 2n − 1, finding the roots of the error-locator polynomial directly would
18
be costly when n increases. Instead, the following process works more efficiently. Let
(s1 , s2 , s3 ) = yH ⊥ . We obtain the following system of equations as in (24):
x1
+ x2 = s1 ,
x31 + x32 = s2 ,
2m+1 +1
2m+1 +1
x1 + x2 = s3 ,
where xt = β it for t = 1, 2 and β is a primitive element in F2n . The first two equations
immediately leads to the quadratic equation s1 x2 + s21 x = s31 + s2 , where x = x1 or
x2 . This quadratic equation can be further transformed to y 2 + y = 1 + ss32 by letting
1
y = sx3 . As discussed earlier, this equation can be solved in O(n) operations in F2 .
Now we consider the decoding of triple errors in a vector y = c + e. Similarly, we
need to solve the following system of equations
x1
+ x2 + x3 = s1 ,
x31 + x32 + x33 = s2 , (25)
2m+1 +1
2m+1 +1 2m+1 +1
x1 + x2 + x3 = s3 ,
implying (
y1 y22 + y12 y2 = 1 + s2 /s31 ,
k k k
y1 y22 + y12 y2 = 1 + s3 /s21 +1 .
Therefore, for any s1 ∈ F2n , it suffices to focus on only the following equations in y1 , y2 :
2
y1 y2
+ y12 y2 = δ,
2m+1 m+1
y1 y2 + y12 y2 = τ, (26)
y y 2m + y 2m y
=τ ,2m
1 2 1 2
d
where (δ, τ ) = (s2 , s3 ) for s1 = 0, and (δ, τ ) = (1+s2 /s31 , 1+s3 /s21 +1 ) for s1 ̸= 0. Note
that if δτ = 0, the equation can have only solutions y1 = y2 , which is invalid here.
19
Furthermore, taking z = y1 /y2 , we derive the following equations from the above
system
= z2δ+z ,
3
y2 m+1
( 2m+1 −2 τ 2
y2 = z2m+1 · z δ+z ,
2 +1 τ +z
y2 = z2m+1 +z , =⇒ m 2m
τ
2m +1
τ2
m y22 = z2m+1 +z
· z τ 2m+z .
y2 = z2m +z ,
m+1 m+1 m m
Assume w = z 2 + z. It is readily seen that z 2 + z = w2 + w and z 2 + z = w2 .
Then we have
m+1 2m+1
−2
y22 = τδ · w w +w ,
m+1 m
2m
2
y22 = (y22 )2 = τ 2τm · ww , (27)
m+2
m 2
y22 = (y22m )2m+2 = 2τm · w2
.
τ w
m+1 m+1
−2 2
From the above equations, the fact y22 y2 = y2 gives
m+1 m 2m+2 m 2
τ w2 w2 w2
+w τ τ
· · = · ,
δ w τ 2m w τ 2m w
i.e.,
m+2 m+1 m+1
τ 1+2 (w2 + w)w2 τ2 w2
m+2 = m+1 · .
δτ 2 w2 +1 τ2 w2
Rearranging this above equation yields
m+1 m+1
w3·(2 )
= γ(w2 +3
+ w4 ),
Substituting the first equation to the second one in (28) gives γγ(w+w)(w+w2 )+w3 =
0. By the first equation and the new equation, we denote
(
h1 = w3 + γw3 · w + γw4 = 0,
(29)
h2 = w2 + (w + w2 )w + γ1 w3 = 0.
where γ1 = 1 + (γγ)−1 .
Below we will eliminate w from h1 , h2 . For reader’s convenience, we include the
process despite its simplicity. Viewing h1 , h2 as polynomials in variable w, by the
20
Euclidean method, we have
h1 = w + (w + w2 ) · h2 + h3 ,
More explicitly,
where σ1 , σ2 , σ3 are coefficients derived from the cubic polynomial in the last second
equation and h(w) = w3 + σ1 w2 + σ2 w + σ3 . Therefore, the system (28) is reduced to
the cubic equation h(w) = 0.
For correctable syndromes s = (s1 , s2 , s3 ) derived from an error e of weight 3, it
can be verified, according to the criteria in [44], that h(w) has three roots. In order
to obtain the roots of h(w), we follow the method by Berlekamp and Solomon [45].
Multiplying h(w) by w + σ1 , we obtain a linearized polynomial
21
m+1
in F2 . Given a root w of h(w), from the equation z 2 + z = w, one can obtain
2 2m+1
the equation z + z = w + w and can find the roots z in O(n) operations in
F2 . By (27) we can get the unique root y2 from δ, τ, w; and by z = y1 /y2 , one can
get solutions (y1 , y2 ) from {(y2 z, y2 ), (y2 z + y2 , y2 )} for the system (26) of equations.
Furthermore, for either s1 = 0 or s1 ̸= 0, we can get two solutions (x1 , x2 , x3 ) from
(y1 , y2 ) ∈ {(y2 z, y2 ), (y2 z + y2 , y2 )}. Here it is to be noted that three roots w of
h(w) leads to six solutions (x1 , x2 , x3 ) for the syndrome equations. These six solutions
correspond to the 6 permutations of one error e with support {i1 , i2 , i3 }.
To summarize, the decoding of the cyclic code C2 for three errors can proceed as
follows:
• given a syndrome (s1 , s2 , s3 ) = yH T , calculate the corresponding δ, τ from
3(2m+1 −1)
(s1 , s2 , s3 ) and γ = τ δ ;
• calculate the polynomial h(w) as in (31);
• construct the linearized polynomial L(w) from h(w) and find its solution in F2n ;
• for the solution w, calculate the intermediate parameters y1 , y2 , and then use them
to calculate xt = β it for t = 1, 2, 3
• recover the codeword c = y + e with the support of e being {i1 , i2 , i3 }.
Denote by N = 2n − 1 the length of the code C2 . In the above decoding procedure,
the calculation of syndrome takes O(N log N ) operations in F2 ; the calculation of
h4 = gcd(h1 , h2 ) is independent of the code length N and solving L(w) = 0 takes
O((log N )3 ) operations in F2 . This decoder significantly outperforms the syndrome
decoder with complexity O(N 3 ) and recent decoders for cyclic codes in [20, 22], which
have complexity at least O(N 2 ) for triple-error-correcting cyclic codes.
22
According to (6), the weight of nonzero codewords in C3⊥ can be expressed in terms of
the extended Walsh transform of g(x). From the Walsh spectrum obtained in Section
3, we see that the weight distribution of C3⊥ is obtained accordingly, which is a 5-weight
spectrum
{2n−1 , 2n−1 ± 2m+1 , 2n−1 ± 2m }.
Now let’s consider another binary linear code related to the Welch permutation.
Ding et. al in [9, 13] introduced a generic construction of binary linear codes from a
subset D = {d1 , d2 , . . . , dℓ } of F2n and the absolute trace function Trn1 (·) from F2n to
F2 as
CD = {ca = (Trn1 (ad1 ), Trn1 (ad2 ), . . . , Trn1 (adℓ )) : a ∈ F2n } .
When the defining set D is properly chosen, the code CD can have a few nonzero
weights. Particularly, when the defining set is given as D(F ) = {F (x) : x ∈ Fn2 } with
a two-to-one function F on F2n , the Hamming weight of a codeword ca in CD(F ) is
given by
That is to say, for studying the Hamming weight properties of the code CD(F ) , it is
critical to study the possible values of the exponential sum WF (a, 0).
In [13], Ding investigated the properties of binary linear codes from the images
of certain functions on F2n and proposed several conjectures on properties of the
constructed codes, including the following one from Welch APN power function.
m
Conjecture 1. [13, Conjecture 33] Let n = 2m + 1 and F (x) = x2 +3 . Let f (x) =
F (x) + F (x + 1) + 1 and D(f ) = {d1 , d2 , . . . , dℓ } = {f (x) | x ∈ F2n }. Define the binary
code CD(f ) as
If n ∈ {5, 7}, then CD(f ) is a three-weight code with length 2n−1 and dimension n. If
n ≥ 9, then CD(f ) is a five-weight code with length 2n−1 and dimension n.
23
m
For the Welch APN power function F (x) = x2 +3
and f (x) = F (x + 1) + F (x) + 1,
it is easy to verify that
m m
f (x) = F (x + 1) + F (x) + 1 = (x + x2 )(x2 + x + 1) = g(x + x2 ),
where g(x) is the Welch permutation of F2n . With the properties of g(x) discussed in
Section 3, we present the following result on the code CD(f ) .
Theorem 5. Let n = 2m + 1 for a positive integer m ≥ 2. The binary linear code
CD(f ) defined in Conjecture 1 has length 2n−1 , dimension n and its nonzero weights
are contained in the following set:
n n−3 n−1
o
2n−2 , 2n−2 ± 2 2 , 22m−1 ± 2 2 .
m
Proof. It is clear that the length of CD(f ) is 2n−1 since f (x) = g(x + x2 ) is a
two-to-one function. As for the dimension, since CD(f ) is linear, we need to consider
the number of a ∈ F2n such that Trn1 (af (x)) = 0 for any x ∈ F2n , equivalently,
Trn
1 (af (x)) = 2n .
P
x∈F2n (−1)
m m
Define T0 = {x + x2 | x ∈ F2n } and T1 = {x + 1 | x ∈ T0 }. Note that x + x2 is a
two-to-one function over F2n . Thus T0 ∪ T1 = F2n . Since n is odd, we have Trn1 (1) = 1
and Trn1 (x) = 1 for any x ∈ T1 . Since g(x) is a permutation of F2n , one has
X n X n X n
(−1)Tr1 (bg(z)) + (−1)Tr1 (bg(z)) = (−1)Tr1 (bg(z)) = 0.
z∈T0 z∈T1 z∈F2n
24
follows from (32) that
1 X n
wt(ca ) = 2n−2 − (−1)Tr1 (ag(x)+x) .
4
x∈F2
n
From the Walsh spectrum of g(x) in Table 1, the possible nonzero weights of the code
CD(f ) can be directly determined.
Theorem 5 provides a partial resolution to the conjecture by Ding [13]. It appears
that new technique is required to completely settle the conjecture and determine the
weight distribution of the code CD(f ) . With the help of Magma, we list some numerical
results in Table 4.2, which are in accordance with Theorem 5.
5 Conclusion
The contributions in this paper are twofold. First, we completely determined the
differential spectrum and the Walsh spectrum of the permutation polynomial g(x)
m
from the Welch APN power function x2 +3 over F22m+1 . Second, we explore two
families of cyclic codes and two families of linear codes derived from the Welch APN
power function. For the two cyclic codes, their properties have been well studied, and
we present efficient algebraic decoders for them; for the two linear codes, the weight
distribution of the first family can be easily obtained from the Walsh spectrum of
g(x), and the weight spectrum of the second one was investigated, which partially
solved a conjecture by Ding in [13]. The Welch permutation g(x) appears to have good
cryptographic properties and some other cryptographic criteria may deserve further
investigation.
Acknowledgment
Y. Xia was supported in part by the National Natural Science Foundation of China
under Grant s 62171479 and in part by the Fundamental Research Funds for the Cen-
tral Universities, South-Central University for Nationalities under Grant CZZ23004.
C. Li and T. Helleseth were supported by the Research Council of Norway under
Grants 247742 and 311646.
25
Declarations
Conflict of interest: The authors declared that they have no conflicts of interest in
connection with the work submitted.
References
[1] Biham, E., Shamir, A.: Differential cryptanalysis of DES-like cryptosystem.
Journal of Cryptology 4(1), 3–72 (1991)
[2] Matsui, M.: Linear cryptanalysis method for DES cipher. In: Helleseth, T. (ed.)
Advances in Cryptology – Eurocrypt’93. Lecture Notes in Computer Sceince, vol.
765, pp. 386–397. Springer, Berlin, Heidelberg (1994)
[3] Chabaud, F., Vaudenay, S.: Links between differential and linear cryptanalysis.
In: De Santis, A. (ed.) Advances in Cryptology – EUROCRYPT’94. Lecture Notes
in Computer Science, vol. 950, pp. 356–365. Springer, Berlin, Heidelberg (1995)
[4] Carlet, C.: Boolean Functions for Cryptography and Coding Theory. Cambridge
University Press, Cambridge, United Kingdom (2021)
[6] Wu, C.-K., Feng, D.: Boolean Functions and Their Applications in Cryptography.
Springer, Berlin, Heidelberg (2016)
[8] Carlet, C., Charpin, P., Zinoviev, V.: Codes, bent functions and permutations
suitable for DES-like cryptosystems. Des. Codes Cryptogr. 15(2), 125–156 (1998)
[9] Ding, C., Niederreiter, H.: Cyclotomic linear codes of order 3. IEEE Trans. Inf.
Theory 53(6), 2274–2277 (2007)
[10] Li, N., Mesnager, S.: Recent results and problems on constructions of linear codes
from cryptographic functions. Cryptogr. Commun. 12(5), 965–986 (2020)
[11] Carlet, C., Ding, C., Yuan, J.: Linear codes from perfect nonlinear mappings and
their secret sharing schemes. IEEE Trans. Inf. Theory 51(6), 2089–2102 (2005)
26
[12] Ding, C., Wang, X.: A coding theory construction of new systematic authentica-
tion codes. Theor. Comput. Sci. 330(1), 81–99 (2005)
[13] Ding, C.: A construction of binary linear codes from boolean functions. Discrete
Math. 339(9), 2288–2303 (2016)
[14] Ding, C., Li, C., Li, N., Zhou, Z.: Three-weight cyclic codes and their weight
distributions. Discrete Math. 339(2), 415–427 (2016)
[15] Mesnager, S., Qu, L.: On two-to-one mappings over finite fields. IEEE Trans. Inf.
Theory 65(12), 7884–7895 (2019)
[16] Cohen, G., Karpovsky, M., Mattson, H., Schatz, J.: Covering radius—survey and
recent results. IEEE Trans. Inf. Theory 31(3), 328–343 (1985)
[17] Charpin, P., Helleseth, T., Zinoviev, V.A.: The coset distribution of triple-error-
correcting binary primitive bch codes. IEEE Trans. Inf. Theory 52(4), 1727–1732
(2006)
[18] Berlekamp, E., McEliece, R., Van Tilborg, H.: On the inherent intractability
of certain coding problems (Corresp.). IEEE Trans. Inf. Theory 24(3), 384–386
(1978)
[19] Berlekamp, E.: Algebraic Coding Theory (Revised Edition). World Scientific
Publishing, Singapore (2015)
[20] Augot, D., Bardet, M., Faugere, J.-C.: On the decoding of binary cyclic codes
with the newton identities. J. Symb. Comput. 44(12), 1608–1625 (2009)
[21] Lin, T.-C., Lee, C.-D., Chen, Y.-H., Truong, T.-K.: Algebraic decoding of cyclic
codes without error-locator polynomials. IEEE Trans. Commun. 64(7), 2719–2731
(2016)
[22] Caruso, F., Orsini, E., Sala, M., Tinnirello, C.: On the shape of the general error
locator polynomial for cyclic codes. IEEE Trans. Inf. Theory 63(6), 3641–3657
(2017)
[23] Dobbertin, H.: Almost perfect nonlinear power functions on GF(2n ): the Welch
case. IEEE Trans. Inf. Theory 45(4), 1271–1275 (1999)
[24] Nyberg, K.: Differentially uniform mappings for cryptography. In: Helleseth, T.
(ed.) Advances in Cryptology – Eurocrypt’93. Lecture Notes in Computer Sceince,
vol. 765, pp. 55–64. Springer, Berlin, Heidelberg (1994)
27
[25] Blondeau, C., Canteaut, A., Charpin, P.: Differential properties of power func-
tions. Int. J. Information and Coding Theory. 1(2), 149–170 (2010)
t
−1
[26] Blondeau, C., Canteaut, A., Charpin, P.: Differential properties of x 7→ x2 .
IEEE Trans. Inf. Theory 57(12), 8127–8137 (2011)
[27] Bracken, C., Leander, G.: A highly nonlinear differentially 4-uniform power map-
ping that permutes fields of even degree. Finite Fields Appl. 16(4), 231–242
(2010)
[28] Charpin, P., Kyureghyan, G.M., Sunder, V.: Sparse permutations with low
differential uniformity. Finite Fields Appl. 28, 214–243 (2014)
[29] Helleseth, T., Kumar, P.V.: Sequences with low correlation. In: Pless, V.S.,
Huffman, W.C. (eds.) Handbook of Coding Theory vol. II, pp. 1765–1853.
North-Holland, Amsterdam (1998)
[30] MacWilliams, F.J., Sloane, N.J.A.: The Theory of Error-Correcting Codes. North-
Holland, Amsterdam (1977)
[31] Ding, C.: Linear codes from some 2-designs. IEEE Trans. Inf. Theory 61(6),
3265–3275 (2015)
[32] Ding, K., Ding, C.: A class of two-weight and three-weight codes and their
applications in secret sharing. IEEE Trans. Inf. Theory 61(11), 5835–5842 (2015)
[33] Zhou, Z., Li, N., Fan, C., Helleseth, T.: Linear codes with two or three weights
from quadratic bent functions. Des. Codes Cryptogr. 81(2), 283–295 (2016)
[34] Mesnager, S.: Linear codes with few weights from weakly regular bent functions
based on a generic construction. Cryptogr. Commun. 9(1), 71–84 (2017)
[35] Chen, C.L.: Formulas for the solutions of quadratic equations over GF(2m ). IEEE
Trans. Inf. Theory. 28(5), 792–794 (1982)
[36] Bracken, C., Byrne, E., Markin, N., MaGuire, G.: Determining the nonlinearity of
a new family of APN functions. In: Boztas, S., Lu, H.-F. (eds.) Applied Algebra,
Algebraic Algorithms and Error-Correcting Codes. AAECC 2007. Lecture Notes
in Computer Science, vol. 765, pp. 72–79. Springer, Berlin, Heidelberg (2007)
[37] Canteaut, A., Charpin, P., Dobbertin, H.: Binary m-sequences with three-valued
crosscorrelation: a proof of Welch’s conjecture. IEEE Trans. Inf. Theory 46(1),
4–8 (2000)
28
[38] Giménez, N., Matera, G., Pérez, M., Privitelli, M.: Average-case complexity of
the euclidean algorithm with a fixed polynomial over a finite field. Comb. Probab.
Comput. 31(1), 166–183 (2022)
[39] Bracken, C., Helleseth, T.: Triple-error-correcting BCH-like codes. 2009 IEEE
International Symposium on Information Theory, Seoul, Korea (South), 28 June
– 03 July 2009 (2009)
[40] Luo, J.: On binary cyclic codes with five nonzero weights. Preprint at https:
//doi.org/10.48550/arXiv.0904.2237 (2009)
[41] Schmidt, K.-U.: On cosets of the generalized first-order Reed–Muller code with
low OFDM. IEEE Trans. Inf. Theory 52(7), 3220–3232 (2006)
[42] Schmidt, K.-U.: Complementary sets, generalized Reed–Muller codes, and power
control for OFDM. IEEE Trans. Inf. Theory 53(2), 808–814 (2007)
[43] Moreno, O., Castro, F.N.: Divisibility properties for covering radius of certain
cyclic codes. IEEE Trans. Inf. Theory 49(12), 3299–3303 (2003)
[44] Williams, K.S.: Note on cubics over GF (2n ) and GF (3n ). Journal of Number
Theory 7(4), 361–365 (1975)
[45] Berlekamp, E.R., Rumsey, H., Solomon, G.: On the solution of algebraic equations
over finite fields. Information and Control 10(6), 553–564 (1967)
29