0% found this document useful (0 votes)
11 views29 pages

2024 CInvestigation of The Permutation

This paper investigates the cryptographic properties of the Welch APN function and its associated permutation polynomial, revealing favorable characteristics through the analysis of differential and Walsh spectra. Additionally, it explores four families of binary linear codes related to the Welch function, proposing efficient algebraic decoding methods that significantly reduce complexity. The findings contribute to the understanding of constructing linear codes from nonlinear functions and improving decoding techniques in coding theory.

Uploaded by

Chunlei Li
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views29 pages

2024 CInvestigation of The Permutation

This paper investigates the cryptographic properties of the Welch APN function and its associated permutation polynomial, revealing favorable characteristics through the analysis of differential and Walsh spectra. Additionally, it explores four families of binary linear codes related to the Welch function, proposing efficient algebraic decoding methods that significantly reduce complexity. The findings contribute to the understanding of constructing linear codes from nonlinear functions and improving decoding techniques in coding theory.

Uploaded by

Chunlei Li
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Investigation of the permutation and linear codes

from the Welch APN function


Tor Helleseth1 , Chunlei Li1 , Yongbo Xia2*
1 Departmentof Informatics, University of Bergen, Bergen, N-5008,
Norway.
2* Department of Mathematics and Statistics, South-Central University,

Minzu Street, Wuhan, 430074, China.

*Corresponding author(s). E-mail(s): [email protected];


Contributing authors: [email protected]; [email protected];

Abstract
m
Dobbertin in 1999 proved that the Welch power function x2 +3 was almost per-
ferct nonlinear (APN) over the finite field F22m+1 , where m is a positive integer.
m
In his proof, Dobbertin showed that the APNness of x2 +3 essentially relied on
m+1
the bijectivity of the polynomial g(x) = x2 +1
+ x3 + x over F22m+1 . In
this paper, we first determine the differential and Walsh spectra of the permu-
tation polynomial g(x), revealing its favourable cryptograhphic properties. We
then explore four families of binary linear codes related to the Welch APN power
functions. For two cyclic codes among them, we propose algebraic decodings that
significantly outperform existing methods in terms of decoding complexity.

Keywords: Permutation, Differential spectrum, Walsh spectrum, Linear codes, Cyclic


Codes, Algebraic Decoding

1 Introduction
Let F2n denote the finite field of 2n elements and F∗2n be its multiplicative group. Non-
linear functions over F2n have wide applications in cryptography and coding theory.
In symmetric cryptography, block ciphers are designed by appropriate compositions
of linear permutations and S-boxes that are the only nonlinear component. Hence the
cryptographic properties of the nonlinear S-boxes are crucial to the security of the

1
ciphers. Differential and linear attacks [1, 2] are two of most powerful cryptographic
attacks agsinst block ciphers, and the link between these two approaches was investi-
gated in [3]. To ensure good resistance to differential attacks, the differential uniformity
of the nonlinear function used in an S-box should be low. The lowest possible differen-
tial uniformity is 2 and functions with this property are called almost perfect nonlinear
(APN) functions. There has been much work and progress on APN functions; see, for
example, [4] and [5]. The nonlinearity quantifies the level of resistance of the function
to the linear attack: the higher is the nonlinearity the better is the resistance of the
function against the linear attack. Besides the differential uniformtiy and the nonlin-
earity, there are also some other cryptographic criteria that measure the resistance of
the nonlinear functions to various known attacks. For further details about this topic,
the reader is referred to [4, 6] and references therein. The study on the cryptographi-
cally significant functions during the past decades shows that it is difficult to design a
function attaining all good cryptographic criteria, and trade-offs must be considered.
Linear codes, particularly cyclic codes, have wide applications in reliable data
storage and communications. In coding theory one of the most important topic is to
construct linear codes with desirable properties and to explore efficient decoding for
them. Constructing linear codes from nonlinear functions was extensively explored
in the past decades [7–10], and many optimal linear codes have been obtained from
cryptographically significant functions [11–15], such as perfect nonlinear functions,
almost perfect nonlinear functions, bent functions and plateaued functions. In those
works, the minimum distances and weight distributions of the constructed codes and
their duals were intensively studied (see for instance a recent survey by Li and Mes-
nager [10]). There are other parameters of linear codes, such as the covering radius
[16] and coset weight distribution [17], that are of fundamental interest, particularly
when evaluating the performance of linear codes in error correction. Nevertheless, due
to their intractabilities, there has been limited research progress on such topics. It
is well known that the problem of random syndrome decoding is NP-complete [18].
There do exist certain linear codes with efficient decoding. For instance, BCH codes,
due to their special property, allow for efficient decoding with polynomial-time com-
plexity [19]. However, efficiently decoding non-BCH cyclic codes remains a significant
open problem, despite recent efforts to develop decoders for generic cyclic codes by
investigating generalized error-locator polynomials [20–22].
In this paper, we first investigate important cryptographic properties, namely, the
differential spectrum and Walsh spectrum, of the permutation polynomial f (x) =
m+1
x2 +1
+ x3 + x over F22m+1 , which we call the Welch permutation since it was used
m
to prove the APNness of the Welch power function F (x) = x2 +3 [23]. In the second
part, we explore two families of cyclic codes and two families of linear codes that
are closely related to the Welch power function. For the two binary cyclic codes, we

2
propose efficient algebraic decoders with complexity in the order of O(N (log N )3 ),
where N = 22m+1 − 1 is the code length. For the second family of binary linear codes,
it is shown to have at most five nonzero weights, which provides a partial resolution
to the conjecture by Ding [13].
The remainder of this paper is organized as follows. Section 2 recalls basic def-
initions and auxiliary results. Section 3 determines the differential spectrum and
Walsh spectrum of g(x). Section 4 explores the properties and decoding of binary
codes derived from the Welch APN power function, and Section 5 summarizes our
contributions in this work.

2 Preliminaries
2.1 Cryptographic properties of vectorial Boolean functions
For a vectorial Boolean function F (x) from F2n to F2n , denote

NF (a, b) = |{x ∈ F2n | F (x + a) + F (x) = b}|. (1)

The differential uniformity of F (x) is defined by

∆F = max {NF (a, b) | a ∈ F∗2n , b ∈ F2n } .

Nyberg defined a mapping F (x) to be differentially δ-uniform if ∆f = δ [24]. It is


clear that the equation F (x + a) + F (x) = b have solutions in pairs. Thus, ∆F = 2 is
the smallest possible value for the differential uniformity of F (x). A function F (x) is
said to be almost perfect nonlinear (APN) if its differential uniformity is equal to 2.
Equivalently, a function F (x) is APN if its derivative function Da F (x) = F (x + a) +
F (x), for any a ∈ F∗2n , is a two-to-one function over F2n .
Besides the differential uniformity, the differential spectrum of F (x) is also
an important notion for measuring its resistance against variants of differential
cryptanalysis [25–28]. Its definition is given as follows.
Definition 1. Let F (x) be a function from F2n to itself and NF (a, b) be defined as in
(1). Denote
ωi = | {(a, b) ∈ F∗2n × F2n | NF (a, b) = i} |.
The differential spectrum of F (x) is defined as the multi-set of NF (a, b) for all (a, b) ∈
F∗2n × F2n , which can be given by

ΩF = [ω0 , ω1 , . . . , ωδ ], (2)

where δ is the differential uniformity of F (x).

3
It is easily seen that ωi = 0 in differential spectrum if i is odd. Moreover, we have
the following properties

δ
X δ
X
ωi = 2n (2n − 1) and (i × ωi ) = 2n (2n − 1). (3)
i=0 i=0

For any APN function over F2n , there are only two possible values 0 and 2 in its
differential spectrum. Thus, from the equalities in (3), the differential spectrum can
be uniquely determined.
Another important criterion of a vectorial Boolean function F (x) is its nonlinearity,
which can be given in terms of the Walsh transform of F (x).
Definition 2. The (extended) Walsh transform of a vectorial Boolean function F (x)
at (a, b) is defined by
X n
WF (a, b) = (−1)Tr1 (bF (x)+ax) ,
x∈F2n

where a, b ∈ F2n . The (extended) Walsh spectrum of F (x) is the multi-set

ΛF = {WF (a, b) : a, b ∈ F2n , b ̸= 0} . (4)

The nonlinearity of F is given by

1
N L(F ) = 2n−1 − max{|WF (a, b)| : a, b ∈ F2n , b ̸= 0}.
2

Remark 1. Note that for a Boolean function G(x) from F2n to F2 , the extended
Walsh transform reduces to the original Walsh-Hadamard transform
X n
G(λ)
b = (−1)G(x)+Tr1 (λx) , λ ∈ F2n .
x∈F2n

Next we recall some results about the Walsh transforms of quadratic Boolean
functions. Given a quadratic Boolean function Q(x) from F2n to F2 , the function
B(x, z) = Q(x + z) + Q(x) + Q(z) is a bilinear function in x and z. When x, z are
expressed as vectors in Fn2 , the bilinear function can be written as B(x, z) = xBz ⊺ ,
where B is the n × n symplectic matrix of Q(x) satisfying B(i, j) = 1 for 1 ≤ i, j ≤ n
if and only if the multivariate form of Q(x) contains the term xi xj . The rank of Q(x)
is defined as the rank of its symplectic matrix B. Let

VQ = {x ∈ F2n | Q(x + z) + Q(x) + Q(z) = 0, ∀ z ∈ F2n }.

4
By the rank-null theorem we have dimF2 (VQ ) + Rank(Q) = n. Note that
 2
X X X X
 (−1)Q(x)  = (−1)Q(x) (−1)Q(x+z)+Q(x)+Q(z) = 2n (−1)Q(x) ,
x∈F2n x∈F2n z∈F2n x∈VQ

where VQ is the F2 -linear space defined as above. It is readily seen that Q(x) is linear
over VQ . Hence one has
(
X
Q(x) ±2n−Rank(Q)/2 , if Q(x) = 0 for any x ∈ VQ ,
(−1) =
x∈F2n 0, otherwise.

Moreover, when λ runs through F2n , the distribution of the Walsh transform Q(x)
b
can be given as follows.
Lemma 1. [29, Theorem 6.2] Let Q(x) be a quadratic form on F2n to F2 with rank
2h. Then its Walsh transform has the following distribution

 ±2n−h , 22h−1 ± 2h−1 times,


X n
Q(x)+Tr1 (λx)
Q(λ) =
b (−1) =
2n − 22h times.

x∈F2n  0,

2.2 Linear codes from nonlinear functions


In this section we recall basics of linear codes and the two generic constructions for
linear codes from nonlinear functions. Below we focus only on binary linear codes
while the basics are valid for linear codes over finite fields in general [19, 30].

Basics of linear codes


An [N, k, d] binary linear code C is a k-dimensional subspace of FN 2 with minimum
(Hamming) weight d. The code C can be defined either by its generator matrix G as
C = {xG | x ∈ Fk2 } or by its parity-check matrix H as C = {c ∈ Fn2 | cH T = 0}. The
dual code C ⊥ is given by C ⊥ = x ∈ FN

2 | x1 c1 + · · · + xN cN = 0, ∀ (c1 , . . . , cN ) ∈ C ,
which has the parity-check matrix H of C as its generator matrix. For a received
vector y = c + e with certain codeword c ∈ C and error vector e ∈ FN 2 , the syndrome
equation s = yH = eH associates the error e with a coset of C in FN
T T
2 . The coset
leader for each coset is defined as the element with minimum weight in the coset. A
binary linear code C is called cyclic if for any c = (c1 , . . . , cN ) ∈ C, its cyclic shift
σ(c) = (cN , c1 , . . . , cN −1 ) is contained in C. An [N, k] binary cyclic code C can be
equivalently seen as an ideal in FN N
2 [x]/(x − 1). In this way, a binary cyclic code C
can be uniquely defined by a binary monic polynomial g(x) dividing xN − 1, known

5
as the generator polynomial of C. Equivalently, the code C = ⟨g(x)⟩, can be uniquely
given by its complete defining set SC = {i : g(αi ) = 0, 0 ≤ i < N }, where α is an
N -th primitive root of unity. Since g(αi ) = 0 iff g(α2i ) for any 0 ≤ i < N , the set SC
is usually partitioned into disjoint cyclotomic cosets modulo N . A subset of SC that
consists of coset leaders from each coset in SC can uniquely define C, and is therefore
termed as the primary defining set of C. When the (complete) defining set SC contains
d−1 consecutive integers, the cyclic code C has minimum distance at least d according
to the BCH bound [19].
Let C be a binary linear code of length N and minimum weight d. The space Fn2
can be then partitioned to cosets with respect to C. For each coset, the coset leader is
defined as one element with minimum weight in the coset. When the minimum weight
of a coset is no greater than ⌊ d−1 2 ⌋, it has a unque coset leader; when its minimum
weight is larger, a coset may have several elements with the minmum weight, indicating
that the coset leader is not unique. For a received vector y = c + e with certain
codeword c ∈ C and error vector e ∈ FN ⊺ ⊺
2 , the syndrome s = yH = eH associates
N
the error e with a coset of C in F2 . In particular, for the case of s = 0, it corresponds
to the code C, of which the coset leader is the zero vector. This indicates that when
a codeword c is transmitted and the received vector y = c + e is another codeword
of C, the process of error detection by the parity-check equation s = yH ⊺ fails. The
probability of the detection failure of the code C can be expressed in terms of its
weight distribution, which is defined as (A0 , A1 , . . . , AN ), where Ai denote the number
of codewords with Hamming weight i in the code C and it is obvious that A0 = 1.
Thanks to the MacWilliams identity, the weight distribution of C can be derived from
the weight distribution (1, B1 , . . . , BN ) of its dual C ⊥ .
For a nonzero syndrome s = yH ⊺ = eH ⊺ , it belongs to a coset with a nonzero
coset leader. The corresponding coset leader has the same syndrome as e, and it will
be deemed as the error e added to the received vecotr y, since the coset leader has
the minimum weight. The process can uniquely correct the error e when its weight
is within the packing radius of C given by t = ⌊ d(C)−1 2 ⌋, for which the coset leader is
unique; when an error e has weight beyond the packing radius t, it is likely that the
corresponding coset doesn’t have unique coset leader anymore. In this case, the error
e cannot be uniquely decoded and the decoder may fail to return a correct codeword.
The performance of the aforementioned error correction procedure can be evaluated
by in terms of weight distributions of cosets [19]. Unfortunately, a complete picture
of weight distributions of all cosets is intractable. Instead, some attempts have been
made in calculating the coset distribution (1, K1 , K2 , . . . , KN ), where Ki denotes the
number of coset leaders with weight i, of the linear code C [17].
The largest weight of coset leaders of C is known as the covering radius of C, which
is defined by ρ(C) = max{min{d(y, c) : c ∈ C} : y ∈ FN 2 }. The covering radius of C is
a basic geometric parameter, which is a measure of the maximum distortion when C is

6
used for data compression, and is the maximum weight of a correctable random error
when C is used for error correction [16]. It is clear that the covering radius of a code is
lower bounded by its packing radius t. The equality of such an inequality is achieved
by perfect codes. In addition, a linear code C is called a quasi-perfect if ρ(C) = t + 1;
and a quasi-perfect code is called uniformly packed code if ρ(C) is the same as the
external distance of C, which is the number of non-zero weights in its dual C ⊥ .

Generic construction 1.
Let F be a function from F2n to itself with F (0) = 0, and β be a primitive element
of F2n . A binary linear code C of length 2n − 1 can be constructed from F via the
following parity-check matrix
" n
#
1 β β 2 . . . β 2 −2
H= n , (5)
F (1) F (β) F (β 2 ) . . . F (β 2 −2 )

where each symbol stands for the column of its coordinate with respect to a basis of
F2n over F2 . It is easy to verify that the dual code C ⊥ is given by

C ⊥ = (Trn1 (ax + bF (x)))x∈F∗2n : a, b ∈ F2n .




For the nonlinear function F , the code C has dimension 2n − 1 − 2n. In particular,
when F (x) is a power function xd , the code C is a cyclic code with primary defining set
{1, d}. This generic construction has a long history and pertains to Delsarte’s Theorem
[7]. Note that for the dual code C ⊥ , the Hamming weight of a codeword ca,b ∈ C ⊥ is
given by

wt(ca,b ) = 2n − 1 − #{x ∈ F∗2n : Trn1 (ax + bF (x)) = 0}


1 X n 1 (6)
= 2n−1 − (−1)Tr1 (ax+bF (x)) = 2n−1 − WF (a, b).
2 2
x∈F2
n

Therefore, the weight distribution of C ⊥ can be directly derived from the extended
Walsh spectrum of F (x) given by {WF (a, b) : a, b ∈ F2n } . This relation has led to a
well-established coding-theory characterization of APN functions, almost bent (AB)
functions [8].
Theorem 1. [8] Let F be a function from F2n to itself with F (0) = 0 and n being
odd. Let the code C be defined by a parity-check matrix H as in (5). Then F (x) is an
APN function if and only the code C has minimum distance 5. Furthermore, F (x) is
an AB function if and only C ⊥ is a [2n − 1, 2n − 1 − 2n] uniformly packed code with
minimum distance 5 and packing radius 3.

7
Generic construction 2.
Let D = {d1 , d2 , . . . , dℓ } be a subset of F2n . A binary linear code having D as its
defining set is given by

CD = {ca = (Trn1 (ad1 ), Trn1 (ad2 ), . . . , Trn1 (adℓ )) : a ∈ F2n } .

It is clear that the code CD has length ℓ and dimension at most n.


When the defining set D is properly chosen, the code CD can have good or opti-
mal parameters. The above construction is generic in the sense that all linear codes
could be produced by selecting proper defining sets D. By considering defining sets
D as the support or image of certain functions F over Fn2 , researchers have proposed
many families of few-weight linear codes with new code lengths, see e.g., [9, 13, 31–
34]. Interested readers may refer to a recent survey by Li and Mesnager in [10] and
references therein for more details about these two generic approaches.

3 Differential and Walsh spectra of the Welch


permutation
m+1
For the Welch permutation g(x) = x2 +1
+ x3 + x over F2n with n = 2m + 1, this
section will determine the differential spectrum Ωg as defined in (2) and the Walsh
spectrum Λg as defined in (4).
m+1
Theorem 2. Let n = 2m + 1 and g(x) = x2 +1
+ x3 + x. Then the function g(x)
over F2n is differentially 4-uniform. Furthermore, its differential spectrum is given by

ω0 = 22n−1 + 22n−3 − 3 · 2n−2 , ω2 = 22n−2 , ω4 = 22n−3 − 2n−2 .


 

Proof. For (a, b) ∈ F∗2n × F2n , let N (a, b) be the number of solutions of the derivative
equation g(x + a) + g(x) = b in F2n . Note that

g(x + a) + g(x) + b
m+1 m+1 m+1
= x2 a + xa2 + a2 +1
+ x2 a + xa2 + a3 + a + b
2m+1 2 2m+1 2
= ax + ax + (a + a )x + g(a) + b.

Since a ̸= 0, g(x + a) + g(x) + b = 0 is equivalent to that


m+1
x2 + x2 + cx + d = 0, (7)

where
m+1
−1 g(a) + b
c = a2 + a and d = . (8)
a

8
Note that c = 0 if and only if a = 1. Next we consider the following equation
m+1
x2 + x2 + cx = 0. (9)

If c = 0, i.e., a = 1, then (9) have two solutions in F2n , which are 0 and 1. If c ̸= 0,
/ F2 , then by raising (9) to the power 2m , we get
i.e., a ∈
m+1 m m
x + x2 + c2 x2 = 0. (10)

Adding up (9) and (10), we get


m m
c2 x2 + x2 + (c + 1)x = 0,

which implies
m x2 c+1
x2 = m + x. (11)
c2 c2m
Substituting (11) into (10), we get
m+1 m+1
x4 + (c2 + c2 + 1)x2 + c2 +1
x = 0. (12)

The above arguments show that when c ̸= 0, the solutions of (9) must be those of (12).
Note that the left hand side of (12) is a linearized polynomial over F2n and it may
have 1, 2 or 4 roots in F2n . Thus, the equation (9) may also have 1, 2 or 4 solutions
in F2n . Moreover, note that
m+1
2m+1 −1 a2 + a2
c=a +a= .
a

Besides x = 0, for any given a ∈ F2n \ F2 , it can be observed that x = a must be a


solution of (9). Thus, when c ̸= 0, i.e., a ∈
/ F2 , the number of solutions of (9) in F2n is
2 or 4.
Denote by M1 (resp. M2 ) the number of a ∈ F2n \ F2 such that (9) has two (resp.
four) solutions in F2n . In what follows, we need to determine M1 and M2 . We further
investigate the equation (12). Since x = 0 and x = a are its solutions, the polynomial
on the left hand side of (12) has a factorization over F2n as follows
m+1
4 2m+1 2 2 2m+1 +1 2 c2 +1
x + (c + c + 1)x + c x = x(x + a)(x + ax + ),
a
2m+1 2 2m+1 +1 m+1
where c = a a +a . By verifying that a2 + c a = c2 + c2 + 1, we can check
the validity of the above factorization. To determine the exact number of solutions to

9
(9), we should investigate the solutions of the following quadratic equation
m+1
c2 +1
x2 + ax + = 0. (13)
a
Note that  m+1 +1 
c2
Trn1 a3
 m+2 m+1 
a2 +a2 2 2
= Trn1 a 2m+1
· a a4 +a
 4 2 2m+1 2m+2 2m+1 2 2m+2 
= Trn1 a +a ·a +a ·a
a2m+1 ·a4
+a ·a
 2m+2 2m+1

= Trn1 1
a2m+1
+ a12 + a a4 + a a2
 m+1   m+1 
a2 a2
= Trn1 1
+ Trn1 1
+ Trn1 + Trn1
 
a a a2 a2
= 0.
Thus, (13) has two solutions in F2n . This also shows that for any a ∈ F2n \ F2 , (12)
always has four solutions in F2n . By Theorem 1 in [35], one can get the solutions of
(13), which are
m m+1
!22i−1
X c2 +1
x1 = a , and x2 = x1 + a.
i=1
a3
Next we should verify that whether x1 is a solution of (9) or not. If x1 is a solution of
(9), so does x2 .
Let y = xa1 , then (13) becomes into
m+1
c2 +1
y2 + y + = 0. (14)
a3

If x1 is a solution of (9), we have

m+1 a2 ca
y2 + y2 + y = 0. (15)
a2m+1 a2m+1

Substituting (14) into (15), we have

m+1
 c 2m+1 +1
y2 +y+ = 0. (16)
a

10
On the other hand, by (14) we have
m+1
y2 +y
m 2i
2
P
= y +y
i=0
m  2m+1 +1 2i
c
P
= a3
i=0
m  m+2 m+1 2i
1 1 a2 a2
P
= a2m+1
+ a2 + a4 + a2
i=0
m
 2 2i
P 1 2
m 1

a2
m+1
a2
m+1 (17)
= a2 + a2 + a2 + a2
i=0
m+1  m+1 2m+1
a2 a2
Trn1 1 1

= a2 + a2m+1
+ a2 + a2
m+1
a2 2
Trn1 1 1
+ a2am+2

= a2 + a2m+1
+ a2
 2m+1 +a2 2m+1 2m+1 +a2
= Trn1 1 a
· a a2

a2 + 1 + a2
m+1
c 2 +1
= Trn1 1
 
a2 + 1 + a .

By (17) and (16), we can conclude that for each a ∈ F2n \ F2 , the solution x1 of (13) is
also a solution of (9) if and only if Trn1 a12 = Trn1 a1 = 1. This means that for each
 

a ∈ F2n \ F2 , (9) has two (resp. four) solutions in F2n if and only if Trn1 a1 = 0 (resp.


Trn1 a1 = 1). It is obvious that the number of a ∈ F2n \ F2 such that Trn1 a1 = 0
 

(resp. Trn1 a1 = 1) is equal to 2n−1 − 1. Thus, we obtain that M1 = M2 = 2n−1 − 1.




For each given a ∈ F∗2n , denote the linearized polynomial on the left hand side
of (9) by La (x). Then, La (x) is a linear transformation from the vector space F2n to
itself. Let Ai = {a ∈ F2n \ F2 | Trn1 ( a1 ) = i}, where i = 0, 1. Then F∗2n = {1} ∪ A0 ∪ A1 .
The above arguments have shown that the kernel of La (x), denoted by kerLa , contains
two elements of F2n if a ∈ {1} ∪ A0 and four elements if a ∈ A1 . Note that the
linear transformation La (x) is also a homomorphism from the additive group of F2n
to itself. Then by the homomorphism theorem, the image of La (x) has cardinality
2n n−1
|kerLa | = 2 if a ∈ {1} ∪ A0 and has cardinality 2n−2 if a ∈ A1 . Moreover, for each
element d in the image of La (x), there exist exactly |kerLa | elements x’s in F2n such
that La (x) = d.
For each a ∈ F∗2n , let Ba denote the image of the linear transformation La (x) =
m+1
x2 + x2 + cx. We have obtained that |Ba | = 2n−1 if a ∈ {1} ∪ A0 and |Ba | = 2n−2
if a ∈ A1 . By (8), for a given element a ∈ F∗2n , the correspondence between d and b is
one-to-one. Recall that N (a, b) denotes the number of solutions of (7) in F2n . Thus,
we can conclude that for each a ∈ {1} ∪ A0 (resp. a ∈ A1 ), N (a, b) = 2 (resp. 4)
iff b ∈ aBa + g(a) = {ad + g(a) | d ∈ Ba }. In other cases, we all have N (a, b) = 0.
Thus, the number of pairs (a, b) ∈ F∗2n × F2n such that N (a, b) = 2 (resp. 4) is equal

11

to 2n−1 · 2n−1 (resp. 2n−1 − 1 · 2n−2 ). This together with (3) gives the differential
spectrum of g(x).
 m+1

Note that Trn1 (ag(x)) = Trn1 a(x2 +1
+ x3 + x) is a quadratic Boolean func-
tion from F2n to F2 . According to Lemma 1, the Walsh transform of Trn1 (ag(x)) heavily
depends on its rank. Below is an auxiliary result for the rank of Trn1 (ag(x)).
Lemma 2. Let s, n, l be positive integers satisfying gcd(s, n) = 1 and let

l
X si
Q(x) = Trn1 (ci x2 +1
),
i=1

where ci ∈ F2n and at least one ci is nonzero for 1 ≤ i ≤ l. Then, the rank 2h of Q(x)
is in the range n − 2l ≤ 2h ≤ n.
Proof. We consider the following equation

Q(x) + Q(z) + Q(x + z)


 l  
si si
= Trn1 ci x2 z + ci xz 2
P
i=1
l  
n 2si 2−is 2−is
P
= Tr1 ci x z + ci x z
i=1l  
n 2si 2−is 2−is
P
= Tr1 z ci x + ci x
i=1
=0

for all z ∈ F2n . The above equation holds if and only if

l  
X si −is −is
ci x2 + c2i x2 = 0,
i=1
which is equivalent to

l  2ls l  
X si −is −is X ls s(l+i) s(l−i) s(l−i)
ci x2 + c2i x2 = c2i x2 + c2i x2
i=1 i=1
2l l−1
X ls si X sj sj (18)
= c2i−l x2 + c2l−j x2
i=l+1 j=0

= 0.

We can rewrite (18) in the following form

2l
X si
ai x2 = 0, (19)
i=0

12
si ls
where ai = c2l−i for i = 0, 1, . . . , l − 1, al = 0 and ai = c2i−l for i = l + 1, l + 2, . . . , 2l.
Since gcd(s, n) = 1, according to [36, Corollary 1], the equation (19) has at most 22l
solutions in F2n . The desired result then follows.
With Theorem 3 and Lemma 2, we are ready to prove the following theorem.
m+1
Theorem 3. Let n = 2m + 1 and g(x) = x2 +1
+ x3 + x be the Welch permutation
of F2n . Then the extended Walsh spectrum of g(x) is given in Table 1.

Table 1 Walsh Spectrum of g(x)

value Frequency
0 9· 22n−4+ 3 · 2n−3 − 1
n−1  n−3

(5·2 −2)
±2m+1 3
2n−2 ± 2 2
(2n−1 −1) n−5
 
±2m+2 3
2n−4 ± 2 2

Proof. It is easily seen that



X n
 2n , if b = 0,

Wg (0, b) = (−1)Tr1 (bx) =
x∈F2n  0, if b ̸= 0.

m+1 +1
 
P Trn ax2 +ax3 +(a+b)x
When a ̸= 0, Wg (a, b) = (−1) 1 . Denote
x∈F2n
 m+1 
Trn1 ax2 +1
+ ax3 by Qa (x), which is a quadratic Boolean function from F2n to
F2 . Note that
 m+1   m m 2m 2m

Qa (x) = Trn1 ax2 +1
+ ax3 = Trn1 a2 x2 +1 + a2 x2 +1 .

Then, by Lemma 2 and taking s = m and l = 2, we can conclude that the rank
of Qa (x) is n − 3 or n − 1. When a runs through F∗2n , assume that the number of
a ∈ F∗2n such that Qa (x) has rank n − (2i − 1) is Ni , i = 1, 2. Then, by Lemma 1,
when (a, b) runs through F2n × F2n , the extended Walsh transform Wg (a, b) of g(x)
has the following distribution





 0, (2n − 1) + N1 (2n − 2n−1 ) + N2 (2n − 2n−3 ) times,

n−3
Wg (a, b) = ±2m+1 , N1 (2n−2 ± 2 2 ) times,



 ±2m+2 , N (2n−4 ± 2 n−5

2 ) times.
2

13
Next we calculate the fourth power sum of Wg (a, b). On one hand, we have
X 4
(Wg (a, b)) = 24n + 24m+4 · 2n−1 · N1 + 24m+8 · 2n−3 · N2 . (20)
a,b∈F2n

On the other hand, we have


P 4
(Wg (a, b))
a,b∈F2n
n n
(−1)Tr1 (b(x+y+u+v)) (−1)Tr1 (a(g(x)+g(y)+g(u)+g(v)))
P P P
= (21)
x,y,u,v∈F2n b∈F2n a∈F2n
2n
= 2 T,

4
where T denotes the number of (x, y, u, v) ∈ (F2n ) satisfying


 x + y + u + v = 0,


 g(x) + g(y) + g(u) + g(v) = 0.

Let N (a, b) be the number of solutions of g(x + a) + g(x) = b in F2n . Then, we have
N (a, b)2 . Using the notation and results in Theorem 2, we have
P
T =
a,b∈F2n

X
N (a, b)2 = 22n + 4ω2 + 16ω4 = 4 · 22n − 2n .

T = (22)
a,b∈F2n

Combining (20), (21), (22) and the fact that N1 + N2 = 2n − 1, we obtain N1 and
N2 . Then, we get the value distribution of the extended Walsh transform of g(x) as
in Table 1.

According to Theorem 3 and Definition 2, we get the following corollary.


m+1
Corollary 1. Let n = 2m + 1 and g(x) = x2 +1
+ x3 + x be the Welch permutation
over F2n . Then, its nonlinearity nl(g(x)) is equal to 2n−1 − 2m+1 .

4 Binary codes related to the Welch APN function


4.1 Binary cyclic codes related to the Welch APN function
In this subsection we will discuss the properties of two families of binary cyclic codes,
that are closely related to the Welch APN power function, and then present algebraic
decoding for them.
Recall that n = 2m + 1 and β is a primitive element of F2n . We start from a family
of binary cyclic codes C1 with primary defining set SC1 = {1, d}, where d = 2m + 3 is

14
the Welch exponent. That is to say, the matrix
" n
#
1 β β 2 . . . β 2 −2
H= n (23)
1 β d β 2d . . . β (2 −2)d

is a parity-check matrix of C1 . It was proved in [37] that the Walsh spectrum of xd is


given by {0, ±2m+1 }. Therefore, it follows from Theorem 1 that C1 is a [2n − 1, 2n −
1 − 2n] double-error-correcting uniformly packed code with packing radius 3. Below
we discuss the algebraic decoding of this uniformly packed code.
Notice that for a cyclic code C with length N and a BCH bound 2t, when an error
has weight t, one can decode C by the well-known BCH decoder [19]:
−1
NP
• calculate the syndrome sj = y(αj ) = yi αij for 1 ≤ j ≤ 2t from the received
i=0
vector y, where α is an N -th primitive root of unity. ;
• determine the error-locator polynomial

σ(x) = (1 − αi1 x) · · · (1 − αit x) = 1 + σ1 x + · · · + σt xt ,

where i1 , . . . , it are the t locations of the error, from the key equation

si+t + σ1 si+t−1 + · · · + σt si = 0 for 1 ≤ i ≤ t.

by the Berlekamp-Massey algorithm;


• use the Chien algorithm to search roots α−i1 , . . . , α−it of σ(x), thereby determining
i1 , . . . , it ;
• use the Forney algorithm to determine the error values ei1 , . . . , eit (which is only
needed for nonbinary codes).
For the code C1 defined by H in (23), although it has minimum distance 5, we can-
not apply BCH decoder when there are double errors in the received vector. Under such
a circumstance, one can consider directly the following system of syndrome equations
(
x1 + x2 = s1 ,
(24)
xd1 + xd2 = s2 ,

where xt = β it for t = 1, 2 and β is a primitive element in F2n . The task is to efficiently


find x1 , x2 for a given syndrome s = (s1 , s2 ) = yH ⊺ .
Letting yt = xt /s1 for t = 1, 2, Eq. (24) is equivalent to y1 = y2 +1 and y1d +y2d = ssd2 .
1
That is to say, it suffices to focus on finding the solution to the equation (y +1)d +y d =
b, where b = ssd2 . Dobbertin [23] showed that for the Welch exponent d = 2m + 3, the
1

15
derivative equation of xd can be written as
m m
(x + 1)d + xd = (x + x2 )(x2 + x + 1) + 1 = g(x + x2 ) + 1,
m+1
where g(x) = x2 +1
+ x3 + x is the corresponding Welch permutation. Let z =
m
x + x2 . The task of correcting double errors for C1 therefore can be re-arranged as
follows:
Step 1: solve the equation g(z) = c = 1 + s2 /sd1 ;
m
Step 2: solve the equation y + y 2 = η, where η is the solution obtained in Step 1;
Step 3: determine error positions i1 , i2 from xt = s1 yt for t = 1, 2.
For the first step, one can find the preimage of c with the help of the com-
positional inverse g −1 (x) of the permutation g(x). Nevertheless, we don’t have an
explicit expression of the compositional inverse g −1 (x) yet. A straightforward way is
to exhaust possible z ∈ F2n for the equation g(z) + c = 0. For each evaluation g(z),
the Chien search method can reduce the computational complexity from O(t2 ) to
O(t). The optimization in this part is negligible for t = 2. Another way is to calculate
n
gcd(z 2 −1 − 1, g(z) + c) over the polynomial ring F2n [x], which gives a linear term
z + z0 . This method can be further optimized based on the form of g(z). As observed
in [23], the equation g(z) = c for c ̸= 0 can be rewritten as
m+1 c
z2 = z2 + 1 + .
z

Raising this equation to the power of 2m+1 gives


m+1 m+1
m+1 c2  c 2 c2
z2 = z2 +1
+1+ = z 2
+ 1 + + 1 + .
z2 m+1
z z 2 + 1 + zc

Rearranging the above equation gives


m
g0 (z) = z 9 + cz 6 + z 5 + cz 4 + (c2 + c)2 z 3 + c2 z + c3 .

Dobbertin showed that g0 (z) can only have one solution in F2n . Hence, an alternative
way to solve g(z) = c is to calculate gcd(g(z) + c, g0 (z)). To compare this calculation
n
with the typical root searching and the calculation of gcd(z 2 −1 −1, g(z)+c), we recall
the result from [38].
Theorem 4. [38, Th. 5.4] Let Fq be the finite field of q elements and Fq [x]t be the poly-
nomials in Fq [x] of degree t. Let e, d be positive integers such that q > d(2e − d + 1)/2
÷ −,×
and e > d. Let tdiv
g , tg , tg be the polynomial divisions, divisions, addition/multipli-
 
cations in Fq . Given a polynomial g ∈ Fq [x]e , the average number E tw g of operations
w ∈ {div, ÷, −, ×} performed on (uniform distributed) inputs from Fq [x]d is bounded

16
in the following way:

E t÷ E tg−,×
     
E tdiv
g de g de de
−1 ≤ , −1 ≤ , −1 ≤ .
d+1 q e+d+1 q de q

n
Note that gcd(z 2 −1 − 1, g(z) + c) = gcd(g(z) + c, g1 (z)), where g1 is the remain-
n
der polynomial with degree less than deg(g). Hence gcd(z 2 −1 − 1, g(z) + c) has more
operations than gcd(g(z) + c, g0 (z)), where deg(g0 ) = 9. According to the above
theorem, for the polynomial g(z) + c ∈ F2n [x] of degree e = 2m+1 + 1, calculating
gcd(g(z) + c, g0 (z)) with d = deg(g0 ) = 9 on average takes d + 1 = 10 polynomial
√ √
divisions, e + d + 1 = 2m+1 + 10 ≈ q divisions and ed = 9(2m+1 + 1) ≈ 9 q addi-
tion and multiplications in Fq for q = 2n = 22m+1 . On the other hand, finding roots
of g(z) = c with Chien search method for t = 2 takes on average tq 2 = q operations in
Fq . In this sense, it is better to calculate gcd(g(z) + c, g0 (z)) in solving the equation
g(z) = c for Step 1.
m
Suppose η is the root of g(z) = c in Step 1. From the equality y 2 + y = η, we
m+1 m+1
can obtain the quadratic equation y 2 + y = η 2 + η 2 . Let θ = β 2 + β 2 . Then, it
i
satisfies Trn1 (θ) = 0. Suppose for a normal basis β 2 , i = 0, 1, . . . , n − 1, the element
Pn−1 i Pn−1 i−1
θ = i=0 θi β 2 and y = i=0 yi β 2 . Then we obtain the following system of n
linear equations with rank n − 1 in n variables in F2 :
    
1 0 0 ... 1 y0 θ0
1 1 0 ... 0   y1   θ 1 
   

0 1 1 ... 0   y2  =  θ 2  ,
    
 .. .. .. ..   ..   .. 
    
. . . .  .   . 
0 0 0 1 1 yn−1 θn−1
Pi
which has solutions yi = yn−1 + j=0 θj for i = 0, 1, . . . , n − 2 and yn−1 ∈ F2 . Here
we provide this elementary process to show its complexity is in order of O(n) instead
of the typical complexity O(n3 ) for solving linearized equations over F2n .
With the two solutions y1 , y2 in Step 2, the two corresponding error positions i1 , i2
can be immediately obtained from β i1 = x1 = y1 s1 , β i2 = x2 = y2 s1 .

We now discuss another family of binary cyclic codes closely related to the Welch
m+1
permutation. For the Welch permutation g(x) = x2 +1
+ x3 + x, we define a cyclic
code C2 with the primary defining set {1, 3, 2m+1 + 1}. We see that the code C2 is a
subcode of the trivial double-error-correcting BCH code, namely, its primary defining
set is {1, 3}. Interestingly, the code C2 actually has properties rather similar to that
of the triple-error-correcting BCH code with primary defining set {1, 3, 5}. With the
fact that (22(m+1) + 1) ≡ 3 mod 22m+1 − 1, we see that the defining set of C2 can be

17
written as {1, 2k + 1, 22k + 1}, where k = m + 1. The first author and Bracken [39]
showed that C2 has minimum distance 7. Note that the dual code C2⊥ is given by
  
m+1
C2⊥ = Trn1 (ax2 +1
+ bx3 + cx) : a, b, c ∈ F2n .
x∈F∗
2n

Luo [40] determined the weight distribution of binary codes given by


  
2k k
Dk = Trn1 (ax2 +1
+ bx2 +1
+ cx) : a, b, c ∈ F2n ,
x∈F∗
2n

when n/ gcd(n, k) is odd. From [40, Th. 1] one can readily see that the codes D1 and
Dm+1 have exactly the same weight distribution with a 5-weight spectrum
n o
2n−1 , 2n−1 ± 2(n−1)/2 , 2n−1 ± 2(n+1)/2 .

This implies that C2 with defining set {1, 3, 2m+1 } and the triple-error-correcting BCH
code with defining set {1, 3, 5} have exactly the same weight distribution. Since the
m+1
terms x2 +1
and x3 have algebraic degree 2, the code C2⊥ is a super-code of first-order
binary Reed-Muller codes in the second-order Reed-Muller code. It is worth noting
that in this context, Kai-Uwe Schmidt has made significant contributions, including
first-order generalized Reed-Muller codes [41] and complementary sets in the context
of sequence design [42].
Charpin, Helleseth and Zinoviev [17] showed that the coset weight distribution of
triple-error-correcting BCH code of length N = 2n − 1 is given by

N N N
  
K0 = 1, K1 = 1 , K2 = 2 , K3 = 3 ,

N (5N 2 + 10N − 3) 4N (N + 2)
K4 = , K5 = ,
6 3

where Ki is the number of coset leaders with weight i. Moreno and Castro in [43]
showed that binary cyclic codes with primary defining set {1, 2k + 1, 22k + 1}, where
gcd(k, n) = 1, have covering radius 5. This implies that the cyclic code C2 has covering
radius 5. Experimental results show that for m ≥ 3, the cyclic code C2 has the same
coset distribution as above. In our view, this is an interesting connection. Nevertheless,
we are not able to provide a theoretical proof for this fact.
Below we discuss the decoding of this triple-error-correcting code. Suppose a
received vector y contains an error e of weight 2. Since C2 has defining set {1, 3, 2m+1 +
1}, the error e can be corrected with a BCH decoder. On the other hand, since the
code length is 2n − 1, finding the roots of the error-locator polynomial directly would

18
be costly when n increases. Instead, the following process works more efficiently. Let
(s1 , s2 , s3 ) = yH ⊥ . We obtain the following system of equations as in (24):

 x1
 + x2 = s1 ,
x31 + x32 = s2 ,
 2m+1 +1
 2m+1 +1
x1 + x2 = s3 ,

where xt = β it for t = 1, 2 and β is a primitive element in F2n . The first two equations
immediately leads to the quadratic equation s1 x2 + s21 x = s31 + s2 , where x = x1 or
x2 . This quadratic equation can be further transformed to y 2 + y = 1 + ss32 by letting
1
y = sx3 . As discussed earlier, this equation can be solved in O(n) operations in F2 .
Now we consider the decoding of triple errors in a vector y = c + e. Similarly, we
need to solve the following system of equations

 x1
 + x2 + x3 = s1 ,
x31 + x32 + x33 = s2 , (25)
 2m+1 +1
 2m+1 +1 2m+1 +1
x1 + x2 + x3 = s3 ,

where xt = β it for t = 1, 2, 3. We first transform the above equations into simplified


ones in two variables.
Assume s1 = 0. Substituting x3 = x1 + x2 in the second and third equations in
(25) gives (
x1 x22 + x21 x2 = s2 ,
k k
x1 x22 + x21 x2 = s3 ,
where k = m + 1. Assume s1 ̸= 0. Letting xt = s1 (yt + 1) for t = 1, 2, 3. Then (25)
becomes
(
(y1 + 1)3 + (y2 + 1)3 + (y1 + y2 + 1)3 = s2 /s31 ,
k
2k +1 2k +1 2k +1
(y1 + 1) + (y2 + 1) + (y1 + y2 + 1) = s3 /s21 +1 ,

implying (
y1 y22 + y12 y2 = 1 + s2 /s31 ,
k k k
y1 y22 + y12 y2 = 1 + s3 /s21 +1 .
Therefore, for any s1 ∈ F2n , it suffices to focus on only the following equations in y1 , y2 :

2
 y1 y2
 + y12 y2 = δ,
2m+1 m+1
y1 y2 + y12 y2 = τ, (26)
 y y 2m + y 2m y

=τ ,2m
1 2 1 2

d
where (δ, τ ) = (s2 , s3 ) for s1 = 0, and (δ, τ ) = (1+s2 /s31 , 1+s3 /s21 +1 ) for s1 ̸= 0. Note
that if δτ = 0, the equation can have only solutions y1 = y2 , which is invalid here.

19
Furthermore, taking z = y1 /y2 , we derive the following equations from the above
system
= z2δ+z ,
 3
 y2 m+1
( 2m+1 −2 τ 2
y2 = z2m+1 · z δ+z ,

2 +1 τ +z
y2 = z2m+1 +z , =⇒ m 2m
τ
 2m +1

τ2
m y22 = z2m+1 +z
· z τ 2m+z .
y2 = z2m +z ,
m+1 m+1 m m
Assume w = z 2 + z. It is readily seen that z 2 + z = w2 + w and z 2 + z = w2 .
Then we have 
m+1 2m+1
−2


 y22 = τδ · w w +w ,

 m+1 m
 2m
2
y22 = (y22 )2 = τ 2τm · ww , (27)
 m+2
  m  2
y22 = (y22m )2m+2 = 2τm · w2

.

τ w
m+1 m+1
−2 2
From the above equations, the fact y22 y2 = y2 gives

m+1 m 2m+2 m 2
τ w2 w2 w2
 
+w τ τ
· · = · ,
δ w τ 2m w τ 2m w

i.e.,
m+2 m+1 m+1
τ 1+2 (w2 + w)w2 τ2 w2
m+2 = m+1 · .
δτ 2 w2 +1 τ2 w2
Rearranging this above equation yields
m+1 m+1
w3·(2 )
= γ(w2 +3
+ w4 ),

3(2m+1 −1) m+1


where γ = τ δ . Following Dobbertin’s method in [23], we denote w = w2
m+1
2m+1
and γ = γ . Then we have w2 = w2 . This above equation and its 2m+1 -th
power give two equations
(
w3 + γ(ww3 + w4 ) = 0,
(28)
w6 + γ(w2 w3 + w4 ) = 0.

Substituting the first equation to the second one in (28) gives γγ(w+w)(w+w2 )+w3 =
0. By the first equation and the new equation, we denote
(
h1 = w3 + γw3 · w + γw4 = 0,
(29)
h2 = w2 + (w + w2 )w + γ1 w3 = 0.
where γ1 = 1 + (γγ)−1 .
Below we will eliminate w from h1 , h2 . For reader’s convenience, we include the
process despite its simplicity. Viewing h1 , h2 as polynomials in variable w, by the

20
Euclidean method, we have

h1 = w + (w + w2 ) · h2 + h3 ,


where h3 = w2 (w2 + (γ + γ1 )w + 1)w + w4 (γ1 w + (γ + γ1 )). Furthermore, rewrite h2


and h3 as h2 = w2 + ϕ1 w + ϕ2 and h3 = ϕ3 w + ϕ4 for simplicity. The equation system
(29) is equivalent to the following system
(
h2 = w2 + ϕ1 w + ϕ2 = 0,
(30)
h3 = ϕ3 w + ϕ4 = 0.

From (30), we can eliminate w by calculating

h4 (w) = ϕ23 h2 + (ϕ3 w + ϕ1 ϕ3 + ϕ4 )h3


= ϕ23 (w2 + ϕ1 w + ϕ2 ) + (ϕ3 w + ϕ1 ϕ3 + ϕ4 )(ϕ3 w + ϕ4 )
= ϕ2 ϕ23 + ϕ1 ϕ3 ϕ4 + ϕ24 .

More explicitly,

h4 (w) = γ1 w3 w4 (w2 + (γ + γ1 )w + 1)2


+(w + w2 )w2 (w2 + (γ + γ1 )w + 1)w4 ((γ1 )w + (γ + γ1 )
+w8 (γ1 w + (γ + γ1 )2
 (31)
= w7 γ(1 + γ1 )w3 + (γ(γ + 1)(γ1 + 1) + γ13 )w2 + γw + γ

= γ(1 + γ1 )w7 w3 + σ1 w2 + σ2 γw + σ3
= γ(1 + γ1 )w7 · h(w),

where σ1 , σ2 , σ3 are coefficients derived from the cubic polynomial in the last second
equation and h(w) = w3 + σ1 w2 + σ2 w + σ3 . Therefore, the system (28) is reduced to
the cubic equation h(w) = 0.
For correctable syndromes s = (s1 , s2 , s3 ) derived from an error e of weight 3, it
can be verified, according to the criteria in [44], that h(w) has three roots. In order
to obtain the roots of h(w), we follow the method by Berlekamp and Solomon [45].
Multiplying h(w) by w + σ1 , we obtain a linearized polynomial

L(w) = (w + σ1 )h(w) = w4 + (σ2 + σ12 )w2 + (σ3 + σ1 σ2 ) + σ1 σ3 .

With a basis β1 , . . . , βn of F2n over F2 , we can express L(w) = 0 as a system of m


linear equations in n variables w1 , . . . , wn ∈ F2 . From the possible 4, 2, 1 solutions
to L(w) = (w + σ1 )h(w), we can get 3, 1 roots of the cubic polynomial h(w) in
general. However, since σ1 , σ2 , σ3 were derived from γ, the cubic polynomial h(w)
actually has 3 roots. This process has complexity in the order of O(n3 ) operations

21
m+1
in F2 . Given a root w of h(w), from the equation z 2 + z = w, one can obtain
2 2m+1
the equation z + z = w + w and can find the roots z in O(n) operations in
F2 . By (27) we can get the unique root y2 from δ, τ, w; and by z = y1 /y2 , one can
get solutions (y1 , y2 ) from {(y2 z, y2 ), (y2 z + y2 , y2 )} for the system (26) of equations.
Furthermore, for either s1 = 0 or s1 ̸= 0, we can get two solutions (x1 , x2 , x3 ) from
(y1 , y2 ) ∈ {(y2 z, y2 ), (y2 z + y2 , y2 )}. Here it is to be noted that three roots w of
h(w) leads to six solutions (x1 , x2 , x3 ) for the syndrome equations. These six solutions
correspond to the 6 permutations of one error e with support {i1 , i2 , i3 }.
To summarize, the decoding of the cyclic code C2 for three errors can proceed as
follows:
• given a syndrome (s1 , s2 , s3 ) = yH T , calculate the corresponding δ, τ from
3(2m+1 −1)
(s1 , s2 , s3 ) and γ = τ δ ;
• calculate the polynomial h(w) as in (31);
• construct the linearized polynomial L(w) from h(w) and find its solution in F2n ;
• for the solution w, calculate the intermediate parameters y1 , y2 , and then use them
to calculate xt = β it for t = 1, 2, 3
• recover the codeword c = y + e with the support of e being {i1 , i2 , i3 }.
Denote by N = 2n − 1 the length of the code C2 . In the above decoding procedure,
the calculation of syndrome takes O(N log N ) operations in F2 ; the calculation of
h4 = gcd(h1 , h2 ) is independent of the code length N and solving L(w) = 0 takes
O((log N )3 ) operations in F2 . This decoder significantly outperforms the syndrome
decoder with complexity O(N 3 ) and recent decoders for cyclic codes in [20, 22], which
have complexity at least O(N 2 ) for triple-error-correcting cyclic codes.

4.2 Binary linear codes related to the Welch permutation


In this section we will discuss two families of binary linear codes that are relevant to
the Welch permutation.
For the Welch permutation g(x), the first family of binary linear code C3 is given
by a parity-check matrix similar to (23), where β di is replaced by g(β i ). It is well
known [4] that C3 defined in this way has minimum distance at most 5, and dimension
3
2n − 1 − 2n. According to the proof of Theorem 2, there exist (x, y, z) ∈ (F∗3n ) such
that (
x + y + z = 0,
g(x) + g(y) + g(z) = 0.
Thus, the code C3 has minimum distance 3. In addition, for its dual code,
n o
C3⊥ = (Trn1 (ag(x) + bx))x∈F∗n : a, b ∈ F2n .
2

22
According to (6), the weight of nonzero codewords in C3⊥ can be expressed in terms of
the extended Walsh transform of g(x). From the Walsh spectrum obtained in Section
3, we see that the weight distribution of C3⊥ is obtained accordingly, which is a 5-weight
spectrum
{2n−1 , 2n−1 ± 2m+1 , 2n−1 ± 2m }.
Now let’s consider another binary linear code related to the Welch permutation.
Ding et. al in [9, 13] introduced a generic construction of binary linear codes from a
subset D = {d1 , d2 , . . . , dℓ } of F2n and the absolute trace function Trn1 (·) from F2n to
F2 as
CD = {ca = (Trn1 (ad1 ), Trn1 (ad2 ), . . . , Trn1 (adℓ )) : a ∈ F2n } .
When the defining set D is properly chosen, the code CD can have a few nonzero
weights. Particularly, when the defining set is given as D(F ) = {F (x) : x ∈ Fn2 } with
a two-to-one function F on F2n , the Hamming weight of a codeword ca in CD(F ) is
given by

wt(ca ) = {1 ≤ i ≤ 2n−1 : Trn1 (adi ) = 1}


 
1  n−1 X n
= 2 − (−1)Tr1 (ad) 
2
d∈D(F )
 
(32)
1  n−1 1 X n
= 2 − (−1)Tr1 (aF (x)) 
2 2 n x∈F2
1 X n
= 2n−2 − (−1)Tr1 (aF (x)) .
4 n x∈F2

That is to say, for studying the Hamming weight properties of the code CD(F ) , it is
critical to study the possible values of the exponential sum WF (a, 0).
In [13], Ding investigated the properties of binary linear codes from the images
of certain functions on F2n and proposed several conjectures on properties of the
constructed codes, including the following one from Welch APN power function.
m
Conjecture 1. [13, Conjecture 33] Let n = 2m + 1 and F (x) = x2 +3 . Let f (x) =
F (x) + F (x + 1) + 1 and D(f ) = {d1 , d2 , . . . , dℓ } = {f (x) | x ∈ F2n }. Define the binary
code CD(f ) as

CD(f ) = {ca = (Trn1 (ad1 ), Trn1 (ad2 ), . . . , Trn1 (adℓ )) : a ∈ F2n } .

If n ∈ {5, 7}, then CD(f ) is a three-weight code with length 2n−1 and dimension n. If
n ≥ 9, then CD(f ) is a five-weight code with length 2n−1 and dimension n.

23
m
For the Welch APN power function F (x) = x2 +3
and f (x) = F (x + 1) + F (x) + 1,
it is easy to verify that

m m
f (x) = F (x + 1) + F (x) + 1 = (x + x2 )(x2 + x + 1) = g(x + x2 ),

where g(x) is the Welch permutation of F2n . With the properties of g(x) discussed in
Section 3, we present the following result on the code CD(f ) .
Theorem 5. Let n = 2m + 1 for a positive integer m ≥ 2. The binary linear code
CD(f ) defined in Conjecture 1 has length 2n−1 , dimension n and its nonzero weights
are contained in the following set:
n n−3 n−1
o
2n−2 , 2n−2 ± 2 2 , 22m−1 ± 2 2 .

m
Proof. It is clear that the length of CD(f ) is 2n−1 since f (x) = g(x + x2 ) is a
two-to-one function. As for the dimension, since CD(f ) is linear, we need to consider
the number of a ∈ F2n such that Trn1 (af (x)) = 0 for any x ∈ F2n , equivalently,
Trn
1 (af (x)) = 2n .
P
x∈F2n (−1)
m m
Define T0 = {x + x2 | x ∈ F2n } and T1 = {x + 1 | x ∈ T0 }. Note that x + x2 is a
two-to-one function over F2n . Thus T0 ∪ T1 = F2n . Since n is odd, we have Trn1 (1) = 1
and Trn1 (x) = 1 for any x ∈ T1 . Since g(x) is a permutation of F2n , one has
X n X n X n
(−1)Tr1 (bg(z)) + (−1)Tr1 (bg(z)) = (−1)Tr1 (bg(z)) = 0.
z∈T0 z∈T1 z∈F2n

Then for any a ∈ F∗2n ,


X n X n
(−1)Tr1 (af (x)) = 2 (−1)Tr1 (ag(z))
x∈F2n z∈T0
X n X n
= (−1)Tr1 (ag(z)) + (−1)Tr1 (ag(z+1)+1)
z∈T0 z∈T0
Trn n
X X
= (−1) 1 (ag(z)+z) + (−1)Tr1 (ag(z+1)+z+1)
z∈T0 z∈T0
Trn n
X X
= (−1) 1 (ag(z)+z) + (−1)Tr1 (ag(z)+z)
z∈T0 z∈T1
Trn
X
= (−1) 1 (ag(x)+x) .
x∈F2n

By the Walsh spectrum of g(x) in Theorem 3, it is clear that Wf (a, 0) = Wg (a, 1) ̸= 2n


for any nonzero a ∈ F2n . This implies that CD(f ) has dimension n. Furthermore, it

24
follows from (32) that

1 X n
wt(ca ) = 2n−2 − (−1)Tr1 (ag(x)+x) .
4
x∈F2
n

From the Walsh spectrum of g(x) in Table 1, the possible nonzero weights of the code
CD(f ) can be directly determined.
Theorem 5 provides a partial resolution to the conjecture by Ding [13]. It appears
that new technique is required to completely settle the conjecture and determine the
weight distribution of the code CD(f ) . With the help of Magma, we list some numerical
results in Table 4.2, which are in accordance with Theorem 5.

Table 2 Some numerical results

Values of Weight enumerator of CD(f )


5 1 + 6x10 + 15x8 + 10x6
7 1 + 63x32 + 36x28 + 28x36
9 1 + x144 + 108x120 + 285x128 + 108x136 + 9x112
11 1 + 440x496 + 408x528 + 22x480 + 1155x512 + 22x544

5 Conclusion
The contributions in this paper are twofold. First, we completely determined the
differential spectrum and the Walsh spectrum of the permutation polynomial g(x)
m
from the Welch APN power function x2 +3 over F22m+1 . Second, we explore two
families of cyclic codes and two families of linear codes derived from the Welch APN
power function. For the two cyclic codes, their properties have been well studied, and
we present efficient algebraic decoders for them; for the two linear codes, the weight
distribution of the first family can be easily obtained from the Walsh spectrum of
g(x), and the weight spectrum of the second one was investigated, which partially
solved a conjecture by Ding in [13]. The Welch permutation g(x) appears to have good
cryptographic properties and some other cryptographic criteria may deserve further
investigation.

Acknowledgment
Y. Xia was supported in part by the National Natural Science Foundation of China
under Grant s 62171479 and in part by the Fundamental Research Funds for the Cen-
tral Universities, South-Central University for Nationalities under Grant CZZ23004.
C. Li and T. Helleseth were supported by the Research Council of Norway under
Grants 247742 and 311646.

25
Declarations
Conflict of interest: The authors declared that they have no conflicts of interest in
connection with the work submitted.

References
[1] Biham, E., Shamir, A.: Differential cryptanalysis of DES-like cryptosystem.
Journal of Cryptology 4(1), 3–72 (1991)

[2] Matsui, M.: Linear cryptanalysis method for DES cipher. In: Helleseth, T. (ed.)
Advances in Cryptology – Eurocrypt’93. Lecture Notes in Computer Sceince, vol.
765, pp. 386–397. Springer, Berlin, Heidelberg (1994)

[3] Chabaud, F., Vaudenay, S.: Links between differential and linear cryptanalysis.
In: De Santis, A. (ed.) Advances in Cryptology – EUROCRYPT’94. Lecture Notes
in Computer Science, vol. 950, pp. 356–365. Springer, Berlin, Heidelberg (1995)

[4] Carlet, C.: Boolean Functions for Cryptography and Coding Theory. Cambridge
University Press, Cambridge, United Kingdom (2021)

[5] Budaghyan, L.: Construction and Analysis of Cryptographic Functions. Springer,


Switzerland (2014)

[6] Wu, C.-K., Feng, D.: Boolean Functions and Their Applications in Cryptography.
Springer, Berlin, Heidelberg (2016)

[7] Delsarte, P.: On subfield subcodes of modified Reed-Solomon codes (Corresp.).


IEEE Trans. Inf. Theory 21(5), 575–576 (1975)

[8] Carlet, C., Charpin, P., Zinoviev, V.: Codes, bent functions and permutations
suitable for DES-like cryptosystems. Des. Codes Cryptogr. 15(2), 125–156 (1998)

[9] Ding, C., Niederreiter, H.: Cyclotomic linear codes of order 3. IEEE Trans. Inf.
Theory 53(6), 2274–2277 (2007)

[10] Li, N., Mesnager, S.: Recent results and problems on constructions of linear codes
from cryptographic functions. Cryptogr. Commun. 12(5), 965–986 (2020)

[11] Carlet, C., Ding, C., Yuan, J.: Linear codes from perfect nonlinear mappings and
their secret sharing schemes. IEEE Trans. Inf. Theory 51(6), 2089–2102 (2005)

26
[12] Ding, C., Wang, X.: A coding theory construction of new systematic authentica-
tion codes. Theor. Comput. Sci. 330(1), 81–99 (2005)

[13] Ding, C.: A construction of binary linear codes from boolean functions. Discrete
Math. 339(9), 2288–2303 (2016)

[14] Ding, C., Li, C., Li, N., Zhou, Z.: Three-weight cyclic codes and their weight
distributions. Discrete Math. 339(2), 415–427 (2016)

[15] Mesnager, S., Qu, L.: On two-to-one mappings over finite fields. IEEE Trans. Inf.
Theory 65(12), 7884–7895 (2019)

[16] Cohen, G., Karpovsky, M., Mattson, H., Schatz, J.: Covering radius—survey and
recent results. IEEE Trans. Inf. Theory 31(3), 328–343 (1985)

[17] Charpin, P., Helleseth, T., Zinoviev, V.A.: The coset distribution of triple-error-
correcting binary primitive bch codes. IEEE Trans. Inf. Theory 52(4), 1727–1732
(2006)

[18] Berlekamp, E., McEliece, R., Van Tilborg, H.: On the inherent intractability
of certain coding problems (Corresp.). IEEE Trans. Inf. Theory 24(3), 384–386
(1978)

[19] Berlekamp, E.: Algebraic Coding Theory (Revised Edition). World Scientific
Publishing, Singapore (2015)

[20] Augot, D., Bardet, M., Faugere, J.-C.: On the decoding of binary cyclic codes
with the newton identities. J. Symb. Comput. 44(12), 1608–1625 (2009)

[21] Lin, T.-C., Lee, C.-D., Chen, Y.-H., Truong, T.-K.: Algebraic decoding of cyclic
codes without error-locator polynomials. IEEE Trans. Commun. 64(7), 2719–2731
(2016)

[22] Caruso, F., Orsini, E., Sala, M., Tinnirello, C.: On the shape of the general error
locator polynomial for cyclic codes. IEEE Trans. Inf. Theory 63(6), 3641–3657
(2017)

[23] Dobbertin, H.: Almost perfect nonlinear power functions on GF(2n ): the Welch
case. IEEE Trans. Inf. Theory 45(4), 1271–1275 (1999)

[24] Nyberg, K.: Differentially uniform mappings for cryptography. In: Helleseth, T.
(ed.) Advances in Cryptology – Eurocrypt’93. Lecture Notes in Computer Sceince,
vol. 765, pp. 55–64. Springer, Berlin, Heidelberg (1994)

27
[25] Blondeau, C., Canteaut, A., Charpin, P.: Differential properties of power func-
tions. Int. J. Information and Coding Theory. 1(2), 149–170 (2010)
t
−1
[26] Blondeau, C., Canteaut, A., Charpin, P.: Differential properties of x 7→ x2 .
IEEE Trans. Inf. Theory 57(12), 8127–8137 (2011)

[27] Bracken, C., Leander, G.: A highly nonlinear differentially 4-uniform power map-
ping that permutes fields of even degree. Finite Fields Appl. 16(4), 231–242
(2010)

[28] Charpin, P., Kyureghyan, G.M., Sunder, V.: Sparse permutations with low
differential uniformity. Finite Fields Appl. 28, 214–243 (2014)

[29] Helleseth, T., Kumar, P.V.: Sequences with low correlation. In: Pless, V.S.,
Huffman, W.C. (eds.) Handbook of Coding Theory vol. II, pp. 1765–1853.
North-Holland, Amsterdam (1998)

[30] MacWilliams, F.J., Sloane, N.J.A.: The Theory of Error-Correcting Codes. North-
Holland, Amsterdam (1977)

[31] Ding, C.: Linear codes from some 2-designs. IEEE Trans. Inf. Theory 61(6),
3265–3275 (2015)

[32] Ding, K., Ding, C.: A class of two-weight and three-weight codes and their
applications in secret sharing. IEEE Trans. Inf. Theory 61(11), 5835–5842 (2015)

[33] Zhou, Z., Li, N., Fan, C., Helleseth, T.: Linear codes with two or three weights
from quadratic bent functions. Des. Codes Cryptogr. 81(2), 283–295 (2016)

[34] Mesnager, S.: Linear codes with few weights from weakly regular bent functions
based on a generic construction. Cryptogr. Commun. 9(1), 71–84 (2017)

[35] Chen, C.L.: Formulas for the solutions of quadratic equations over GF(2m ). IEEE
Trans. Inf. Theory. 28(5), 792–794 (1982)

[36] Bracken, C., Byrne, E., Markin, N., MaGuire, G.: Determining the nonlinearity of
a new family of APN functions. In: Boztas, S., Lu, H.-F. (eds.) Applied Algebra,
Algebraic Algorithms and Error-Correcting Codes. AAECC 2007. Lecture Notes
in Computer Science, vol. 765, pp. 72–79. Springer, Berlin, Heidelberg (2007)

[37] Canteaut, A., Charpin, P., Dobbertin, H.: Binary m-sequences with three-valued
crosscorrelation: a proof of Welch’s conjecture. IEEE Trans. Inf. Theory 46(1),
4–8 (2000)

28
[38] Giménez, N., Matera, G., Pérez, M., Privitelli, M.: Average-case complexity of
the euclidean algorithm with a fixed polynomial over a finite field. Comb. Probab.
Comput. 31(1), 166–183 (2022)

[39] Bracken, C., Helleseth, T.: Triple-error-correcting BCH-like codes. 2009 IEEE
International Symposium on Information Theory, Seoul, Korea (South), 28 June
– 03 July 2009 (2009)

[40] Luo, J.: On binary cyclic codes with five nonzero weights. Preprint at https:
//doi.org/10.48550/arXiv.0904.2237 (2009)

[41] Schmidt, K.-U.: On cosets of the generalized first-order Reed–Muller code with
low OFDM. IEEE Trans. Inf. Theory 52(7), 3220–3232 (2006)

[42] Schmidt, K.-U.: Complementary sets, generalized Reed–Muller codes, and power
control for OFDM. IEEE Trans. Inf. Theory 53(2), 808–814 (2007)

[43] Moreno, O., Castro, F.N.: Divisibility properties for covering radius of certain
cyclic codes. IEEE Trans. Inf. Theory 49(12), 3299–3303 (2003)

[44] Williams, K.S.: Note on cubics over GF (2n ) and GF (3n ). Journal of Number
Theory 7(4), 361–365 (1975)

[45] Berlekamp, E.R., Rumsey, H., Solomon, G.: On the solution of algebraic equations
over finite fields. Information and Control 10(6), 553–564 (1967)

29

You might also like