0% found this document useful (0 votes)
78 views5 pages

RMCrypto 2021 V

Uploaded by

Random Person
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views5 pages

RMCrypto 2021 V

Uploaded by

Random Person
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

RANDOM MATRICES OVER F2 IN CRYPTOGRAPHY

FALKUSH

Abstract. We describe two cryptographic hash functions, a pseudorandom cryptographic random number
generator, a block cipher and a Diffie-Hellman key exchange all based around random matrices over F2 . The
simplicity of the schemes presented and their highly parallelizable features suggest good performances. The
security of the schemes is based on the avalanche effect exhibited by the matrices and their chaotic behavior
when introducing non-linearity.

1. Introduction
∼ Z/2Z. Surveys of the theory of random ma-
Let Fq be the finite field with q elements. In particular, F2 =
trices over Fq can be found in [Ful00] and [LMN19]. Another paper studying random matrices over Fq can be
found here [GL21]. We will focus on the special case q = 2 in this paper and its applications in cryptography.

The use of random matrices over F2 for hashing purposes is presented in the following lecture [Blu11].
It describes non-cryptographic hash functions since any hash can be reversed quickly using Gaussian elim-
ination. Random matrices modulo 26 were also used in the Hill cipher [Hil29]. For pseudorandom number
generation, linear-feedback shift registers uses a special kind of random matrices over F2 [ABM12].

Let A be an n × n invertible matrix over F2 and let V = Fn2 . We require A to be invertible in order
to have Im(A) = V . We say that A is primitive if and only if
n
{Ai v}2i=0−2 = V − {~0}
for any v ∈ V − {~0} and ~0 is the zero vector. In other words, the matrix A acting repeatedly on a non-zero
vector will cycle through all the non-zero vectors of V .
Theorem 1.1. Let I be the identity matrix. For n ∈ Z such that 2n − 1 is a Mersenne prime, the n × n
matrix A 6= I is primitive if and only if
n
A2 −1 = I.
Proof. Let ui be the vector with one in the ith position and zeros everywhere else. If A is primitive, then
n n
A2 −1 ui = ui for all i, which implies A2 −1 = I.

In the other direction, we notice that the order of A is the least common multiple of the length of all
its cycles, where a cycle of A is a subset W ⊂ V such that
n
2 −2
{Ai w}i=0 =W
for any w ∈ W . 
n
Because of this theorem, we will focus on values of n such that 2 −1 is a Mersenne prime. Exponentiation
by squaring makes it possible to compute large powers of a matrix. According to numerical computations, it
seems like the probability of a random invertible matrix to be primitive is about 1/2n. It took about three
days to find a primitive matrix with n = 521 on an i5-6600K processor.

We present the following method to produce an invertible matrix. We start with the identity matrix and
we XOR the first row to the other rows with 50% probability for each of them. We do this procedure for
each row. Since about 30% of matrices are invertible [ja11], a simple counting argument shows that this

Date: June 29, 2021.


1
procedure cannot produce all invertible matrices. We can apply the method twice to address this problem.
We can also produce a random matrix and check if it is invertible.

2. The Linearity Problem


Some precautions must be taken on the linearity of a matrix and the XOR operator, denoted by ⊕. For
example, trying to find vector v given matrix A and vector w in the equation Av ⊕ v = w is very fast. We
factor Av ⊕ v = (A ⊕ I)v where I is the identity matrix and use Gaussian elimination or invert the matrix
A ⊕ I if it can be inverted to recover v.

To stay away from this linearity, we use the addition modulo 2n instead, denoted by +. However, it is
still possible for + to act very similarly to ⊕. For two arbitrary vectors a = (an , . . . , a1 ) and b = (bn , . . . , b1 ),
the bits starts to differ between a + b and a ⊕ b when ai = bi = 1. They will differ until they reach j > i
such that aj = bj = 0.

If one of the vector has many zeros, a + b and a ⊕ b have a higher chance of being very close bitwise. This
is problematic because for example, knowing A and w, we can try to recover v in the equation Av + v = w
by trying out values w0 close to w bitwise in w0 = Av 0 ⊕ v 0 until we find v = v 0 .

If the addition is inside a matrix, for example, trying to find v given A, b, and w in the equation A(v +b)⊕v =
w, we have a similar problem. If we treat + as ⊕, it is easy to solve (A ⊕ I)v = w ⊕ Ab. The difference here
is we have to try values of w0 that are equal to w XOR’d by some columns of matrix A.

3. Two Cryptographic Hash Functions


We describe two cryptographic hash functions in this section.
Hash function 1. Let A0 and A1 be two n × n primitive matrices. We initialize h0 ∈ V to be the number
of bytes of the message. Let mi be the value of the ith bit of the message. Then, for each bit of the message,
we compute
hi+1 = Ami hi .
Last state of this procedure is the hash of the message. The idea behind this hashing function is that the
matrix B acting on a vector should take it to somewhere random in the cycle generated by A. We initialize
h0 with the length of the message to counter the length extension attack. It also provides protection against
the possible fact that some product of the two matrices may have several very short cycles. Repeating this
product to take advantage of this fact will change the length and modify each previous state. This algorithm
is extremely slow on a CPU, so we propose another faster algorithm.

Hash function 2. Let A and B be two n × n primitive matrices. We initialize h0 ∈ V to be the number of
bytes of the message. We pad the message with zeros such that its length in bits is divisible by n. Let mi
be the ith block of n bits of the message. Then, for each block, we compute
hi+1 = A(mi + hi ) + B(mi + hi ) + mi
where the overline is the NOT operator, and the addition is modulo 2n .

The purpose of the NOT operator is to counter the possibility that a quantity has many zeros, since it
would introduce a quantity with many ones. In general, we expect a difference of at least 128 bits between
a + b and a ⊕ b. Using four additions seems strong enough to prevent using the linearity problem to gain an
advantage over bruteforce in a preimage attack situation.

For collision attacks, we sum by mi to prevent a possible way to reduce by half the expected time of
the birthday attack. Without this sum, we get by choosing mi = hi for every block
hi+1 = A(2hi ) + B(1, 1, 1, . . . , 1).
The last bit of 2hi is always zero, so we lose half of the image. We remark that the two matrix actions can
be computed in parallel.
2
4. Cryptographic Pseudorandom Number Generator
Linear-feedback shift registers already use a special kind of random matrices over F2 to generate pseudo-
random numbers. However, only the first bit of the new state vector is random, the other bits correspond to
the previous state shifted by one position. When the matrix is general, it introduces an avalanche effect: the
new state has no apparent relation to the previous one, and changing only one bit in the input completely
changes the output when computing a matrix action.

The first obvious way to produce a pseudorandom sequence is to take a primitive n × n matrix A and
look at the sequence
s0 = (seed)
si = Asi−1
n
which has period 2 − 1. A major problem with this procedure is that each state we go through never comes
back unless we go through the whole cycle. Even worse, with the knowledge of at least n states, we can
recover the matrix A and the seed if we can construct an invertible matrix using the states as columns.

We propose the following scheme


k0 = (seed1 )
s0 = (seed2 )
ki = Aki−1
si = Asi−1 + si−1 + Aki−1 + ki−1 + A(si−1 + k i−1 )
where si is the pseudorandom sequence, the addition is modulo 2n , and the overline is the NOT operator.
The first state s1 should be discarded.

Assuming the matrix A is public, the scheme passes the next-bit test [Yao82] as long as the attacker isn’t able
to isolate a single ki . Given the first few blocks, we can isolate the first few Aki−1 + ki−1 + A(si−1 + k i−1 ).
For the same reasons given in the hash functions section, exploiting the linearity problem shouldn’t offer
better odds than bruteforce when trying to recover ki−1 .

The generator should also resist state compromise extensions. Given the values ki and si at some point, we
can reconstruct all kj and even find seed1 , but recovering si−1 would imply being able to isolate it from
Asi−1 + si−1 + A(si−1 + k i−1 ). Unlike the next-bit test where solving one ki gives away the whole sequence,
here all preceding blocks have to be solved one by one.

The expected period of this generator for any given seeds should be O(2n+n/2 ) blocks by the design of
ki and because A is primitive.

5. Two Block Ciphers


Let A be an invertible n × n matrix over F2 . The simplest algorithm would use A as the secret key and
replace every block mi by Ami . To decrypt we apply A−1 on every block. The first problem is that we can
detect blocks that are repeated within a message. The second problem is the known-plaintext attack can
recover the matrix if we know around n blocks.

Block Cipher 1. One way to solve this problem is to include random bits at the end of each block.
For n = 2048, we can break down the message into blocks of length 1920 bits and concatenate 128 random
bits at the end of each block. This would increase the file size by ≈ 7%. It might also be a good idea to
use 96 random bits instead and use the last 32 bits for a checksum of the block (including the random bits).
This way, if a few bits in the encrypted block become corrupted, it’d be possible to use a bruteforce to try
to recover the original block. We note that the matrix A doesn’t have to be primitive for this method. If
the plain text is know, even with the knowledge of the first 1920 columns of matrix A it would be difficult
3
for an attacker to recover the columns acting on the garbage bits. The only information they would get is a
collection of vectors in the image of the rectangular matrix. It seems like a hard problem to reconstruct a
matrix from its image.

Block Cipher 2. An obvious cipher would use the cryptographic pseudorandom number generator from
the last section and keep the seeds as the private keys. Then, we apply mi 7→ si ⊕ mi on each block of
the message. We propose another cipher that should offer better performance while increasing the file size
by only 2n bits. If the matrix A is the key and kept private, we can simplify the pseudorandom number
generator by redefining si to
si = Asi−1 + ki .
This scheme still passes the next-bit test, but if matrix A is revealed, all previous states can be recovered.
It is also important to discard the first state since it may reveal a pair (Av, v) that might be used to recover
A. Seeds can be public and put in the beginning of the file.

6. Diffie-Hellman key exchange


The Diffie-Hellman key exchange [Mer78] can be carried out in the random matrix setting. Given a prim-
itive matrix A, we choose an integer m between 1 and 2n − 1 as the private key. The public key is Am . The
shared secret is a power of A that can be used as the secret matrix in the second block cipher of the last
section.

The authors behind Logjam [ABD+ 15] recommend using 2048-bits keys. Finding a primitive matrix us-
ing n = 2203 would require a supercomputer, but once it is found, it can be made public and used for the
key exchange.

7. Implementations
2
Using about n /2 XOR gates, we can construct a circuit that computes the action of a given matrix on
any vector. This should provide good performances for the hash functions and the pseudorandom number
generator. The other proposed protocols of this paper should be implemented for GPUs since many opera-
tions can be parallelized. Computing a matrix action, each row of the matrix can be done in parallel. Even
a single row can be divided and done in parallel.

A Java implementation (using CPU) of the two hash functions can be downloaded here:
https://bit.ly/3drHhtS. They return
hash2:979aafd82026fada06b4224acfb852c1f465333c9b31a05af7c2098f5d3c123b
36a54575738a2bb816e2727a294fdf3fd65142e3ff2b2278c8281333557d7c4b
hash1:0b254180052594ddeeddb55a5efa2208cc3fa385a41f34e7eb9a5a6ba1056518
d5c643937647581a8b6e4a89a2604f1f3d1085d1c5960412e1851fee00f57f2d
on the zip file itself.

8. Acknowledgments
A huge thanks to prof. Glenn G. Chappell for his various corrections/suggestions/improvements. Another
huge thanks to the people who contributed to the discussion on the reddit threads. A final huge thanks to
/u/lucy tatterhood for the idea to initialize h0 with the length of the message.

References
[ABD+ 15] David Adrian, Karthikeyan Bhargavan, Zakir Durumeric, Pierrick Gaudry, Matthew Green, J Alex Halderman,
Nadia Heninger, Drew Springall, Emmanuel Thomé, Luke Valenta, et al. Imperfect forward secrecy: How diffie-
hellman fails in practice. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications
Security, pages 5–17, 2015.
[ABM12] Shadab Alam, Mohammad Bokhari, and Faheem Masoodi. An analysis of linear feedback shift registers in stream
ciphers. International Journal of Computer Applications-IJCA, Volume 46:46–49, 06 2012.
[Blu11] Avrim Blum. Lecture 10: Universal and Perfect Hashing. 2011. https://www.cs.cmu.edu/~avrim/451f11/lectures/
lect1004.pdf.
4
[Ful00] Jason Fulman. Random matrix theory over finite fields: a survey. arXiv Mathematics e-prints, page math/0003195,
March 2000.
[GL21] Heide Gluesing-Luerssen and Hunter Lehmann. Automorphism Groups and Isometries for Cyclic Orbit Codes. arXiv
e-prints, page arXiv:2101.09548, January 2021.
[Hil29] Lester S. Hill. Cryptography in an algebraic alphabet. The American Mathematical Monthly, 36(6):306–312, 1929.
[ja11] joriki’s answer. Math Stack Exchange: Probability that a random binary matrix is invertible? 2011. https://math.
stackexchange.com/questions/54246/probability-that-a-random-binary-matrix-is-invertible.
[LMN19] Kyle Luh, Sean Meehan, and Hoi H. Nguyen. Some new results in random matrices over finite fields. arXiv e-prints,
page arXiv:1907.02575, July 2019.
[Mer78] Ralph C. Merkle. Secure communications over insecure channels. Commun. ACM, 21(4):294–299, April 1978.
[Yao82] Andrew C. Yao. Theory and application of trapdoor functions. In 23rd Annual Symposium on Foundations of
Computer Science (sfcs 1982), pages 80–91, 1982.
Email address: [email protected]

You might also like