Number Generator
Number Generator
1 Introduction
scope of this work. A statistically ideal PRNG is one that passes the theoretical
next bit test [9, p. 171].
The research is inspired by Abadi and Andersen’s work on neural network
learning of encryption schemes [1], conjecturing that a neural network can rep-
resent a good pseudo-random generator function, and that discovering such a
function by stochastic gradient descent is tractable. Motivation is also drawn
from the needs of security: a hypothetical neural-network-based PRNG has sev-
eral potentially desirable properties. This includes the ability to perform ad-hoc
modifications to the generator by means of further training, which could consti-
tute the basis of strategies for dealing with the kind of non-statistical attacks
described by Kelsey et al. in [7].
Related Work Few attempts have been made to produce pseudo-random num-
ber sequences with neural networks [2,3,11,5]. The most successful approaches
have been presented by Tirdad and Sadeghian [11], and by Jeong et al. [5]. The
former employed Hopfield neural networks adapted so as to prevent convergence
and encourage chaotic behavior, while the latter used an LSTM trained on a
sample of random data to obtain indices into the digits of pi. Both papers re-
ported a strong performance in statistical randomness tests. However, neither
scheme sought to train an “end-to-end” neural network PRNG, instead using
the networks as components of more complex algorithms.
We undertake the task differently, by applying a deep learning method known
as generative adversarial networks [4] to train an end-to-end neural PRNG which
outputs pseudo-random sequences directly. We present two conceptually simple
architectures, and evaluate their strength as PRNGs using the NIST test suite
[10].
Fig. 1. Conceptual view of a PRNG (left) and our neural implementation (right).
GO (s, o1 ) : B2 → B8 . (5)
Fig. 3. Architecture of the generator: FCFF layers with leaky ReLU and mod activa-
tions
labels. The discriminator outputs a scalar p(true) in the range [0, 1] representing
the probability that the sequence belongs to either class.
The discriminator consists of four stacked convolutional layers, each with 4
filters, kernel size 2, and stride 1, followed by a max pooling layer and two FCFF
layers with 4 and 1 units, respectively. The stack of convolutional layers allow
the network to discover complex patterns in the input.
P (rsplit ) : B7 → B (7)
where rsplit is the generator’s output vector with the last element removed.
The last element is used as the corresponding label for the predictor’s input.
Apart from the input size and meaning of the output, the discriminator and the
predictor share the same architecture.
Loss Functions and Optimizer We use standard loss functions. In the dis-
criminative case, the generator and discriminator both have least squares loss. In
the predictive case, the generator and the predictor both have absolute difference
loss. We use the popular Adam stochastic gradient descent optimizer [8].
6 M. De Bernardi et al.
3 Experiments
We measure the extent to which training the GANs improves the randomness
properties of the generators by analyzing large quantities of outputs, produced
for a single seed, using the NIST statistical test suite both before and after train-
ing.
Training parameters In each experiment we train the GAN for 200,000 epochs
over mini-batches of 2,048 samples, with the generator performing one gradient
update per mini-batch and the adversary performing three. We set the learning
rate of the networks to 0.02. The generator outputs floating-point numbers con-
strained to the range [0, 216 − 1], which are rounded to the nearest 16-bit integer
for evaluation. The evaluation dataset consists of 400 mini-batches of 2,048 in-
put vectors each, for a total of 819,200 input samples. The generator outputs 8
floating-point numbers for each input, each yielding 16 bits for the full output
sequence. In total, each evaluation output thus consists of 104,857,600 bits, pro-
duced from a single random seed. Larger outputs were not produced due to disk
quotas on the cluster used to run the models.
Pseudo-Random Number Generation using Generative Adversarial Networks 7
NIST testing procedure The NIST test suite is applied with default set-
tings. The test suite consists of 188 distinct tests, each repeated 10 times, with
1,000,000 input bits consumed for each repetition. Each repetition will be re-
ferred to as a test instance. For every test, NIST reports the number of individual
instances that passed, the p-value of all individual instances, as well as a p-value
for the distribution of the instance p-values. A test instance fails if its p-value
is below a critical value (α = 0.01). An overall test fails if either the number of
passed instances is below a threshold, or the p-value for the distribution of test
instance p-values is below a critical value.
Results Table 1 shows the average performance across experiments, before and
after training, for both GAN approaches. Table 2 shows the average improvement
across all experiments for both approaches. Figures 5 and 6 display the loss
functions during the a discriminative training run and a predictive training run.
Table 1. NIST test suite results for the generators, before and after training. Di and Pi
refer to discriminative and predictive experiments, respectively. T is the overall number
of distinct tests carried out by NIST STS, and TI is the number of total test instances.
FI and FI% are the number of failed test instances and the percentage of failed test
instances. Fp is the number of distinct tests failed due to an abnormal distribution of
the test instance p-values. FT and F% refer to the absolute number and percentage of
distinct tests failed.
Table 2. Performance change from before training to after training for the discrimi-
native and predictive approaches across all tests.
Fig. 5. Training loss of the discriminative model. The discriminator has a tendency
to gradually improve its performance while the generator plateaus. Occasionally the
learning destabilizes and the discriminator’s loss increases by a large factor.
Fig. 6. A plot of the training loss during training of the predictive model. The predictor
and generator converge in the initial phase of training.
Pseudo-Random Number Generation using Generative Adversarial Networks 9
Fig. 7. Visualization of the generator output as produced in the 9th predictive training
instance, before (left half) and after (right half) training. The 200x200 grid shows the
first 40,000 bits in the generator’s sample output. Obvious patterns are visible before
training, but not after.
10 M. De Bernardi et al.
References
1. Abadi, M., Andersen, D.G.: Learning to protect communications with adversarial
neural cryptography. arXiv preprint arXiv:1610.06918 (2016)
2. Desai, V., Deshmukh, V., Rao, D.: Pseudo random number generator using el-
man neural network. In: Recent Advances in Intelligent Computational Systems
(RAICS), 2011 IEEE. pp. 251–254. IEEE (2011)
3. Desai, V., Patil, R.T., Deshmukh, V., Rao, D.: Pseudo random number generator
using time delay neural network. World 2(10), 165–169 (2012)
4. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair,
S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in neural
information processing systems. pp. 2672–2680 (2014)
5. Jeong, Y.S., Oh, K., Cho, C.K., Choi, H.J.: Pseudo random number generation us-
ing lstms and irrational numbers. In: Big Data and Smart Computing (BigComp),
2018 IEEE International Conference on. pp. 541–544. IEEE (2018)
6. Karpathy, A.: Lecture notes for cs231n convolutional neural networks for visual
recognition (2017)
7. Kelsey, J., Schneier, B., Wagner, D., Hall, C.: Cryptanalytic attacks on pseudo-
random number generators. In: Fast Software Encryption. pp. 168–188. Springer
(1998)
8. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint
arXiv:1412.6980 (2014)
9. Menezes, A.J., Van Oorschot, P.C., Vanstone, S.A.: Handbook of applied cryptog-
raphy. CRC press (1996)
10. Rukhin, A., Soto, J., Nechvatal, J., Smid, M., Barker, E.: A statistical test suite
for random and pseudorandom number generators for cryptographic applications.
Tech. rep., Booz-Allen and Hamilton Inc Mclean Va (2001)
11. Tirdad, K., Sadeghian, A.: Hopfield neural networks as pseudo random number gen-
erators. In: Fuzzy Information Processing Society (NAFIPS), 2010 Annual Meeting
of the North American. pp. 1–6. IEEE (2010)