ECE Project Final Documents
ECE Project Final Documents
ON
Bachelor of Technology
In
Submitted by
B. ANUSHA 17FH1A0401
CERTIFICATE
This is to certify that the project work on “DESIGN OF A HIGH PERFORMANCE 2-BIT
MAGNITUDE COMPARATOR USING HYBRID LOGIC STYLE” is a bonafide work
done by B. LAKSHMIPRASANNA (17FH1A0402), K. TEJASREE (17FH1A0416),
V.NAGAJYOTHI(17FH1A0436),K.MARYTEJASWI(17FH1A0412),B.ANUSHA(17FH1A04
01),in partial fulfillment of the requirement for the award of the degree of “Bachelor of
Technology in Electronics and Communication Engineering” JNTUA, Anantapuramu
during 2017-2021.
Date:
ACKNOWLEDGEMENT
At the outset I sincerely thanks to our guide K. ABDUL REHMAN, M. Tech Assistant
Professor in Dept. of Electronics and Communication Engineering at Dr. KVSRIT, for his kind
cooperation and encouragement for the successful completion of Seminar and providing the
necessary facilities.
I am most obliged and grateful to Sri. T. VIJAY KUMAR, Head of the Department of
Electronics and Communication of Dr. KVSRIT for giving me guidance in completing this
Seminar successfully.
It is my privilege and pleasure to express my profound sense of gratitude and
indebtedness to Dr. S G GOVINDARAJULU, Professor in Department of Electronics and
Communication, Dr. KVSRIT, for his guidance, cogent discussion, constructive criticisms, and
encouragement throughout this dissertation work.
I would like to express my very great appreciation to Dr. M.V. SRUTHI , Professor in
Department of Electronics and Communication, Dr. KVSRIT for her valuable and constructive
suggestions during the planning and development of this research work.
I am grateful to Dr. L.THIMMAIAH, Principal, Dr. KVSRIT, for his sagacious
guidance, scholarly and the inspiration offered in an amiable and pleasant manner in helping me
for completing this seminar successfully.
We would like to thank our college management, the chairman Dr. K.V.SUBBA
REDDY Garu, Mrs. K.VIJAYALAKSHMAMMA garu who had inspired a lot through her
speeches. She has given meaning to our Technological studies and told us to survive in this
competitive world. We wish to express our thanks to all staff members and our friends who have
rendered their whole-hearted support at all times for the successful completion of his seminar
within the limited time.
I
LIST OF TABLES
GATES
PROPOSED METHOD
1. OPERATIONAL TABLE OF 2-BIT MC
II
ABSTRACT
Design of a 2-bit binary Magnitude Comparator (MC) is presented in this research. The proposed
MC has been designed using Conventional CMOS (CCMOS) logic, Pass Transistor Logic (PTL).
The design is simulated along with 5 other existing MC designs in order to carry out evaluation
and comparison. The proposed 2-bit MC displayed satisfactory level of improvement in speed
and power. For this reason, significant enhancement in Power Delay Product (PDP) could have
been attained. Due to the significant enhancement in performance, the proposed MC can be
considered as a highly effective alternative to the existing MC designs.
III
HARDWARE EFFICIENT POST PROCESSING ARCHITECTURE FOR TRUE RANDOM NUMBER GENERATOR
CHAPTER-1
INTRODUCTION
1.1 OVERVIEW
on DPR capabilities available on Xilinx FPGAs. The major contribution of this paper is the
development of an architecture which allows on–the–fly tunabilty of statistical qualities of a
TRNG by utilizing DPR capabilities of modern FPGAs for varying the DCM modeling
parameters. To the best of our knowledge this is the first reported work which incorporates
tunability in a TRNG.
This approach is only applicable for Xilinx FPGAs which provide programmable clock
generation mechanism, and capability of DPR.DPR is a relatively new enhancement in FPGA
technology, whereby modifications to predefined portions of the FPGA logic fabric is possible
on–the–fly, without affecting the normal functionality of the FPGA. Xilinx Clock Management
Tiles (CMTs) contain Dynamic Reconfiguration Port (DRP) which allow DPR to be performed
through much simpler means [1]. Using DPR, the clock frequencies generated can be changed
on–the–fly by adjusting the corresponding DCM parameters. DPR via DRP is an added
advantage in FPGAs as it allows the user to tune the clock frequency as per the need. Design
techniques exist to prevent any malicious manipulations via DPR which in other ways may
detrimentally affect the security of the system [2].
Mersenne Twister (MT) is a widely-used fast pseudorandom number generator
(PRNG), designed by Matsumoto [8]. More CPU time is required for initialization than for
generation in MT and hence, next to Mersenne Twisters, WELL generators were introduced by
Panneton [9]. CPUs for personal computers later, acquired new features of SIMD operations
(i.e., 128- bit operations) and multi-stage pipelines.128-bit based PRNG was proposed which
was named as SIMD-oriented Fast Mersenne Twister (SFMT),
which is analogous to MT using SIMD operations proposed by Saito[7]. Tsoi[10]
mentioned that if the function call is avoided, WELL may be slower than MT for some CPUs.
The SFMT pseudorandom number generator is a very fast generator with satisfactorily high-
dimensional equidistribution property. Then Random number generators based on linear
recurrences modulo 2 were introduced. Linear Feedback Shift Register random number
generators, also called Tausworthe generators, which work on linear recurrences modulo 2.
Trinomial-based generators have important statistical defects, but combining them can yield
generators that are relatively fast and robust. Such combinations have been proposed and
analyzed by Matsumoto and Wang [11, 12]. The generators given in are for 32-bit computers.
Nowadays 64-bit computers are becoming increasingly common and so it is important to have
good generators designed to fully use the 64-bit words given by P. L’Ecuyer [6] The huge-
period generators proposed thereafter were not quite optimal. New generators with better
equidistribution and bit-mixing properties were required.
The state of these generators evolves in a more chaotic way than for the Mersenne twister.
The reduction of the impact of persistent dependencies among successive output values can be
observed in certain parts of the period of the Mersenne twister which was given by Saito [7]. A
generator with a period of can be implemented using k flip-flops and k LUTs, and provides k
random output bits each cycle. Despite these advantages, FPGA-optimized generators are not
widely used in practice, as the process of constructing a generator for a given parameterization is
time consuming, in terms of both developer man hours and CPU time. While it is possible to
construct all possible generators ahead of time, the resulting set of cores would require many
megabytes, and be difficult to integrate into existing tools and design flows. Faced with these
unpalatable choices, engineers under time constraints understandably choose less efficient
methods, such as combined Tausworthe generators [3] or parallel linear feedback shift registers
(LFSRs). using cheap bit-wise shift-registers to provide long periods and good quality without
requiring expensive resources. The number of bits generated per cycle is chosen generally to
meet the needs of the application.
Permutation of the resulting outputs is given to the XOR gates. The output of the XOR
gates are then given to the PIPO SRs, where the XOR gate outputs are shifted and thus random
number generation takes place successfully. The Random Number Generation is performed as
per the methodology. The simulations are performed in Model Sim 6.4a which is a tool and
synthesized using Xilinx Plan Ahead Virtex5 kit verified on the Spartan 3E kit and the
programming is written using Verilog.
The results that are obtained from the tools and the design summary obtained from Xilinx
8.1i are shown below. The initial seed is given as input. The seed is permuted.
The results for 8-bit RNG are discussed below. The same scheme is carried out for 64-bit
RNG. The permuted bits’ output is given to the XOR gates. For 8-bit RNG the number of XOR
gates is 8(t=8). The concept of permutation is used up for improving randomness among bits and
thus employing unpredictability.
The first and last bits are interchanged. The same concept of permutation is used for
different bit RNGs. The permuted outputs are fed into the XOR gates and for remaining inputs
to XOR gates round basis is used. Thus, the obtained XOR gate output bits are fed in a parallel
basis into the PIPO SR. The resulting outputs generate the random number cycle. The cycle is
fed into the SISO SR [FIFO] of varying lengths (length=k).
The length should not exceed r. As each bit crosses the flip-flop, it will be set to zero. Thus,
random number generation takes place. The resulting random numbers are generated such that
their period is 2r -1. If the number of bits is 16, then the period is 216-1. The count of all zero
state is reduced since the all zero state leads to idle condition. Random number and random bit
generators, RNGs and RBGs, respectively, are a fundamental tool in many different areas. The
two main fields of application are stochastic simulation and cryptography. In stochastic
simulation, RNGs are used for mimicking the behavior of a random variable with a given
probability distribution. In cryptography, these generators are employed to produce secret keys,
to encrypt messages or to mask the content of certain protocols by combining the content with a
random sequence. A further application of cryptographically secure random numbers is the
growing area of internet gambling since these games should imitate very closely the distribution
properties of their real equivalents and must not be predictable or influenceable by any
adversary.
A random number generator is an algorithm that, based on an initial seed or by means of
continuous input, produces a sequence of numbers or respectively bits. We demand that this
sequence appears “random” to any observer. This topic leads us to the question: What is
random? Most people will claim that they know what randomness means, but if they are asked
to give an exact definition they will have a problem doing so. In most cases terms like
unpredictable or uniformly distributed will be used in the attempt to describe the necessary
properties of random numbers. However, when can a particular number or output string be
called unpredictable or uniformly distributed? In Part I we will introduce three different
approaches to define randomness or related notions.
In the context of random numbers and RNGs the notions of “real” random numbers and
true random number generators (TRNGs) appear quite frequently. By real random numbers we
mean the independent realizations of a uniformly distributed random variable, by TRNGs we
denote generators that output the result of a physical experiment which is considered to be
random, like radioactive decay or the noise of a semiconductor diode. In certain circumstances,
RNGs employ TRNGs in connection with an additional algorithm to produce a sequence that
behaves almost like real random numbers. However, why would we use RNGs instead of
TRNGs? First of all, TRNGs are often biased, this means for example that on average their
output might contain more ones than zeros and therefore does not correspond to a uniformly
distributed random variable. This effect can be balanced by different means, but this post-
processing reduces the number of useful bits as well as the efficiency of the generator.
Another problem is that some TRNGs are very expensive or need at least an extra
hardware device. In addition, these generators are often too slow for the intended applications.
Ordinary RNGs need no additional hardware, are much faster than TRNGs, and their output
fulfills the fundamental conditions, like unbiasedness, that are expected from random numbers.
These conditions are required for high quality RNGs but they cannot be generalized to the wide
range of available generators. Despite the arguments above, TRNGs have their place in the
arsenal. They are used to generate the seed or the continuous input for RNGs. In [Ell95] the
author lists several hardware sources that can be applied for such a purpose. Independently of
whether a RNG is used for stochastic simulation or for cryptographic applications, it has to
satisfy certain conditions. First of all the output should imitate the realization of a sequence of
independent uniformly distributed random variables. Random variables that are not uniformly
Random Number Generation: Types and Techniques A degree of randomness is built into
the fabric of reality. It is impossible to say for certain what a baby’s personality will be, how the
temperature will fluctuate next week, or which way dice will land on their next roll. A planet in
which everything could be predicted would be bland, and much of the excitement of life would
be lost. Because randomness is so inherent in everyday life, many researchers have tried to
either harvest or simulate its effect inside the digital realm. Before accomplishing this feat,
however, many important questions need to be answered. What does it mean to be random?
How does a person go about creating randomness, and how can he capture the randomness he
encounters? How can someone know if an event or number sequence is random or not? Over
generations, the answers to these questions have progressively been developed.
This paper takes a look at the current solutions, and attempts to organize the methods for
creating chaos. Defining Random It is impossible to appreciate a random number generator
without first understanding what it means to be random. Developing a well-rounded definition of
randomness can be accomplished by studying a random phenomenon, such as a dice roll, and
exploring what qualities makes it random. To begin, imagine that a family game includes a die
to make things more interesting. In the first turn, the die rolls a five. By itself, the roll of five is
completely random. However, as the game goes on, the sequence of rolls is five, five, five, and
five. The family playing the game will not take long to realize that the die they received
probably is not random.
From this illustration, it is apparent that when discussing randomness, a sequence of
random numbers should be the focus of the description, as opposed to the individual numbers
themselves (Kenny, 2005). To make sure the next die the family buys is random, they roll it 200
times. This time, the die did not land on the same face every time, but half of the rolls came up
as a one. This die would not be considered random either, because it has a disproportionate bias
toward a specific number. To be random, the die should land on all possible values equally. In a
third scenario, the dice manufacturer guarantees that now all its dice land on all numbers
equally. Cautious, a family role this new die 200 times to verify. Although the numbers were hit
uniformly, the family realized that throughout the entire experiment the numbers always
followed a sequence: five, six, one, two, etc. Once again, the randomness of the die would be
questioned. For the die to be accepted as random, it could not have any obvious patterns in a
sequence of dice rolls.
If it can be predicted what will happen next, or anywhere in the future, the die cannot
truly be random. From the results of these dice illustrations a more formal definition of
randomness can be constructed. A generally accepted and basic definition of a random number
sequence is as follows: a random number sequence is uniformly distributed over all possible
values and each number is independent of the numbers generated before it (Marsaglia,2005). A
random number generator can be defined as any system that creates random sequences like the
one just defined. Unfortunately, time has shown that the requirements for a random number
generator change greatly depending on the context in which it is used.
When a random number generator is used in cryptography, it is vital that the past
sequences can neither be discovered nor repeated; otherwise, attackers would be able to break
into systems (Kenny, 2005). The opposite is true when a generator is used in simulations. In this
context, it is actually desirable to obtain the same random sequence multiple times. This allows
for experiments that are performed based on changes in individual values. The new major
requirement typical of simulations, especially Monte Carlo simulations, is that vast amounts of
random numbers need to be generated quickly, since they are consumed quickly (Chan, 2009).
For example, in a war simulator a new random number might be needed every time a soldier
fires a weapon to determine if he hits his target. If a battle consists of hundreds or thousands of
soldiers, providing a random generator quick enough to accommodate it is not trivial. Random
numbers are often used in digital games and in statistical sampling as well. These last two
categories put very few requirements on the random numbers other than that they be actually
random. Inside each of these contexts, requirements even over the additional ones listed can
exist depending on the specific application.
There is a general definition describing a random number generator, but this definition
needs to be tailored for each situation a generator is used in. Types of Random Number
Generators With a description of randomness in hand, focus can shift to random number
generators themselves and how they are constructed. Typically, whenever a random number
generator is being discussed, its output is given in binary. Generators exist that have non-binary
outputs, but whatever is produced can be converted into binary after the fact. There are two main
types of random number generators. The first type attempts to capture random events in the real
world to create its sequences. It is referred to as a true random number generator, because in
normal circumstances it is impossible for anyone to predict the next number in the sequence.
The second camp believes that algorithms with unpredictable outputs (assuming no one knows
the initial conditions) are sufficient to meet the requirements for randomness. The generators
produced through algorithmic techniques are called pseudo-random generators, because in
reality each value is determined based off the system’s state, and is not truly random. To gain an
understanding of how these generators work, specific examples from both categories will be
examined.
True Random Number Generators A true random number generator uses entropy sources
that already exist instead of inventing them. Entropy refers to the amount of uncertainty about an
outcome. Real word events such as coin flips have a high degree of entropy, because it is almost
impossible to accurately predict what the end result will be. It is the source of entropy that
makes a true random number generator unpredictable. Flipping coins and rolling dice are two
ways entropy could be obtained for a generator, although the rate at which random numbers
could be produced would be restricted. Low production rate is a problem that plagues most true
random number generators (Foley, 2001). Another major disadvantage of these generators is that
they rely on some sort of hardware. Since they use real world phenomena, some physical device
capable of recording the event is needed. This can make true random generators a lot more
expensive to implement, especially if the necessary device is not commonly used. It also means
that the generators are vulnerable to physical attacks that can bias the number sequences.
Finally, even when there are no attackers present, physical devices are typically vulnerable to
wear over time and errors in their construction that can naturally bias the sequences produced
(Sunar, Martin, & Stinson, 2006).
To overcome bias, most true random number generators have some sort of post
processing algorithm that can compensate for it. Despite these disadvantages, there are many
contexts where having number sequences that are neither artificially made nor reproducible is
important enough to accept the obstacles. For security experts, there is a peace of mind that
comes with knowing that no mathematician can break a code that does not exist. In the next
sections, four major true random generators: Random.org, Hot bits, lasers, and oscillators will be
covered. Random.org. A widely used true random number generator is hosted on a website
named Random.org. Random.org freely distributes the random sequences it generates, leading to
a varied user base (Haahr, 2011). Applications of these numbers have ranged everywhere from
an online backgammon server to a company that uses the numbers for random drug screenings
(Kenny, 2005). However, since the numbers are obtained over the Internet, it would be unwise
to use them for security purposes or situations where the sequence absolutely needs to stay
private. There is always the risk that the transmission will be intercepted. The random number
generator from this site collects its entropy from atmospheric noise.
Radio devices pick up on the noise and run it through a postprocessor that converts it
into a stream of binary ones and zeroes. Scholars have pointed out that the laws governing
atmospheric noise are actually deterministic, so the sequences produced by this generator are not
completely random (Random.org, 2012). The proponents of this claim believe that only quantum
phenomena are truly nondeterministic. Random.org has countered this argument by pointing out
that the number of variables that would be required to predict the values of atmospheric noise
are infeasible for humans to obtain. Guessing the next number produced would mean accurately
recoding every broadcasting device and atmospheric fluctuation in the area, possibly even down
to molecules. It has been certified by several third parties that the number sequences on this site
pass the industry-standard test suites, making it a free and viable option for casual consumers of
random numbers.
Hot Bits. The other popular free Internet-based random number generator is referred to
as Hot Bits. This site generates its random number sequences based off radioactive decay.
Because this is a quantum-level phenomenon, there is no debate over whether the number
sequences are truly non-deterministic. At the same time, the process involved in harvesting this
phenomenon restricts Hot Bits to only producing numbers at the rate of 800 bits (100 bytes) per
second (Hot Bits, 2012). Although the Hot Bits server stores a backlog of random numbers, the
rate at which random sequences can be extracted is still limited in comparison to other options.
As with Random.org, random numbers obtained from this generator are sent over the Internet, so
there is always the possibility that a third party has knowledge of the sequence. This makes it
unsuitable for security-focused applications, but Hot bits is useful when unquestionably random
data is necessary. Lasers. The use of lasers allows for true random number generators that
overcome the obstacle of slow production. In laser-based generators, entropy can be obtained by
several different means. Having two photons race to a destination is one.
1.4 objective
The goal of this paper is the design, analysis and implementation of an easy-
to-design, improved, low-overhead, tunable TRNG for the FPGA platform. The following are
our major contributions:
1) We investigate the limitations of the BFD–TRNG [3] when implemented on a FPGA
design platform. To solve the shortcomings, we propose an improved BFD–TRNG
architecture suitable for FPGA based applications. To the best of our knowledge this is the
first reported work which incorporates tunability in a fully digital TRNG.
2) We analyze the modified proposed architecture mathematically and experimentally.
3) Our experimental results strongly support the mathematical model proposed. The
proposed TRNG has low hardware overhead, and the random bitstreams derived from the
proposed TRNG passes all tests in the NIST statistical test suite.
CHAPTER 2
LITERATURE REVIEW
LITERATURE REVIEW
From the rigorous review of related work and published literature, it is observed that many
researchers have designed random number generation by applying different techniques.
Researchers have undertaken different systems, processes, or phenomena with regard to design and
analyze RNG content and attempted to find the unknown parameters. A pseudorandom number
generator (PRNG) is an algorithm for generating a sequence of numbers that approximates the
properties of random numbers. These sequences are not truly random. Although sequences that are
closer to truly random can be generated using hardware random number generators, pseudorandom
numbers are important in practice for simulations (e.g., of physical systems with the Monte Carlo
method), and are important in the practice of cryptography. Ray C. C. Cheung, Dong-U Lee, John
D. Villasenor [1], presented an automated methodology for producing hardware-based random
number generator (RNG) designs for arbitrary distributions using the inverse cumulative
distribution function (ICDF).
The ICDF is evaluated via piecewise polynomial approximation with a hierarchical
segmentation scheme that involves uniform segments and segments with size varying by powers of
two which can adapt to local function nonlinearities. Analytical error analysis is used to guarantee
accuracy to one unit in the last place (ulp). Compact and efficient RNGs that can reach arbitrary
multiples of the standard deviation can be generated. For instance, a Gaussian RNG based on our
approach for a Xilinx Virtex-4 XC4VLX100-12 field programmable gate array produces 16-bit
random samples up to 8.2delta. It occupies 487 slices, 2 block-RAMs, and 2 DSP-blocks. The
Dept of ECE Dr. KVSRIT Page 14
HARDWARE EFFICIENT POST PROCESSING ARCHITECTURE FOR TRUE RANDOM NUMBER GENERATOR
design is capable of running at 371 MHz and generates one sample every clock cycle. The designs
are capable of generating random numbers from arbitrary distributions provided that the ICDFs is
known. GU Xiao-chen, ZHANG Min-xuan [2] presented ―Uniform Random Number Generator
using Leap-Ahead LFSR Architecture‖. Introducing a new kind of URNG using Leap-Ahead LFSR
Architecture which could generate an mbits random number per cycle using only one LFSR.
A normal LFSR could only generate one random bit per cycle. As multi-bits is required to
form a random number in most applications, Multi-LFSRs architecture is used to implement a
URNG. This means 32 different LFSRs are needed in a 32- bit output URNG. But Leap-Ahead
architecture could avoid this and generate one multi-bits random number per cycle using only one
LFSR. The Leap-Ahead architecture consumes less than 10% of slices which the Multi-LFSR
architecture consumes. One of the reasons for this is that the Leap-Ahead architecture has only
1LFSR in the URNG hardware, while the Multi-LFSR architecture has 18. The other reason is that
every register in the URNG has to be initialed separately when the circuit is restarted, and the logic
for this is complicated.
As the Multi-LFSR architecture has 18×18registers, while the Leap-Ahead architecture has
only 23registers, it needs more slices for the initializing function. By implementing the Leap-Ahead
LFSR architecture and Multi-LFSR architecture of both Galois type and Fibonacci type on Xilinx
Vertex 4 FPGA, we acquire the conclusion that, with only very little lost in speed, Leap-Ahead
LFSR architecture consumes only 10% slices of what the MultiLFSR architecture does to generate
the random numbers that have the same period. By comparison with other URNGs, Leap-Ahead
LFSR architecture has very good Area Time performance and Throughput performance that are
2.18×10- 9 slices×sec per bit and 17.87×109 bits per sec. Jonathan M. Comer, Juan C. Cerda, Chris
D. Martinez, and David H. K. Hoe [3] introduced new architecture using Cellular Automata.
Cellular Automata (CA) have been found to make good pseudo-random number
generators (PRNGs), and these CA-based PRNGs are well suited for implementation on Field
Programmable Gate Arrays (FPGAs). To improve the quality of the random numbers that are
generated, the basic CA structure is enhanced in two ways. First, the addition of a super-rule to
each CA cell is considered. The overviews of the design of linear feedback shift register (LFSRs)
and cellular automata (CA), followed by a review of related works that have utilized LFSR and CA
for generating random numbers. Therefore, evaluated the performance of CA-based PRNGs
suitable for implementation on FPGAs. Synthesis results for the Xilinx Spartan 3E FPGA give a
good idea of the relative resources required for each configuration.
Pawel Dabal, Ryszard Pelka [4] presented FPGA Implementation of Chaotic Pseudo-
Random Bit Generators‖ Modern communication systems (including mobile systems) require the
use of advanced methods of information protection against unauthorized access. Therefore, one of
the essential problems of modern cryptography is the generation of keys having relevant statistical
properties. In recent years, the cryptographers pay an increasing attention to digital systems based
on chaos theory. The use of chaotic signals to carry information. An idea of using a nonlinear
chaotic dynamic system for design of cryptographic secure pseudorandom number or bit generator
(PRNG or PRBG) seems to be interesting from practical reasons. Carlos Arturo Gayoso, C.
González, L. Arnone, M. Rabini, Jorge Castiñeira Moreira, [5] presented ―Pseudorandom Number
Generator Based on the Residue Number System and its FPGA Implementation‖ Residue Number
System (RNS), which allows us to design a very fast circuit that has a very different way of
operating with respect to other generators.
A set of classic tests, the Diehard test, the statistic complexity test and the Hurst exponent
test are used to provide a measure of the quality of the randomness of the proposed pseudorandom
number generator. David B. Thomas, Wayne Luk, [6] presented ―The LUT-SR Family of Uniform
Random Number Generators for FPGA Architectures‖. A type of FPGA RNG called a LUT-SR
RNG, which takes advantage of bitwise XOR operations and the ability to turn lookup tables
(LUTs) into shift registers of varying lengths. This provides a good resource–quality balance
compared to previous FPGA-optimized generators, between the previous high-resource high-period
LUT-FIFO RNGs and lowresource low-quality LUTOPT RNGs, with quality comparable to the
best software generators.
The LUT-SR generators can also be expressed using a simple C++ algorithm
contained within this paper, allowing 60 fully specified LUT-SR RNGs with different
characteristics to be embedded in this paper, backed up by an online set of very high-speed
integrated circuit hardware description language (VHDL) generators and test benches. Ravi Saini,
Sanjay Singh, Anil K Saini, A S Mandal, Chandra Shekhar [7] presented Design of a Fast and
Efficient Hardware Implementation of a Random Number Generator in FPGA‖ presents a fast and
efficient hardware implementation of a pseudo-random number generator based on Lehmer linear
congruential method. Demonstrated in this paper that how the introduction of application
specificity in the architecture can deliver huge performance in terms of area and speed. The design
has been specified in VHDL and is implemented on Xilinx FPGA device XC5VFX130T- 3ff1738
and takes up only 23 slice LUTS. In 2014, Purushottam Y. Chawle and R.V. Kshirsagar [8],
presented a simple algorithm to generate pseudo random number using Linear Feedback Shift
register (LFSR). The generated pseudo sequence is mainly used for communication process such as
cryptographic, encoder and decoder application in coded format. In LFSR operation, the linear
operation of single bit is exclusive-or (X-OR). The 8 and 16-bit LFSR is designed using verilog
HDL language to study their performance and randomness. LFSR is a shift register whose output
random state is depend on feedback polynomial.
CHAPTER 3
EXISTING SYSTEM
randomness originating from process variation effects associated with deep sub-micron
CMOS manufacturing, one of the oscillators (say, ROSCA) oscillates slightly faster than the
other oscillator (ROSCB). In addition, the authors [3] proposed to employ trimming
capacitors to further tune the oscillator output frequencies.
2) The output of one of the ROs is used to sample the output of the other, using a D flip-flop
(DFF). Without loss of generality, assume the output of ROSCA is fed to the D-input of the
DFF, while the output of ROSCB is connected to the clock input of the DFF.
3) At certain time intervals (determined by the frequency difference of the two ROCs), the
faster oscillator signal passes, catches up, and overtakes the slower signal in phase. Due to
random jitter, these capturing events happen at random intervals, called “Beat Frequency
Intervals”. As a result, the DFF outputs a logic-1 at different random instances.
4) A counter controlled by the DFF increments during the beat frequency intervals, and gets
reset due to the logic-1 output of the DFF. Due to the random jitter, the free running counter
output ramps up to different peak values in each of the count-up intervals before getting
reset.
5) The output of the counter is sampled by a sampling clock before it reaches its maximum
value.
6) The sampled response is then serialized to obtain the random bitstream.
3.2 Shortcoming of the BFD-TRNG
One shortcoming of the previous BFD-TRNG circuit is that its statistical randomness is
dependent on the design quality of the ring oscillators. Any design bias in the ring oscillators might
adversely affect the statistical randomness of the bitstream generated by the TRNG. Designs with
same number of Fig. 1: Architecture of single-phase BFD–TRNG [3]. inverters but different
placements resulted in varying counter maximas. Additionally, the same ring-oscillator based
BFDTRNG implemented on different FPGAs of the same family shows distinct counter maxima.
Unfortunately, since the ring oscillators are free running, it is difficult to control them to eliminate
any design bias. The problem is exacerbated in FPGAs where it is often difficult to control design
bias because of the lack of fine-grained designer control on routing in the FPGA design fabric. A
relatively simple way of tuning clock generator hardware primitives on Xilinx FPGAs, particularly
the Phase Locked Loop (PLL) or the Digital Clock Manager (DCM) as used in this work, is by
enabling dynamic reconfiguration via the Dynamic Reconfiguration Ports (DRPs). Once enabled, the
clock generators can be tuned to generate clock signals of different frequencies by modifying values
at the DRPs [1] on–the–fly, without needing to bring the device off– line. A field-programmable gate
array (FPGA) is an integrated circuit created to be configured by the customer after manufacturing
hence "field-programmable".
The FPGA configuration is generally defined using a hardware description language (HDL),
similar to that used for an application-specific integrated circuit (ASIC) (circuit diagrams were
previously used to specify the configuration, as they were for ASICs, but this is increasingly rare).
FPGAs can be used to implement any logical function that an ASIC can perform. The ability to
update the functionality after shipping, partial re-configuration of the portion of the design and the
low non-recurring engineering costs relative to an ASIC design, offer advantages for many
applications. FPGAs contain programmable logic components called "logic blocks", and a hierarchy
of reconfigurable interconnects that allow the blocks to be "connected together" somewhat like a
one-chip programmable breadboard. Logic blocks can be configured to perform complex
combinational functions, or merely simple logic like AND and NAND. In most FPGAs, the logic
blocks also include memory elements, which may be simple flip-flops or more complete blocks of
memory.
The XOR gate provides feedback to the register that shifts bits from left to right. The
maximal sequence consists of every possible state except the "00000000" state. In computing, a
Dept of ECE Dr. KVSRIT Page 21
HARDWARE EFFICIENT POST PROCESSING ARCHITECTURE FOR TRUE RANDOM NUMBER GENERATOR
linear-feedback shift register (LFSR) is a shift register whose input bit is a linear function of its
previous state. The most commonly used linear function of single bits is exclusive-or (XOR). Thus,
an LFSR is most often a shift register whose input bit is driven by the XOR of some bits of the
overall shift register value.
CHAPTER 4
PROPOSED SYSTEM
Tunability is established by setting the DCM parameters on–the– fly using DPR capabilities
using DRP ports. This capability provides the design greater flexibility than the ring oscillator-
based BFD-TRNG. The difference in the frequencies of the two generated clock signals is captured
using a DFF. The DFF sets when the faster oscillator completes one cycle more than the slower one
(at the beat frequency interval). A counter is driven by one of the generated clock signals, and is
reset when the DFF is set. Effectively, the counter increases the throughput of the generated
random numbers. The last three LSBs of the maximum count values reached by the count were
found to show good randomness properties.
Fig. 3: Overall architecture of proposed Digital Clock Manager based tunable BFD–TRNG
Fig. 3 shows the overall architecture of the proposed TRNG. In place of two ring
oscillators, two DCM modules generate the oscillation waveforms. The DCM primitives are
parameterized to generate slightly different frequencies, by adjusting two design parameters M
(Multiplication Factor) and D (Division Factor). In the proposed design, the source of randomness
is the jitter presented in the DCM circuitry. The DCM modules allow greater designer control over
the clock waveforms, and their usage eliminates the need for initial calibration [3].
Additionally, we have a simple post-processing unit using a Von Neumann Corrector
(VNC) [5] to eliminate any biasing in the generated random bits. VNC is a well-known
lowoverhead scheme to eliminate bias from a random bitstream. In this scheme, any input bit “00”
or “11” pattern is eliminated; otherwise, if the input bit pattern is “01” or “10”, only the first bit is
retained. The last three LSBs of the generated random number is passed through the VNC. The
VNC improves the statistical qualities at the cost of slight decrease in throughput.
The initial value of the LFSR is called the seed, and because the operation of the register is
deterministic, the stream of values produced by the register is completely determined by its
current (or previous) state. Likewise, because the register has a finite number of possible states, it
must eventually enter a repeating cycle. However, an LFSR with a wellchosen feedback function
can produce a sequence of bits which appears random and which has a very long cycle.
Applications of LFSRs include generating pseudo-random numbers, pseudo-noise sequences, fast
digital counters, and whitening sequences. Both hardware and software implementations of
LFSRs are common.
The feedback tap numbers correspond to a primitive polynomial in the table so the
register cycles through the maximum number of 256 states excluding the all-zeroes state. The bit
positions that affect the next state are called the taps. In the diagram the taps are [7,6,4,3]. The
rightmost bit of the LFSR is called the output bit. The taps are XOR'd sequentially with the
output bit and then fed back into the leftmost bit. The sequence of bits in the rightmost position
is called the output stream. The arrangement of taps for feedback in an LFSR can be expressed
in finite field arithmetic as a polynomial mod 2. This means that the coefficients of the
polynomial must be 1's or 0's.
This is called the feedback polynomial or reciprocal characteristic and 3rd bits (as
shown), the feedback polynomial is X 7 +X6 +X4 +X3 +1 The 'one' in the polynomial does not
correspond to a tap – it corresponds to the input to the first bit (i.e. x 0, which is equivalent to 1).
The powers of the terms represent the tapped bits, counting from the left. The first and last bits
are always connected as an input and output tap respectively. The LFSR is maximal-length if
and only if the corresponding feedback polynomial is primitive. This means that the following
conditions are necessary (but not sufficient): The number of taps should be even. The set of
taps – taken all together, not pairwise (i.e. as pairs of elements) – must be relatively prime. In
other words, there must be no divisor other than 1 common to all taps.
4.2 Galois LFSRs:
Named after the French mathematician Évariste Galois, an LFSR in Galois configuration,
which is also known as modular, internal XORs as well as one-to-many LFSR, is an alternate
structure that can generate the same output stream as a conventional LFSR (but offset in time). In
the Galois configuration, when the system is clocked, bits that are not taps are shifted one position
to the right unchanged. The taps, on the other hand, are XOR'd with the output bit before they are
stored in the next position. The new output bit is the next input bit. The effect of this is that when
the output bit is zero all the bits in the register shift to the right unchanged, and the input bit
becomes zero. When the output bit is one, the bits in the tap positions all flip (if they are 0, they
become 1, and if they are 1, they become 0), and then the entire register is shifted to the right and
the input bit becomes 1.
To generate the same output stream, the order of the taps is the counterpart (see above) of the
order for the conventional LFSR, otherwise the stream will be in reverse. Note that the internal
state of the LFSR is not necessarily the same.
The Galois register shown has the same output stream as the Fibonacci register in the first
section. A time offset exists between the streams, so a different start point will be needed to get
the same output each cycle. Galois LFSRs do not concatenate every tap to produce the new input
(the XOR'ing is done within the LFSR and no XOR gates are run in serial, therefore the
propagation times are reduced to that of one XOR rather than a whole chain), thus it is possible
for each tap to be computed in parallel, increasing the speed of execution. In a software
implementation of an LFSR, the Galois form is more efficient as the XOR operations can be
implemented a word at a time: only the output bit must be examined individually.
It is a pseudorandom number generator proposed in 1986by Lenore Blum, Manuel Blum
and Michael Shub (Blum et al., 1986). Blum Blum Shub takes the form: Xn+1 = Xn2mod n
Where n=p x q is the product of two large primes p and q. At each step of the algorithm, some
output is derived from xn+1; the output is commonly the bit parity of Xn+1 or one or more of the
least significant bits of Xn+1. The two primes, p and q, should both be congruent to 3 (mod 4) .
Steps for executing Blum Blum Shub Generator algorithm:
The Blum Shub Generator is known to be the cryptographically secure pseudo
random number generator (CSPRNG). The algorithm for BBS generator is as follows: Select two
big prime numbers p and q, such that both the numbers leave a remainder of 3 when divided by 4.
Choose n = p * q Choose seeds, such that s is relatively prime to n which means that neither p nor
q is a factor of s. Xo = s2 mod n The consequent values are generated according to the formula
Xi = (Xi1)2mod n A sequence of binary digits is produced according to the formula Bi= Xi mod2
The output sequence is B1, B2, B3, B4…… Pipelining Introduction
a) Pipelining Comes from the idea of a water pipe: continue sending water without waiting
the water in the pipe to be out leads to a reduction in the critical path Either increases the clock
speed (or sampling speed) or reduces the power consumption at same speed in a DSP system
b) Parallel Processing Multiple outputs are computed in parallel in a clock period The
effective sampling speed is increased by the level of parallelism Can also be used to reduce the
power consumption water pipe an instruction pipeline is a technique used in the design of
computers to increase their instruction throughput (the number of instructions that can be
executed in a unit of time).
The basic instruction cycle is broken up into a series called a pipeline. Rather than
processing each instruction sequentially (one at a time, finishing one instruction before starting
the next), each instruction is split up into a sequence of steps so different steps can be executed
concurrently (at the same time) and in parallel (by different circuitry). Pipelining increases
instruction throughput by performing multiple operations at the same time (concurrently) but does
not reduce instruction latency (the time to complete a single instruction from start to finish) as it
still must go through all steps. Indeed, it may increase latency due to additional overhead from
breaking the computation into separate steps and worse, the pipeline may stall (or even need to be
flushed), further increasing latency. Pipelining thus increases throughput at the cost of latency,
and is frequently used in CPUs, but avoided in real time systems, where latency is a hard
constraint. Each instruction is split into a sequence of dependent steps.
The first step is always to fetch the instruction from memory; the final step is usually
writing the results of the instruction to processor Registers or to memory. Pipelining seeks to let
the processor work on as many instructions as there are dependent steps, just as an assembly line
builds many vehicles at once, rather than waiting until one vehicle has passed through the line
before admitting the next one. Just as the goal of the assembly line is to always keep each
assembler productive, pipelining seeks to keep every portion of the processor busy with some
instruction. Pipelining lets the computer's cycle time be the time of the slowest step, and ideally
lets one instruction complete in every cycle.
A) The LUT-SR RNG the LUT-SR generators provide a middle performance between the
LUT-Opt [1] and LUT-FIFO generator [1], by Implementation of RNG in FPGA using Efficient
Resource Utilization 91 using cheap bit-wise shift-registers to provide long periods and good
quality without requiring expensive resources. The number of bits generated per cycle is chosen
generally to meet the needs of the application.
B) Algorithm LUT-SR generator family uses a short and precise algorithm for expanding
the full RNG structure. The algorithm [1] uses r, t and k with period 2 − 1 where r is the number
of random output bits generated per cycle, t is the XOR gate input count, k is the maximum shift
register length. The parameters (r, t, k) describe the properties of the generator in terms of
application requirements and architectural restrictions. The algorithmic steps are as follows, •
Initial loading Initially the loading step is done by giving a seed. For rbit generator the seed size is
r. As soon as the seed is given the bits are permuted. Any seed other than “all-zero state” can be
given. The seed is also known as initial seed. All zero-state condition cancels random number
generation and makes the generator idle [1].
Permutation The simple dependency between adjoining bits is masked up using a final
output permutation. The model is shown in Fig.1. Loading XOR connections, the permuted
outputs are given as inputs to XOR gates. The number of inputs should not exceed r. The number
of rounds should be t or t-1[1], where t is the number of XOR gates given. Each permuted output
bit is used at most t times. Some bits will be assigned the same FIFO bit in multiple rounds. The
XOR- ed outputs are given to the PIPO SR and fed back to the FIFO extensions [1]. PIPO SR
Universal shift register performs shifting operation in addition to the parallel-in-parallel-out
function.
At a time, multiple input processing happens in Parallel-in-parallelout-shift register.
The purpose of the parallel-in parallel-out shift register is to take in parallel data, shifts it, then
output the data [1]. FIFO Extension 1-bit shift registers are used. Bitwise shift registers improve
the rate of mixing [1]. For 8-bit RNG, the length of FIFO SR should not exceed k, where k=8.The
length of the shift register is given by the number of flip-flops. The outputs from PIPO SR are fed
back to FIFO or SISO SR. A FIFO is a sequential data buffer that is very easy to use. Very small
FIFOs can be implemented with flip-flops or register arrays, sometimes even with shift registers
[1]. Model Sim Results the initial seed for an 8-bit RNG is given or triggered through PIPO SR. A
shift register is an n-bit register that shifts its stored data by one-bit position for every clock tick.
The resulting sequence is fed back to the SISO SR, or FIFO SR. Permutation of the
resulting outputs is given to the XOR gates. The output of the XOR gates is then given to the
PIPO SRs, where the XOR gate outputs are shifted, and thus random number generation takes
place successfully. The Random Number Generation is performed as per the methodology.
The simulations are performed in Model Sim 6.4a which is a tool and synthesized using
Xilinx PlanAhead Virtex5 kit verified on the Spartan 3E kit, and the programming is written
using Verilog. The results that are obtained from the tools and the design summary obtained
from Xilinx 8.1i are shown below. The initial seed is given as input. The seed is permuted. The
results for 8-bit RNG are discussed below. The same scheme is carried out for 64-bit RNG. The
permuted bits’ output is given to the XOR gates. For 8-bit RNG the number of XOR gates is
8(t=8). The concept of permutation is used up for improving randomness among bits and thus
employing unpredictability. The first and last bits are interchanged. The same concept of
permutation is used for different bit RNGs. The permuted outputs are fed into the XOR gates
and for remaining inputs to XOR gates round basis is used. Hence, the obtained XOR gate
output bits are fed in a parallel basis into the PIPO SR. resulting outputs generate the random
number cycle. The cycle is fed into the SISO SR [FIFO] of varying lengths (length=k).
The length should not exceed r. As each bit crosses the flip-flop, it will be set to zero.
Hence, random number generation takes place. The resulting random numbers are generated
such that their period is 2r -1. If the number of bits is 16, then the period is 216-1. The count of
all zero state is reduced since all zero state leads to idle condition. The period is the duration
after which the entire sequence goes on repeating based on the initial seed and the permutations.
So, the period for 32, 64, 128 and 512-bit RNGs are 232-1, 264-1, 2128-1, 2512-1.
4.3Tuning Circuitry
The architecture of the tuning circuitry is shown in Fig. 3. The target clock
frequency is determined by the set of parameter values actually selected. The random values
reached by the counter, as well as the jitter are related to the chosen parameters M and D (details
are discussed in Section IV). This makes it possible to tune the proposed TRNG using the
predetermined stored M and D values. As unrestricted DPR has been shown to be a potential threat
to the circuit [6], the safe operational value combinations of the D and M parameters for each DCM
are predetermined during the design time and stored on an on-chip Block RAM (BRAM) memory
block in the FPGA.
There are actually two different options for the clock generators – one can use the Phase
Locked Loop (PLL) hard macros available on Xilinx FPGAs, or the DCMs. We next describe
analytical and experimental results which compelled us to choose DCM in favor of the PLL
modules for clock waveform generation.
4.4 Circuit Behavior with PLL as Clock Generator
We first consider the operational principle for the PLL, and its feasibility as a
component of the proposed TRNG. The Xilinx PLL synthesizes a clock signal whose frequency is
given by: FCLKFX = FCLKIN · M D (1) where FCLKIN is the frequency of input clock signal, and
M and D are the multiplication and division factors previously mentioned. Values of M and D can be
varied to generate the required clock frequency. The two PLLs can be parametrized with the
necessary set of (M, D) values to generate two slightly different clock frequencies. Without loss of
generality, assume P LLA is set up to be slightly faster than P LLB, i.e. the time periods are related
by TA < TB. On reaching the beat frequency interval (say, n clock cycles), by definition, P LLA
completes one cycle more than the slower one.
The following equation depicts this simple model: TA TB = N N + 1 (2) N = 2.n, where n is
the estimated maximum counter value. For the first n clock cycles, the counter does not increment,
and then increments by one for each of the next n clock cycles. Hence, the maximum counter values
reached is n. Then, Eqn. (2) leads to: n = TB /2(TB − TA) (3) Using design configuration parameters
(M and D) one of the oscillators is made to run faster than the other. This is done in order to limit the
range of counter values produced.
If both the oscillators were configured to run at the same frequency we may get
random numbers, but the maximum counter value produced will be very high (theoretically infinite)
as per Eqn. (3). In other words, the latency of the circuit will be very high, since the counter sets and
resets only after reaching a very large count value. When the Xilinx PLLs are used as clock
generators, the predicted and observed counter values for all combinations of (M, D) values remain
the same. This confirms that the Xilinx PLL instances demonstrate close-toideal behavior and are
quasi-identical and have negligible jitter between the waveforms generated by them. Since the
BFDTRNG is critically dependent on the presence of jitter between the two generated clock
waveforms, PLLs seem unsuitable as components of the proposed TRNG. Hence, next we examine
the DCM as clock generators.
4.5 Circuit Behavior with DCM as Clock Generator
Without loss of generality, the clock signals produced by one of the DCM (say, DCMA) is
slightly faster than the other (DCMB), implying TA < TB. This is ensured by assigning the design
parameters M and D as in Eqn. (7). More details are discussed in Section IV-C. Timing diagrams of
the DCM clock outputs and the resultant DFF response is shown in Fig. 4.
Fig. 7: Timing diagram of DCM output waveforms and the corresponding and DFF response.
Let N be the number of clock cycles of the slower clock signal in which the faster clock
signal completes exactly one cycle more. Then, t A [N + 1] = (N + 1) TA + €A(4) and tB [N] = NTB +
€B (5) where €A and €B are the uncertainties due to jitter in DCMA and DCMB respectively. The
uncertainties due to jitter in DCMA and DCMB are different, this is because the DCMs are
designed with distinct modeling parameters M and D. The corresponding jitter for each of the
DCMs used in the proposed design is presented in Table III. For example, consider the
configuration presented in Sl.No. 1. In this case, DCMA is configured with M=15 and D=31 and
DCMB is configured with M=14 and D=29. This results in peak-to-peak jitter of 0.600 ns and
0.568 ns for DCMA and DCMB respectively. Of course, we also have: t A[N + 1] = tB[N].
Assuming there is no metastability for the DFF if signal transitions occur in the setup-hold
timing window around its driving clock edge (the metastability issue can be avoided by cascaded
DFF combination), the transition time (td) of the DFF, the time interval after which it sets (i.e. the
counter driven by the DFF resets), is estimated by: td = t A[N + 1] + tB[N] 2 = (N + 1)TA + NTB + €A
Dept of ECE Dr. KVSRIT Page 32
HARDWARE EFFICIENT POST PROCESSING ARCHITECTURE FOR TRUE RANDOM NUMBER GENERATOR
+ €B 2 (6) From Eqn. (6), the transition time of DFF is a random process. The output of the DFF,
i.e. the time interval (td) after which the counter resets, is thus a random function. As a result, the
count value obtained when the counter resets is also a random quantity.
CHAPTER-5
INTRODUCTION TO VLSI
The principal semiconductor chips held one transistor each. Ensuing advances
included more transistors, and, as a result, more individual capacities or frameworks were
incorporated after some time. The initially incorporated circuits held just a couple of gadgets, maybe
upwards of ten diodes, transistors, resistors and capacitors, making it conceivable to manufacture one
or more rationale doors on a solitary gadget. Presently referred to reflectively as "little scale joining"
(SSI), upgrades in procedure prompted gadgets with several rationale entryways, known as huge
scale incorporation (LSI), i.e. frameworks with no less than a thousand rationale doors. Current
technology has moved far past this imprint and today's chip have numerous a large number of
entryways and countless individual transistors.
At one time, there was a push to name and adjust different levels of huge scale joining
above VLSI. Terms like Ultra-substantial scale Integration (ULSI) were utilized. In any case, the
gigantic number of entryways and transistors accessible on regular gadgets has rendered such fine
refinements debatable.
Terms recommending more prominent than VLSI levels of combination are no more
in boundless use. Indeed, even VLSI is presently to some degree interesting, given the regular
suspicion that all chips are VLSI or better.
2. Lower power consumption. Replacing a handful of standard parts with a single chip reduces
total power consumption. Reducing power consumption has a ripple effect on the rest of the
system: a smaller, cheaper power supply can be used; since less power consumption means less
heat, a fan may no longer be necessary; a simpler cabinet with less shielding for electromagnetic
shielding may be feasible, too.
3. Reduced cost. Reducing the number of components, the power supply requirements, cabinet
costs, and so on, will inevitably reduce system cost. The ripple effect of integration is such that
the cost of a system built from custom ICs can be less, even though the individual ICs cost more
than the standard parts they replace. Understanding why integrated circuit technology such
profound influence on the design of digital systems has requires understanding both the
technology of IC manufacturing and the economics of ICs and digital systems.
Applications
Digital electronics control VCRs
Transaction processing system, ATM
Personal computers and Workstations
Medical electronic systems.
Etc….
Personal entertainment systems such as portable MP3 players and DVD players
perform sophisticated algorithms with remarkably little energy.
Electronic systems in cars operate stereo systems and displays; they also control fuel
injection systems, adjust suspensions to varying terrain, and perform the control
functions required for anti-lock braking (ABS) systems.
Digital electronics compress and decompress video, even at high-definition data
rates, on-the-fly in consumer electronics.
Low-cost terminals for Web browsing still require sophisticated electronics, despite
their dedicated function.
Personal computers and workstations provide word-processing, financial analysis,
and games. Computers include both central processing units (CPUs) and special-
purpose hardware for disk access, faster screen display, etc.
Medical electronic systems measure bodily functions and perform complex
processing algorithms to warn about unusual conditions. The availability of these
complex systems, far from overwhelming consumers, only creates demand for even
more complex systems.
The growing sophistication of applications continually pushes the design and
manufacturing of integrated circuits and electronic systems to new levels of
complexity. And perhaps the most amazing characteristic of this collection of
systems is its variety-as systems become more complex, we build not a few general-
purpose computers but an ever-wider range of special-purpose systems. Our ability
to do so is a testament to our growing mastery of both integrated circuit
manufacturing and design, but the increasing demands of customers continue to test
the limits of design and manufacturing
5.7 ASIC:
An Application-Specific Integrated Circuit (ASIC) is an integrated circuit (IC) customized
for a particular use, rather than intended for general-purpose use. For example, a chip designed
solely to run a cell phone is an ASIC. Intermediate between ASICs and industry standard integrated
circuits, like the 7400 or the 4000 series, are application specific standard products (ASSPs).
As feature sizes have shrunk and design tools improved over the years, the maximum
complexity (and hence functionality) possible in an ASIC has grown from 5,000 gates to over 100
million. Modern ASICs often include entire 32-bit processors, memory blocks including ROM,
RAM, EEPROM, Flash and other large building blocks. Such an ASIC is often termed a SoC
(system-on-a-chip). Designers of digital ASICs use a hardware description language (HDL), such
as Verilog or VHDL, to describe the functionality of ASICs.
Field-programmable gate arrays (FPGA) are the modern-day technology for building a
breadboard or prototype from standard parts; programmable logic blocks and programmable
interconnects allow the same FPGA to be used in many different applications. For smaller designs
and/or lower production volumes, FPGAs may be more cost effective than an ASIC design even in
production.
An application-specific integrated circuit (ASIC) is an integrated circuit (IC)
customized for a particular use, rather than intended for general-purpose use.
A Structured ASIC falls between an FPGA and a Standard Cell-based ASIC
Structured ASIC’s are used mainly for mid-volume level design. The design task for
structured ASIC’s is to map the circuit into a fixed arrangement of known cells
CHAPTER-6
INTRODUCTION TO XILINX
To Migrate a Project:
1. In the ISE 12 Project Navigator, select File > Open Project.
2. In the Open Project dialog box, select the .XISE file to migrate. Note You may
need to change the extension in the Files of type field to display .npl (ISE 5 and ISE
6 software) or. ISE (ISE 7 through ISE 10 software) project files. In the dialog box
that appears, select Backup and Migrate or Migrate Only.
4. If you chose to Backup and Migrate, a backup of the original project is created at
project_name_ise12migration.zip.Implement the design using the new version of
the software. Note Implementation status is not maintained after migration.
6.2 Properties:
For information on properties that have changed in the ISE 12 software, see ISE 11 to ISE
12 Properties Conversion.
6.3 IP Modules:
If your design includes IP modules that were created using CORE Generator™ software
or Xilinx® Platform Studio (XPS) and you need to modify these modules, you may be required
Dept of ECE Dr. KVSRIT Page 41
HARDWARE EFFICIENT POST PROCESSING ARCHITECTURE FOR TRUE RANDOM NUMBER GENERATOR
to update the core. However, if the core netlist is present and you do not need to modify the core,
updates are not required, and the existing netlist is used during implementation.
The ISE 12 programming backings the greater part of the source sorts that were upheld in
the ISE 11 programming.
On the off chance that you are working with undertakings from past discharges, state graph
source documents (.dia), ABEL source records (.abl), and test seat waveform source documents
(.tbw) are no more upheld. For state outline and ABEL source records, the product discovers a
related HDL document and adds it to the task, if conceivable. For test seat waveform documents,
the product consequently changes over the TBW record to a HDL test seat and adds it to the
venture. To change over a TBW record after task relocation, see Converting a TBW File to a HDL
Test Bench.
6.5 Using ISE Example Projects:
To help familiarize you with the ISE® software and with FPGA and CPLD designs, a set
of example designs is provided with Project Navigator. The examples show different design
techniques and source types, such as VHDL, Verilog, schematic, or EDIF, and include different
constraints and IP.
To Open an Example
2. In the Open Example dialog box, select the Sample Project Name. Note To help
you choose an example project, the Project Description field describes each
project. In addition, you can scroll to the right to see additional fields, which
provide details about the project.
4. Click OK.
5. The example project is extracted to the directory you specified in the Destination
Directory field and is automatically opened in Project Navigator. You can then run
processes on the example project and save any changes. Note If you modified an
example project and want to overwrite it with the original example project, select
File > Open Example, select the Sample Project Name, and specify the same
Destination Directory you originally used. In the dialog box that appears, select
Overwrite the existing project and click OK.
6.6 Creating a Project:
Venture Navigator permits you to deal with your FPGA and CPLD plans utilizing an
ISE® venture, which contains all the source records and settings particular to your outline. To
begin with, you must make a task and after that, include source documents, and set procedure
properties. After you make an undertaking, you can run procedures to execute, compel, and break
down your configuration. Venture Navigator gives a wizard to offer you some assistance with
creating an undertaking as takes after.
Note If you incline toward, you can make an undertaking utilizing the New Project dialog
box rather than the New Project Wizard. To utilize the New Project dialog box, deselect
the Use New Project wizard alternative in the ISE General page of the Preferences dialog
box
To Create a Project
1. Select File > New Project to launch the New Project Wizard.
2. In the Create New Project page, set the name, location, and project type, and
click Next.
3. For EDIF or NGC/NGO projects only: In the Import EDIF/NGC Project page,
select the input and constraint file for the project, and click Next.
4. In the Project Settings page, set the device and project properties, and click
Next.
5. In the Project Summary page, review the information, and click Finish to create
the project.
6. Project Navigator creates the project file (project name. xise) in the directory you
specified. After you add source files to the project, the files appear in the
Hierarchy panel.
6.7 Design panel:
Project Navigator manages your project based on the design properties (top-level module
type, device type, synthesis tool, and language) you selected when you created the project. It
organizes all the parts of your design and keeps track of the processes necessary to move the
design from design entry through implementation to programming the targeted Xilinx® device.
Note For information on changing design properties, see Changing Design Properties.
You can now perform any of the following:
Create new source files for your project.
Design source files are left in their existing location, and the copied project
specified directory.
Dept of ECE Dr. KVSRIT Page 44
HARDWARE EFFICIENT POST PROCESSING ARCHITECTURE FOR TRUE RANDOM NUMBER GENERATOR
Design source files, excluding generated files, are copied and placed in a
specified directory.
Copied projects are the same as other projects in both form and function. For example,
you can do the following with copied projects:
Open the copied project using the File > Open Project menu command.
View, modify, and implement the copied project.
Use the Project Browser to view key summary data for the copied project
and then, open the copied project for further analysis and implementation, as
described.
1. Select File > Copy Project. In the Copy Project dialog box, enter the Name for
the copy. Note The name for the copy can be the same as the name for the
project, as long as you specify a different location.
2. Enter a directory Location to store the copied project.
3. Optionally, enter a Working directory. By default, this is blank, and the working
directory is the same as the project directory. However, you can specify a
working directory if you want to keep your ISE® project file (. rise extension)
separate from your working area.
4. Optionally, enter a Description for the copy.
The description can be useful in identifying key traits of the project for reference
later. In the Source options area, do the following: Select one of the following
options:
Dept of ECE Dr. KVSRIT Page 45
HARDWARE EFFICIENT POST PROCESSING ARCHITECTURE FOR TRUE RANDOM NUMBER GENERATOR
5. A ZIP file is created in the specified directory. To open the archived project,
you must first unzip the ZIP file, and then, you can open the project.
CHAPTER-7
INTRODUCTION TO VERILOG
Verilog was the first modern hardware description language to be invented. It was
created by Phil Moorby and PrabhuGoel during the winter of 1983/1984. The wording for this
process was "Automated Integrated Design Systems" (later renamed to Gateway Design
Automation in 1985) as a hardware modeling language. Gateway Design Automation was
purchased by Cadence Design Systems in 1990. Cadence now has full proprietary rights to
Gateway's Verilog and the Verilog-XL, the HDL-simulator that would become the de-facto
standard (of Verilog logic simulators) for the next decade. Originally, Verilog was intended to
describe and allow simulation; only afterwards was support for synthesis added.
7.1 Verilog-95:
With the increasing success of VHDL at the time, Cadence decided to make the
language available for open standardization. Cadence transferred Verilog into the public domain
under the Open Verilog International (OVI) (now known as Accellera) organization. Verilog was
later submitted to IEEE and became IEEE Standard 1364-1995, commonly referred to as Verilog-
95.
In the same time frame Cadence initiated the creation of Verilog-A to put standards
support behind its analog simulator Spectre. Verilog-A was never intended to be a standalone
language and is a subset of Verilog-AMS which encompassed Verilog-95.
Verilog 2001:
Extensions to Verilog-95 were submitted back to IEEE to cover the deficiencies that
users had found in the original Verilog standard. These extensions became IEEE Standard 1364-
2001 known as Verilog-2001.
Verilog-2001 is a significant upgrade from Verilog-95. First, it adds explicit support for
(2's complement) signed nets and variables. Previously, code authors had to perform signed
operations using awkward bit-level manipulations (for example, the carry-out bit of a simple 8-
bit addition required an explicit description of the Boolean algebra to determine its correct
value). The same function under Verilog-2001 can be more succinctly described by one of the
built-in operators: +, -, /, *, >>>. A generate/endgenerate construct (similar to VHDL's
generate/endgenerate) allows Verilog-2001 to control instance and statement instantiation
through normal decision operators (case/if/else). Using generate/endgenerate, Verilog-2001 can
instantiate an array of instances, with control over the connectivity of the individual instances.
File I/O has been improved by several new system tasks. And finally, a few syntax additions
were introduced to improve code readability (e.g. always @*, named parameter override, C-
style function/task/module header declaration). Verilog-2001 is the dominant flavor of Verilog
supported by the majority of commercial EDA software packages.
Verilog 2005
Not to be confused with SystemVerilog, Verilog 2005 (IEEE Standard 1364-2005)
consists of minor corrections, spec clarifications, and a few new language features (such as the
uwire keyword). A separate part of the Verilog standard, Verilog-AMS, attempts to integrate
analog and mixed signal modelling with traditional Verilog.
System Verilog
System Verilog is a superset of Verilog-2005, with many new features and capabilities
to aid design verification and design modeling. As of 2009, the SystemVerilog and Verilog
language standards were merged into SystemVerilog 2009 (IEEE Standard 1800-2009).
In the late 1990s, the Verilog Hardware Description Language (HDL) turned into the most
broadly utilized dialect for depicting equipment for reproduction and union. In any case, the initial
two adaptations institutionalized by the IEEE (1364-1995 and 1364-2001) had just straightforward
builds for making tests. As outline sizes exceeded the check abilities of the dialect, business
Hardware Verification Languages (HVL, for example, Open Vera and e were made. Organizations
that would not have liked to pay for these apparatuses rather burned through several man-years
making their own custom instruments. This profitability emergency (alongside a comparative one
on the outline side) prompted the formation of Accellera, a consortium of EDA organizations and
clients who needed to make the up-and-coming era of Verilog. The gift of the Open-Vera dialect
framed the premise for the HVL elements of SystemVerilog.Accellera's objective was met in
November 2005 with the reception of the IEEE standard P1800-2005 for SystemVerilog, IEEE
(2005). The most important advantage of SystemVerilog is that it permits the client to develop
dependable, repeatable confirmation situations, in a predictable linguistic structure, that can be
utilized over numerous activities
Examples
module main;
initial
begin
$display("Hello world!");
$finish;
end
endmodule
Dept of ECE Dr. KVSRIT Page 53
HARDWARE EFFICIENT POST PROCESSING ARCHITECTURE FOR TRUE RANDOM NUMBER GENERATOR
reg a, b, c, d;
wire e;
always@(b or e)
begin
a = b & e;
b = a | b;
#5 c = b;
d =#6 c ^ e;
end
The always clause above illustrates the other type of method of use, i.e. the always clause
executes any time any of the entities in the list change, i.e. the b or e change. When one of these
changes, immediately a is assigned a new value, and due to the blocking assignment b is assigned a
new value afterward (taking into account the new value of a.) After a delay of 5-time units, c is
assigned the value of b and the value of c ^ e is tucked away in an invisible store. Then after 6 more-
time units, d is assigned the value that was tucked away.
Dept of ECE Dr. KVSRIT Page 56
HARDWARE EFFICIENT POST PROCESSING ARCHITECTURE FOR TRUE RANDOM NUMBER GENERATOR
Signals that are driven from within a process (an initial or always block) must be of type
reg. Signals that are driven from outside a process must be of type wire. The keyword reg
does not necessarily imply a hardware register.
7.2 Constants
The definition of constants in Verilog supports the addition of a width parameter. The basic
syntax is:
Examples:
There are several statements in Verilog that have no analog in real hardware, e.g.
$display. Consequently, much of the language can not be used to describe hardware. The examples
presented here are the classic subset of the language that has a direct mapping to real gates.
begin
case(sel)
1'b0: out = b;
1'b1: out = a;
endcase
end
// Finally - you can use if/else in a
// procedural structure.
reg out;
always@(a or b orsel)
if(sel)
out= a;
else
out= b;
The next interesting structure is a transparent latch; it will pass the input to the output when
the gate signal is set for "pass-through", and captures the input and stores it upon transition of the
gate signal to "hold". The output will remain stable regardless of the input signal while the gate is set
to "hold". In the example below the "pass-through" level of the gate would be when the value of the
if clause is true, i.e. gate = 1. This is read "if gate is true, the din is fed to latch_out continuously."
Once the if clause is false, the last value at latch_out will remain and is independent of the value of
din.
EX6: // Transparent latch example
reg out;
always@(gate or din)
if(gate)
out= din;// Pass through state
// Note that the else isn't required here. The variable
// out will follow the value of din while gate is high.
The flip-flop is the next significant template; in Verilog, the D-flop is the simplest, and it
can be modeled as:
reg q;
always@(posedgeclk)
q <= d;
The significant thing to notice in the example is the use of the non-blocking
assignment. A basic rule of thumb is to use <= when there is a
posedge or negedge statement within the always clause.
A variant of the D-flop is one with an asynchronous reset; there is a convention that
the reset state will be the first if clause within the statement.
reg q;
always@(posedgeclkorposedge reset)
if(reset)
q <=0;
else
q <= d;
The next variant is including both an asynchronous reset and asynchronous set condition;
again the convention comes into play, i.e. the reset term is followed by the set term.
reg q;
always@(posedgeclkorposedge reset orposedge set)
if(reset)
q <=0;
else
if(set)
q <=1;
else
q <= d;
Note: If this model is used to model a Set/Reset flip flop then simulation errors can result.
Consider the following test sequence of events. 1) reset goes high 2) clk goes high 3) set goes high
4) clk goes high again 5) reset goes low followed by 6) set going low. Assume no setup and hold
violations.
In this case the dependably @ explanation would first execute when the rising edge of
reset happens which would put q to an estimation of 0. Whenever the dependably square executes
would be the rising edge of clk which again would keep q at an estimation of 0. The dependably
piece then executes when set goes high which on the grounds that reset is high strengths q to stay at
0. This condition could conceivably be right contingent upon the real flip lemon. Be that as it may,
this is not the primary issue with this model. Notice that when reset goes low, that set is still high.
In a genuine flip slump this will make the yield go to a 1. On the other hand, in this model it won't
happens on the grounds that the dependably piece is activated by rising edges of set and reset - not
levels. An alternate methodology may be important for set/reset flip failures.
Note that there are no "starting" pieces specified in this depiction. There is a split in the
middle of FPGA and ASIC combination instruments on this structure. FPGA instruments permit
starting squares where reg qualities are built up as opposed to utilizing a "reset" signal. ASIC blend
devices don't backing such an announcement. The reason is that a FPGA's beginning state is
something that is downloaded into the memory tables of the FPGA. An ASIC is a genuine
equipment usage.
There are two separate ways of declaring a Verilog process. These are the always and
the initial keywords. The always keyword indicates a free-running process. The initial keyword
indicates a process executes exactly once. Both constructs begin execution at simulator time 0, and
both execute until the end of the block. Once an always block has reached its end, it is rescheduled
(again). It is a common misconception to believe that an initial block will execute before an always
block. In fact, it is better to think of the initial-block as a special-case of the always-block, one
which terminates after it completes for the first time.
//Examples:
initial
begin
a =1;// Assign a value to reg a at time 0
#1;// Wait 1 time unit
b = a;// Assign the value of reg a to reg b
end
These are the classic uses for these two keywords, but there are two significant
additional uses. The most common of these is an always keyword without the @(...) sensitivity list.
It is possible to use always as shown below:
always
begin// Always begins executing at time 0 and NEVER stops
clk=0;// Set clk to 0
#1;// Wait for 1 time unit
clk=1;// Set clk to 1
#1;// Wait 1 time unit
end// Keeps executing - so continue back at the top of the begin
The always keyword acts similar to the "C" construct while (1) {..} in the sense that it will
execute forever.
The other interesting exception is the use of the initial keyword with the addition of
the forever keyword.
initial
a =0;
initial
b = a;
initial
begin
#1;
$display ("Value a=%b Value of b=%b”, a,b);
end
What will be printed out for the values of a and b? Depending on the order of execution of
the initial blocks, it could be zero and zero, or alternately zero and some other arbitrary uninitialized
value. The $display statement will always execute after both assignment blocks have completed, due
to the #1 delay.
7.6 Operators
^ Bitwise XOR
~^ or ^~ Bitwise XNOR
! NOT
Logical
&& AND
|| OR
~| Reduction NOR
^ Reduction XOR
~^ or ^~ Reduction XNOR
+ Addition
- Subtraction
- 2's complement
Arithmetic
* Multiplication
/ Division
** Exponentiation (*Verilog-2001)
Concatenation { , } Concatenation
Conditional ?: Conditional
Table 1 Race condition
System tasks are available to handle simple I/O, and various design measurement functions.
All system tasks are prefixed with $ to distinguish them from user tasks and functions. This section
presents a short list of the most often used tasks. It is by no means a comprehensive list.
$dumpfile - Declare the VCD (Value Change Dump) format output file name.
$dumpvars - Turn on and dump the variables.
$dumpports - Turn on and dump the variables in Extended-VCD format.
CHAPTER-8
SIMULATION RESULTS
CHAPTER 9
CONCLUSION
We have presented an improved fully digital tunable TRNG for FPGA based
applications, based on the principle of Beat Frequency Detection and clock jitter, and with
in-built error correction capabilities. The TRNG utilizes this tunability feature for
determining the degree of randomness, thus providing a high degree of flexibility for
various applications.
CHAPTER 10
FUTURE SCOPE
Presently, we are dealing with the 4-digit numeric OTP system in this project. So, in
order to improve the security, we need to develop an Alpha numeric (symbol base)
OTP systems.
CHAPTER 11
BIBLOGRAPHY
[1] D. B. Thomas and W. Luk, “The LUT-SR Family of Uniform Random Number
Generators for FPGA Architectures,” IEEE Transactions on Very Large-Scale Integration
(VLSI) Systems, March 2012.
[2] D. B. Thomas and W. Luk, “FPGA-optimized uniform random number generators using
lot and shift registers,” in Proc. Int. Conf. Field Program. Logic Appl., 2010, pp. 77–82.
[3] D. B. Thomas and W. Luk, “FPGA- optimized high - quality uniform random number
generators,” in Proc. Field Program. Logic Appl. Int.Conf., 2008, pp. 235-244.
[4] D. B. Thomas and W. Luk, “High quality uniform random number generation using
LUT optimized state-transition matrices,” J. VLSI Signal Process., vol. 47, no. 1, pp. 77–92,
2007.
[5] F. Panneton, P. L’Ecuyer, and M. Matsumoto, “Improved long period generators based
on linear recurrences modulo 2,” ACM Trans. Math. Software, vol. 32, no. 1, pp. 1–16,
2006.
[6] P. L’Ecuyer, “Tables of maximally equidistributed combined LFSR generators,”
Math.Comput., vol. 68, no. 225, pp. 261– 269, 1999.
[7] M. Saito and M. Matsumoto, “SIMD-oriented fast mersenne twister: A 128-bit Pseudo
random number generator,” in MonteCarlo and Quasi- Monte Carlo Methods. NewYork:
SpringerVerlag, 2006, pp. 607–622.
[8] M. Matsumoto and T. Nishimura, “Mersenne twister: A 623- dimensionally
equidistributed uniform pseudo-random number generator,” ACM Trans. Modeling
Comput. Simulate., vol.8, no. 1, pp. 3–30, Jan. 1998.
[9] F. Panneton, P. L’Ecuyer, and M. Matsumoto, “Improved longperiod generators based
on linear recurrences modulo 2,” ACM Trans. Math. Software, vol. 32, no. 1, pp. 1–16,
2006.
CHAPTER 12
SORCE CODE
`timescale 1ns / 1p
////////////////////////////////////////////////////////////////////////////////
// Company:
// Engineer:
//
// Target Device:
// Tool versions:
// Description:
//
//
// Dependencies:
//
// Revision:
// Additional Comments:
//
////////////////////////////////////////////////////////////////////////////////
module lfsr_tb1;
// Inputs
reg clk;
reg rst;
// Outputs
wire qout;
lfsr uut (
.clk(clk),
.rst(rst),
.qout(qout)
);
initial begin
// Initialize Inputs
clk = 0;
rst = 1;
#50
rst=0;
#50
rst=0
end
endmodule