RTL Design Using VHDL
RTL Design Using VHDL
RTL Design Using VHDL
declare enumerated types and subtypes of array types in architectures and packages
We will also learn an approach to logic design called Register Transfer Level (RTL) or dataow design. This is the
method currently used for the design of complex logic circuits such as microprocessors.
You should be able to:
select a sufcient set of registers and logic/arithmetic functions required to implement an algorithm
convert the algorithm into a sequence of register transfers through logic/arithmetic functions
write synthesizeable VHDL RTL code to implement the algorithm
We also cover three topics related to the design of interfaces to logic circuits: metastability, input synchronization
and glitches. You should be able to: identify circuits where metastable behaviour is possible; compute the mean time
between metastable outputs; identify circuits that could fail due to asynchronous inputs; add synchronizer ip-ops
to reduce the probability of metastability; remove race conditions by registering inputs; and use registered outputs to
eliminate glitches.
Reserved Words
In order to avoid declaring each component in every architecture where it is used, we typically place
component declarations in packages. A package
typically contains a set of component declarations
for a particular application. Packages are themselves
1
abs access after alias all and architecture array assert attribute begin block body buffer bus
case component conguration constant disconnect downto else elsif end entity exit le for function generate generic group guarded if impure in
inertial inout is label library linkage literal loop
map mod nand new next nor not null of on open
or others out package port postponed procedure
process pure range record register reject rem report return rol ror select severity signal shared
sla sll sra srl subtype then to transport type unaffected units until use variable wait when while
with xnor xor
library ieee ;
use ieee.numeric_bit.all ;
Exercise 35:
example?
Note that a component denes an interface to another device. That device may not have been designed with VHDL so there may not necessarily be a
corresponding entity declaration.
Creating Components
stored in libraries:
Library
Package
component
component
component
A component declaration is similar to an entity declaration and denes the input and output signals.
Component declarations can be placed in an architecture before the begin. But its usually more
convenient to put component declarations within a
package declaration. When we compile (or analyze) the package declaration the information about
the components in the package is saved in a le in the
WORK library. The components in the packages can
then be used in an architecture (in that same le or in
other les) by using the appropriate use statements.
For example, the following code declares a package called flipflops. This package contains only
one component, rs, with inputs r and s and an output q:
Package
component
package flipflops is
component rs
port ( r, s : in bit ; q : out bit ) ;
end component ;
end flipflops ;
Component Instantiation
1 The logic synthesizer used to create the schematics in these
lecture notes.
2 An exception: when an architecture immediately follows its
entity you need not repeat the library and use statements.
up to the other signals in the architecture. It is a concurrent statement (as is a selected assignment statement).
The following example shows how three 2-input
exclusive-or gates can be used to build a 4-input
parity-check circuit using component instantiation.
This type of description is called structural VHDL
because we are dening the structure rather than the
behaviour of the circuit.
In this case we have put the component declaration
into the le mypackage.vhd. The xor_pkg contains
the xor2 component (although a typical package denes more than one component):
The following comparison shows some rough equivalents between the VHDL concepts described above
A second le, parity.vhd, describes the parity and C programming3 .
entity that uses the xor2 component:
VHDL
C
-- parity function built from xor gates
analyze
compile
elaborate
link
use work.xor_pkg.all ;
component function
entity parity is
instantiate
call
port ( a, b, c, d : in bit ; p : out bit ) ;
use
#include
end parity ;
package
DLL
architecture rtl of parity is
library
directory
-- internal signals
signal x, y :
begin
x1: xor2 port
x2: xor2 port
x3: xor2 port
end rtl ;
bit
map ( a, b, x ) ;
map ( c, x, y ) ;
map ( d, y, p ) ;
std_logic
std_logic_vector
declared in
std_logic_arith
signed
unsigned
Constants
The standard arithmetic operators (+, -, *, /, **, You can declare symbolic constants in the same way
>, <, <=, >=, =, /=) can be applied to signals of type as signals. For example:
signed or unsigned. Note that it may not be practical or possible to synthesize complex operators such
constant zero_bits : unsigned (3 downto 0) := "0000" ;
as multiplication, division or exponentiation.
For example, we could generate the combinational A constant declared in a package is available to all
logic to build a 4-bit adder using the following archi- design units (packages, entities and architectures)
tecture:
that use that package. You should use symbolic constants for any values that are likely to change or if it
makes your code easier to read or easier to modify.
Integers
library ieee ;
use ieee.std_logic_1164.all ;
use ieee.std_logic_arith.all ;
VHDL also includes an integer type which is useful for specifying small constants (e.g. next x <=
x + 1 ;). However, signals should be declared
std_logic or one of its subtypes, not integer.
Declarations sometimes use the natural (values
0), and positive (values 0) types. Integer
constants can be specied in non-decimal base. For
example, the value 2000 hex can be specied as:
16#2000#.
entity adder4 is
port (
a, b : in unsigned (3 downto 0) ;
c : out unsigned (3 downto 0) ) ;
end adder4 ;
Type Declarations
are overloaded so that you can use the same function/operator with more than one type, in many cases
you will need to use type conversion functions.
The following type conversion functions are found
in the the std_logic_1164 package in the ieee library:
from
lv
bv
to
bv
lv
function
to bitvector(x)
to stdlogicvector(x)
to
un
lv
in
un
lv
package dsp_types is
type mode is (slow, medium, fast) ;
subtype word is std_logic_vector (15 downto 0) ;
end dsp_types ;
function
unsigned(x)
std logic vector(x)
conv integer(x)
conv unsigned(x,len)
conv std logic vector(x,len)
Exercise 39:
Generics
Exercise 38:
begin
n <=
"100"
"011"
"010"
"001"
"000"
end rtl ;
package registers is
component nregister
generic ( width : integer ) ;
port ( d : in bit_vector (width-1 downto 0) ;
q : out bit_vector (width-1 downto 0) ;
clk : in bit ) ;
end component ;
end registers ;
when
when
when
when
;
b(3)
b(2)
b(1)
b(0)
=
=
=
=
1
1
1
1
else
else
else
else
"000"
"001"
"010"
n
"011"
b(0)=1
"100"
b(1)=1
b(2)=1
Attributes
b(3)=1
Exercise 40: Write a conditional assignment that models a 2to-1 multiplexer. Use an array x as the input, a signal sel to
Conditional Assignment
select the input and a signal y as the output. Repeat for a 4-to-1
multiplexer (sel is now an array).
library ieee ;
use ieee.std_logic_1164.all ;
entity nbits is port (
b : in std_logic_vector (3 downto 0) ;
n : out std_logic_vector (2 downto 0) ) ;
end nbits ;
architecture rtl of nbits is
Tri-State Buses
Memory Models
A tri-state output can be set to high and low logic levels as well as to a third state: high-impedance (Z).
This type of output is used where different devices
outputs are connected together and drive a common
bus (hopefully at different times!). To specify that an
output should be set to the high-impedance state, we
use a signal of type std_logic and assign it a value
of Z.
The following example shows an implementation
of a 4-bit buffer with an enable output. When the
enable is not asserted the output is in high-impedance
mode :
VHDL also allows the use of arrays with signal indices to model random-access memory (RAM). The
following example demonstrates the use of VHDL
arrays as well as bi-directional buses. We must use
the type-conversion function conv_integer because
the address input, a, is of type unsigned while the
array index must be of type integer.
library ieee ;
use ieee.std_logic_1164.all ;
use ieee.std_logic_arith.all ;
library ieee ;
use ieee.std_logic_1164.all ;
entity ram is
port (
-- bi-directional data signal
d : inout std_logic_vector (7 downto 0) ;
-- address input
a : in unsigned (1 downto 0) ;
-- output enable and write strobe (clock)
oe, wr : in std_logic ) ;
end ram ;
Exercise 41: Modify the design above to create a 16-element, 4bit wide RAM with separate input and output signals. How could
you model a ROM?
Structural Design
For many implementation technologies (FPGAs,
gate arrays, or standard-cell ASICs) there are usually vendor-specic ways of implementing memory
arrays that give better results. However, using a
VHDL-only model with random logic as shown
above is more portable and may be practical for small
memories such as CPU register les.
Behavioral Design
RTL Design
Register Transfer Level, or RTL5 design lies between
a purely behavioral description of the desired circuit
and a purely structural one. An RTL description describes a circuits registers and the sequence of transfers between these registers but does not describe the
hardware used to carry out these operations.
The steps in RTL design are: (1) determine the
number and sizes of registers needed to hold the
data used by the device, (2) determine the logic and
arithmetic operations that need to be performed on
these register contents, and (3) design a state machine whose outputs control how the register contents are updated in order to obtain the desired results.
Producing an RTL design is similar to writing a
computer program in a conventional programming
language. Choosing registers is the same as choosing variables. Designing the ow of data in the datapath is analogous to writing expressions involving the variables (registers) and operators (combinational functions). Designing the controller state machine is similar to deciding on the ow of control
within the program (if/then/else, while-loops, etc).
=
=
=
=
=
0
s
s
s
s
+
+
+
+
a
b
c
d
register
a
b
c
d
multiplexer
The RTL designer can trade off datapath complexity (e.g. using more adders and thus using more chip
area) against speed (e.g. having more adders means
fewer steps are required to obtain the result). RTL
design is well suited for the design of CPUs and
special-purpose processors such as disk drive controllers, video display cards, network adapter cards,
etc. It gives the designer great exibility in choosing
between processing speed and circuit complexity.
adder
multiplexer
from controller
clock
The diagram below shows a generic component in Exercise 45: Other datapaths could compute the same result.
the datapath. Each RTL design will be composed of Draw the block diagram of a datapath capable of computing the
one of the following building blocks for each regis- sum of the four numbers in three clock cycles.
ter. The structure allows the contents of each regThe rst design unit is a package that denes a
ister to be updated at the end of each clock period new type, num, for eight-bit unsigned numbers and an
with a value selected by the controller. The widths enumerated type, states, with six possible values.
of the registers, the types of combinational functions nums are dened as a subtype of the unsigned type.
and their inputs will be determined by the application. A typical design will include many of these
-- RTL design of 4-input summer
components.
-- subtype used in design
register
arithmetic/logic
function
multiplexer
...
from registers
...
arithmetic/logic
function
...
...
arithmetic/logic
function
library ieee ;
use ieee.std_logic_1164.all ;
use ieee.std_logic_arith.all ;
clock
from controller
package averager_types is
subtype num is unsigned (7 downto 0) ;
type states is (clr, add_a, add_b, add_c,
add_d, hold) ;
end averager_types ;
The inputs to the datapath from the controller are input is an update signal that tells our device to rea 2-bit selector for the multiplexer and two control compute the sum (presumably because one or more
signals to load or clear (set to 0) the register.
of the inputs has changed).
This particular state machine sits at the hold
state until the update signal is true. It then sequences
-- datapath
through the other ve states and then stops at the hold
library ieee ;
state again. The other ve states are used to clear the
use ieee.std_logic_1164.all ;
use ieee.std_logic_arith.all ;
register and to add the four inputs to the current value
use work.averager_types.all ;
of the register.
entity datapath is
port (
a, b, c, d : in num ;
sum : out num ;
sel : in std_logic_vector (1 downto 0) ;
load, clear, clk : in std_logic
) ;
end datapath ;
-- controller
library ieee ;
use ieee.std_logic_1164.all ;
use work.averager_types.all ;
entity controller is
port (
update : in std_logic ;
sel : out std_logic_vector (1 downto 0) ;
load, clear : out std_logic ;
clk : in std_logic
) ;
end controller ;
Exercise 46: Label the block diagram above with the bus widths
and signal names used in the entity.
What would happen if both clear and load inputs were asserted? Why do we need to dene both sum reg and sum signals?
-- controller outputs
with s select sel <=
"00" when add_a,
"01" when add_b,
"10" when add_c,
"11" when others ;
How many clock cycles will it take to compute the sum of the
four inputs?
load
11
end averager_components ;
hold
clear
add_a
add_b
add_c
add_d
a+b
a+b+c
hold
hold
clock
sum
a+b+c+d a+b+c+d
tclock
tCO
tPD
timing margin
register setup time
max. propagation
delay
clockto
output delay
register input
clock
state n1
state n
state n+1
clock edges
(change of state)
Using a single clock means we only need to compute the delay through combinational logic blocks
which is much simpler than having to predict the
effect of propagation delays on clock signals. This
is why almost all large-scale digital circuits are synchronous designs.
Synthesis tools can be asked to synthesize logic
that operates at a particular clock period. The synthesizer is supplied with the propagation delay specications for the combinational logic components
available in the particular technology being used and
it will then try to arrange the logic so that the longest
propagation delay between any register output and
any register input is less than the clock period (minus
setup and clock-to-output delays). This ensures that
the circuit will work properly at the specied clock
rate.
Behavioural Synthesis
Metastability
Introduction
The proper operation of a clocked ip-op depends
on the input being stable for a certain period of time
before (the setup time) and after (the hold time) the
clock edge. If the setup and hold time requirements
are met, the correct output will appear at a valid
output level (between VOL and VOH ) at the ip-op
output after a maximum delay of tCO (the clock-tooutput delay). However, if these setup and hold time
requirements are not met then the output of the ipop may take much longer than tCO to reach a valid
logic level. This is called metastable behaviour or
metastability.
An invalid logic level at the output of the ip-op
may be interpreted by some logic gates as a 1 and
by others as a 0. This leads to unpredictable and
usually incorrect behaviour of the circuit.
In the synchronous circuits we have studied thus
far we have been able to prevent metastability by
clocking all ip-ops from the same clock and ensuring that the maximum propagation delay of any combinational logic path is less than the clock period minus the ip-op setup time and clock-to-output delay.
However, when inputs to a synchronous circuit are
not synchronized to the clock, it is impossible to ensure that the setup and hold times will be met. This
will eventually lead to the incorrect behaviour of the
device. It is important to realize that all practical
logic circuits will eventually fail due to metastability. However, the designer should try to ensure that
these failures happen very infrequently (e.g. once per
103 or 106 years of operation) so that other causes of
failure predominate.
Reducing Metastability
The simplest approach is to slow down the clock
since this provides a longer time for the output of
the ip-op to reach a stable output value. Because
the MTBF increases exponentially with t MET a small
reduction in clock frequency will often be enough to
increase the MTBF to an acceptable value. However,
in other cases this approach will be unacceptable because the resulting clock rate will be too slow.
Another approach is to use ip-ops with shorter
setup and hold times (and correspondingly smaller
C1 and larger C2 values). Whenever possible, these
Glitches
Glitches are short temporary changes in outputs that
are caused by different propagation delays in a circuit. There are two reasons why glitches are undesirable.
The rst set of problems is related to noise and
power. Since glitches are short pulses much of their
energy is at high frequencies and this power couples easily onto adjacent conductors. This induces
14
noise into other circuits and reduces their noise immunity. Glitches also cause power supply current
spikes which result in voltage transients on the power
supply lines. Another problem with glitches is that
in CMOS logic families current consumption is proportional to the number of transistor switchings and
glitches lead to increased current consumption.
The second set of problems arises when the digital output of one circuit is used as a clock in another
circuit (e.g. to drive a counter or register). In this
case glitches cause undesired clock edges (similar to
switch bounce). In synchronous (single-clock) circuits these glitches are not a problem.
Glitches can be reduced by modifying the design
of the combinational logic. However, this usually introduces additional logic. Glitches on signals that
are conned to short paths within a circuit or inside
a chip are usually tolerated. However, when outputs
are brought off a chip, board or system (e.g. onto a
bus) it is good practice to eliminate glitches.
The simplest way to eliminate glitches is to use
a registered output signal. The output of a ip-op
changes only once, on the clock edge, and thus eliminates any glitches on its input. There are two ways
to register outputs. Often it is possible to use register
outputs directly such as when an output is already in
a data register or when the signals are state machine
state registers. The second method is to pass the signal through an additional ip-op before it is output.
The disadvantage of this method is that the output
will be delayed by up to one clock period.
As a general rule, always register outputs.
15