Blaine Readler - Verilog by Example - A Concise Introduction For FPGA Design (2011, Full Arc Press) PDF
Blaine Readler - Verilog by Example - A Concise Introduction For FPGA Design (2011, Full Arc Press) PDF
level here doesn’t necessarily mean just AND and OR gates, but
includesbasic functional blocks as muxesandflip-flops. It is at this
step that we find out if our code can bepractically translated into
logic that can be implemented in an FPGA. The output from the
synthesis step looks very muchlike a netlist. Expensive stand-alone
synthesis tools are often used for large or complex designs, but
most FPGA vendor software includes synthesis that is quite
adequate for many applications.
Step 4: compile
Whereas the synthesis of step 3 still comprises somewhat
abstract logic constructs, the final compile step maps the synthesis
netlist-like logic description into the specific logic and routing
resources of the FPGA device. This step is always performed by
the vendor software. We can define pins assignments, or let the
tool automatically assign them (almost never done on all but the
most difficult designs). It is in this compile step that we find out if
the design that was synthesizable can actually be implemented into
out chosen device. The output of the compile step is the binary
load file that 1s used to configure the FPGA.
In and Out
First a word about coding style 1s necessary. What you find in
this book are the author’s methods developed over many years of
practice. The goal should always be to produce readable code that
is easy to understand. Different people have different preferences,
though, and you will find as many individual styles as there are
people coding. About the only absolutely wrongstyle is nostyle,
1.e., where all the text is smashed to the left margin with no
indenting or consistent parenthetical blocking. You should note
that there are many shortcuts that could be taken with the code
used throughout this book, but you'll never be wrong by including
optional parenthesis or block flags, but you could very well cause
your code to synthesize in an unintended manner if you make
careless eliminations.
The synthesis tool expects certain standardfile structures. We'll
start with almost the simplest design possible in order to introduce
the minimum requirements: two combinatorial operations on three
inputs. Here’s how it looks as logic block flow. Note that this box
represents the entire FPGA.
nO l ) out un
simple_in_n_out
The verilog code can be seen on the next page. The text file
implements one “module,” which for this simple design is the entire
design. The word “module”is a required keyword, and is followed
6
In and Out
// Port definitions
input in1;
input in2;
input in 3; .
— I/O declarations
output out1;
output out2;
assign out1 =
assign out2 = inl |in2 | in_3;
endmodule
A port list follows the name of the module, and definesall the
sionals in and out of the module, separated by commas. I/O
declarations then follow, defining the direction of each signal listed
in the port list (as we'll see later, new verilog versions allow the
direction declarations to reside directly in the port list).
Following the declarations, the design proper begins. In this
simplest case, the design consists of simple combinatorial
Verilog by Example
nis
intermediatesi out_1
in_2 ; pD* >
out 2
in_3 ~ »
intermed_wire
module intermedwire
(
// Inputs
inl,
in2,
in3,
// Outputs
out1,
out2
);
// Port definitions
“
input in_1;
input in2;
input in_3;
> declarations
output outi;
output out2;
(wire intermediate_sig;) J
endmodule
in_1[3:0]
! :0 ; 1h »10
in_2[0
NX int]
in_2[1
Kin12] out[3-0] 5
in_22
\ in13]
to
in_2[3:0]
in_3 >
bus_sigs
BusSignals
new bus have the same value as the original “in_3”. This is done
using a replication operator, where the value of the signal inside the
inner pair of braces is repeated the numberof times as indicated by
the numberbetweenthe pairs of braces.
Note that “~” is a bitwise negation operator, 1e., it inverts each
bit of the vector signal (bus “in_3_bus”’).
module bussigs
(
// Inputs
inl,
in2,
in3,
// Outputs
out1
);
// Port definitions
input [3:0] in 1;
input [3:0] in 2;
input in 3;
This is replicated this many times.
output [3:0] out1;
endmodule
BusSignals
Verilog by Example
The next block logic diagram shows the logic gates of the
previous diagram collected together into a standard mux symbol.
Note that we have not changed the function, just the
representation.
standard mux
Standard Mux
SILILILIILIT ILIA AA
//
// Header information
// Standard Mux
//
SLLTLITIATTA TATA TATA
module standardmux
(
// Inputs
inl,
in2,
in_3,
// Outputs
out1,
);
// Port definitions
input [3:0] in 1;
input [3:0] in2;
input in 3;
endmodule
Standard Mux
Verilog by Example
in_1[3:0] >|
NX.in_1[0] out_1[0]
NX. in4[1] out_1[1] \
N\
NX. in1[2]
ae
out_1 [5:01
\it8) aD’
in.2[2] out_1[4] /
4 in_2[3] out_1[5] 7
Y/
in_2[3:0]
bus_breakout
Bus Breakout
module busbreakout
(
// Inputs
in_l,
in2,
// Outputs
out_1
);
// Port definitions
input [3:0] in1;
input [3:0] in2;
output [5:0] out1;
endmodule
Bus Breakout
Verilog by Example
Clocks and Registers
In the introduction, I indicated that this book assumesthat you
have a working familiarity with digital design. The rubber 1s about
to meetthe road.
Clocked state logic comprises the vast majority of the workings
of modern FPGAs, and it 1s here that the true complexity and
sophistication of any hardware descriptive language unfolds. The
fundamental principles of clocked operation in verilog, though,are
straightforward, and easy to grasp if we take them step at a time.
Until now, our code has consisted of continuous assignments,
1.e., direct combinatorial logic. These “assign” statements are
continuous in the sense that the output signal (the one being
assigned) is continuously responsive to any and all inputs. Any
input that changes (and is not gated off by the mtervening logic)
will immediately affect the output (Qgnoring physical delays).
Contrary to this, registers hold or store information, and therefore
require a different coding mechanism called a structured procedural
statement. The most commonstructured procedural statement,
and the one used almost exclusively for register implementations,1s
the “always block.” ‘There are a variety of flavors of this, but for
implementation (1e., synthesis) of clocked registers, we use
exclusively the sequential, non-blocking version. That probably
doesn’t mean muchto you, and that’s okay for now. It is helpful to
know that there are other forms in case you may happen across
them, but for the time being, an always-block is synonymous with a
register.
We'll begin by implementing the simplest form of a D-flop.
Since this represents the basis for the various forms of registers we
will continue to encounter, it is labeled as a “Reg.” As shown in the
timing diagram, output “out_1” follows “input in_1” at the clocked
edges.
Verilog by Example
| st 1 s2 | s38 | s4 | sd |
clk | [ | [7] |
int / ey
out_1 / YY
clk >
simple_dflop
Simple D-flop
For the sake of brevity we’ve modified the file format a bit in
the code on the opposite page (you'll get used to this as you look
across different people’s code).
We've added a new declaration for a “reg.” This is necessary
since we will be implementing output signal “out_1” as a register
type. This is in contrast to the “wire” declaration. We have not
previously needed to declare outputs explicitly as wires since in
verilog outputs default to wire types (it wouldn’t have been wrong
to declare all the previous outputs as wires, just not necessary).
The section of code shown asthe always-block implements the
D-flop register. The information inside the parenthesis next to the
“@” symbolis called the sensitivity list, and defines which signals
can contribute to changes inside the block. Specifically, no activity
inside the block can occur unless something in the sensitivity list
changes. In the case of our simplest of D-flop registers, the
sensitivity list contains just the clock signal. Further, “posedge”
defines the flop as rising-edge triggered (“negedge” would be
falline-edge triggered).
The operation is easy to see: at every rising clock edge (and
only at a rising clock edge), the value of “in_1” 1s assigned to
“out_1”. You may wonder why we use the two-part “<=” symbol
instead of a simple “=” for the assignment like we did with the
combinatorial assignments, and the answer is that this defines it as a
Clocks and Registers
input clk;
input in1;
output out1;
reg out1;
endmodule
Simple D-flop
Verilog by Example
| s1 | s2 | s3 | s4 | s5 | s6 | s7
ck LI
int _ KL
out_1 / \ \
reset |
in 1 out 1
- > Reg —P
clk
reset >
dflop_n_reset
In the code on the opposite page you can see that the always-
block has now grown to accommodatethe reset. Since the reset is
asynchronous and results in activity tmmediately, it must be
included in the sensitivity list. Tagging it as “posedge”’ meansthatit
will be high-active—the flop resets as soon as the reset goes high,
but after the reset is lifted, the flop doesn’t change until the next
clock edge, thus only the rising edge of the reset requires immediate
attention.
The body of the always-block has now become more
complicated as we introduce if/else conditional statements to
accommodate the reset. Any time “reset” his high, “out_1” 1s
forced to zero. Since this happens as soon as reset goes active
(reset 1s part of the sensitivity list), and at every rising clock edge,
you can see that this effects an asynchronousclear. When reset1s
Clocks and Registers
input clk;
input reset;
input in1;
output out1;
reg out1;
endmodule
clean
clk >
reset >
dflop_en_clr
input clk;
input reset;
input in1;
input enable;
input clear;
output out1;
reg out1;
endmodule
start , cnt_en
stop ;
count[3:0] >
stop stop_d2
——$<__$__ >
clks_n_regs_4
in a sense a latch. Notice that since the counter is modulo 14, the
first “else if’ statement clears it when the count is 13. A couple of
things to note here: “ 4’d13 ” indicates a decimal thirteen, and we’re
now using a double “&&” in the conditional expression. This 1s
because “8&&” is a Boolean AND (versus the bitwise “&’’), which 1s
required for the conditional decision. In the samesense, “| |” 1s a
Boolean OR (versusthe bitwise “|”’). Note that we use “ 4’h0 ” for
clearing the counter. This indicates a hex zero. It could just as well
have been 4b0000, or 4’d0. Similarly, the 4’d13 modulo rollover
could have beenthe slightly less readable #hD, or even 4’b1101.
input clk;
input reset;
input start;
input stop;
output [3:0] count;
reg cnt_en;
reg [3:0] count;
reg stopdl;
reg stop_d2;
// SR flop
always @( posedge clk or posedge reset )
begin
1£ ( reset )
cnt_en <= 1'b0;
else if ( start )
cnt en <= 1'bl;
else if ( stop )
cnt_en <= 1'b0;
end
Verilog by Example
// Counter
always @( posedge clk or posedge reset )
begin
if ( reset )
count <= 4'h0O;
else if ( cnt_en
&& count == 4'd13
)
count <= 4'h0;
else if ( cnt_en )
count <= count + 1;
end
// delay
always @( posedge clk or posedge reset )
begin
if ( reset )
begin
stop_dl <= 1'b0;
stopd2 <= 1'b0;
end
else
begin
stopdl <= stop;
stop_d2 <= stop_dl;
end
end
endmodule
if ( cnt_en
&& count == 4'dl13
)
count <= 4'h0;
else if ( cnt_en )
count <= count + 1;
“finish’ Ldone_».
kill
clk —)
clk
duration_cnt == 100
“active” count/[6:0]
“finish” or “abort”
clk +)
state_machine
State Machine
eee
// State Machine
input clk;
input reset;
input go;
input kill;
output done;
// State Machine
always @( posedge clk or posedge reset )
begin
if ( reset )
statereg <= idle;
else
case ( statereg )
idle
if ( go ) statereg active;
active
if ( kill ) statereg abort;
else if
(count == 7'd100) state_reg finish;
abort
if ( !kill ) statereg idle;
// Counter
always @( posedge clk or posedge reset )
begin
if ( reset )
count <= 7'h00;
else if ( statereg == finish
|| statereg == abort
)
count <= 7'h00;
else if ( statereg == active )
count <= count + 1;
end
// done register
always @( posedge clk or posedge reset )
begin
if ( reset )
done <= 1'b0O;
else if ( statereg == finish )
done <= 1'bl;
else
done <= 1'b0O;
end
endmodule
State Machine
Verilog by Example
contains “idle” (2’b00). Each clock, the case selection executes the
idle group, where if “go” is not high (not active) then nothing 1s
done, so that for the next clock “state_reg” still contains “idle.”
Eventually “go” transitions high, and “state_reg’” is assigned
“active”. This corresponds to the first transition of the state
machine. For the next clock, the case statement selects for
execution the “active” group, where “state_reg” remains unchanged
until either “kill” goes high, or the counter reaches its terminal value
(decimal 100), when the state machine then transitions to “abort” or
“finish” respectively.
We'll not detail the entire machine operation, as you’ve surely
gotten the gist by now. Note, however, that “!’ is used to indicate
“not kill.’ This is the same as “ kill == 1’bO ”. Like the double
“&&” and “||”, “! 1s a logical operator, and is normally used in
conditional expressions. The “~” symbol (a bitwise negation) 1s
usually used in combinatorial assignments. Since the results are
often the same, designers sometimes use them indiscriminately.
We'll now review the coding structure. Normally the
assignment statement (e.g. “state_reg <= active’) follows the
conditional statement on the next line. Here, though, we haveit
following on the sameline. Verilog doesn’t care, and this allows for
a visually coherent form—the state machine operation is easily
understood based onthetransition decisions. We note that this is
only possible because this always-block contains nothing but the
state machine. If it didn’t (as we'll soon see), then we would have
to block multiple assignments with begin/end borders, ruining the
regular matrix structure.
Finally, be aware that many synthesis programs require the
default statement, even if the case statement already includes all
possible selection branch combinations (considered “‘full”’). The
label “default” is a keyword (it was not defined as a parameter).
The counter and output register of this module are similar to
those we’ve already looked at. Note that we decode state machine
states directly in these blocks using the “state_reg” register signal
and the state parameters. Also note that the “done” outputis set to
one based on a conditional“else if” test of the state machine. Most
newer synthesis tools allow a more direct form:
Verilog by Example
input clk;
input reset;
input go;
input kill;
output done;
// State Machine
always @( posedge clk or posedge reset )
begin
if ( reset )
begin
statereg <= idle;
State Machines
idle
begin
count <= 7'h0OO;
done <= 1'b0O;
if ( go )
statereg <= active;
end
active
begin
count <= count + 1;
done <= 1'b0O;
if ( kill )
statereg <= abort;
else if ( count == 7'd100 )
statereg <= finish;
end
finish
begin
count = 7'h00;
done <= 1'bl;
statereg <= idle;
end
abort
begin
count <= 7'h0OO;
done <= 1'b0O;
if ( !kill )
statereg <= idle;
end
default
begin
count <= 7'hOO;
done <= 1'bO;
statereg <= idle;
end
endcase
end
endmodule
they operate exactly the same, and have the same input/outputs, we
could useeither one.
kill_clr P
go_1
a P| go done done_1 _»>
— | kill
state_machine
go_delay1
go2
“alo go done done2_
met gg gy iii
state_machine
go_delay2
go3
Kill_3
—— OO Kill
a done S008=p>
ad state_machine
clk go_delay3
modular_1
input clk;
input reset;
input gol;
input kill 1;
input go2;
input kill 2;
input go3;
input kill 3;
input kill clr;
output done1;
output done2;
output done3;
output killltchd;
clk ( clk y
-go ( go2 ),
kill ( kill2 ),
.done ( done 2 )
+
// Kill Latch
always @( posedge clk or posedge reset )
begin
if ( reset )
kill ltchd <= 1'b0;
else if ( kill 1
|| kill 2
|| kill 3
)
kill ltchd <= 1'bl;
else if ( kill _clr )
kill ltchd <= 1'b0O;
end
endmodule
JS kill Itchd,
J
kill_clr >R
go_1
‘all 1 —P| go done done_1
—_+@ Be kil!
state_machine
go_delay 71
goz
—tD> g 0 done done 2
kil | pe) kil
’ -—~ statemachine
go_delay2
gos » go done_out
kill_-3 PI done ——
e p> kill
state_machine
clk go_delay3
modular2
input clk;
input reset;
input go_l;
input kill1;
input go2;
input kill2;
input go_3;
input kill3;
input killclr;
output doneout;
output killltchd;
reg kill_ltchd;
wire done1;
wire done2;
~clk ( clk ) t
go ( done1 | go2 ),
Modular Designs
-kill ( kill2 ),
.done ( done2 )
)+
// third module instantiation
statemachine1 godelay3
(
.reset ( reset dy
.clk ( clk ,
-go ( done1
| done2
| go3
)y
~kill ( kill 3 ),
.done ( doneout )
);
// Kill Latch
always @( posedge clk or posedge reset )
begin
if ( reset )
kill ltchd <= 1'b0;
else if ( kill1
|| kill 2
|| kill3
)
kill ltchd <= 1'bl;
else if ( kill clr )
kill ltchd <= 1'b0;
end
endmodule
Now that we’ve revealed this, it’s like cracking the top of
Pandora’s Box. You can see the attractton—simple and short; but
also prone to mistakes. In fact, I go so far as to consider this
method a trap waiting to be sprung. The reason is that port
misconnections are not obvious, and if the mismatched port types
are the same, then the synthesis tool will not flag a problem.
Mismatches can happen if, for example, modifications are made to
the instantiat-ed module that cause the ordering of the port list to
change (remember that often the instantiat-ed module 1s not under
your control). Worse still, verilog allows you to leave unconnected
output ports of instantiat-ed modules out of the port list. This
means that if, for example, a new output signal is added to your
instantiat-ed module (this happens quite often), shifting down all
the original signals, and if the last signal is an output, it is now
unconnected (and signals above are misconnected), but no error
flag is raised by the tools.
I] urge you to resist.
Verilog by Example
Memories
Memories are an important componentin manyfields of digital
design, and they come in a variety of forms: DRAM, SDRAM,
DDR, QDR, SRAM, FIFO, LIFO, DP, etc.. Of these, the first
four are of course not (yet) available for FPGA implementation, but
almost any other form imaginable has probably been implemented.
Memory design in FPGAsis another topic that could be a whole
book unto itself, and here we will simply review the fundamentals
of designing memories using verilog.
Memories implemented in FPGAs (versus memorycontrollers,
which would also include the DRAMs,etc.) can be defined in three
general ways:
1) infer the memory directly via the verilog code;
2) build the memory using the vendor’s primitive RAM
structures;
3) design the memory using the FPGA vendor’s specialized
tools.
We'll discuss the last two first. The second option (primitive
RAMstructures) uses RAM resources that are built into the FPGA
device fabric, and thus are the most efficient means of building
memory functions (and if you don’t use them, then they represent
valuable substrate that goes unused). Each RAM block occupies a
fixed amount of FPGA die, and they usually have a limited degree
of flexibility as to their depth versus width (aspect ratio). These
RAM blocks are an example of primitive cores discussed in the
previous section, and as explained there, when using these in a
verilog design, they are instantiated as black box modules. In this
mode, it’s up to you the designer to build up in verilog any
associated control logic, such as circular addressing for FIFOs, logic
for the FIFO depth flags, etc.. Most built-in FPGA RAM blocks
can be configured to operate as dual-port memories, vastly
simplifying many designs.
Verilog by Example
module simpledpmem
( clk,
reset,
dat_in,
wr_adr,
wr_en,
dat out,
rd_adr
Memories
// Memory
//
always @( posedge clk )
begin
if(wren)
memory[wr_adr] <= dat_in;
dat out <= memory[rdadr];
end
endmodule
// Port a
//
always @( posedge clka )
begin
dat_out_a <= memory[address a];
if (wra)
begin
dat_out_a <= dat_ina;
memory[address a] <= dat_ina;
end
end
// Port b
//
always @( posedge clkb )
begin
dat_out_b <= memory[addressb];
if (wr_b)
begin
dat_out_b <= dat_in_b;
memory[address b] <= dat_inb;
end
end
endmodule
data_io[15:0]
<q address/[9:0]
=e oY Bical
pl adr
lam Lad
wr_en || we
rd pl rc 1Kx16
clk pi} memory
single_port_mem
Single-port Memory
TELTTTTITTAT TITTIES TT
// Single-port Memory
module singleportmem
( clk,
reset,
data_io,
address,
wr_en,
rd
);
input clk;
input reset;
inout [15:0] data_io; //new I/O type
input [9:0] address;
input wren;
input rd;
// Memory
//
always @( posedge clk )
begin
if (wren)
memory[address] <= data_io;
datout <= memory[address];
rddl <= rd;
end
endmodule
Single-port Memory
Ne FPGA
59
Verilog by Example
along for free. The following diagram shows a Xilinx clock buffer
(BUFG), but each FPGA vendorhas its own version. For example,
the predominant Altera clock buffer 1s called “GCLK”’.
BUFG
FPGA
Clock Buffer
module clockbuffer
( reset,
clkin,
dat_in,
dat_out
)F
input reset;
input clk;
input dat_in;
output dat_out;
wire clk;
reg datout;
Managing Clocks
// register
always @( posedge clk or posedge reset )
begin
if ( reset )
dat out <= 1'b0;
else
dat_out <= dat_in;
end
endmodule
Clock Buffer
o frequency multiplication.
We start with the DLLs, the simpler of the two synthesis
functions. The delay in the diagram below1s just that, except that1t
consists of a series of precise delay elements, from which the
propagating clock signal can be tapped as “clk.” The phase
detector compares the phases (relative edges) at the input and
output of the delay, and can select the delay tap that creates the
desired phase offset. Thus, for example, we could choosea slightly
negative phase offset (1.e., something less than 360 degrees) so that
“clk” is effectively moved back in time. Then it could incur
propagation delay in the FPGA and be back to approximately
where it was coming in as “clk_in.” The clock edges at the FPGA’s
internal registers would be (approximately) synchronized with those
on the circuit board.
clk in Delay clk
A tap select
phase
detector
Simple DLL
clk in
Delay clk
VCO clk
A BUFG
error signal
clk_in
—_—=—_—_—_
phase -<¥j-
feedback
detector
Frequency Synthesis
Output
Divider clk
BUFG
Clock i_clk
p> Generator
(DLL) BUFG
clk in ! phase FB <4 feedback
- detector Divider
clk_in f ] _——_
——
\,_ \} 1
i_clk
_ \— | |
feedback /| —
FB Divider output a | c—
Output Divider output _j3=—=—— -—
clk yi 1 -—_
Note that “clk_in” and “clk” line up. This of course was the
whole point. The output of the FP Divider also occurs coincident
with “clk_in’’, and this is automatically a result of configuring a zero
phase offset in the phase detector. Finally, note that “i_clk’”’ 1s first
in the pack, occurring far “before” the input clock “clk_in’”’. This1s
magic of phase-locked loops.
DLL PLL
jitter The digital nature of The analog nature of
DLLsresults in some PLLs, on the other
amount of small impulse- hand, exhibit muchless
type jitter. This is rarely a jitter, and in some cases,
problem with the internal a PLL might be inserted
digital logic, but can pose prior to a DLL for the
a problem for external exclusive purpose of
interfaces that limit reducing input jitter.
allowable jitter, such as
communicationslinks.
o pullup/pulldown/keepers;
o tri-state drive.
outputvoltagelevels
Interface signal levels can range from 3.3 Volts down to less
than a Volt. External input pins define the drive voltage. The
FPGA designer must coordinate with the circuit board design to
make sure the proper voltage is used for a desired standard. Note
that I/O pins are grouped together in banks, where all I/O pins in
the same bank share the same external drive voltage pin. Thus, all
the signals connecting to a bank must share the samesignal voltage
I/O Flavors
slew rate
switching thresholds
Input voltage thresholds are for the most part defined via
constraints. Each pin has its own constraint line, where the actual
I/O standard is declared (one that is supported by the vendor).
The constraint format is defined in the FPGA vendot’s
documentation.
Some I/O standards(those that define pseudo-differential input
amplifiers) require a reference voltage provided by an external pin
similar to that which establishes the outputdrive voltage.
inserted delay
drive strength
tri-state drive
com picnt{9:O}.
enable :
clk _»
sim_sample
input clk;
input reset;
input [7:0] dat_in;
input enable;
output [9:0] compcnt;
vector dat_in[7:0]
generator
. (observedin
comp_cnt[9:0] --» simulation tool)
reset Bi reset
generator sim_sample
tb_sim_sample
ee
// Simple Testbench Using Embedded, Explicit Vectors
module tbsimsample1
(
// no I/O for the testbench
)i
// testbench signals.
integer i;
integer j;
// clock periods
parameter CLK_PERIOD = 10; // 10 ns = 100 MHz.
always #( CLK_PERIOD/2.0 )
sim_clk = ~sim_clk;
i = itl;
1f (1 == 20)
#1 reset <= 1'b0;
end
half that time, and since the clock consists of a high time and a low
time, the total duration of the final clock is indeed 10ns. At the end
of each half-clock wait time, the assignment statement simply
toggles the clock polarity. ““~sim_clk” means, “not sim_clk”.
Next we generate a reset signal. First, we again establish the
initial state, and here that is a one—this is because we want the
simulation to start with the reset active. Next weinitialize the entity
labeled “1’’. We can see from an earlier declaration that this has
been declared as an integer. Integers are very useful in simulation,
but are rarely used in designs (they can’t be synthesizedas flip-flops
or wires). In the always statement that then follows, the “1” integer
C6599
TELETITILITTIAA TITTIES
// Simple Testbench Using Embedded, Automatic Vectors
module tb_sim_sample2
(
// no I/O for the testbench
)i
// testbench signals.
integer i;
integer j;
// clock periods
parameter CLKPERIOD = 10; // 10 ns = 100 MHz.
always #( CLKPERIOD/2.0 )
sim_clk = ~sim_clk;
begin
1 = itl;
1f (1 == 20)
#1 reset <= 1'b0O;
end
endmodule
generate for the test run. Next, we’ve introduced a new register
signal, “random_num”. As the name suggests, this will hold a
random number. Nothing is different now until we get to the
assignment of the stimulus vectors downin the initial-block. Where
in the first testbench we specifically assigned a list of vector values,
here we have a for-loop. Wesee that the length of the loop 1s
defined by our earlier QUANT_VECTORSparameter. Each pass
through the loop, 1e., each subsequent clock period, we assign a
new value to “dat_in”. Sometimes we increment the value, and
sometimes we don’t. Those times that the value is not incremented
will result in our comparison counter in our design incrementing.
But what determines if we increment “dat_in” or not? As
you've astutely guessed, a random number, of course. The
expression “$random(1)” is a verilog system task. It generates a 32-
bit random number. The “1” in the parenthesis is the seed, and is
optional, but by including a seed, we ensure that each simulation
run will be the same. Now you see why we declared
“random_num” as a 32-bit signal. We’re only using the LS three
bits, though. Each clock period, there 1s a 1:8 chance that
“random_num7”’ will have “3’b000” as the LS bits, and the “dat_in”
value will not increment.
In operation, you would monitor both “dat_in” and
“comp_cnt” from within the simulation tool to confirm that the
design 1s working properly.
Although efficient and easily understood, this testbench does
have the weakness that the stimulus vector values are always
incrementing by one (or occasionally static). In some designs, this
limitation could be limiting, foregoing some value transitions that
might be important—values stepping from 8’h55 to 8’hAA,to take
one example. Often, it 1s better to incorporate more randomness
into the vector generation. In the case of our simple design, for
example, we could have simply used the LS eight bits of the 32-bit
tandom number. Of course, then there would only be a 1:256
chance that we would see repeating values and consequential
“comp_cnt” increments.
In the opposite direction, we might need less randomness and
even more control. One example is a local processor bus, where we
are simulating bus protocol activity—perhaps a microprocessor on
A Taste of Simulation
Expressions
Concatenation {}
{Vh6, 3’b101, 5’7hO2} = 12’b011010100010
Replication {{t}
{3{2°b10}} = {2’b10, 2’b10, 2°b10} = 101010
Arithmetic +, -, *, /
Modulus %o
(FhA %43) = 1
(VhA % 4’h2) = 0
(PhA % 4h4) = 2
Logical Negation
Logical OR
Logical Equality
Logical Inequality
Case Equality
(7’b0x01 === 4’b0x01) = 1
(4’b0x01 === 4’bOxx1) = 0
CCI?
where “x” is “don’t care”
Case Inequality
~(4’b1001) = 4’b0110
Bitwise OR
(6’b111111 | 4’b1010) = 6’b111111
N
Bitwise XOR
(4b1010 ~* 4’b1110 ) = 4’b0100
Bitwise Equivalence nw
N & r~—
& 4b1111 = 1
Reduction OR |
| 4’b0000 = 0
| 4’b1010 = 1
Reduction NOR ~|
Reduction XOR “
“ 4b1000 = 1
“ 47b1100 = 0
Shortcuts
Shortcuts often end up being the long way round in the end,
but for the record, here are some you mightsee in yourtravels.
Declarations don’t have to be one-per-line. You can smash
them together as much as youlike.
More Shortcuts
module combinedecarations
Combinatorial Always-block
Note that we used the regular “=” assignment rather than the
non-blocking “<=”. This is because the order of execution doesn’t
matter for combinatorial logic (as opposed to clocked registers,
where it very much does, thus the non-blocking assignment).
Passing Parameters
module mux
( anil,
in2,
sel,
out
3
parameter SIZE = 8;
input [SIZE - 1:0] | inl, in2;
input sel;
output [SIZE - 1:0] out;
[ [oo ---7- 7-7-7
out = sel ? inl : in2;
endmodule
You can see that we’ve snuck in a new verilog feature: bus
width fields can be labels. So, “[SIZE - 1:0]” is the same as
“(7:0)”. This is a handy way to quickly configure bus widths, and
you'll see this used often.
Next, we'll instantiate the “mux” module within a higher-level
module and pass the “SIZE” parameter down.
module toplevel
( anil,
in2,
sel _a,
out_a
The Rest for Reference
.out (out a)
);
endmodule
In this top level module we’ve set the bus widths using a
parameter (SIZE_A) in the same way that we did in “mux’, but
here they are 16 bits instead of 8. We pass this value downinto the
“mux” module via the line that starts with “#” between the name
of the module being instantiated (mux) and the instantiation label
(U1). If we had multiple parameters to pass down, each additional
patameter set would be separated by commas. The field following
the “.” is the name of the parameter in the lower-level module
(SIZE), and the field in the parenthesis is the value to pass down.
This could be a direct number, or as here, another constant label
(SIZE_A). The parameter value passed down overrides whatever
was set inside the lower-level module, so the parameter “SIZE” in
“mux” will become 16 instead of 8, which 1s good, since that’s what
we want in order to be compatible with the bus widths of the top
level module.
Since a parameter label can be used to assign the value to be
passed down, you can see that you could pass parameters down
through multiple layers of a hierarchy, using the parameter name
within each intermediate module (which itself is overridden from
above) to assign the value that’s passed down.
But there’s another way to pass parameters down through a
hierarchy—one that some designers dislike, but that you will surely
see (or use) eventually. This is the defparam, and it explicitly
defines both the instantiated name of the lower-level module as well
as the parameter name used in that lower-level module. This is how
it would be used in our top level module.
Verilog by Example
Passing ‘Defines
“Include files
‘include header_defs.h
Conditional Compiling
generate
if (UPCOUNT = 1)
upcounter ul
( inl,
in2,
count out
);
else
downcounter u2
( inl,
in2,
count out
) i
endgenerate
generate
case (OPTION)
1: upcounterfrz ul //freezes at max
( inl,
in2,
count out
);
2: upcounterroll u2 //rolls over
( inl,
in2,
count out
)i
3: downcounterfrz u3 //freezes at zero
( inl,
in2,
count out
)i
4: downcounterroll u4 //rolls under
( inl,
in2,
count out
i
default:
staticreg ud
( inl,
in2,
count out
)i
endgenerate
generate
genvar i;
for (1=0; i<=7; 1=1+1)
memory U ( read,
write,
dat _in[(1*8)+7: (1*8)],
addr,
dat _out[(1*8)+7: (1*8) ]
)?
endgenerate
generate
genvar 1;
for (1=0; i<=SIZE; i=i+1)
assign diffs[i] = insig[i] * insig[i+1l];
endgenerate
wordin[4] ? 5'd5
wordin[3] ? 5'd4
wordin[2] 2? 5'd3
wordin[l1] 2? 5'd2
word in[0] 2? 5'dl
5'd0;
end
endfunction
begin
// find the location of the most-
// significant bit. Return zero
// if no ones found.
msloc = wordin[15] ? 5'dl6
word in[14]}] ? 5'dl5
wordin[{13] ? 5'dl4
word in[12] ? 5'd13
word in[1ll] ? 5'dl2
wordin[10] ? 5'dll
word in[9] 2 5'dl0
word in[8] 2? 5'Aag
word inl[/7] 2? 5'da8g
word in[6] ? 5'd7
word in[5] ? 5'd6
word in[4] 2? 5'd5
word in[3] 2? 5S'dé4
The Rest for Reference
wordin[2] ? 5'd3
wordin[1] 2? 5'd2
wordin[{0] ? 5'dl
5'd0;
end
endtask
RIL: we haven’t used this acronym, but you will see it. It
stands for “Register Transfer Level,” and refers to a description of
The Rest for Reference
digital operation (1.e. HDL) that includes registers (and thus, easily
extendable to counters, state machines, etc.). One of the primary
functions of synthesis software is to translate RTL into gate-level
interconnections appropriate for ASIC or FPGA implementation.
“RTL” has become somewhat synonymous generically with HDL
languages (verilog and VHDL).
Symbols /Numbers C
"I"33 case equality, 90
"8", 8, 10 case inequality, 90
"Oc", 25 case sensitive, 8
"1"8,10 case statement, 32
"||", 25 clock buffer, 60, 61
"~~" 11, 33 clock distribution, 60
# (delay), 80, 81 clock enable, 23
1'b0, 21 clock skew, 60-62
8B/10B, 44, 74 clock synthesis, 60
clocked registers, 17
A clocking regions, 62
always block, 17 combinatorial
arithmetic, 89 conditional statement, 12
ASIC,48 commentflag, 8
assion, 8, 17 concatenation, 14
asynchronousreset, 20 concatenation, 89
concurrent operation, 24
B conditional priority, 23
ball-grid, 72 continuous assignment, 8, 17
begin/end, 26, 27 cores, 44
bitwise AND, 90 counter, 24
bitwise equivalence, 90
bitwise negation, 90 D
bitwise OR, 90 D-flop, 17
bitwise XOR, 90 define (define) , 32, 96-98, 101
black box primitive, 37, 44, 47, Delay-Locked Loop, 62
60, 71-74 design constraints, 70-74
BUFG,61 differential pair, 69, 71
Verilog by Example
display ($display) , 83 I
distributed RAM,51, 52 I/O declaration, 7
DLL, 62-68 if/else, 20
drive impedance, 69, 72 ifdef (ifdef), 99-100
dual-port memories, 47-56 ifndef (ifndef), 100
include (include), 98, 99, 108
E inferred memory, 47, 48
else if, 22 initial, 80, 81
elsif (elsif) , 100 inout, 57
enable, 22 inserted delay, 69, 72
endif (endif), 100 integer, 81
endmodule, 8 IP cores, 44, 48, 61
F K
falling-edge triggered, 18 keepers, 70, 73, 74
FFT, 44
FIFO, 44, 47, 74 L
FIR filter, 44 latch, 12, 24
for loop, 82, 86 left shift, 90
forever loop, 83 local clocks, 62
frequency synthesis, 65-68 logical AND,90
function, 104-106 logical equality, 22
logical equality, 90
G logical inequality, 90
logical negation, 90
GCLK,61 logical OR, 90
generate, 100-103 LVCMOS, 69
genvar, 103 LVDS,69
Gigabit Ethernet, 44, 61, 74 LVPECL,69
global buffer, 59, 61, 62
globalreset, 22 M
micro-processor, 44
H module, 6-8
HDL,3, 4 module instantiation, 40
HSTL,69 modulus, 89
Verilog by Example
T
task, 106-108
termination, 69, 71
testbench, 75
timing constraints, 70
Vv
VCO, 64
vector signal, 10
Viterbi, 44
W
wait, 81
wire, 8, 18
write-before-read, 52, 55, 56