Digital Design With SystemVerilog
Digital Design With SystemVerilog
Digital Design With SystemVerilog
Spring 2014
Synchronous Digital Design Combinational Logic Sequential Logic Summary of Modeling Styles Example: Bresenhams Line Algorithm Testbenches
Why HDLs?
1970s: SPICE transistor-level netlists
Vdd
An XOR built from four NAND gates .MODEL P PMOS .MODEL N NMOS .SUBCKT NAND A M1 Y A Vdd Vdd M2 Y B Vdd Vdd M3 Y A X Vss M4 X B Vss Vss .ENDS X1 X2 X3 X4 A A B I2 B I1 I1 I3 I1 I2 I3 Y Vdd Vdd Vdd Vdd B Y Vdd Vss P P N N 0 0 0 0 NAND NAND NAND NAND
Y A B Vss
A X1 B I1
X2
I2 X4 Y
X3 I3
Why HDLs?
1980s: Graphical schematic capture programs
Why HDLs?
1990s: HDLs and Logic Synthesis
library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity ALU is port(A: in unsigned(1 downto 0); B: in unsigned(1 downto 0); Sel: in unsigned(1 downto 0); Res: out unsigned(1 downto 0)); end ALU; architecture behv of ALU is begin process (A,B,Sel) begin case Sel is when "00" => Res <= A + B; when "01" => Res <= A + (not B) + 1; when "10" => Res <= A and B; when "11" => Res <= A or B; when others => Res <= "XX"; end case; end process; end behv;
Verilog: More succinct, really messy VHDL: Verbose, overly exible, fairly messy Part of languages people actually use identical Every synthesis system supports both SystemVerilog a newer version. Supports many more features.
All ip-ops driven by the same clock No other clock signals Every cyclic path contains at least one ip-op No combinational loops
STATE CLOCK
C L
NEXT STATE
C L
CLK
tc CLK Q D
C L
CLK
Hold time constraint: how soon after the clock edge can D start changing? Min. FF delay + min. logic delay
C L
CLK
tp(max,FF) CLK Q D
Setup time constraint: when before the clock edge is D guaranteed stable? Max. FF delay + max. logic delay
Combinational Logic
Full Adder
Module name Data type: Input port single bit Port name // Full adder module full_adder( full_adder input logic a, a b, c, output logic sum, carry);
Continuous assign sum = a ^ b ^ c; assignment c assign carry = a & b | a & c | b & c; expresses combinational endmodule logic
carry~0 b a carry~2 c carry~1 carry~3 carry
Logical Expression
sum sum
module gates(input logic [3:0] a, b, output logic [3:0] y1, y2, y3, y4, y5); /* Five groups of two-input logic gates acting on 4-bit busses */ assign y1 = a & b; // AND assign y2 = a | b; // OR assign y3 = a ^ b; // XOR assign y4 = ~(a & b); // NAND assign y5 = ~(a | b); // NOR endmodule
Multi-line comment
module and8(input logic [7:0] a, output logic y); assign y = &a; // Reduction AND
// Equivalent to // assign y = a[7] & a[6] & a[5] & a[4] & // a[3] & a[2] & a[1] & a[0]; // Also ~|a // |a // ~|a // ^a // ~^a endmodule NAND OR NOR XOR XNOR
s d0[3..0] d1[3..0]
3 3 0 1
y~0 y[3..0]
module mux2(input logic [3:0] d0, d1, input logic s, output logic [3:0] y); // Array of two-input muxes
y~1
2 2 0 1
y~2
1 1
0 1
y~3
0 0 0 1
mynand2:n4 y y~not
y b
a mynand2:n1 y y~not
y b
a b
mynand2:n3 y y~not
y b
A Decimal-to-Seven-Segment Decoder
always_comb: combinational logic in an imperative style module dec7seg(input logic [3:0] a, output logic [6:0] y); always_comb case (a) 4d0: y = 7b111_1110; 4d1: y = 7b011_0000; 4d2: y = 7b110_1101; 4d3: y = 7b111_1001; 4d4: y = 7b011_0011; 4d5: 4d5 y = 7b101_1011; 7b101_1011 4d6: y = 7b101_1111; 4d7: y = 7b111_0000; 4d8: y = 7b111_1111; 4d9: y = 7b111_0011; default: y = 7b000_0000; endcase endmodule blocking assignment: use in always_comb
Mandatory
Verilog Numbers
16h 16 h8_0F 8_0F Number of Bits Base: b, o, d, or h Value: _s are ignored Zero-padded
y~2
2
y~5
0 1
y[3..0]
+ y~0
0 0
y~7
0 0 1
s a[3..0] b[3..0]
3 0 3 1 3
y~3 y~4
y~2
y~6
0
y~9
0
y~13
2 1
y~5
y[3..0]
y~1
y~10
0 0 1 1 1
y~14
y~15
y~12
t s
A[3..0] B[3..0]
Add0 +
OUT[3..0]
All three expressions computed in parallel. Cascaded muxes implement priority (s over t).
s 1 0 0
t 1 0
Add1
A[4..0] 0:3 0:3 B[4..0]
OUT[4..0]
1 1
y~5
t
1 1
y~10 y~1
0 1 0 2 2 2 0 2 2 0 0 0 0 3 3 1 3 3 1
y~13 y[3..0]
y~14
0 1 1
y~0 y~11
0 0
y~15
0 1
y~4
y~3
0
y~8
0
y~12
3 1
y~7
Add0
A[3..0] B[3..0]
z~0
OUT[3..0] 3
1'h0 0
0 1 4 1
z~4 z[3..0]
z~1 1'h0 0
0 2 1 3 1
z~5
z~2 1'h0 0
0 1 2 1
z~6
z~3 1'h0 0
0 0 1 1 1
z~7
An Address Decoder
module adecode(input logic [15:0] address, output logic RAM, ROM, output logic VIDEO, IO); always_comb begin {RAM, ROM, VIDEO, IO} } = if (address[15] [15]) RAM = 1; else if (address[14:13] VIDEO = 1; [14:12] else if (address[14:12] IO = 1; else if (address[14:13] ROM = 1; end endmodule 4b 0; 0 Vector concatenation Default: all zeros Select bit 15 Select bits 14, 13, & 12
== 2b 00 ) == 3b 101) == 2b 11 )
Omitting defaults for RAM, etc. will give construct does not infer purely combinational logic.
Sequential Logic
A D-Flip-Flop
always_ff introduces sequential logic module mydff(input logic clk, input logic d, output logic q); always_ff @(posedge clk) clk q <= d; Copy d to q endmodule Triggered by the rising edge of clk
q~reg0 d clk
D CLK Q
module count4(input logic clk, output logic [3:0] count); always_ff @(posedge clk) count <= count + 4d 1; endmodule
Add0
A[3..0]
count[0]~reg[3..0]
OUT[3..0] D CLK Q
count[3..0]
1'h0 1
3 0
count~4
0 1
count~8
0 3 1
count~12 1'h0 1
count~1
2 0
1'h0 1
2 0
count~5
0 1
count~9
0 2 1
count~13 1'h0 1
count~2 Equal0
A[3..0] 1 0
4'h9 B[3..0]
OUT
1'h0 1
1 0
count~6
0 1
count~10
0 1 1
count~14 1'h0 1
Add0
A[3..0]
count~3
OUT[3..0] 0 0
4'h8 B[3..0]
1'h0 1
0 0
count~7
0 1 0 1
count~11
0
count[3..0]
Current State
Output Logic
Outputs
The Moore Form: Outputs are a function of only the current state.
Current State
Output Logic
Outputs
The Mealy Form: Outputs may be a function of both the current state and the inputs. A mnemonic: Moore machines often have more states.
assign red = state == R; assign yellow = state == Y; assign green = state == G; endmodule
module blocking(input clk, input logic a, output logic d); logic b, c; always_ff @(posedge clk) begin Blocking b = a; assignment: c = b; Effect felt by next statement d = c; end endmodule
c
Q
d~reg0
D CLK Q
d~reg0
d
a clk
D CLK
A Contrived Example
module styles_tlc(input logic clk, reset, input logic advance, output logic red, yellow, green); enum logic [2:0] {R, Y, G} state; always_ff @(posedge if (reset) else case (state) R: if (advance) G: if (advance) Y: if (advance) default: endcase clk) // Imperative sequential state <= R; // Non-blocking assignment // Case state <= G; // If-else state <= Y; state <= R; state <= R; Imperative combinational Blocking assignment If-else Case
always_comb begin // {red, yellow} = 2b 0; // if (state == R) red = 1; // case (state) // Y: yellow = 1; default: ; endcase; end assign green = state == G; endmodule
3 4 0
5 6 3 2
Approach
1. Understand the algorithm I went to Wikipedia; doesnt everybody? 2. Code and test the algorithm in software I used C and the SDL library for graphics 3. Dene the interface for the hardware module A communication protocol: consider the whole system 4. Schedule the operations Draw a timing diagram! In hardware, you must know in which cycle each thing happens. 5. Code in RTL Always envision the hardware you are asking for 6. Test in simulation Create a testbench: code that mimicks the environment (e.g., generates clocks, inputs). 7. Test on the FPGA Simulating correctly is necessary but not sufcient.
My C Code
void line(Uint16 x0, { Sint16 dx, dy; // Uint16 x, y; // Sint16 err; // Sint16 e2; // int right, down;// Uint16 y0, Uint16 x1, Uint16 y1) Width and height of bounding box Current point Loop-carried value Temporary variable Boolean
dx = x1 - x0; right = dx > 0; if (!right) dx = -dx; dy = y1 - y0; down = dy > 0; if (down) dy = -dy; err = dx + dy; x = x0; y = y0; for (;;) { plot(x, y); if (x == x1 && y == y1) break; // Reached the end e2 = err << 1; // err * 2 if (e2 > dy) { err += dy; if (right) x++; else x--;} if (e2 < dx) { err += dx; if (down) y++; else y--;} } }
Module Interface
module bresenham(input logic input logic input logic [10:0] clk, reset, start, x0, y0, x1, y1,
start indicates (x0, y0) and (x1, y1) are valid plot indicates (x,y) is a point to plot done indicates we are ready for the next start
7, 4
negate
1 0
dx
y1 y0
0?
down
negate
1 0
dy
dx 1 0
1 0
0 1
<
e2 > dy dx e2
err e2 < dx
>
1 0
1 0
output logic [7:0] VGA_R, VGA_G, VGA_B, output logic VGA_CLK, VGA_HS, VGA_VS, VGA_BLANK_n,VGA_SYNC_n); parameter HACTIVE HFRONT_PORCH HSYNC HBACK_PORCH HTOTAL HACTIVE + parameter VACTIVE VFRONT_PORCH VSYNC VBACK_PORCH VTOTAL VACTIVE + = 11d 1280, = 11d 32, = 11d 192, = 11d 96, = HFRONT_PORCH + HSYNC + HBACK_PORCH; //1600 = 10d 480, = 10d 10, = 10d 2, = 10d 33, = VFRONT_PORCH + VSYNC + VBACK_PORCH; //525
assign endOfLine = hcount == HTOTAL - 1; // Vertical counter logic [9:0] logic always_ff @(posedge clk50 or if (reset) vcount else if (endOfLine) if (endOfField) vcount else vcount vcount; endOfField; posedge reset) <= 0; <= 0; <= vcount + 10d 1;
assign endOfField = vcount == VTOTAL - 1; assign VGA_HS = !( (hcount[10:7] == 4b1010) & (hcount[6] | hcount[5]) ); assign VGA_VS = !( vcount[9:1] == (VACTIVE + VFRONT_PORCH) / 2);
output logic [10:0] x0, y0, x1, y1, output logic start, pixel_color); // ... // Typical state: S_TOP: if (done) begin start <= 1; if (x0 < 620) x0 <= x0 + 10d 10; else begin state <= S_RIGHT; x0 <= 639; y0 <= 0; end end
VGA_framebuffer fb(.clk50(OSC_50_B3B), .reset(~RESET_n), .*); bresenham liner(.clk(OSC_50_B3B), .reset(~RESET_n), .reset(~RESET_n) .plot(pixel_write), .*); hallway hall(.clk(OSC_50_B3B), .reset(~RESET_n), .* );
Connect the other bresenham ports to wires with the same name e.g., .x(x), .y(y),. . .
Testbenches
Testbenches
A model of the environment; exercises the module.
// Module to test: // Three-bit // binary counter module count3( input logic clk, reset, output logic [2:0] count); always_ff @(posedge clk) if (reset) count <= 3d 0; else count <= count + 3d 1; endmodule module count3_tb; ; logic clk, reset; logic [2:0] count; count3 dut(. dut(.*); initial begin clk = 0; forever #20ns clk = ~clk; end initial begin // Reset reset = 0; repeat (2) @(posedge clk); reset = 1; repeat (2) @(posedge clk); reset = 0; end endmodule No ports Signals for each DUT port Device Under Test Connect everything Initial block: Imperative code runs once Innite loop Delay Counted loop