Digital Design With Verilog: Course Notes For First Edition
Digital Design With Verilog: Course Notes For First Edition
NOTE:
2014-01-11
Table of Contents
Introduction............................................................................................................................. 1
Course Description.................................................................................................................... 1
Prerequisites.................................................................................................................. 3
Class Organization and Materials..............................................................................3
Proprietary Information and Licensing Limitations................................................5
Textbook...................................................................................................................................... 6
Textbook Errata (edition of 2008)............................................................................................ 6
Supplementary References..................................................................................................... 12
Supplementary Textbooks......................................................................................... 13
Interactive Language Tutorial..................................................................................13
Recommended Free Verilog Simulator.....................................................................13
VCS Simulator Summary (distributed separately).............................................................13
DC Synthesizer Lab 1 Summary (distributed separately).................................................13
DC Synthesizer Command Reminder (distributed separately).........................................13
Course Grading Policy............................................................................................................. 14
Getting Ready for Class Attendance.....................................................................................15
2014-01-11
2014-01-11
Introduction
This Course Note Book is provided as a supplement to the required Textbook. This
Note Book describes course-related matters, summarizes the day-by-day lectures and
labs, and includes tool-related updates. All EDA tools change rapidly, so any textbook,
including ours, will be somewhat obsolescent as soon as it has been published.
These Notes soon will be incorporated in a future edition of the Textbook.
At present, all references in the Textbook and this Note Book are to IEEE Std 13642005 for verilog and not to the later SystemVerilog Std document -- unless otherwise
stated. Except (a) for inclusion of existing C or C++ code, or (b) for complex assertion
composition, SystemVerilog and verilog 2005 differ very little, as explained in detail in
the final lecture below.
The Textbook soon will be available in print as the second edition of
Digital VLSI Design with Verilog. Some changes in the second edition
are:
Many minor typographical errors have been corrected, as have been several
other errors newly discovered in the text and figures.
Major upgrades in the second edition are:
Modified Day 1 presentation making it more useful to verilog beginners
Dozens of new figures
Expansion or clarification of explanations on almost every page
Upgrade of the simulation figures to be in color
New coverage of the features of SystemVerilog and VerilogA/MS
A new summary introduction to each chapter and lab exercise
IEEE Stds references include SystemVerilog as well as verilog
A new, optional lab checklist for recording learning progress
Any corrections or updates of the second edition will be posted on Scribd.
There will be no more updates to Scribd concerning the first edition.
Course Description
This hands-on course presents the design of digital integrated circuits, using the
verilog digital design language as described in IEEE Standard 1364 (2001 and after).
By a balanced mixture of lectures and labs, the students are introduced to language
constructs in a progressively more complex project environment. During the course,
students are familiarized with the use of the Synopsys Design Compiler to synthesize
gate-level netlists from behavioral, RTL, and structural Verilog code. The synthesis
constraints most useful for area and speed optimization are emphasized. Almost all the
2014-01-11
class project work is done in the synthesizable subset of the language; logic simulation is
treated as a verification method in preparation for synthesis.
The Synopsys VCS simulator, or optionally [tbd] the Mentor QuestaSim simulator,
will be used in class; the majority of the labs are small enough to be worked with the
demo-limited Silos simulator which comes on the CD-ROM included with the Thomas
and Moorby or Palnitkar supplementary textbooks cited in the required Textbook.
Instruction in simulation tools will be minimal in this course, which is focussed on the
verilog language and on those constructs permitting logic synthesis.
Other course topics include design partitioning, hierarchy decomposition, safe coding
styles, assertions as designer aids, and design for test.
2014-01-11
Prerequisites
A bachelor's degree in electrical engineering or the equivalent, with digital design
experience. Familiarity with programming in a modern language such as C.
Required Textbook
The Textbook for this course is Digital VLSI Design with Verilog (see below for
citation). The Textbook contains explanations, homework readings, and all the lab
instructions. The CD which ships with the hardcopy edition of the Textbook contains the
original versions of the answers which are provided in the Labxx working directories.
It is not recommended that you purchase the PDF edition of the Textbook for this
course; you will want a hard copy anyway, to take marginal notes; and, viewing it onscreen will make your lab work much more difficult than if you had the capacity to page
through a book independent of what your computer was doing. If you did get the PDF
edition, you could read it on your own laptop computer during lecture or lab, but you will
not be permitted access to any computer during the (open-book) final exam, so notes in
your computer will be unavailable.
2014-01-11
Textbook CD-ROM
Page xx (twenty) of the Textbook briefly suggests how to use the contents of the CDROM which accompanies it. As a suggestion, a good working environment in which to
organize the contents of the CD-ROM would be set up as shown in the following figure:
If you create the DC and VCS directories as suggested, you will have used everything in
the CD-ROM misc directory, so you will not need a misc directory in your working
environment.
The majority of the exercise answer subdirectories which depend upon the
tcbn90ghp_v2001.v simulation library contain a zero-length copy of that .v file. This
was done to reduce the space occupied by the answer CD-ROM. Whenever you
encounter a zero-length file named tcbn90ghp_v2001.v, you should replace it with a
full-length version from your VCS directory (see above) or from the misc data directory
on your CD-ROM.
The _vimrc file in the CD-ROM misc directory is a convenient startup file for the vim
text editor, which is the recommended text editor for verilog. The _vimrc file should be
renamed to .vimrc and copied to your home directory ("~"), if running Linux or Unix. If
you are running Windows, _vimrc should be copied to the startup directory which you
have configured for vim.
Class Attendance
Commitment to attendance at all lectures and labs is advised: This course is not an
easy one; any missed mandatory session should be made up as soon as possible. The
school provides for as many as three makeup lecture sessions during the course; and, a
total of two missed sessions (lecture or lab), not made up, is allowed. Attendance not
meeting these criteria means that a certificate will not be granted.
2014-01-11
2014-01-11
Textbook
Williams, J. M. Digital VLSI Design with Verilog. Springer, 2008. This is the required
textbook and lab exercise book for the course. ISBN: 978-1-4020-8445-4.
2014-01-11
16. p. 68: In the second paragraph, 4th line, the units are wrong, and there is a period
('.') missing. The line should read, "...of about 350 MB/s. A parallel ...".
17. p. 70: Change the paragraph below Figure 4.8 to read, ". . .; the horizontal grids in
Fig. 4.8 are spaced vertically at about 300 mV divisions".
18. p. 71: In the second paragraph, second line, there is an extra "of"; the correct wording
should be, ". . . we can create a netlist, . . .".
19. p. 73: The first begin in the code example is capitalized (Microsoft Word bug); it
should be "begin".
20. p. 75: In the first paragraph in Step 3, the reference should be to figure 4.10, not
"figure 3.10".
21. p. 76: In the last line of the code example, the "2'h00" should be 2'b00.
22. p. 77: In the second sentence following the code example, for consistency with the
code example, the second sentence should begin, "So, "HiBit'b1" would not be a legal
increment expression."
23. p. 85: In the paragraph below the second code example, the first few words should be,
"To declare a memory, ...".
24. p. 86: In the first sentence, it should be, "... more than three ...".
25. p. 86: The last bullet on the page no longer is correct. It should be changed to,
Many simulators such as the Silos demo version can not display a memory
storage waveform; new VCS, QuestaSim, and Aldec can.
26. p. 90: In the last paragraph, the end of the first sentence should be changed to, "... for
p1 and the rest of the 8 bits.".
27. p. 94: Two of the framing-error examples are garbled. There are ten bits per frame,
so the correct framing should appear this way:
40'b_xXxxxxxxxP_xXxxxxxxxP_xXxxxxxxxP_xXxxxxxxxP
The Rx framing error lagging should appear this way:
40'b_XxxxxxxxPx_XxxxxxxxPx_XxxxxxxxPx_XxxxxxxxPx
The Rx framing error leading is correct as printed.
28. p. 95: In the second sentence of Step 2, the module name should not have spaces:
"Mem1kx32".
29. p. 96: The first line on tip should read, ". . . given in the preceding . . .".
30. p. 96: The Fig. 5.4 label's module name should not have spaces: "Fig. 5.4 The
Mem1kx32 RAM schematic".
31. p. 97: In the $display of the code example, the format-specifier quote characters
should be double-quotes, thus: $display("...", $time, . . .);. Of course,
". . ." represents a format specifier.
32. p. 101: Figure 6.1 caption is missing a period ('.') after the word "values".
33. p. 113: In section 7.1.1, 3rd paragraph, there is a period instead of a comma: It
should read, "... assignment of '#'), there is ...".
34. p. 121: In the Symbol/Type list at the top of the page, the last symbol on the left is
reversed: It should be, "<<< shift (arith)".
2014-01-11
35. p. 141: In the code example, the commented offset numbers and the 64-bit packet
should line up vertically this way:
//
60
50
40
30
20
10
0
// 32109876 54321098 76543210 98765432 10987654 32109876 54321098 76543210
64'b01100001_00011000_01100010_00010000_01111001_00001000_01111010_00000000;
36. p. 149: The last paragraph before Section 8.1.3 refers to a file "FindPattern.v" in
the Lab11 directory. This file can be found instead in the discussion of Lab11 below.
37. p. 159: The normal one of the five simple states at the top of the page should refer to
a 4-state FIFO, not a "4-register" FIFO. Similarly, the Fig. 8.13 text should describe a
FIFO with five or more storage states.
38. p. 170: The figure caption for Fig. 9.1 should call the clock Clk, not "C1K".
39. p. 173: The figure caption for Fig. 9.2 has two periods after the word Queue (instead
of one).
40. p. 176: In Case 2, step 9, the case, of course, should be Case 2, not "Case 1".
41. p. 179: In the lab Step 1, left code example, the second always block has two '$'
typoes: It should be, "always@(a) b <= a;".
42. p. 190: In the first paragraph after Fig. 10.4, the second sentence should be, "With X
= 1, if In goes to 0, q and qn both go to 1; if In then returns to 1, there is a latched
value, but it is indeterminate, because if either q or qn shifts to 0 sooner than the
other, it can clamp or reverse the other to its original 1 value."
43. p. 192: At time 7 in the truth-table, the !D value should be 0, not "1".
44. p. 203: In the next-to-last text paragraph, the third sentence should begin with, "If
one input bit never toggles, then each change of CheckInBus will spawn a new forkjoin block event in memory, ...".
Also, the last text paragraph should be preceded by this new one:
"The always-block example above also suffers from this problem, but to a lesser
extent."
45. p. 205: Delete the words, ", for temporary use, even", in the first sentence of the
second paragraph on this page. See the Supplementary Note on this section for an
explanation and a rewrite of the last paragraph on this page.
46. p. 206: The beginning of the first sentence in the first text paragraph should be
changed to say, "Variables declared in a task, like those in any other named block, are
static and hold their most recent value . . .".
Also, the third sentence in the first paragraph should begin, "This last holds . . .".
Also on this page, the last two sentences of the third text paragraph should read this
way:
"Task local variables are static reg-type variables shared among all calls of a given
task; so, of course, they share the same declaration. Sometimes this sharing may be
desirable, because it allows different running instances of the task to communicate
with each other."
47. p. 224: The first code example contains an incorrect always block. It should be the
same as the others, namely,
always@(Abus) #1 temp <= &Abus;
2014-01-11
48. p. 225: In the first line of the Lab Procedure, "generate" should not end with a space
before ".".
49. p. 225: The second sentence in Step 4 should read, "Below the instructions in Step 4
F, is a slightly . . .".
50. p. 226: The Lab 15, Step 4C instruction following Figure 12.9 should be changed: Do
add a 33rd bit for parity; however, leave the 33rd bit unused for now.
51. p. 228: The figure caption to Fig. 12.11 should mention that the waveforms are for a
synthesized netlist simulation, not for a simulation of the verilog source.
52. p. 232: In the middle of the second paragraph, there is an extra "or": Change it to,
"...; or, instead of such a flag, we might provide ...".
53. p. 250: The first sentence of the second paragraph in MOS Switchesis
grammatically incorrect, beginning with the semicolon. It should be changed to, . . .
more negative, and an N-channel transistor (nmos) conducts . . ..
54. p. 257: The schematic in Fig. 14.8 shows a fourth trireg connecting the output, but
the waveforms in Figs. 14.8 and 14.10 are for a design in which that trireg has been
omitted. These figures are corrected in the Supplementary Notes below.
55. p. 259: The second parameter example in 15.1.1 contains an unnecessary "-".
56. p. 275: In the first code example, delete the if (FastClock==1'b1) as redundant.
57. p. 286: In the first paragraph, second sentence in 16.2.4, specparam is misspelled.
58. p. 288: The first paragraph after the last code example on this page should read, "In
the first statement above, assume that Clk clocks in D to Q1; ...".
59. p. 289: In Timing Lab 20, Lab Procedure, 1st paragraph, all occurrences of
"SpecIt" should be in Courier (fixed-width) font.
60. p. 291: For clarity, the first sentence in Step 3B should be rephrased to, "Change the
delay in the specify block to . . .". Also, in the second paragraph of Step 3B, delete, ".
. . and, in a change from x, the longest delay is used."
61. p. 298: In 17.1.3.3, in the explanation of $recovery, the middle of the first sentence
should be, "... control, such as a set or clear, to effective ...". Also, the last sentence in
the $setuphold discussion should be, "See the explanation in the section below on
negative time limits."
62. p. 299: Figure 17.1 is incorrect in regard to $removal; it and the explanation of
$removal should be corrected as in the Supplementary Note below. The second
sentence in the $removal explanation, continued onto this page, should say, "... how
long after that edge does an asynchronous control have to remove itself in order for the
clock edge to be ignored?"
In 17.1.3.3, the last sentence on $recrem should be, "See the discussion below on
negative time limits." [also corrects an extra period ('.')]
63. p. 303: The first sentence in 17.1.7 should have the word notifier as shown here (in
italicized proportional font, not italicized fixed-width font).
64. p. 303: In the middle of the paragraph beginning "PATHPULSE", the sentence
mentioning the "type" of a timing path should be changed to, "It changes the inertial
delay behavior of the timing path involved."
2014-01-11
10
65. p. 307: The 3rd paragraph is Step 6 should say, "Add a specify block after the
module header declaration in DFFC, setting . . .."
66. p. 312: In the caption to Fig. 18.1, the final sentence should be changed to, "None of
the PLL, FIFO, and DesDecoder will synthesize correctly, but all will simulate
usably."
67. p. 317: In Fig. 18.7, the ClockIn to the "B" two-bit counter should not be labelled
"(delayed)". The paragraph before this figure should be ignored.
68. p. 323: At the bottom of the page, the third editting instruction should be:
Rename the testbench module to PLLTopTst.v.
69. p. 327: In the paragraph beginning, "Prepare to set . . .", change the last sentence to,
"Assign a default value of 5 (for 25 = 32 words)."
70. p. 330: In the second code example, the last line of the port map has an incorrect
underscore. It should be, ", .Empty(FIFOEmpty)".
71. p. 342: In Fig. 19.3, the input port should be named ClkW, not the "CkW" which is
shown in a Supplementary Note for that page.
72. pp. 351-352: For improved readability in both the incrRead and incrWrite tasks,
the first else in each of these should have expressions reversed to "if
(ReadAr==OldReadAr)" and "if (WriteAr==OldWriteAr)".
73. p. 356: In Step 9, first paragraph after the first code example, change the second
sentence to read, "Start by removing the SerValid gate . . ..".
74. p. 358: The 4th bulletted item should end with a period '.'.
75. p. 361 ff: Throughout Chapter 20 (Week 10, Class 2) and Lab 23, "serdes" should be
spelled "SerDes", and "deserializer" should be "Deserializer", when referring
specifically to the class project design.
76. p. 378: In the middle of the third paragraph, the reference to a "cell array" should be
"cell vector", for consistency with verilog terminology.
77. p. 383: In Lab 25, Step 2, the final words in the 1st paragraph should be separated:
"... our larger SerDes design."
78. p. 399: In the last line on this page, there is an extra blank before the comma
following "Deserializer".
79. p. 401: The "Loose Glitch Sinks Chips" pun should be in a Gothic typeface, such as
Windows MariageD:
80. p. 402: In Step 8A, second paragraph, the scan-insertion commands are in
Lab25_Ans/Step01_IntScan/... instead of in "Lab25_Ans/IntScan/...".
81. p. 406: In the last sentence of the first paragraph below Figure 22.1, the reference to
SDF punctuation should be proportional-font parentheses enclosing fixed-width singlequoted parentheses, thus: " . . . ( '(' and ')' ) . . .". The single-quotes and the inner
parentheses should be in fixed-width font (such as Courier).
82. p. 409: The verilog code example is displayed with the wrong quote characters; it
should read, initial $sdf_annotate("Intro_TopNetlist.sdf");
83. p. 409: In the SDF code example, the wrong quote characters appear: It should be
written, (CELLTYPE "ctype").
2014-01-11
11
84. p. 415: In the bulletted list of omitted topics, "design_vision" has been split into
two words.
85. p. 416: In the system task and function tabulation, $readmemb and $readmemh
should not be boldface; they were mentioned but never exercised.
86. p. 427: In the bulletted SystemVerilog feature list, the `timescale should be backquoted in fixed-width font, not single-quoted, thus:
timeunit and timeprecision declarations instead of `timescale.
2014-01-11
12
Supplementary References
Note: Active-HDL, Design Compiler, Design Vision, DVE, HSPICE, Liberty, ModelSim,
NanoSim, QuestaSim, Silos, VCS, and Verilog (capitalized) are trademarks of their respective
owners.
2014-01-11
13
Wood, B. "Backplane tutorial: RapidIO, PCIe, and Ethernet". DSP Designline, January
14, 2009 issue. Posted at http://www.dspdesignline.com/howto
/showArticle.jhtml;articleID=212701200 (2009-01-14).
Supplementary Textbooks
The Thomas and Moorby and Palnitkar textbooks are referenced many times in our
Textbook for optional, supplementary exercises and readings. These may be useful to
clear up understanding of specific topics presented during the course. Each of these
books includes a Simucad Silos Verilog-2001 (MS Windows) simulator on CD-ROM,
with Silos simulation projects. This is a capacity-limited demo version of the Silos
simulator, but it will work with many of the smaller lab exercises in this course.
References to the Thomas and Moorby textbook are for the 2002 (4th) edition. Since
then, a new edition has been published (using verilog 2001 headers throughout); the page
and section numbers may differ somewhat from those given in our Textbook.
2014-01-11
14
Quizzes
There will be a quiz every scheduled lecture day, except during the first and last weeks
of the course. The total of 20 quizzes primarily are meant to be instructional.
Generally, quizzes will include material presented in the current or most recent lab or
lecture.
Each quiz will be brief, about 15 minutes, and will count 10 points, for a total of 200
points over the whole course. Quizzes are teaching as well as testing instruments, so
poor performance on quizzes need not mean very much for an enrollee's final status.
No makeup will be allowed for any missed quiz.
Quiz weighting rule: Round the quiz average percent up to the next 10%; then, for
every 10% above 50%, or any quiz missed, add 1% to the baseline 5% weight of the
quizzes.
Examples:
1. An enrollee takes every quiz and averages 95%: 95 --> 100. The quizzes
therefore equal 5% + 5 x 1% = 10% of the total course score. The final exam
contributes 90%.
2. Every quiz is missed: 0 --> 0. The 0% quiz average will be weighted as 5% + 20
x 1% = 25% of the total course score. The final exam contributes 75%.
3. Every quiz is taken and the quiz average is 34%: 34 --> 40. The 34% quiz
average then contributes 5% of the total course score.
4. 1 quiz is missed and the quiz average is 65%: 65 --> 70. Then, the 65% counts
as 5% + 2 x 1% + 1 x 1% = 8% of the total course score.
Final Exam
The final exam will be a two-hour, open-book comprehensive exam scheduled after the
final class lecture day of the course. The exam will be paper-and-pen; no electronic
assistance. Enrollees will be expected to answer content-related questions and to code
verilog. A makeup may be taken for the final, if the scheduled final exam should be
missed for good reason.
2014-01-11
15
2014-01-11
16
Week 1 Class 1
Today's Agenda:
Introductory Lecture
Topics: Course content and organization.
Summary: This is a verilog language course. It is heavily lab-oriented and its
goal is netlist synthesis, more than simulation, as a content-related skill.
Homework reading assignments are of utmost importance, but neither labs nor
homework are graded. Daily quizzes (after the first week) and the final exam
are graded.
Introductory Lab 1
This introduces tool operation of the synthesizer and simulator. It is meant to be
executed by rote; explanations will follow.
Lab Postmortem
After having seen a verilog design through the tool flow, the enrollee is explained
the basic language constructs which were involved.
Reality-Check Self-Exam
This is a quiz to help enrollees understand how well prepared they are for the
course. It is not collected or graded.
Lecture: Verilog vectors and numerical expressions
Topics: Vector notation and numerical constants.
Summary: Verilog vector declarations and assignment statements are presented.
Numerical expressions in hexadecimal, binary and other bases are described.
Simulation delay expressions and verilog timescale specifications are
introduced.
Verilog Operator Lab 2
This exercises basic vector assignments and associated simulation delays. It also
exercises simple boolean operators and shows how a verilog module header
may be written either in ANSI standard format or in the older, verilog-1995,
"K&R C" format.
Lab Postmortem
After lab Q&A, vector extensions and reduction operators are explained.
Advantages of the ANSI module header format are given. Module timescale
and time resolution are discussed.
Lecture: VCD and SDF files
Summary: Simulator VCD files and synthesizer SDF file formats are described
briefly and their purposes explained.
Note: The old first-day lecture has been removed. It is assumed that students in the
future will obtain a copy of the Textbook before the scheduled first day has begun.
2014-01-11
17
2014-01-11
18
Week 1 Class 2
Today's Agenda:
Lecture: More language constructs
Topics: Traditional module header format. Comments, procedural blocks,
integer and logical types, constant expressions, implicit type conversions and
truncations, parameter declarations, and macro (`define) definitions.
Summary: We expand on the previously introduced, basic verilog constructs
allowed within a module. The emphasis is on procedural blocks and vector
value conversions. Verilog parameters are introduced as the preferred way of
propagating constants through the modules of a design; the main alternative,
globally `defined macro compilation, also is presented.
Type Conversion Lab 3
A trivial design illustrates the use of `define to control parameters, and the use
of comments. We exercise constraints to synthesize for area and for speed.
We also do exercises in signed and unsigned vectors, and in vector assignments
of different width.
Lab postmortem
This is mostly lab Q&A; also, the use of vector negative indices is discussed.
Lecture: Control constructs; strings and messages
Topics: The if, for, and case; the conditional expression; event controls; the
distinct contexts for blocking and nonblocking assignments. The forever
block and edge event expressions. Messaging system tasks. Shift registers.
Summary: Procedural control and the basic if, for, and case statements are
introduced. The relationship of if to the conditional ("?:") is described, with
warning about 'x' evaluations. Blocking assignments for unclocked logic;
nonblocking assignments for clocked logic. The forever block is presented but
its use is not encouraged. posedge and negedge for sequential logic are
presented. Assertions based on $display, $strobe, and $monitor are
advocated as good design practice. Finally, a shift register is presented briefly
to prepare for the next lab session.
Nonblocking Control Lab 4
After flip-flop and latch models, verilog for a serial-load and parallel-load shift
register is coded, simulated, and synthesized. A simple procedural model of a
shift-register is given.
Lab postmortem
In addition to lab Q&A, some thought-provoking questions about synthesis and
assertions are asked.
2014-01-11
19
Supplementary Note on $display. Add a new paragraph above the first code
example on p. 28: "The $display system task is like printf in C or C++: It formats a
message to be printed to the terminal window of the simulator. It is ignored during
netlist synthesis. We shall discuss details of $display later today."
2014-01-11
20
always@(negedge Clk)
Q <= D
2014-01-11
21
Supplementary Note on clock gating and reconvergence. While the simple and
gate shown in Figure 2.2 (Textbook p. 36)
illustrates the point, with the logic shown, an
asynchronous ShiftEna could cause a glitch on
the gated clock, with unpredictable results. A
realistic clock gate would use a transparent latch
which was enabled by the inactive edge of the
clock to be gated, as shown to the right.
Of course, the additional delay of the latch would make the reconvergence delay longer,
possibly requiring more design effort to avoid it.
2014-01-11
22
2014-01-11
23
Week 2 Class 1
Today's Agenda:
Verilog Lecture
Topics: Variable reg and net types; constants; basic simulation and relation to
synthesis; basic system tasks & PLI. Internal scan.
Summary: This time we study a variety of features of the verilog language. We
begin by explaining reg and net variables again, and we distinguish several of
the types of net available in verilog. We mention BNF format for clarifying
syntax. We then return to sequential logic synthesis, using latch and mux
similarities to illustrate how combinational logic should be modelled. We
introduce the basics of clock simulation and point out some simple ways of
avoiding race conditions. We wrap up with a brief presentation of internal and
boundary scan, which increase observability of design state in the hardware.
Simple Scan Lab 5
Using the combinational Intro_Top design of the first lab of the course, we add a
JTAG port, register all inputs and outputs using flip-flops, install muxes, and
link the flip-flop muxes to form a scan chain. Finally, we add an assertion to
be sure the scan mode operates correctly.
Lab Postmortem
We discuss glitches and scan-chain operational refinements.
2014-01-11
24
Supplementary Note on procedural blocks: On p. 45, change the 3rd item in the
bulletted list of block contents to:
procedural control statements (if; for; case; forever)
Supplementary Note on pp. 45-46, BNF: The example should be replaced with a
better one as follows:
". . .. For example, suppose for simplicity's sake that the BNF for a logical operator
was,
logic_op ::= boolean_op | bitwise_op
boolean_op := && | || | !
"Then, from this we may deduce that a logical operator always will be a bitwise
operator or a boolean &&, ||, or !. An attempt to use subtraction (-) as a logical
operation then would be an error easily recognized from the BNF given in this example."
2014-01-11
25
Supplementary Note on Textbook Lab 5 Step 7: The new VCS shows the
assertion messages like this:
2014-01-11
26
2014-01-11
27
Week 2 Class 2
Today's Agenda:
Lecture on SerDes
Topics: Serial vs parallel data transfer, PCI Express, PLL functionality.
Serializer-Deserializer and PLL design.
Summary: We introduce our class project, a full-duplex serializer-deserializer
(serdes), by discussing serial bus transfer advantages and the performance
specifications of a PCI Express serdes lane. We then look at the serializer and
deserializer ends of a serdes on a block level. We also see how a PLL works on
a block level. We decide to start our project by designing and implementing
the PLL which will be instantiated at each end of each serial line.
PLL Clock Lab 6
After designing and testing a digital PLL, we sketch out and simulate a
preliminary, generic parallel-serial converter. We finish with a parallel-toparallel frame encoder which may be used to prepare data for serialization.
Lab Postmortem
We discuss simulator time resolution and digital formats, comparator features,
and our choice of frame encoding.
2014-01-11
28
2014-01-11
29
Supplementary Note on last part of Lab 6: The format of our class project
SerDes serial packets will be:
64'bxxxxxxxx00011000xxxxxxxx00010000xxxxxxxx00001000xxxxxxxx00000000
The last part of Lab 6 is to just to perform a reformat of 32 bits of data to the above
kind of 64-bit packet:
Data3
Data2
Data1
Data0
becomes
Data3
Pad3
Data2
Pad2
Data1
Pad1
Data0
Pad0
2014-01-11
30
Week 3 Class 1
Today's Agenda:
Lecture on memories and verilog arrays
Topics: Memory chip size descriptions, verilog arrays, parity checks and data
integrity; framing for error detection,
Summary: We look into how the storage capacity (in bits) of a memory chip can
be described, and then we show how to model memory storage by verilog
arrays. We then study parity, checksums, and error-correcting codes (ECC) as
ways of tracking data integrity in stored memory contents -- or, in data storage
in general. We then show how parity may be used in 10-bit serial data frames
to extract a serial clock.
Lab on RAM models
We show how to model a simple single-port RAM in verilog and then to
instantiate it in a wrapper to give it bidirectional I/O ports.
Lab Postmortem
We discuss hierarchical references in verilog, the concatenation operation, and
some RAM details.
Supplementary Note on the simple RAM model, p. 87: Parity checking can be
added this way:
module RAM (output[7:0] Obus, output ParityErr
, input[7:0] Ibus
, input[3:0] Adr, input Clk, Read
);
reg[8:0] Storage[15:0]; // MSB is parity bit.
reg[7:0] ObusReg;
// Parity not used for data.
//
assign #1 Obus = ObusReg;
assign ParityErr= (Read==1'b1)? (^Storage[Adr]): 1'b0;
//
always@(posedge Clk)
if (Read==1'b0)
Storage[Adr] <= {^Ibus, Ibus}; // Store parity bit.
else ObusReg <= Storage[Adr];
// Discard parity bit.
endmodule
Supplementary Note on ECC: Concerning Section 5.1.6, Textbook, p. 90, see the
Plank reference in these Notes for an example of how to generate Reed-Solomon ECC for
multiple-error recovery. While reading this reference, keep in mind that matrix
multiplication by a vector makes each term in the product vector dependent on all terms
in one matrix row and on every term in the multiplier vector. Also, note that the errors
being corrected in the Planck paper always have been errors localized exactly in the
2014-01-11
31
system (e. g., by identified crash of a specific hard disc) before the ECC is invoked.
Memory Lab 7
Topic and Location: Single-port RAM and a simple way to make memory I/O ports
bidirectional.
Preview: We exercise various kinds of verilog array assignment. Then, we use an
array to design a single-port static RAM 32 bits wide and 32 addresses deep, with parity.
We add a for loop in our testbench to check the addressing and data storage of this
model; we use the simulator memory or $display() to look inside the memory. Finally,
we modify the RAM from two, unidirectional input and output data busses to a single,
bidirectional input-output data bus. We use the concept of a "wrapper" module to
combine the original data busses. Finally, we synthesize the bidirectional, single-port
RAM.
Deliverables: Step 1: A verilog module in a file with the specified declarations and
assignments; simulated. Step 3: A corner-case-correct single-port RAM simulation
model with parity and an assertion which reports any parity error. Step 4: A testbench
including a for loop with exhaustive testing and display of data storage by the RAM
model. Step 6: A modified, correctly-simulating RAM model with one, bidirectional data
bus. Step 7: Two synthesized verilog netlists of the bidirectional RAM model, one
optimized for area and the other optimized for speed.
Supplementary Note on Section 5.2.2 Additional Study, p. 99: Since this web
site was visited last, Wagner's The Laws of Cryptography with Java Code has been
published in PDF format, with the Hamming code presentation as one chapter. It is
available at http://www.cs.utsa.edu/~wagner/laws or at
http://openpdf.com/ebook/java-cryptography-pdf.html (2010-04-07).
2014-01-11
32
Week 3 Class 2
Today's Agenda:
Lecture on counters and the verilog for them
Topics: Kinds of counter by sequence. Counter structures discussed: ripple,
synchronous, one-hot, ring, gray code.
Summary: After defining what we mean by a "counter", we examine typical
terminology used to describe HDL modelling in general. Then, we examine
several kinds of counter typically used in hardware design: Ripple,
synchronous (carry look-ahead), one-hot, ring, and gray code.
Counter Lab
We write, simulate, and synthesize several different kinds of counter, showing
how the rarely-used wor net may be applied in one of them. We finish by
using our old PLL clock output to drive three of them in the same design.
2. Preload the counter to bias the wrap-around. This gives a count over any ten
contiguous, ascending values. For example, for the ten highest values,
reg[3:0] Count;
always@(posedge Clk, posedge Rst)
if (Rst == 1'b1 || Count == 4'hf)
Count <= 4'h6;
else Count <= Count + 1;
2014-01-11
33
Counter Lab 8
Topic and Location: We write verilog counter models, simulate them, and
synthesize them to compare performance of the synthesized netlists. We also create a
new design in which our old PLL is used to clock three different counter structures.
Work in the Lab08 directory for this lab.
Preview: We start with verilog for a ripple counter, using flip-flop models from
previous labs. We synthesize a netlist from the ripple-counter model and compare its
simulation speed with a netlist synthesized from a synchronous-counter model. We also
write a behavioral verilog model of a counter and simulate and synthesize it. After this,
we replace or expressions in the synchronous-counter model with a wor net, to see how a
wired or works. Finally, we create a new model by instantiating our old PLL and one
each of the ripple, synchronous, and behavioral counters in a single, top-level module
named ClockedByPLL; we end the lab by simulating this model.
Deliverables: Step 1: A correctly-simulating ripple counter and two synthesized
netlists, one optimized for area and the other for speed. Also, a record of the fastest clock
speed at which this verilog source model will count correctly. Step 2: The same as Step
one, but for a synchronous counter. Also, a dual-netlist simulation, using speedoptimized synthesized verilog netlists for the ripple-counter of Step 1 and the
synchronous counter of this Step. Step 3: A behavioral verilog counter, and netlists
synthesized for area and speed. A behavioral counter modified to count down and
simulated correctly. Step 4: The synchronous counter model of Step 2 with or gates (or
expressions) replaced by wor nets. Step 5: A correctly simulating model in which a PLL
instance supplies the clock to drive a ripple counter instance, a synchronous counter
instance, and a behavioral counter instance.
Step 2. You will have to use different module names for the DFF's in the
synchronous vs. ripple counters, to get VCS to compile their synthesized netlists together
in the second part of this Step. The reason is that if "DFFC" is used to synthesize both
netlists, each netlist will have a declaration of a "DFFC"; and, the new VCS will reject the
idea of multiple declarations of the same module, even if both are identical.
So, begin this Step by copying your standard DFFC.v to a new name, for example,
DFF.v. In the new copy, change the module name to match that of the file, viz., to DFF.
2014-01-11
34
Then, use DFF, not DFFC, in your synchronous counter. If you do this, after synthesis
the ripple counter netlist will contain a declaration of a DFFC, and the synchronous
counter netlist will contain a declaration of an otherwise identical DFF; this resolves the
name conflict.
2014-01-11
35
Week 4 Class 1
Today's Agenda:
Lecture on strength, contention, and operator precedence
Topics: Verilog drive and charge strength, race conditions, and operator
precedence in expressions.
Summary: Verilog charge strength and drive strength are defined and ranked in
order to show how contention among different logic levels is resolved. Then,
race conditions which lead to contention are discussed. Finally, the
precedence of verilog operators is presented, so that the logic in complicated
expressions may be handled correctly.
Lab on strength and contention
Verilog drive strengths and race conditions are simulated. Some of this lab is
optional; if you have the Silos simulator provided with Thomas and Moorby or
Palnitkar, you should be able to do the optional parts at home.
Lab Postmortem
Simulation of contention.
Lecture on PLL synchronization
Topics: Named blocks, and behavioral and patterned extraction of serial clocks.
Summary: Named blocks are presented to provide means of terminating
execution of a procedural loop. Then, the general problem of transmitting a
clock with serial data is solved, for purposes of the class serdes project, by
specifying 32 bits of inert padding for every 32 bits of data transmitted. An
unsynthesizable behavioral (procedural) PLL is described as an optional
digression but is not used. The lecture leaves for lab the detailed development
of a patterned verilog solution for PLL clock extraction.
Lab on PLL synchronization
This lab exercises some sophisticated procedural code for serial clock extraction.
It then shows how a combinational-logic bit swizzle can be equally effective and
much simpler than a purely procedural solution.
Lab Postmortem
Extraction of a serial clock: How to deal with loss of synchronization?
2014-01-11
36
Supplementary Note concerning Step 2: Netter should have a 4-bit input port
Sel[3:0] and a 1-bit output port named XYout. The simulation testbench should
include a 4-bit counter.
Supplementary Note concerning Steps 1 - 5: VCS will not simulate either the
verilog source or the synthesized netlists correctly for these Steps: For the source, VCS is
not designed to distinguish strengths in contention and simply will produce unknowns
2014-01-11
37
('x'). As for the netlists, DC does not synthesize gates capable of reliable resolution of
contention, and the synthesis libraries don't include such gates, anyway. Contention
almost always is an error in VLSI design, except only contention vs. 'z' -- which VCS and
DC handle correctly. The only reason for running VCS on the verilog source in these
Steps is for a good syntax check.
2014-01-11
38
Week 4 Class 2
Today's Agenda:
Lecture on FIFO state-machine design
Topics: Tasks, functions, fork-join blocks, state machines, and FIFOs
Summary: After introducing tasks and functions, we describe the procedural
concurrency of the fork-join block. We then discuss the best use of verilog in
state machine design. After that, we describe operation of a FIFO in detail,
focussing on the read and write address controls. We end by describing how to
code a FIFO controller state machine in verilog.
Lab on FIFO design
After a warmup on tasks, a simple FIFO state machine is coded.
Lab Postmortem
FIFO design across clock domains; gray code counters.
Here, OutBusReg is updated 4 ns after clock, and the change in DataBus[3] is not
seen until the next clock:
always@(posedge Clk)
begin
#1 DataBus[0] <= 1'b0;
#2 DataBus[1] <= 1'b1;
#3 DataBus[2] = 1'b0;
#4 DataBus[3] <= 1'b1; // OutBusReg misses this one.
#1 OutBusReg = DataBus; // Updated after DataBus[2].
end
2014-01-11
39
With this fork-join, OutBusReg is updated with all changes 5 ns after clock:
always@(posedge Clk)
begin
fork
#1 DataBus[0] <= 1'b0;
#2 DataBus[1] <= 1'b1;
#3 DataBus[2] = 1'b0;
#4 DataBus[3] <= 1'b1;
join
#1 OutBusReg = DataBus; // Updated after DataBus[3].
end
Supplementary Note on the verilog FIFO, p. 160, add a new last sentence to the
second paragraph after the incrRead code example: "Another reason to encapsulate the
address increments in tasks would be to handle address wrap-arounds for a register file
with a number of words not equal to a power of 2."
FIFO Lab 11
Topic and Location: Task coding, and the design of a FIFO in verilog.
Do all work for this lab in the Lab11 directory.
Preview: We start off with a minor modification to add a task to a previously
completed module. Then, we code a useful, generic error-handling assertion, using a
task to make it easily available in any verilog design. After that, we code a simplified
verilog FIFO state machine controller based on today's lecture. Finally, we connect our
previously coded verilog RAM model to our state machine to make a fully-functional (but
imperfect) FIFO.
Deliverables: Step 1: A modified FindPatternBeh module, using a task to collect
repeated lines of code. Step 2: Simulation of a task containing the error handling
assertion specified. Step 3: A correctly simulating FIFO state machine controller (minus
register file), and. if DC permits, two synthesized netlists, one optimized for area and the
other for speed. Step 4: The same as Step 3, but with a register file attached.
2014-01-11
40
2014-01-11
// =======================================================
`ifdef DC
`else
module FindPatternTst;
//
reg StartSearchStim;
wire FoundWatch;
//
initial
begin
#0
StartSearchStim = 1'b0;
#5
StartSearchStim = 1'b1;
#100 StartSearchStim = 1'b0;
#5
StartSearchStim = 1'b1;
#100 $finish;
end
//
FindPattern FindPatternInst1( .Found(FoundWatch), .StartSearch(StartSearchStim) );
//
endmodule //FindPatternTst.
`endif
41
2014-01-11
42
Week 5 Class 1
Today's Agenda:
Lecture on rise-fall delays and the verilog event queue
Topics: Regular and scheduled delays, rise vs. fall, event controls, and the verilog
simulation event-scheduling queue.
Summary: We describe regular vs. scheduled (intraassignment) delay statements
inline in the verilog code, which represent gate-to-gate or module-to-module
delays, only. We then see how to specify different rise and fall delays, and
different delays for high-impedance, in these statements. After that, we look
into how the verilog language requires a simulator to queue up, evaluate, and
schedule assignments to variables. We finish with the two different verilog
event controls, '@' and 'wait'.
Lab
Delay statements and event scheduling.
Lab Postmortem
Q & A on the event queue, lab results, and some fine points on delays.
2014-01-11
43
...
reg x, y, z;
//
always@(x) z = x;
always@(y) z = y;
//
initial
begin
z = 1'bz;
#2 x = 1'b0;
y = 1'b1;
#5 $finish;
end
The way that this is written, y will be updated to 1'b1 at time 2 after x is updated to
1'b0, which will leave z at 1'b1. However, changing the first always block to
always@(x) #0 z = x;
will cause z to be left at 1'b0 at time 2. Why should a zero delay do this?
B. The third example on p. 172 will not be simulated correctly by any simulator
available to the class. This is because modern designers rarely if ever write delayed
nonblocking assignments; so, modern simulators usually ignore the possibility and treat
delayed nonblocking assignments as delayed blocking assignments.
2014-01-11
44
B. Initialization before simulation time is set to 0. In Figure 9.2, the block pointing to
the "set t = 0" block should say, "Read past [all] event controls fulfilled", not just levelsensitive ones. Statements not preceded by a procedural delay and controlled by edgesensitive event controls will be executed here. Use a #0 delay in an initial block to
ensure initializations which must be present at time 0 (this is the only use of #0 if one is
coding in the recommended style).
The corrected Fig. 9.2:
2014-01-11
45
Scheduling Lab 12
Topic and Location: Event scheduling by the simulator and rise-fall delay
statements.
Do this work in the Lab12 directory.
Preview: We start with a few simple puzzles, asking when different events will be
scheduled to occur. We then look into problems caused by mixing blocking and
nonblocking assignments. We revisit the Week 2 Class 1 latch-synthesis problems
caused by omission of variables from a sensitivity list. Finally, we look at statements to
schedule different delays on rising vs. falling edges.
Deliverables: The first three are observational and require no deliverable in fixed
form: Step 1: Eight answers to the questions posed. Step 2: Three answers. Step 3:
Simulation of the file provided; your comments on the results. Step 4: A synthesized
verilog netlist containing instances of the two modules provided. Step 5: A completed
module which simulates the statements given with the three delay values provided.
Supplementary Note on the answers provided: None was given for Steps 1 & 2;
see below. The other answers are somewhat expanded from those explicitly required in
the Textbook. Note that "rise" and "fall" refer to changes, not to initial assignments.
Step 1 SchedDelayA: First rise a = t 5; b = same. Last change a = t 6; b = same.
SchedDelayB: First rise a = t 0; b = same. Last change a = t 6; b = same.
Step 2 SchedDelayC: A. t = 1; B. Predictable (continuous assigns); C. a = 1'b1.
Supplementary Note on the Step 5 bus delays. According to the IEEE LRM,
1364-2005, vector assignments must be simulated with a single delay, not bitwise.
But, the LRM Section 6.1.3 gives this three-part rule for multivalue delays:
(a) If the RHS makes a transition to 0, the falling delay shall be used.
(b) If the RHS makes a transition to z, then the turnoff delay shall be used.
(c) In all other cases, the rising delay shall be used.
However, this rule is ambiguous, because "makes" can be interpreted bitwise or to
mean the numerical value in the vector. Both Aldec and VCS seem to treat vector delays
consistent with the numerical value interpretation: Thus, they require a numerical value
of 0 for case (a), all bits 'z' for case (b), and any other pattern for case (c). Other
simulators may use the transition on the LSB, only.
2014-01-11
46
Week 5 Class 2
Today's Agenda:
Lecture on verilog built-in gates and net types
Topics: Verilog built-in elementary gates, net types, implied wires, and a port
and parameter review.
Summary: A netlist consists of gate instances and wiring. Here, we enumerate
the elementary logic gates which are built into the verilog language, pointing
out their input and output port organization. We then enumerate all verilog
built-in net types useful in VLSI design, saving switch-level constructs for
later. We mention the `default_nettype control, and we then discuss the
rationale for verilog rules for mapping of net and reg types to a port. We point
out that parameter and delay values both may be preceded by '#' but that they
should not be confused.
Lab
We construct several verilog netlists by hand, and we simulate and synthesize
them.
Lab Postmortem
After general Q&A, we discuss synthesis results for our netlist counter vs one
built with behavioral flip-flops.
186, a new sentence should be added after the first one: "Representing a tap into one of
the chip power rails, the strength of a pullup or pulldown gate is source strength."
p. 188: Add a new sentence just before the last one: "Also, the parameter declaration or
value-list always immediately follows the identifying module name."
2014-01-11
47
Netlist Lab 13
Topic and Location: We construct by hand several verilog netlists and simulate and
synthesize them.
Work in the Lab13 directory.
Preview: Using elementary combinational gates, we construct a netlist
representation of a D flip-flop, connect several of our flip-flops in a synchronous counter,
and simulate and synthesize the verilog to library gates. We also construct a similar
counter from behavioral D flip-flops, to compare the synthesis result.
Deliverables: Step 1: A correctly simulating model of a D flip-flop consisting of a
hand-entered netlist of verilog primitive combinational gates. Step 2: A correctlysimulating synchronous counter created by replacing the behavioral D flip-flops of the
Synch4DFF Lab08 exercise with your flip-flops of this Step 1. Step 3: Four synthesized
netlists: One each, optimized for area and speed, from the old Synch4DFF counter and
from your new counter made from structural D flip-flops.
2014-01-11
48
Week 6 Class 1
Today's Agenda:
Lecture on procedural control and concurrency
Topics: Procedural control constructs, procedural concurrency, and verilog name
space.
Summary: We deepen our understanding of the procedural control constructs
forever, repeat, while, case, and the use of the case equality operator.
We also study casex and casez, but we warn against them. Then, we change
topic somewhat to revisit procedural control of concurrency by fork-join
blocks. We end with a few new observations on the scope of verilog names.
Lab on concurrency
After exercises in common procedural control constructs, we design a watchdog
device which runs concurrently with the CPU it guards.
Lab Postmortem
We review the problem of casex and devise a bubble-diagram for our watchdogCPU design.
Supplementary Note on the CheckToggle examples of pp. 203 and 206. These
examples are to illustrate task instance concurrency and are not synthesizable; they do
not exemplify good modelling. A simpler and better way to solve the bit-checking
problem concurrently and synthesizably would be,
// Use bit-selects:
always@(InBus[0]) TempToggledReg = 2'h0;
always@(InBus[1]) TempToggledReg = 2'h1;
always@(InBus[2]) TempToggledReg = 2'h2;
always@(InBus[3]) TempToggledReg = 2'h3;
always@(posedge CheckInBus) #1 ToggledReg = TempToggledReg;
Supplementary Note on the example at the top of p. 206. The error message in
the code example comment, as well as part of the last paragraph on p. 205, are wrong: It
is not an error; a variable declared in an always block will retain its value between block
executions.
VCS will simulate the example at the top of p. 206 correctly as an up-counter. It
should be mentioned that ClockCount is local to the named always block and can't be
accessed directly by simple name; it can be accessed from outside the Ticker always
block by hierarchical name, as will be studied in the next Chapter.
The last paragraph on p. 205 should be rewritten thus:
"Notice in the next example that the count will be preserved, even though the counter
storage was declared local to the always block."
2014-01-11
49
Concurrency Lab 14
Topic and Location: Procedural control constructs, concurrency in procedural code,
and a watchdog-CPU design.
Do this lab in the Lab14 directory.
Preview: We do a few elementary exercises with forever, repeat, while, and
case. Then, we take on a fairly complex design which requires concurrent execution of a
watchdog device and another device, in this case a CPU.
Deliverables: Step 1: A simple module containing a correctly simulating forever
loop. Step 2: The same design with a repeat loop added. Step 3: The same design
with a while loop added. Step 4: A new, small design of an encoder implemented with a
case statement. Step 5: A correctly simulating design containing a minimal emulation
of a CPU and a concurrently-running watchdog.
Supplementary Note on Step 5, Fig. 11.2, p. 208: The "t" in the various
"t_(command)" actions stands for task. It is suggested that each "t" action be
implemented as a separate task instance; however, the assignment (p. 209, B) only
requires one task to be used.
2014-01-11
50
Week 6 Class 2
Today's Agenda:
Lecture on hierarchical names and generate blocks
Topics: Hierarchical names, arrayed instances, compiler macroes vs. conditional
generates, looping generates.
Summary: We lay groundwork by presenting in detail the rationale for
hierarchical names in verilog. We then show how to create arrays of instances
wherever a single instance could be used, assuming the connectivity is fairly
simple. After that, we introduce the generate statement. First, we discuss
the conditional generate, which may be compared with conditional
compilation using verilog macro definitions. Next, we discuss the unique
looping generate, which allows creation of arrays of instances, and of wiring
of any complexity among them. We expand on the looping generate construct
by stepping through the design of a very large decoder.
Lab
After brief exercises in arrayed instances, we use generate to design a RAM,
which we then substitute into our previous FIFO design.
Lab Postmortem
We discuss indices in arrayed instances and generate; then, we compare the
effect of writing a certain generate block three different ways.
2014-01-11
51
Supplementary Notes on the RTL code on p. 220. Two comments on the RTL
decoder model at the top of this page: (a) The block name, Decoder, is not required; and,
(b) the delayed blocking assignment causes the delays to accumulate, so that
AdrEna[1023] is not assigned until 1024 ns after each Address change. Another good
reason not to put delays in procedural code. Also, keep in mind that the delays in the
preceding generate example are structural, not procedural.
A different and efficient way to implement a large decoder procedurally would be,
parameter NumInputs = 10;
parameter NumOutputs = 1<<NumInputs;
input[NumInputs-1:0] Address;
reg[NumOutputs-1:0] AdrEna;
//
always@(Address)
begin : Decode
AdrEna = 'b0;
#1 AdrEna[Address] = 1'b1;
end
This does not match the generate loop structure, but it does not cumulate delays.
Supplementary Note on Figure 12.7, p. 221. The heavier arrow paths represent
enables, not addresses.
2014-01-11
52
Generate Lab 15
Topic and Location: Synthesis comparison of arrayed instances and generated
instances; coding of a RAM using generate; and, use of a generate'd RAM in a FIFO.
Work in the Lab15 directory, and be sure to save the results of Step 1 and Step 2
separately, so they can be compared in Step 3.
Preview: First, we ensure that the alternative construct, arrayed instances, can be
used by doing a small design from the Textbook with them; we save synthesis results for
later comparison. Next, we do a very similar design using generate, again saving
synthesis results. Then, we compare area and speed optimizations for one of the designs
previously synthesized. After these warm-ups, we provide requirements for a new RAM
design which is to be coded using generate. Once this RAM has been simulated
superficially, we include it in a FIFO (from a previous lab) and simulate and synthesize
the result.
Deliverables: Step 1 and 2: A synthesized netlist. Step 3: Two synthesized netlists,
one optimized for area and the other for speed. Step 4: A correctly simulating RAM
design, according to requirements given in the lab instructions. Step 4: A correctly
simulating FIFO which uses the generated RAM of Step 3; also, two, and, optionally,
three, synthesized netlists of the FIFO design.
2014-01-11
53
A better representation would be to show the flip-flops with the enable inputs (see Lab
4; Week 1 Class 2) which would be supplied by the decoders:
Supplementary Note concerning Step 5, pp. 228 - 229. After the simulation, the
rest of this Step is a set of exercises on synthesis constraints.
Also, the last sentence in Step 5, on p. 229, perhaps should repeat the reason why the
FIFO netlist at this point is incorrect: Among other things, the FIFO state machine
includes tasks with event controls; this causes DC to skip entirely any synthesis of the
FIFOStateM module.
One final comment: A single, full synthesis of this FIFO, with or without generate
assistance, very likely would take a considerable time to finish, perhaps more than an
hour on a fast machine. This means that synthesis of this FIFO alone, which generally
would be expected to require multiple reruns to tune the size and gate speeds, might take
a week or more and probably would not be as good as a commercially available IP model.
2014-01-11
54
Week 7 Class 1
Today's Agenda:
Lecture on serial-parallel conversion (deserialization)
Topics: Generic requirements for serial-parallel conversion; uses of functions and
tasks.
Summary: We briefly discuss serial-parallel conversion as a generic problem and
then move on to specific the usefulness of verilog functions and tasks in
simplifying this kind of conversion. Continuation of our serdes class project is
reserved for the lab work.
Lab on serial-parallel conversion
We first do two related, generic deserializer designs; then, we move on to recall
our serdes class project. We implement a first cut on the required Deserializer
module of our project.
Lab Postmortem
We discuss reusability issues of our class serdes deserializer.
Serial-Parallel Lab 16
Topic and Location: A generic serial-parallel converter, a modification sensitive to
bit-patterns in the serial stream, and a first-cut at the Deserializer module of our class
serdes project.
Do this work in the Lab16 directory.
Preview: We first do two related, generic deserializer designs: A generic serialparallel converter which ignores content of the incoming serial stream, and then a
modification of it which requires use of bit-patterns in the serial stream to control
functionality. After the generic designs, we move on to coding what will become the first
cut at the Deserializer of our serdes class project.
Deliverables: Step 1: A correctly simulating deserializer which converts an incoming
serial stream to registered (parallel) 32-bit words. Step 2: A modification of the Step 1
design which controls the deserialization process based on serial content. Step 3: A
correctly simulating Deserializer which produces registered 32-bit words from a serial
stream arriving in the packet format required by our serdes project.
2014-01-11
55
Supplementary Note on function and task, p. 234. In the second code block, the
"ParValidReg" is the same as the "ParValidFlagr" of the second code block on the
previous page. The Unload32 task briefly lowers the ParValid flag each time it
updates the parallel bus.
The same always block calls the task and lowers the ParValid flag on reset; this is
consistent with synthesizer requirements, although any delay on an assignment in the
task will be ignored by the synthesizer.
2014-01-11
56
Week 7 Class 2
Today's Agenda:
Lecture on UDPs, timing triplets, and switch-level models
Topics: User-defined primitives, timing triplets, switch-level primitives and nets,
trireg nets.
Summary: We present the basics of verilog user-defined look-up table primitives
(UDPs) and how to use them for simple component modelling. We then pass to
a review of two- and-three-value delay specifications to introduce the second
dimension of delay timing, the min:typ:max triplets which may be used in
place of single delay values anywhere. We then enumerate the switch-level
primitives and explain their drive strength parameters. We end with the
switch-level trireg net, which may be used to model decay transition time of a
capacitor.
Lab
We model a combinational and sequential UDP first; then, we model several
devices at switch level, including a trireg delay line.
Lab Postmortem
Q&A, and two little questions.
2014-01-11
57
Supplementary Note on the cmos switch primitive, pp. 250 - 252. Because a
A
cmos switch
However, when used in a gate such as an inverter, the elements of a cmos would be
rearranged so that the pass-gate "input" and "output" were connected between supply
and ground, as shown in Textbook Figure 14.3. In this arrangement, the inputs of the
pmos and nmos switches would be tied to the power supply rails. The output of such an
arrangement then always would be at strong strength, because, differing from the
Textbook rules on p. 150, switches always reduce supply strength inputs to strong
output strength.
Supplementary Note on trireg nets. On p. 253, Section 14.1.5, add a new last
sentence to the paragraph beginning, "A delay value may be assigned . . .": "A
generated or arrayed structure including trireg nets thus could be used to model the
refresh behavior of a dynamic RAM."
Also, in the paragraph beginning, "If a trireg net has no delay, . . .", add a new
sentence before the last one: "This resembles procedural behavior and so explains the
peculiar name, trireg, of this construct."
Component Lab 17
Topic and Location: UDPs, switch-level models, trireg models.
Do the work for this lab in the Lab17 directory.
Preview: This lab does not involve any synthesis. We start by modelling a
combinational UDP and then a sequential UDP. We model a switch-level inverter using
n and p MOS switches. We examine a cmos switch. Next, we model a mux using pass
transistor switches. We complete our study of switch-level components by modelling
simple nand and nor gates. We simulate a delay line built on trireg nets.
Deliverables: Steps 1 and 2: Correctly simulating UDP component models. Step 3:
A correctly simulating MOS-switch inverter component. Step 4: The Step 3 model with
a cmos output added; a completed lab truth table for the cmos. Step 5: A mux built
from pass transistor switches and simulated correctly both with tranif and rtranif
primitives. Step 6: A module containing a correctly simulating switch-level model of a
nand and a nor gate; a second version of this module using the Step 3 switch-level
inverter to create and and or gates. Step 7: Two versions of a trireg pulse-delay line,
one with rnmos primitive components, and the other with rtran primitive components.
2014-01-11
58
Supplementary Note for revision of Step 7. First of all, this Step should be
optional.
The Textbook Step 7 uses delays which do not clearly present the behavior of this
design. Also, VCS currently (2009) never will simulate triregs correctly when
connected to rtrans; so, the waveforms shown in Textbook Figs. 14.9 and 14.10 should be
ignored.
Instead, use the schematic given in Fig. 14.8 modified by replacing the small trireg
driven by the input buffer with a large trireg. Change the delays as follows, naming
the triregs Tri01 - Tri04, from left to right:
Name
Tri01
Tri02
Tri03
Tri04
Drive Strength
large
large
medium
small
Delay
(19, 23, 50)
(21, 29, 53)
(9, 13, 17)
(5, 7, 11)
These delays will make the expected simulation easier to understand; however, VCS
(2009) will not simulate this design correctly. Assigning a delay of 1 to each rnmos, the
generally expected result would be as follows:
2014-01-11
59
Revised Fig. 14.9: VCS simulation of RCnet, with rnmos components (incorrect -- output oscillates).
Revised Fig. 14.10: VCS simulation of RCnet, with rtran components (incorrect).
The QuestaSim simulator (v. 6.5 of 2008) does a better job with switch-level constructs:
Revised Fig. 14.9: QuestaSim simulation of RCnet, with rnmos components (overall view).
2014-01-11
Revised Fig. 14.9: QuestaSim simulation of RCnet, with rnmos components (buffer disable closeup).
Revised Fig. 14.9: QuestaSim simulation of RCnet, with rnmos components (buffer reenable closeup).
Revised Fig. 14.10: QuestaSim simulation of RCnet, with rtran components (buffer disable closeup).
60
2014-01-11
Revised Fig. 14.10: QuestaSim simulation of RCnet, with rtran components (buffer reenable closeup).
61
2014-01-11
62
Week 8 Class 1
Today's Agenda:
This time we'll do some review of parameters and parameter passing. We'll also look
into some problems concerning design hierarchy.
Lecture on parameter types and module connection
Topics: Parameter types, port and parameter mapping, and defparams
Summary: We examine in some detail the typing of parameter declarations, and
ANSI and traditional port and parameter mapping. We mention defparam, a
construct to be avoided, and localparam, a construct devised to prevent use of
defparam.
Connection Lab
We exercise port and parameter declarations, as well as a hierarchical design
comparing `defines with parameters to parameterize bus widths.
Lab Postmortem
We look into possible confusion of parameters with delays in component
instantiation.
Lecture on hierarchical names and design partitions
Topics: Hierarchical names, verilog identifiers, design partitioning, and
synchronization across clock domains.
Summary: We review and extend our coverage of verilog hierarchical names,
generalizing to a discussion of the scope of identifiers in verilog. Then, we
present various criteria which might be used to decide how a design should be
partitioned into modules and module hierarchy. We dedicate some of the
discussion to the problem of data which crosses independent clock domains and
the related problem of how to synchronize the clocking of such data.
Hierarchy Lab
We experiment with hierarchical truncation and extension of mismatched bus
widths. Then, we use a small design to see the benefit of latching module
outputs.
Lab Postmortem
We point out a conflict between rules of thumb (a) to use continuous assigns to
lump module simulation delays and (b) to latch all outputs.
2014-01-11
63
been retained in IEEE Std 1364, but its use has been discouraged in all Std versions
through 1364-2012. As explained in the Textbook, defparam should be avoided: Use
parameter or localparam instead.
Connection Lab 18
Topic and Location: Traditional and ANSI port mapping, instantiation of parameter
overrides, hierarchical comparison of `define vs parameter override.
Do this work in the Lab18 directory.
Preview: We start with a simple exercise in reformatting a module header. Then, we
do a few simple but perhaps unusual parameter overrides. After that, we view in VCS
the hierarchy in a skeleton design with configurable bus widths, first using `define and
then using parameters.
Deliverables: Step 1: A compilable rewrite of a traditional-port module. Steps 2 - 4:
Compiling versions of a design with various parameter overrides. Step 5: Compilation of
the given design with macro bus widths in correct order. Answers for the two questions
asked. Step 6: A compiling rewrite of the design of Step 5, but with parameter overrides
in place of `defines.
Also, as is visible in the preceding figure, the new VCS gui will display parameters
and their values. In addition, to see the values in the [Data] window pane, it is
possible just to pick [] in the "Value" column.
2014-01-11
64
Hierarchy Lab 19
Topic and Location: Bus routing in a module hierarchy, synchronization across
clock domains.
Do this work in the Lab19 directory.
Preview: We create a skeleton hierarchy of module ports and then see how bus-width
mismatches are handled across hierarchical boundaries. We then write a testbench
which clocks two 3-bit counters at different clock speeds. We examine glitching across
the two clock domains and see how latched outputs ameliorate the glitching problem,
even in a digital simulator which has no capability to represent different voltages.
Deliverables: Step 1: An empty hierarchy acceptable to the simulator compiler.
Step 2 - 5: Answers to the questions, checked by the simulator. Step 6: A correctly
simulating ClockDomains testbench.
2014-01-11
To make this example work and get a fatal error from new VCS, before performing
Step 3, instantiate Narrow in Wide, mapping the Narrow output port to OutWide and
the Narrow input port to InWide.
65
2014-01-11
66
Week 8 Class 2
Today's Agenda:
Lecture on verilog configurations
Topics: Libraries and configs.
Summary: We explain the relation of a library to a design and enumerate the
major keywords for a config. There is no lab exercise or example available
because of lack of vendor implementation.
Lecture on timing arcs and specify delays
Topics: Timing arcs within modules, specify blocks, path and full delays.
Summary: We build on the library concept to cover verilog delays within a
module, especially a module for a library component. We elaborate on the use
of lumped vs distributed delays on a path. We present specify blocks and
their delay statements, including specparams, leaving timing checks for a
later class.
Lab
We study details of specify-block delays, concentrating on resolution of conflicts
among overlapping paths. We write out an SDF file after a synthesis.
Lab Postmortem
We discuss simulation of conflicting delay specifications.
the paragraph beginning, "When creating . . .": "The reason that this is a good practice is
that specify blocks are used in models of small devices; the specparams then would be
named for specified performance parameters on a datasheet for such a device."
2014-01-11
67
C. A new last paragraph on p. 287 should be added: "For paths on which delay
pessimism must be overridden, up to twelve different delays are allowed, counting to- and
from-x as well as to- and from-z. See the next Section."
289: It should be emphasized that an SDF delay, if applied, supersedes and replaces any
of the kinds of delay described here.
Timing Lab 20
Topic and Location: Path delays and conflicts; SDF for a netlist.
Do this work in the Lab20 directory, using the verilog provided.
Preview: We use a small hierarchical design to study assignment of delays to
component internal paths. We especially show how delays on multiple paths, potentially
conflicting, are resolved in simulation. We synthesize a netlist and write out SDF to see
the effect of the specify delays.
Deliverables: Step 1: A testbench counter for the SpecIt design which compiles in
the simulator. Step 2: Correctly simulating distributed and lumped delays in SpecIt.
Step 3: Various correctly simulating delays using a specify block in SpecIt. Step 4:
A correctly simulating full-parallel path conflict, using specparams in a specify block.
Step 5: Simulation and synthesis delays, and SDF delays, for a SpecIt netlist.
Supplementary Note concerning Step 4B. p. 292: The instruction to "Use only
rise and fall delays" just means not to use posedge or negedge expressions.
2014-01-11
68
Week 9 Class 1
Today's Agenda:
Lecture on timing checks and pulse controls
Topics: Relation to assertions, the twelve verilog timing checks, pulse filtering
and delay pessimism.
Summary: This will complete our study of timing issues in verilog, as well as of
the allowed contents of a specify block. We start by discussing the
relationships among timing checks, assertions, and system tasks. After
introducing the time-stamp/time-check rationale, we present the 12 timing
checks and their default arguments. After describing conditioned events and
timing-check notifiers, we explain verilog simulator pulse handling, inertial
delay, the PATHPULSE task, and pessimism reduction in the specify block.
Lab on timing checks
We exercise the timing checks and pulse-filtering features of specify blocks.
Lab Postmortem
We entertain a Q&A session only, this time.
Supplementary Notes on $setuphold (p. 298) and $recrem (p. 299) advice to
see "below": These references are to Textbook section 17.1.4 on negative time limits.
timing check has been implemented in later releases of VCS (e. g., v. 2008-09).
2014-01-11
69
(b) the pulse width is between the error limit and the rejection limit: The leadingedge delay is imposed, and the output changes as though the input pulse had
been narrowed to a width equal to the rejection limit; or,
(c) [as-is].
p. 304: Figure 17.5: Some readers may find this figure simpler with the time scale
reversed, as follows:
2014-01-11
70
Supplementary Note concerning Step 6. p. 307: In this Step, new VCS will
print a warning for each PATHPULSE error, as well as displaying an 'x' in the waveform.
Also, a $nochange timing check has been added for this Step in the Lab21_Ans
directory, just to show what the message looks like.
Supplementary Note concerning Step 8B. p. 309: To clarify the third sentence,
the reference to "10 ps" is to the clock period, not the half-period delay in the clock
generator.
2014-01-11
71
Week 9 Class 2
Today's Agenda:
Lecture on sequential deserializer of our serdes
Topics: Redesign of PLL for synthesis. Blocks to complete for the Deserializer.
Summary: The "sequential" Deserializer is because our FIFO doesn't yet have a
dual-port RAM; reads and writes have to be mutually exclusive (sequential; not
concurrent). We review the status of our project and decide to complete the
Deserializer block in the next lab. In the current lab, we prepare by
redesigning the PLL to make it synthesizable and then setting up a new,
permanent directory structure for the FIFO, PLL, and Deserializer components
(the deserializing decoder and serial receiver).
Lab on first-cut Deserializer
Much of the work is in renaming and connecting blocks. However, we spend
some effort in redesigning our PLL so it will operate about the same as before
but will be synthesizable. We connect the PLL and our (single-port RAM)
FIFO and then set up a divide-by-two clock for parallel data. We then connect
the serial input in a testbench and simulate. In addition to redesigning the
PLL, we make many small modifications to everything else. Finally, we
synthesize for an idea of the design size and speed.
Lab Postmortem
We focus on synthesis constraints and don't worry about the design functionality
at this time.
Supplementary Note on the VFO Figs. 18.2 and 18.3, p. 313: Keep in mind that
our VFO both (a) receives an external 1 MHz ParClk and (b) generates its own internal
~32 MHz ClockOut.
Supplementary Note on the library delay cell, p. 314, 3rd paragraph: Rewrite
the first sentence to say, ". . . available as the simple, small, noninverting delay-cell
buffer DEL005 in our synthesis target library . . .".
Supplementary Note on the ClockComparator. p. 316: To clarify the abstract
example's relation to the SerDes class project, a new paragraph should be added after the
code example: "The VarFreq vector, of course, will be the same as ClockCounterN,
controlled by the AdjFreq output of the ClockComparator block."
2014-01-11
72
2014-01-11
73
2014-01-11
74
Fig. 18.14a. Block-level schematic of the synthesizable PLL, with connection details. Some Reset lines omitted.
this step was introduced in Fig. 13.4 of Textbook p. 238 and Fig. 18.1 of Textbook p. 312.
2014-01-11
75
Supplementary Notes concerning Step 9. pp. 331-332: The end of the first
sentence in the 2nd paragraph of p. 331 of this Step might be better phrased as, ". . .
design module; keep the testbench module, which originally was at the bottom of the file."
p. 332: In Step 9, the last bulletted paragraph, after the code example, there is a
reference to "if (SerValid==YES)": This simply should be "if (SerValid==1'b1)".
The "YES" is taken from my answer code for the DesDecoder, in which "localparam
YES = 1'b1;" is defined. Later, in Lab24, Week 10 Class 2, both "YES" and "NO" will be
defined as part of the exercise and located in a new include file, SerDesFormats.inc.
The actual readability value of these words is minimal, but they do show how parameters
can be used to conceal the literal values of numerical constants. The FIFO state
machine state codes and the packet pad values are other, better, examples.
p. 332: Add a new last sentence to Step 9: "Further modifications to make the
DesDecoder synthesize correctly will be done later in the course."
Supplementary Note concerning Step 10. The new VCS may produce a first
working source simulation with different output values than those shown in the
Textbook.
Also, the on-disc answers for this Step include some changes in the source verilog
(different from the Textbook CD) which have been made to prevent the new VCS from
setting certain signals to 'x' where the old VCS did not. The main change is to comment
out the FIFO state machine assignment of 1'bz to the ReadCounter when the FIFO is
in its empty state.
2014-01-11
76
Supplementary Note concerning Step 11. pp. 333-335: In the on-disc answers
given, loose-end Step A, p. 333, was done already by Step 10.
The specific data I/O values in the on-disc answers will differ from those in the
Textbook (p. 334) for the new VCS. For example, the first valid word out with new VCS
might well be 32'h0573_8705, as shown in the next figures, corresponding not-quitecorrectly to the ParWordIn value 32'h0573_870a at simulation time 61,947 ns.
Zoomed-in view of the New VCS Deserializer output (not quite correct).
2014-01-11
77
Week 10 Class 1
Today's Agenda:
Lecture on concurrent deserializer lab
This brief lecture merely introduces the lab, which includes simulation and
synthesis, the latter primarily as a verification tool.
Lab
We debug our first-draft Deserializer and make many small changes to modify
the memory so it will operate as a dual-port. We change our FIFO state
machine so it can take advantage of the dual-port memory.
Lab Postmortem
We stop for Q&A and discuss a few verilog debugging techniques.
of Fig. 19.1, p. 337, may be compared with that of the sequential Deserializer in the
lower half of the schematic of Fig. 18.1, p. 312. The PLL which is displayed on p. 312 as
PLL_RxU1 in Deserializer.v/*SerialRx_U1 is especially noteworthy.
The clock speed for the FIFO's is 1/2 MHz, as was mentioned in the Textbook at
several previous points and is detailed on p. 133 (Week 4 Class 1). Two clocks at 1-MHz
means (1 MHz)/2 = 1/2 MHz per FIFO clock.
2014-01-11
78
Supplementary Note concerning Step 2. On p. 340, the filename listing, and the
sentence following, mention "default.cfg": New VCS does not use this file; instead, it
saves state in a .tcl session script.
Supplementary Note concerning Step 3. On pp. 340 341, this Step mostly is
preliminary discussion. The one important change in this Step is to wire ChipEna
permanently high; the changes which actually will enable simultaneous read and write
are in Step 4.
2014-01-11
79
Supplementary Note concerning Step 8B. On p. 349, making the state clock
generator synthesizable is a problem. Using the receiving-domain clock as the state
clock would work but would risk a variety of machine race conditions on certain sendingdomain clock phases.
Using "always@(ClkR, ClkW) StateClockRaw = !StateClockRaw; " would not
synthesize correctly, because neither ClkR nor ClkW are used in the statement. To avoid
danglers, hypothetically one might try using any available edge on either domain, as in,
"always@(posedge ClkR, negedge ClkR, posedge ClkW, negedge ClkW)". However,
event controls using both edges of the same variable are not synthesizable.
The "!(ClkR && ClkW)" solution in the Textbook reduces the clocking problem to one
of glitch filtering. Some experimentation shows that a plain and, "(ClkR && ClkW)",
actually works better than the book nand; so, this is used in the lab answers. However,
any logical operation does imply that for certain very improbable ClkR vs. ClkW phases,
the state clock would stop -- temporarily, assuming that at least one clock could drift in
phase. Watchdog logic could avert this, but why complicate this FIFO, which is a
training exercise, further?
Supplementary Note concerning Step 8C, pp. 350 - 355: New incrRead and
incrWrite blocks are developed on pp. 351 & 352; their purpose is to replace the original
and unsynthesizable @(posedge Clk) statements in the old incrRead and incrWrite
of FIFOStateM.v. The use of the new blocks, minus their deleted fork-join's, is
shown on p. 353.
After completion of the changes in this Step, we have a clocked "combinational" state
machine block, but with blocking, not the usual nonblocking, assignments. This is so
that the statements in this block still are updated in order as they are run. The usual
setup and hold functionalities of nonblocking assignments in a simple clocked model are
2014-01-11
80
not relevant here, because we can be confident that everything will have settled to its
final condition by the time the next posedge of StateClock arrives.
Overall view of the new DC netlist simulation (with SDF timing back-annotation) of Step 8C.
The FIFOTop.sct file for this Step contains suppress_message* statements. These
shorten the screen and .log report warning messages as we increase design and .sct
complexity. Otherwise, eventually, the warnings would consume all the .sct runtime
screen display, making error messages disappear.
2014-01-11
81
The Decode4 task, rewritten as a negedge always block, may be done this way:
always@(negedge SerClock, posedge Reset)
begin : Decode4
if (Reset==YES)
begin
DecodeReg = 'b0;
doParSync = NO;
SyncOK
= NO;
UnLoad
= NO;
end
else begin : PacketFind // Look for packet alignment:
UnLoad
= NO;
doParSync = NO;
if ( FrameSR[7:0]==PAD0 )
begin : FoundPAD0
SyncOK = YES;
if ( FrameSR[23:16]==PAD1 && FrameSR[39:32]==PAD2
&& FrameSR[55:48]==PAD3 )
begin // All pads indicate all frames aligned:
DecodeReg = { FrameSR[63:56], FrameSR[47:40]
, FrameSR[31:24], FrameSR[15:8] };
UnLoad = YES;
end
else // Found a PAD0, but rest failed; so, synchronize:
begin
doParSync = YES;
SyncOK = NO;
end
end // FoundPAD0.
end // PacketFind.
end
As mentioned previously, for variety, in the above always block we write YES (=1'b1)
or NO (=1'b0).
The procedural complexity of this decoder requires blocking assignments. By
generating the parallel clock on a posedge of the serial clock and using all DesDecoder
results only on the negedge, race conditions easily are avoided.
Supplementary Note concerning Step 9, pp. 355 - 357. This Step is important;
and, except for the "fancy" localparam expression on p. 357, it should not be considered
optional.
2014-01-11
82
Because each Deserializer main subblock (FIFO, PLL, and DesDecoder) now
synthesizes correctly, a mixed RTL-netlist simulation of the entire design is possible by
putting together the synthesized subblock netlists instead of trying to synthesize a single
netlist from the whole verilog design. For a mixed simulation, you may leave
Deserializer.v and SerialRx.v in source form and create a new
DeserializerSubNets.vcs file containing this:
DeserializerTst.v
Deserializer.v
./DesDecoder/DesDecoderNetlist.v
./FIFO/FIFOTopNetlistSDF.v
./PLL/PLLTopNetlist.v
./SerialRx/SerialRx.v
-v tcbn90ghp_v2001.v
The answers for this Step include synthesis scripts to synthesize the subblocks. With
current (2010) versions of DC, the SDF timing only is required for FIFO; the other
modules will simulate without back-annotation.
In the .sct file mentioned above, you will notice a considerably increased number of
detailed assignment statements, along with an (expected) increased number of
supress_message* statements. The increase in complexity is normal as the design
increases in complexity, which causes correct synthesis to require more and more
complex and detailed constraints.
2014-01-11
83
Week 10 Class 2
Today's Agenda:
Lecture on the Serializer and SerDes
Topics: Serializer submodules, assembly and checkout of the SerDes.
Summary: The brief lecture is used to outline the functionality of the
SerEncoder and the SerialTx, and to point out the reorganization of the
project files.
Lab
The Serializer model is completed, and it is attached to the Lab23
Deserializer to form a complete serdes. We simulate and synthesize the
result.
Lab Postmortem
We look into possible incompatibilities of our Ser and Des. We also question
where assertions and timing checks should be installed.
SerDes Lab 24
Topic and Location: Design and simulation of a Serializer; assembly, simulation,
and synthesis of a serdes.
Do this work in the Lab24 directory.
Preview: We reorganize our files from previous labs so that the PLL, FIFO,
Deserializer, and new Serializer are in the same working directory. Then, we
design the Serializer to share packet pad definitions and otherwise to be compatible
with our Deserializer. We connect the Deserializer and Serializer together
with a single serial wire, and we demonstrate correctness by simulation.
Deliverables: Step 1: Copied previous Lab23 answer files. Step 2: Reorganized
design files as shown by successful load into the simulator. Step 3: A correctly
configured and assembled Serializer, to the extent of successful load into the
simulator. Step 4: Completed connections for the SerialTx. Step 5: Completed
SerEncoder. Step 6: A correctly simulating Serializer. Step 7: A correctly
assembled, completed serdes project. Step 8: Synthesized, simulated unit netlists for
the completed serdes' PLL, FIFO, DesDecoder, and SerEncoder.
2014-01-11
84
2014-01-11
In new VCS, using the on-disc answers, the deserialized packet with data
32'h462d_f78c appeared on the parallel-data input bus at time 28,598 ns:
85
2014-01-11
86
Closeup of the Step 7 source simulation, showing the output just described.
2014-01-11
87
The *SDF.v netlist files are the ones with $sdf_annotate(), which you used during
the optional SDF-based simulation in Step 8. The other files are unsynthesized verilog
source -- they could be synthesized hierarchically to netlists and then editted for
individual use, but this would be somewhat time-consuming and unnecessary, in view of
the fact that they are simple enough to assume that synthesis easily would have created
netlists which would simulate successfully.
During simulation of the design subnetted as above, the new VCS command window
will report the SDF annotation as follows:
2014-01-11
88
The two time markers are set at 567,572.27 ns, when parallel data word
32'h22d500c5 was clocked out of the Ser_U1 FIFO, and at 745,427.73 ns, when that
word was clocked onto the Des_U1 ParOut bus.
The next figure is a closeup of the time around which the 32'h22d500c5 word arrived
on the Des output:
There remain a few netlist glitches, but the SerDes clearly is operating more or less
correctly.
2014-01-11
89
Week 11 Class 1
Today's Agenda:
Lecture on DFT
Topics: Basics of DFT, observability, coverage, boundary scan, internal scan, and
BIST.
Summary: After introducing DFT as a methodology, we relate it to assertions
and then explain the terms, observability and coverage. We then introduce
boundary scan, internal scan, and BIST as the major DFT techniques.
Lab 25 on DFT
We recall an optional Lab05 insertion of internal scan by the synthesizer and also
show how to attach I/O pads in a verilog design for automated synthesizer
insertion of boundary scan. We put a memory in a wrapper and attach a BIST
module to it.
Lab Postmortem
We discuss the effect of the BIST on memory die size.
Lecture on full-duplex SerDes
Topics: Implementation parameters for connecting two of our SerDes into a fullduplex lane.
Summary: We give special FIFO depth requirements for our full-duplex system,
and we discuss addition of DFT functions.
Lab 26 on a DFT, full-duplex SerDes
After putting together a working, full-duplex serdes, we add assertions, timing
checks, and automated internal scan.
Lab Postmortem
We discuss the effect of the scan elements on SerDes size.
Supplementary Note concerning boundary scan and internal scan, pp. 379 - 382.
These were introduced briefly in Week 2 Class 1, in Textbook Section 3.1.8, and in Lab05
pages 50 - 59.
2014-01-11
90
the hard error rate expected. Then, BIST is run during manufacture to identify defective
bits. Knowing the location of the defects, if any, the tester programs a within-chip or
within-board register with the spares to be substituted. Then, either the spare row(s)
and column(s) are substituted by permanent "burned-in" logic, or on-chip logic is
programmed dynamically by shifting in a substitution configuration (as in programming
an FPGA)."
"Defects are rare enough so that almost all memories can be repaired by substitution of
just one or two spare rows or columns. The next figure illustrates the statistics for this:"
The graph (a) is for a typical memory with 0.002 defects/mm 2; graph (b) is for a defect rate 1/10 as great.
The upper curves show the improvement in BIST pass rate gained by having one spare row or column
available vs. none; the lower curves show the improvement by having two spare rows or columns vs. just
one. Clearly, there is little expected benefit in providing for more than two spare rows or columns.
The statistic is Yield/Area. After figure 7 of the Mentor Memory Repair Primer of November 2008.
Reproduced with permission of Mentor Graphics, Inc.
Also, in the next-to-last paragraph in Section 21.1.8, the last sentence should say, ". . .
soft errors caused by the Solar wind and cosmic rays . . .".
2014-01-11
91
should have been symlinked in the VCS directory to tpdn90g18tc_3PAD.v. The CDROM (misc directory) version, tpdn90g18tc_3PAD_NonTSMCProprietary.v, also
should work as well; it may be copied to the VCS directory (and may be renamed to
tpdn90g18tc_3PAD.v).
Step 2C. The detailed instructions on p. 384 are to provide some experience with
inserting and connecting pads in a design; they may be skipped if such experience is
unnecessary. At the end of this Step, you may simulate the design if you wish. It
should function as the familiar Intro_Top of previous labs; however, the undriven
ScanOut pad output will remain unknown.
Step 2D. On p. 385, the TAP controller can be implemented by means of DC
commands too time-consuming for this course. The final netlist can be simulated
without TAB; however, the scan-mode scanned output data will not be meaningful.
VCS can be used to view the netlist schematic: Select "Topper01 (Intro_Top)" in
the control window hierarchy view, and then pick the "and gate" icon just above. This
may take a minute, but the result will be better-arranged than the one from
design_vision. Tool details are in your DC_Synth_Lab01_Summary.pdf handout.
The remainder of this lab exercise includes some detailed file rewriting. If the
user is not planning to work with BIST development and has sufficient experience with
verilog, Steps 3 - 8 may be skipped to conclude with the Step 9 synthesis exercises. In
any case, it would be instructive to read the Lab25_Ans answers provided for all the
steps of this lab.
Step 6. On p. 389, ClkRw and "Resetw" are named "*w" because they have been
declared wires.
Step 9. Using the latest 2011 version of DC, synthesis of the memory with BIST and
with timing constraints but without set_fix_hold, takes about 15 minutes on a 1 GHz
machine; the resulting netlist simulates correctly with the approximate delays given in
tcbn90ghp_v2001.v.
2014-01-11
92
Step 3: On p. 395, in the middle of the second paragraph after the second code
example, the cautionary sentence is meant to be read as, "Do not rename the FIFO
parameter; just change the name of the value assigned to it in SerDes.v. For example,
in SerDes.v, 'Serializer #( .DWid(DWid), .AWid(AWid) ) ...' will become,
'Serializer #( .DWid(DWid), .AWid(TxLogDepth) ) ...'"
2014-01-11
93
Step 6: Here is the VCS console display showing a $width timing violation at time
21,935 ns:
The Step 6 DoParSync pulse is too narrow at time 21,935, causing a $width violation.
2014-01-11
94
Even zoomed in, the violating pulse (obscured by the measurement cursors) does not
draw attention in the waveforms:
Step 7: The FullDup source verilog simulation should be about the same as shown
2014-01-11
95
Step 8A: Rather than following the Step 8A Textbook instructions on p. 402, which
recommend copying the Lab25 .sct file, you probably should copy the
FullDupScan.sct file from the Lab26_Ans, Step 8 directory. This is because of minor
changes in DC since the book was published.
In these .sct files, notice the large number of DC warning messages which have been
suppressed; each one was studied individually and deemed unimportant.
Step 8B: To clarify these instructions on p. 402, the netlist comparison should be
between the results of Lab26 Step 7 and Lab26 Step 8. The purpose is to get an idea of
the netlist size difference caused by scan insertion.
2014-01-11
96
Week 11 Class 2
Today's Agenda:
Lecture on SDF back-annotation
Topics: Back-annotation basics; SDF file use with a verilog design.
Summary: We briefly explain the use of back-annotation and describe some of the
characteristics of an SDF file. We show how SDF fits into the design flow.
We then give the syntax for the use of SDF with a verilog module.
SDF Lab
We synthesize the Intro_Top design and then simulate the netlist with backannotated timing.
Lab Postmortem
Q&A, and how to back-annotate part of a design.
(rest of day may be used for retrospection on recent, previous labs)
Supplementary Note on SDF file functionality. Two new last paragraphs should
be added on p. 407, after the final summary:
"Our presentation here is meant only to show how SDF relates to verilog; we cover only
a minimum of what an SDF file can do. In particular, the SDF language can include
conditional and multivalue delays, PATHPULSE overrides, and timing checks."
"SDF back-annotation may be used for netlist-based tasks such as simulation or static
timing verification; it can not be used for synthesis."
2014-01-11
97
SDF Lab 27
Topic and Location: Simulation and synthesis of Intro_Top; simulation of SDF
back-annotated netlists.
Do this work in the Lab27 directory.
Preview: We simulate the old Intro_Top design and compare timing of the original
verilog with that of the synthesized, back-annotated netlist. We then edit the SDF file
manually to see how it controls the netlist simulation timing.
Deliverables: Step 1: A correctly simulating Intro_Top design in the orig
directory. Step 2: A correctly simulating synthesized netlist of Intro_Top in the orig
directory. Step 3: A correctly simulating netlist of Intro_Top, with written-out SDF
back-annotated timing, in the ba1 ("back-annotated #1") directory. Step 4: A correctly
simulating netlist of Intro_Top in the ba2 directory, with manually modified SDF backannotated timing.
Supplementary Note concerning all these Steps: In new VCS, to display two
different simulation waveforms at once, begin by simulating one of them as usual. After
arranging the wave window nicely, save the session. Then, use the [x] icon in the upper
left-hand corner of the console window to dismiss the console window; this will leave the
wave window displayed and active. Bring up a second shell window, cd to the same
directory, and run VCS again for the second simulation. Load the saved session to create
a new wave window with the same file-lists and geometry as the first one.
Also: Old VCS used config files (.cfg), as in the Textbook, to save window setups and
wave-window signals; new VCS uses session files (.tcl), which are Tcl scripts.
change the TestBench.v timescale to 1ns/1ps or 1ns/10ps before simulating; this will
make the source and netlist simulations more precisely comparable.
2014-01-11
98
Week 12 Class 1
Today's Agenda:
Lecture wrapping up coverage of the verilog language
Summary: We describe briefly the differences between verilog-1995 and verilog2001 (or verilog-2005). We review the synthesizable subset of the language
and list the constructs not specifically covered in the course. We explain the
relationship of the verilog PLI to the language. We mention a few wiring
tricks which may be found in legacy code. We also summarize what will be the
scope of the final exam.
Continued Lab Work (Lab 23 or later)
2014-01-11
99
A name used in a port map implies a wire and need not be declared:
ALU U1 (.Result(Res), .Ain(A), .Bin(B), .Clk(GatedClock));
...
bufif1 ClockGater (GatedClock, Clock, Ena);
GatedClock is implied and need not be declared.
Not recommended: (a) widths of implied wires always are 1; and, (b) there may be lost
time and confusion searching for the declarations of the implied names.
2014-01-11
100
Week 12 Class 2
Today's Agenda:
Lecture on deep-submicron problems and verification
Deep-submicron scaling physics.
Importance of correct logic and clocking.
Functional and timing verification.
System Verilog major features.
Verilog-AMS (Analogue-Mixed Signal)
Continued Lab Work (Lab 23 or later)
24.1.6 SystemVerilog
SystemVerilog is an Accellera standard which has been adopted by IEEE. The 2012
standard document runs over 1300 pages, but the verilog-related subset is considerably
shorter. We concentrate here on those SystemVerilog features which are related to
verilog and are likely to be synthesizable by recently updated tools.
SystemVerilog consists in part of a superset of verilog-2005 features; all of verilog-2005
is included. This superset additionally contains many C++ design constructs. Also
included is a standalone assertion sublanguage.
A major goal of SystemVerilog is to make the porting of code to and from C++ easier
than it has been with verilog-2005. Another important goal is to make complex systemlevel design, as well as assertion statements, directly expressible.
A Caution
As we have seen, even the synthesizable subset of plain verilog, now almost 20 years
old, has not been implemented fully (and not always correctly) by the major EDA vendors.
SystemVerilog has been an IEEE standard for less than half of that time, and its
synthesizable subset is more complex than that of verilog. It behooves the designer to
try a small test case on any unfamiliar SystemVerilog feature before committing it to use
in commercial coding, whether for simulation or synthesis. Vendors generally will be
quick to fix bugs, but they have to know about them first.
2014-01-11
101
declarations.*
Loop C-like break, continue, and return commands (no block name required).*
A new interface type to make reuse of module I/O declarations easier.*
A self-contained assertion language subset for assertions at module scope or in
procedural code.*
The features marked "*" are likely to be useful for logic synthesis, or for simulation
directly leading to synthesis, and are next discussed individually in more detail.
2014-01-11
102
a, b, c, d;
Same as a =
Same as b =
Same as c =
Same as d =
8'b0000_0000.
8'b1111_1111.
8'bxxxx_xxxx.
8'bzzzz_zzzz.
2014-01-11
103
2014-01-11
104
return: This causes an exit from a SystemVerilog function call or triggers an exit
from a running task.
A SystemVerilog function call may be terminated by a return(value) command
instead of by an assignment to the function's declared name. For example, in our Lab11
FnTask.v example, instead of
function[7:0] doCheckSum ( . . . );
. . .
doCheckSum = temp1[7:0] + temp2[7:0]^temp1[15:8] + temp2[15:8];
end
endfunction
2014-01-11
105
Noticing the repetition in the module headers, it is possible to simplify those headers
by means of an interface which contains their common port declarations. This is
typically the simplest and most obvious use of an interface, to bundle a collection of
variables or nets. Other, more involved uses are presented in the IEEE Std 1800 chapter
referenced above.
2014-01-11
Keeping in mind that a default port directionality of inout is assumed, the same
design fragment as above may be rewritten as follows to take advantage of an
interface declaration:
interface simple_bus; // Define the interface.
logic
req, gnt;
logic[7:0] addr, data;
logic[1:0] mode;
logic
start, rdy;
endinterface: simple_bus
#
module memMod
(simple_bus a, // Access the simple_bus interface.
input logic clk
);
logic avail;
//
// When memMod is instantiated in module top, a.req is the
// "logic req, gnt;" signal in the sb_intf instance of the
// 'simple_bus' interface:
always @(posedge clk)
a.gnt <= a.req & avail;
//
endmodule
#
module cpuMod(simple_bus b, input logic clk);
...
endmodule
#
module top;
logic clk = 0;
simple_bus sb_intf();
// Instantiate the interface.
//
memMod mem(sb_intf, clk);
// Connect the interface by position
cpuMod cpu(.b(sb_intf), .clk(clk)); // or by name.
//
endmodule
106
2014-01-11
107
This kind of assertion is declared within a package, module, or interface and may
be called in procedural code. Instead of the "else $display . . ." shown in the code
example, omitting the else action causes a default, tool-specific $error() system task to
run if the assertion should fail. The $error() may be called explicitly and should accept
printf-like formatting, the same as $display(). Other messaging system tasks in
SystemVerilog may be used to define severity levels and include $fatal(), $warning(),
and $info(). The verilog stratified event queue is extended for SystemVerilog to handle
assertions and other enhanced language features.
Assertion messages can be verbose and thus may be used in place of comments in the
code, wherever the comment would specify proper behavior of the module.
SystemVerilog assertions are simulation objects, only; they can not be synthesized and
should be ignored by the synthesizer.
SystemVerilog Conclusion
There are, of course other SystemVerilog features useful for synthesis or synthesisaimed simulation, but the ones discussed above are those most likely to be useful to a
designer and to be implemented by an EDA vendor.
2014-01-11
108
24.1.7 Verilog-AMS
Introduction
Verilog-A/MS means "Verilog Analogue / Mixed-Signal", in which "mixed" refers to
mixed analogue and digital statements in the same verilog design.
Notes on terminology:
We shall abbreviate Verilog-A/MS as VAMS.
Because analog is a keyword in verilog-A/MS, the alternate
spelling, "analogue", will be used wherever functionality is meant.
Many large ASICs include some analogue functionality. Analogue components
generally require higher voltages and greater noise immunity than digital ones. VAMS
may be used to simulate such an ASIC with accuracy good enough to ensure correct
operation of the digital side of the design.
VAMS contains verilog (IEEE Std. 1364) as a proper subset. In addition, when a
digital or analogue module can not itself resolve inputs of the other kind, VAMS permits
a verilog source file to include declaration and instantiation of special connectmodules
for communication between the analogue and digital domains. Within a verilog module,
VAMS provides analog blocks, similar to always blocks, to implement the analogue
functionality of the module. Basically, VAMS allows a designer to code analogue
functionality in verilog.
SPICE may be included, if necessary, in a VAMS design. A VAMS simulator is
required to permit instantiation of SPICE-deck subckt code blocks in the same manner
as verilog modules. VAMS specifies a core list of SPICE constructs required to be
implemented by a conforming simulator; the precise version of SPICE otherwise is left to
the simulator author.
2014-01-11
109
2014-01-11
110
The assumption is that we have a VDD of 1.0 V (digital CMOS high). The thresholdcrossing operator cross permits the State variable assignment to occur whenever
V(Clk) rises through 0.5V. The voltage of Q is assigned by the transition operator
regardless of whether the preceding event control has been fulfilled or not. The
transition operator provides a gradual transition rather than an immediate digital one
with infinite slope.
2014-01-11
111
Benefits of VAMS
The main benefit of VAMS is in relation to SPICE: It is more abstract than SPICE and
simulates far faster; this makes VAMS usable for large analogue circuits.
A secondary benefit is that VAMS is convenient for large-design designers -- which is
to say, digital designers -- because its analogue syntax is that of the familiar verilog, and
its digital semantics is identical to that of verilog.
Otherwise stated, VAMS permits the easy integration of analogue functionality into
large projects requiring top-down design methodologies.
================================================================