0% found this document useful (0 votes)
6 views

Chisel Tutorial

Uploaded by

邱睿智
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Chisel Tutorial

Uploaded by

邱睿智
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Chisel 2.

2 Tutorial
Jonathan Bachrach, Krste Asanović, John Wawrzynek
EECS Department, UC Berkeley
{jrb|krste|johnw}@eecs.berkeley.edu

April 26, 2016

1 Introduction developed as hardware simulation languages, and


only later did they become a basis for hardware syn-
This document is a tutorial introduction to Chisel thesis. Much of the semantics of these languages are
(Constructing Hardware In a Scala Embedded Lan- not appropriate for hardware synthesis and, in fact,
guage). Chisel is a hardware construction language many constructs are simply not synthesizable. Other
constructs are non-intuitive in how they map to hard-
embedded in the high-level programming language
ware implementations, or their use can accidently lead
Scala. At some point we will provide a proper refer- to highly inefficient hardware structures. While it is
ence manual, in addition to more tutorial examples. possible to use a subset of these languages and yield
In the meantime, this document along with a lot of acceptable results, they nonetheless present a cluttered
trial and error should set you on your way to using and confusing specification model, particularly in an
Chisel. Chisel is really only a set of special class def- instructional setting.
initions, predefined objects, and usage conventions
within Scala, so when you write a Chisel program you
are actually writing a Scala program. However, for However, our strongest motivation for developing
the tutorial we don’t presume that you understand a new hardware language is our desire to change the
how to program in Scala. We will point out necessary way that electronic system design takes place. We
Scala features through the Chisel examples we give, believe that it is important to not only teach students
and significant hardware designs can be completed how to design circuits, but also to teach them how
using only the material contained herein. But as you to design circuit generators—programs that auto-
gain experience and want to make your code simpler matically generate designs from a high-level set of
or more reusable, you will find it important to lever- design parameters and constraints. Through circuit
age the underlying power of the Scala language. We generators, we hope to leverage the hard work of de-
recommend you consult one of the excellent Scala sign experts and raise the level of design abstraction
for everyone. To express flexible and scalable circuit
books to become more expert in Scala programming.
construction, circuit generators must employ sophis-
Chisel is still in its infancy and you are likely to ticated programming techniques to make decisions
encounter some implementation bugs, and perhaps concerning how to best customize their output cir-
even a few conceptual design problems. However, cuits according to high-level parameter values and
we are actively fixing and improving the language, constraints. While Verilog and VHDL include some
and are open to bug reports and suggestions. Even primitive constructs for programmatic circuit genera-
in its early state, we hope Chisel will help designers tion, they lack the powerful facilities present in mod-
be more productive in building designs that are easy ern programming languages, such as object-oriented
to reuse and maintain. programming, type inference, support for functional
programming, and reflection.

Through the tutorial, we format commentary on our


design choices as in this paragraph. You should be
able to skip the commentary sections and still fully Instead of building a new hardware design lan-
understand how to use Chisel, but we hope you’ll find guage from scratch, we chose to embed hardware con-
them interesting. struction primitives within an existing language. We
We were motivated to develop a new hardware picked Scala not only because it includes the program-
language by years of struggle with existing hardware ming features we feel are important for building cir-
description languages in our research projects and cuit generators, but because it was specifically devel-
hardware design courses. Verilog and VHDL were oped as a base for domain-specific languages.

1
2 Hardware expressible in Chisel widths can also be specified explicitly on literals, as
shown below:
This version of Chisel only supports binary logic, and
UInt("ha", 8) // hexadecimal 8-bit lit of type UInt
does not support tri-state signals. UInt("o12", 6) // octal 6-bit lit of type UInt
UInt("b1010", 12) // binary 12-bit lit of type UInt
We focus on binary logic designs as they constitute the
SInt(5, 7) // signed decimal 7-bit lit of type SInt
vast majority of designs in practice. We omit support
UInt(5, 8) // unsigned decimal 8-bit lit of type UInt
for tri-state logic in the current Chisel language as
this is in any case poorly supported by industry flows, For literals of type UInt, the value is zero-extended
and difficult to use reliably outside of controlled hard
to the desired bit width. For literals of type SInt, the
macros.
value is sign-extended to fill the desired bit width. If
the given bit width is too small to hold the argument
value, then a Chisel error is generated.
3 Datatypes in Chisel
We are working on a more concise literal syntax
Chisel datatypes are used to specify the type of val- for Chisel using symbolic prefix operators, but are
ues held in state elements or flowing on wires. While stymied by the limitations of Scala operator overload-
hardware designs ultimately operate on vectors of ing and have not yet settled on a syntax that is actu-
binary digits, other more abstract representations ally more readable than constructors taking strings.
for values allow clearer specifications and help the We have also considered allowing Scala literals to
tools generate more optimal circuits. In Chisel, a be automatically converted to Chisel types, but this
raw collection of bits is represented by the Bits type. can cause type ambiguity and requires an additional
Signed and unsigned integers are considered sub- import.
sets of fixed-point numbers and are represented by The SInt and UInt types will also later support
an optional exponent field to allow Chisel to auto-
types SInt and UInt respectively. Signed fixed-point
matically produce optimized fixed-point arithmetic
numbers, including integers, are represented using
circuits.
two’s-complement format. Boolean values are repre-
sented as type Bool. Note that these types are distinct
from Scala’s builtin types such as Int or Boolean. Ad- 4 Combinational Circuits
ditionally, Chisel defines Bundles for making collec-
tions of values with named fields (similar to structs in A circuit is represented as a graph of nodes in Chisel.
other languages), and Vecs for indexable collections Each node is a hardware operator that has zero or
of values. Bundles and Vecs will be covered later. more inputs and that drives one output. A literal,
Constant or literal values are expressed using Scala introduced above, is a degenerate kind of node that
integers or strings passed to constructors for the has no inputs and drives a constant value on its out-
types: put. One way to create and wire together nodes is
UInt(1) // decimal 1-bit lit from Scala Int.
using textual expressions. For example, we can ex-
UInt("ha") // hexadecimal 4-bit lit from string. press a simple combinational logic circuit using the
UInt("o12") // octal 4-bit lit from string. following expression:
UInt("b1010") // binary 4-bit lit from string.
(a & b) | (~c & d)
SInt(5) // signed decimal 4-bit lit from Scala Int.
SInt(-8) // negative decimal 4-bit lit from Scala Int. The syntax should look familiar, with & and | rep-
UInt(5) // unsigned decimal 3-bit lit from Scala Int.
resenting bitwise-AND and -OR respectively, and ˜
Bool(true) // Bool lits from Scala lits. representing bitwise-NOT. The names a through d
Bool(false) represent named wires of some (unspecified) width.
Any simple expression can be converted directly
Underscores can be used as separators in long into a circuit tree, with named wires at the leaves and
string literals to aid readability, but are ignored when operators forming the internal nodes. The final circuit
creating the value, e.g.: output of the expression is taken from the operator at
UInt("h_dead_beef") // 32-bit lit of type UInt the root of the tree, in this example, the bitwise-OR.
Simple expressions can build circuits in the shape
By default, the Chisel compiler will size each con- of trees, but to construct circuits in the shape of ar-
stant to the minimum number of bits required to hold bitrary directed acyclic graphs (DAGs), we need to
the constant, including a sign bit for signed types. Bit describe fan-out. In Chisel, we do this by naming

2
a wire that holds a subexpression that we can then Scala language. We have to use triple equals === for
reference multiple times in subsequent expressions. equality and =/= for inequality to allow the native
We name a wire in Chisel by declaring a variable. Scala equals operator to remain usable.
For example, consider the select expression, which is We are also planning to add further operators that
used twice in the following multiplexer description: constrain bitwidth to the larger of the two inputs.

val sel = a | b
val out = (sel & in1) | (~sel & in0)
6 Functional Abstraction
The keyword val is part of Scala, and is used to name
variables that have values that won’t change. It is We can define functions to factor out a repeated piece
used here to name the Chisel wire, sel, holding the of logic that we later reuse multiple times in a design.
output of the first bitwise-OR operator so that the For example, we can wrap up our earlier example of
output can be used multiple times in the second ex- a simple combinational logic block as follows:
pression. def clb(a: UInt, b: UInt, c: UInt, d: UInt): UInt =
(a & b) | (~c & d)

5 Builtin Operators where clb is the function which takes a, b, c, d as


arguments and returns a wire to the output of a
Chisel defines a set of hardware operators for the boolean circuit. The def keyword is part of Scala
builtin types shown in Table 1. and introduces a function definition, with each ar-
gument followed by a colon then its type, and the
5.1 Bitwidth Inference function return type given after the colon following
the argument list. The equals (=) sign separates the
Users are required to set bitwidths of ports and regis- function argument list from the function definition.
ters, but otherwise, bit widths on wires are automat- We can then use the block in another circuit as
ically inferred unless set manually by the user. The follows:
bit-width inference engine starts from the graph’s
input ports and calculates node output bit widths val out = clb(a,b,c,d)
from their respective input bit widths according to
the following set of rules: We will later describe many powerful ways to use
operation bit width functions to construct hardware using Scala’s func-
z = x + y wz = max(wx, wy)
tional programming support.
z = x - y wz = max(wx, wy)
z = x & y wz = min(wx, wy)
z = Mux(c, x, y) wz = max(wx, wy) 7 Bundles and Vecs
z = w * y wz = wx + wy
z = x << n wz = wx + maxNum(n) Bundle and Vec are classes that allow the user to ex-
z = x >> n wz = wx - minNum(n) pand the set of Chisel datatypes with aggregates of
z = Cat(x, y) wz = wx + wy other types.
z = Fill(n, x) wz = wx * maxNum(n) Bundles group together several named fields of
where for instance wz is the bit width of wire z, and potentially different types into a coherent unit, much
the & rule applies to all bitwise logical operations. like a struct in C. Users define their own bundles by
The bit-width inference process continues until no defining a class as a subclass of Bundle:
bit width changes. Except for right shifts by known class MyFloat extends Bundle {
constant amounts, the bit-width inference rules spec- val sign = Bool()
ify output bit widths that are never smaller than the val exponent = UInt(width = 8)
input bit widths, and thus, output bit widths either val significand = UInt(width = 23)
}
grow or stay the same. Furthermore, the width of a
register must be specified by the user either explicitly val x = new MyFloat()
or from the bitwidth of the reset value or the next pa- val xs = x.sign
rameter. From these two requirements, we can show
that the bit-width inference process will converge to A Scala convention is to capitalize the name of new
a fixpoint. classes and we suggest you follow that convention in
Chisel too. The width named parameter to the UInt
Our choice of operator names was constrained by the constructor specificies the number of bits in the type.

3
Example Explanation
Bitwise operators. Valid on SInt, UInt, Bool.
val invertedX = ~x Bitwise NOT
val hiBits = x & UInt("h_ffff_0000") Bitwise AND
val flagsOut = flagsIn | overflow Bitwise OR
val flagsOut = flagsIn ^ toggle Bitwise XOR
Bitwise reductions. Valid on SInt and UInt. Returns Bool.
val allSet = andR(x) AND reduction
val anySet = orR(x) OR reduction
val parity = xorR(x) XOR reduction
Equality comparison. Valid on SInt, UInt, and Bool. Returns Bool.
val equ = x === y Equality
val neq = x =/= y Inequality
Shifts. Valid on SInt and UInt.
val twoToTheX = SInt(1) << x Logical left shift.
val hiBits = x >> UInt(16) Right shift (logical on UInt and& arithmetic on SInt).
Bitfield manipulation. Valid on SInt, UInt, and Bool.
val xLSB = x(0) Extract single bit, LSB has index 0.
val xTopNibble = x(15,12) Extract bit field from end to start bit position.
val usDebt = Fill(3, UInt("hA")) Replicate a bit string multiple times.
val float = Cat(sign,exponent,mantissa) Concatenates bit fields, with first argument on left.
Logical operations. Valid on Bools.
val sleep = !busy Logical NOT
val hit = tagMatch && valid Logical AND
val stall = src1busy || src2busy Logical OR
val out = Mux(sel, inTrue, inFalse) Two-input mux where sel is a Bool
Arithmetic operations. Valid on Nums: SInt and UInt.
val sum = a + b Addition
val diff = a - b Subtraction
val prod = a * b Multiplication
val div = a / b Division
val mod = a % b Modulus
Arithmetic comparisons. Valid on Nums: SInt and UInt. Returns Bool.
val gt = a > b Greater than
val gte = a >= b Greater than or equal
val lt = a < b Less than
val lte = a <= b Less than or equal

Table 1: Chisel operators on builtin data types.

4
Vecs create an indexable vector of elements, and named collections of wires.
are constructed as follows: The direction of an object can also be assigned at
instantation time:
// Vector of 5 23-bit signed integers.
val myVec = Vec.fill(5){ SInt(width = 23) } class ScaleIO extends Bundle {
val in = new MyFloat().asInput
// Connect to one element of vector. val scale = new MyFloat().asInput
val reg3 = myVec(3) val out = new MyFloat().asOutput
}
(Note that we have to specify the type of the Vec ele-
ments inside the trailing curly brackets, as we have The methods asInput and asOutput force all modules
to pass the bitwidth parameter into the SInt construc- of the data object to the requested direction.
tor.) By folding directions into the object declarations,
The set of primitive classes (SInt, UInt, and Bool) Chisel is able to provide powerful wiring constructs
plus the aggregate classes (Bundles and Vecs) all in- described later.
herit from a common superclass, Data. Every object
that ultimately inherits from Data can be represented
as a bit vector in a hardware design. 9 Modules
Bundles and Vecs can be arbitrarily nested to build
complex data structures: Chisel modules are very similar to Verilog modules in
defining a hierarchical structure in the generated cir-
class BigBundle extends Bundle { cuit. The hierarchical module namespace is accessible
// Vector of 5 23-bit signed integers.
in downstream tools to aid in debugging and phys-
val myVec = Vec.fill(5) { SInt(width = 23) }
val flag = Bool()
ical layout. A user-defined module is defined as a
// Previously defined bundle. class which:
val f = new MyFloat()
} • inherits from Module,

Note that the builtin Chisel primitive and aggre- • contains an interface stored in a port field named
gate classes do not require the new when creating io, and
an instance, whereas new user datatypes will. A
• wires together subcircuits in its constructor.
Scala apply constructor can be defined so that a user
datatype also does not require new, as described in As an example, consider defining your own two-
Section 14. input multiplexer as a module:
class Mux2 extends Module {
val io = new Bundle{
8 Ports val sel = UInt(INPUT, 1)
val in0 = UInt(INPUT, 1)
val in1 = UInt(INPUT, 1)
Ports are used as interfaces to hardware components.
val out = UInt(OUTPUT, 1)
A port is simply any Data object that has directions }
assigned to its members. io.out := (io.sel & io.in1) | (~io.sel & io.in0)
Chisel provides port constructors to allow a di- }

rection to be added (input or output) to an object at


The wiring interface to a module is a collection of
construction time. Primitive port constructors take
ports in the form of a Bundle. The interface to the
the direction as the first argument (where the direc-
module is defined through a field named io. For
tion is INPUT or OUTPUT) and the number of bits as the
Mux2, io is defined as a bundle with four fields, one
second argument (except booleans which are always
for each multiplexer port.
one bit).
The := assignment operator, used here in the body
An example port declaration is as follows:
of the definition, is a special operator in Chisel that
class Decoupled extends Bundle { wires the input of left-hand side to the output of the
val ready = Bool(OUTPUT) right-hand side.
val data = UInt(INPUT, 32)
val valid = Bool(INPUT)
} 9.1 Module Hierarchy
After defining Decoupled, it becomes a new type that We can now construct circuit hierarchies, where we
can be used as needed for module interfaces or for build larger modules out of smaller sub-modules. For

5
example, we can build a 4-input multiplexer module
in terms of the Mux2 module by wiring together three inputs
2-input multiplexers:
Chisel DUT
class Mux4 extends Module {
val io = new Bundle {
outputs
val in0 = UInt(INPUT, 1)
val in1 = UInt(INPUT, 1)
val in2 = UInt(INPUT, 1)
val in3 = UInt(INPUT, 1)
val sel = UInt(INPUT, 2) Figure 1: DUT run using a Tester object in Scala with
val out = UInt(OUTPUT, 1) stdin and stdout connected
}
val m0 = Module(new Mux2())
m0.io.sel := io.sel(0)
def step(n: Int): Int
m0.io.in0 := io.in0; m0.io.in1 := io.in1
def pokeAt(data: Mem[T], index: Int, x: BigInt)
def poke(data: Bits, x: BigInt)
val m1 = Module(new Mux2())
def poke(data: Aggregate, x: Array[BigInt])
m1.io.sel := io.sel(0)
def peekAt(data: Mem[T], index: Int)
m1.io.in0 := io.in2; m1.io.in1 := io.in3
def peek(data: Bits): BigInt
def peek(data: Aggregate): Array[BigInt]
val m3 = Module(new Mux2())
def expect (good: Boolean, msg: String): Boolean
m3.io.sel := io.sel(1)
def expect (data: Bits, target: BigInt): Boolean
m3.io.in0 := m0.io.out; m3.io.in1 := m1.io.out
}

io.out := m3.io.out
} which binds a tester to a module and allows users
to write tests using the given debug protocol. In
We again define the module interface as io and wire particular, users utilize:
up the inputs and outputs. In this case, we create
three Mux2 children modules, using the Module con- • poke to set input port and state values,
structor function and the Scala new keyword to create
• step to execute the circuit one time unit,
a new object. We then wire them up to one another
and to the ports of the Mux4 interface. • peek to read port and state values, and

• expect to compare peeked circuit values to ex-


10 Running and Testing Exam- pected arguments.
ples Users connect tester instances to modules using:
Now that we have defined modules, we will discuss object chiselMainTest {
def apply[T <: Module]
how we actually run and test a circuit. Chisel trans-
(args: Array[String], comp: () => T)(
lates into either C++ or Verilog. In order to build a tester: T => Tester[T]): T
circuit we need to call chiselMain: }

object tutorial {
When --test is given as an argument to
def main(args: Array[String]) = {
chiselMain(args, () => Module(new Mux2()))
chiselMainTest, a tester instance runs the De-
} sign Under Test (DUT) in a separate process with
} stdin and stdout connected so that debug com-
mands can be sent to the DUT and responses can be
Testing is a crucial part of circuit design, and thus received from the DUT as shown in Figure 1.
in Chisel we provide a mechanism for testing cir- For example, in the following:
cuits by providing test vectors within Scala using
subclasses of the Tester class: class Mux2Tests(c: Mux2) extends Tester(c) {
val n = pow(2, 3).toInt
class Tester[T <: Module] (val c: T, val isTrace: Boolean for (s <- 0 until 2) {
= true) { for (i0 <- 0 until 2) {
var t: Int for (i1 <- 0 until 2) {
val rnd: Random poke(c.io.sel, s)
def int(x: Boolean): BigInt poke(c.io.in1, i1)
def int(x: Int): BigInt poke(c.io.in0, i0)
def int(x: Bits): BigInt step(1)
def reset(n: Int = 1) expect(c.io.out, (if (s == 1) i1 else i0))

6
} x := Mux(x === max, UInt(0), x + UInt(1))
} x
} }
}
The counter register is created in the counter func-
assignments for each input of Mux2 is set to the appro- tion with a reset value of 0 (with width large enough
priate values using poke. For this particular example, to hold max), to which the register will be initialized
we are testing the Mux2 by hardcoding the inputs to when the global reset for the circuit is asserted. The
some known values and checking if the output corre- := assignment to x in counter wires an update combi-
sponds to the known one. To do this, on each iteration national circuit which increments the counter value
we generate appropriate inputs to the module and unless it hits the max at which point it wraps back to
tell the simulation to assign these values to the inputs zero. Note that when x appears on the right-hand
of the device we are testing c, step the circuit, and side of an assigment, its output is referenced, whereas
test the expected value. Finally, the following shows when on the left-hand side, its input is referenced.
how the tester is invoked: Counters can be used to build a number of useful
chiselMainTest(args + "--test", () => Module(new Mux2())){ sequential circuits. For example, we can build a pulse
c => new Mux2Tests(c) generator by outputting true when a counter reaches
} zero:
Other command arguments are as follows: // Produce pulse every n cycles.
--targetDir target pathname prefix def pulse(n: UInt) = counter(n - UInt(1)) === UInt(0)

--genHarness generate harness file for C++


A square-wave generator can then be toggled by the
--backend v generate verilog
pulse train, toggling between true and false on each
--backend c generate C++ (default)
pulse:
--vcd enable vcd dumping
--debug put all wires in class file // Flip internal state when input true.
def toggle(p: Bool) = {
val x = Reg(init = Bool(false))

11 State Elements x := Mux(p, !x, x)


x
}
The simplest form of state element supported by
Chisel is a positive edge-triggered register, which // Square wave of a given period.
def squareWave(period: UInt) = toggle(pulse(period/2))
can be instantiated as:
val reg = Reg(next = in)

This circuit has an output that is a copy of the input 11.1 Forward Declarations
signal in delayed by one clock cycle. Note that we do
Purely combinational circuits cannot have cycles be-
not have to specify the type of Reg as it will be auto-
tween nodes, and Chisel will report an error if such
matically inferred from its input when instantiated in
a cycle is detected. Because they do not have cycles,
this way. In the current version of Chisel, clock and
combinational circuits can always be constructed in a
reset are global signals that are implicity included
feed-forward manner, by adding new nodes whose
where needed.
inputs are derived from nodes that have already been
Using registers, we can quickly define a number of
defined. Sequential circuits naturally have feedback
useful circuit constructs. For example, a rising-edge
between nodes, and so it is sometimes necessary to
detector that takes a boolean signal in and outputs
reference an output wire before the producing node
true when the current value is true and the previous
has been defined. Because Scala evaluates program
value is false is given by:
statements sequentially, we allow data nodes to serve
def risingedge(x: Bool) = x && !Reg(next = x) as a wire providing a declaration of a node that can
be used immediately, but whose input will be set later.
Counters are an important sequential circuit. To For example, in a simple CPU, we need to define the
construct an up-counter that counts up to a maxi- pcPlus4 and brTarget wires so they can be referenced
mum value, max, then wraps around back to zero (i.e., before defined:
modulo max+1), we write:
val pcPlus4 = UInt()
def counter(max: UInt) = { val brTarget = UInt()
val x = Reg(init = UInt(0, max.getWidth)) val pcNext = Mux(io.ctrl.pcSel, brTarget, pcPlus4)

7
val pcReg = Reg(next = pcNext, init = UInt(0, 32)) Initial values
pcPlus4 := pcReg + UInt(4) e1
0 0
...
brTarget := addOut
c1 f t
The wiring operator := is used to wire up the connec-
when (c1)
tion after pcReg and addOut are defined.
{ r := e1 }
e2

11.2 Conditional Updates


c2 f t
In our previous examples using registers, we simply when (c2)
wired the combinational logic blocks to the inputs of { r := e2 }
the registers. When describing the operation of state enable in
elements, it is often useful to instead specify when up-
dates to the registers will occur and to specify these clock r
updates spread across several separate statements. out
Chisel provides conditional update rules in the form
of the when construct to support this style of sequen-
Figure 2: Equivalent hardware constructed for con-
tial logic description. For example,
ditional updates. Each when statement adds another
val r = Reg(init = UInt(0, 16)) level of data mux and ORs the predicate into the
when (cond) { enable chain. The compiler effectively adds the termi-
r := r + UInt(1)
}
nation values to the end of the chain automatically.

where register r is updated at the end of the current


Chisel provides some syntactic sugar for other
clock cycle only if cond is true. The argument to when
common forms of conditional update. The unless
is a predicate circuit expression that returns a Bool.
construct is the same as when but negates its condi-
The update block following when can only contain
tion. In other words,
update statements using the assignment operator :=,
simple expressions, and named wires defined with unless (c) { body }
val.
In a sequence of conditional updates, the last con- is the same as
ditional update whose condition is true takes priority. when (!c) { body }
For example,
The update block can target multiple registers, and
when (c1) { r := UInt(1) }
when (c2) { r := UInt(2) }
there can be different overlapping subsets of registers
present in different update blocks. Each register is
leads to r being updated according to the following only affected by conditions in which it appears. The
truth table: same is possible for combinational circuits (update
of a Wire). Note that all combinational circuits need a
c1 c2 r default value. For example:
0 0 r r unchanged
0 1 2 r := SInt(3); s := SInt(3)
1 0 1 when (c1) { r := SInt(1); s := SInt(1) }
when (c2) { r := SInt(2) }
1 1 2 c2 takes precedence over c1

Figure 2 shows how each conditional update can leads to r and s being updated according to the fol-
be viewed as inserting a mux before the input of lowing truth table:
a register to select either the update expression or c1 c2 r s
the previous input according to the when predicate. 0 0 3 3
In addition, the predicate is OR-ed into an enable 0 1 2 3 // r updated in c2 block, s at top level.
1 0 1 1
signal that drives the load enable of the register. The 1 1 2 1
compiler places initialization values at the beginning
of the chain so that if no conditional updates fire in
a clock cycle, the load enable of the register will be We are considering adding a different form of condi-
deasserted and the register value will not change. tional update, where only a single update block will

8
take effect. These atomic updates are similar to Blue- 11.3 Finite State Machines
spec guarded atomic actions.
A common type of sequential circuit used in digital
design is a Finite State Machine (FSM). An example
Conditional update constructs can be nested and of a simple FSM is a parity generator:
any given block is executed under the conjunction of
class Parity extends Module {
all outer nesting conditions. For example, val io = new Bundle {
val in = Bool(dir = INPUT)
when (a) { when (b) { body } }
val out = Bool(dir = OUTPUT) }
val s_even :: s_odd :: Nil = Enum(UInt(), 2)
is the same as: val state = Reg(init = s_even)
when (io.in) {
when (a && b) { body } when (state === s_even) { state := s_odd }
when (state === s_odd) { state := s_even }
}
Conditionals can be chained together using when,
io.out := (state === s_odd)
.elsewhen, .otherwise corresponding to if, else if }
and else in Scala. For example,
where Enum(UInt(), 2) generates two UInt literals.
when (c1) { u1 }
.elsewhen (c2) { u2 }
States are updated when in is true. It is worth noting
.otherwise { ud } that all of the mechanisms for FSMs are built upon
registers, wires, and conditional updates.
is the same as: Below is a more complicated FSM example which
is a circuit for accepting money for a vending ma-
when (c1) { u1 }
when (!c1 && c2) { u2 }
chine:
when (!(c1 || c2)) { ud }
class VendingMachine extends Module {
val io = new Bundle {
We introduce the switch statement for conditional val nickel = Bool(dir = INPUT)
updates involving a series of comparisons against a val dime = Bool(dir = INPUT)
val valid = Bool(dir = OUTPUT) }
common key. For example, val s_idle :: s_5 :: s_10 :: s_15 :: s_ok :: Nil =
Enum(UInt(), 5)
switch(idx) {
val state = Reg(init = s_idle)
is(v1) { u1 }
when (state === s_idle) {
is(v2) { u2 }
when (io.nickel) { state := s_5 }
}
when (io.dime) { state := s_10 }
}
is equivalent to: when (state === s_5) {
when (io.nickel) { state := s_10 }
when (idx === v1) { u1 } when (io.dime) { state := s_15 }
.elsewhen (idx === v2) { u2 } }
when (state === s_10) {
Chisel also allows a Wire, i.e., the output of some when (io.nickel) { state := s_15 }
when (io.dime) { state := s_ok }
combinational logic, to be the target of conditional up- }
date statements to allow complex combinational logic when (state === s_15) {
expressions to be built incrementally. Chisel does when (io.nickel) { state := s_ok }
not allow a combinational output to be incompletely when (io.dime) { state := s_ok }
}
specified and will report an error if an unconditional when (state === s_ok) {
update is not encountered for a combinational output. state := s_idle
}
io.valid := (state === s_ok)
}

In Verilog, if a procedural specification of a combina-


Here is the vending machine FSM defined with
tional logic block is incomplete, a latch will silently be
switch statement:
inferred causing many frustrating bugs.
It could be possible to add more analysis to the class VendingMachine extends Module {
Chisel compiler, to determine if a set of predicates val io = new Bundle {
val nickel = Bool(dir = INPUT)
covers all possibilities. But for now, we require a
val dime = Bool(dir = INPUT)
single predicate that is always true in the chain of val valid = Bool(dir = OUTPUT)
conditional updates to a Wire. }

9
val s_idle :: s_5 :: s_10 :: s_15 :: s_ok :: Nil = where amp is used to scale the fixpoint values stored
Enum(UInt(), 5) in the ROM.
val state = Reg(init = s_idle)

switch (state) {
is (s_idle) {
12.2 Mem
when (io.nickel) { state := s_5 }
when (io.dime) { state := s_10 } Memories are given special treatment in Chisel since
} hardware implementations of memory have many
is (s_5) { variations, e.g., FPGA memories are instantiated
when (io.nickel) { state := s_10 }
when (io.dime) { state := s_15 }
quite differently from ASIC memories. Chisel defines
} a memory abstraction that can map to either sim-
is (s_10) { ple Verilog behavioral descriptions, or to instances
when (io.nickel) { state := s_15 } of memory modules that are available from exter-
when (io.dime) { state := s_ok }
}
nal memory generators provided by foundry or IP
is (s_15) { vendors.
when (io.nickel) { state := s_ok } Chisel supports random-access memories via the
when (io.dime) { state := s_ok }
Mem construct. Writes to Mems are positive-edge-
}
is (s_ok) {
triggered and reads are either combinational or
state := s_idle positive-edge-triggered.1
}
} object Mem {
io.valid := (state ===s_ok) def apply[T <: Data](type: T, depth: Int,
} seqRead: Boolean = false): Mem
}

class Mem[T <: Data](type: T, depth: Int,


seqRead: Boolean = false)
12 Memories extends Updateable {
def apply(idx: UInt): T
}
Chisel provides facilities for creating both read only
and read/write memories.
Ports into Mems are created by applying a UInt
index. A 32-entry register file with one write port and
12.1 ROM two combinational read ports might be expressed as
follows:
Users can define read only memories with a Vec:
val rf = Mem(UInt(width = 64), 32)
Vec(inits: Seq[T]) when (wen) { rf(waddr) := wdata }
Vec(elt0: T, elts: T*) val dout1 = rf(waddr1)
val dout2 = rf(waddr2)
where inits is a sequence of initial Data literals
that initialize the ROM. For example, users can cre- If the optional parameter seqRead is set, Chisel will
ate a small ROM initialized to 1, 2, 4, 8 and loop attempt to infer sequential read ports when the read
through all values using a counter as an address gen- address is a Reg. A one-read port, one-write port
erator as follows: SRAM might be described as follows:

val m = Vec(Array(UInt(1), UInt(2), UInt(4), UInt(8))) val ram1r1w =


val r = m(counter(UInt(m.length))) Mem(UInt(width = 32), 1024, seqRead = true)
val reg_raddr = Reg(UInt())
when (wen) { ram1r1w(waddr) := wdata }
We can create an n value sine lookup table using a
when (ren) { reg_raddr := raddr }
ROM initialized as follows: val rdata = ram1r1w(reg_raddr)

def sinTable (amp: Double, n: Int) = {


val times =
Single-ported SRAMs can be inferred when the
Range(0, n, 1).map(i => (i*2*Pi)/(n.toDouble-1) - Pi) read and write conditions are mutually exclusive in
val inits = the same when chain:
times.map(t => SInt(round(amp * sin(t)), width = 32))
Vec(inits) val ram1p =
}
def sinWave (amp: Double, n: Int) = 1 Current FPGA technology does not support combinational
sinTable(amp, n)(counter(UInt(n)) (asynchronous) reads (anymore). The read address needs to be
registered.

10
Mem(UInt(width = 32), 1024, seqRead = true) class FilterIO extends Bundle {
val reg_raddr = Reg(UInt()) val x = new PLink().flip
when (wen) { ram1p(waddr) := wdata } val y = new PLink()
.elsewhen (ren) { reg_raddr := raddr } }
val rdata = ram1p(reg_raddr)

where flip recursively changes the “gender” of a


If the same Mem address is both written and sequen- bundle, changing input to output and output to in-
tially read on the same clock edge, or if a sequential put.
read enable is cleared, then the read data is unde-
We can now define a filter by defining a filter class
fined.
extending module:
Mem also supports write masks for subword writes.
A given bit is written if the corresponding mask bit class Filter extends Module {
is set. val io = new FilterIO()
...
val ram = Mem(UInt(width = 32), 256) }
when (wen) { ram.write(waddr, wdata, wmask) }
where the io field contains FilterIO.

13 Interfaces and Bulk Connec-


13.2 Bundle Vectors
tions
Beyond single elements, vectors of elements form
For more sophisticated modules it is often useful to richer hierarchical interfaces. For example, in order
define and instantiate interface classes while defin- to create a crossbar with a vector of inputs, producing
ing the IO for a module. First and foremost, inter- a vector of outputs, and selected by a UInt input, we
face classes promote reuse allowing users to capture utilize the Vec constructor:
once and for all common interfaces in a useful form.
class CrossbarIo(n: Int) extends Bundle {
Secondly, interfaces allow users to dramatically re-
val in = Vec.fill(n){ new PLink().flip() }
duce wiring by supporting bulk connections between val sel = UInt(INPUT, sizeof(n))
producer and consumer modules. Finally, users can val out = Vec.fill(n){ new PLink() }
make changes in large interfaces in one place reduc- }

ing the number of updates required when adding or


removing pieces of the interface. where Vec takes a size as the first argument and a
block returning a port as the second argument.

13.1 Port Classes, Subclasses, and Nest-


ing 13.3 Bulk Connections
As we saw earlier, users can define their own inter-
faces by defining a class that subclasses Bundle. For We can now compose two filters into a filter block as
example, a user could define a simple link for hand- follows:
shaking data as follows: class Block extends Module {
val io = new FilterIO()
class SimpleLink extends Bundle {
val f1 = Module(new Filter())
val data = UInt(16, OUTPUT)
val f2 = Module(new Filter())
val valid = Bool(OUTPUT)
}
f1.io.x <> io.x
f1.io.y <> f2.io.x
We can then extend SimpleLink by adding parity bits f2.io.y <> io.y
using bundle inheritance: }

class PLink extends SimpleLink {


where <> bulk connects interfaces of opposite gender
val parity = UInt(5, OUTPUT)
} between sibling modules or interfaces of same gender
between parent/child modules. Bulk connections
In general, users can organize their interfaces into connect leaf ports of the same name to each other.
hierarchies using inheritance. After all connections are made and the circuit is being
From there we can define a filter interface by nest- elaborated, Chisel warns users if ports have other
ing two PLinks into a new FilterIO bundle: than exactly one connection to them.

11
io.imem.isVal := ...;
io.dmem.isVal := ...;
io.dmem.isWr := ...;
...
}

class Dpath extends Module {


val io = new DpathIo();
...
io.imem.raddr := ...;
io.dmem.raddr := ...;
io.dmem.wdata := ...;
...
}

We can now wire up the CPU using bulk connects as


we would with other bundles:
class Cpu extends Module {
val io = new CpuIo()
val c = Module(new CtlPath())
val d = Module(new DatPath())
c.io.ctl <> d.io.ctl
c.io.dat <> d.io.dat
Figure 3: Simple CPU involving control and data c.io.imem <> io.imem
path submodules and host and memory interfaces. d.io.imem <> io.imem
c.io.dmem <> io.dmem
d.io.dmem <> io.dmem

13.4 Interface Views }


d.io.host <> io.host

Consider a simple CPU consisting of control path


Repeated bulk connections of partially assigned con-
and data path submodules and host and memory
trol and data path interfaces completely connect up
interfaces shown in Figure 3. In this CPU we can
the CPU interface.
see that the control path and data path each connect
only to a part of the instruction and data memory
interfaces. Chisel allows users to do this with par-
tial fulfillment of interfaces. A user first defines the 14 Functional Creation of Mod-
complete interface to a ROM and Mem as follows: ules
class RomIo extends Bundle {
val isVal = Bool(INPUT) It is also useful to be able to make a functional inter-
val raddr = UInt(INPUT, 32) face for module construction. For instance, we could
val rdata = UInt(OUTPUT, 32)
build a constructor that takes multiplexer inputs as
}
parameters and returns the multiplexer output:
class RamIo extends RomIo {
object Mux2 {
val isWr = Bool(INPUT)
def apply (sel: UInt, in0: UInt, in1: UInt) = {
val wdata = UInt(INPUT, 32)
val m = new Mux2()
}
m.io.in0 := in0
m.io.in1 := in1
Now the control path can build an interface in terms m.io.sel := sel
of these interfaces: m.io.out
}
class CpathIo extends Bundle { }
val imem = RomIo().flip()
val dmem = RamIo().flip()
where object Mux2 creates a Scala singleton object on
...
} the Mux2 module class, and apply defines a method
for creation of a Mux2 instance. With this Mux2 creation
and the control and data path modules can be built function, the specification of Mux4 now is significantly
by partially assigning to this interfaces as follows: simpler.
class Cpath extends Module { class Mux4 extends Module {
val io = new CpathIo(); val io = new Bundle {
... val in0 = UInt(INPUT, 1)

12
val in1 = UInt(INPUT, 1) def Mux[T <: Bits](c: Bool, con: T, alt: T): T { ... }
val in2 = UInt(INPUT, 1)
val in3 = UInt(INPUT, 1) where T is required to be a subclass of Bits. Scala
val sel = UInt(INPUT, 2)
ensures that in each usage of Mux, it can find a com-
val out = UInt(OUTPUT, 1)
} mon superclass of the actual con and alt argument
io.out := Mux2(io.sel(1), types, otherwise it causes a Scala compilation type
Mux2(io.sel(0), io.in0, io.in1), error. For example,
Mux2(io.sel(0), io.in2, io.in3))
} Mux(c, UInt(10), UInt(11))

Selecting inputs is so useful that Chisel builds it in yields a UInt wire because the con and alt arguments
and calls it Mux. However, unlike Mux2 defined above, are each of type UInt.
the builtin version allows any datatype on in0 and We now present a more advanced example of pa-
in1 as long as they have a common super class. In rameterized functions for defining an inner product
Section 15 we will see how to define this ourselves. FIR digital filter generically over Chisel Num’s. The in-
Chisel provides MuxCase which is an n-way Mux ner product FIR filter can be mathematically defined
MuxCase(default, Array(c1 -> a, c2 -> b, ...))
as: X
y[t] = wj ⇤ xj [t j] (1)
where each condition / value is represented as a tuple j
in a Scala array and where MuxCase can be translated where x is the input and w is a vector of weights. In
into the following Mux expression: Chisel this can be defined as:
Mux(c1, a, Mux(c2, b, Mux(..., default)))
def delays[T <: Data](x: T, n: Int): List[T] =
if (n <= 1) List(x) else x :: Delays(RegNext(x), n-1)
Chisel also provides MuxLookup which is an n-way
indexed multiplexer: def FIR[T <: Data with Num[T]](ws: Seq[T], x: T): T =
(ws, Delays(x, ws.length)).zipped.
MuxLookup(idx, default, map( _ * _ ).reduce( _ + _ )
Array(UInt(0) -> a, UInt(1) -> b, ...))
where delays creates a list of incrementally increas-
which can be rewritten in terms of:MuxCase as follows: ing delays of its input and reduce constructs a reduc-
MuxCase(default, tion circuit given a binary combiner function f. In
Array((idx === UInt(0)) -> a, this case, reduce creates a summation circuit. Finally,
(idx === UInt(1)) -> b, ...)) the FIR function is constrained to work on inputs of
type Num where Chisel multiplication and addition
Note that the cases (eg. c1, c2) must be in parentheses.
are defined.

15 Polymorphism and Parameter- 15.2 Parameterized Classes


ization Like parameterized functions, we can also parameter-
ize classes to make them more reusable. For instance,
Scala is a strongly typed language and uses parame- we can generalize the Filter class to use any kind of
terized types to specify generic functions and classes. link. We do so by parameterizing the FilterIO class
In this section, we show how Chisel users can de- and defining the constructor to take a zero argument
fine their own reusable functions and classes using type constructor function as follow:
parameterized classes.
class FilterIO[T <: Data](type: T) extends Bundle {
val x = type.asInput.flip
This section is advanced and can be skipped at first
val y = type.asOutput
reading. }

We can now define Filter by defining a module class


15.1 Parameterized Functions
that also takes a link type constructor argument and
Earlier we defined Mux2 on Bool, but now we show passes it through to the FilterIO interface construc-
how we can define a generic multiplexer function. tor:
We define this function as taking a boolean condition class Filter[T <: Data](type: T) extends Module {
and con and alt arguments (corresponding to then val io = new FilterIO(type)
and else expressions) of type T: ...

13
} val valid = Bool(OUTPUT)
val bits = data.clone.asOutput
We can now define a PLink based Filter as follows: }

val f = Module(new Filter(new PLink())) This template can then be used to add a handshaking
protocol to any set of signals:
where the curly braces { } denote a zero argument
function (aka thunk) that in this case creates the link class DecoupledDemo
extends DecoupledIO()( new DataBundle )
type.
A generic FIFO could be defined as shown in Fig- The FIFO interface in Figure 4 can be now be simpli-
ure 4 and used as follows: fied as follows:
class DataBundle extends Bundle {
class Fifo[T <: Data] (data: T, n: Int)
val A = UInt(width = 32)
extends Module {
val B = UInt(width = 32)
val io = new Bundle {
}
val enq = new DecoupledIO( data ).flip()
val deq = new DecoupledIO( data )
object FifoDemo {
}
def apply () = new Fifo(new DataBundle, 32)
...
}
}

class Fifo[T <: Data] (type: T, n: Int)


extends Module {
val io = new Bundle {
16 Multiple Clock Domains
val enq_val = Bool(INPUT)
val enq_rdy = Bool(OUTPUT) Chisel 2.0 introduces support of multiple clock do-
val deq_val = Bool(OUTPUT) mains.
val deq_rdy = Bool(INPUT)
val enq_dat = type.asInput
val deq_dat = type.asOutput
} 16.1 Creating Clock domains
val enq_ptr = Reg(init = UInt(0, sizeof(n)))
val deq_ptr = Reg(init = UInt(0, sizeof(n))) In order to use multiple clock domains, users must
val is_full = Reg(init = Bool(false)) create multiple clocks. In Chisel, clocks are first class
val do_enq = io.enq_rdy && io.enq_val
nodes created with a reset signal parameter and de-
val do_deq = io.deq_rdy && io.deq_val
val is_empty = !is_full && (enq_ptr === deq_ptr)
fined as follows:
val deq_ptr_inc = deq_ptr + UInt(1)
class Clock (reset: Bool) extends Node {
val enq_ptr_inc = enq_ptr + UInt(1)
def reset: Bool // returns reset pin
val is_full_next =
}
Mux(do_enq && ~do_deq && (enq_ptr_inc === deq_ptr),
Bool(true),
Mux(do_deq && is_full, Bool(false), is_full)) In Chisel there is a builtin implicit clock that state
enq_ptr := Mux(do_enq, enq_ptr_inc, enq_ptr) elements use by default:
deq_ptr := Mux(do_deq, deq_ptr_inc, deq_ptr)
is_full := is_full_next var implicitClock = new Clock( implicitReset )
val ram = Mem(n)
when (do_enq) { The clock for state elements and modules can be
ram(enq_ptr) := io.enq_dat
defined using an additional named parameter called
}
io.enq_rdy := !is_full clock:
io.deq_val := !is_empty
Reg(... clock: Clock = implicitClock)
ram(deq_ptr) <> io.deq_dat
Mem(... clock: Clock = implicitClock)
}
Module(... clock: Clock = implicitClock)

Figure 4: Parameterized FIFO example.


16.2 Crossing Clock Domains
It is also possible to define a generic decoupled
interface: There are two ways that circuits can be defined to
class DecoupledIO[T <: Data](data: T)
send data between clock domains. The first and most
extends Bundle { primitive way is by using a synchronizer circuit com-
val ready = Bool(INPUT) prised of two registers as follows:

14
// signalA is in clock domain clockA, • initializes all period fields to desired period
// want a version in clockB as signalB
val s1 = Reg(init = UInt(0), clock = clockB) • initializes all count fields to desired phase,
val s2 = Reg(init = UInt(0), clock = clockB)
s1 := signalA • calls reset and then
s2 := s1;
signalB := s2 • repeated calls clock to step the simulation.

Due to metastability issues, this technique is limited The following is a C++ example of a main function
to communicating one bit data between domains. for the slowClock / fastClock example:
The second and more general way to send data int main(int argc, char argv) {
**
between domains is by using an asynchronous fifo: ClkDomainTest_t dut;
dut.init(1);
class AsyncFifo[T<:Data] dut.clk = 2;
(gen: T, entries: Int, enq_clk: Clock, deq_clock: dut.clk_cnt = 1;
Clock) dut.fastClock = 4;
extends Module dut.fastClock_cnt = 0;
dut.slowClock = 6;
When get a version of signalA from clock domains dut.slowClock_cnt = 0;
clockA to clockB by specifying the standard fifo pa- for (int i = 0; i < 12; i ++)
dut.reset();
rameters and the two clocks and then using the stan-
for (int i = 0; i < 96; i ++)
dard decoupled ready/valid signals: dut.clock(LIT<1>(0));
}
val fifo =
new AsyncFifo(Uint(width = 32), 2, clockA, clockB)
fifo.io.enq.bits := signalA
signalB := fifo.io.deq.bits 16.3.2 Verilog
fifo.io.enq.valid := condA
fifo.io.deq.ready := condB In Verilog,
...
• Chisel creates a new port for each clock / reset,
• Chisel wires all the clocks to the top module, and
16.3 Backend Specific Multiple Clock
Domains • the user must create an always block clock driver
for every clock i.
Each Chisel backend requires the user to setup and
control multiple clocks in a backend specific man- The following is a Verilog example of a top level
ner. For the purposes of showing how to drive a harness to drive the slowClock / fastClock example
multi clock design, consider the example of hardware circuit:
with two modules communicating using an Async- module emulator;
Fifo with each module on separate clocks: fastClock reg fastClock = 0, slowClock = 0,
and slowClock. resetFast = 1, resetSlow = 1;
wire [31:0] add, mul, test;
always #2 fastClock = ~fastClock;
16.3.1 C++ always #4 slowClock = ~slowClock;
initial begin
In the C++ backend, for every clock i there is a #8
resetFast = 0;
• size_t clk.len field representing the clock i’s resetSlow = 0;
period, #400
$finish;
• clock_lo_i and clock_hi_i, end
ClkDomainTest dut (
.fastClock(fastClock),
• int reset() function which ensures that all
.slowClock(slowClock),
clock_lo and clock_hi functions are called at .io_resetFast(resetFast),
least once, and .io_resetSlow(resetSlow),
.io_add(add), .io_mul(mul), .io_test(test));
• int clock(reset) function which computes endmodule
min delta, invokes appropriate clock_lo and
clock_hi’s and returns min delta used. See http://www.asic-world.com/verilog/verifaq2.
html for more information about simulating clocks in
In order to set up a C++ simulation, the user Verilog.

15
17 Acknowlegements
Many people have helped out in the design of Chisel,
and we thank them for their patience, bravery, and
belief in a better way. Many Berkeley EECS students
in the Isis group gave weekly feedback as the de-
sign evolved including but not limited to Yunsup
Lee, Andrew Waterman, Scott Beamer, Chris Celio,
etc. Yunsup Lee gave us feedback in response to
the first RISC-V implementation, called TrainWreck,
translated from Verilog to Chisel. Andrew Waterman
and Yunsup Lee helped us get our Verilog backend
up and running and Chisel TrainWreck running on
an FPGA. Brian Richards was the first actual Chisel
user, first translating (with Huy Vo) John Hauser’s
FPU Verilog code to Chisel, and later implementing
generic memory blocks. Brian gave many invaluable
comments on the design and brought a vast expe-
rience in hardware design and design tools. Chris
Batten shared his fast multiword C++ template li-
brary that inspired our fast emulation library. Huy
Vo became our undergraduate research assistant and
was the first to actually assist in the Chisel imple-
mentation. We appreciate all the EECS students who
participated in the Chisel bootcamp and proposed
and worked on hardware design projects all of which
pushed the Chisel envelope. We appreciate the work
that James Martin and Alex Williams did in writ-
ing and translating network and memory controllers
and non-blocking caches. Finally, Chisel’s functional
programming and bit-width inference ideas were in-
spired by earlier work on a hardware description lan-
guage called Gel [2] designed in collaboration with
Dany Qumsiyeh and Mark Tobenkin.

References
[1] Bachrach, J., Vo, H., Richards, B., Lee, Y., Wa-
terman, A., Avižienis, Wawrzynek, J., Asanović
Chisel: Constructing Hardware in a Scala Em-
bedded Language. in DAC ’12.
[2] Bachrach, J., Qumsiyeh, D., Tobenkin, M. Hard-
ware Scripting in Gel. in Field-Programmable
Custom Computing Machines, 2008. FCCM ’08.
16th.

16

You might also like