Memory Bus Encoding For Low Power: A Tutorial: Wei-Chung Cheng and Massoud Pedram
Memory Bus Encoding For Low Power: A Tutorial: Wei-Chung Cheng and Massoud Pedram
Abstract length. Each code word has a message part m, which can be
This paper contains a tutorial on bus-encoding techniques that chosen arbitrarily and a check part c, which is used to complete
target low power dissipation. Three general classes of codes, the message part to a proper code word. If an error of
i.e., algebraic, permutation-based, and probability-based, are transmission has occurred, resulting in a word u which is not a
reviewed. A new mathematical framework for unifying the code word, it is then possible to see that there has been an error,
power-aware algebraic coding techniques based on the notion and it may be possible to locate the original code word as the
of leader sets is also presented. closest code word to u. For bit codes, i.e., codes where the
letters are 0 and 1, the distance d(w,w') between two words w
1. Introduction and w' is defined as the number of digit positions where the two
words differ.
Modern electronic systems contain a dichotomy of
For any encoding scheme, the encoder and decoder functions are
simultaneously needing to be low power and high performance.
the inverse of one another; therefore, the encoding and decoding
This arises largely from their use in battery-operated portable
functions are usually discussed together in the literature. There
(wearable) platforms. Even in fixed, power-rich platforms, the
is, however, a fundamental difference between the two
packaging and reliability costs associated with high power and
functions. The encoder must consider the target objective (e.g.,
high performance systems are forcing designers to look for ways
low transition count on the bus) and as a result exploits an
to reduce power consumption. Power-efficient design requires
optimization algorithm to generate the code words from the
reducing power dissipation in all parts of the design and during
source words, whereas the decoder does a straightforward
all stages of the design process subject to constraints on the
decoding of the code words without attention to why the
system performance and quality of service (QoS). Sophisticated
encoding took place in one way or the other. From a
power-aware, high-level language compilers, dynamic power
mathematical point of view, it is therefore easier to explain and
management policies, memory management, bus-encoding
analyze the behavior of the decoder, rather than the encoder.
techniques, and hardware design tools are demanded to meet
This is the approach we adopted in this paper. Although we
these often-conflicting design requirements. This paper focuses
sometimes provide hints about the encoding algorithms, the
on the low power bus-encoding problem.
emphasis is on the decoding process.
The major building blocks of a computer system include the
CPU, the memory controller, the memory chips, and the 2. Low Power Coding Techniques
communication channels dedicated to providing the means for
data transfer between the CPU and the memory. These channels Generally speaking, a code C is a bijection T from a set A of
tend to support heavy traffic and often constitute the letters to another set B of letters. A message M written with the
performance bottleneck in many systems. At the same time, the letters of A is encoded by T to a coded message TM written in
energy dissipation per memory bus access is quite high, which the letters of B, and the original message is recovered by
in turn limits the power efficiency of the overall system. applying T-1 to TM, M=T-1TM. In a classical example, A and B
consist of the same letters, and T is a permutation of them.
In a computer system, the bus can be an on-chip bus [20], a
local bus between the CPU and the memory controller [5], or a Low power bus codes can be classified as algebraic,
memory bus between the memory controller (which may be on- permutation, or probabilistic. Algebraic codes refer to those
chip or off-chip) and the memory devices. The emphasis of this codes that are produced by encoders that take two or more
paper is on low power encoding techniques for the memory bus. operands (e.g. the current source word, the previous source
A widely used class of codes is the block code where the word, the previous code word, etc.) to produce the current code
encoded message consists of code words w of fixed length word using arithmetic or bit-level logic operations. Permutation
(number of letters) forming a subset of all words of the same codes refer to a permutation of a set of source words.
Probabilistic codes are generated by encoders that examine the
1
This work is sponsored in part by DARPA PAC/C program under contract number DAAB07-00-C-L516.
probability distribution of source words or pairs of source words Bus-Invert encoding is a simple, yet efficient technique for
and use this distribution to assign codes to the source words reducing transitions on a bus. In [11], it is proved that the Bus-
(Level signaling) or pairs of source words (Transition Invert code minimizes the bus switching activity if the data is
Signaling). In all cases, the objective is to minimize the number random and only one extra bit is available to the encoder. We
of transitions when transmitting all of the source words on the refer to this type of added redundancy to a bus as a Bus-Invert
bus. The overhead of the encoder/decoder circuitry is often Code in Space.
ignored.
Partial Bus-Invert Code [17]: The Partial Bus-Invert encoding
In the remainder of this paper, we will use the following technique breaks the source word into two parts u@v. Let the
terminology. The original data to be encoded will be referred to partial Bus-Invert code word be denoted as INV@x@y where x
as source words and denoted by si. The encoded data will be is the same as u and y denotes v or its one’s complement. The
referred to as the code words and denoted by ci. The index i Bus-Invert decoder takes the code word and produces the
refers to the time stamp of the words. Let S denote the set of all corresponding source word as follows. If the INV signal is set,
source words and C the set of all code words. We use ws and wc the result is x concatenated with the one’s complement of y;
to denote the bit widths of the source and code words, otherwise it is x@y.
respectively. In general, ws is different from wc .The encoder
takes the source words as input and produces the code words Partial Bus-Invert encoding is effective when certain bits of the
based on some algorithm or policy. The decoder takes the code source words in the data stream exhibit strong spatio-temporal
words as input and produces the source words. correlations. The key idea is to identify such bits, group them
together, and then apply the Bus-Invert coding technique only to
2.1. Algebraic Codes these bits. Given the input data stream, the problem of
determining this bit grouping to minimize the expected
Although algebraic coding is well developed, it has seldom been switching activity on the bus using the Partial Bus-Invert
mentioned in the low power bus encoding literature for the encoding is proven to be NP-complete [17].
following reasons. Conventionally, algebraic coding is used for
error correction in a noisy communication channel. Not only is Interleaving Partial Bus-Invert Code [21]: It is similar to the
power consideration absent in the coding theory, but there is Partial Bus-Invert Code, with the difference that the bit-width
also a need for excessive redundancy in order to perform error and the position of the x and y bits are dynamically changed.
detection or correction. For reduced-transition bus encoding, the In [21], Field Programmable Gate Array (FPGA) devices are
temporal sequence of code words needs to be considered, and used to dynamically change the x and y bit groupings. A
coding theory helps little in this respect. heuristic off-line algorithm was proposed to divide a given input
Error-correct code and information entropy, two branches of stream into several sub-sequences while considering the runtime
applied mathematics, have been applied to computer systems. and power cost overhead of reconfiguring the encoder circuit.
For the conventional setup of coding theory, there are source Note that FPGA devices are not necessary to realize this scheme.
encoding and channel encoding. Source encoding reduces the By adding extra redundant bits, we can achieve the same effect
data (i.e., compresses it toward the lower bound, which is the of changing the x and y bit partitions at runtime even when using
entropy), and channel encoding inserts redundancy to correct a fixed encoding function.
common errors generated in the noisy channel. Because the goal M-bit Bus-Invert Code: The M-bit Bus-Invert encoding
of coding theory is to perform error correction and its technique breaks the source word into M parts u1@u2…@uM.
redundancy is high, it has not been used in power-aware bus Let the M-bit Bus-Invert code word be denoted as
encoding. INV1@INV2…@INVM@x1@x2@…@xM where xi is ui or its
one’s complement. The M-bit Bus-Invert decoder takes the code
2.1.1. Example Codes word and produces the corresponding source word as follows.
In the following, we provide a brief explanation of a number of If the INVi signal is set, the result is the one’s complement of xi;
power-aware bus-encoding techniques. Because of space otherwise it is xi.
limitations, in each case we only explain how the decoder This is an obvious generalization of the Partial Bus-Invert code
works. The encoder, of course, works in a dual manner with the where there is more than one group of correlated bits, and we
decoder, but its operation is based on some optimization use different Bus-Invert signals for each one to allow
algorithms that would require additional explanation. For details independent control of the encode values for each group of bits.
about the encoding algorithms, please refer to the appropriate Note that the M-bit Bus-Invert reduces the number of transitions
references. on the data bits. However, there is an extra cost associated with
Bus-Invert Code [18]: The Bus-Invert encoding technique uses the transitions on the redundant INV bits, which may offset the
an extra signal (INV) to indicate the “polarity” of the data. Let reduction in the switching activity of the original set of bits.
the Bus-Invert code word be denoted as INV@x where @ is the Bus-Invert Code in Time: This is a coding scheme in which the
concatenation operator, and x denotes either the source word or decoder decodes the last N code words based on an INVERT
its one’s complement. The Bus-Invert decoder takes the code word received as the N+1st code word, that is, the i-th source
word and produces the corresponding source word as follows. word is the Bus-Invert of the i-th code word based on the i-th bit
If the INV signal is set, the result is one’s complement of x; of the INVERT word.
otherwise it is x.
This technique assumes that the input data is transmitted in T0-XOR Code [9]: The T0-XOR decoder produces the source
packets and will be decoded as a fixed-length block of source word as follows: si= ci-1⊕ci⊕ (si-1+S).
words in one shot. The redundant Bus-Invert bits for the words
in a word block are added at one time as an additional word that This code is a hybrid of Transition Signaling and T0 code. In
follows the block. Thus, redundancy is added in time. contrast to the T0 code, which requires a redundant INC line,
the T0-XOR code uses the XOR function to avoid the
Two-dimensional Code [19]: This is a hybrid of the Bus-Invert introduction of the redundant line to the bus.
codes in space and time.
Offset-XOR Code [9]: The Offset-XOR decoder produces the
Two-dimensional code uses both the spatial and temporal source word as follows: si=(ci-1⊕ci)+si-1.
redundancy to enlarge the encoding space (i.e., the set E). The
degree of spatial redundancy should be limited or else the Again, there is no need to add a redundant bit because of the
increased bus width increases the implementation cost and the decorrelating effect of the Transition Signaling. In [9], other
switching activity. Similarly, temporal redundancy imposes an variations of T0, Offset, and Transition Signaling codes are
extra bus cycle for every N bus cycles and, therefore, reduces the described.
system performance. In [19], a signal modulation scheme is Dual Mode Code [1]: The Dual Mode encoding technique
proposed to hide the extra cycle. However, the signal classifies the source word into either address or data. For the
modulation scheme requires a dramatic change in the data source words, it uses the Bus-Invert code or any of its
implementation of the physical layer. variants whereas for address source words it uses the T0 code
Transition Signaling Code: The Transition Signaling decoder or any of its variants.
produces the source word by doing an exclusive-OR operation This encoding technique is an example of algorithms that
on the current code word and the previous source word, i.e., examine the nature of the transmitted words on the bus (in this
si=ci⊕si-1 where ⊕ denotes the XOR operation. case, whether the word is address or data).
This code does not introduce any redundancy to the bus (either Limited-weight Code [18]: A k-limited-weight code is a code
in time or in space). However it requires the encoder to having at most k one’s per word. This can be achieved by
remember the previous source word. This implies the need for a adding appropriate redundant lines.
memory element in the decoder circuit.
These codes are useful in conjunction with transition signaling.
Offset Code [9]: The Offset code is similar to the Transition Thus, a k-limited-weight code would guarantee at most k
Signaling code except that the Offset decoder uses the transitions per bus cycle.
arithmetic addition (in two’s complement) of the previous
source word and the current code word, i.e., si=ci+si-1. Working-Zone Code [14]: The Working-Zone encoding
technique generates a code word as
The Offset code tends to reduce the dynamic range of the values PRESENT@IDEN@OFFSET where PRESENT bit denotes a hit
transmitted on the bus, and hence it can reduce the number of or miss of the working-zone, IDEN denotes the identifier for the
bits that switch from one bus value to the next. In the extreme current working-zone, and OFFSET denotes the one-hot coded
case of sequential data access with a fixed stride value, only the offset value within that zone. The Working-zone decoder takes
stride value is transmitted on the bus, i.e., we arrive at the T0 the code word and produces the corresponding source word as
code. follows. If PRESENT is set to one, then the decoder produces
T0 Code [2]: The T0 encoding technique uses an extra signal zone(IDEN)+offset where zone(IDEN) is obtained from a look-
(INC) to indicate the sequentiality of the data. Let the T0 code up table relating the starting addresses of the working-zones to
word be denoted as INC@x where x denotes either the current their identifiers, and offset is the decoded value of the OFFSET.
source word or is a “don’t-care” word. The T0 decoder takes Otherwise, the decoder outputs the current code word.
the code word and produces the corresponding source word as This encoding scheme attempts to exploit the locality of
follows. If the INC signal is set, the result is the previous source reference that is usually present in the software programs. The
word plus a stride value S (si=si-1 + S); otherwise it is x (si=x). proposed encoding technique partitions the address space into
Notice that for the increment function, the stride value is working-zones whose starting addresses are stored in a number
positive whereas for the decrement function, it is negative. The of registers. A bit is used to denote a hit or a miss of the
stride value may be fixed and known by the encoder/decoder a working-zone. When there is a miss, the full address is
priori, or it may be sent on the bus as explained next. When INC transmitted on the bus; otherwise, the bus is used to transmit the
is set to 1, the value on the bus is a don’t-care. Normally this offset which is one-hot coded. Additional lines are used to
value is set to the previous word on the bus to minimize the bus transmit the identifiers of the working-zone. A miss of the
activity. However, it may be replaced with the stride value itself. working-zone means that either the working-zone starting
In this way, we can have variable stride values, which will address is not stored in the registers or that the offset is too large
reduce the number of times that the INC signal will be set to to be one-hot coded. For the case in which the number of zones
zero. However, changing the stride value on the fly will cause is larger than the number of registers, a replacement policy is
switching activity on the bus. One can use one-hot coding or implemented. In [14], it is suggested to use Transition Signaling
some other technique to minimize the activity when the stride on the offset to further reduce the number of transitions on that
value is changed. portion of the bus.
Codebook-based Code [10]: The Codebook-based encoding equation: ci ∗ l = si where ci , l ∈ C , si ∈ S and ∗ is a binary
technique uses a code word with two parts ID@x where ID operation over E . Note that if there exists more than one
denotes the index of a pattern stored in a codebook, and x element in the leader set, the decoder decides which leader
denotes either the source word or its XOR with the pattern in element is to be used during the decoding based on the NND
the codebook. The decoder produces the source word as algorithm. However, in many cases, the decoder is told what
follows: si=ci⊕pattern(ID) where pattern(ID) refers to the leader should be used by explicitly sending the information
codebook pattern corresponding to ID. about the leader with the data.
The Codebook-based code can be thought of as a generalized Example 1: Consider a 4-bit bus with Bus-Invert encoding. In
version of the Bus-Invert code. The codebook contains the set of this case, C = 25 , S = 2 4 , L = {00000,11111} , ∗ is the bitwise
patterns and their corresponding ID’s. The patterns are chosen
so that the average Hamming distance between a source word Exclusive-OR (XOR ⊕ ) operation, ci = 00000 , and
and the “best” pattern in the codebook is minimized. The si = ci ∗11111 . During the decoding process, we use l=11111
encoder compares each source word with all of the patterns in when INV=1, otherwise we use l=00000.
the codebook to find the pattern that has the minimum Hamming
distance from the source word and then produces the code word Using this terminology and notation, the formal statement of the
as the concatenation of the ID bits and the XOR function of that Bus-Invert decoding is: si = ci ⊕ l INV , where l0 = 00...0 ,
pattern and the source word. Both the encoder and the decoder
l1 = 11...1 .
know the codebook, which can be initialized offline and/or
updated online. The binary operation ∗ is used to encode and decode the code
words. There are two types of operations in use:
2.1.2. A Unifying Framework
a) XOR (): Examples include Bus-Invert and Transition
We will define a code based on algebraic principles. Signaling codes.
Group: A group ( S , ∗ ) is a set S together with a binary b) Binary Addition (+): Examples include Working-zone
operation ∗ defined on S for which the following properties and T0 codes.
hold:
Note that XOR is equivalent to the Hamming distance between
(1) Closure: For all a, b∈S, a ∗ b∈S. the two operands and that binary addition requires an inverse
(2) Identity: There is an element e∈S such that operation, i.e., the binary subtraction. The XOR and binary
e ∗ a=a ∗ e=a for all a∈S. addition/subtraction operations give rise to a group of codes that
(3) Associativity: For all a, b, c∈S, we have (a ∗ b) ∗ c=a are easy to implement in practice.
∗ (b ∗ c). Two parameters are used to characterize the leader set: size and
(4) Inversion: For each a∈S, there exists a unique element scope.
b∈S such that a ∗ b=b ∗ a=e.
Size
Code: A code f is a mapping from a set of symbols S (alphabet)
to a set of binary numbers C. If every element in C has the same a) L = 1 : For example, the Transition Signaling code
length, then f is a block code. Otherwise, f is a variable-length implicitly uses the previous code word as the leader without
code (e.g., Huffman code). introducing any redundant bits to the bus.
Given the source word set S and the code word set C, one can b) L = 2 : It is necessary to add one extra bit to the bus. For
define any mapping from S to C as a code. The purpose of
encoding is to detect or correct error generated from a noisy example, the Bus-Invert code uses leader set {0,0 −1} . Notice
communication channel or, in our case, to reduce transition that the two leaders are complementary, which is, in general, not
count. The effectiveness of encoding depends on the choice of C required (e.g., Partial Bus-Invert code).
and f.
c) L ≥ 3 : For example, the M-bit Bus-Invert and Working-
Nearest Neighbor Decoding Algorithm: We consider a class of
zone codes use three or more leaders.
codes where the decoding function is based on a Nearest
Neighbor Decoding (NND) algorithm. For this class of codes, Scope
the code words (including correctly and incorrectly received
code words) are partitioned into groups where each group We define the scope of a leader as a window of size W where all
represents one source word (i.e., the correct intended word). A of the source words in that window are decoded using the
leader represents each group. After the decoder receives a code leader. Consider the following cases:
word, it calculates the distance between this word and each of a) W = ∞ : Examples are the Bus-Invert and M-bit Bus-Invert
the leaders by an XOR operation. The leader with the minimum codes. Note that the leader set L is fixed.
distance is selected, and the represented source word is
recognized. b) W = t: Examples are the Working-zone and Dual Mode
codes. Note that L may be fixed or adaptively changed as t itself
Let L ⊂ C denote a leader set containing some special elements may be changing. For example, with Codebook and Adaptive
l. The decoding process may be expressed by the following coding, L is changed on the fly.
c) W = 1: Examples are the Transition Signaling and T0 above encoding schemes either needs modification to work or
codes. Note that L={previous-code-word} for Transition does not work at all. We will present how this problem can be
Signaling code, and L={previous-source-word} for T0 code. formulated and solved.
Comparing the sizes of E and A , we discuss the following Pyramid Code [6]: Let us partition a source word x into three
three cases: fields p, q, and s as follows:
a) E = A : The code word has no redundant bits, that is if N
bits are required to uniquely describe all elements of A, then all
elements of E use the same number of bits. If E=A, then the p q s
coding is a permutation such as Gray or Pyramid code.
N-1 N 1
b) E > A : The code words have redundant bits. Most
encoding schemes are redundant. Usually, the extra bits are used The Pyramid code for x is given by:
to indicate the leader. For example, the INV bit of the Bus-Invert
p s ,0 s , p = q x, s = 0
code is used to select between the two leaders 0 and 0-1. xs =
x, s = 1
Μ( p, q, s ) = q + s, p , p > q
s
c) E < A : Multiple source words are mapped to the same x, y , s = 0
s
x, y =
s
q + s, p , p < q
code word. In this case, the different source words are assigned y, x , s = 1
to different leaders for the purpose of avoiding ambiguity in the
decoding. Examples include the Working-zone and Codebook- It has been shown in [6] that the Pyramid code produces the
based codes. minimum transition count for sequential access on a multiplexed
bus. The code remains quite effective in reducing the power
2.2. Permutation Codes dissipation of the multiplexed bus even when the sequentiality
If redundancy is not feasible on the memory bus, the encoding of the addresses is interrupted every four addresses.
function becomes a permutation, i.e., a one-to-one and on-to Data Ordering-based Code [12][13]: This is a coding scheme in
mapping from a set S to itself. Without any knowledge of the which the encoder takes a block of N source words and
access pattern, the only locality that can be made use of is produces a block of N code words where the code words ci are a
sequentiality. The address flow of instruction segments or large permutation π of source word sj. Both the encoder and the
arrays is a good example. For a simple bus and sequential decoder know the permutation function π.
access, Gray code is optimal because the transition between
consequent data is exactly one, the minimum. For cache write-back or built-in self-test systems, changing the
data flow does not affect the original semantics. Hence, a data-
Gray Code [20]: Only one bit difference between two ordering problem can be stated to minimize the bus transitions.
consecutive words. Let the source word s = bn−1,bn−2, ..., b1,b0 Given a set of data, the encoder looks for the optimal order that
and the code word c = g n−1, g n−2, ..., g1, g 0 . The encoding generates the minimum switching activities on the bus. Since the
data-ordering problem is NP-complete, a bounded-error
function from Binary to Gray code is g n = bn , gi = bi +1 ⊕ bi . approximation algorithm was proposed in [12] to solve the
The decoding function from Gray to Binary code is bn = g n , offline version of this problem. To further reduce transitions, a
Bus-Invert signal can be added to the Data Ordering-based code
bi = bi +1 ⊕ gi . in [13].
Dynamic RAM (DRAM) is omnipresent in computer systems
because of its high density and low cost. The long access time of
2.3. Probabilistic Codes
DRAM, which is its major shortcoming compared to the Static Entropy-reducing Code [11]: This code refers to a group of
RAM (SRAM), has been significantly improved during the past codes that attempt to reduce the entropy rate of the source given
few years by developing DRAM technologies such as Page a fixed level of redundancy in the bus. The key idea is to
Mode, EDO, Synchronous DRAM, Rambus DRAM, and DDR compute the error between the current source word and its
DRAM [8]. predicted value followed by a coding algorithm that minimizes
the transition activity. The result is then sent on the bus using
Because of the physical layout and the reduction in pin number,
the Transition Signaling technique and is decoded accordingly.
a DRAM address bus is always multiplexed. Typically, an
address is divided into Row address and Column address. These The rationale for this class of codes is that the power savings
two addresses are transmitted on the bus one after the other and obtainable by encoding depend on the entropy rate of the
distinguished by two control signals Row Address Strobe (RAS) incoming source data and on the amount of redundancy in the
and Column Address Strobe (CAS). The newer Rambus DRAM code. The higher the entropy rate, the lower the energy savings
channel uses a packetized bus, which can be considered m-way that can be achieved by encoding the source words for a
multiplexed, and the address fields are interpreted according to specified level of redundant bits on the bus.
the packet format [15].
Beach Code [1]: The Beach encoder analyzes the word-level
Due to address multiplexing, the switching activity on the bus is correlations between source words to assign codes with small
generated in a completely different way. Hence, any one of the
Hamming distance to data words that are likely to be sent on communication over wide busses,” Proc. of Design
the bus in two consecutive clock cycles. Automation Conf., pp. 128-133, 1999.
[5] N. Chang, K. Kim, and J. Cho, “Bus encoding for low-
The Beach code is a subset of the entropy-reducing codes. The
power high-performance memory systems,” Proc. of
Beach encoder and decoder do not, however, use decorrelator
Design Automation Conf., pp. 800–805, 2000.
and correlator blocks.
[6] W. C. Cheng and M. Pedram, “Power-optimal encoding for
Probability-based Codes [4]: These codes are generated based DRAM address bus,” Proc. of Int’l Symp. on Low Power
on a general codec architecture that uses encoder/decoder Electronics and Design, pp. 250-252, 2000.
functions based on the current and previous values of the source [7] W. C. Cheng and M. Pedram, “Low power techniques for
and code words and decorrelator/correlator functions that address encoding and memory allocation,” Proc. of Asia
implement a Transition Signaling scheme on the bus. and South Pacific Design Automation Conference, Jan.
2001.
These codes start with the assumption that a detailed statistical [8] V. Cuppu, B. Jacob, B. Davis, and T. Mudge, “A
characterization of the data source is available, that is, the performance comparison of contemporary DRAM
stationary probability distribution of all pairs of consecutive architectures,” Proc. of Int’l Symp. on Computer
values in the input stream is known. The Exact Encoding Architecture, pp. 222-233, 1999.
function uses an exponential table (in the bit width of the bus) [9] W. Fornaciari, M. Polentarutti, D. Sciuto, and C. Silvano,
that stores all possible pairs of source words and their joint “Power optimization of system-level address buses based
occurrence probability to assign a minimum of transition activity on software profiling,” Proc. of the Eighth Int’l Workshop
codes to each pair of source words (Transition Signaling). on Hardware/Software Codesign, pp. 29-33, 2000.
Clustered Encoding uses a spatial partitioning of the bits into [10] S. Komatsu, M. Ikeda and K. Asada, “Low power chip
groups (or clusters) of bits, which are then individually coded interface based on bus data encoding with adaptive code-
for minimum transition activity while considering the complete book method,” Proc. of the Ninth Great Lakes Symp. on
set of transition probability statistics for each cluster. VLSI, pp. 368-371, 1999.
Discretized Encoding accounts for temporal correlations [11] S. Ramprasad, N. R. Shanbhag, and I. N. Hajj, “A coding
between the M most probable source word pairs and, hence, framework for low-power address and data busses,” IEEE
completely accounts for the intra-word spatial correlations while Trans. on VLSI, Vol. 7, No. 2, pp. 212-221, June 1999.
ignoring some of the inter-word temporal correlations. Both [12] R. Murgai, M. Fujita, and A. Oliveria, “Using
encoding techniques assume a priori knowledge of the input complementation and resequencing to minimize
source. Adaptive Encoding does not require a priori knowledge transitions,” Proc. of Design Automation Conf., pp. 694-
of the input source statistics. Instead, it operates on the basis of 697, 1998.
approximate information collected by observation of the input [13] R. Murgai and M. Fujita, “On reducing transition through
word stream over a widow of fixed size S. data modifications,” Proc. of Design, Automation and Test
in Europe Conf. and Exhibition, pp. 82-88, 1999.
3. Conclusions [14] E. Musoll, T. Lang, and J. Cortadella, “Exploiting the
locality of memory references to reduce the address bus
This paper reviewed a number of bus-encoding techniques that energy,” Proc. of Int’l Symp. on Low Power Electronics
target low power dissipation. Three general classes of codes, i.e., and Design, Monterey, CA, pp. 202-207, Aug. 1997.
algebraic, permutation-based, and probability-based, were [15] Rambus Inc, “Rambus Signaling Technologies: RSL,
analyzed. A new mathematical framework for unifying the QRSL and SerDes Technology Overview,” June 2000.
power-aware algebraic coding techniques based on the notion of [16] N. R. Shanbhag, “A mathematical basis for power-
leader sets was also presented. reduction in digital VLSI systems,” IEEE Trans. on Circuit
and Systems II: Analog and Digital Signal Processing,
4. References Vol. 44, No. 11, pp. 935-951, Nov. 1997.
[17] Y. Shin, S. Chae and K. Choi, “Partial bus-invert coding
[1] L. Benini, G. DeMicheli, E. Macii, M. Poncino, and S. for power optimization of system level bus,” Proc. of Int’l
Quer, “System-level power optimization of special purpose Symp. on Low Power Electronics and Design, pp. 127–
applications: The beach solution,” Proc. of Int’l Symp. on 129, 1998.
Low Power Electronics and Design, Monterey, CA, pp. 24- [18] M. R. Stan and W. P. Burleson, “Coding a terminated bus
29, Aug. 1997, for low power,” Proc. of Fifth Great Lakes Symp. on VLSI,
[2] L. Benini, G. DeMicheli, E. Macii, D. Sciuto, and C. pp. 70–73, 1995.
Silvano, “Asymptotic zero-transition activity encoding for [19] M. R. Stan and W. P. Burleson, “Two-dimensional codes
address busses in low-power microprocessor-based for low power,” Proc. of Int’l Symp. on Low Power
systems,” Proc. of The Seventh Great Lakes Symp. on Electronics and Design, pp. 335-340, 1996.
VLSI, pp. 77-82, 1997. [20] C. L. Su, C. Y. Tsui, and A. M. Despain, “Saving power in
[3] L. Benini, G. DeMicheli, E. Macii, D. Sciuto, and C. the control path of embedded processors,” IEEE Design
Silvano, “Address bus encoding techniques for system- and Test of Computers, Vol. 11, No. 4, pp. 24-30, 1994.
level power optimization,” Proc. of Design, Automation [21] S. Yoo and K. Choi, “Interleaving partial bus-invert coding
and Test in Europe, Paris, France, pp. 861-866, Feb. 1998. for low power reconfiguration of FPGAs,” Proc. of the
[4] L. Benini, A. Macii, E. Macii, M. Poncino and R. Scarsi, Sixth Int’l Conf. on VLSI and CAD, pp. 549-552, 1999.
“Synthesis of low-overhead interface for power-efficient