0% found this document useful (0 votes)

54 views10 pages

RaPiD - Recon Figurable Pipelined Datapath

RaPiD is a new coarse-grained FPGA architecture optimized for computationally intensive tasks like signal processing. It allows deeply pipelined datapaths to be configured from ALUs, multipliers, registers and memories. Regular computations can be efficiently mapped to RaPiD's linear array of functional units connected by segmented busses. Dynamic control signals specify variable operations and data to be performed in each cycle, while static control determines the underlying datapath structure. RaPiD aims to achieve both high performance for applications and flexibility to reconfigure for different applications.

Uploaded by

Anonymous XeS0viUW

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views10 pages

RaPiD - Recon Figurable Pipelined Datapath

Uploaded by

Anonymous XeS0viUW

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

RaPiD - Recongurable Pipelined Datapath?

y
Carl Ebeling, Darren C. Cronquist, and Paul Franklin
Department of Computer Science and Engineering
University of Washington
Box 352350
Seattle, WA 98195-2350

Abstract. Congurable computing has captured the imagination of many

architects who want the performance of application-specic hardware
combined with the reprogrammability of general-purpose computers. Un-
fortunately, congurable computing has had rather limited success largely
because the FPGAs on which they are built are more suited to imple-
menting random logic than computing tasks. This paper presents RaPiD,
a new coarse-grained FPGA architecture that is optimized for highly
repetitive, computation-intensive tasks. Very deep application-specic
computation pipelines can be congured in RaPiD. These pipelines make
much more ecient use of silicon than traditional FPGAs and also yield
much higher performance for a wide range of applications.

1 Introduction
Congurable computing promises to deliver the high performance required by
computationally demanding applications while providing the exibility and adapt-
ability of programmed processors. As such, congurable computing platforms lie
somewhere between ASIC solutions, which provide the highest performance/cost
at the expense of exibility and adaptability, and programmable processors,
which provide the greatest exibility at the expense of performance/cost. Un-
fortunately the promise of congurable computing has yet to be realized in spite
of some very successful examples[1, 5]. There are two main reasons for this.
First, congurable computing platforms are currently implemented using com-
mercial FPGAs which are very ecient for implementing random logic functions,
but much less so for general arithmetic functions. Building a multiplier using an
FPGA incurs a performance/cost penalty of at least 100. Second, current cong-
urable platforms are extremely hard to program[5, 6]. Taking an application from
concept to a high-performance implementation is a time-consuming, designer-
intensive task. The dream of automatic compilation from high-level specication
to a fast and ecient implementation is still unattainable.
? This paper appeared in FPL '96: The 6th International Workshop on Field-
Programmable Logic and Applications, pages 126-135. Springer-Verlag, 1996.
y This work was supported in part by the Defense Advanced Research Projects Agency
under Contract DAAH04-94-G0272. D. Cronquist was supported in part by an IBM
fellowship. P. Franklin was supported by an NSF fellowship.
The RaPiD architecture takes aim at these two problems in the context
of computationally demanding tasks such as those found in signal processing
applications. RaPiD is a coarse-grained FPGA architecture that allows deeply
pipelined computational datapaths to be constructed dynamically from a mix
of ALUs, multipliers, registers and local memories. The goal of RaPiD is to
compile regular computations like those found in DSP applications into both
an application-specic datapath and the program for controlling that datapath.
The datapath is controlled using a combination of static and dynamic control
signals. The static control determines the underlying structure of the datapath
that remains constant for a particular application. The dynamic control signals
can change from cycle to cycle and specify the variable operations performed
and the data to be used by those operations. The static control signals are
generated by static RAM cells that are changed only between applications while
the dynamic control is provided by a control program.
The structure of the datapaths constructed in RaPiD is biased strongly to-
wards linear arrays of functional units communicating in mostly a nearest neigh-
bor fashion. Systolic arrays[2], for example, map very well into RaPiD datapaths,
which allows the considerable amount of research on compiling to systolic ar-
rays to be applied to compiling computations to RaPiD[4, 3]. RaPiD is not
limited to implementing systolic arrays, however. For example, a pipeline can be
constructed which comprises dierent computations at dierent stages and at
dierent times.
The computational bandwidth provided by a RaPiD array is extremely high
and scales with the size of the array. The input and output data bandwidth,
however, is limited to the data memory bandwidth which does not scale. Thus
the amount of computation performed per I/O operation bounds the amount of
parallelism and thus the speedup an application can exhibit when implemented
using RaPiD. The RaPiD architecture assumes that at most three memory ac-
cesses are made per cycle. Providing even this much bandwidth requires a very
high-performance memory architecture.
RaPiD is also not suited for tasks that are unstructured, not highly repetitive,
or whose control ow depends strongly on the data. The assumption is that
RaPiD will be integrated closely with a RISC engine on the same chip. The
RISC would control the overall computational ow, farming out the heavy-duty
computation to RaPiD that requires brute force computation.
The concept of RaPiD can in theory be extended to 2-D arrays of functional
units. However, dynamically conguring 2-D arrays is much more dicult, and
the underlying communication structure is much more costly. Since most 2-D
computations can be computed eciently using a linear array, RaPiD is currently
restricted to linear arrays.
The paper begins with a description of the datapath architecture and how
computations are congured. This is followed by a description of the way dy-
namic control signals are generated. Next, a FIR lter example is used to illus-
trate how computations are mapped to the RaPiD architecture. The paper ends
with a discussion of the performance of RaPiD-I and future work.
2 RaPiD Architecture
This section describes the version of the RaPiD architecture, called RaPiD-I,
which is currently being implemented at the University of Washington. Variants
of this architecture with a dierent data width and data format, dierent func-
tional units, dierent number and conguration of busses and so on, could be
dened for dierent application domains. The RaPiD-I architecture contains all
the salient features of RaPiD and will allow us to describe RaPiD computations
for a variety of applications.

R R R
A A A A A
M L M L M
U U

bus connector

Fig. 1. The basic cell of RaPiD-I. This cell is replicated left to right to form a
complete RaPiD array.

RaPiD-I is a linear array of functional units which can be congured to form
a (mostly) linear computational pipeline. This array of functional units is divided
into identical cells which are replicated to form a complete array. One cell for
RaPiD-I is shown in Figure 1. This cell comprises an integer multiplier, two
integer ALUs, six general-purpose registers and three small local memories. The
complete RaPiD-I array contains 16 of these cells. Although the array is divided
into cells, this division is invisible when it comes to mapping an application to
the functional units and busses.
The functional units are interconnected using a set of ten segmented busses
that run the length of the datapath. Each input of the functional units is at-
tached to a multiplexer which is congured to select one of eight busses. Each
output of the functional units is attached to a demultiplexer comprised of tristate
drivers, each driving one of eight busses. Each output driver can be congured
independently, which allows an output to fan out to several busses, or none at
all if the functional unit is not being used. The assignment of operations to func-
tional units must be done so there is a bus segment available to connect units
that communicate.
The busses in dierent tracks are segmented into dierent lengths so that
bus tracks are used eciently. In several tracks, adjacent bus segments can be
connected together using either a buer or a register. This bus connector is
shown in Figure 2 and is represented in Figure 1 as a pair of lines between bus
segments. The connection is active and can drive in either direction but not both
at once. Many of the registers in a pipelined computation can be implemented
using these bus pipeline registers. In theory, all the bus segments in one track
could be connected together by bus connectors congured as bypass buers to
provide a broadcast signal the length of the array. In practice, the delay is much
too long and all signals are pipelined to some degree.

left right
bus bus

left-to-right
right-to-left
bypass-right
bypass-left

Fig. 2. Bus connectors can be used to connect adjacent bus segments via a buer
or a register.

Functional unit outputs are registered, although this output register can
be bypassed via conguration control. Functional units may additionally be
pipelined internally depending on their complexity. These pipeline registers can
also be bypassed if appropriate.
RaPiD-I operates on 16-bit signed or unsigned xed-point data which is
maintained via shifters in the multipliers. Dierent xed point representation
can be used in the same application by appropriately conguring the dierent
shifters in the datapath. An extra tag bit is associated with each data value
to indicate whether an over ow has occurred. Once set, the over ow value is
propagated to all results. The datapath thus generates no exceptions during
operation, but incorporates them into the data produced.
The ALUs perform the usual logical and arithmetic operations on 16-bit
data. The two ALUs in a cell can be combined to perform a pipelined 32-bit
operation, most typically as a 32-bit add for multiply-accumulate computations.
The ALU output register can be used as the accumulator for multiply-accumulate
operations.
The multiplier multiplies two 16-bit numbers and produces a 32-bit result,
shifted by a statically programmed amount to maintain the appropriate xed-
point representation. Both 16-bit halves of the result are available as output via
separate bus drivers. Either driver can be turned o to drop the corresponding
output if it is not needed. The multiplier uses a modied Booth's algorithm and
includes one congurable pipeline register.

clr
load

bypass
to bus
from bus

Fig. 3. Datapath register.

The registers in the datapath are used to store constants and temporary val-
ues as well as create pipelines of dierent lengths. These registers are completely
general unlike the registers found in the bus connectors and functional units,
which are used only for pipelining. Figure 3 shows the design of the datapath
registers. The datapath register inputs and outputs are connected to the busses
just like other functional units. One conguration signal controls whether the
output is driven by the register or the bypass path. This bypass is used to con-
nect a bus segment on one track to a bus segment in a dierent track. The load
and clear signals control the operation of the register. As discussed in Section 3,
these control signals must be set dynamically. While datapath registers are very
general, they are expensive in terms of both area and bus utilization. While
the datapath registers themselves are relatively small, their input multiplexer
and output drivers are quite large. Wiring the input and output of a datapath
register usually requires bus segments in two dierent tracks which consumes
extra routing resources. Thus the bus pipeline registers and the functional unit
registers are used whenever possible.
A limited amount of local memory is provided in the datapath for saving
and reusing data over many cycles. In many applications, the input or output
data is segmented into blocks that are accessed once, saved locally and reused as
needed, and then discarded. Local memory can also be used for constant arrays.
RaPiD-I includes three local memories per cell. The input and output data lines
are connected to busses as in other functional units. Because of the time needed
to read and write memory, congurable registers are included on both the input
and output data ports. The memory address is supplied either by a data bus or
by a local address generator, shown in Figure 4, that supports simple sequential
memory access. If values are read and written to consecutive addresses, which
is the most common case, then the memory address generator can supply the
addresses without using datapath resources.
+1 Memory

0
address

R/W
inc/hold R/W

load/clear/count

AddressIn
Datain Dataout

Fig. 4. Local memory.

Input and output data enter and exit the datapath via I/O streams at each
end of the datapath. These streams act as the interface to external memory. Each
stream contains a FIFO which is lled with data required by the computation or
with results produced by the computation. The data for each stream is associated
with a predetermined block of memory from which it is read or to which it is
written. The datapath reads from an input stream to obtain the next input data
value and writes to an output stream to store a result. Address generation and
memory reads and writes are handled entirely by the I/O streams themselves.
The I/O stream FIFOs operate asynchronously: if the datapath reads a value
from an empty FIFO or writes a value to a full FIFO, the datapath is stalled
until the FIFO is ready.

3 Datapath Control
For the most part, the signals that control the operation of the functional units
and their interconnection can be static over an entire application. However,
there are almost always some control signals that must be dynamic. For ex-
ample, constants are loaded into datapath registers during initialization but
then remain unchanged. The load signals of the datapath registers thus take
on dierent values during initialization and computation. More complex exam-
ples include double-buering the local memories and performing data-dependent
calculations.
The control signals are thus divided into static control signals provided by
conguration memory as in ordinary FPGAs, and dynamic control which must
be provided on every cycle. RaPiD is programmed for a particular application
by rst mapping the computation onto a datapath pipeline. The static program-
ming bits are used to construct this pipeline and the dynamic programming bits
are used to schedule the operations of the computation onto the datapath over
time. A controller is programmed to generate the dynamic information needed
to produce the dynamic programming bits.
Of the 230 control signals in a RaPiD-I cell, 80 are dynamic. Thus there is a
total of over 1200 dynamic control signals for the entire datapath. While cong-
uration memory is relatively cheap, producing and communicating the dynamic
control signals on every cycle, using a standard microprogram for example, would
be very expensive.
The problem of generating static control signals is solved using a control path
which parallels the data path. RaPiD applications map into pipelines of similar,
if not identical, repeating pipeline stages. The control signals of these stages are
thus similar as well, except that their values are skewed in time in the same way
the data passing through the pipeline is skewed in time.
The control path is thus a set of segmented busses containing congurable
pipeline registers through which control signal values are sent from one end of the
datapath to the other. Control values are inserted at one end of the control path
and are passed from stage to stage where they are applied to the appropriate
control signals. The congurable pipeline registers allow dierent control signals
to travel at dierent rates through the control path.
Generating the dynamic control signals is then accomplished by connecting
each dynamic control signal to a bus in the control path that carries the ap-
propriate value each cycle. The number of busses required in the control path
varies by application, but it is kept manageable because many control signals
have identical values. The values inserted into the control path are generated by
a simple microprogrammed controller whose microinstructions contain the dat-
apath control information in addition to looping constructs that allow datapath
instructions or instruction sequences to be repeated many times.

4 Example Application: FIR Filter

The simple FIR lter provides a good illustration of how RaPiD executes algo-
rithms. Figure 5a gives a specication for a NumTaps lter with NumX inputs.
The lter weights are stored in the W array, the input in the X array, and the
output in the Y array (starting at array location NumTaps ? 1). Figure 5b shows
the entire computation required for a single output of a 4-tap FIR lter.
for i = NumTaps-1 to NumX-1
Y[i] = 0 .....X9......X8......X7......X6......X5......X4......X3......X2......X1......X0

for j = 0 to NumTaps-1 W0 W1 W2 W3

Y[i] = Y[i] + X[i-j]*W[j]

end Y6 =

end
(a) (b)
Fig. 5. (a) Algorithm for FIR lter. (b) Computation for NumTaps=4 and i=6.

As with most applications, there are a variety of ways to map FIR lter to
RaPiD. The choice of mapping is driven by the parameters of both the RaPiD
array and the application. For example, if the number of taps is less than the
number of RaPiD multipliers, then each multiplier is assigned to multiply a
specic weight. The weights are rst preloaded into datapath registers whose
outputs drive the input of a specic multiplier. Pipeline registers are used to
stream the X inputs and Y outputs. Since each Y output must see NumTaps
inputs, the X and Y busses are pipelined at dierent rates. Figure 6a shows a
schematic diagram for this implementation on a four-tap FIR lter. The X input
bus was chosen to be doubly pipelined and the Y input bus singly pipelined.
Wires are annotated with the weight, input, and output values from a single
point in time during the computation phase.

IN
X9 X8 X7 X6 X5 X4 X3 X2

W0 * W1 * W2 * W3 *

0 Y8 Y7 Y6 Y5
ALU ALU ALU ALU

OUT

(a)
X X
H H

L L
W W

A Y A Y
L L
U U

OUT
IN

(b)
Fig. 6. (a) Schematic diagram for four-tap FIR lter, labeled at a point in time
(computing four parallel computations for y5, y6, y7, and y8). (b) Two taps of
the FIR lter mapped to the RaPiD array (this is replicated to form more taps).

This implementation maps easily to the RaPiD array, as shown for two taps in
Figure 6b. For clarity, all unused functional units are removed, and used busses
are highlighted. The bus connectors from Figure 1 are left open to represent
no connection and boxed to represent a register. The control for this mapping
consists of two phases of execution: loading the weights and computing the out-
put results. In the rst phase, the weights are sent down the IN double pipeline
along with a singly pipelined control bit (not shown) which sets the state of each
datapath register to \LOAD". When the nal weight is inserted, the control bit
is switched to \HOLD". Since the control bit travels twice as fast as the weights,
each datapath register will hold a unique weight. No special signals are required
to begin the computation; hence, the second phase is started the moment the
control bit is set to \HOLD".

5 Performance
This section evaluates the sustained (ignoring initialization and nalization)
computation rate of mapping FIR lter and matrix multiply to the RaPiD array.
These results are a function of both the RaPiD array parameters and the algo-
rithmic parameters. The parameters associated with the RaPiD array are the
clock rate in MHz ( ), the number of cells (S ), and the number of addressable
memory locations per cell (M ). Because RaPiD by its very nature is heavily
pipelined, a conservative estimate on the RaPiD-I clock rate of a mapped appli-
cation is 100MHz. In addition, conservative estimates of the number of RaPiD-I
cells and memory locations per cell are 16 and 96, respectively. Results will be
measured in MOPS or GOPS, where an operation is a single multiply-accumulate
combination. The maximum rate on RaPiD-I is 1.6 GOPS.
5.1 FIR Filter
The only algorithmic parameter aecting the sustained computation rate of the
FIR lter is the number of taps, T . The mapping described in Section 4 produced
one output per cycle and thus T MOPS with the constraint that T S . For a
more general mapping restricting the taps to T 31 MS , the RaPiD array can
produce min(1; TS ) outputs per cycle and min(T; S ) MOPS.5 For example, with
= 100, S = 16, M = 96, and T 16, RaPiD can perform a sustained rate of
1:6 GOPS on a FIR lter with up to 512 taps (and an unbounded number of
input values).
5.2 Matrix Multiply
Matrix multiply takes an X Y matrix A P and a Y Z matrix B and computes
the X Z matrix C = A B as cij = Yk=0 ?1 a b . Many dierent RaPiD
ik kj
mappings exist, each producing slightly dierent performance results. In one
implementation, the RaPiD array can produce min(1; YS ; 3YM ) operations per
1

cycle and min(Y; 13 M; S ) MOPS. With = 100, S = 16, M = 96, and Y 16,
RaPiD can perform a sustained rate of 1:6 GOPS (X and Z are unbounded).
5 This is a simplied version of a more complex formulation which is beyond the scope
of this paper.
6 Conclusions and Future Work
The RaPiD architecture potentially provides a very ecient recongurable plat-
form for implementing computationally intensive applications. Many applica-
tions have been mapped successfully by hand to RaPiD and simulated with very
promising results. However, there are several open problems which need to be
solved to make RaPiD truly successful.
{ The domain of applicability must be explored by mapping more problems
from dierent domains to RaPiD.
{ Thus far all RaPiD applications have been designed by hand. The next
step will be to apply compiler technology, particularly loop-transformation
theory[7] and systolic array compiling methods[4] to build a compiler for
RaPiD.
{ A memory architecture must be designed which can support the I/O band-
width required by RaPiD over a wide range of applications.
{ Although it is clear that RaPiD should be closely coupled to a generic RISC
processor, it is not clear exactly how this should be done. This is a problem
being faced by other recongurable computers.
Acknowledgments
We would like to thank the rest of the RaPiD team, Chris Fisher, Larry Mc-
Murchie and Jerey Weener, for their contributions to the project.

References
1. J. M. Arnold, D. A. Buell, D. T. Hoang, D. V. Pryor, N. Shirazi, and M. R. This-
tle. The Splash 2 processor and applications. In Proceedings IEEE International
Conference on Computer Design: VLSI in Computers and Processors, pages 482{5.
IEEE Comput. Soc. Press, 1993.
2. H.T. Kung. Let's design algorithms for VLSI systems. Technical Report CMU-CS-
79-151, Carnegie-Mellon University, January 1979.
3. P. Lee and Z. M. Kedem. Synthesizing linear array algorithms from nested FOR
loop algorithms. IEEE Transactions on Computers, 37(12):1578{98, 1988.
4. D. I. Moldovan and J. A. B. Fortes. Partitioning and mapping algorithms into xed
size systolic arrays. IEEE Transactions on Computers, C-35(1):1{12, 1986.
5. J. E. Vuillemin, P. Bertin, D. Roncin, M. Shand, H. H. Touati, and P. Boucard.
Programmable active memories: recongurable systems come of age. IEEE Trans-
actions on Very Large Scale Integration (VLSI) Systems, 4(1):56{69, 1996.
6. M. Wazlowski, L. Agarwal, T. Lee, A. Smith, E. Lam, P. Athanas, H. Silverman,
and S. Ghosh. PRISM-II compiler and architecture. In Proceedings IEEE Workshop
on FPGAs for Custom Computing Machines, pages 9{16. IEEE Comput. Soc. Press,
1993.
7. M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maxi-
mize parallelism. IEEE Transactions on Parallel and Distributed Systems, 2(4):452{
471, 1991.
This article was processed using the LATEX macro package with LLNCS style

Layers of The Atmosphere Power Point PDF
No ratings yet
Layers of The Atmosphere Power Point PDF
16 pages
Christophe Bobda-Introduction To Reconfigurable Computing - Architectures, Algorithms and Applications (2007)
100% (1)
Christophe Bobda-Introduction To Reconfigurable Computing - Architectures, Algorithms and Applications (2007)
375 pages
Why Con Gurable Computing? The Computational Density Advantage of Con Gurable Architectures
No ratings yet
Why Con Gurable Computing? The Computational Density Advantage of Con Gurable Architectures
17 pages
How Secure Are Fpgas in Cryptographic Applications?: (Wollinger, Cpaar) @crypto - Rub.De
No ratings yet
How Secure Are Fpgas in Cryptographic Applications?: (Wollinger, Cpaar) @crypto - Rub.De
11 pages
Introduction To Reconfigurable Computing
No ratings yet
Introduction To Reconfigurable Computing
30 pages
Compton Hauck RCOverview Paper 2002
No ratings yet
Compton Hauck RCOverview Paper 2002
40 pages
A Compact FPGA-Based Accelerator For Curve-Based C
No ratings yet
A Compact FPGA-Based Accelerator For Curve-Based C
13 pages
Synopsis - RPDS - Major 2
No ratings yet
Synopsis - RPDS - Major 2
5 pages
Reconfigurable Computing Architectures
No ratings yet
Reconfigurable Computing Architectures
23 pages
High-Throughput Pattern Matching With
No ratings yet
High-Throughput Pattern Matching With
14 pages
Reconfigurable Computing - What, Why & How
No ratings yet
Reconfigurable Computing - What, Why & How
6 pages
BITS Pilani: Reconfigurable Computing Es ZG 554 / Mel ZG 554 Session 1
No ratings yet
BITS Pilani: Reconfigurable Computing Es ZG 554 / Mel ZG 554 Session 1
23 pages
Design Methods
No ratings yet
Design Methods
6 pages
Reconfigurable Computing Hardware - 2008 - Reconfigurable Computing
No ratings yet
Reconfigurable Computing Hardware - 2008 - Reconfigurable Computing
2 pages
Introduction To Reconfigurable Computing Architectures Algorithms and Applications
No ratings yet
Introduction To Reconfigurable Computing Architectures Algorithms and Applications
374 pages
ES-MEL-AEL ZG554 - Lec1
No ratings yet
ES-MEL-AEL ZG554 - Lec1
40 pages
Arnold An eFPGA-Augmented RISC-V SoC For Low Power Iot End Nodes
No ratings yet
Arnold An eFPGA-Augmented RISC-V SoC For Low Power Iot End Nodes
14 pages
Reconfigurable Computing: Architectures and Design Methods
No ratings yet
Reconfigurable Computing: Architectures and Design Methods
15 pages
A Compute-In-Memory Chip Based On Resistive Random-Access Memory
No ratings yet
A Compute-In-Memory Chip Based On Resistive Random-Access Memory
29 pages
Compiling For Reconfigurable Computing - A Survey
No ratings yet
Compiling For Reconfigurable Computing - A Survey
65 pages
Computer Architecture As A Multilevel Hierarchical Framework
100% (1)
Computer Architecture As A Multilevel Hierarchical Framework
6 pages
Cui Columbia 2015
No ratings yet
Cui Columbia 2015
204 pages
2008 01 Config
No ratings yet
2008 01 Config
2 pages
Frequently and Non Frequently Configurable Devices Applications of Reconfigurable Devices
No ratings yet
Frequently and Non Frequently Configurable Devices Applications of Reconfigurable Devices
14 pages
Dean Seminar1
No ratings yet
Dean Seminar1
19 pages
Kyber Specification Round3 20210131
No ratings yet
Kyber Specification Round3 20210131
43 pages
Jackhammer: Efficient Rowhammer On Heterogeneous Fpga-Cpu Platforms
No ratings yet
Jackhammer: Efficient Rowhammer On Heterogeneous Fpga-Cpu Platforms
16 pages
Preview
No ratings yet
Preview
52 pages
A Probabilistic Compute Fabric Based On Coupled Ring Oscillators For Solving Combinatorial Optimization Problems
No ratings yet
A Probabilistic Compute Fabric Based On Coupled Ring Oscillators For Solving Combinatorial Optimization Problems
11 pages
Unit 1 and 2 PPTs
100% (2)
Unit 1 and 2 PPTs
81 pages
Shahzad TRETS2022
No ratings yet
Shahzad TRETS2022
42 pages
Robotic Computing On FPGAs
No ratings yet
Robotic Computing On FPGAs
5 pages
Blocks Challenging SIMDs and VLIWs With A Reconfigurable Architecture
No ratings yet
Blocks Challenging SIMDs and VLIWs With A Reconfigurable Architecture
14 pages
Unpatchable Silicon
No ratings yet
Unpatchable Silicon
17 pages
Huffmire - Managing Security in FPGA-Based Embedded Systems
No ratings yet
Huffmire - Managing Security in FPGA-Based Embedded Systems
11 pages
What Is Microchip FPGA
No ratings yet
What Is Microchip FPGA
7 pages
Wrapped Rsa Cryptography Check On Window Executable Using Reconfigurable Hardware
No ratings yet
Wrapped Rsa Cryptography Check On Window Executable Using Reconfigurable Hardware
9 pages
Cryptography 08 00036 With Cover
No ratings yet
Cryptography 08 00036 With Cover
20 pages
Commented (L1) : Create This Table of Contents in Such A Way
No ratings yet
Commented (L1) : Create This Table of Contents in Such A Way
9 pages
A Survey On Embedded Reconfigurable Architectures
No ratings yet
A Survey On Embedded Reconfigurable Architectures
5 pages
Reconfigurable Computing Es Zg554 / Mel ZG 554 Session 2: BITS Pilani
No ratings yet
Reconfigurable Computing Es Zg554 / Mel ZG 554 Session 2: BITS Pilani
35 pages
1 Recognizable System
No ratings yet
1 Recognizable System
6 pages
Prasad ppt-1
No ratings yet
Prasad ppt-1
13 pages
VLSI & Embedded Systems Module
No ratings yet
VLSI & Embedded Systems Module
8 pages
DML Dynamic Partial Reconfiguration With Scalable Task Scheduling For Multi-Applications On FPGAs
No ratings yet
DML Dynamic Partial Reconfiguration With Scalable Task Scheduling For Multi-Applications On FPGAs
15 pages
Benchmarking Bilinear Pair Cryptography For Resource-Constrained Platforms Using Raspberry Pi
No ratings yet
Benchmarking Bilinear Pair Cryptography For Resource-Constrained Platforms Using Raspberry Pi
13 pages
Prasad PPT 1
No ratings yet
Prasad PPT 1
13 pages
Identification of Important FPGA Modules Based On Complex Network
No ratings yet
Identification of Important FPGA Modules Based On Complex Network
21 pages
Regalado Research File
No ratings yet
Regalado Research File
6 pages
A Low Cost Fault-Attack Resilient AES For IoT Applications
No ratings yet
A Low Cost Fault-Attack Resilient AES For IoT Applications
13 pages
Managing Security in FPGA-Based Embedded Systems
No ratings yet
Managing Security in FPGA-Based Embedded Systems
24 pages
Lightweight ASIP Design For Lattice-Based Post-Quantum Cryptography Algorithms
No ratings yet
Lightweight ASIP Design For Lattice-Based Post-Quantum Cryptography Algorithms
15 pages
Reconfigurable Computing Course Handout
No ratings yet
Reconfigurable Computing Course Handout
3 pages
Genetic Programming Using Self-Reconfigurable Fpgas: Reetinder P. S. Sidhu, Alessandro Mei, and Viktor K. Prasanna
No ratings yet
Genetic Programming Using Self-Reconfigurable Fpgas: Reetinder P. S. Sidhu, Alessandro Mei, and Viktor K. Prasanna
12 pages
CIS: The Crypto Intelligence System For Automatic Detection and Localization of Cryptographic Functions in Current Malware
No ratings yet
CIS: The Crypto Intelligence System For Automatic Detection and Localization of Cryptographic Functions in Current Malware
8 pages
Prasad ppt-1
No ratings yet
Prasad ppt-1
12 pages
Efficient Hardware Implementation of The Lightweight CRYSTALS-Kyber
No ratings yet
Efficient Hardware Implementation of The Lightweight CRYSTALS-Kyber
13 pages
Research Article: Optimization of Lookup Schemes For Flow-Based Packet Classification On Fpgas
No ratings yet
Research Article: Optimization of Lookup Schemes For Flow-Based Packet Classification On Fpgas
32 pages
Energy Drinks
0% (1)
Energy Drinks
19 pages
Cambridge International AS & A Level: Biology 9700/51
No ratings yet
Cambridge International AS & A Level: Biology 9700/51
16 pages
Legal Positivism Austins Theory
No ratings yet
Legal Positivism Austins Theory
6 pages
Peace and Conflict Studies
No ratings yet
Peace and Conflict Studies
18 pages
The Jhānas
No ratings yet
The Jhānas
4 pages
Mental Health Essay
100% (2)
Mental Health Essay
7 pages
Occupational Health and Safety Policy For The National Department of Health
No ratings yet
Occupational Health and Safety Policy For The National Department of Health
14 pages
Activity No. 4: Simple and Complex Tissues
0% (1)
Activity No. 4: Simple and Complex Tissues
6 pages
Practice Exam For Final Exam Acct301 With Answers
No ratings yet
Practice Exam For Final Exam Acct301 With Answers
9 pages
Operating Systems
No ratings yet
Operating Systems
7 pages
Black Death
No ratings yet
Black Death
34 pages
Amaravathi Bye Laws
No ratings yet
Amaravathi Bye Laws
5 pages
Quantam Computers
No ratings yet
Quantam Computers
21 pages
Properties of KMnO4 and K2Cr2O7.PDF-65
No ratings yet
Properties of KMnO4 and K2Cr2O7.PDF-65
7 pages
RDBMS Unit2
No ratings yet
RDBMS Unit2
28 pages
Probability - Wikipedia
No ratings yet
Probability - Wikipedia
58 pages
Teip7419 Mo
No ratings yet
Teip7419 Mo
22 pages
FIT ZONE Nutrition Plan For MEN by Guru Mann
100% (1)
FIT ZONE Nutrition Plan For MEN by Guru Mann
8 pages
Test Bench TS1300 - High Quality in A Small Space
No ratings yet
Test Bench TS1300 - High Quality in A Small Space
2 pages
Glass Ampoules & Glass Vials Import Sample
No ratings yet
Glass Ampoules & Glass Vials Import Sample
15 pages
Air 0 Is The Next
No ratings yet
Air 0 Is The Next
2 pages
Grade 7 Maths Notes Part 1
No ratings yet
Grade 7 Maths Notes Part 1
6 pages
Neofiti 1 - Deuteronomio - Translation-English
No ratings yet
Neofiti 1 - Deuteronomio - Translation-English
68 pages
Catalogo Thompson MOTORES
No ratings yet
Catalogo Thompson MOTORES
23 pages
Soal Kelas X
No ratings yet
Soal Kelas X
5 pages
Uniu S2466 Sti Ii Ul
No ratings yet
Uniu S2466 Sti Ii Ul
1 page
Wafers: Basic Wafer Types
No ratings yet
Wafers: Basic Wafer Types
7 pages
21 - Olorunfemi - Assessment of The Effect
No ratings yet
21 - Olorunfemi - Assessment of The Effect
7 pages
Information Package: Including Terms & Conditions
No ratings yet
Information Package: Including Terms & Conditions
8 pages

RaPiD - Recon Figurable Pipelined Datapath

Uploaded by

RaPiD - Recon Figurable Pipelined Datapath

Uploaded by

RaPiD - Recon gurable Pipelined Datapath?

Abstract. Con gurable computing has captured the imagination of many

Fig. 3. Datapath register.

Fig. 4. Local memory.

4 Example Application: FIR Filter

Y[i] = Y[i] + X[i-j]*W[j]

You might also like

RaPiD - Recongurable Pipelined Datapath?

Abstract. Congurable computing has captured the imagination of many