Smart Pixel Arrays PDF
Smart Pixel Arrays PDF
INTRODUCTION
For taking advantage of the high space bandwidth product of optics, free-space
digital optical technologies that consist of optically interconnected two-dimensional arrays of
smart pixels have emerged as an attractive interconnection platform.1 A smart pixel is an
optoelectronic device that combines optical inputs, outputs, or both with electronic
processing circuitry and can be integrated into two-dimensional arrays. A fieldprogrammable smart pixel is a smart pixel capable of having its electronic circuitry
dynamically programmed in the field. Because of their functional versatility, fieldprogrammable smart-pixel arrays ~FP-SPA's! Can implement a wide range of optical
interconnection architectures and functions, which is not possible with custom-designed
application-specific smart-pixel arrays.
The flexibility of FP-SPA's, as with most other programmable devices, has some
economic advantages. FP-SPA's can eliminate the need for the custom dig-ital and VLSI
design of an application-specific optoelectronic smart-pixel array, which is costly. FP-SPA's
can also eliminate months of turnaround time associated with the fabrication of such a device.
Currently, the design of a custom optoelectronic de-vice can require six months, and the
fabrication can require a year and cost of the order of $10,000 ~de-pending on the die size!.
In contrast, the functionality of a FP-SPA device can be programmed dynamically in the field
in a matter of minutes, typically by the downloading of a control bit pattern into the device.
interconnected with LED's, which was considered in Ref. 8. However, this approach does not
overcome the electronic bandwidth bottleneck of conventional electronic field-programmable
gate arrays ~FPGA's!. Merging optical I/O directly onto the CMOS substrate permits the
potential to exist to create programmable devices that can process vast amounts of optical
data.
In this paper we describe the design, VLSI implementation, and free-space optical
interconnect applications of a first-generation 4 * 3 FP-SPA, implemented in the CMOS
self-electro-optic effect device (SEED) optoelectronic technology made avail-able through
the 19951996 Lucent Technologies Advanced Research Projects Agency Cooperative
(Lucent/ARPA/COOP) workshop. We report SPICE simulation results in which we configure
this FP-SPA in two sample applications: first as an array of free-space optical binary switches
that can be used in an optical multistage network such as a Benes or a Close network, and
second as an optoelectronic transceiver for a dynamically reconfigurable free-space optical
backplane architecture called the hyperplane. 10 We also describe the testing setup and the
results of electrical and optical tests that demonstrate the correct functionality of the
fabricated FP-SPA device.
denoted V.B, which similarly is supplied to the device from the external world. The logical
functions of the rows and the columns can be modified by external adjustment of the H.B or
the V.B signals, which are then distributed to those rows and columns, respectively. Finally,
each pixel also receives three global control signals, called the FF1-Clock, the FF2-Clock,
and the Reset.
Each pixel also has two optical input bits, denoted Opt.in.1 and Opt.in.2, and two
optical output bits, denoted Opt.out.1 and Opt.out.2. We form a large two-dimensional array
of pixels on the FP-SPA by abutting neighboring pixels on the CMOS substrate and
connecting their electrical I/O appropriately.
Fig. 3. Schematic of a single programmable pixel. Ci,j denotes the jth control bit for the ith
LUT or the eight-to-one MUX. Mi denotes the control bit for the ith two-to-one MUX. Each
pixel has 55 control bits
Several binary MUX's ~M1M7 in Fig. 3 are used to route certain signals to the select
lines of the LUT's. Hence the three logic variables in each programmable LUT can be
determined to a certain ex-tent by the user by the programming of the binary MUX's
M1M7.
2.2.
pixel. With reference to Fig. 3, there are 8 bits per LUT times 6 LUT's and 7 bits for the
binary MUX's (M1M7) select lines. All the pixels in the odd-numbered columns ~1 and 3!
are programmed with the same control RAM bits, and programming is similar for the evennumbered columns ~2 and 4!. Therefore a total of 110 bits are required to program the entire
device. Pairs of adjacent pixels can be programmed to realize a larger finite-state machine
than an individual pixel is capable of.
The control RAM is implemented by 110 D FF's, which are connected to form an
internal 110-bit shift register. Similar to programming static-RAM-based FPGA's, when
programming the FP-SPA a set of 110 control bits are serially shifted into the control RAM.
A programming clock must also be applied externally to activate the D FF's. Dynamic reconfiguration of FP-SPA functionality is possible by means of loading a new set of control bits If
we assume a 2-MHz clock rate, a complete reconfiguration requires approximately 50ms.
To test the device, we mounted the FP-SPA chip on a prototypical printed circuit
board with an Altera FLEX Model 81500 FPGA. Programming is accomplished under
computer control through the parallel port of a PC workstation. The Altera FPGA supplies a
programming-clock signal and a serial data path to the field-programmable smart pixel to
load the control RAM. It also provides several serial data paths to the N, E, S, and W data
inputs of the device and to the V.B and the H.B signals of the device
Fig. 4. Photograph of the FP-SPA VLSI die. The control RAM is at the top of the
integrated circuit. The 4 3 3 array of pixels is underneath the control RAM.
Software supervises the transfer of data between the PC and the FPGA with parity
checking. In one upload or download transaction 128 bits can be transferred. The data stream
is divided into 16 blocks of 8 bits. In each block, 7 bits are used for control RAM data, and 1
bit is reserved for even parity. Hence the effective data-carrying capacity of each transaction
is 16 3 7 5 112 bits, which allows the entire control RAM contents to be transferred from the
PC to the FPGA and then to the FP-SPA in a single transaction.
This simple and versatile programming setup can easily be adapted for future
generations of FP-SPA's with larger control RAM capacities. We successfully achieved a
programming speed of 2MHz. The Altera FPGA utilizes 1160 of 1296 logic cells (89%)
utilization! to implement the logic needed to program the FP-SPA. The logic is easily
extended to program devices with much larger control RAM's.
capability, we increase its optical signals' contrast ratio by using differential optical signaling.
In this signaling scheme, shown in Fig. 5, the logic value at an optical window is determined
by the difference in optical intensity at a differential SEED pair. To implement four optical
windows per pixel requires 8 SEED's/pixel; thus our device has a density of 7300
SEED's/cm2.
4. FIELD-PROGRAMMABLE APPLICATION
We demonstrate the programmability of the de-scribed FP-SPA by configuring it in
two sample applications. In the first application the FP-SPA is programmed to implement an
array of free-space optical binary switches, which can be used in an optical multistage
network such as a Benes or a Close net-work. In the second application the FP-SPA is
programmed to implement an optoelectronic transceiver for a reconfigurable intelligent
optical backplane called the hyper plane. We report SPICE simulation results for these two
configurations.
Fig. 6. a) Bar state of a two-input, two-output binary switch. b) Cross state of a two-input,
two-output binary switch.
SPICE simulations are shown in Figs. 7 and 8. In Fig. 7 the optical input Opt.in.1 of
one pixel is directed to the optical output Opt.out.2 of the same pixel. Two adjacent pixels
thus implement the bar state, as shown in Fig. 6. From Fig. 7 we can see that there is a 7-ns
latency between the time the optical data is received on Opt.in.1 ~bold curve! and the time it
appears on the optical output Opt.out.2 of the same pixel ~dotted curve!. In Fig. 8 the optical
input Opt.in.1 of one pixel is directed to the optical output Opt.out.2 of its adjacent pixel.
Two adjacent pixels thus implement the cross state. From Fig. 8 we can see that there is a 9ns latency between the time the optical data is received on Opt.in.1 ~bold curve! and the time
it appears on the optical output Opt.out.2 of the neighboring pixel ~dotted curve!. The latency
shown in Fig. 8 is larger than that shown in Fig. 7 because the optical signal must travel
through additional LUT's when it is transferred to the neighboring pixel. These SPICE
simulations indicate that a clock period of 10 ns should be suficient to allow the LUT's to
stabilize in each pixel, corresponding to a maximum clock rate of approximately 100 MHz.
Fig. 9. FP-SPAs configured to implement the hyper plane optical backplane: a) Injection of
electrical data onto one optical channel and b) Extraction of electrical data from one optical
channel.
for the optical I/O for the devices for such a backplane, and Ref. 13 describes a custom
CMOS design for a hyperplane smart-pixel array.
Figure 9 illustrates the two configurations used to implement the backplane functions.
In this application the 4 3 3 array of pixels is viewed as four columns, each containing
3 bits. In Fig. 9~a) Electrical data are injected into the optical backplane by use of the three
H.B electrical input pins. These three electrical bits appear on the three optical outputs
Opt.out.2 of the pixels in column 1. These optical bits are optically transferred to the
neighboring FP SPA, where they are imaged into the optical inputs Opt.in.1 of column 1. The
data are converted to electrical form and routed out of the FP-SPA on the three electrical
output pads, EW.out. The optical data are also regenerated and propagated out over the
optical outputs, Opt.out.2, of column 1, where they normally would continue to travel down
the backplane to the next smart-pixel array.
are directed simultaneously to the E.W.out electrical output pad of the FP-SPA chip and to
the Opt.out.2 optical out-put port. This case can be considered as the data-extraction dataregeneration operation in which optical data are extracted from the backplane and removed
from the chip in electrical form; the same optical data are also propagated down the
backplane in optical form.
Tanner pads have a maximum output clock rate of approximately 30 MHz; above 30 MHz,
the output resembled a sine wave, and the logic transitions were no longer sharp enough to be
useful for digital processing. We are unaware of any con-firmed reports of smart-pixel arrays
made through the 19951996 Lucent/ARPA/COOP with clock at rates exceeding 30 MHz.
Second, because of limitations on design time, we were forced to make some VLSI
routing decisions that may have affected the clock rate. The Tanner standard cells all used
metal 1 and metal 2 for power and ground, respectively. Normally, the Tanner standard cell
design methodology uses metal 3 to route signals between standard cells. However, all
optoelectronic design submissions to the Lucent/ ARPA/COOP workshop had to reserve
metal 3 for the bonding pads for the SEED optical modulators. As a result, we did not have a
spare metal layer available for routing signals between standard cells in the locality of the
metal 3 bonding pads ~normally metal 3 would be used for this purpose!. As a result, we
were forced to use poly silicon to route some signals between some standard cells, which
resulted in larger capacitance and higher resistance when com-pared with metal 3. Our
experimental measurements indicated a maximum clock rate of between 3 and 6 MHz.
However, that the limitation on the use of metal 3 is not unique to our design; rather,
it is an inherent constraint when using the CMOS SEED technology, which uses metal 3 for
the ip-chip bonding of the optical I/O onto the CMOS die. There are several ways to design
around this constraint. For example, computer-aided-design tools can use metal 3 to route
signals between standard cells as long as the optical I/O are not nearby, and we exploited this
attribute. However, if the optical I/O are nearby and are using the metal 3 layer, the standard
cells underneath the optical I/O must use only two layers of metal. Our FP-SPA design is
quite dense, and considerable logic was placed underneath the optical I/O; these standard
cells already use metal 1 and metal 2 for power and ground, respectively. Thus these layers
cannot be used for routing signals between standard cells beneath the optical I/O. Poly silicon
was used to interconnect these standard cells, which has resulted in the lower clock rate. In
general, an optoelectronic technology would be more versatile if an additional layer of metal
were made available for routing signals between standard cells to replace the metal 3 layer
lost to the ip-chip-bonding process. The latest CMOS processes offer as many as six layers
of metal, which would simplify the interconnections of the standard cells. Alternatively, an
optoelectronic technology would be more versatile if the standard cells were redesigned to
leave two layers of metal free, so that one layer of metal could be used for ip-chip bonding of
optical I/O, and the second layer of metal could be used for routing signals between standard
cells.
copper interspersed with FR-4. The two inner layers are planes of copper to support ground
and power, while the outer layers contain traces for signal inter connection.bemness By
fixing the thickness of the dielectric (FR-4)reen signal and ground planes, as well as the thick
of the outer signal traces and their width, we have designed a micro strip transmission line
with a nominal characteristic impedance of 50 ohm.
A prototype of the daughterboard was fabricated to verify the system design, and the
measured -3 dB bandwidth was found to be greater than 1.5 GHz. In addition to providing a
first and second level package for the chip, this daughterboard supports a single high-speed
AMP microstrip connector for connection to the motherboard via a flexible, impedancecontrolled (50 (2) ribbon. In this way, optical alignment of the photonic interconnect fabric is
decoupled from the support electronics, while a reasonably high number of interconnects
(40) is maintained at a bandwidth of more than 1GHz per connection.
Due to space and connectivity constraints imposed by our optical design, we found it
impossible to exploit any of the standard chip carriers described previously. With this in mind
we chose to mount the Hybrid-CMOS chip directly on the daughterboard using a MCM-L
arrangement (Chip-on-board).
The daughterboard provides for chip-on-board mounting with a central, round dieattachment pad surrounded by 44 bond fingers. All exposed copper is plated with immersion
gold to allow for direct wire bonding.
One of the constraints imposed by the system was the positional tolerance of the chip.
With this in mind, we have devised an alignment, bonding and gluing rig ("puck") for the
chip attachment to daughterboard. The daughterboard is first attached to the circular
aluminium "puck" by means of mounting screws. A Teflon spacer is then located on the
daughterboard by means of two mounting rods. This L-shaped spacer is designed to provide a
stop to chip-travel during placement of the chip die, thereby providing an effective alignment
mechanism. The entire "puck can be moved to a heating unit to set the epoxy and then wire
bonded on a height-adjustable, stable base. The anticipated die to daughterboard positional
accuracy will be (+100 or -100) micrometres
7. DISCUSSION
The packaging used in this system can be expanded and improved upon to suit future
application needs. We have addressed many of the issues and concerns facing a system
designer when confronted with the task of packaging a two-dimensional array of smart pixels,
but others still require attention. For example, thermal management is a key system parameter
which must be addressed in order to build and package large smart pixel arrays for system
applications. Figure 3 shows the schematic of a package incorporating chip-on-board
technology and cooling via both a passive heat-sink and a Peltier cooler. This, or other
similar solutions will be required in future system applications.
8. CONCLUSION
In this paper we have described the design, the VLSI implementation, and the optical
applications of a first-generation CMOSSEED FP-SPA. To the best of our knowledge, this
device is the first fully operational field-programmable optoelectronic device yet
demonstrated. We have reported SPICE simulation results in which we configured this FPSPA in two sample applications: first as an array of free-space optical binary switches
suitable for use in an optical multistage network such as a Benes or a Close net-work, and
second as an array of dynamically reconfigurable free-space optical switches for a backplane
architecture called the hyper plane. We have also described the testing setup and the results of
electrical and optical tests that demonstrated the correct functionality of the fabricated FPSPA device. This device establishes the feasibility of dynamically programmable
optoelectronic devices with thousands of transistors per optical I/O bit. Such devices have the
potential to reduce significantly the need for the custom design and fabrication of applicationspecific optoelectronic devices in the same manner that FP-GA's have largely eliminated the
need for custom device and fabrication of application-specific gate arrays, except in the most
demanding applications.
9. REFERENCES
1. H. S. Hinton, Architectural considerations for photonic switching networks, IEEE J.
Selected Areas Commun. 6, 1209 1226 ~1988!
2. T. H. Szymanski and H. S. Hinton, Architecture of field programmable smart pixel arrays,
in Proceedings of the International Conference on Optical Computing 94 Vol. 139 of IOP
Conference Proceedings ~Institute of Physics, Bristol, UK, 1995!, pp. 497500.
3. T. H. Szymanski, Field programmable smart pixel arrays for an intelligent optical
backplane, in Proceedings of the Fourth Canadian Workshop on Field Programmable
Devices ~The University of Toronto, Toronto, Canada, 1996!, pp. 55 61.
4. S. S. Sherif, T. H. Szymanski, and H. S. Hinton, Design and implementation of a eld
programmable smart pixel array, in
Proceedings of the 1996 IEEEyLEOS Summer Topical Meeting
~Institute of Electrical and Electronics Engineers, New York,1996, pp. 78 79.
5. T. M. Pinkston and C. Kuznia, Smart-pixel-based network interface chip, Appl. Opt. 36,
4871 4880 ~1997!.