0% found this document useful (0 votes)
34 views10 pages

Sdarticle

This paper presents a new power efficient asynchronous multiplexer (MUX) for application in analog front-end electronics used in x-ray medical imaging systems. The proposed MUX together with an experimental readout ASIC has been implemented in the CMOS 0. Mm process and occupies 1100 mm2 / channel area.

Uploaded by

hivis69
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views10 pages

Sdarticle

This paper presents a new power efficient asynchronous multiplexer (MUX) for application in analog front-end electronics used in x-ray medical imaging systems. The proposed MUX together with an experimental readout ASIC has been implemented in the CMOS 0. Mm process and occupies 1100 mm2 / channel area.

Uploaded by

hivis69
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Microelectronics Journal 42 (2011) 33–42

Contents lists available at ScienceDirect

Microelectronics Journal
journal homepage: www.elsevier.com/locate/mejo

Power efficient asynchronous multiplexer for X-ray sensors in medical


imaging analog front-end electronics
Rafa" D"ugosz a,b,n, Pierre-André Farine a, Kris Iniewski c
a
Swiss Federal Institute of Technology in Lausanne, Institute of Microtechnology, Rue A.-L. Breguet 2, CH-2000 Neuchâtel, Switzerland
b
University of Technology and Life Sciences, Faculty of Telecommunication and Electrical Engineering, ul. Kaliskiego 7, 85-796 Bydgoszcz, Poland
c
CMOS Emerging Technologies Inc., 2865 Stanley Pl., Coquitlam, BC V3B 7L7, Canada

a r t i c l e in fo abstract

Article history: This paper presents a new power efficient asynchronous multiplexer (MUX) for application in analog
Received 31 January 2010 front-end electronics (AFE) used in X-ray medical imaging systems. Contrary to typical synchronous
Received in revised form MUXes that have to be controlled by a clock, this circuit features a simple structure, as the clock is not
29 August 2010
required. The circuit dissipates power only while detecting the active signals and then automatically
Accepted 1 September 2010
turns back to the power down mode. Medical imaging systems usually consist of several dozen to even
Available online 18 September 2010
several hundreds of channels that operate asynchronously. The proposed MUX enables an unambiguous
Keywords: choice of the active channel. In case of two or more channels that become active at the same time the
Medical imaging MUX serializes the reading out data from particular channels. This characteristic leads to 100%
Multiplexers
effectiveness in data processing and no impulses’ loss. The proposed MUX together with an
Asynchronous circuits
experimental readout ASIC has been implemented in the CMOS 0.18 mm process and occupies
Analog front end
X-ray detection 1100 mm2/channel area. It works properly in a wide range of the voltage supply in between 0.8 and
1.8 V. Energy consumed during the detection of one active channel is below 1 pJ, while the detection
time is about 1 ns.
& 2010 Elsevier Ltd. All rights reserved.

1. Introduction detectors are sensed by an array of pixel electrodes (sensors) and


directly processed in the associated AFE that conditions the
Electronic signal detection and processing of X-ray images are analog signals received from the array of sensors and then
gaining widespread acceptance due to their inherent benefits of performs an analog-to-digital (A/D) conversion. The ability to use
data storage and transmission in a digital format as opposed to solid-state detectors has enabled a great improvement in the
conventional X-ray films [1]. At present, most nuclear medical spatial resolution of X-ray based medical imaging techniques.
imaging devices use a scintillator–photomultiplier combination to A block diagram of a typical AFE used in medical imaging
detect X-rays or gamma rays. The scintillator absorbs X or gamma applications is realized as a multi-channel specialized integrated
photons that are emitted by radionuclides introduced to the circuit (ASIC), and is shown in Fig. 1. Each channel in this system
patient’s body with pharmaceuticals, and re-emits the energy as consists of a sensor (S), a charge amplifier (G), a pulse shaping
visible light. This light is absorbed by a photocathode of the filter (PS) and a peak detector (PD) [2]. The signal processing
photomultiplier tube (PMT) and re-emitted as a burst of electrons. scheme in each channel starts with the detection of incident
Further data processing is performed using external analog front- radiation by a sensor that generates, as an answer, an equivalent
end electronics (AFE). charge amount. The problem faced at this stage is a very small
Due to the multi-step detection process that involves visible charge, which for 1 keV X-rays is on the level of several dozen aC.
light, PMT devices suffer from poor imaging resolution. Recently, The other problem is the random distribution of this charge over
these problems were addressed by the fabrication of solid-state time. The amplifier is therefore used to amplify this charge and to
detectors that operate at room temperature and convert X-ray integrate it in an embedded capacitor. If this integration is fast
photons directly to electrical signals [1]. Charge carriers of these enough, the PS block receives the voltage, which approximately is
the Heaviside step function. The PS block is realized as a
continuous-time band-pass filter, which converts this step
n
Corresponding author at: University of Technology and Life Sciences, Faculty of function into a pulse with a given peaking time and the amplitude
Telecommunication and Electrical Engineering, ul. Kaliskiego7, 85-796 Bydgoszcz,
that is linearly proportional to the value of the step voltage. The
Poland. Tel.: + 48 668 160 217.
E-mail addresses: [email protected] (R. D"ugosz), subsequent PD determines precisely the peaking time, which is
pierre-andre.farine@epfl.ch (P.-A. Farine), [email protected] (K. Iniewski). necessary to catch the peak’s value, and finally to set up the FLAG

0026-2692/$ - see front matter & 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.mejo.2010.09.006
34 R. D!ugosz et al. / Microelectronics Journal 42 (2011) 33–42

ASIC 42 mm2 with about 20% of this area occupied by the ADCs, which
Channel 1
peak are in this case 10-bits charge-redistribution successive approx-

Asynchronous multiplexer
S G + PD imation (SAR) converters. In this type of converters capacitors
FLAG
peak occupy a large chip area [7,8]. On the other hand, the advantage of
Channel 2 (digital) this approach is that it eliminates the bottleneck problem that is
peak
peak
S G + PD ADC caused by insufficiently fast single ADC at the output of typical
FIFO
FLAG i-bits AFE systems. In systems with a large number of channels, in
X or gamma addr
which the number of events is also large, this is an important
photon Channel n address
peak feature. In such systems the possible solution could be a mixed
of the
S G + PD active channel one with intermediate number (K) of ADCs, and K MUXes each
sensor peak
FLAG with M/K inputs. Such an approach is to be explored in the future.
charge amplifier detector In this paper we focus on the last element in the signal
pulse shaper processing chain, i.e. the MUX circuit, which in this approach has
been designed as a fully asynchronous circuit. The proposed
Fig. 1. A typical front-end ASIC for multi-element detectors [4].
circuit enables detection of the active channel in less than 1 ns.
The circuit has a built-in de-randomization mechanism that
eliminates collisions in the system even if all channels become
signal, which is a request directed to the output MUX to read data. active at the same time. Since the proposed circuit operates in an
Since pulses in particular channels appear asynchronously, a asynchronous fashion, it can be viewed as an alternative solution
special multiplexing that serializes the tasks is required. After for the solution reported earlier in [22]. In [22] the block
reading the data the MUX block reset the corresponding FLAG preventing the collisions is placed at the input of the system.
signal and a given channel is ready to detect the next event. This block directs the input signals to one of the eight channels
In the literature one can find several AFE readout systems that containing peak detectors and time-to-amplitude converters
differ in the structure of particular blocks. The most advanced (PD/TAC), which is available at the moment, i.e. does not process
solutions of this type with 32 asynchonously operating channels any signal. As a result, the system is able to handle eight
have been described in [3,4,22]. In the chip described in [3,4] two simultaneous events at the same time. In our circuit the collision-
16-channels preamplifier-shaper ASICs produce unipolar pulses preventing block is included in the MUX, which is the last block in
with 1.2 ms width, which are provided to a self-switched MUX the system. In this case particular channels operate indepen-
(SSM). The SSM chip consists of a comparator bank, 32:1 switch dently. The proposed system is able to catch the number of
matrix and arbitration logic. The SSM detects above-threshold impulses equal to the number of channels, as each channel
inputs and routes them to the output FIFO structure (first input contains its own PS and PD blocks. The bottleneck in this system is
first output). It also presents the 5-bit address of the selected the output ADC. Comparing both these solutions the proposed
channel at the output. In the circuit described in [3,4] no collision circuit is able to catch more simultaneous events, but on the other
mechanism has been applied, making this solution generally hand the solution described in [22] requires smaller number of
sufficient in cases when impulses occur seldom in time. If two or the PD blocks (eight in this case),which are shared between all
more impulses arrive near in time (simultaneous events), the channels.
impulses overlap and a collision occurs in the system, since The paper is organized as follows. An overview of typical MUX
several channels try to connect to the output at the same time. As architectures is described in the next section. Section 3 is devoted
a result, a portion of all detected impulses may be lost. This to the proposed multiplexer realized in the CMOS technology.
problem might be important in some applications and therefore A verification of the conscience mechanism by means of detailed
the authors of [3,4] developed the Simultaneous Events Catcher postlayout circuit level simulations is presented in Section 4.
(SEC) that prevents the collisions of the simultaneous events. The Finally the conclusions are covered in Section 5.
solution described in [22] is the first known circuit addressing this
problem.
Another AFE realized in the CMOS 0.18 mm technology has 2. Multiplexing circuits—an overview
been reported in [5]. This system is composed of 64 analog
channels and a synchronous analog MUX controlled by a multi- Multiplexers are widely used in various fields of industry,
phase clock. This clock cyclically, in the loop, checks all the mostly in telecommunication and also in medical applications, in
channels. This approach eliminates the collision problem from the nuclear physics and others. A manifold of different multiplexing
systems, but the MUX has to be active the whole time. This circuits, both digital and analog, have been reported in the
approach limits the maximum data rate of a single channel to fS/M literature [9–19]. So far in most cases synchronous solutions were
per second, where fS is the sampling frequency of the MUX, while used that are controlled by an arbitrary clock circuit that
M is the number of channels. Moreover, the clock makes the determines the multiplexing data sequence. On the other hand,
overall circuit much more complex. Looking from both the power asynchronous solutions are useful in those applications where
dissipation and the chip area points of view this approach is not data occur randomly at the inputs, e.g. in nuclear medicine.
the most optimal. The MUX occupies an area of more than Since the application of the proposed MUX significantly differs
2.5 mm2, although 66% of this area is occupied in this case by from that of the synchronous circuits, we do not focus in detail on
differential amplifiers used to compensate the noise. particular state-of-the-art solutions, presenting only those
The AFE in medical imaging applications can also be realized information that are necessary to place our circuit in a proper
without the explicit MUX block. In the system with 32 channels perspective.
reported in [6] instead of using the MUX, followed by a high data Several typical MUX architectures are usually distinguished
rate analog-to-digital converter (ADC), as shown in Fig. 1, each [11]. The most popular of them are based either on the shift
channel has its own converter, while the outputs of the channels register concept [9], the multi-phase approach also known as a
are shortened together. A disadvantage of this approach is the single-stage [5,10,19] or the binary-tree (BT) structures [11–18].
large number of ADC blocks that significantly enlarges the chip The first two are straightforward structures, which are able to
area. This system in the CMOS 0.25 mm technology occupies handle an arbitrary number of inputs, which is one of the
R. D!ugosz et al. / Microelectronics Journal 42 (2011) 33–42 35

advantages. Unfortunately in the shift register-type MUXes all multi-phase solutions feature a simple structure and require a
circuit components operate at the highest rate, which usually is low frequency multi-phase clock, which is an advantage but suffer
the source of high power dissipation [11]. On the other hand, the from large capacitive load that limits the maximum sampling

PD PD PD PD PD PD PD PD

FL FL FL FL FL FL FL FL
S1 S2 S3 S4 S5 S6 S7 S8
FIFO

F11 F12 F13 F14 F15 F16 F17 F18

ACDB11 ACDB12 ACDB13 ACDB14

X11 X12 X14 X16 X18


X13 X15 X17
F21 F22 F23 F24

ACDB21 ACDB22

X 21 X 22 X 23 X 24

F31 F32

ACDB31

X31 X32
F41

ACDB

X1
EN
F1

X2
clk1 A1 Y
d q NO
DFF
clk2
A3
q
F
A2
O
F2

X11
S1

X21
S2
X12
X31

X13 S3

X22

S4
X14

Fig. 2. The proposed binary-tree asynchronous multiplexer: (a) the general structure, (b) a single active channel detection block—ACDB and (c) the circuit that determines
the address of the active channel.
36 R. D!ugosz et al. / Microelectronics Journal 42 (2011) 33–42

frequency. The binary-tree MUXes to some degree overcome the structures, so it can operate at higher speed. In the binary-tree
limitations of the first two architectures. The capacitive load is in MUXes particular layers in the tree operate at different frequen-
this approach much smaller than in the case of the multi-phase cies. Since only some blocks operate at the highest speed the
power dissipation is therefore reduced. A disadvantage of this
solution is that it requires a multi-rate clock, in which distribution
of different frequency clock signals has to be precisely controlled.
Despite its disadvantages this approach recently is the most
frequently used [11–18].
Considering analog data processing, e.g. in medical imaging
applications, the most reasonable MUX structure that can be
applied is the multi-phase one. In the other two cases input data
need to be copied several times in the structure, e.g. between
layers in the tree. In case of analog signals the copying process
would be the source of large errors. In the binary-tree structures,
for example, data are copied log2 M times, while in the shift
register MUX even M times. This is the reason for using only
multi-phase structures in medical imaging systems [3,5]. While in
all these cases a similar structure is used in the analog signal path,
with only some modifications, an essential difference exists in the
structure of the channel selection circuit that always is a digital
block. In the AFE described in [5] the loop selection sequence is
imposed by the arbitrary clock. Since this is a typical multi-phase
scheme, the MUX has to be oversampled at least M times, in
comparison with the data rate of particular channels, to ensure a
proper reading out of all data. We have proposed quite a different
approach, which can be referred to as the asynchronous one. The
analog signal processing path is similar to those used in [3–5],
while the channel selection circuit is based on the binary-tree
Fig. 3. A prototype AFE chip implemented in the TSMC 0.18 mm technology. The
concept. In this approach, to detect any active channel, the most
MUX block composed of seven active channel detection blocks (ACDB) occupies an log2 M switching operations (steps) in the tree are required, but
area of 300  30 mm2. since in some cases no switching occurs, the average number of

F23

1400 F41
0.8 ns
1000
V [mV]

600 F15 F32 S5


200
-200
79.6 79.8 80 80.2 80.4 80.6 80.8 81 81.2 81.4
Time [ns]

1800
EN22
1400
V [mV]

1000 EN13
600
EN31
200
-200
79.6 79.8 80 80.2 80.4 80.6 80.8 81 81.2 81.4
Time [ns]

standby Iavr = 600uA → E ≈ 0.85pJ (1 event)


800
10 nA IDD
I [uA]

600
400
200
0
-200
79.6 79.8 80 80.2 80.4 80.6 80.8 81 81.2 81.4
Time [ns]

Fig. 4. Transistor level simulations for the worst case scenario, in which all ACDBs between a given input and the output are turned over, for VDD ¼1.8 V: (a) the flag signals,
F, at particular layers in the tree and a corresponding address, S, (b) the clock enables signals, EN, in ACDBs and (c) the total current consumption.
R. D!ugosz et al. / Microelectronics Journal 42 (2011) 33–42 37

800
600
F15 S5
V [mV]

F23
400
F32 F41 8 ns
200
0
-200
79 80 81 82 83 84 85 86 87 88 89 90
Time [ns]

800
600
V [mV]

400
EN13 EN22
200
EN31
0
-200
79 80 81 82 83 84 85 86 87 88 89 90
Time [ns]

30
25 standby I avr = 12uA → E ≈ 0.07pJ (1 event)
20 5 nA standby
I [uA]

15
10
5 I DD
0
-5
79 80 81 82 83 84 85 86 87 88 89 90
Time [ns]

Fig. 5. similar results as shown in Fig. 4 for VDD ¼0.8 V.

switching operations equals 0.5 log2 M. These steps are mechanism that allows detecting the address of the channel that
started automatically with a delay of only two logical gates became active. This information is relevant in medical diagnostics
and without the arbitrary clock, which allows for achieving applications as it provides information on the pixel address that
very small detection times and consequently relatively high has received X-ray photon.
data rates. One of the introduced innovations is the ability to solve the
problem of collisions between events. Even if two or more
channels become active at the same time, the circuit properly
3. The proposed asynchronous multiplexer serializes access to the MUX output so that no data will be lost. As
long as a given active channel is being read out, data in other
The proposed MUX has been optimized for application in AFE channels are held and these channels are not allowed to detect
that operates asynchronously. The role of the proposed circuitry is new events. After reading out data a given channel is reset (its flag
to detect the event of data appearing in one of the M input becomes logical ‘0’), while the MUX automatically turns to
channels, and to establish a connection between the given active another active channel.
input and the MUX output in order to read out data. Upon The general structure of the proposed MUX is shown in Fig. 2(a).
detection, the MUX copies the peak of the analog impulse stored It is composed of M-1 active channel detection blocks (ACDB),
in the sample-and-hold (S&H) cell to the output stage, and shown in detail in Fig. 2 (b). The MUX can be in several states. In
determines the address of a given active channel. The general particular ACDB input signals are flags (F) received either directly
MUX architecture used in this application is fairly typical from the corresponding channels, in the case of the bottom
(a tree-type structure), but mechanisms used to detect the active layer in the tree, or from the preceding layers. If all flags are logical
channels proposed here are novel. The binary-tree structure ‘0’, which means that all channels are inactive, the overall structure
ensures an unambiguous selection of only one path between a is in the standby mode. Each ACDB contains an asynchronously
given MUX input and the output. started 2-phase clock that is composed of a D-flip flop (DFF). The
One of the most important innovations is that the proposed operation principle of this concept has been explained for an
MUX does not require an external clock and operates fully example case of M¼4, i.e. for two layers in the tree and two ACDBs
asynchronously. As a result, for all channels being inactive the at the bottom layer, namely ACDB11 and ACDB12, and one
circuit is in the power down mode and is automatically activated (ACDB21) at the top. The example sequence is as follows:
when new data occur at any input. Since the circuit is composed
of CMOS elements, in the standby mode it consumes a negligible
power of several nW only. One of the features introduced in our 1. For all flags F11–F14 being ‘0’, signals X11–X14 are also ‘0’ and
circuit, which typically is not used in synchronous MUXes, is the the corresponding signals Y are logical high. Similarly, flags
38 R. D!ugosz et al. / Microelectronics Journal 42 (2011) 33–42

F21 and F22 are zero, signals X21 and X22 are zero, as well as case the main role, as even in the case of larger numbers of
clock enable EN in each ACDB is zero. In this case the tree is layers in the tree the proposed MUX is able to establish
turned off. the connection between the given input and the output
2. As a signal appears in channel 1 its corresponding flag F11 after time that is equal to a delay of several OR gates only.
changes to logical high. As a result the F21 flag changes to Once the connection is established the MUX is further in the
logical high as well. At this point there are two options: standby mode.
2.1 clk1 was logical low (clk2 was logical high) at the moment 4. It is important to point out that the clock in any ACDB is
that the signal appeared in channel 1 and the correspond- automatically started only in a situation where one channel is
ing F11 flag was set. As a result, outputs of A1 and A2 in active and the clock is not pointing at this channel. In the case
ACDB11 are still zero, while Y is logical high, and since F21 when both flags F11 and F12 are active at the same time
is one, EN becomes logical high starting the clock for the and connection between one of these inputs and the MUX
ACDB11 pair. After one clock cycle clk1 becomes logical output is established, the second flag must wait until the
high and the X11 signal becomes logical high, which stops first flag is reset. This is one of the main advantages of
the clock. the proposed solution as even in the worst case scenario,
2.2 clk1 was one at the time the signal appeared in channel 1 when all flags are logical high at the same time, access to the
and the corresponding F11 flag was set. In this case A1 output is always limited to only one input, while other
output is at one that causes Y21 to change value to zero channels must wait. This prevents collisions at the MUX
and the enable signal EN in ACDB11 remains zero. In output, as mentioned above.
consequence, the clock was not started as desired. The access to the MUX output is controlled by signals Sx that
3. Almost parallel with the operation described in point 2 above control the switches in the analog paths (see Fig. 2(a)). These
the ACDB21 circuitry at the second level is activated as the signals depend on signals X from particular layers in the tree, as
F21 flag becomes logical high after only one OR delay. shown in Fig. 2(c). This concept features a very simple structure.
Subsequently, that circuit operates exactly in the same fashion Each ACDB block consists of only one DFF and several logic gates,
as at the lower level. The asynchronous operation plays in this thus occupying a very small chip area.

1000
900 VDD = 1.8V

800
700
E1event [fJ]

600
1.5V
500
400
300
200 1.2V
0.6V 0.8V 1V
100
0.7V 0.9V
0
0 200 400 600 800 1000 1200 1400
fS [MHz]

0.9V 1V
2.5
PI: fs / E1event [GHz/pJ]

1.2V

2
0.8V
1.5V
1.5
0.7V VDD = 1.8V
the most optimal case
1

0.6V
0.5

0
0 200 400 600 800 1000 1200 1400
fS [MHz]

Fig. 6. Performance characteristics of the proposed asynchronous MUX: (a) energy consumed during detection of a single active channel vs. sampling frequency, fS, for
selected supply voltages, (b) Performance Index (PI) defined as fS over energy per event vs. achievable data rate for different values of VDD.
R. D!ugosz et al. / Microelectronics Journal 42 (2011) 33–42 39

4. Implementation of the proposed multiplexer in the CMOS the MUX during detection of a single event, it can be seen
technology that the most optimal case is for VDD ¼1 V. For higher supplies we
increase data rate but at the expense of energy consumption
The proposed MUX has been applied in a prototype AFE with that increases much faster. On the other hand, reducing the
eight analog channels realized in the CMOS 0.18 mm process, as supply voltage allows for decreasing the energy consumption per
shown in Fig. 3. Other system components, i.e. the PS filter and single event but data rate decreases faster, which is also not the
the PD with an S&H memory cell have been described by the optimal case.
authors in [20,21]. The proposed MUX occupies an area of The results shown in Figs. 4 and 5 are for only one channel
0.009 mm2, i.e. 1100 mm2 per channel. being active at any time. This situation allows us to determine the
To illustrate the MUX performance selected transistor level maximum delay introduced by the MUX, which is measured as a
simulation results are shown in Figs. 4, 5 and 7. The MUX can period between setting up the flag at the output of a given
operate in a wide range of the supply voltage. Figs. 4 and 5 show channel and a corresponding S signal (the address). The maximum
that the supply voltage only has an influence on the circuit speed, data rate that exceeds 1 GHz is for VDD ¼1.8 V. In this case the
while it does not affect its functionality. Comparing both these energy consumed per single event is the largest but is still below
cases one can see that the power dissipation is in the first case 1 pJ, which is much lesser than in synchronous MUXes reported in
higher by two orders of magnitude than in the second case but the literature.
since the circuit is ca. 10 times faster in this case, the energy The results in Fig. 7 are shown for the worst case scenario, for
consumption per single detection operation increases only all channels being active at the same time. This case demonstrates
moderately. how the proposed de-randomization mechanism operates. Fig. 7
A comparative study for different values of the supply voltage illustrates two cycles, each starting with the occurrence of all
is shown in Fig. 6. Defining the performance index (PI) as a eight flags at the MUX inputs. In this case, in the first cycle the 3rd
ratio between data rate and the energy that is consumed by channel is read out as first. After the flag F13 is reset the ACDB12

events events
1800
1400
V [mV]

1000
F3 F4 F1 F2 F7 F8 F5 F6 F6 F5 F8 F7 F2 F1 F4 F3
600
200
-200
4 6 8 10 12 14 16 18 20 22 24 26 28 30
Time [ns]

EN12 EN11 EN14 EN13 EN13 EN14 EN11 EN1


EN21 EN3 EN22 EN22 EN31 EN21
1800
1400
V [mV]

1000
600
200
-200
4 6 8 10 12 14 16 18 20 22 24 26 28 30
Time [ns]
S3 S4 S1 S2 S7 S8 S5 S6 S6 S5 S8 S7 S2 S1 S4 S3
1800
1400
V [mV]

1000
600
200
-200
4 6 8 10 12 14 16 18 20 22 24 26 28 30
Time [ns]

2500
standby
2000 IDD 15 nA
I [uA]

1500
1000 Iavr = 400uA → Eavr ≈ 0.95pJ (1 event)
500
0
4 6 8 10 12 14 16 18 20 22 24 26 28 30
Time [ns]

Fig. 7. Simulation results for all channels being active at the same time: (a) the input flag signals, (b) the EN signals in particular ACDBs, (c) the corresponding address
signals and (d) total current consumption.
40 R. D!ugosz et al. / Microelectronics Journal 42 (2011) 33–42

block immediately switches to F14. As the flag F22 is still ‘1’ at Since the proposed MUX is going to be used in a commercial
this time, the clock in ACDB21 remains inactive. After reading out AFE, careful tests on process, voltage and temperature (PVT)
the 4th channel only ACDB21 turns over and the 1st channel is variation have been undertaken. The results for VDD ¼ 1.8 V as well
read out as next followed by reading out the 2nd channel. After as for 0.8 V are shown in Fig. 9 for temperatures ranging between
that ACDB31 turns over to group F15–F18 and the process  40 and 100 1C, and for several representative transistor models,
continues until all channels are read out. Note that after the first namely typical–typical (TT), fast–fast (FF) and slow–slow (SS)
cycle all ACDBs remember their last settings, which is the reason models. As can be seen these parameters influence only the
for the opposite reading sequence at the next cycle. Looking from achievable data rate. The presented results show that the most
the power dissipation point of view, this case is the most optimal optimal supply voltage is 1 V; as in this case there is the best
as only one ACDB turns over for each input data. energy usage (see Fig. 6) and data rate is stable over a wide range
The proposed MUX can also be potentially used as a of environmental temperatures.
synchronous circuit, as shown in Fig. 8(a) for an example case The results presented above are shown for eight channels, i.e.
of 8 inputs. In this configuration, additional 2-input AND gates for three layers in the tree. To illustrate the circuit performance in
need to be used at the MUX inputs. The clock signals (clk1,y, a wider perspective, Fig. 10 presents the maximum achievable
clk8) are applied to one input of the AND gates, while transmitted data rate as a function of the number of inputs. The results shown
data (In1,y, In8) to the others. In this case flag F41 becomes the for two, four and eight inputs have been obtained during
MUX output signal. In this configuration both the X and the S transistor level simulations, while the results for larger numbers
signals are not used, since there is no need to determine the of inputs have been calculated. The BT solution provides an
addresses of the channels. Although in comparison with the state- important advantage here. Even for large number of channels the
of-the-art solutions the data rate of this MUX is lower (about number of layers in the tree is small, e.g. 8 for 256 channels, so
1.5 GHz), but energy per single bit is below 0.3 pJ. This is two data rate is not significantly limited in that case. The maximum
orders of magnitude less than in the circuit described in [17] also data rate can be calculated as follows:
realized in the 0.18 mm CMOS process. Simulation results in the
fS_max ¼ fS_1ACDB =log2 M ½Hz ð1Þ
case of the synchronous configuration are present in Fig. 8(b). The
input signals are sampled at the rate of 0.19 GHz, while the output In the formula above fS_1ACDB is the maximum data rate of a
at 1.51 GHz. One of the advantages of this solution is that a multi- single ACDB, while M is the number of MUX inputs. Note that
rate clock that is typically used at particular layers in synchronous doubling the number of inputs always adds only one layer to the
binary-tree MUXes is not required in this case. This significantly tree, which makes the data rate decrease rather moderately with
simplifies the circuit structure and reduces power dissipation. the number of inputs. This is an important advantage in

In1 F11
clk1
clk1
In2 F12 clk2
clk2 F41
MUX

clk8

In8 F18
clk8

Fig. 8. The proposed MUX used in the synchronous mode for the input data rate of 0.19 and 1.52 GHz at the output: (a) a test structure controlled by an 8-phases clock and
(b) the input and output signals. The results are presented for VDD ¼ 1.8 V and a room temperature of 20 1C.
R. D!ugosz et al. / Microelectronics Journal 42 (2011) 33–42 41

5
1.5 4.5
VDD = 1.8V (synchronous)
1.4
4 VDD = 0.8V (synchronous)
1.3
3.5 VDD = 1.8V (asynchronous)
1.2 VDD = 0.8V (asynchronous)
3

fs [GHz]
FF model
fs [GHz]

TT model
1.1
2.5
1
2
0.9
1.5
0.8
SS model 1
0.7
0.5
0.6
-40 -20 0 20 40 60 80 100 0
0 50 100 150 200 250
T [C]
No. of inputs

0.35 Fig. 10. Maximum data rate vs. number of inputs for 1.8 and 0.8 V supply voltages
for the MUX working either in the synchronous or the asynchronous mode. The
0.3 results are presented for the TT transistor model and a room temperature of 20 1C.

FF model
0.25
the synchronous ones, used for multiplexing digital data, in our
fs [GHz]

0.2 solution we have pointed out the active channel selection time as
this is the comparable parameter. We should remember that if the
0.15 TT model MUX is used in the AFE, a significantly larger time is required to
read out and then to reset the channel. For a meaningful
0.1 comparison a Figure-of-Merit (FOM) has been defined as a ratio
SS model of a normalized sampling frequency to dissipated power per
0.05 single input:
fNORM
0 zfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflffl{
-40 -20 0 20 40 60 80 100 ðfs =log2 MÞ  
FOM ¼ GHz=mW ð2Þ
T [C] ðP=MÞ

Fig. 9. Corner analysis of the circuit performance—maximum data rate, fS, over
the environmental temperature for: the slow, typical and fast transistor models, To make the reported sampling frequencies comparable we
for an example case of eight inputs and three layers in the tree, for: (a) VDD ¼ 1.8 V have normalized it to a single layer only (in case of BT solutions),
and (b) VDD ¼ 0.8 V. For VDD of 1.0 V particular waveforms are approximately flat at since each layer usually introduces an approximately equal delay.
the levels of 0.267, 0.35 and 0.53 GHz, for SS, TT and FF models respectively.
One of the most important reported parameters is achievable
data rate. As shown in Table 1 the circuits implemented in the
CMOS 0.18 mm process, i.e. in the technology comparable with our
comparison with other MUXes used in the AFE chips described case, can be as fast as 5–10 GHz (for 8 inputs) but at the expense
above. of much larger power dissipation and larger chip area. On the
Fig. 10 illustrates also the results for the synchronous mode. In other hand, since the application of our circuit is specific, we
this case addresses of the channels do not need to be determined, focused rather on power dissipation, as well as circuit complexity.
which makes the circuit significantly faster than in the asynchro- The MUX has been used in the AFE, in which reading out a single
nous mode and more power efficient. channel takes ca. 100 ns. In this case a very large data rate is not
In this paper we have presented only the simulation results of the most important parameter. Nevertheless, the sampling
the MUX. The main reason is that this circuit has been used as one frequency of ca. 1 GHz is not a significantly worse result, while
of the components of the AFE chip, in which the measurement of we achieved a much better FOM.
the MUX as a separate block is not possible. Measurement results An important advantage of our solution is also a very good
of this particular chip do not provide exact information on the matching of energy consumption to the values of the input data. If
achievable data rates, since particular analog channels in in the asynchronous mode no events are present at the channels
the system operating at much lower sampling frequencies are inputs (i.e. all flags are zero) or in the synchronous mode most
the bottleneck here. In our opinion the simulation results in this input signals are equal in a given period of time, regardless of their
particular case can be viewed as a good estimation of MUX values, the circuit operates at low power dissipation or is in the
performance, as this block, being a simple feed-forward structure, power down mode, since most or all ACDBs do not turn over in this
is composed of only digital elements. This additionally has been case. This makes the proposed circuit very useful in various
confirmed by means of detailed corner analysis, as presented portable devices or in wireless sensor network (WSN) applications,
above. in which energy consumption is one of the main parameters.

4.1. A comparative study with other multiplexing circuits 5. Conclusions

Performance comparison between reported MUX structures is A novel binary-tree multiplexer with ultra-low power, asyn-
provided in Table 1. Since the most commonly used solutions are chronous selection circuit has been proposed and realized in the
42 R. D!ugosz et al. / Microelectronics Journal 42 (2011) 33–42

Table 1
Performance comparison of selected MUX circuits.

Ref. Process [mm] VDD [V] M P [mW] fS [GHz] Type Area [mm2] FOM [GHz/mW]

[11] 0.18 ND 8 50 5 BTSa 0.9 0.27


[12] 0.12 1.3 4 105 15 BTS 0.66 0.29
[3] 0.18 2 16 30 3.6 SRb 42 0.48
[2] 0.15 2 8 118 3 MPc 42 0.07
[15] 0.18 1.8 16 24 1.65 BTS 0.858 0.28
[16] 0.18 1.8 16 36.2 2 BTS 0.78 0.22
[17] 0.18 1.8 8 30 5 BTS 0.029 0.44
[14] 0.18 2.2 8 112 10.2 BTS 0.13 0.24
[13] 0.18 1.5 2 40 40 BTS 0.30 2.00
[18] 0.18 2 2 110 15 BTS 0.11 0.27
This work 0.18 1.8 8 1.08 1 BTAd 0.009 2.47
This work 0.18 0.8 8 0.0096 0.1 BTA 0.009 27.78
This work 0.18 1.8 8 0.51 1.52 used as BTS 0.009 7.95

a
Binary tree synchronous.
b
Shift register.
c
Multi-phase.
d
BT asynchronous.

CMOS 0.18 mm process. The proposed circuit has been designed as [7] A. Rivetti, G. Anelli, F. Anghinolfi, G. Mazza, F. Rotondo, A low power 10-bit
an important part of the readout front-end ASIC for multi-element ADC in a 0.25 mm CMOS: design considerations and test results, IEEE
Transactions on Nuclear Science 48 (2001) 1225–1228.
detectors in medical imaging, where input data appear in random [8] N. Verma, A.P. Chandrakasan, A 25 mW 100 kS/s 12b ADC for wireless micro-
fashion in the input channels. The proposed MUX offers an sensor applications, in: Proceedings of the IEEE International Solid-State
alternative solution for a collision preventing circuit proposed Circuits Conference (ISSCC), San Francisco, CA, USA, 2006, pp. 822–831.
[9] M. Kurisu, M. Kaneko, T. Suzaki, A. Tanabe, M. Togo, A. Furukawa, T. Tamura,
in [22]. K. Nakajima, K. Yoshida, 2.8Gb/s 176-mW byte-interleaved and 3.0-Gb/s 118-
A de-randomization block included in the MUX automatically mW bit-interleaved 8:1 multiplexers with a 0.15mm CMOS technology, IEEE
finds out those channels, which became active, i.e. detects the Journal of Solid-State Circuits 31 (12) (1996) 2024–2029.
[10] T. Nakura, K. Ueda, K. Kubo, Y. Matsuda, K. Mashiko, T. Yoshihara, 3.6-Gb/s
peaks of the impulses, then holds the information on the
340-mW 16:1 pipe-lined multiplexer using 0.18 mm SOI-CMOS technology,
amplitudes of these peaks in analog memory cells of particular IEEE Journal of Solid-State Circuits 35 (5) (2000) 751–756.
channels as long as they will be read out by the output stage and [11] Hung-Wen Lu, Chau-Chin Su, A 5 Gbps CMOS LVDS transmitter with multi-
phase tree type multiplexer, in: Proceedings of the IEEE Asia-Pacific
performs a reset of these channels at the end. One of the
Conference on Advanced System Integrated Circuits, 2004, pp. 228–231.
important purposes of the proposed circuit is full elimination or at [12] D. Kehrer, H.-D. Wohlmuth, H. Knapp, A.L. Scholtz, A 15 Gb/s 4:1 parallel-to-
least strong limitation of such situations, in which one or more serial data multiplexer in 0.12 mm CMOS, in: Proceedings of the 28th
input signals attempt to connect to the output stage at the same European Solid-State Circuits Conference, 2002, pp. 227–230.
[13] D. Kehrer, H.-D. Wohlmuth, H. Knapp, M. Wurzer, A.L. Scholtz, 40-Gb/s 2:1
time. In the proposed circuit this feature does not depend on the multiplexer and 1:2 demultiplexer in 120-nm standard CMOS, IEEE Journal of
number of simultaneous events, even in the case of large number Solid-State Circuits 38 (11) (2003) 1830–1837.
of parallel channels. [14] A. Tanabe, M. Umetani, I. Fujiwara, T. Ogura, K. Kataoka, M. Okihara, H.
Sakuraba, T. Endoh, F. Masuoka, 0.18-mm CMOS 10-Gb/s multiplexer/
The proposed circuit enables a simple standby mode when no demultiplexer ICs using current mode logic with tolerance to threshold
data arrives at its input. When detecting a signal the energy used to voltage fluctuation, IEEE Journal of Solid-State Circuits 36 (6) (2001)
detect one active channel is below 1 pJ (for 8 inputs). The circuit is 988–996.
[15] K. Short, N.T. Trung, Soo-Won Kim, Jae-Tack, Yoo, A reliable static-logic-based
able to work with an input data frequency of about 1 GHz, which is 16:1 binary-tree multiplexer in 0.18 mm CMOS, in: Proceedings of the
sufficient with a large margin in designed medical imaging 50th Midwest Symposium on Circuits and Systems (MWSCAS), 2007,
application where the data rate will be at the level of 10 MHz. pp. 1193–1196.
[16] X. Tang, X.J. Wang, S.Y. Zhang, Y.S. Chi, N. Jiang, F.Y. Huang, A 2-Gb/s 16:1
multiplexer in 0.18-mm CMOS process, in: Proceedings of the International
Conference on Microwave and Millimeter Wave Technology (ICMMT), vol. 2,
References 2008, pp. 868–870.
[17] Hungwen Lu, Chauchin Su, Chien-Nan Jimmy Liu, A tree-topology multi-
plexer for multi-phase clock system, IEEE Transactions on Circuits and
[1] K. Iniewski (Ed.), Medical Imaging: Principles, Detectors, and Electronics, Systems I: Regular Papers 56 (1) (2009) 124–131.
Wiley, 2009. [18] Jun-Chau Chien, Liang-Hung Lu, A 15-gb/s 2:1 multiplexer in 0.18 mm CMOS,
[2] H. Spieler, Semiconductor Detector Systems, Oxford University Press, 2005. IEEE Microwave and Wireless Components Letters 16 (10) (2006) 558–560.
[3] G. De Geronimo, A. Kandasamy, P. O’Connor, Analog peak detector and [19] C.K. Yang, M.A. Horowitz, A 0.8-mm CMOS 2.5 Gb/s Oversampling Receiver
derandomizer for high-rate spectroscopy, IEEE Transactions on Nuclear and Transmitter for Serial Links, IEEE Journal of Solid-State Circuits 31 (12)
Science 49 ((4) Pt 1) (2002) 1769–1773. (1996) 2015–2023.
[4] G. De Geronimo, P. O’Connor, J. Grosholz, A generation of CMOS readout [20] R. D"ugosz, K. Iniewski, High precision analog peak detector for X-ray imaging
ASIC’s for CZT detectors, IEEE Transactions on Nuclear Science 47 (2000) applications, Electronics Letters 43 (8) (2007) 440–441.
1857–1867. [21] R. D"ugosz, R..Wojtyna, Novel CMOS analog pulse shaping filter for solid state
[5] M. Zoladz, P. Grybos, M. Kachel, P. Kmon, R. Szczygiel, Analogue multiplexer X-ray sensors in medical imaging systems, in: E. Ka˛cki, M. Rudnicki, J.
for neural application in 180 nm CMOS technology, in: Proceedings of the Stempczyńska (Eds.), Computers in Medical Activities, Book series: Advances
Sixteenth International Conference Mixed Design of Integrated Circuits & in Intelligent and Soft Computing, ISSN: 1615-3871, ISBN: 978-3-642-04461-
Systems (MIXDES), 2009, pp. 230–233. 8, vol. 65/2009, Springer-Verlag, Berlin/Heidelberg, 2009 pp. 155–165,
[6] G. Mazza, A. Rivetti, G. Anelli, F. Anghinolfi, M.I. Martinez, F. Rotondo, A (Chapter 16).
32-channel, 0.25 mm CMOS ASIC for the readout of the silicon drift detectors [22] A. Dragone, G. De Geronimo, et al., The PDD ASIC: highly efficient energy and
of the ALICE experiment, IEEE Transactions on Nuclear Science 51 ((5), Pt 1) timing extraction for high-rate applications, in: Proceedings of the IEEE
(2004) 1942–1947. Nuclear Science Symposium, 2005, pp. 914–918.

You might also like