0% found this document useful (0 votes)

93 views137 pages

Yushiang 1

This document is Yu-Shiang Lin's dissertation submitted in partial fulfillment of the requirements for a Doctor of Philosophy degree in Electrical Engineering at the University of Michigan in 2008. The dissertation focuses on designing low power circuits for miniature sensor systems. It includes an introduction describing typical components of sensor systems and challenges in developing low power systems. It then presents six chapters that describe Lin's work on developing ultra low power timers, temperature sensors, level shifters, wireless data retrieval techniques, and inductive coupling systems for miniature sensor applications.

Uploaded by

RishabReddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views137 pages

Yushiang 1

Uploaded by

RishabReddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 137

LOW POWER CIRCUITS FOR MINIATURE

SENSOR SYSTEMS

by
Yu-Shiang Lin

A dissertation submitted in partial fulfillment

of the requirements for the degree of
Doctor of Philosophy
(Electrical Engineering)
in The University of Michigan
2008

Doctoral Committee:
Associate Professor Dennis M. Sylvester, Chair
Professor David T. Blaauw
Associate Professor Michael P. Flynn
Professor Marios C. Papaefthymiou
Yu-Shiang Lin
c
2008
All rights reserved.
To My Family

ii
ACKNOWLEDGEMENTS

When I looks back the days pursuing for the Ph.D, I see challenges and lots of
precious memories. Coming to a foreign country for living and study for the first time

in my life, a lot of people have helped me throughout the past five years.
I always know that the VLSI program at the University of Michigan was my top
choice, especially Professor Sylvesters group. However, not until I joined the group I

realized how fortunate I was to choose the right program. I was always given all the
resources I need to implement my ideas under Professor Sylvesters guidance.

I want to thank Professor Blaauw for his advices on the researches. Having two
advisors on research has always been a positive experience to me. Also I want to thank

Professor Flynn and Professor Papaefthymiou to become my committee members and

provide their expertise.
I would like to thank the stuffs from the EECS department. You make my life so

much easier. It has been a pleasure to work with a group of brilliant people in our
lab. I find myself always learning new things from you.

Last but not least, I want to dedicate my Ph.D to my family for their supporting.
My wife Yen-Ting takes care of most daily routines for me and always encourages

me. My son Andruw always gives me his big smile when I come back home. It is my
parents who make who I am.
May the best to all of you.

iii
TABLE OF CONTENTS

DEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . iii

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
I. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Microcontroller . . . . . . . . . . . . . . . . . . . . 4
1.1.3 Storage elements . . . . . . . . . . . . . . . . . . . . 5
1.1.4 Communication module . . . . . . . . . . . . . . . . 7
1.1.5 Timer . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.6 Power source . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Low power sensor system . . . . . . . . . . . . . . . . . . . . 10
II. ULTRA LOW POWER TIMER DESIGN FOR SENSOR
APPLICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 A Sub-pW gate leakage timer . . . . . . . . . . . . . . . . . . 20
2.2.1 Circuit Design for the timer . . . . . . . . . . . . . 20
2.2.2 Measurement results . . . . . . . . . . . . . . . . . 24
2.3 Self temperature compensation for low power timer . . . . . . 27
2.3.1 Oscillator with self temperature compensated current
source . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.2 Power reduction by charge holding technique . . . . 30
2.3.3 Test chip and measurement results . . . . . . . . . . 33
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
III. AN ULTRA LOW POWER 1V, 220NW TEMPERATURE
SENSOR FOR PASSIVE WIRELESS APPLICATIONS . . . 42
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2 low power temperature sensor design . . . . . . . . . . . . . . 43
3.3 Measurement results . . . . . . . . . . . . . . . . . . . . . . . 48
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.5 Improving the voltage sensitivity . . . . . . . . . . . . . . . . 52

iv
IV. SINGLE STAGE STATIC LEVEL SHIFTER DESIGN FOR
SUBTHRESHOLD TO I/O VOLTAGE CONVERSION . . . 54
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2 Conventional approach . . . . . . . . . . . . . . . . . . . . . . 55
4.3 Proposed approach . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . 63
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
V. SENSOR DATA RETRIEVAL USING ALIGNMENT INDE-
PENDENT CAPACITIVE SIGNALING . . . . . . . . . . . . 67
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.2 Geometry optimization . . . . . . . . . . . . . . . . . . . . . 69
5.2.1 Sizing of the sensor pad . . . . . . . . . . . . . . . . 70
5.2.2 Single-ended vs. differential signaling . . . . . . . . 73
5.3 System architecture . . . . . . . . . . . . . . . . . . . . . . . 76
5.3.1 Data retrieval circuits design . . . . . . . . . . . . . 77
5.3.2 Sensor chip circuit design . . . . . . . . . . . . . . . 79
5.4 Chip measurement . . . . . . . . . . . . . . . . . . . . . . . . 83
5.4.1 Test chip . . . . . . . . . . . . . . . . . . . . . . . . 83
5.4.2 Alignment detection . . . . . . . . . . . . . . . . . . 84
5.4.3 Measurement results . . . . . . . . . . . . . . . . . 86
5.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
VI. NEAR FIELD INDUCTIVE COUPLING USING PLL PHASE-
LOCKING AND PULSE SIGNALING . . . . . . . . . . . . . 92
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.2 System architecture . . . . . . . . . . . . . . . . . . . . . . . 94
6.2.1 Integrated inductor . . . . . . . . . . . . . . . . . . 95
6.2.2 Transponder circuits . . . . . . . . . . . . . . . . . 96
6.2.3 Reader circuits . . . . . . . . . . . . . . . . . . . . . 105
6.3 Measurement results . . . . . . . . . . . . . . . . . . . . . . . 108
6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
VII. CONTRIBUTIONS AND FUTURE WORKS . . . . . . . . . 113
7.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.2 Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

v
LIST OF FIGURES

1.1 Illustration of the building blocks of a monitoring system. . . . . . . 3

1.2 The relationship between supply voltage and energy consumption per
instruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Intraocular pressure monitoring system. . . . . . . . . . . . . . . . . 11
1.4 Monitoring system considering power gating scheme. . . . . . . . . . 13

2.1 Illustration of the lifetime of a sensor system. . . . . . . . . . . . . . 19

2.2 The concept of a one-shot oscillator. (a) The circuit diagram. (b)
The operation waveform. . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Proposed timer structure for subthreshold operation . . . . . . . . . 23
2.4 Power consumption vs. supply voltage at different temperature points 24
2.5 Timer period vs. temperature at various supply voltages. . . . . . . 25
2.6 Output period scatter plot highlighting die-to-die and within-die vari-
ations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.7 Timer output period variation with respect to time. . . . . . . . . . 27
2.8 The bias stage showing the voltage divider and a resistor based self-
biasing loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.9 One-shot oscillator for timer output. . . . . . . . . . . . . . . . . . . 31
2.10 One-shot oscillator for timer output. . . . . . . . . . . . . . . . . . . 32
2.11 Circuits for charge holding. (a) type I, (b) type II. . . . . . . . . . . 33
2.12 Timing diagram of the program-and-hold method. . . . . . . . . . . 34
2.13 Die photo of the timer test chip. . . . . . . . . . . . . . . . . . . . . 35
2.14 Normalized frequency vs. temperature and supply voltage. . . . . . 36
2.15 Average of normalized frequency drift over time. . . . . . . . . . . . 37
2.16 Refresh the timer every four minutes with 1.1 second programming
time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.17 Normalized frequency vs. programming time. . . . . . . . . . . . . . 39
2.18 Power consumption at the programming mode and the active mode
with respect to different temperatures. . . . . . . . . . . . . . . . . . 39
2.19 Power consumption and frequency deviation with different refreshing
time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.1 Temperature sensor block diagram. . . . . . . . . . . . . . . . . . . 44

3.2 Schematic for IPTAT generation. . . . . . . . . . . . . . . . . . . . . . 45
3.3 Schematic for Iref generation. . . . . . . . . . . . . . . . . . . . . . . 46

vi
3.4 Block diagram and timing diagram of the sensor controller. . . . . . 47
3.5 Die photo of the temperature sensor. . . . . . . . . . . . . . . . . . . 48
3.6 Power consumption of the temperature sensor. . . . . . . . . . . . . 49
3.7 Temperature inaccuracy of the temperature sensor with two-point
calibration at 20 C and 80 C. . . . . . . . . . . . . . . . . . . . . . . 50
3.8 Temperature inaccuracy over samples (top: 10 samples/s; bottom:
100 samples/s; solid line: actual temperature). . . . . . . . . . . . . 51
3.9 Modified temperature insensitive current source. . . . . . . . . . . . 52
3.10 Voltage reference generator. . . . . . . . . . . . . . . . . . . . . . . . 53

4.1 Conventional DCVS-type level shifter with cross-coupled pull-up tran-

sistors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.2 Simulation results showing the operating frequency with respect to
pull-down transistor width Wn. . . . . . . . . . . . . . . . . . . . . 57
4.3 Proposed approach that uses input voltage independent diode-connected
transistor stacks for pull-up devices. . . . . . . . . . . . . . . . . . . 58
4.4 Proposed level shifter with feedback path for leakage reduction. . . . 59
4.5 Reduced swing inverter. . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.6 Waveforms demonstrating the operation of our proposed circuit. . . 61
4.7 Sizing of diode-connected stacked PMOS (Wp) versus gate delay and
power dissipation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.8 Comparison of level shifters. (a) Gate delay, (b) Power consumption. 63
4.9 Monte Carlo simulation results showing gate delay variation across
process spread. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.1 The relative

position of the receiver array and sensor signal pad when
WRX <= 2WT X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.2 Relationship between pad size of sensor chip to the coupling capaci-
tance with WRX =50m. . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.3 Differential signaling scheme. Pad A (square with slant lines) together
with all the other pads in light gray are used to recover the signal from
the sensor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.4 System architecture for the proposed data retrieval mechanism. . . . 76
5.5 Data retrieval array showing 20 by 20 cells and controller. . . . . . . 77
5.6 Alignment detector. (a) block diagram, (b) operation waveform. . . 78
5.7 Data retrieval with capacitive coupled input and periodic precharge
to sensitize the amplifier. . . . . . . . . . . . . . . . . . . . . . . . . 80
5.8 Timing diagram showing the operation of data retrieval circuits when
switching happens. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.9 AC to DC conversion circuits for sensor chip power harvesting. . . . 81
5.10 Voltage limiter. (a) circuits diagram, (b) open loop voltage transfer
curve. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.11 Envelope detector for sensor chip. . . . . . . . . . . . . . . . . . . . 83
5.12 chip die photo. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.13 Parasitic components for the system of two chips in a stack. . . . . . 85

vii
5.14 Procedures for alignment detection and pad reconfiguration. . . . . . 86
5.15 Decoded data waveform showing pseudo random bit sequences up to
15 unrepeated cycles. . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.16 Operating frequency versus transmitting amplitude and carrier fre-
quency with estimated working distance showing on the second x-axis. 88
5.17 Energy consumption versus transmitting amplitude and carrier fre-
quency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.18 (a) Tw versus BER, (b) Clock modulation circuit that defines Tw . . 89
5.19 Data rate versus BER with 10 random position testing. . . . . . . . 90

6.1 Comparison of transponder data encoding with back scattering and

pulse signaling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 System architecture for the proposed pulse signaling method. . . . . 94
6.3 Power harvesting module with the schematic of the voltage limiter. . 97
6.4 Schematic for the voltage regulator. . . . . . . . . . . . . . . . . . . 98
6.5 Schematic for the voltage regulator. . . . . . . . . . . . . . . . . . . 100
6.6 TSPC phase frequency detector. (a) TSPC D filp-flop, (b) circuit
diagram of the phase frequency detector. . . . . . . . . . . . . . . . 101
6.7 Schematics of (a) the charge pump, (b) the VCO. . . . . . . . . . . 102
6.8 Timing waveform of the pulse signaling mode. . . . . . . . . . . . . 103
6.9 Timing waveform of the PLL locking mode. . . . . . . . . . . . . . . 103
6.10 The driver circuits for the transponder. . . . . . . . . . . . . . . . . 105
6.11 Pulse generator and output driver of the reader. . . . . . . . . . . . 107
6.12 Data receiving scheme for the reader. . . . . . . . . . . . . . . . . . 107
6.13 Die photo for the reader and transponder of the system. . . . . . . . 108
6.14 (a) Test setup with the micromanipulator and the PC board, (b)
close-up photo. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.15 Measured waveform from the oscilloscope showing the output data
and the clock signal . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.16 Measured communication distance with respect to the data rate and
the switching amplitude. . . . . . . . . . . . . . . . . . . . . . . . . 111
6.17 Measured achievable communication distance with misalignment in
the x-axis or the y-axis. . . . . . . . . . . . . . . . . . . . . . . . . . 112

viii
LIST OF TABLES

1.1 Summary of power generation sources. . . . . . . . . . . . . . . . . . 9

1.2 Comparison between small batteries . . . . . . . . . . . . . . . . . . 10
1.3 Summary of the contributions of the works. . . . . . . . . . . . . . . 17

2.1 Comparison for the timers. . . . . . . . . . . . . . . . . . . . . . . . 41

3.1 Comparison of temperature sensors. . . . . . . . . . . . . . . . . . . 50

4.1 Comparison of level shifters. . . . . . . . . . . . . . . . . . . . . . . 65

5.1 Summary of pad dimensions. . . . . . . . . . . . . . . . . . . . . . . 76

6.1 Summary of the integrated inductor. . . . . . . . . . . . . . . . . . . 95

ix
CHAPTER I

INTRODUCTION

1.1 Overview

Sensor system is ubiquitous in our modern day of livings. Applications such as chem-
ical sensing, biomedical monitoring to even industrial and automotive applications

have all made large strides. Nowadays, those systems are more and more cost ef-
fective and with higher level of integration, thanks to the highly developed silicon

technology [1, 2, 3, 4]. Basically, a sensor system utilizes the transducers to translate
the nonelectric world to something that the electrical engineers are more familiar
with, for instance, in the form of digital or analog signals. By interfacing the nonelec-

trical properties with signals that can be processed by electronic devices, the sensor
system can perform more functions in addition to just sensing. For example, the

sensors can be used to build devices such as beams or diaphragms with their mechan-
ical properties [5]. Such devices when controlled by externally applied voltages, can

be used as the actuators. Another example is to continuously monitor an object over

an extended period of time. The sensed signal is translated into digital data that can
be managed by a microcontroller, which can perform tasks such as compression and

store the recorded data in the storage elements. An example is a watchdog system
that monitors the condition of perishables [6]. The deterioration process of the food

can be represented by chemical reactions that are closely related to the function of
activation energy and the integrated temperature over time. The reaction process

1
can be simulated with a simple CMOS circuit with digital outputs. In this way, the

condition of the perishables can be monitored throughout the lifetime. Another ex-
ample is commonly implemented in modern VLSI microprocessors, where a thermal

sensor is used for hot-spot detection and thermal management [7].

With RF modules, wireless sensor network (WSN) brings even broader applica-
tions such as environmental monitoring. Compact sensor nodes can be widely dis-

tributed to collect environmental related data such as the temperature and humidity
inside an ecosystem [8]. The lifetime of the wireless sensor network is limited by the

energy consumption of the individual sensor node. Most components in the sensor
node need to be turned off when not used due to the power sources with limited ca-

pacities. A wakeup receiver that consumes less than 100W was proposed to provide
low standby power and to activate the main receiver upon request [9]. When the RF
input power is reduced or when the wakeup sequence is shortened, the mean time

between false alarm will also be decreased. This results in a more frequent wakeup
cycle than what is needed for the sensor system. It was shown that by increasing the

time of false alarm to more than 1018 seconds, -50dBm of sensitivity can be achieved
with a 7-bit code. Compared to -100dBm sensitivity of the main receiver, the saving

of the power consumption is at the expense of shorter energy range. Further discus-
sion of the wireless sensor network is beyond the scope of this work, we will mostly
focus on a single sensor system throughout the chapter.

The advancement of the CMOS technology is the driving force for high perfor-
mance computing systems. On the other hand, a sensor system can also benefit

from higher density and smaller parasitic capacitances from device scaling while the
throughput requirement is usually low. Fig. 1.1 illustrates a sensor monitoring system

that integrates various functions into the same package. The components include sen-
sor/transducer, power source, controller, storage elements, timer and communication
module. In the following section, we will discuss the role of the components in the

2
+
battery
-

sensor
1 0
Storage
0 0
element

1
1 controller
communication

timer 0110001...

Figure 1.1: Illustration of the building blocks of a monitoring system.

monitoring system and the challenges that need to be addressed.

1.1.1 Sensors

Sensors, actuators and microelectronics form the backbone of a MicroElectroMechan-

ical System (MEMS). The underline technology is so-called micromachining process

that selective etches the silicon wafer or adds additional layers to form the mechani-
cal or electrical devices. Pressure sensors formed by creating a thin diaphragm is an
early and successful example of a MEMS sensor. Nowadays, sensors are not limited

to mechanical devices only. Instead, thermal, optical, magnetic and chemical sensors
have all seen promising developments [10, 11, 12, 13]. Typically, interface circuits

such as analog amplifiers or analog-to-digital converters (ADC) are used to convert

the signal into formats that can be understand by the microelectronic devices. As an

3
example, a microfabricated capacitive servo pressure sensor was demonstrated with

integrated circuits [14]. The cavity between the diaphragm and the electrode is vac-
uumed so that the pressure sensor detects absolute pressures. The idea is to use a

capacitance-to-voltage converter to form a close loop with the servo sensor and to
balance the position of the silicon electrode. The amplified voltage corresponds to
the absolute pressure is then generated automatically by the sensor.

Temperature sensing is another example with broad interests for all sorts of ap-
plications. Unlike other environmental parameters, temperature has an direct and

measurable impact on the characteristics of integrated circuit components such as the

resistors and the transistors. Therefore, one does not need a special MEMS process

to design a temperature sensor that is able to translate the nonelectrical property

of temperature. For most applications, the ambient temperature varies slowly so that
the conversion rate of less than a hundred ms is sufficient. On the other hand, the

thermal sensor for a VLSI microprocessor may require less than 1ms of conversion
time for effective thermal throttling.

1.1.2 Microcontroller

After the nonelectrical property is quantized to digital signals that can be under-
stood by the logic circuits, a microcontroller takes over the control of the system.

The microcontroller can be as simple as a finite state machine that performs certain
routines like serializing the data to the communication module or a full-fledged gen-

eral purpose processor that can handle data manipulation and analysis. For sensor
applications, the system throughput is generally lower than 1bps. As a result, it
spends most of the time idling and still consumes leakage power due to the nature

of non-ideal MOS switches. Technology scaling results in smaller parasitics and thus
less switching power. On the other hand, to keep up the performance boost over

technologies, the threshold voltage needs to be reduced as well and leads to more

4
subthreshold leakages [15].

Fig. 1.2 shows the supply voltage of a processor versus the energy consumption per
instruction. First, the graph shows that there is a specific power supply voltage (Vmin )

that produces the minimum energy per instruction for a fixed activity rate [16].
This is because when the voltage is high, the circuit can operate at a higher frequency
and the total power is dominated by the dynamic power. On the other hand, running

at a lower voltage means that the circuits start to spend more time leaking while the
dynamic power remains the same. Usually Vmin is below the threshold voltage so the

operating frequency decreases rapidly as the supply voltage reduced from Vmin . When
increases, the curve shifts upward, and the minimum energy voltage is moved from

Vmin1 to a lower value of Vmin2 . The analysis of Vmin has an underline assumption
that the processor consumes zero power after the computation is completed, which
is not realistic for a sensor system. In practice, the strength of the power gating

transistors should be considered. A footer (or header) transistor is used to provide a

virtual rail when the circuit is in the active mode. When idling, it behaves like a high

impedance path between the supply rails. There is a tradeoff in deciding the size of
the footer transistor. Selecting a weak footer transistor sizes helps reducing leakage

but hurts the performance and robustness of the system [17]. Process variation has
to be considered when choosing the optimal transistor size. To sum up, operation
at subthreshold region and heavily power gating are both required in order to save

energy consumption of the microcontroller.

1.1.3 Storage elements

Storage elements serve two purpose: recording the measurement data and providing

the instruction routines. SRAM is commonly used as the storage element since it
provides good compromise between high density and low latency. While the same

minimum energy analysis can be applied to the SRAM devices, SRAM cells that are

5
Energy/Inst
Vmin2

Vmin1

Vdd

Figure 1.2: The relationship between supply voltage and energy consumption per instruction.

designed for nominal voltage operation do not reliably work below 700mV without
modifications [18]. The fundamental problem is the loss of Ion /Iof f ratio while op-

erating at subthreshold regions. The other problem is due to the process variation
such that the cells suffer from reduced static noise margin (SNR). An early effort is to

replace the SRAM with multiplexer based memory that is able to successfully work
at 180mV [19]. To optimize the read and write margin at the same time, a virtual
rail can be used to selectively weaken the latch transistors during the write operation

[20]. Another solution is to increase the number of transistors in the SRAM cell and
by doing so, the size of the cell can be optimized individually for the read and the

write operations [21, 22]. Using these techniques, the supply voltage can be lowered
to less than 200mV. Ultra low standby power SRAM cell of 10.9fW was reported

using stack-forcing and gate length biasing techniques by sacrificing the bitcell area
[17].
An alternative for the storage elements is the nonvolatile memories such as ROM,

EPROM and FLASH. Such devices do not rely on power sources to sustain the data.
Therefore, it is very energy efficient for infrequent operations. ROM is suitable for

storing instruction routines because of its high density. However, writable memories
are required for recording data. FLASH is widely used for mass storage in consumer
electronics such as digital camera and mobile phones. Generally, it requires a special

6
floating gate process during fabrication and relies on high programming voltage to

provide the necessary electrical field for accessing the floating node [23]. CMOS com-
patible FLASH was proposed with 5V of programming voltage and 1.2V for reading

operation [24].

1.1.4 Communication module

There are two types of communication schemes that can be applied to the system:

one that is able to individually send and receive data to other nodes and the other
one that relies on a base station. The former one is the concept of a wireless sensor

network which requires a reliable power source. On the other hand, the latter one can
be remotely powered. Passive wireless nodes have been demonstrated with low data
rates [25, 26, 27]. In these systems, the passive nodes harvest energy from the radio

frequency (RF) input. In general, power and data are sent simultaneously. In radio
frequency identification (RFID) terminology, the base station is called a interrogator

or reader and the device that responds the request is called a transponder. Typical
ranges for passive RFID devices are from less than 1cm to a few meters. Long range
batteryless wireless telemetry has been reported with up to 18 meters of distance

[28, 29]. To harvest enough energy for signal transmission, large capacitors are used
to store the charges.

Active communication usually adopts a so-called schedule rendezvous action that

only periodically wakes up the transceiver to perform communication [30]. The power

saving is a strong function of the responsiveness of the sensor node which is highly
application dependent. As an alternative, the aforementioned wakeup receiver that
operates at lower power consumption when it is inactive can be used. With -50dBm

RF input sensitivity, a wakeup detector is able to operate at 100nA of standby leakage

and only a few mV of activation voltage [31]. A hybrid scheme can be implemented

so that the sensors only turn on their low power wakeup detector every t seconds

7
(where t is a design parameter). This scheme further saves power consumption at the

expense of the response delay.

1.1.5 Timer

Time keeping is essential to some sensor systems since the content can be highly time

dependent. For example, the doctor may apply proper treatment by knowing the
temperature variation of the patient in the past 48 hours. In another example, the

scientist may be interested in the humidity of a forest during a particular time of

the day to study its impact on the ecosystem. The timer has to be able to adapt

to different weather conditions such as dramatic changing of temperature. Another

function of the timer is to monitor the sleeping time of the system when power gating
is applied. The precision requirement of the timer is highly application dependent.

For medical applications, the temperature variation is small whereas it can be dra-
matically different when used in automotive systems. Generally, power consumption

is the biggest challenge for the timer since it is the only active device while the other
parts of the system are strongly power gated. The power consumption of the timer
should not dominate the sleep power of the system.

1.1.6 Power source

Energy scavenging and battery are two potential power sources for the sensor systems.

Energy scavenging is the process by which energy is captured and stored. There are
a variety of scavenging sources such as solar power, thermal energy, vibration energy
or even human power [32, 33, 34, 35, 36]. Table. 1.1 summarizes the power density

of various power generation sources [37]. For a lifespan of 10 years, the power den-
sity of the energy scavenging sources outperform the Lithium batteries. Among all,

vibration is potentially the favorable mechanism because of its abundance. One way
of exploiting it is to use the piezoelectric materials to produce electric field when

the material is deformed by external forces. Other methods such as magnetic and

8
Table 1.1: Summary of power generation sources.
Power Density (W/cm3 )
15,000 - direct sun
Solar (outdoors)
150 - cloudy day
Solar (indoors) 6 - office desk
Vibrations 200
0.003 @ 75dB
Acoustic Noise
0.96 @ 100dB
Daily Temp. variation 10
Temperature Gradient 15 @ 10 C gradient

electrical transducer can also be used to harvest the vibration energies. Vibration

scavenging through the traffic of a bridge or induced by wind, for example, is a rea-
sonable power source for a structural health sensor nodes [38]. However, considering

the inconsistent nature of the mechanism, such power source is not reliable enough
to guarantee the operation of the system.
Batteries are able to supply constant current until the lifetime is over. The lifetime

of a battery depends mostly on the form factor and the chemistry. Table 1.2 compares
several commercialized miniature batteries that are potential candidates for the sensor

system. 4A and CR-1025 are commonly used batteries for small electrical devices such
as watches and toys. Although the charge density is high, they are not compatible

with microfabrication process. Power paper [39] (ink based technique) and Cymbet
[40] (thin film battery) are advantageous in terms of size because the thickness of the
battery can be less than 1mm. Take Cymbet for example, a 1mm by 1mm by 25m

battery is able to provide the energy of roughly 10Ah. The lifetime of the sensor
system can be calculated from the its power consumption and the capacity of the

power source. A year of lifetime means that the whole system can only consume 1nA
of current when directly supplied by the aforementioned Cymbet battery.

9
Table 1.2: Comparison between small batteries
Product Nominal Voltage Capacity Size Charge density
4A battery 1.5V 625mAh 2298.0mm3 0.27mAh/mm3
CR-1025 3.0V 30mAh 196.25mm3 0.15mAh/mm3
Power paper 1.5V 30mAh 1064.7mm3 0.03mAh/mm3
Cymbet 3.6V N/A N/A 0.40mAh/mm3

1.2 Low power sensor system

In this dissertation, we target for a sensor system that is mostly limited by the form
factor. Applications such as implantable or non-intrusive systems may find 1mm3

form factor attractive. One good example is a intraocular pressure monitoring sys-
tem that is shown in Fig. 1.3. Eye pressure is highly related to some eye disease
such as glaucoma. When the intraocular pressure (IOP) increases, it can cause mal-

function of the eyes drainage structure. It will finally damage the optic nerve and
result into permanently vision loss if left untreated. The raise of pressure inside the

eye is due to the imbalance between drainage and reproduction of fluid [41]. Fluids
continuously enter the eye but they are not able to be drained due to improper func-

tioning drainage channels. IOP higher than 22mmHg is considered to be suspicious

and possibly abnormal [42, 43]. Traditional way to examine eye pressure is through
tonometer. Applanation tonometer measures eye pressure by the force requires to

flatten a constant region of cornea. It is considered the most accurate way of mea-
suring the eye pressure. The other tonometry such as air puff test or transpalpebral

tonometry do not require direct contact of the cornea and are less accurate compared
to the applanation tonometers. The disadvantage of those measurement is that the

patient has to frequent the clinic in other to take the measurements.

In addition, according to [44], the measurement taken at the clinic does not reflect
the peak pressure and the pressure variation. Large fluctuation in diurnal IOP is a

risk factor of open-angle glaucoma. High IOP on awakening is reported from many

10
Sensor
Processor

Through vias for

battery
Glass
substrate

Integrated battery
(back side of the chip)

Wireless module

RX Readout system

Figure 1.3: Intraocular pressure monitoring system.

publications [45, 46, 47]. Therefore, the measurement that is not taken during the

morning very likely misses the peak of eye pressure. The average IOP is similar
during different time of the day, but peak IOP is about 5mmHg higher on awakening

compared to other time. Goldmann tonometry was done in the sitting position and it
is reported that IOP is higher in supine position [48]. Thus, it suggests that the using
of portable tonometry or self-tonometry is advantageous over the traditional methods.

In [49], a non-invasive self-tonometry device has been reported and is demonstrated

with the ability to measure the IOP in a pigs eye.

In order to continuously monitor the intraocular pressure, implantable pressure

sensor is the preferable solution. Passive resonant sensor utilizes the inductance-
capacitance oscillating circuit to detect the resonant frequency of the sensor, and

uses the information to determine the absolute pressure from the capacitance value

11
[50]. The other way to implement passive sensor is by designing a device so that the

pressure can be transformed into changes in terms of electromagnetic properties [2].

The signal is then coupled to a external device through a coil and can be digitized and

stored into the memory. The pressure sensor can also be implemented with surface
micromachining [51]. In [52], the authors summarize the development of nontelemetry
intraocular sensors to date where the implant sizes are ranging from 1.1mm to 11.5mm

in diameter.
In [53], radio-frequency (RF) transmission is used to send the signal from the

transponder to external telemetric component. This system, however, still requires

external processing unit such as A/D converter or network analyzer to monitor the

processor in real time. A full system demonstration of intraocular sensor was reported
with an on-chip micromechanical pressure sensor, a microcontroller, the readout cir-
cuits and a RF transponder in [54]. Another readout method for intraocular applica-

tion is to use a coil in parallel to the capacitive sensor [55]. This LC resonant circuit
converts the pressure into a shift of the resonance frequency. A VCO is then used to

excite the sensor over a frequency range and to detect the resonant frequency of the
internal sensor.

For a monitoring system on the order of mm3 , the power source, whether by
energy scavenging or microfabricated battery, is the limiting factor. As was discussed
in the previous section, less than 1nA of average current consumption is required

for a year of lifetime based on the capacity of the battery. To operate at such a
tight power budget, the system has to adopt aggressive power gating while it is not

actively monitoring the objects. Fig. 1.4 shows such a monitoring system with a goal
on minimizing the total power. In this system, it relies on a battery to provide the

supply power to all the components except for the wireless module. The wireless
module should harvest AC power directly from the RF input. The voltage regulator
downconverts the voltage level from a typical battery output to the energy minimum

12
Power
Regulator Power gated
source
Not power
gated

Power
Timer
controller

Wireless Temp.
CPU ROM
module Sensor

Retention Gated Interface MEMS

memory memory circuit Sensor

Figure 1.4: Monitoring system considering power gating scheme.

voltage level. In this way, the dynamic power can be quadratically reduced with
supply voltage. With switched-capacitor DC-DC converter, the current efficiency can

be higher than 100% [56].

In the active mode, the CPU reads the routines from the ROM and compresses

the digitized data from the sensor to the data memory. The storage units are partially
retentive in order not to lose the data while in the sleep mode. After the computation
is completed, the CPU sends a request to the power controller before entering the

sleep mode. Then the power controller takes over and switch off the footer/header
transistors for the none data retentive blocks. At the same time, a START signal is

also sent to the timer to start counting the time spent in the sleep mode. During
the sleep mode, a retention memory is used to keep the recorded data. After a given

number of cycles, the timer sends a expire signal to the power controller and returns

13
the control to the CPU again.

Recently, many research efforts have been focused on ultra low power design for
digital logics and memories. Energy number in a single digit of pJ was reported for the

processor as well as SRAM [57, 20]. On the other hand, the power consumption of the
peripheral circuits such as the timer have gotten little attentions. In this dissertation,
we will discuss the design issues on the peripheral circuits under stringent power

constraints and propose our solutions.

Crystal oscillator is widely used in digital system to provide excellent process,

supply and temperature insensitivities. The package dimension, however, is at least

a few mm3 for discrete crystal oscillators. With Colpitts crystal oscillator and a cas-

code connected base-common buffer amplifier, an integrated crystal oscillator can be

produced with low frequency sensitivity [58]. However, the total area is still on the
order of 1mm2 and the power consumption is close to 1mA. As an alternative, cur-

rent controlled one-shot timer was proposed to provide steady output frequency with
circuit that combines Schmitt trigger and a charge pump [59, 60]. For a given period

of time, the charge pump provides a fixed amount of charge to the load capacitor and
the output will eventually be flipped when the voltage level exceeds the transition

point of the Schmitt trigger. In this design, the current source has the most impact
on the output frequency variations. The implementation of steady low current source
is challenging given that 1nA is the total system current budget. Ring oscillator

based circuit can also be used to generate clock signal with low hardware overhead
[61]. For MOS transistors, the drain current decreases at higher temperatures mainly

due to the degradation of electron mobility. On the other hand, the drain current
is increasing with the temperature while operating at subthreshold region where the

threshold voltage becomes the dominating factor. Therefore, by properly operating

the ring oscillator at a particular voltage, it can be temperature insensitive as well.
However, the temperature coefficient is highly sensitive to the voltage and process

14
variations.

In Chap. II, two timer designs that is suitable for sub-nA operations will be
presented. The first one uses the gate leakage of a MOS transistor as the current

source for the timer. Gate leakage is relatively insensitive to temperature compared
to other current source in CMOS technologies. In addition to that, it provides large
time constant which is ideal for reducing the switching activities with negligible area

cost. The second timer design generates a temperature insensitive current source by
forcing identical voltage across a resistor. The same current flows into a reference

transistor such that it also becomes temperature insensitive. To further reduce the
power consumption, a program-and-hold scheme is implemented to store the bias

voltage on a capacitor while the biasing circuit is turned off. The current reduction
is achieved by mirrored to a transistor that is 200X smaller than the reference one.
An ultra low power temperature sensor will be shown in Chap. III. Temperature

sensor accounts for large portion of the total leakage when the system is remotely
powered. The transmitting distance of such device is highly related to the power

consumption. In this work, a PTAT (proportional to absolute temperature) current

source and a temperature insensitive current source are implemented. After a current

to frequency conversion, a digital counter is used to generate the temperature reading.

Since both current sources are defined with reference resistors, the power consumption
can be traded-off with the size of the resistor.

Chap. V presents a design of chip-to-chip proximity communication to read out

the data from a sensor node. In certain situations, the reader can be brought in close

proximity to the sensor node where a strong field is not needed for communication.
Capacitive coupling is suitable for such applications where it has the advantage of

low hardware overhead. Capacitive coupling was first proposed to alleviate I/O com-
munications between the chips and was implemented through a substrate trace [62].
Face-to-face communication scheme was proposed to allow more channels and thus

15
higher bandwidths [63, 64]. One obvious advantage is the power reduction and per-

formance boosting due to the absence of the electrostatic discharge (ESD) protection
device that is commonly used in wired communications. It is also suitable for passive

communication since the transmitting frequency is not limited by the resonant fre-
quency of the passive device. However, the major concern of the capacitive coupling
scheme is the misalignment of the chips. Since the coupling capacitance is inversely

proportional to the distance between the pads, the chip alignment has a strong impact
on the signal strength of the receiver. An alignment independent method is proposed

in this work. The transmitting pads are divided into smaller microplates and each
microplate can be reconfigured to either transmit power or receive data depending on

their location. The solution is demonstrated with less than 15% of achievable data
rate by randomly dropped the sensor chip on the so-called data retrieval chip.
For sensor-type applications, passive RFID technique is a potential solution when

the transmitting distance is less than 10m [65]. However, such system usually requires
a bulky external coil so the enough energy can be harvested by the transponder. On

the other hand, there are applications like intraocular pressure sensing that requires
only a few mms of distance between the reader and the transponder. Inductive

coupling is well suited for this type of application. In Chap. VI, a pulse signaling
based scheme is proposed to provide more robust transmission compared to traditional
backscattering scheme. Pulse signaling is widely used in ultra wide band (UWB)

communications [66]. It is also used in high performance proximity communication

with inductive coupling [67]. In this work, a PLL is used to lock into the incoming

RF frequency which is excited at the resonant frequency of the reader. A short gap
between continuous waves is created so that the transponder can utilize it to send

pulses with the frequency that was previously acquired by the PLL back to the reader.
As a result, the signal-to-noise ratio (SNR) can be greatly improved and filtering of
the strong interferences from the readers local resonant clock is not required. The

16
Table 1.3: Summary of the contributions of the works.
Chap. Title Contributions Ref.
II Gate leakage based timer Sub-pW power consumption at 300mV. [68]
II Program-and-hold timer Lower temperature dependency and sup-
ply sensitivity at comparable power con-
sumption compared to [68].
III Low power temperature Lowest published power consumption [69]
sensor
IV Static subthreshold to I/O Single stage design with consistent 5FO4 [70]
voltage level shifter delay for 300mV to 2.5V level conversion.
V Alignment independent ca- First alignment independent capacitive [71]
pacitive signaling coupling test chip with simultaneous
power and data transmission.
VI Inductive coupling using Propose pulse signaling and PLL phase-
pulse signaling locking scheme that automatically ac-
quires resonant frequency for near field in-
ductive coupling data transmission.

measured chip demonstrates a 1.1mm of communication distance with 1mm1mm of

integrated inductors for both the reader and the transponder.

Chap. VII concludes the contributions of this dissertation, which are also sum-
marized in Table 1.3. Future directions based on the works will also be discussed.

17
CHAPTER II

ULTRA LOW POWER TIMER DESIGN FOR

SENSOR APPLICATIONS

2.1 Introduction

In this chapter, the designs of ultra-low power timers will be presented. To reduce

the power consumption and extend the lifetime for a sensor system, both the active
power and the idle power are crucial. For example, Fig. 2.1 illustrates the energy

consumption of a sensor system during its lifetime. The system is actively performing
tasks during only a short period of time. Previous work shows that the energy Emin of
a few pJ per instruction can be achieved by operating at a subthreshold voltage Vmin

[57]. Although the voltage Vmin optimize the energy consumption during active mode,
the system spends most of the time idling. The idling energy can be greatly reduced

by power gating technique. As shown in [17], however, strong power gating requires
weak footer or header transistors that also reduces the performance and robustness
of the circuits. Moreover, not every component can be turned off during the idle

time of the sensor system. A timer that keeps track of the time in the sleep mode is
one example. The power consumption of the timer should not dominate the power

consumption of the other circuits, otherwise the power reduction from power gating
can be largely degraded.

The crystal oscillator is widely used as the frequency reference due to its accurate
value and insensitivity to temperature and supply variations. Typically incurring
a bulky external component, it can also be implemented with a Colpitts oscillating

18
Task

Emin

Idle
Esleep

Figure 2.1: Illustration of the lifetime of a sensor system.

circuit on-chip [58]. However, the frequency coming off of the crystal oscillator is
orders of magnitude higher than what we need and the resulting power consumption
is unacceptable. An alternative option is a current controlled one-shot timer has

been proposed to provide steady output frequency with a circuit that combines a
comparator and a charge pump [59, 60]. For a given time the charge pump provides

a fixed amount of charge to the load capacitance and the output will eventually flip
when the voltage level exceeds the transition point of the Schmitt trigger. The design

of the current source has a direct impact on the frequency sensitivity of the circuit.
While the supply voltage for the sensor system is preferred to be in the subthreshold
region to minimize the active energy, designing a current source that is insensitive to

temperature, voltage and also small in magnitude is not trivial. In a standard CMOS
process, several leakage sources are available options to provide reasonably small time

constant for the timer. The subthreshold leakage is well-studied but unfortunately has
exponential dependency on the temperature. The gate leakage is relatively insensitive

to temperature, however, can vary by several orders of magnitude among different

processes.

19
The remainder of the chapter will be organized as follows. We first show the

design of a timer using gate leakage as the current source in Sec. 2.2. In Sec. 2.3, a
timer with self temperature compensation current source will be presented. It relies

on a charge holding technique to save the power during the active mode. We will
compare the proposed timers and summarize our work in Sec. 2.4.

2.2 A Sub-pW gate leakage timer

2.2.1 Circuit Design for the timer

Fig. 2.2(a) shows the typical implementation of the previously mentioned one-shot

oscillator design [59, 60]. The output of the oscillator is decided by the voltage Vin .
The waveform shown in Fig. 2.2(b) illustrates the operation of the circuit. When

Vout is 0, Vin is charged toward the supply voltage by current source I1 . When Vin
surpasses Vb1 , the comparator will flip the output of the oscillator. Therefore, Vin

starts to be discharged by I2 instead. Assuming that both I1 and I1 are equal to Ion ,
the frequency of the timer can be written by

2Ion
(2.1)
Ct (Vb1 Vb2 )

which is independent of the supply voltage.

In practice, the current sources are susceptible to bias condition and tempera-
ture if not carefully designed. There are many publications on CMOS temperature-

compensated current sources, however, none are targeted for ultra low power applica-
tions [72, 73]. To reduce overall power consumption of the timer, the circuit needs to

be biased in the subthreshold region, further reducing the headroom for the current
source. Implementing the voltage references is another challenge since the voltages
Vb1 and Vb2 should also be independent to operating conditions such as temperature

and voltage. Although a bandgap reference provides a reference voltage with great
accuracy, it does not comply with the stringent power consumption that will be en-

20
I1 Vb1

Vb1
M1
Vin
Vin Vout
Comparator
Vb2

Ct M2
Vout
I2 Vb1

time
(a) (b)
Figure 2.2: The concept of a one-shot oscillator. (a) The circuit diagram. (b) The operation
waveform.

forced on the timer. The goal of the timer is to awaken the processor just in time.
Therefore, operating at a frequency faster than a very low value, for example, at

sub-Hz to 10Hz range, should be avoided as it is only a waste of energy. This means
that either the load capacitor Ct has to be very large or the current source should
only generate very little current in order to achieve a large RC time constant.

To provide a solution to the aforementioned challenges, the circuit that takes

advantage of the gate leakage of MOS transistors are proposed to replace the current

sources I1 and I2 shown in Fig. 2.2(a). As a CMOS technology scales, gate oxide
thickness will continue to shrink to maintain good channel control and drive current

at reduced channel lengths and supply voltages. Therefore, tunneling-based gate

leakage has becoming a non-negligible leakage source as opposed to subthreshold
leakage, especially at 65nm or below. Gate leakage is the sum of many different

tunneling currents such as the electron tunneling from the conduction band (ECB),
the electron tunneling from the valance band (EVB), and the hole tunneling from the

21
valance band (HVB). In general, gate current density has the following form [74]

Vg Vaux
Jg = A Toxratio
t2ox (2.2)
exp [B( |Vox |)(1 + |Vox |)tox ]
3/2
where A = q 2 /8hb , B = 8 2qmox b /3h, mox is the effective carrier mass in the
oxide, b the tunneling barrier height, tox the oxide thickness, and Vaux is a fitting

function of the tunneling carrier density and available states. Vaux is a weak function
of temperature and has the following form

Vgs ef f Vth0
Vaux = N IGC vt log 1 + exp (2.3)
N IGC vt

Typical temperature sensitivity for gate leakage is about 10% per 10 C, which is much
lower compared to the subthreshold leakage or junction leakage. For the timer appli-

cation, using the gate leakage is also advantageous in its small magnitude compared
to a transistors saturation current (e.g., in 0.13m CMOS a typical gate leakage is on
the order of 10s of pA/um [75]). The benefit of having small magnitude is two-fold.

First, the static current that used to charge and discharge the load capacitance is
small. Plus, large time constant helps reducing the switching of the clock network on

the subsequent digital circuits (e.g. a digital counter).

For simplicity reason, the comparator function is implemented with a CMOS

Schmitt trigger. The hysteresis nature of a Schmitt trigger is often used to suppress
signal noises [76]. The low-to-high transition voltage VM + and high-to-low transition
voltage VM are defined as the 2 crossover points in the voltage transfer characteristic;

i.e., when the input voltage Vin equals the output voltage Vout . In this work, VM + and
VM are the equivalent of Vb1 and Vb2 of Fig. 2.2, respectively. Since VM + and VM

are circuit parameters, there is no need to generate extra bias voltages.

Based on the previous discussion, our proposed timer is shown in Fig. 2.3. The

Schmitt trigger inverter contains transistors MS1 through MS6. When operating at

22
Vinv

MI4
MS4 MC1 MC2
MS6
MS3 Vin MI3
Vs Vout
Vin Vs
MS2 ML1 MI2
INV1 INV2
vx
MS5
TINV MI1
MS1 Vclk

Figure 2.3: Proposed timer structure for subthreshold operation

superthreshold voltages, VM + can be determined when MS1, MS2 and MS5 are all in
saturation. Considering IM S1 = IM S2 + IM S5 , and assuming that the channel length

modulation effect is negligible, VM + can be found as a function of Vth and Vx,tran

k2 + k5 k1
VM + = Vx,tran + Vth (2.4)
k2 + k5 k1

where Vx,tran is the voltage Vx in Fig. 2.3 when VM + occurs. From simulation results,

we know that Vx,tran is nearly constant when temperature varies. Therefore, the
temperature impacts VM + in the same way it affects Vth . The transition voltage for

the Schmitt trigger decreases as the temperature rises. VM can be computed in the
same way. Our simulation results show that at 300mV, VM = VM + VM reduces
0.1%/ C due to the lower on-off current ratio in subthreshold region. This results

show adequate temperature dependency using the Schmitt trigger in replacement of

the comparator and voltage references.

INV1 and INV2 are the inverters to provide sharper transition for the timer. To
reduce the leakage power, they are stack-forced and sized with long channel lengths.

The clock output is buffered again from the loading by a tri-state inverter TINV to
isolate any possible noise that is coming from the other part of the system. MC1 and
MC2 are thin oxide MOS transistors used to serve as the charging and discharging

devices. Both PMOS and NMOS transistors are used to provide comparable charg-

23
0C
20 C
40 C

Power consumption (pW)

3
10 60 C

2
10

1
10

0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65

Supply voltage (V)

Figure 2.4: Power consumption vs. supply voltage at different temperature points

ing/discharging strength. The load capacitance ML1 is implemented with a thick

gate oxide transistor, which are commonly available in modern CMOS processes, to
avoid unwanted gate leakage to ground. The corresponding waveform of Vin and Vout

is similar to the one illustrated with Fig. 2.2(b). When Vout is pulled up, Vinv is pulled
down to discharge Vin through both MC1 and MC2 until Vin is lower than VM of
the Schmitt trigger, and vice versa. Vclk goes to a digital counter that is configurable

by the system. Based on the application, the number of timer ticks can be used to
decide the time between active modes.

2.2.2 Measurement results

The test chip was implemented in a commercial 0.13m digital CMOS process. The
total circuit area is approximately 480 m2 where half of the area is allocated to the

load capacitor ML1. Fig. 2.4 plots the power consumption of timer as a function of
both the supply voltage and temperature. At 300mV, the power consumption of the

24
17 300mV
400mV
16 500mV
600mV
15

Output period (s) 14

10
0 10 20 30 40 50 60 70 80
Temperature ( C)

Figure 2.5: Timer period vs. temperature at various supply voltages.

timer is less than 1pW at 20 C and it consumes roughly 2nW at 600mV measured
by a Keithley 6217 electrometer.

Fig. 2.5 shows the timer output period measured at different supply voltages and
temperatures. The timer is more temperature insensitive at higher supply voltages,

largely due to the fact that the impact of VM , is minimized at the superthreshold
region. Similarly, the variation due to supply voltage is also reduced at higher it Vdd s
for the same reason. The measured temperature sensitivity is 0.16%/ C at 600mV

and is 0.6%/ C at 300mV; supply sensitivity is 0.15%/mV from 300mV to 500mV and
0.04%/mV at 600mV, the lower figure at 600mV is specifically due to operating in the

superthreshold region. For in-tissue biomedical sensor-type applications, temperature

normally will not deviate more than a couple degrees and the temperature sensitivity

is adequate. Also, for a system containing a temperature sensor, updated temperature

information can be obtained when the processor wakes up and the number of sleep
clock cycles can be adjusted the next time the system goes to sleep.

25
70
Vdd=300mV
Vdd=600mV
60

Output period (s)

0
0 5 10 15 20 25
Die number

Figure 2.6: Output period scatter plot highlighting die-to-die and within-die variations.

Within-die and die-to-die process variation is a significant concern in advanced

VLSI technologies. We measured ten timers (2x5) in each of the 25 dies and plot

the output period in Fig. 2.6. To characterize die-to-die variation, we first compute
the mean for each die and obtain / across all 25 dies. Die-to-die variation is 28%
and 27% at supply voltage of 300mV and 600mV respectively. Within-die variation is

obtained by taking the average of / within individual die and is measured at 12.4%
and 9.2% for 300mV and 600mV. Key sources of variation includes oxide thickness

variation and the voltage shift of the Schmitt trigger trip points VM + and VM due
to transistor mismatch. In general, the variation can be calibrated by adjusting the

aforementioned counter. The processor can easily configure the number of counts
between the readings by preloading digital values.
We also tested the proposed timer by running it continuously for 20 hours to

measure the timing stability over an extended period. Measurement results over time

26
Output period (s)
21.4

21.3

21.2
500 1000 1500 2000 2500 3000
Number of ticks

400
Count

300
200
100
0
21.1 21.15 21.2 21.25 21.3 21.35 21.4 21.45 21.5
second

Figure 2.7: Timer output period variation with respect to time.

along with the resulting histogram are shown in Fig. 2.7. It takes approximately
ten minutes for the timer to reach steady state, after which the output frequency is
always within 1% throughout the remaining 20 hours of testing. The rms jitter for

this timer is 30ms, equivalent to 0.14% of the output period.

2.3 Self temperature compensation for low power

timer

As mentioned at the beginning of the chapter, the fundamental challenge to design a

low power timer is to find a reliable current source which defines the output period
accurately with low cost. The previous section discussed the design where gate leak-
age of a thin oxide device is used to provide such current. In general, gate leakage

modeling is a complicated problem and accurately simulate the behavior is not feasi-
ble. Also, porting the design to another technology is not trivial since the change in

27
gate leakage is typical huge. Therefore, we would like to explore other options that

is available in a standard CMOS process.

2.3.1 Oscillator with self temperature compensated current

source

Subthreshold leakage is still the most dominant contributor to the total leakage power
dissipation for advanced CMOS technologies [77]. Therefore, it is also the most

studied leakage source and many measurement data can be used to refine the circuit
simulation model [78, 79]. The result of many recent literatures on subthreshold

circuit designs provide further evidents on this [57, 80, 19]. An advantage of using
the subthreshold leakage is that the value can be easily duplicated by the current
mirror, which is critical for lowering the power consumption as we will present in Sec.

2.3.2. On the other hand, subthreshold current does suffer from temperature and
process variation as shown by the equation
2
W kT
Isub = ef f Cox (m 1) eq(Vg Vt )/mkT (1 eqVds /kT ) (2.5)
L q

As temperature affects both the thermal voltage and the threshold voltage in a non-
linear way, finding a inverse function that is able to compensate for the temperature

effect is not practical. In addition to that, process variation from doping and geometry
can cause difference up to several times compared to the nominal value [81, 82].

In typical CMOS technologies, unsilicide polysilicon resistors provide decent sheet

resistance and well-controlled temperature coefficient [83]. By forcing constant volt-
ages across the two terminals of a resistor, it becomes a superior current source

compared to the subthreshold current of a MOS transistor that is not temperature

compensated. Fig. 2.8 shows the bias stage with resistor R1. Diode-connected tran-

sistors Md1 -Md6 which are equally size evenly divide the supply voltage and the inter-
mediate voltages are insensitive to temperature. Transistor M0 is biased with a gate

voltage lower than the subthreshold voltage and forms a negative feedback loop with

28
nd1
Md1 R1 M2
nd2 n1
Md2 bp
A0
+ bn
nd3 M0
-
Md3
M1
nd4
Md4 Mg Ms

nd5
Md5

Md6 reset

Figure 2.8: The bias stage showing the voltage divider and a resistor based self-biasing loop.

resistor R1 and amplifier A0. The voltage on node nd3 is replicated to n1 through the
feedback loop. As a result, the current that flows through R1 and the drain current
of M0 are identical and can be given by

Vdd Vnd3
IR1 = IM0 = (2.6)
R1

the gate overdrive voltage bn is thus self-biased at different temperatures. When the
temperature is high, voltage bn reduces to compensate for the higher leakage, and

vice versa. Transistor that biased in subthreshold region provides very high output
resistance since the drain current barely changes with Vds when it is larger than 3

kT /q. The magnitude of current-mirrored output can be easily adjusted for process
variation by dividing M0 into smaller parallel transistors with series switches that can

be selectively turned on to change the ratio. A reset signal is used to turn off the
biasing circuit in order to save power during the active mode. We will have more
discussion on this in the later section.

29
The oscillator that generates the output of the timer is shown in Fig. 2.9. It is a

one-shot oscillator that determines the oscillation period by the load capacitor CL and
current sources that is biased by bn and bp. Both bn and bp are originally generated

from the bias stage and replicated by the hold stage that will be presented later. When
out is logic low, the charge stored on load will be sinked to ground. On the other
hand, load is pulled up toward Vdd when out is logic high. The switching transistors

for pull-up and pull-down are long channel devices and stack-forced with N=4. This
is because biasing transistors Mn and Mp are biased in the deep subthreshold region.

To guarantee the behavior at low temperatures, the switching transistors should be

significantly weaker than biasing transistors when they are turned off so that the

unwanted leakage does not contend with the charging transistors during oscillation.
By comparing load voltage to reference voltages refh and refl, the output will flip
from the previous state once load surpasses refh or becomes lower than refl. In this

work, refh and refl is also generated from the bias stage (voltages nd2 and nd4). The
gain of the comparator is an important design parameter since it determines the delay

of signal ss and rs once load triggers the comparator. It is noted that to maintain
the comparator gain at different temperatures, the comparators are also biased with

the same gate overdrive voltage bp.

2.3.2 Power reduction by charge holding technique

From Eq. 2.6, the current consumption of the circuit is mainly determined by the

resistance value. When Vdd = 600mV, Vnd3 = 300mV and R1 equals 15M, the
power consumption for the bias stage is 10nW assuming that the power consumption
of the amplifier at room temperature can be neglected. Further increasing R1 will

reduce power consumption at the expense of increased silicon area. Reducing the
voltage difference between Vdd and Vnd3 is another option. However, it magnifies the

voltage offset between Vnd3 and Vn1 and increases the temperature sensitivity. In

30
Mp bp

refh + ss
N=4
- out

load

CL + rs
N=4
refl -

Mn bn

Figure 2.9: One-shot oscillator for timer output.

an effort to keep adequate footprint for the timer, a program-and-hold technique is

proposed to bring down the power by two orders of magnitude with little hardware
overhead.

The circuit diagram for the proposed method is shown in Fig. 2.10. The bias stage
and the oscillator stage have already been presented in the previous section, with an

addition of the hold stage for power saving. The idea is to store the voltage on a
capacitor after the bias stage is turned off through power gating. The time before
the bias stage is turned off is called the programming mode. And it enters the active

mode when the bias voltage is only sustained by the hold stage. By applying the
bias voltage to a much smaller transistor than bias transistor, the bias current for

the oscillator and thus the total active power can be proportionally reduced. For
example, if the ratio between the transistor width of M0 and M1 is 200:1, the power

consumption can be reduced from 10nW during the programming mode to merely
50pW in the active mode.
Two types of charge holding circuit are shown in Fig. 2.11. For type I circuit,

31
Bias Stage Hold Stage Oscillate Stage

Iref M2

Program Hold
200X 1X out
M0 M1
CL

Figure 2.10: One-shot oscillator for timer output.

the bias voltage bn is written into bn1 and bn2 when c[1] is high. After c[1] goes

low, bn will be discharged to ground as amplifier A0 is turned on simultaneously.

Assuming A0 has no input offset voltage, bn1 should be identical bn2 and therefore

eliminates the subthreshold leakage through Ms2 . Ideally, the charge stored on CL
can be maintained for a long period of time before it needs to be replenished again.
However, the junction leakage of Ms2 becomes a dominant source that discharges CL

when the temperature increases. To address this issue, type II circuit is considered.
Instead of charging CL with pass transistors, gate leakage of a thin oxide transistor is

used. When the bias stage is ON, amplifier A1 controls the voltage of bn1 until bn2
reaches the same value as bn. After the bias stage turns off, node bn2 acts like floating
node. In this scheme, node bn2 does not suffer from other leakage source other than

through transistor Ms . Since gate leakage is relatively insensitive to temperature, type

II circuit is able to operate at much higher temperatures than the type I counterpart.

To further understand the operation of the hold circuit, the node voltages are

plotted in Fig. 2.12. Assuming at the beginning of power on, every node has initial

32
c[2] bn1
(to osc.)
c[1]
+ A2
bn1 bn2 -
bn c[3] bn2
M s1 Ms2 Ms
(from bias)
- Mc
+ CL bn(from bias) CL
+ A1
-
A0
c[1]
(a) (b)
Figure 2.11: Circuits for charge holding. (a) type I, (b) type II.

condition of voltage 0. During phase P1, node bn will reach steady state first as
discussed in Sec. 2.3.1. bn2 will slowly converge to bn with time constant depending

on the ratio of gate leakage and the load capacitance. It is noted there is finite
voltage offset between bn1 and bn due to input offset voltage and finite gain for A1.
In order to eliminate the gap between bn1 and bn, transistor Ms is turned on during

P2. At this point, ideally bn, bn1 and bn2 will all have the same voltage. In phase
P3 and P4, amplifier A2 is turned on to minimize the voltage difference between

bn1 and bn2 while bias stage can be turned off by c[0] to save power. While P2,
P3 can take less than milliseconds, P1 will be the dominating period of the total

programming time. Since bn1 and bn2 follow each other closely during P4, bn1 is
chosen to drive the oscillator stage. In this way, the unnecessary coupling noise from
the switching of oscillator can be prevented from entering pseudo-floating node bn2.

The programming time of the timer is defined as the total period combining P1, P2
and P3.

2.3.3 Test chip and measurement results

The proposed timer design was fabricated in 0.13m technology. The die photo of
the timer circuit is shown in Fig. 2.13. Total area of the timer is 0.019mm2 , where

the resistor occupies about half of the that. Ambient temperature is controlled by

33
P1 P2 P4
P3
VDD
bn1

bn2

0
c[0]

c[1]

c[2]

c[3]

Figure 2.12: Timing diagram of the program-and-hold method.

a TestEquity TE-105A temperature chamber. In order to guarantee a precisely con-

trolled temperature, all testings are performed 10 minutes after the temperature is

ramped up. The output frequency of the timer is measured by a Tektronics oscillo-
scope TDS5104A. In this setup, all the control signals are supplied externally through

the parallel interface of a PC.

The normalized frequency variation due to temperature and supply voltage is

plotted in Fig. 2.14. Since the frequency of the timer drifts over time by the gate
leakage through the programming transistor, the output frequency in this figure is
referred to as the frequency right after the programming mode is completed. The

programming time is set to 30 seconds for this measurement. At lower temperatures,

the output frequency decreases from the room temperature value. This is mainly

because of the reduced gain from the comparators of the oscillate stage. The variation
of the output frequency across temperature when the timer is operating at 600mV is

6% over the range from 0 C to 90 C. At different supply voltages, the curves closely
track each other with respect to temperature, suggesting that the characteristics

34
Bias stage

Hold stage

Osc. stage

Figure 2.13: Die photo of the timer test chip.

remain unchanged with the bias voltage. At room temperature, varying the supply
voltage by 50mV results in +4/-2% of frequency variation. In this work, the timer
is the only active component in the sleep mode. Thus, the switching supply noise

can be neglected. It is reasonable to assume that the cycle time error due to supply
variation should be lower than 1%.

The expected timer behavior is, for example, to wake up the system every 10
minutes. The ideal situation is that the timer only requires to be reprogrammed

whenever the processor is waken up to simplify the programming process. Given

the fact that the timer output frequency is not constant after entering the active
mode, the expiration time that is defined by a certain numbers of cycles counted by

35
1.06
1.04

Normalized frequency
1.02
1.00
0.98
0.96
0.94
VDD = 600mV
0.92 VDD = 550mV
VDD = 650mV
0.900 10 20 30 40 50 60 70 80 90
Temperature ( C)
Figure 2.14: Normalized frequency vs. temperature and supply voltage.

the timer varies depending on how often the timer needs to be refreshed. Fig. 2.15

shows the average timer frequency versus the time between refreshing. The results are
measured at 600mV with a programming time of 30 seconds as well. Each curve in
the figure represents a different temperature point at 0, 20, 50 and 90 C, respectively.

Higher temperature also means higher leakage and thus larger slope is shown in the
figure. At 0 C, the frequency drift is about 0.8% per minute, while it is 1.7% per

minute at 90 C. The measurement results backs up the statement that choosing type
II circuit for the hold stage is advantageous for its low temperature coefficient. By

refreshing the timer every 4 minutes, 7% of frequency deviation is observed across the
temperatures. Whereas by reducing the refreshing time to 2 minutes, the frequency
deviation can be reduced to 5% as well.

So far, the programming time is set to 30 seconds to guarantee that the timer
is properly biased. However, the power saving by the program-and-hold method is

maximized by staying at the power hungry programming mode as short as possible.

36
105

@ 50 C
100

Normalized frequency (%)

@ 20 C
95
90 @0 C

85 @ 90 C
80
750 100 200 300 400 500 600 700
Refresh time (s)
Figure 2.15: Average of normalized frequency drift over time.

In Fig. 2.16, the timer is refreshed every four minutes with a programming time of 1.1

seconds. It shows that 3 to 4 programming cycles are required to bias the timer at the
target frequency. After that, the timer will operate at a steady frequency. In order to
achieve the steady state frequency in a single programming cycle, the programming

time needs to be increased to at least 1.5 seconds.

Although it is of our interest to reduce programming time, the impact in terms of

temperature sensitivity needs to be considered. Fig. 2.17 shows the output frequency
normalized to the programming time of 10 seconds with three different temperature

settings. When the programming time is further decreased below 1 second, the timer
can no long be properly programmed and therefore no oscillation can be observed.
This is consistently true across the temperatures of interest. At the room temperature,

the output frequency drops 8% by reducing the programming time to less than 2
seconds. Since the programming time will be fixed across temperature, what matters

is the frequency deviation at the same programming time. According to the figure,

37
12
10

Output frequency (Hz)

After 2nd programming
8
6
After 1st programming
4
2
00 500 1000 1500 2000 2500
Time (s)
Figure 2.16: Refresh the timer every four minutes with 1.1 second programming time.

reducing the programming time will not introduce more than 2% error on top of the

frequency deviation shown in Fig. 2.14. Therefore, it is reasonable to always use the
minimum programming time that is available for this work.
Power consumption is measured at programming mode and active mode, respec-

tively. Fig. 2.18 shows the power consumption at supply voltage equals to 550mV. It
is also shown that the programming power has a clear floor at around 11nW. This is

mainly due to the fixed voltage drop across the polysilicon resistor in the bias stage.
At higher temperatures, the programming power grows beyond linearly due to the

exponentially increased power from the amplifiers that is biased in subthreshold re-
gion. The active power at room temperature is 55pW, which is directly proportional
to the 200:1 current mirror ratio. In active mode, the power consumption of the

comparators start to dominate the total power after 60 C.

Combining Fig. 2.15 and Fig. 2.18, the tradeoff between power and frequency

deviation across the temperature range of interest from 0 C to 90 C is shown in

38
1.00

Normalized frequency (Hz)

0.98

0.96

0.94

0.92 Temperature = 0 C
Temperature = 27 C
Temperature = 80 C
0.90 0
10 101
Programming time (s)
Figure 2.17: Normalized frequency vs. programming time.

13.0 250
Programming power (nW)

12.5 200
Active power (pW)
12.0 150

11.5 100

11.0 50

10.50 10 20 30 40 50 60 70 800
Temperature ( C)
Figure 2.18: Power consumption at the programming mode and the active mode with respect to
different temperatures.

39
105 12

Average power consumption (pW)

Frequency deviation (%)

104 10
9
103
8

102 7
6
101 0 100 200 300 400 500 600
Refresh period (s)
Figure 2.19: Power consumption and frequency deviation with different refreshing time.

Fig. 2.19. The programming time is given by 1 second, which is also the smallest

period that can still bias the timer properly. As the refreshing time gets larger, the
average power consumption includes both the programming power and the active
power will be reduced. The frequency deviation is 5% without considering the shift

of frequency over time. When the refreshing time increases over two minutes, the
frequency deviation begins to rise as a result of leakage current difference between

low and high temperatures. To sum up, 150pW and 100pW of power consumption
can be achieved if the tolerable errors are 5% and 7%, respectively.

2.4 Summary

In this chapter, two types of timers are proposed for ultra lower power sensor plat-
forms. Table. 2.1 summarizes the characteristics of the timers. The gate leakage

timer has the advantage on smaller footprint and also consumes less power compared
to the program-and-hold timers when operates at 300mV. Considering applications,

40
the program-and-old timer works only under the assumption that temperature only

vary slowly compared to the refreshing period. The gate leakage timer, however, suf-
fers from larger variation when operates at different temperatures, especially when

the supply voltage becomes lower. In exchange for better temperature insensitiv-
ity for gate leakage based timer, the supply voltage has to be increased and results
into higher power consumption. In terms of low voltage operation, the gate leakage

timer is preferred since the operation does not rely on the analog components as
in the program-and-hold timer. Control for the gate leakage timer is trivial as the

program-and-hold timer requires a finite-state machine for programming it properly.

Table 2.1: Comparison for the timers.

Gate leakage timer Program-and-hold timer

Technology 0.13m 0.13m

Total area 480m2 0.019mm2

Nominal period 11s 0.09s

Cycle time error 16.3% @ 450mV 5% (refresh every 2 minutes)
due to temperature 7% (refresh every 4 minutes)

Power consumption 120pW @ 450mV 150pW (refresh every 2 minutes)

100pW (refresh every 4 minutes)

Supply sensitivity 7.5% @ 450mV +4/-2% @ 600mV

(50mV offset)

*
Over a temperature range from 0 C to 80 C.
**
Over a temperature range from 0 C to 90 C.

41
CHAPTER III

AN ULTRA LOW POWER 1V, 220NW

TEMPERATURE SENSOR FOR PASSIVE
WIRELESS APPLICATIONS

3.1 Introduction

Since the last decade, smart temperature sensors have growing demands on VLSI,
automotive, and wireless sensing applications due to their low cost. Monitoring VLSI
chip temperature plays a key role on long-term system level reliability and perfor-

mance. Rapidly increasing transistor numbers require embedded sensors with small
area and low power that can be spread over the chip for temperature management

[84]. Sensors that produce low power consumption not only helps with power grid
integrity but also alleviates self-heating issues. Recently, growing interests in building

monitoring systems with wireless telemetry or RFID cards demand even more strin-
gent power consumption [85, 86]. The energy range which is defined as the distance
from the transponder and reader that is just enough to operate the transponder can

be extended by cutting down the power dissipation [87]. In the work reported in [65],
the temperature sensor consumes 10W compared to 2W by the reader for their pas-

sive RFID transponder. This means that the power consumption of the temperature
sensor is highly related to the working distance of such wireless systems.

Smart temperature sensor ICs were first developed using bandgap reference and
analog-to-digital converters (ADCs) [88, 89]. Such sensors typically are able to achieve
better than 1 C accuracy with calibration. Combining with offset cancellation, dy-

42
namic element matching and room-temperature calibration, accuracy of 0.1 C with

247.5W power consumption was reported [90]. Time-to-digital converter (TDC) was
also proposed to measure the temperature by tracking a pulsed signal along a delay

line [91]. In this work, our goal is to implement a temperature sensor with sub-W
power dissipation with acceptable accuracy for ultra low power passive wireless sen-
sor applications. In the Sec. 3.2, the architecture of our proposed circuits will be

discussed and the power consumption will be analyzed. Measurement results will
be shown in Sec. 3.3 and is followed by the conclusion in Sec. 3.4. A discussion on

improving the voltage sensitivity will be in Sec. 3.5.

3.2 low power temperature sensor design

Fig. 3.1 shows the block diagram of the temperature sensor. Temperature insen-
sitive current source Iref and proportional to absolute temperature (PTAT) current

source IPTAT are generated separately. Each current source is mirrored and fed into
the current-starved ring oscillator to translate the temperature information into fre-
quency. Afterwards, the clock signals are fed into an UP-counter that is triggered

by a start signal in order to produce a digitized output. The sensor controller de-
cides when the conversion should start and responds by a data valid signal when the

data is available. The key blocks of this work is to generate current sources Iref and
IPTAT with low power dissipation and is still able to maintain reasonable temperature

characteristics.
Generating IPTAT is a commonly used technique in bandgap reference design for
compensating the complementary to absolute temperature (CTAT) current sources.

Fig. 3.2 shows the schematic for such purpose that was originally implemented with
bipolar circuits [92]. CMOS transistors can be used in place of bipolar transistors

when operating in the subthreshold region. In this way, we can reduce the power
consumption of this block significantly, which accounts for roughly 30% of the total

43
Iref
start

Sensor controller

start
Counter data
data_valid

IPTAT
start

Figure 3.1: Temperature sensor block diagram.

power dissipation. When Vgs is less than Vth and Vds is larger than three VT , the drain
current of transistor M4 and M5 can be approximately written by:

W 2 Vgs Vth
Isub = COX VT exp (3.1)
L nVT

where VT and Vth are the thermal voltage represented by kT /q and Vth is the threshold
voltage of the transistor, respectively. Through current mirror transistors M2 and M3,
the current through resistor RP T AT can be expressed as

nVT W 5 W 3 L4 L2
IRP T AT = ln (3.2)
RP T AT W 4 W 2 L5 L3

Assuming that Vth mismatch is ignored. By properly biasing the circuit, the output
current is proportional to VT . The sensitivity to the geometric variations can be

minimized by designing a large value in the log function. Large transistor sizes also
help to reduce the impact on threshold voltage due to random doping fluctuations.

The temperature insensitive current source is generated by a self-biasing tech-

nique. The circuit diagram is shown in Fig. 3.3. M1 through M5 are diode-connected

44
M0
resetn
M1 M7

bp
M2 M3

RPTAT
bn

M4 M5 M6

Figure 3.2: Schematic for IPTAT generation.

transistors used to provide bias voltages that are proportional to the supply voltage.
The voltage of nb is replicated to node na through negative feedback loop consisting

of transistor M6, resistor R1 and the amplifier. Therefore, the drain current of M6
can be defined by (Vdd Vna Vos )/Rref , where Vos is the input offset voltage of the
amplifier. The fractional temperature coefficient (T CF ) of Id6 is

1 dId6
T CF (Id6 ) = (3.3)
Id6 dT

1 dVna 1 dRref
= (3.4)
Vdd Vna Vos dT Rref dT

To reduce the non-ideal temperature effect on the sensor we do the following: 1) the

resistor is chosen so that the second-order temperature coefficient (TC2) is minimized;

and 2) transistors M1-M5 should be identically sized to eliminate the first term of

Eq. 3.4.
It is noted that in this work, the voltage reference circuitry in Fig. 3.3 was imple-

45
M5
Rref M8
nb na
bp
M4

+
-
Amp
M3

M6 M7

bn
Inv
M9
M1
reset
M10
Voltage
reference
circuitry

Figure 3.3: Schematic for Iref generation.

mented as a voltage divider. Thus, Iref is inversely proportional to the supply voltages

and lead to changing output value with power supply noises. To fix this issue in the
future, the voltage reference should be re-designed to have constant output regardless

of the supply voltage.

Both Iref and IPTAT blocks generate analog voltages bn and bp to provide the

starving voltage for the ring oscillator. Temperature information in Iref and IPTAT
are translated into frequency for the signals clk i and clk l. In Fig. 3.4, the sensor
controller is shown as well as the timing diagram. clk i and clk l are used to clock

the q-counter and the d-counter, respectively. When start is 0, both counter outputs
are cleared. Triggered by input signal start, the controller asserts output data valid

after the q-counter gets overflowed 210 cycles later. data valid immediately stops
both counters from changing their content until start goes to 0 again to reset the

states. The temperature sensor including the Iref and IPTAT blocks are implemented so

46
clk_i data_valid
q[0] q[10]
start

d[0] d[10]
clk_l

data

start
clk_i
clk_l
q 012 1024
d 01 1135
data_valid

Figure 3.4: Block diagram and timing diagram of the sensor controller.

that they can be deactivated during sleep state by asserting reset signal. When high
conversion rate in not required, the temperature sensor can be periodically deactivated

to save power.
The total power of our proposed temperature sensor can be written as follow:

Ptot = Vdd [(n + 1)IP T AT + (m + 1)Iref ] + Pctrl (3.5)

where n, m are the multiplication constants of current mirrors. Pctrl is the power
consumption of the sensor controller. For simplification, static power consumption is
2
neglected in this first order analysis. Therefore, Pctrl can be expressed as Cc Vdd fclk

given the total capacitance Cc , effective activity factor and clock frequency fclk .
Considering fclk as a function of Iref , IPTAT and Vdd , Eq. 3.5 can be re-written as
2
Vdd Vdd
Ptot = k1 VT + k2 (3.6)
RP T AT Rref

where k1 and k2 are geometry and process related constants. It is shown that 1) the

47
power consumption of the sensor is a linear function of temperature; and 2) power

consumption can be proportionally reduced by using large resistors. The size of

resistors are determined by the target current consumption of 200nA and by matching

Iref and IPTAT at room temperature. In this work, 6.2M and 3.2M P+ poly resistors
are chosen for Rref and RP T AT , respectively.

3.3 Measurement results

The chip was implemented in a 0.18m 1P6M digital CMOS process. The total area
of the temperature sensor module is 0.05mm2 . The die photo is shown in Fig. 3.5.

In this test chip, 85% of the area is dominated by the resistor for biasing the current
sources.

De-
Sensor controller
cap

0.3mm
IPTAT Iref
module module

0.165mm

Figure 3.5: Die photo of the temperature sensor.

48
320
300
280

Power (nW)
260
240
220
200
180
0 20 40 60 80 100
Temp (C)
Figure 3.6: Power consumption of the temperature sensor.

The measurement is setup inside a TestEquity environment chamber TE-105A.

The power consumption is measured by a Keithley electrometer 6517A and the results
are shown in Fig. 3.6. The supply voltage is set to 1V while the nominal supply

voltage for this technology is 1.8V. The power consumption increases from 200nW
to 310nW from 0 C to 100 C, which matches the expected trend from Eq. 3.6. The

slope at higher temperature is larger mainly because leakage components become

non-negligible in this region. It is noted that there is a trade-off between power

consumption and area. While most area are dominated by the resistors, reducing the
resistance by half also reduce total area by 43%. In the same time, the conversion rate
is also doubled because of the boost in ring oscillators starving current. In this test

chip, clk i is running at 100kHz for an equivalent of 100 samples/s. This is sufficiently
fast for most applications, and in fact we can lower the conversion rate to lower the

reading noise as will be shown in Fig. 3.8.

The temperature inaccuracy of 5 test samples after two-point calibration are shown

in Fig. 3.7. The temperature error is ranging from -1.6 C to +3 C over the sweep-

49
Table 3.1: Comparison of temperature sensors.
Sensor Inaccuracy Power Technology Area Temperature Conversion rate
( C) Consumption (mm2 ) range ( C) (samples/s)
[88] 1 7W 2m 1.5 -40120 50
[89] 1 1mW 0.6m 3.32 -55125 40k
[90] 0.1 247.5W 0.7m 4.5 -55125 110
[91] -0.7/+0.9 10W 0.35m 0.175 0100 10k
[65] -1.8/+2.2 10W N/A N/A 0100 2
[86] 1 0.9W 0.18m 0.2 2747 N/A
This work -1.6/+3 0.22W 0.18m 0.05 0100 100

2
Error (C)

-1

-2
0 20 40 60 80 100
Temp (C)
Figure 3.7: Temperature inaccuracy of the temperature sensor with two-point calibration at 20 C
and 80 C.

ing range from 0 C to 100 C. With 11 bits output from the sensor controller, the

temperature resolution is 0.3 C.

Table 3.1 lists the previous works on smart temperature sensors and compares the

key circuits parameters to this work. It can be seen that our proposed temperature
sensor adopts an approach that is favorable for low power operation at the expense in
terms of temperature inaccuracy. The total area of our test chip is comparable or even

smaller than other works after considering the translation of different technologies.
Fig. 3.8 shows the long term characteristics of the sensor by setting up the chip in

50
3
2

Temperature offset(C)
1
0
-1
-2
-3
3
2
1
0
-1
-2
-3
0 200 400 600 800 1000
Samples (#)
Figure 3.8: Temperature inaccuracy over samples (top: 10 samples/s; bottom: 100 samples/s; solid
line: actual temperature).

the temperature chamber (top: 10 samples/s; bottom: 100 samples/s). After taking
1000 samples successively, the 3 inaccuracy value over the samples is 2.5 C. By

lowering the conversion rate to 10 samples/s, the 3 inaccuracy is reduced to 0.28 C

by averaging the samples. The actual temperature is also shown in solid line in Fig.

3.8.

3.4 Conclusion

In this work, we implemented an ultra low power temperature sensor for passive
wireless applications. At room temperature, it consumes merely 220nW while contin-
uously running. By utilizing a temperature independent current source Iref and PTAT

current source IPTAT , the temperature information can be synthesized and translated
into digital output in a conversion rate of 100 samples/s. Measured data shows that

the temperature inaccuracy of the temperature sensor is -1.6 C/+3 C from 0 C to

100 C.

51
M0 M1

Amp
bp

+
-
Vref nb bn

Rref M2

Figure 3.9: Modified temperature insensitive current source.

3.5 Improving the voltage sensitivity

In order to minimize the supply voltage sensitivity, the circuit shown in Fig. 3.3
needs to be revisited. As supply voltage deviates from its nominal value by V,

the current varies by I=V/Rref . I will result in a shift of output value given
the same measured temperature. More importantly, it will also distort the current
to frequency conversion by the current starved ring oscillator since the relationship

between the input current and the output frequency is a non-linear function. One
solution is to investigate another way of translating the current to frequency that is

less impacted by the supply voltage. For example, a transmission gate based current
starved oscillator can be used.

Another solution is to consider the circuit that shown in Fig. 3.9. Instead of using
voltage divider as the voltage reference, Vref which provides absolute voltage level
needs to be generated. The other change is to place the resistor between ground and

na. Therefore, a supply voltage insensitive current source can be generated. Vref can
be provided by the circuit shown in Fig. 3.10. The details of the voltage reference

will be discussed in Sec. 6.2.2. According to the Monte Carlo simulations, more than

52
M2 M3

M1
Vref

M4 M5

Figure 3.10: Voltage reference generator.

40dBs of power supply rejection ratio (PSRR) can be achieved using the circuit. The
temperature coefficient of Vref can be compensated by sizing transistor M1. At the
temperature of interest, the temperature coefficient of the thermal voltage and the

threshold voltage of M1 cancels each other. Again from 100 Monte Carlo Simulations,
the worse case temperature coefficient is 167ppm/ C.

53
CHAPTER IV

SINGLE STAGE STATIC LEVEL SHIFTER

DESIGN FOR SUBTHRESHOLD TO I/O
VOLTAGE CONVERSION

4.1 Introduction

Operating in the subthreshold region helps to greatly reduce power dissipation for
applications that do not require high performance [80, 19]. Level conversion has al-
ways been an issue for systems that need to deal with two or more power domains.

This problem is more severe in subthreshold circuits. Since the drive strength of the
input devices are mostly limited to subthreshold operation and have a corresponding

exponential dependency on voltage, several intermediate voltages are typically re-

quired to up-convert to I/O voltage levels. Generating intermediate voltages and the

extra wiring requirements are undesirable side effects of multi-stage level conversion.
In this paper we discuss the design issues associated with bridging the subthreshold
core logic and I/O voltage in a single stage design, and propose a robust circuit that

addresses these issues.

Differential cascode voltage switch (DCVS) is a commonly used circuit technique

for voltage level conversion [93]. It has the advantage of low static power consumption
and small propagation delay due to the cross-coupled latch structure. The drawback

is that converting from ultra-low input voltages requires large transistors, which will
be discussed in Sec. 4.2. Modified DCVS was proposed to alleviate the contention
problem [94], however it still suffers the same sizing issue as the DCVS level shifter

54
when the input is at subthreshold voltage. A dynamic level converter is a reliable

way to achieve level conversion at low voltages [95, 96]. The disadvantage is that it is
more power hungry compared to its static counterpart and requires extra clock routing

and synchronization circuitry. Since power consumption is the critical performance

metric in subthreshold systems, dynamic level conversion becomes undesirable. Single
supply diode-voltage-limited buffer and half-latch level converters are other options

used for dual supply systems [97]. While not specifically designed for subthreshold
level conversion, these designs generally require the cascading of multiple stages.

In this chapter, we will first examine the sizing of conventional DCVS level shifters
at very low voltages and demonstrate its susceptibility to process/voltage/temperature

(PVT) variation. We will then present a modified diode-voltage-limited level shifter

that takes the advantage of the input level independent pull-up devices. We compare
the proposed circuit to the conventional approach in terms of power consumption,

area, and delay.

4.2 Conventional approach

Fig. 4.1 shows the circuit diagram of a DCVS-type level shifter. The circuit operates
on the basis of contention between pull-up and pull-down devices. In order for the

output to switch, the NMOS drive strength has to be sufficiently greater than the
PMOS drive strength. When VDDL is at a subthreshold level and VDDH is the I/O

voltage, the difference in drive strength for the pull-up and pull-down transistors can
easily be greater than three orders of magnitude.
Fig. 4.2 shows the simulated gate delay of the DCVS level shifter converting a

periodic input to the I/O voltage in 0.13m CMOS. Transistors Mp1 and Mp2 are
1
both sized at W/L of 0.36m/5m to provide decent fall delay with respect to the

1
0.36m is the minimum width of I/O device in this technology

55
Thick oxide VDDH
I/O

Thin oxide
Normal Vt Mp1 Mp2
VDDH

n2
VDDL out
n3
buf_in (inv)
in Mn1 Mn2

(buffer)
VDDL
n1

(inv)

Figure 4.1: Conventional DCVS-type level shifter with cross-coupled pull-up transistors.

rise delay while restricting the size of the pull-down transistors Mn1 and Mn2. When
VDDL is 0.35V, the operating frequency is primarily limited by the fall delay if the

width of Mn1 and Mn2 (represented by Wn) is greater than 15m. This also indicates
that the results can be further optimized by up-sizing Mp1 and Mp2. However, at
lower VDDL, the rise delay is not able to catch up with the fall delay until Mn1 and

Mn2 is disproportionately large. Other than the area and corresponding leakage power
arising from such large pull-down transistor sizes, operating at low temperatures leads

to another problem for this type of circuit. The drive strength of Mp1/Mp2 are (at
most) quadratically sensitive to Vt shift while the drain current of Mn1/Mn2 has an
exponential dependency on Vt. As a result it is much more difficult to balance drive

strengths at lower temperature by sizing up Wn. In typical cases, conventional level

shifter is able to achieve less than 5 fanout-of-four inverter (FO4) delays at VDDL,

which is sufficiently fast for most subthreshold applications. However, considering 3

process variation, Wn needs to be up-sized at least two times larger than the nominal

case to maintain functionality.

56
5

Delay (# of FO4)
3

1
VDDL=0.30V
VDDL=0.25V
VDDL=0.35V
0
0 10 20 30 40 50 60 70 80 90 100
Wn (m)

Figure 4.2: Simulation results showing the operating frequency with respect to pull-down transistor
width Wn.

4.3 Proposed approach

To overcome the dramatic difference in overdrive voltage for pull-up and pull-down
transistors, diode-connected PMOS transistors are used to replace the pull-up tran-

sistors. The proposed circuit is shown in Fig. 4.3. The pull-down of internal nodes
intn and intp is directly through normal Vt transistor Mn1 and zero Vt thick oxide

transistor Mn2. The use of Mn2 was previously proposed to reduce the potential
difference across the drain and source of Mn1 [98]. Therefore, the drive strength of
Mn1 can be increased by avoiding the use of thick oxide devices. We apply this stack

transistor technique to the conventional level shifter for fair comparisons in Sec. 4.4.
The purpose of Mp1 is to help eliminate part of the cross-bar current at the beginning

of the transition, although it also introduces roughly 10% delay penalty due to the
extra loading.

For pulling up the internal nodes, Md1 through MdN transistor stacks provide a
variable resistance path to the supply. At the beginning of input switching, most of

57
VDDH

Thick oxide
Md1
I/O
Mpo

Thick oxide
intp
zero Vt
MdN

Thin oxide
Normal Vt (pull up)
Mp1
out

VDDL Mn2

buf_in intn
in Mn1 Mno

(buffer)
(pull down)

Figure 4.3: Proposed approach that uses input voltage independent diode-connected transistor stacks
for pull-up devices.

the VDDH voltage drop is across Md1-MdN. Assuming that each transistor in the
stack is still biased above threshold voltage and neglecting second order effects, the

effective resistance of the stack can be represented by

VDDH Lp
Ref f = (4.1)
Cox Wp ( VDDH
N Vt )
2

where N is the number of devices in the stack. A small N helps to achieve faster

falling delay by initially providing a smaller Ref f and reducing the time it stays in
subthreshold region before the state is switched. However, a smaller N also leads to

larger leakage when the input is at state 1. Nodes intn and intp are used to drive
output transistors Mno and Mpo, respectively. In this way, the good 0 property of
intn and good 1 property of intp can both be used to reduce static power with little

impact on the ON current of the output stage.

The circuit in Fig. 4.4 is proposed to solve the leakage problem introduced in

58
VDDH

Mp2

n1 Mp3

Reduced n2 Md1
Mpo
swing inverter
Mn4
intp
MdN
Mn3 out
n0
Mp1
VDDL

VDDL Mn2

buf_in intn
in Mn1 Mno

(buffer)

Figure 4.4: Proposed level shifter with feedback path for leakage reduction.

the previous paragraph. The idea is to add a PMOS header Mp2 on top of the
pull-up transistor stack. When both the input and the output is high, n0 is low
and n1 is designed at 500mV below VDDH due to the reduced swing inverter. As a

consequence, n2 will be pulled high, strongly turning off Mp2 to save leakage. When
the input switches low, Mn3 needs to be strong enough with a gate voltage of VDDL

to pull down n2. Otherwise, the pull-up transistor stack will not be able to charge
intn and intp, causing functional failure. Assuming that node n2 can be pulled down

very quickly after in goes low, the rise delay is dictated by the size of transistors Md1
through MdN. In this way, we can choose circuit parameters (including the number
of transistors in the stack N, and transistor widths) to match the fall delay with the

rise delay without sacrificing leakage power or vice versa.

The circuit diagram for the reduced swing inverter is shown in Fig. 4.5. Note that

59
Mp3
Mn1

Mp2

out
in Mp1

Mn0

Figure 4.5: Reduced swing inverter.

all devices are thick oxide I/O devices in this circuit. When in is low, Mp2 and Mp3
can easily pull out to VDDH. When in goes high, it behaves like a reduced swing

driver, which is used to save switching energy for interconnect [99]. Instead of pulling
all the way to 0, out will remain slightly higher than (VDDH-Vtn) in this situation.

The use of the reduced swing inverter helps to match the gate overdrive voltage to the
subthreshold voltage input that is being converted. The inverted output is designed
at 2V for 0.3V to 2.5V voltage conversion. It provides a fast response time for leakage

reduction and still makes Mp3 weak enough compared to Mn3 when there is logic
contention at n2.

The simulated waveforms of our proposed circuit are shown in Fig. 4.6 where N is
5. As in transitions from high to low, n1 is pulled up to VDDH to ensure the diode

transistors stack is able to sink current from VDDH. Therefore, both intp and intn
rise to within 10% of their corresponding steady state voltage in a few hundred ns
(FO4 delay at 0.3V in this technology is 18ns). Node intp has to be able to quickly

turn on the output pull-up transistor Mpo. In this design, the internal node between
Md2 and Md3 is chosen as intp. The tradeoff in the selection of this node is the

60
3.0

out

2.0 intn n1
(V)

intp

1.0

0.0

0.0 50n 100n 150n 200n 250n 300n 350n

t(s)

Figure 4.6: Waveforms demonstrating the operation of our proposed circuit.

leakage power since intp will never reach VDDH. When in goes high again, n1 drops
from VDDH to turn off Mp2 in order to save leakage power in this state. Nodes intp

and intn decrease as well, allowing out to go high.

The guidelines for transistor sizing in the proposed level shifter can be summarized

as follows:

Use minimal transistor dimensions for Mp1 to minimize the intrinsic loading on
node intn.

Determine the size of Mp2 based on leakage current constraints when the input

is at state 1.

Size Mn1 to meet the target leakage current at state 0 and the rise delay re-
quirements.

Choose N (the number of stacked PMOS devices) by calculating the fall delay

based on supply voltage and Vt of each transistor.

Verify that the pull-down strength of Mn3 is always stronger than the pull-up
strength of Mp3 at process corners.

61
2.4 40
Delay
2.3 Power
35

Power dissipation (nW)

Gate Delay (# of FO4)
2.2

2.1 30
2

1.9 25

1.8
20
1.7

1.6 15
0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Wp (m)

Figure 4.7: Sizing of diode-connected stacked PMOS (Wp) versus gate delay and power dissipation.

Among these steps, the sizing of transistors Md1 through MdN is the most difficult
to determine analytically since the operating range is mostly between the subthreshold

and superthreshold regions. For a better understanding of the tradeoffs, transistor

size Wp versus gate delay and power dissipation is simulated and shown in Fig. 4.7.
By taking the maximum of the rise and fall delays, gate delay represents the maximum

operating frequency of the level shifter. N is chosen to be 5 and the width of transistor
Mn1 is 5m. When Wp is small, gate delay is dominated by the fall delay. Increasing

Wp both decreases the fall delay by reducing the effective pull-up resistance and
increases the rise delay as the parasitic capacitance also grows. To illustrate a typical

case for subthreshold operation at 300mV, the circuit is running at 100 FO4 delay with
an activity factor of 0.1. The cross-bar current dominates other leakage sources in
this scenario. Therefore, as Wp increases the power consumption decreases due to the

faster rising transition of internal nodes indp and indn, despite the fact that parasitic
capacitances also rise. As expected, power dissipation saturates to a certain value as

Wp becomes large and finally will start rising as parasitic capacitances dominate.

62
20

Gate Delay (# of FO4)

Conventional
Proposed
15

0
-20 0 20 40 60 80 100
Temp (C)
(a)

100
Power dissipation (nW)

Conventional
80 Proposed

0
-20 0 20 40 60 80 100
Temp (C)
(b)

Figure 4.8: Comparison of level shifters. (a) Gate delay, (b) Power consumption.

4.4 Simulation results

In this section, we compare the performance of the proposed level shifter to the con-
ventional design in terms of delay, power, robustness, and area. The target output
voltage VDDH is 2.5V, which is being converted to from a subthreshold voltage of

300mV. The simulations are conducted using commercial 0.13m CMOS logic tech-
nology. Power values do not include the switching power that drives a physical package

pin. As was explained in Sec. 4.2, the W/L of pull-up transistors are 0.36m/5m
for the conventional level shifter.
We first examine the gate delay and power dissipation of both circuits at different

temperatures. Worst-case corner (fast PMOS and slow NMOS) is applied for all
transistors. In Fig. 4.8a, gate delay is plotted with respect to temperature. Both

circuits can operate sufficiently fast (roughly 5 FO4 delays) above room temperature.
However, the conventional circuit runs much slower at low temperatures and even

63
fails to function completely below -10 C. This is due to the pull-down NMOS devices

becoming exponentially weaker at low temperature while PMOS become stronger

due to their mobility increase (as they are not operating in subthreshold). On the

other hand, the gate delay of our proposed circuit remains almost constant in terms
of FO4 delays across temperature. The power dissipation of the level converters are
calculated with a period of 5000 FO4 inverter delays. Fig. 4.8b clearly shows that the

proposed circuit has lower power than the conventional design. The first reason is due
to the fact that the conventional circuit is slower, making it more susceptible to cross-

bar current. In addition, large NMOS sizes are required in the DCVS case due to
the difficulty in low temperature conversion and result in large parasitic capacitances.

This increases both switching and leakage power components.

Process variation is an important characteristic for subthreshold circuits. We
perform 5000 Monte Carlo SPICE simulations on both level shifters and show the

variation in gate delay when converting from 0.3V to 2.5V (Fig. 4.9). Process related
parameters such as Vt and geometry are the sweeping factors in this setup. Both level

shifters are configured the same way as the experiment in the previous paragraph. The
and of conventional level shifter are 3.1 and 2.77 in terms of FO4 inverter delay.

On the other hand, the and of our proposed level shifter are 2.63 and 1.13 FO4
delays, respectively. Although the proposed level shifter is smaller in total area and
has a more complicated circuit structure, it is still less affected by process variations.

The reason is that our proposed circuits with the diode-connected pull-up stack has
less contention at the beginning of state switching compared to the conventional one.

In other words, speed penalty caused by contention has less impact on our circuits
when process parameters vary.

Table 4.1 summarizes the circuit parameters of the conventional and proposed
level shifters at worst-case process corner and room temperature. These values do
not include the input buffer to the level shifters to simplify the comparison. Therefore,

64
500
Proposed
Conventional
400

Number of counts
300

200

100

0
0 1 2 3 4 5 6 8
Gate delay (# of FO4)
Figure 4.9: Monte Carlo simulation results showing gate delay variation across process spread.

Table 4.1: Comparison of level shifters.

Conventional design Proposed design
Active energy (fJ) 828 102
Leakage power (pW) 1080 121
Rise delay (# of FO4) 4.74 4.16
Fall delay (# of FO4) 4.64 3.79
Total transistor area (m2 ) 30.58 11.11

the power consumption of the conventional one is underestimated due to its larger
input capacitances. The conventional level shifter suffers from area (calculated as the

sum of transistor areas) and power penalties in order to increase the driving strength
of pull-down transistors especially at low temperatures. The routing cost and irregular
structure of the proposed circuit will reduce the area difference in physical design,

however, it still has a clear edge in this category.

4.5 Conclusion

In this work, we proposed a subthreshold to I/O voltage level shifter that relies
on a pull-up transistor stack independent of the input voltage. Through a feedback

65
mechanism that reduces leakage when the input is high, it also improves the transition

speed of the circuit. The proposed level shifter was compared to the conventional
DCVS-type level shifter, and shows advantages in power dissipation, gate delay and

total area. The proposed level shifter is also capable of converting a 0.3V incoming
signal to 2.5V output robustly across process variation according to Monte Carlo
SPICE simulations.

66
CHAPTER V

SENSOR DATA RETRIEVAL USING

ALIGNMENT INDEPENDENT CAPACITIVE
SIGNALING

5.1 Introduction

Miniature self-sustaining sensor nodes have become a viable option with silicon tech-
nology scaling. Such a system can be easily attached to, or implanted into, various
objects for applications such as periodic sensing and recording of temperature or bio-

chemical data. With energy minimization techniques [80, 100, 101] and aggressive
power gating, these systems can potentially operate using a micro-fabricated battery

with comparable form factor over an extended period of time [56]. To maintain the
form factor for such systems, data read-out can be challenging from two perspec-

tives. First, hardware overhead has to be kept low such that the size of the system is
not dominated by the communication components. Secondly, power consumption and
instantaneous power spikes during read-out will determine the size of battery and pas-

sive components such as decoupling capacitors. Passive radio-frequency identification

(RFID) transponder techniques can be used to eliminate read-out energy dissipation

for the sensor, but this generally requires an external coil on a centimeter scale,
significantly limiting the application space [102]. Near-field pulse signaling through

inductive coupling has been reported to achieve high bandwidth using integrated in-
ductors while also being energy efficient [103, 104]. However, the power required for
sending data back from the sensor chip still needs to be supplied externally.

67
Capacitive coupling is another favored candidate for near field communication

due to its high bandwidth and low energy consumption capable of achieving less
than 0.1pJ/b [105, 106]. Simultaneous data and power transmission has also been

successfully tested for silicon on a stack [107]. It is an advantageous solution for

small form factor systems where chip stacking is applicable since all hardware can be
integrated into a silicon chip. On the other hand, the signal strength of capacitive

coupling is inversely proportional to the distance between the pads, which makes the
robustness of such scheme very susceptible to misalignment. Pad alignment of about

3m, achieved by markings on the edge of a scriber line, was reported [63]. Vernier
bar patterns were also proposed to electrically detect the alignment between chips so

that alignment error down to 1.4m can be detected [64]. The accuracy of alignment
can be further improved by dividing each transmit plate into smaller microplates. By
driving the appropriate microplates with embedded switching circuits, the mechanical

misalignment can be compensated up to +/- 25m [108]. More quantized alignment

information can be provided by a capacitive sensor and alignment circuits. As re-

ported in [109], the analog output by the alignment circuits is able to differentiate
alignment error down to 0.1m.

In this work, we propose a capacitive coupling based method where the communi-
cation module is fully integrated with the sensor node on a sub-mm scale. The goal
is to provide a convenient read-out mechanism without the aid of a optical micro-

scope and positioning by micromanipulator. We use the terminology sensor chip (SC)
and data retrieval (DR) chip to indicate corresponding concepts referred to as the

transponder and interrogator in RFID systems. By dividing the data retrieval pads
into microplates, individual microplates can be grouped together to establish power

and signal channels after alignment is known. A digital alignment circuit is used to
translate misalignment into digital output so that the configuration can be computed
externally. In Sec. 5.2, the geometric design issues associated with capacitive coupling

68
will be discussed. The proposed system architecture will be shown in Sec. 5.3 along

with circuits blocks. Several design aspects will be highlighted in Sec. 5.4 with silicon
measurement results. Sec. 5.5 concludes the work.

5.2 Geometry optimization

Since our goal is to achieve chip to chip communication without fine tuning the

alignment, the pad pattern is designed considering the electrical field in the worst
case due to misalignment. Thus, we first seek the worst case scenario when stacking
two chips face-to-face. There are two assumptions in the following analysis:

1. The data retrieval chip is composed of a large array of square pads so that the
sensor chip is completely covered by the pad array.

2. The distance between the coupling pads is not a function of location, i.e. the
thickness of passivation layer is fixed.

The first assumption requires that the data retrieval array is large enough such that

the sensor chip can be easily dropped on top of it while the entire sensor chip is still
within the boundary of the receiver array. By designing the receiver array two times

larger than the sensor chip, this assumption can be satisfied without fine positioning
by micromanipulator. The second assumption relies on the uniformity of the final

passivation, which can be affected by many issues such as dust on the surface of
the chip. In general, this is not a deterministic process from the circuit designers
perspective of view. Therefore, it is reasonable to make this assumption at design

time.
Both the power and signal channels are required to be established during com-

munication. For the sensor chip, the power pads will be allocated with as much area
as possible to maximize the charge that can be harvested. On the other hand, sig-

nal pad sizing presents a tradeoff between capacitive loading and coupling factor. A

69
WRX

WTX

B A
r2 r1
Wsep C r3
r4

Figure 5.1: The relative position of the receiver array and sensor signal pad when WRX <= 2WT X .

larger signal pad means greater energy consumption at each transition, while reduc-

ing the size decreases the sensible voltage seen by the data retrieval pad given fixed
parasitic capacitances. For the data retrieval array, the pads are placed as close as
possible so that the uncovered area can be minimized. The spacing between pads

is typically constrained by two DRC (Design Rule Check) rules in advanced VLSI
technologies: the metal density rule that is allowed in the process and the minimum

spacing between top metal layers. In the following analysis, the separation between
data retrieval pads is fixed at 5m according to the CMOS process we use. Ideally,

dividing the pads into a smaller dimension is helpful for a finer configuration. In
reality, however, the minimum size of the pads will be decided by the area of the
functional blocks associated with each pad.

5.2.1 Sizing of the sensor pad

Fig. 5.1 illustrates the worst case condition given that all the signal pads are square

in this work. is defined as the offset angle between the two chips. Here WRX , WT X ,

70
Wsep are the width of data retrieval pads, sensor signal pad and the separation between

the data retrieval pads, respectively. Since the pads are all squares, from symmetry

is only considered from [0, 4 ]. In this case WRX <= 2WT X , and polygon ABCD

represents the area of interest, which is used to calculate the coupling capacitance of
the pads. Line segments r1 through r4 are used to represent the length of the sides.
Coupling capacitance is the sum of the parallel plate capacitance and the fringing

capacitance

Cc = Cpp (Area of polygon ABCD)

+ Cf r,T X (BCD) + Cf r,RX (DAB) (5.1)

where Cpp is the parallel plate capacitance per unit area, Cf r,RX is the fringing capac-
itance per unit length of the data retrieval pads and Cf r,T X is the fringing capacitance

per unit length of the sensor pads. Using trigonometric function, r1 through r4 can
be written as:

WT X Wsep
r1 = sec (1 tan ) (5.2)
2 2
WT X Wsep
r2 = (1 tan ) sec (5.3)
2 2
WT X Wsep
r3 = sec (1 + tan ) (5.4)
2 2
WT X Wsep
r4 = (1 + tan ) sec (5.5)
2 2

It is noted that the above expressions are physically meaningful only when vertex C is

still inside the data retrieval pad. In other words, they are valid when <= t where
t is the angle when vertex B and vertex C overlap. Combining Eq. 5.2 through Eq.

71
5.5, the area of polygon ABCD can be obtained and simplified as

1
Area of polygon ABCD = (r1 r3 + r2 r4)
2
1 WT X Wsep
= sec (1 tan )
2 2 2

WT X Wsep
sec (1 + tan )
2 2

1 WT X Wsep
+ (1 tan ) sec )
2 2 2

WT X Wsep
(1 + tan ) sec )
2 2
1 1 2 1
= WT2X + Wsep WT X Wsep sec (5.6)
4 4 2

while the line segment BCD and DAB are

BCD = (r2 + r4)

= WT X Wsep sec (5.7)

DAB = (r1 + r3)

= WT X sec Wsep (5.8)

As mentioned before, Wsep is designed to be small enough to maximize the cou-

pled area. Thus, it is reasonable to assume that Cf r,RX is negligible compared to

Cf r,T X because the electric field lines from sidewall DAB are mostly terminated at
the neighboring receiver pads instead of the sensor pad. Cc can then be rewritten as

1 1 2 1
Cc = Cpp WT2X + Wsep WT X Wsep sec
4 4 2
+ Cf r,T X (WT X Wsep sec ) (5.9)

Similar derivations can also be applied to other cases such as > t or when WRX >

2WT X . Based on the analysis the worse case occurs when = 4 .

With the aid of 3D field solver tools [110], the relationship between WT X and
coupling capacitance can be found as in Fig. 5.2. In the technology used in this work,

72
7
6

Coupling capacitance (fF)

5
4

3
2

40 60 80 100 120 140 160 180 200

WTX ( m)
Figure 5.2: Relationship between pad size of sensor chip to the coupling capacitance with
WRX =50m.

the minimum size of the data retrieval pad is 50m due to the active circuits area.
With WRX and Wsep being 50m and 5m, the simulation results with respect to

different outer dimensions of WT X are plotted. The coupling capacitance gradually

increases until about 150m. At this point the sensor pad is large enough to cover at

least one data retrieval pad no matter where it is located. Further simulation result
shows that the difference between coupling capacitance at different orientations is
within 1%, suggesting that a consistently good coupling ratio can be achieved at

WT X = 150m. To sum up, sensor pads are chosen to be about three times larger
than the receiver pad to maximize coupling in the worst case condition.

5.2.2 Single-ended vs. differential signaling

In the previous section, only a single pad was considered to transmit a signal from
the sensor. On the other hand, the signal strength can be doubled by implementing

differential signaling. Consider the diagram shown in Fig. 5.3, assuming that the
dimension of the pads are the same as given in Sec. 5.2.1. In this scheme, both Pads

73
Sensor pads

DR array

Figure 5.3: Differential signaling scheme. Pad A (square with slant lines) together with all the other
pads in light gray are used to recover the signal from the sensor.

A and B are required to amplify the differential signal from the sensor pads. Since
the sensor chip can land in any orientation, 15 DR pads along with Pad B have to
be routed into Pad A to make sure that signals from both sensor pads are able to be

picked up by the DR pad.

In a simplified analysis, the coupled voltage from the sensor pad to the receiver pad

is proportional to the ratio of coupling capacitance (Ccouple ) and ground capacitance

(Cgnd ) where Cgnd already includes the input capacitance of the amplifier. For a

single-ended signaling scheme, the coupling coefficient can be written as

Vcouple Ccouple
Cc,single = = (5.10)
Vtran Cgnd + Ccouple

Vcouple and Vtran are the coupled voltage and transmitted amplitude, respectively. For

a differential signaling scheme, the coupling coefficient is given by

2
X Ccouple,i
Cc,dif f = (5.11)
i=1
Cgnd,i + Ccouple,i + N Csw + Cwire

where Csw is the device loading of the switches that control the destination of coupled

74
signal, N is the number of other pads the pad has to connect to, and Cwire denote

the extra wire loading due to the differential signaling scheme. Assuming Ccouple,1
Ccouple,2 = Ccouple and Cgnd,1 Cgnd,2 = Cgnd , the difference between Cc,single and

Cc,dif f is

Cc,single Cc,dif f
Ccouple 2Ccouple
=
Cgnd + Ccouple Cgnd + ccouple + N Cckt + Cwire

Ccouple N Csw + Cwire Cgnd Ccouple
= (5.12)
Cgnd + Ccouple Cgnd + Ccouple + N Csw + Cwire

In other words, the differential scheme is better than the single-ended scheme only

when the sum of N Csw and Cwire is smaller than the sum of Cgnd and Ccouple .
Cgnd and Ccouple can be estimated from the process and geometry, or more precisely,

through RC extraction tools. For a DR pad that is 50m by each side, Cgnd is
4050fF if the signal and power routing underneath it are restricted to metal 3 or
below. Csw can generally been ignored if, for example, a transmission gate that is four

times as large as the minimum sized transistor is used. Cwire can estimated by the
wire length. Considering 15 extra connections require 150m long metal wiring each

with minimum width, the total wire loading is 150fF assuming isolated wires. Unless
Ccouple is more than two times larger than Cgnd , differential signaling scheme will not
offer any advantage over the single-ended counterpart. In addition to that, complex

wiring in the differential signaling scheme will force wires to be routed at higher
levels of metal and will increase Cgnd as a result. Therefore, single-ended signaling is

implemented in this work. The dimensions of the pads used in data retrieval chip and
sensor chip are summarized in Table. 5.1. Due to fabrication constraints, the actual

footprint of the pads are slightly different from the designed values. For example, the
DR pad size is reduced from 50m to 48m on a side to comply with metal density
rules.

75
Table 5.1: Summary of pad dimensions.
Pad size Pad Spacing Number of
(m) (m) pads
Sensor chip Power: 225 by 225 20 Power: 2
Signal: 150 by 150 Signal: 1
Data retrieval chip 48 by 48 5 400

5.3 System architecture

Fig. 5.4 shows the proposed system diagram for sensor data retrieval. The data
retrieval chip is responsible for sending power and recovering data from the sensor

chip at the same time. Since there is no common reference for both chips, two power
channels are required to send AC power differentially. An AC to DC converter at the
sensor chip side is used to harvest the supply voltage for the sensor. The clock signal

is modulated with the power signals and can be demodulated by the sensor chip,
so no additional channel is needed for synchronization. This also helps to precisely

control the sensing window of the receiver circuit for better noise rejection. A single
signal channel is used to transmit data back to the data retrieval chip as suggested

in the previous section.

clkmod
Level converter
Sensor chip
ext_clk Modulator

Voltage
Voltage Demod-
Alignment douber
limiter ulator
detector stages VDDn
(n=1:10)
align_out

Level converter

VDD10 (~ 1.6V) VDD4 (~ 0.6V)

preset
Level Logic clkdem

data_out Controller Amp converter (LFSR)

Data retrieval chip

Vref

Figure 5.4: System architecture for the proposed data retrieval mechanism.

76
DR cell
Level shifter
Clk
Pad Modulator
generator

Alignment Serial
detector Interface

N=20
Differential DR
ext_clk
amplifier controller

N=20

Figure 5.5: Data retrieval array showing 20 by 20 cells and controller.

5.3.1 Data retrieval circuits design

While the sensor chip has three pads dedicated to individual channels, the data
retrieval chip contains an array of 20 by 20 cells that each can be assigned as the

signal channel or can be clustered as a power channel as needed (Fig. 5.5). Each cell
is tied to a corresponding DR pad, which serves as communication channels that are

reconfigurable based on alignment information. One of the following three functions

can be performed by the DR cell at the same time,

1. Alignment detection. Alignment information is transformed to digital output

and can be scanned out for post-processing.

2. Power transmission. The pad is driven by level converters with elevated ampli-

tude to strengthen the signal that is able to reach the sensor pads.

3. Signal recovery. The capacitively coupled signal is first amplified and then
decoded by the DR controller.

77
PAD SYS_CLK

RING_CLK
ENABLE
synchronous
counter
RING_CLK
ENABLE

asynchronous
SYS_CLK
OUT[N:0]
counter OUT[N:0] 1 2 3 4 5 6 6

(a) (b)
Figure 5.6: Alignment detector. (a) block diagram, (b) operation waveform.

After sensor chip is dropped on top of the data retrieval chip, the alignment detec-
tor shown in Fig. 5.6(a) is used to determine the best configuration. The alignment
detector is essentially an ring oscillator based capacitance-to-digital converter that

translates capacitive loading for each DR pad. The ring oscillator converts the ca-
pacitance into frequency information represented by RING CLK . Then RING CLK

is used to increment the synchronous counter during a given period of time when
ENABLE is high (defined by SYS CLK ). The operation waveform is shown in Fig.

5.6(b). To adapt for different speed of ring oscillators across the DR array, a one time
zero-calibration method needs to be implemented (Sec. 5.4.2). Although the output
has to be limited to 9 bits to physically fit underneath each cell, the circuits can be

operated in cyclic mode. This means that the alignment information is maintained
even though the counter overflows and the carry-out information is discarded. We will

revisit the alignment detection issue in Sec. 5.4.2 to explain how useful information
can be extracted efficiently for the whole data retrieval array.
For the power transmission drivers, traditional DCVS (differential cascode voltage

switch) type level converters are used. Such level converters can easily operate at
an output amplitude that is three times higher than the nominal supply voltage

within our interested carrier frequency of tens of MHzs. The clock signal is globally
distributed to every cell and is locally inverted if an out-of-phase signal is required.

In an effort to reduce parasitic capacitance for the DR pads, we restrict the routing

78
layers to metal 3 and below only. Uniform clock wire routing is achieved throughout

the DR array by implementing the clock driver all from one side of each row. This
provides a feasible routing scheme compared to an H-tree type clock network, at the

expense of larger clock skew. The problem of clock skew will be discussed in Sec. 5.4
as it limits the carrier frequency for power harvesting.
Figure. 5.7 shows the data retrieval mechanism. Two differential amplifiers are

used to detect both the rising and falling transitions. The input node (Vin ) is
precharged high before the clock goes low to sensitize the amplifiers. Immediately

after the clock fires, either Vlh or Vhl will be pulled down depending on the direction
of the coupled signal. The high-to-low transition triggers the 400-to-1 AND tree gate

that simultaneously monitors all DR pads and results in an UP/DN signal for the
one-bit saturation counter that determines the data output. The difference between
Vdc and Vdc1 /Vdc2 is designed to be 50mV to mitigate input offset voltage and the

impact of noise. The timing diagram in Fig. 5.8 shows that the operation is syn-
chronized to ext clk. The signal transition only happens after the negative edge of

ext clk and is latched at the positive edge. In this scheme, signal preset is used to
both precharge Vin and enable the decoder to detect switching events. In other words,

the impact of noise on the floating node Vin can be minimized by properly control of
the pulse width of preset. The pulse width of the signal preset and the delay from
ext clk can both be programmable through delay lines.

5.3.2 Sensor chip circuit design

The main building block of the sensor chip is the AC to DC conversion circuit shown
in Fig. 5.9. The AC coupled inputs Vin and Vinn are rectified into DC supply voltages

by cascading voltage doublers. Each voltage doubler contains a full-bridge hybrid

cross-coupled PMOS rectifier. Transistors md1 and md2 set the lowest voltage of Vn1

and Vn2 to Vdc1 . After each input transition at Vin and Vinn , Vdc2 is charged with

79
SC pad Vlh Vhl
Vdc1 Vin Vdc2
disable
DR pad
Vbias Vbias
preset
Vdc

data_out

Cell 1
Cell 1 UP 1-bit
Cell 1 UP/DN
Cell 1 counter
Cell 1
Cell 1
Vlh
Cell 1
Cell 1 DN
Vhl Controlled
delay unit
400-to-1
AND gate DR Controller
preset

Figure 5.7: Data retrieval with capacitive coupled input and periodic precharge to sensitize the
amplifier.

ext_clk

preset

Vin

data_out

Figure 5.8: Timing diagram showing the operation of data retrieval circuits when switching happens.

80
VDD4 (to LFSR)

Vin VDD10
(to level
converter)
DR pad SC pad Voltage VDD1 Voltage VDD2 Voltage
doubler doubler doubler

Vinn

Vn2
Vin
Vout
1.6V
md2 md4

Vdc1 - + Vdc2 Vin

md1 md3

Vinn
Vn1

Figure 5.9: AC to DC conversion circuits for sensor chip power harvesting.

a potential equals to Vdc1 +Vin by the cross-coupled PMOS md3 and md4, where

Vin is the coupled amplitude for the sensor chip. Although replacing md1 and md2
with cross-coupled NMOS transistors are advantageous in reducing turn on voltage at

the first few stages, it is not feasible for stages with higher voltage inputs. The reason
is that without a triple well or deep NWELL process, body effect can eventually result
in large NMOS threshold voltage. At the output of the 10th stage, a voltage limiter

prevents the supply voltage from going above operating range.

The design of the voltage limiter is shown in Fig. 5.10(a). The general concept

is similar to the mode selector in [85]. In this work, a shunt transistor m10 is used
to discharge current from Vin (VDD10 in Fig. 5.9) to ground when Vin is above a

certain voltage level. To help explain how the voltage is set in hardware, the open-
loop voltage transfer curve in Fig. 5.10(b) is used. Node n2 will remain close to VSS
before Vin exceeds 2V (where V is the turn-on voltage of the diode-connected

81
Vin

R2
m6 m7 (1M)

m4 m3
2.0
n3
n3 m9

(V)
m5 m10 1.0 n2
n1 n1
n2 m1 m2
R1
0.0

(1M) m8
0.0 1.0 2.0 3.0

(v)

(a) (b)
Figure 5.10: Voltage limiter. (a) circuits diagram, (b) open loop voltage transfer curve.

transistors m5 and m6). When Vin increases beyond 2V , the excessive voltage drop

will occur mainly across R1, and thus the voltage on n2 begins to track the supply
voltage. On the other hand, voltage n1 will be limited at 2V once the supply voltage

is higher than this value. By comparing n1 and n2, the amplifier output n3 will begin
to turn on m10 strongly when the supply voltage is greater than 1.6V. Since each
voltage doubler stage is identical, intermediate voltage levels VDD1 through VDD10

are inherently generated. In this work, we use VDD4 (0.65V) to supply the voltage for
a 4-bit LFSR circuit to generate a data stream with low power consumption and then

up-convert to VDD10 to increase signal strength before transmission. A power-on-

reset circuit is usually required to avoid the deadlock situation when all the register

outputs are zero. This is relatively easy for the LFSR circuit used in this work to
represent logic, since the situation can be avoided by using a NAND4 gate to force
advancing the state of LFSR if it starts at the deadlock state.

For clock synchronization, the system clock is amplitude modulated with carrier
frequency fc using the same power channels. An envelope detector is used to demod-

ulate the clock signal as shown in Fig. 5.11. The differential AC input signal is first
rectified and then filtered by a RC low pass filter. Since the input amplitude varies

82
VDD4

VDD2 M0
Vin

M1 M5 M6 VDD1
M3

M2 M4 clk_out
C1 R1 M7 M8
Vin

Figure 5.11: Envelope detector for sensor chip.

due to several factors such as the transmitting amplitude and the distance between
pads, a level converter is required so that the demodulated clock is able to drive

the logic blocks at 0.65V. For robust level conversion for subthreshold input voltage,
a single stage comparator is implemented. In this circuit, VDD1 and VDD2 from
the voltage doubler stages are used as the reference voltage and bias voltage for the

comparator, respectively. In this way, as long as the rectified voltage is higher than
VDD1 the demodulator is able to work properly.

5.4 Chip measurement

5.4.1 Test chip

A test chip was fabricated in 0.13m CMOS technology. The die photo is shown in
Fig. 5.12. The active die area consumed by the sensor chip is 0.014mm2 . The size of
the data retrieval array is 1.1mm 1.1mm while the total size of the DR controller

and clock generator is 0.08mm2 . During measurement, the data retrieval chip is
packaged and mounted on a PCB. The sensor chip is diced to 0.5mm by 0.5mm, and

is manually dropped on top of the data retrieval array without precise positioning.
Once the two chips are stacked, we first perform alignment detection and scan out

83
0.5mm 1.1mm

Sensor chip
0.5mm
(later diced)

20 by 20
DR Data retrieval
array
controller

1.1mm

Figure 5.12: chip die photo.

the information to be externally processed by a PC. The PC will match the data to
a known pattern and determine the channel that a particular pad should be assigned

to for the DR array. Alternatively, the computations can also be processed on chip if
an ALU (Arithmetic Logic Unit) is available. Data clock fdata is generated externally

by a function generator and sent along with the decoded data to a PC-based logic
analyzer to compute BER (Bit Error Rate).

5.4.2 Alignment detection

We have seen that alignment information can be obtained using the ring oscillators to
extract different coupling capacitances seen by each DR pad. To reduce the conver-

sion time, we would like to run as many alignment detectors in parallel as possible.
However, activating all alignment detectors at the same time will yield results that do

not contain any alignment information. This can be explained by Fig. 5.13 showing
the parasitic components of the system when two chips are put in a stack. For DR
pads P1 through P5, the parasitic capacitors include coupling capacitors Cc1 through

84
Sensor Chip
Substrate
Rs
Cs

Cc4a Cc4b Cc5

Cc1 Cc2 Cc3

P1 P2 P3 P4 P5
Cp1 Cp2 Cp3 Cp4
Cg1 Cg2 Cg3 Cg4 Cg5

DR Array
Figure 5.13: Parasitic components for the system of two chips in a stack.

Cc5 , ground capacitors Cg1 through Cg5 and capacitors Cp1 through Cp4 that exist
between pads. By simultaneously oscillating all the pads at the same time, the cou-
pling capacitances will be blocked from the AC ground and therefore the location of

the sensor chip will not have any impact on the alignment detectors. In addition,
since the impedance of Cp1 through Cp4 is low at high frequency the whole system

will oscillate at the same frequency. To solve this problem, at least one neighboring
pad should be grounded for any given oscillating pad. For example, P2 and P4 are

grounded when P1, P3 and P5 are running to provide a close return AC path to
ground.
From this analysis, we can develop the alignment detection algorithm in a sys-

tematic way (Fig. 5.14). The DR array is first divided into four quadrants and only
one quadrant is activated at a time. By repeating the capacitance-to-digital conver-

sion four times the results can be merged into a two dimensional table. The table
represents a set of zero calibration values for the specific data retrieval chip. The

same procedure needs to be repeated again every time the sensor is dropped on top

85
Increase i Power
pads

2 1 2 1 2 2
3 4 3 4 Scan in 3 4 3 4
i>4
2 1 2 1 setups 2 2
Signal pads
3 4 3 4 3 4 3 4
Divide DR array Run alignment detector
into 4 zones Zi, while rounding Reconfigure pads
i=1..4 neighboring cells
1 1 3 2

0 0 1 -1

-2 -1 1 0

3 0 -1 1
Repeat the
procedure after 8 15 17 18
the sensor chip is 3 0 1 14
dropped on top of -
5 -1 2 0
the DR chip Map the digital outputs
14 2 -1 1
to 2D contour plot

Figure 5.14: Procedures for alignment detection and pad reconfiguration.

of the DR array to generate another 2D table that represents the actual alignment.

A 2D contour plot shown on the bottom right of Fig. 5.14 can be obtained by simply
subtracting values from the 2D tables. Each pixel of the plot indicates the value of

excessive coupling capacitance due to the existence of the sensor chip. From the plot,
both the outline of the sensor chip and the position of the power pads and signal pad

can be clearly seen. With the digitized alignment information, the clusters for power
pads and signal pad can be computed by comparing the results with a known pattern
coming from the chip geometry. As a result, the channels for power transmission and

signal reception can be identified and reconfigured properly every time regardless of
the position and orientation of the sensor chip.

5.4.3 Measurement results

Measured waveforms of the test chip are shown in Fig. 5.15. At a clock frequency of
1.1MHz, the decoded output shows the data sequence that repeats every 15 cycles. We

define the achievable operating frequency (or data rate, since there is only one serial
data bit) of this system to be when no errors occur in 109 cycles. Achievable data rate

86
2.5

Voltage (V)
1.5
sequence repeated every 15 cycles
1

0.5

-0.5
0 5 10 15 20 25
Time (s)
3
2.5
2
Voltage (V)

1.5
1
0.5
0
-0.5
-1
-1.5
-2
0 5 10 15 20 25
Time (s)

Figure 5.15: Decoded data waveform showing pseudo random bit sequences up to 15 unrepeated
cycles.

is measured with different transmitting amplitude (Ain ) and carrier frequency fc . The

results are shown in Fig. 5.16. I/O devices are used for power transmission so Ain can
be as high as 3.3V in this 0.13m technology. The system starts successfully receiving
sequence of data with BER less than 109 when Ain exceeds 1.8V. Estimated working

distance is also shown on the second x-axis of the fdata plot. Based on measurement
data, I/O devices would not be needed if the passivation thickness were reduced by 1/3

from its 5.6m original value (e.g., by further polishing). Increasing Ain monotonically
increases the data rate as expected. At 3V, a data rate as high as 2.5MHz can be
achieved with fc of 216MHz. However, it is observed that raising fc above 150MHz

in fact reduces fdata . The reason is that at higher frequencies the clock skew between
different cells can cause phase offset for signals in the same power cluster, eventually

resulting in a reduction of electric field. Since targeted data working sets for sensor
nodes are on the order of kb [111, 112], the achievable data rate is sufficient for

complete data retrieval on the ms timescale.

Energy numbers for the test chip are shown in Fig. 5.17. It is clear that increasing
fc penalizes overall energy consumption since data rate does not scale well with carrier

87
Estimated working distance (m)
12 14 16 18
2.5

Max. Frequency fdata (MHz)

1.5

1
fc=54MHz
0.5 fc=108MHz
fc=216MHz
fc=432MHz
0
1.8 2 2.2 2.4 2.6 2.8 3
Transmitting Amplitude (V)

Figure 5.16: Operating frequency versus transmitting amplitude and carrier frequency with esti-
mated working distance showing on the second x-axis.

frequency fc . In this measurement, the transmitting amplitude for minimal energy

is around 2.8V. If operated above the minimal energy point, the junction will be

slightly forward-biased after each transition for the rectifier circuit shown in Fig. 5.9.
Therefore, the charge that can be harvested begins to saturate and results in lower

rectifier efficiency. 2nJ/bit is the lowest energy achieved by the proposed system.
Fig. 5.18(a) shows BER with respect to the window size (Tw ), which is related to

the modulated clock for power transmission. Tw is defined as the period when the
output clkmod (Fig. 5.18(b)) remains at 0. It is required for clock synchronization
purpose as the sensor chip needs to demodulate the clock signal and send back the

data within the time when Tw is low. This sets the lower bound for Tw because of the
demodulators response time. From Fig. 5.18(a), the bathtub shape of BER suggests

that there is also an upper bound for Tw . The reason is that the charge that can be
harvested by the sensor chip reduces as Tw increases for a given period of time. In

general, we need to fine tune Tw within a range of tens of ns for higher data rate. On

88
40
fc=54MHz
35 fc=108MHz
fc=216MHz
30 fc=432MHz

Energy (nJ/bit)
25

0
2 2.2 2.4 2.6 2.8 3
Transmitting amplitude (V)

Figure 5.17: Energy consumption versus transmitting amplitude and carrier frequency.

100
900kHz
10-1 920kHz
940kHz
10-2 960kHz
Bit error rate

980kHz
10-3
10-4
10-5
fc (clk) clkmod
10-6
Tw
10-7
340 360 380 400 420 440 460 480 500
Tw (ns)
fdata (ext_clk)
(a) (b)
Figure 5.18: (a) Tw versus BER, (b) Clock modulation circuit that defines Tw .

the other hand, since data rates close to MHz may be excessive for the application,
the design requirement for Tw can be relaxed by simply reducing the transmitting

data rate.
Fig. 5.19 shows the data rate vs. BER for 10 random locations at which the

sensor was dropped. The alignment 2D contour plots (8 out of 10 locations) are

89
18 18 18 18
16 16 16 16
14 14 14 14
12 12 12 12
10 10 10 10
8 8 8 8
6 6 6 6
4 4 4 4
2 2 2 2

2 4 6 8 10 12 14 16 18 2 4 6 8 10 12 14 16 18 2 4 6 8 10 12 14 16 18 2 4 6 8 10 12 14 16 18

18
16
100 18
16
14
10-1 14
12 12
10 10-2 10

Bit error rate

8 8
6 10-3
6
4 4
2 10-4
2

2 4 6 8 10 12 14 16 18
10-5 2 4 6 8 10 12 14 16 18

18 18
16 10-6 16
14 14
12 10 -7
12
10 900 950 1000 1050 1100 10
8
data rate (kbps) 8
6 6
4 4
2 2

2 4 6 8 10 12 14 16 18 2 4 6 8 10 12 14 16 18

Figure 5.19: Data rate versus BER with 10 random position testing.

also shown for the corresponding BER curves. Some regions yield a lower data rate
mainly because the electric field between the pads is not as strong as the others. The

results are distributed into two distinct regions of the plot, however, there is no clear
correlation between the position and the achievable data rate. Non-uniform surface

of the passivation layer may be one cause for the discrepancy. These results verify
that the proposed system adapts to different locations and orientations without the

need for precise positioning.

5.5 Conclusions

In this work, we presented a near field data retrieval system using capacitive coupling.

To alleviate the problem of chip misalignment, an alignment detection and pad recon-
figuration method was proposed. The data retrieval pad is divided into an array of

micropads, and each micropads can be assigned for sending power or receiving data
depending on the alignment information. From the chip measurement results, it was

90
shown that data rate higher than 900kbps can be achieved across 10 random posi-

tioning tests. For small form factor sensor systems, this work provides the advantage
of little hardware overhead and a flexible operating frequency that is not limited by

the dimension of passive components.

91
CHAPTER VI

NEAR FIELD INDUCTIVE COUPLING USING

PLL PHASE-LOCKING AND PULSE
SIGNALING

6.1 Introduction

Radio frequency identification (RFID) is widely used among various areas includ-
ing personal identification, public transportations, and many more. For near field
RFID transponder, the range of operation can vary from a few meters to less than

10cm depending on the operating frequency [113, 114]. Because of the cheap cost,
near field applications usually adopt passive RFID tag that does not rely on any

internal power supply. While harvesting power from the reader (interrogator), the
transponder transmits data back through backscattering [102, 115, 116]. The con-

cept of backscattering is shown on the left of Fig. 6.1. The data is modulated by
changing the load impedance that is seen by the incoming AC signal of the transpon-
der. Changing Zm leads to phase modulation (PM) or amplitude modulation (AM)

depending on whether Zm is in the form of the capacitance or the resistance. The

modulated data is usually located on the subcarrier frequency which is tens of kHzs

away from the carrier signal. One of the subcarrier frequencies can be downconverted
at the reader to decode the data. In this scheme, there are two main limitations for

the reader. First, the inductors should be designed with high quality factor (Q) to
maximize the energy range. However, higher Q damps the subcarrier frequency and
weakens the sensible amplitude. Secondly, the transmitter is continuously switching

92
Back
Scattering Pulse signaling

Power

Zm
Data

Figure 6.1: Comparison of transponder data encoding with back scattering and pulse signaling.

in order to power the transponder which causes significant amount of noise by the
oscillator. Therefore, the subcarrier signal should be no more than 100dB lower than

the transmitters carrier signal [87].

An unique application for the near field application is the medical implanted de-

vice, for example, an intraocular pressure sensor that helps glaucoma detection and
diagnosis. The required range of operation for such an application can be as close

as a few mms. At the same time, the form factor should also be small enough con-
sidering the intrusiveness to the body. In this work, we propose a time-multiplexing
inductive coupling scheme in an effort to alleviate the design limitations on tradi-

tional backscattering method. The concept is shown on the right of Fig. 6.1. Instead
of sending power continuously to the transponder, a small gap is created so that the

transponder can send the uplink signals through pulse signaling using the same in-
ductor. During the same period of time, the transponder is also synchronized to the

reader by an envelope detector. The oscillator of the reader can be turned off so that
the noise floor of the receiver can be greatly reduced.Pulse signaling is widely used in
ultra wide band (UWB) communications [66] and recently used by proximity induc-

tive coupling to achieve high data rate and low energy operation [67, 103, 117]. By
sending the pulses at the resonant frequency, the amplitude that reaches the receiver

93
input is maximized with a given power constraint. A key to the scheme is to design

the transponder and the reader with identical resonant frequency with the maximal
available Q. In Sec. 6.2, the system architecture along with design of the circuits will

be shown. Test chip and silicon measurement results will be presented in Sec. 6.3.
Conclusion will be drawn in Sec. 6.4.

6.2 System architecture

The proposed system is shown in Fig. 6.2. The reader sends a continuous power signal
that is modulated by the clock. At the transponder side, a power harvesting module

generates the DC supply voltage by rectifying the incoming AC signals. In order to

send pulses at the resonant frequency, a phase-locked-loop (PLL) is used to replicate

the frequency fVCO from the input frequency fin . The clock for synchronization is
demodulated from the notch of the continuous wave. A timing controller keeps a

small state machine to control the behavior of the transponder and ensures that the
timing is precisely followed.

Reader Transponder

fin
clk PFD CP LPF
ext_clk Modulator
gen.
fVCO
PLL
Data Power Timing
data_out Demod.
decoder module Controller

Figure 6.2: System architecture for the proposed pulse signaling method.

94
Table 6.1: Summary of the integrated inductor.
Target distance 2mm
Metal Width 12m
Number of turns 13
Metal spacing 5m
Shielding M1 patterned ground
Hollowness 0.79
DC inductance 163nH
Natural frequency 294MHz
Q @ 200MHz 8.62

6.2.1 Integrated inductor

Instead of using external coils, an integrated inductor is advantageous in saving de-

vice size. On the other hand, it suffers from poor quality factor (Q) due to a lossy

substrate. In this work, we restrict the size of the inductor to 1mm by 1mm for both
the reader and the transponder. It is reasonable to assume that in our pulse signaling

scheme, the limiting factor for the range of operation is the power that can be har-
vested by the transponder. To optimize the transponder supply power, the geometry

of the inductor should be optimized according to the target distance of 2mm. The
power available for the transponder is a function of self inductance, coupling coeffi-
cient, the resistive loading and the operating frequency at a desirable Q. However,

the resistive loading is not a linear function of input amplitude due to the nonlinear
transistors. Instead of trying to solve it analytically, a inductor simulation tool called

ASITIC is used to calculate the S-parameters and the coupling coefficient k of the in-
ductor. The geometries of the inductor are constrained by the process and metal fills

are neglected during the simulation. The resulting S-parameters are transformed into
discrete R, L and C values that can be used in the circuit simulators like HSPICE.
The optimized parameters for the integrated inductor is shown in Table. 6.1. The

operating frequency is designed at 200MHz to compromise between the quality factor

and achievable operation frequency of the transponder.

95
6.2.2 Transponder circuits

In this section, the building blocks of the transponder will be presented. The goal of
the power harvesting module is to perform the following three tasks:

AC to DC conversion.

Signals the controller when the rectified voltage is below certain level.

Voltage regulation.

The block diagram of the power harvesting module is shown in Fig. 6.3. An AC to DC
conversion is accomplished with the same method that used in Sec. 5.3. Instead of

using 10 stages of voltage doublers, 5 stages are used in this work since the minimum
input amplitude that is required is higher so the power harvesting module will not

benefit from having more stages. The voltage limiter clamps the supply voltage at
2V which will be used to supply the voltage regulators and the output drivers. As
explained in Sec. 5.3, voltage Vn3 will suppress voltage Vn6 when the voltage limiter

approaches the designed voltage of 2V. When the supply voltage VDD5 reduces from
2V, Vn3 reduces while Vn6 should remain unchanged until VDD5 is lower than one

diode drop. Implementing a Schmitt trigger with Vn6 as the supply voltage and Vn3 as
the input voltage, the state of pump enable will be flipped when Vn3 becomes lower

than 1.2V. The active high pump enable signal provides an important information
since it happens when the power harvesting module is unable to sustain the power
of the PLL. The controller can use the information to determine the time when PLL

should be disabled from further draining more power. Two voltage regulators are
used in this work. One is to supply the voltage controlled oscillator (VCO) of the

PLL and the other one is to supply the rest of the chip. With a dedicate power supply
for the VCO, the supply noise from the digital controller can be largely reduced.

The regulator output voltage is a critical parameter to the transponder. Ideally,

we would like to lower the supply voltage as low as possible to take advantage of the

96
VDD5
VDDR2
VDD1 Voltage VDDR1
Voltage
Voltage
doubler
Voltage
doubler Voltage
Voltage
doubler Voltage Voltage
doubler regulator 2
doubler limiter regulator 1

to controller

to VCO

M1 M7 R2

M2 M8 M9 M4

Ms n6 VDDR2
M3 M5
M10 M11 LC pump_enable
n3 n6
R1 M6 VDD1

* LC: Level Converter

Figure 6.3: Power harvesting module with the schematic of the voltage limiter.

quadratic saving of the dynamic power. On the other hand, it still needs to meet

the requirement of the timing critical elements, in this case, the input buffer and the
phase frequency detector of the PLL. Based on our simulation results, the minimum
operating voltage for the PLL with 200MHz input is 670mV. Considering the margins

for process variations, the output voltage is designed at 770mV. The voltage regulator
circuit is shown in Fig. 6.4. It includes a voltage reference stage, a start-up stage and

an output stage. The reference voltage Vref can be expressed as

W 3 W 4 L2 L5
Vref = nVT ln + Vn0 (6.1)
W 2 W 5 L3 L4

where VT is the thermal voltage equals to kT /q. If biased at least three thermal volt-

ages above ground potential, the Vds dependency of transistor M3 can be neglected.

97
Voltage reference Startup circuit Output buffer

M5 M4 M7
M15 M16 M17
vn1 M8
vref M6
M9 M12 M13 vout
R1
vn0 M18
M10
M2 M3 M14 M19
M11

Figure 6.4: Schematic for the voltage regulator.

Vn0 becomes
" !#
W 4 L5 IR1
Vn0 = Vth + VT ln (6.2)
L4 W 5 ef f Cox W 3
V2
L3 T

where IR1 can be decided by the first term of the RHS of Eq. 6.1 and the resistance

of R1. It is shown that Vref is unrelated to the supply voltage.

The output stage is simply a unity gain buffer. The size of transistor pairs (M12,

M13) and (M15, M16) are made unbalanced so that the output voltage level can
be shifted to a higher voltage compared to Vref . In addition to that, unbalanced
sizing also provides temperature compensation [92]. The bias current of the output

stage is also controlled by Vref . Transistors M18 and M19 are used to provide decent
current loading in order to stabilize the output. The start-up circuit is to assist the

transient response of the voltage reference stage before it achieves the steady state.
The current of transistors M4 and M5 need to be large enough so that the self-biasing
mechanism starts to take over. The voltage reference circuit has two operating point:

the desired state with Iref flowing and the undesired state where the current is near

98
0. As the supply voltage ramp up, voltage Vn1 will be pulled up to near Vdd while

Vn0 will remain close to 0. Vn0 will eventually reach the desired state because of
leakage. However, it can be a slow process which largely depends on the supply

voltage. The startup stage speeds up the process. When the current of transistor
M7 is low, transistor M6 is turned on to shunt voltages Vn1 and Vref . The charge is
redistributed between Vn1 and Vref so that Vref rises faster toward the desired voltage.

Simulation results show 1s for Vref to attain the steady state which is at least 20
times better than the circuit without the startup stage.

The block diagram of the PLL is shown in Fig. 6.5. A type II PLL is used for
locking into a wider range of incoming frequency [118]. For power saving purpose, each

block of the PLL can be individually turned off. The PLL operates at a locking mode
and a signal pulsing mode depending on whether the voltage controlled oscillator
(VCO) is the only active circuits. During the locking mode, the PLL operates in a

negative feedback loop and every block is activated. A pulsing mode is used when
the blocks associated with VDDR1 are turned off so that the VCO is running directly

off of the voltage stored in the low-passed filter (LPF). A second order loop filter
is implemented to reduce noise injection at every clock cycle and to still maintain a

stabilized loop [119]. As a rule of thumb, the loop bandwidth has to be at least 5
times smaller than the reference frequency to avoid instability [120]. With a reference
frequency of 200MHz in this work, 20MHz of loop bandwidth is designed to provide

reasonable tradeoff between stability and transient response. With the polysilicon
resistor and the MOS capacitors, the total area of the integrated loop filter is 180m2

in this 0.13m technology.

In order to operate at 200MHz with 770mV supply voltage, operating speed is the

major concern for the phase frequency detector. We use the true single phase clock
(TSPC) D flip-flops to replace the static flip-flops in the phase frequency detector
[121]. The circuit diagram of the TSPC D flip-flop is shown in Fig. 6.6(a). When

99
pll_en Powered by VDDR1

up
in Amplifier Phase
Charge Loop filter
frequency dn
pump
detector

Ring
oscillator
VCO

vco_en_bar

Figure 6.5: Schematic for the voltage regulator.

reset is high, the output will be precharged high as well. After reset goes low,

the output node becomes floating and is only sensitive to the positive edge of input
signal. From the energy point of view, the TSPC D flip-flop is more efficient compared

to static logics due to less parasitic capacitances. The phase frequency detector is
shown in Fig. 6.6(b). Two TSPC D flip-flops are used to detect the rising edge from
fref and fVCO , respectively. The reset signal will be asserted whenever both of the

flip-flop outputs are high. The output signals upn and dn are balanced in terms of
propagation delay to reduce the noise that will be injected into the charge pump. The

phase frequency detector can be disabled by setting pll en bar to high.

Typically, the charge pump and the VCO are the most sensitive components in
PLL regarding phase noise. In this work, however, the PLL does not work in frequency

locking mode during transmission. Thus the noise sources such as charge injection
and charge sharing are irrelevant to the pulse signaling events. In addition, the timing

jitter from VCO has little impact on the resonant signal since the quality factor of the
integrated inductors is not high. We can therefore trade off the noise for the circuit

complexity and the power consumption for the sensitive components. The circuit
diagram of the charge pump is shown in Fig. 6.7(a). The bias current is generated by
transistors M1 and M3. The charge pump pulls current from the supply to the output

100
Q upn
in M1 M6
fin Q
out RST

M2 M5 reset
pll_en_bar
reset

M3 M4 RST
Q dn
fVCO
Q

(a) (b)
Figure 6.6: TSPC phase frequency detector. (a) TSPC D filp-flop, (b) circuit diagram of the phase
frequency detector.

when the upn signal is low, and vice versa. In pulse signaling mode, the charge pump
can be turned off by opening the transistor M2. The output will become a floating

node but can still drift over time due to the subthreshold leakage from transistors
M4 to M7. To minimize the impact of leakage, careful sizing is needed for balancing

the pull up and pull down networks when they are off. Fig. 6.7(b) shows the current-
starved VCO. The VCO is composed of 5 inverter stages that are current-starved with
control voltage coming from the loop filter output. Since it is only current-starved

through NMOS transistors, the duty cycle of the oscillator is not 50%. In order to
compensate for uneven rising and falling transitions from the high state and the low

state, another current-starved inverter is used to generate the output that is close to
50% duty cycle. A separate control signal vco en bar is used to stop the VCO from

running when it is not needed.

For the timing control, the reference clock is obtained from the gap between
the incoming AC signals. The demodulated output demod out from the envelope

detector will start reducing its amplitude once the incoming signal stop switching
as shown in the waveform of Fig. 6.8. The figure shows the related control signals

when the transponder is in the pulse signaling mode. By detecting the transitions

101
M3 M4

out
M7 upn
in out
pll_en M2 out

M6 dn

M1 M5
ctrl
vco_en_bar

ctrl

(a) (b)
Figure 6.7: Schematics of (a) the charge pump, (b) the VCO.

from demod out, the active low signal clk bar is generated to indicate the period
when the transponder starts to take over the control of the communication channel.

vco en bar becomes 0 right after clk bar goes low to enable the VCO in a free running
mode. The VCO is in fact activated a couple hundreds of nanoseconds before the
pulse signaling happens. This is because that the voltage regulator requires some

response time switching from very light load to the load of an free running VCO. The
delay between the time when clk bar goes low and the actual pulse signals events

is controlled by the number of cycles of the VCO output. A series of pulses will
be fired by the drivers at the output frequency of the VCO. At the same time, the

signal demod out may rise since the same communication channel is being excited
again. To prevent the controller from misinterpreting that the pulse signaling mode is
finished, the controller needs to mask out demod out and prevents the clk bar signal

from rising. The VCO can be put into sleep right after the pulse signaling mode is
completed to save power.

The waveform when the system is in the PLL locking mode is shown in Fig. 6.9. At
the beginning of this mode, clk bar goes high and activates both the VCO and the rest

parts of the PLL. After a certain cycle, the VCO output will reach the same frequency
as the incoming signal frequency. It is noted that the power consumption is at its peak

102
Sending Signal 1

Vin

demod_out

clk_bar Glitches at the demodulator output should

be masked out by the controller

vco_en_bar Disable VCO after pulse

signaling is completed

vpulse

pump_enable

pll_en

Figure 6.8: Timing waveform of the pulse signaling mode.

Vin

demod_out

clk_bar

vco_en_bar
PLL locking is stopped in either condition:
1. 128 cycles has been passed
vpulse 2. VDD5 dropped below 1.2V

pump_enable

pll_en

Figure 6.9: Timing waveform of the PLL locking mode.

103
in this mode where the PLL is the dominant source. The energy range for the reader

depends on the power consumption of the transponder. When the distance between
the two chips are far enough apart, the harvested power will not enough to supply the

PLL. To extend the energy range, the harvested power should be allowed to be lower
than what the PLL is consuming. As mentioned before, the rectified voltage can be
prevented from being lower than a certain voltage by the pump enable signal. The

PLL operation will be stopped upon the request of pump enable. In close distances,
however, what may happen is that the PLL can keep running until it enters the

pulse signaling mode again. When that happens the reference clock of the PLL will
suddenly disappear while the PLL is still trying to track to a frequency that does not

exist. To avoid the false locking attempt, a counter in the timing controller records
the number of cycles the PLL has entered the locking mode and will stop it after 128
cycles if pump enable has not been asserted. In general, the PLL locking mode only

accounts for a fraction of time when the transponder is remotely powered. The rest
of the cycles will be used to replenish the other components that need to be charged

for pulse signaling. Part of the harvested energy will go to the supply capacitors that
will be discussed in the next paragraph.

During the signal pulsing mode, the open loop VCO output pll clk is used to
produce pulses at the resonant frequency. The scheme is shown in Fig. 6.10. To
provide enough current for exciting the reader inductor, the charges are stored on

the supply capacitors during power harvesting. Supply voltages vddp1 to vddp4 are
replenished by the rectifier output VDD5 when pulse en is low. In case a signal 1

is sent, a series of pulses will be sent. The output drivers will inject the charges that
were previously stored on vddp1 to vddp4 successively into the inductor and sink the

current out of it at the other end. lcn is the input of the pull-down transistor Mn,
while lp1 to lp4 are the inputs for the pull-up transistors. Each capacitor for supply
voltages vddp1 to vddp4 is 10pF, which corresponds to 1890m2 of silicon area.

104
pulse_en LC

pulse gen LC
pll_clk LC
LC vddp1 vddp2 vddp3 vddp4 VDD5
logic

Complementary lcn, lc1-lc4

operation

lcn
vddp1 vddp2 vddp3 vddp4
lc1
Mp1 Mp2 Mp3 Mp4
lc2

Mn lc3

lc4

Figure 6.10: The driver circuits for the transponder.

6.2.3 Reader circuits

Compared to the transponder, the design for the reader is much simpler since no
other signals are interfering with the readout data. The output driver is similar to
the transponder driver except that it can be driven strongly by the power supply.

The carrier clock is modulated with ext clk which defines the data rate. The circuit
that generates the control signals for the output driver is shown in Fig. 6.11. Two

pulse generators are composed of delay chains which define two critical period for the
system.

T1 is the time when the reader stops sending power and instead waiting for the
readout signals.

T1-T2 is the time when the reader is allowed to amplify the incoming signals.

Both T1 and T2 are referenced to the negative edge of ext clk. In this work, T1

is nominally designed at 500ns with adjustable delays to accommodate for process

variations. To ensure that the receiver does not unintentionally triggered by the

105
damping resonant clock of the reader itself, T2 is given by 200ns. Signal clk is

the carrier frequency of the system at 200MHz. The signals produced by the pulse
generators are pulse mod and pulse pre, respectively. The drivers are controlled by

pulse mod and clk where the charges are replenished into the inductor from opposite
direction every half cycle. In order to conveniently adjust the magnetic field, level
converters are implemented to drive the driver with larger swing. Assuming that

the series resistance of the inductor dominates the current, the AC current will be
proportional to the raised switching amplitude. Therefore, the driver transistors

should be implemented with thick oxide devices in order to sustain the higher than
nominal voltages.

The carrier clock signal is implemented on die as well. The clock generator sup-
ports two modes of operation. The first mode generates the clock output at the
resonant frequency which can be directly sent to the driver. The second mode pro-

duces a clock frequency twice higher than the resonant frequency and follows by a
divide-by-two circuit before feeding the output drivers. The reason for the second

mode is to produce a near 50% duty cycle signal from the clock generator. Although
the second mode consumes more power, the sinusoidal signal always gets replenished

at the right time and efficiency of the driver can be improved compared to the first
mode.
Since data receiving is time-multiplexed with the power transmitting mode, the

receiver does not have to deal with strong interferences. Fig. 6.12 shows the circuit
diagram of how the data is decoded and the timing diagram when a single bit of 1

is sent. First, a single stage amplifier with decent gain is sufficient to amplify the
resonant signals to full rail. The input signal in is first AC coupled and properly

DC biased before amplified to rec in. And then the data can be easily decoded by
digital logic gates. Signal pulse pre is used here to reset the state of rec data and
ensure that the transition of rec data is uni-direction. Assuming that the amplitude

106
T1
pulse_mod
ext_clk

clk clkn

pulse_pre

ext_clk

clk

pulse_mod T1

pulse_pre T2

Figure 6.11: Pulse generator and output driver of the reader.

in
rec_in
in rec_data
Amp
pulse_pre

ext_clk

out_data 1'bx 1'b1

rec_data
D Q 1'b0 D Q D Q out_data
SET SET SET

rec_in

CLR Q CLR Q CLR Q

pulse_pre reset

ext_clk

Figure 6.12: Data receiving scheme for the reader.

107
of rec in is large enough to trigger the D flip-flop and produce a rising signal rec data,

the output data out will be latched at the positive edge of ext clk. In case when a
0 is sent, data out will remain the same while rec data is precharged to 0.

6.3 Measurement results

The test chip for inductive coupling was fabricated with a 0.13m CMOS technology.

Fig. 6.13 shows the die photo of both the reader and the transponder on the same
die. The integrated inductor for both circuits are designed with the same dimensions
as shown in Table. 6.1. The active area of the transponder measures 0.084m2 , while

the active area of the reader is 0.04m2 .

Fig. 6.14 shows the setup for the two chips test. The reader is packaged and

Transponder Reader

Figure 6.13: Die photo for the reader and transponder of the system.

108
Micromani-
pulator

Transponder

Chip under test

Reader

(a) (b)
Figure 6.14: (a) Test setup with the micromanipulator and the PC board, (b) close-up photo.

mounted on a PC board with interfaces to the oscilloscope and a laptop for control

signals. The transponder chip is attached to a micromanipulator that allows pre-

cise 3D positioning down to 10m on the vertical direction. For the x-axis and the

y-axis, the resolution is 0.1mm. In order to maximize the energy range, the reso-
nant frequency should be measured first. As the switching frequency getting closer

to the resonant frequency, the loss of charges due to the resistive components de-
crease. Therefore, the resonant frequency can be found at the local minimum of
power consumption by sweeping the switching frequency.

The received waveform in shown in Fig. 6.15. The data stream is encoded with a
4 bit LFSR that generates pseudo-random numbers and repeats every 15 cycles. The

worst case happens when a 1 is sent following another 1 from the previous cycle
since it gives the transponder the shortest time to replenish the supply capacitors. It

is shown that the data can be correctly decoded.

The communication distance (dmax ) is shown on the left of Fig. 6.16. The switching
amplitude (Vsw ) of the reader can be higher than the nominal Vdd to increase the

109
3.0
2.5
2.0
data_out (V) 1.5
1.0
0.5
0.0

2.0
1.5
ext_clk (V)

1.0
0.5
0.0
0 50 100 150 200
Time ( s)

Figure 6.15: Measured waveform from the oscilloscope showing the output data and the clock signal

AC current. At a given data rate of fdata , dmax monotonically increases with Vsw .
1.1mm is the achievable distance with Vsw = 3V and fdata = 50kHz while the power

consumption is 16mW. For reduced distance at 0.9mm apart, fdata = 400kHz can
be achieved. At higher data rates, the dmax starts decreasing because the harvested

energy is also reduced. On the other hand, reducing the data rate is not always
advantageous. It is because that the frequency of pulse signaling relies on the ability
of the filter in the PLL to hold the bias voltage. However, it suffers from 20pA of

leakage even after the charge pump is turned off. The frequency will further deviate
from the resonant frequency as the time between the refreshes increases and result in

less sensible signals. As a result, the absolute minimum data rate is 6kHz regardless
of the switching amplitude.

Fig. 6.17 shows the plot where the data can be successfully communicated with a

110
1.04
1.00
Switching Amplitude (V)
2.5 1.00

Distance (mm)
0.96 0.96
2.0 0.92
0.92 0.88
0.84
1.5 0.88
0.80
0.84
10 20 30 40 50 60 70 80 90 100
Data rate (kHz)
Figure 6.16: Measured communication distance with respect to the data rate and the switching
amplitude.

combination of the horizontal misalignment and the vertical distance. At Vsw = 3V,
every 0.1mm of misalignment translates to a loss of roughly 0.1mm of communication

distance. While at Vsw = 2.3V, the impact from misalignment is only half of that.
The power consumption when Vsw = 2.3V is about 8mW.

6.4 Conclusion

In this work, we present a pulse signaling based method for data readout from induc-

tive coupled coils in short range. The use of time-multiplexing pulse signaling allows
the optimization of quality factor for both the reader and the transponder, as long
as the resonant frequency is the same. It also relaxes the constraint on the receivers

sensitivity by eliminating the dominant noise source during data receiving. A PLL is

111
0.5 Vsw = 3.0V
Alignment offset (mm) Vsw = 2.3V
0.4
0.3
0.2
0.1
0.0
0.7 0.8 0.9 1.0 1.1
Communication distance (mm)
Figure 6.17: Measured achievable communication distance with misalignment in the x-axis or the
y-axis.

implemented on the transponder to replicate the resonant frequency while harvesting

the power. The replicate frequency in stored on the loop filter in the form of volt-
age, which can be later used to drive the VCO and generate pulses that effectively
excites the readers inductor. The test chip was fabricated in 0.13m technology

with 1mm1mm of integrated inductors on both the reader and the transponder.
The measurement results demonstrate successful reading at 1.1mm of distance with

16mW of power consumption.

112
CHAPTER VII

CONTRIBUTIONS AND FUTURE WORKS

7.1 Contributions

In this dissertation, several building blocks for a miniature sensor system were dis-
cussed. In order to achieve a small form factor, low energy operation becomes the

key to such a system. Unlike the microcontrollers or the storage units like SRAM,
the peripheral circuits did not get much emphasis in terms of low power operations.

The timer, for example, is the only active component while the system is in the sleep
mode and often dominates the total power if it is not properly designed. Passive
communication is another important feature for the miniature system. At mm3

scale, a system that is able to sustain the instantaneous power required by a RF

module has yet been reported. Passive RFID technology provides a solution to access

data remotely without actively powering the transponder. In order to work at close
distance and limited form factor, new techniques are proposed in this work.

The contributions of the dissertation is summarized as the following:

Two ultra low power timers that oscillate in sub-Hz to 10Hz are proposed. To
effectively reduce the power consumption, the transistors are aggressively biased

in subthreshold regions. The drain current of a MOS transistor has exponential

dependency on the temperature when operating in this region. The first design

uses the gate leakage of a MOS transistor as the charging/discharging source to

reduce the temperature sensitivity and also to provide a low output frequency.

113
The chip is measured with less than 0.1Hz of nominal frequency and sub-pW

power consumption at 300mV supply voltage. By raising the voltage from

300mV to 600mV, the variation due to temperature reduces from 0.6%/ C to

0.16%/ C while the power consumption also increases to 2nW.

Another approach for the low power timer is achieved by a program-and-hold

technique. A current source is generated by referring to a resistor that has a low

temperature sensitivity. To further reduce the power consumption in the active

mode, the bias stage needs to be turned off. A hold stage is implemented to store

the bias voltage so that the oscillation period remains temperature insensitive
after the bias stage is turned off. Although the footprint of the program-and-

hold timer is 40X larger than the gate leakage based design, it still only accounts
for less than 2% of a 1mm2 chip. The average power consumption is 150pW,
with 5% cycle time error from 0 C to 90 C when the timer is refreshed every 2

minutes.

A low power temperature sensor is proposed for remotely powered systems. The

energy range of such system is highly correlated to the power consumption,

which is typically dominated by the temperature sensor. The proposed tem-

perature sensor generates a temperature insensitive current source and a PTAT

current source which both operate at the subthreshold region. The current
sources are translated into oscillating frequencies and can be used to generate

a digitized output. In this work, the size of the temperature sensor is inversely
proportional to the power consumption. With a footprint of 0.05mm2 , the total

power consumption is 220nW. The temperature inaccuracy is -1.6 C to +3 C

over the temperature range from 0 C to 100 C.

For communication between different power domains and testing for the sub-

threshold circuits, level shifters are widely used. A single stage static subthresh-

114
old to I/O voltage level shifter is implemented with the advantage of robustness

across temperatures. The idea is to use cascode diode-connected transistors

as the pull-up network so that both the pull-up and pull-down network have

comparable driving strength. The experiment results show better performance

and power consumption compared to a widely used DCVS level shifter when
converting from 0.3V to 2.5V. The FO4 delay of the proposed design remains

unchanged at temperatures lower than 0c ircC while the DCVS counterpart de-
grades exponentially below the room temperature.

A passive capacitive coupling based proximity sensor data retrieval technique

is presented. An alignment independent technique is proposed to alleviate the

requirement for precise positioning in capacitive coupling systems. By dividing

the data retrieval chip into smaller microplates, each microplate can detect the
alignment information and reconfigure its function during communication. The

test chip demonstrates that the achievable data rate varies less than 15% in 10
experiments when the sensor chip is randomly dropped on the data retrieval

chip. In this work, data, clock and power can all be capacitively transmitted at
the same time through different channels.

For communication in mms range, a inductive coupling technique is proposed to

enhance the readout data robustness coming from the transponder compared to
a traditional backscattering scheme. The readout data can be signaled in a series

of pulses at the resonant frequency to maximize the received amplitude given a

fixed energy. Another advantage is that by time-multiplexing of the power and

data signals, the noise floor of the receiver is greatly reduced. The test chips for
both the transponder and the reader are implemented with integrated inductors
of 1mm1mm. The achievable communication distance is 1.1mm which can be

improved with larger reader inductor in the future.

115
7.2 Future works

In the future, we can work toward two directions for the sensor system.

Wireless sensor network. The low power circuits shown in this dissertation were

motivated toward a single sensor system. It is also possible to apply the circuit
techniques to a wireless sensor network with the addition of RF modules. The

RF module, however, has unacceptable active power consumption for the system
that relies on a battery in mm3 . As a result, strong power gating is required to

be applied to the RF circuits as well while data is not transmitting. In order to

exchange data during a short period of wakeup time, a synchronization protocol
should be established between the transmitter and the receiver. One solution

is to use the wakeup receiver that was discussed in Chap. I. The idle power
of the wakeup receiver should be further reduced so that it will not dominate

the system power during the sleep mode. Another solution is to develop a low
power timer that can be used on both end of the transceiver and the chips

should remain synchronized after a extended period of time. This requires

precise control on the output period of the timer as well as the jitter causing
by the noises. Supply sensitivity should also be minimized since the operating

condition can be different between two chips.

High level integration.

Supply voltage management is a big challenge for a system with small

current loadings. The output level from a microfabricated battery is con-

sistent but typically too high for subthreshold operations according to the

discussion in Sec. 1.1.6. An higher than 90% efficiency voltage regulator

can be implemented by switched capacitors or bulk converters with de-

cent loading. However, the efficiency drops significantly when the loading
reduces in the sleep mode. In order to maintain decent efficiency, the de-

116
vices that is responsible for static power consumption should be minimized.

Lowering the clock frequency intelligently based on the load is the key to
cut down the power consumption. For example, the voltage regulator only

needs to supply 100pW for the timer in the sleep mode and but the power
demand increases to 1W for the whole system during the active mode.

Another aspect for voltage management is by considering hybrid power

sources such as the combination of battery and energy scavenging. To come

up with a scheme that utilizes the advantages of both power sources is not
trivial. Life time is not limited with energy scavenging, however, switching

to the battery when the harvested energy is insufficient is challenging. A

supply power monitor is required for implementation of such a hybrid

power system.

3D stacking is an attractive option for the sensor system. The first reason

is that by stacking the chips, higher densities can be achieved. Another rea-
son is that heterogeneous technologies can be used to fabricate components

such as the FLASH memory. This allows us to explore new architecture for
the system as individual components can be optimized. For example, the

timers that proposed in this work rely on the magnitude of gate leakage
in a certain range so technology scaling can become adverse. On the other
hand, the microcontroller typically favors smaller dimensions so that the

parasitic capacitances can be reduced.

117
BIBLIOGRAPHY

[1] G. Schimetta, F. Dollinger, G. Scholl, and R. Weigel, Optimized design and

fabrication of a wireless pressure and temperature sensor unit based on SAW
transponder technology, in Proc. MTT-S, vol. 1, 20-25 May 2001, pp. 355358.
[2] S. Lizon Martinez, R. Giannetti, and J. L. Rodriguez Marrero, Design of a sys-
tem for continuous intraocular pressure monitoring, in Proc. Instrumentation
Measurement Technology Conf., vol. 3, 18-20 May 2004, pp. 16931696.
[3] F. Udrea and J. Gardner, SOI CMOS gas sensors, in Proc. Sensors, vol. 2,
2002, pp. 13791384.
[4] Y. Leng, G. Zhao, Q. Li, C. Sun, and S. Liu, A High Accuracy Signal Con-
ditioning Method and Sensor Calibration System for Wireless Sensor in Auto-
motive Tire Pressure Monitoring System, in Proc. WiCOM, Sept. 2007, pp.
18331837.
[5] K. Wise, Microelectromechanical systems: interfacing electronics to a non-
electronic world, in Proc. IEDM, Dec. 1996, pp. 1118.
[6] K. Ueno, T. Hirose, T. Asai, and Y. Amemiya, A CMOS Watchdog Sensor for
Certifying the Quality of Various Perishables with a Wider Activation Energy,
Trans. IEICE, vol. E89-A, pp. 902907, Apr. 2006.
[7] E. Saneyoshi, K. Nose, M. Kajita, and M. Mizuno, A 1.1V 35m 35m
thermal sensor with supply voltage sensitivity of 2 C/10%-supply for thermal
management on the SX-9 supercomputer, in Symp. VLSI Circuits Dig. Tech.
Papers, June 2008, pp. 152153.
[8] D. Culler, D. Estrin, and M. Srivastava, Overview of Sensor Networks, Com-
puter, vol. 37, no. 8, pp. 4149, Aug. 2004.
[9] N. Pletcher, S. Gambini, and J. Rabaey, A 65W, 1.9 GHz RF to Digital
Baseband Wakeup Receiver for Wireless Sensor Nodes, in Proc. IEEE Custom
Integrated Circuits Conf. (CICC), Sept. 2007, pp. 539542.
[10] S. Selvarasah, C.-L. Chen, S.-H. Chao, P. Makaram, A. Busnaina, and M. Dok-
meci, A Three Dimensional Thermal Sensor Based on Single-Walled Carbon
Nanotubes, in Transducers07, June 2007, pp. 10231026.

118
[11] A. Malik, M. Aceves, and S. Alcantara, Novel FTO/SRO/silicon optical sen-
sors: characterization and applications, in Proc. Sensors, vol. 1, 2002, pp.
116120.

[12] C. Dolabdjian, A. Qasimi, and C. Cordier, Applied magnetic sensing: a long

way, in Proc. Sensors, vol. 1, Oct. 2003, pp. 477482.

[13] D. Wilson, S. Hoyt, J. Janata, K. Booksh, and L. Obando, Chemical sensors

for portable, handheld field instruments, IEEE Sensors J., vol. 1, no. 4, pp.
256274, Dec. 2001.

[14] M. Esashi, S. Sugiyama, K. Ikeda, Y. Wang, and H. Miyashita, Vacuum-sealed

silicon micromachined pressure sensors, Proceedings of the IEEE, vol. 86, no. 8,
pp. 16271639, Aug. 1998.

[15] Y. Taur and T. H. Ning, Fundamentals of Modern VLSI Devices. Cambridge

Univ Pr, 1998.

[16] S. Hanson, B. Zhai, D. Blaauw, and D. Sylvester, Energy-Optimal Circuit

Design, in Symp. System-On-Chip, Nov. 2007, pp. 14.

[17] M. Seok, S. Hanson, D. Sylvester, and D. Blaauw, Analysis and Optimization

of Sleep Modes in Subthreshold Circuit Design, in Proc. Design Automation
Conf., June 2007, pp. 694699.

[18] H. Onoda, K. Miyashita, T. Nakayama, T. Kinoshita, H. Nishimura, A. Azuma,

S. Yamada, and F. Matsuoka, 0.7 V SRAM Technology with Stress-Enhanced
Dopant Segregated Schottky (DSS) Source/Drain Transistors for 32 nm Node,
in Symp. VLSI Circuits Dig. Tech. Papers, June 2007, pp. 7677.

[19] A. Wang and A. Chandrakasan, A 180-mV subthreshold FFT processor using

a minimum energy design methodology, IEEE J. Solid-State Circuits, vol. 40,
no. 1, pp. 310319, Jan. 2005.

[20] B. Zhai, D. Blaauw, D. Sylvester, and S. Hanson, A Sub-200mV 6T SRAM in

0.13m CMOS, in IEEE ISSCC Dig. Tech. Papers, Feb. 2007, pp. 332606.

[21] I. J. Chang, J.-J. Kim, S. Park, and K. Roy, A 32kb 10T Subthreshold SRAM
Array with Bit-Interleaving and Differential Read Scheme in 90nm CMOS, in
IEEE ISSCC Dig. Tech. Papers, Feb. 2008, pp. 388622.

[22] N. Verma and A. Chandrakasan, A 256 kb 65 nm 8T Subthreshold SRAM

Employing Sense-Amplifier Redundancy, IEEE J. Solid-State Circuits, vol. 43,
no. 1, pp. 141149, Jan. 2008.

[23] R. Bez, E. Camerlenghi, A. Modelli, and A. Visconti, Introduction to flash

memory, IEEE Proc., vol. 91, no. 4, pp. 489502, Apr. 2003.

119
[24] S. Shukuri, K. Tanagisawa, and K. Ishibashi, CMOS process compatible ie-
Flash (inverse gate electrode Flash) technology for system-on-a-chip, in Proc.
IEEE Custom Integrated Circuits Conf. (CICC), May 2001, pp. 179182.

[25] Q. Huang and M. Qberle, A 0.5-mW passive telemetry IC for biomedical ap-
plications, IEEE J. Solid-State Circuits, vol. 33, no. 7, pp. 937946, July 1998.

[26] A. DeHennis and K. Wise, A double-sided single-chip wireless pressure sensor,

in Proc. IEEE Conf. Microelectromechanical Syst., Jan. 2002, pp. 252255.

[27] S. Kaiser, Passive Telemetric Readout System, IEEE Sensors J., vol. 6, no. 5,
pp. 13401345, Oct. 2006.

[28] D. Dudenbostel, K.-L. Krieger, C. Candler, and R. Laur, A new passive CMOS
telemetry chip to receive power and transmit data for a wide range of sensor
applications, in Proc. Solid State Sensors and Acuators, vol. 2, 16-19 June
1997, pp. 995998.

[29] F. Kocer and M. P. Flynn, A new transponder architecture with on-chip ADC
for long-range telemetry applications, IEEE J. Solid-State Circuits, vol. 41,
no. 5, pp. 11421148, May 2006.

[30] W. Nosovic and T. Todd, Scheduled rendezvous and RFID wakeup in embed-
ded wireless networks, in Proc. ICC, vol. 5, 2002, pp. 33253329.

[31] S. von der Mark and G. Boeck, Ultra low power wakeup detector for sensor
networks, in Proc. IMOC, Oct. 2007, pp. 865868.

[32] H. Kulah and K. Najafi, Energy Scavenging From Low-Frequency Vibrations

by Using Frequency Up-Conversion for Wireless Sensor Applications, IEEE
Sensors J., vol. 8, no. 3, pp. 261268, Mar. 2008.

[33] E. Yeatman, Rotating and Gyroscopic MEMS Energy Scavenging, in Inter-

national Workshop on Wearable and Implantable Body Sensor Networks (BSN
06), 2006, pp. 4245.

[34] M. Renaud, T. Sterken, P. Fiorini, R. Puers, K. Baert, and C. van Hoof, Scav-
enging energy from human body: design of a piezoelectric transducer, in Proc.
Transducers, vol. 1, 5-9 June 2005, pp. 784787.

[35] E. Reilly, E. Carleton, and P. Wright, Thin Film Piezoelectric Energy Scav-
enging Systems for Long Term Medical Monitoring, in International Workshop
on Wearable and Implantable Body Sensor Networks (BSN 06), Apr. 2006, pp.
3841.

[36] B. C. Yen and J. H. Lang, A variable-capacitance vibration-to-electric energy

harvester, IEEE Trans. Circuits Syst. I, vol. 53, no. 2, pp. 288295, Feb. 2006.

120
[37] S. J. Roundy, Energy Scavenging for Wireless Sensor Nodes with a Focus
on Vibration to Electricity Conversion, Ph.D. dissertation, The University of
California, Berkeley, 2003.

[38] H. Li and P. Pillay, A Linear Generator Powered from Bridge Vibrations for
Wireless Sensors, in Proc. IAS, 2007, pp. 523529.

[39] Power paper, http://www.powerpaper.com. [Online]. Available:

http://www.powerpaper.com

[40] J. Klassen, A description of Cymbet battery technology and its comparison

with other battery technologies, Available: http://www.cymbet.com. [Online].
Available: http://www.cymbet.com

[41] J. Flammer, S. Org ul, V. Costa, N. Orzalesi, G. Krieglstein, L. Serra, J.-P.

Renard, and E. Stefansson, The impact of ocular blood flow in glaucoma,
Progress Retinal Eye Research, vol. 21, pp. 359393, 2002.

[42] C. H. Hong and R. A. A. Arosemena, D. Zurakowski, Glaucoma drainage de-

vices: a systematic literature review and current controversies, Ophthalmology
Survey, vol. 50, pp. 4860, 2005.

[43] S. Chandrasekaran, R. Cumming, E. Rochtchina, and P. Mitchell, Associ-

ations between elevated intraocular pressure and glaucoma, use of glaucoma
medications, and 5-year incident cataract: the Blue Mountains Eye Study,
Ophthalmology, pp. 417424, 2006.

[44] S. Asrani, R. Zeimer, J. Wilensky, D. Gieser, S. Vitale, and K. Lindenmuth,

Large Diurnal Fluctuations in Intraocular Pressure Are an Independent Risk
Factor in Patients With Glaucoma, Journal of Glaucoma, vol. 9, pp. 134142,
2000.

[45] R. C. Zeimer, J. T. Wilensky, and D. K. Gieser, Presence and Rapid Decline

of Early Morning Intraocular Pressure in Glaucoma Patients, Ophthalmology,
vol. 97, pp. 547550, 1990.

[46] D. Da Rin and B. Brown, Diurnal variation of intraocular pressure and the
overriding effects of sleep, Am J Optom Physiol Opt, vol. 64, pp. 5461, 1987.

[47] P. P. Syam, I. Mavrikakis, and C. Liu, Importance of early morning intraocular

pressure recording for measurement of diurnal variation of intraocular pressure,
British Journal of Ophthalmology, vol. 89, pp. 926927, 2005.

[48] J. H. K. Liu, X. Zhang, D. F. Kripke, and R. N. Weinreb, Twenty-Four-

Hour Pattern of Intraocular Pressure in the Aging Population, Investigative
Ophthalmology and Visual Science, vol. 40, pp. 29122917, 1999.

121
[49] A. Banobre, T. Alvarez, R. Fechtner, R. Greene, G. Thomas, O. Levi, and
N. Ciampa, Measurement of intraocular pressure in pigs eyes using a new
tonometer prototype, in Proc. NEBC, 2-3 April 2005, pp. 260261.

[50] C. C. Collins, Miniature passive pressure transensor for implanting in the eye,
IEEE Trans. Biomed. Eng., vol. 14, pp. 7483, 1967.

[51] M. Kandler and W. Mokwa, Capacitive silicon pressure sensor for invasive
measurement of blood pressure, in Proc. Micromech. Euro. Tech. Dig, Nov
1990, pp. 203208.

[52] K. C. Katuri, S. Asrani, and R. M.K., Intraocular Pressure Monitoring Sen-

sors, IEEE Sensors J., vol. 8, no. 1, pp. 1219, Jan. 2008.

[53] W. Mokwa and U. Schnakenberg, Micro-transponder systems for medical ap-

plications, IEEE. Trans. Instrumentation and Measurement, vol. 50, no. 6, pp.
15511555, Dec. 2001.

[54] K. Stangel, S. Kolnsberg, D. Hammerschmidt, B. Hosticka, H. Trieu, and

W. Mokwa, A programmable intraocular CMOS pressure sensor system im-
plant, IEEE J. Solid-State Circuits, vol. 36, no. 7, pp. 10941100, July 2001.

[55] J. Coosemans, M. Catrysse, and R. Puers, A readout circuit for an intra-ocular

pressure sensor, Sens. Actuators, vol. 110, no. 1-3, pp. 432438, 2004.

[56] Y.-S. Lin, S. Hanson, F. Albano, C. Tokunaga, R.-U. Haque, K. Wise, A. Sastry,
D. Blaauw, and D. Sylvester, Low-voltage circuit design for widespread sensing
applications, in IEEE Int. Symp. on Circuits and Systems, May 2008, pp.
25582561.

[57] B. Zhai, L. Nazhandali, J. Olson, A. Reeves, M. Minuth, R. Helfand, S. Pant,

D. Blaauw, and T. Austin, A 2.60pJ/Inst Subthreshold Sensor Processor for
Optimal Energy Efficiency, in Symp. VLSI Circuits Dig. Tech. Papers, June
2006, pp. 154155.

[58] K. Hosaka, S. Harase, S. Izumiya, and T. Adachi, A cascode crystal oscillator

suitable for integrated circuits, in Proc. Frequency Control, 29-31 May 2002,
pp. 610614.

[59] R. Woudsma and J. M. Noteboom, The Modular Design of Clock-Generator

Circuits in a CMOS Building-Block System, IEEE J. Solid-State Circuits,
vol. 20, no. 3, pp. 770774, Jun 1985.

[60] H. Okuno, T. Tominaka, S. Fujishima, T. Mitsumoto, T. Kubo, T. Kawaguchi,

J.-W. Kim, K. Ikegami, N. Sakamoto, S. Yokouchi, T. Morikawa, T. Tanaka,
A. Goto, and Y. Yano, A programmable clock oscillator for integrated sensor
applications, in Proc. Electron Devices Meeting, 29 Aug. 1998, pp. 10751077.

122
[61] K. Sundaresan, K. Brouse, K. U-Yen, F. Ayazi, and P. Allen, A 7-MHz process,
temperature and supply compensated clock oscillator in 0.25 m CMOS, in
IEEE Int. Symp. on Circuits and Systems, vol. 1, 25-28 May 2003, pp. 693696.

[62] S. Mick, J. Wilson, and P. Franzon, 4 Gbps high-density AC coupled intercon-

nection, in Proc. IEEE Custom Integrated Circuits Conf. (CICC), May 2002,
pp. 133140.

[63] K. Kanda, D. Antono, K. Ishida, H. Kawaguchi, T. Kuroda, and T. Sakurai,

1.27Gb/s/pin 3mW/pin wireless superconnect (WSC) interface scheme, in
IEEE ISSCC Dig. Tech. Papers, vol. 1, 2003, pp. 186487.

[64] R. Drost, R. Hopkins, R. Ho, and I. Sutherland, Proximity communication,

IEEE J. Solid-State Circuits, vol. 39, no. 9, pp. 15291535, Sept. 2004.

[65] K. Opasjumruskit, T. Thanthipwan, O. Sathusen, P. Sirinamarattana, P. Gad-

manee, E. Pootarapan, N. Wongkomet, A. Thanachayanont, and M. Thamsiri-
anunt, Self-powered wireless temperature sensors exploit RFID technology,
IEEE Pervasive Comput., vol. 5, no. 1, pp. 5461, Jan.-March 2006.

[66] R. Xu, Y. Jin, and C. Nguyen, Power-efficient switching-based CMOS UWB

transmitters for UWB communications and Radar systems, IEEE Trans. Mi-
crow. Theory Tech., vol. 54, no. 8, pp. 32713277, Aug. 2006.

[67] N. Miura, D. Mizoguchi, T. Sakurai, and T. Kuroda, Analysis and design

of inductive coupling and transceiver circuit for inductive inter-chip wireless
superconnect, IEEE J. Solid-State Circuits, vol. 40, no. 4, pp. 829837, April
2005.

[68] Y.-S. Lin, D. Sylvester, and D. Blaauw, A sub-pW timer using gate leakage for
ultra low-power sub-Hz monitoring systems, in Proc. IEEE Custom Integrated
Circuits Conf. (CICC), Sept. 2007, pp. 397400.

[69] Y.-S. Lin, D. Sylvester, and D. Blaauw, An ultra low power 1V, 220nW temper-
ature sensor for passive wireless applications, in Proc. IEEE Custom Integrated
Circuits Conf. (CICC), Sept. 2008, pp. 507510.

[70] Y.-S. Lin and D. Sylvester, Single stage static level shifter design for subthresh-
old to I/O voltage conversion, in Proc. Int. Symp. Low Power Electronics and
Design, Aug. 2008, pp. 197200.

[71] Y.-S. Lin, D. Sylvester, and D. Blaauw, Sensor data retrieval using alignment
independent capacitive signaling, in Symp. VLSI Circuits Dig. Tech. Papers,
June. 2008, pp. 6667.

[72] C. H. Lee and H. J. Park, All-CMOS temperature independent current refer-

ence, in Electronics Letters, vol. 32, no. 14, 4 July 1996, pp. 12801281.

123
[73] J. Georgiou and C. Toumazou, A resistorless low current reference circuit for
implantable devices, in IEEE Int. Symp. on Circuits and Systems, vol. 3, May
2002, pp. 193196.
[74] K. M. Cao, W.-C. Lee, W. Liu, X. Jin, P. Su, S. Fung, J. An, B. Yu, and C. Hu,
BSIM4 gate leakage model including source-drain partition, in Proc. IEDM,
10-13 Dec. 2000, pp. 815818.
[75] C.-H. Choi, K.-Y. Nam, Z. Yu, and R. Dutton, Impact of gate direct tunneling
current on circuit performance: a simulation study, IEEE Trans. Electron
Devices, vol. 48, no. 12, pp. 28232829, Dec. 2001.
[76] M. J. S. Smith and J. D. Meindl, Exact analysis of the Schmitt trigger oscil-
lator, IEEE J. Solid-State Circuits, vol. 19, no. 6, pp. 10431046, Dec 1984.
[77] S. Borkar, Design challenges of technology scaling, IEEE Micro, vol. 19, no. 4,
pp. 2329, Jul-Aug 1999.
[78] Y. Liu, R. Dick, L. Shang, and H. Yang, Accurate Temperature-Dependent
Integrated Circuit Leakage Power Estimation is Easy, in Proc. Design Au-
tomation Test Eur., Apr. 2007, pp. 16.
[79] H. Su, F. Liu, A. Devgan, E. Acar, and S. Nassif, Full chip leakage-estimation
considering power supply and temperature variations, in Proc. Int. Symp. Low
Power Electronics and Design, Aug. 2003, pp. 7883.
[80] S. Hanson, B. Zhai, M. Seok, B. Cline, K. Zhou, M. Singhal, M. Minuth, J. Ol-
son, L. Nazhandali, T. Austin, D. Sylvester, and D. Blaauw, Performance and
Variability Optimization Strategies in a Sub-200mV, 3.5pJ/inst, 11nW Sub-
threshold Processor, in Symp. VLSI Circuits Dig. Tech. Papers, June 2007,
pp. 152153.
[81] S. Narendra, V. De, S. Borkar, D. Antoniadis, and A. Chandrakasan, Full-
Chip Subthreshold Leakage Power Prediction and Reduction Techniques for
Sub-0.18-m CMOS, IEEE J. Solid-State Circuits, vol. 39, no. 3, pp. 501510,
Mar. 2004.
[82] R. Rao, A. Srivastava, D. Blaauw, and D. Sylvester., Statistical Analysis of
Subthreshold Leakage Current for VLSI Circuits, IEEE Trans. VLSI Syst.,
vol. 12, no. 2, pp. 131139, Feb. 2004.
[83] H.-M. Chuang, K.-B. Thei, S.-F. Tsai, and W.-C. Liu, Temperature-dependent
characteristics of polysilicon and diffused resistors, IEEE Trans. Electron De-
vices, vol. 50, no. 5, pp. 14131415, May 2003.
[84] D. Duarte, G. Geannopoulos, U. Mughal, K. Wong, and G. Taylor, Temper-
ature Sensor Design in a High Volume Manufacturing 65nm CMOS Digital
Process, in Proc. IEEE Custom Integrated Circuits Conf. (CICC), Sept. 2007,
pp. 221224.

124
[85] F. Kocer and M. Flynn, An RF-powered, wireless CMOS temperature sensor,
IEEE Sensors J., vol. 6, no. 3, pp. 557564, 2006.

[86] S. Zhou and N. Wu, A novel ultra low power temperature sensor for UHF
RFID tag chip, in Proc. ASSCC, Nov 2007, pp. 464467.

[87] K. Finkenzeller, RFID Handbook: Fundamentals and Applications in Contact-

less Smart Cards and Identification. John Wiley & Sons, 2003.

[88] A. Bakker and J. Huijsing, Micropower CMOS temperature sensor with digital
output, IEEE J. Solid-State Circuits, vol. 31, no. 7, pp. 933937, July 1996.

[89] M. Tuthill, A switched-current, switched-capacitor temperature sensor in 0.6-

m CMOS, IEEE J. Solid-State Circuits, pp. 11171122, 1998.

[90] M. A. P. Pertijs, K. A. A. Makinwa, and J. H. Huijsing, A CMOS smart

temperature sensor with a 3 inaccuracy of 0.1 C from -55 C to 125 C,
IEEE J. Solid-State Circuits, vol. 40, no. 12, pp. 28052815, Dec. 2005.

[91] P. Chen, C.-C. Chen, C.-C. Tsai, and W.-F. Lu, A time-to-digital-converter-
based CMOS smart temperature sensor, IEEE J. Solid-State Circuits, vol. 40,
no. 8, pp. 16421648, Aug 2005.

[92] K. Kimura, Low voltage techniques for bias circuits, IEEE Trans. Circuits
Syst. I, vol. 44, no. 5, pp. 459465, May 1997.

[93] K. Usami, Automated low-power technique exploiting multiple supply voltages

applied to a media processor, IEEE J. Solid-State Circuits, vol. 33, no. 3, pp.
463472, 1998.

[94] C.-C. Yu, W.-P. Wang, and B.-D. Liu, A new level converter for low-power
applications, in IEEE Int. Symp. on Circuits and Systems, vol. 1, May 2001,
pp. 113116.

[95] F. Ishihara, F. Sheikh, and B. Nikolic, Level Conversion for Dual-Supply Sys-
tems, IEEE Trans. VLSI Syst., vol. 12, no. 2, pp. 185195, 2004.

[96] I. J. Chang, J.-J. K., and K. Roy, Robust Level Converter Design for Sub-
threshold Logic, in Proc. Int. Symp. Low Power Electronics and Design, 2006,
pp. 1419.

[97] R. Puri, L. Stok, J. Cohn, D. Kung, D. Pan, D. Sylvester, A. Srivastava, and

S. Kulkarni, Pushing ASIC performance in a power envelope, in Proc. Design
Automation Conf., June 2003, pp. 788793.

[98] W.-T. Wang, M.-D. Ker, M.-C. Chiang, and C.-H. Chen, Level shifters for
high-speed 1 V to 3.3 V interfaces in a 0.13 m Cu-interconnection/low-k CMOS
technology, in Proc. VLSI TSA, 2001, pp. 307310.

125
[99] H. Zhang, V. George, and J. Rabaey, Low-swing on-chip signaling techniques:
effectiveness and robustness, IEEE Trans. VLSI Syst., vol. 8, no. 3, pp. 264
272, June 2000.
[100] Y. Ramadass and A. Chandrakasan, Minimum Energy Tracking Loop with
Embedded DC-DC Converter Delivering Voltages down to 250mV in 65nm
CMOS, in IEEE ISSCC Dig. Tech. Papers, Feb. 2007, pp. 64 587.
[101] M.-E. Hwang, A. Raychowdhury, K. Kim, and K. Roy, A 85mV 40nW Process-
Tolerant Subthreshold 8x8 FIR Filter in 130nm Technology, in Symp. VLSI
Circuits Dig. Tech. Papers, June 2007, pp. 154155.
[102] U. Karthaus and M. Fischer, Fully integrated passive UHF RFID transponder
IC with 16.7-W minimum RF input power, IEEE J. Solid-State Circuits,
vol. 38, no. 10, pp. 16021608, Oct. 2003.
[103] N. Miura, D. Mizoguchi, M. Inoue, T. Sakurai, and T. Kuroda, A 195-gbs
1.2-W inductive inter-chip wireless superconnect with transmit power control
scheme for 3-D-stacked system in a package, IEEE J. Solid-State Circuits,
vol. 41, no. 1, pp. 2334, Jan. 2006.
[104] T. Kuroda, Wireless Proximity Communications for 3D System Integration,
in IEEE Workshop on RFIT, Dec. 2007, pp. 2125.
[105] R. Drost, R. Hopkins, and I. Sutherland, Proximity communication, in Proc.
IEEE Custom Integrated Circuits Conf. (CICC), Sept. 2003, pp. 469472.
[106] A. Fazzi, R. Canegallo, L. Ciccarelli, L. Magagni, F. Natali, E. Jung,
P. Rolandi, and R. Guerrieri, 3D Capacitive Interconnections with Mono- and
Bi-Directional Capabilities, in IEEE ISSCC Dig. Tech. Papers, Feb. 2007, pp.
356608.
[107] E. Culurciello and A. G. Andreou, Capacitive Inter-Chip Data and Power
Transfer for 3-D VLSI, IEEE Trans. Circuits Syst. II, vol. 53, no. 12, pp.
13481352, 2006.
[108] R. Drost, R. Ho, D. Hopkins, and I. Sutherland, Electronic alignment for
proximity communication, in IEEE ISSCC Dig. Tech. Papers, 15-19 Feb. 2004,
pp. 144518.
[109] R. Canegallo, M. Mirandola, A. Fazzi, L. Magagni, R. Guerrieri, and
K. Kaschlun, Electrical measurement of alignment for 3D stacked chips, in
Proc. ESSCIRC, 12-16 Sept. 2005, pp. 347350.
[110] Raphael, Synopsys Inc., Mountain View, California, 2005.
[111] L. Nazhandali, B. Zhai, A. Olson, A. Reeves, M. Minuth, R. Helfand, S. Pant,
T. Austin, and D. Blaauw, Energy optimization of subthreshold-voltage sensor
network processors, in Proc. of the International Symposium on Computer
Architecture (ISCA), June 2005, pp. 197207.

126
[112] M. Seok, S. Hanson, Y.-S. Lin, Z. Foo, D. Kim, Y. Lee, N. Liu, D. Sylvester, and
D. Blaauw, The Phoenix Processor: A 30pW platform for sensor applications,
in Symp. VLSI Circuits Dig. Tech. Papers, June 2008, pp. 188189.

[113] V. Chawla and D. S. Ha, An overview of passive RFID, IEEE Commun.

Mag., vol. 45, no. 9, pp. 1117, Sept. 2007.

[114] K. W. Min, S. B. Chai, and S. Kim, An Analog Front-End Circuit for ISO/IEC
14443-compatible RFID Interrogators, Jour. ETRI, vol. 26, no. 6, pp. 560564,
2004.

[115] J.-P. Curty, N. Joehl, C. Dehollain, and M. Declercq, Remotely powered ad-
dressable UHF RFID integrated system, IEEE J. Solid-State Circuits, vol. 40,
no. 11, pp. 21932202, Nov. 2005.

[116] G. Balachandran and R. Barnett, A 110 nA Voltage Regulator System With

Dynamic Bandwidth Boosting for RFID Systems, IEEE J. Solid-State Cir-
cuits, vol. 41, no. 9, pp. 20192028, Sept. 2006.

[117] N. Miura, H. Ishikuro, T. Sakurai, and T. Kuroda, A 0.14pJ/b Inductive-

Coupling Inter-Chip Data Transceiver with Digitally-Controlled Precise Pulse
Shaping, in IEEE ISSCC Dig. Tech. Papers, Feb. 2007, pp. 358608.

[118] R. E. Best, Phase-Locked Loops. McGraw-Hill Professional, 2003.

[119] B. Razavi, Design of Analog CMOS Integrated Circuits. McGraw-Hill, 2000.

[120] S. Levantino, M. Milani, C. Samori, and A. Lacaita, Fast-Switching Analog

PLL With Finite-Impulse Response, IEEE Trans. Circuits Syst. I, vol. 51,
no. 9, pp. 16971701, Sept. 2004.

[121] W.-H. Lee, J.-D. Cho, and S.-D. Lee, A high speed and low power phase-
frequency detector and charge-pump, in Proc. Asia and South Pacific Design
Automation Conf., vol. 1, 1999, pp. 269272.

127