2) A - Time-Domain - Computing-In-Memory - Micro - Using - Ring - Oscillator
2) A - Time-Domain - Computing-In-Memory - Micro - Using - Ring - Oscillator
Ring Oscillator
Yixuan He1, Minsu Choi2, Kyung-Ki Kim3, Yong-Bin Kim1
1Dept. of ECE, Northeastern University, Boston, MA, USA
2021 18th International SoC Design Conference (ISOCC) | 978-1-6654-0174-6/21/$31.00 ©2021 IEEE | DOI: 10.1109/ISOCC53507.2021.9613954
2Dept. of ECE, Missouri University of Science & Technology, Rolla, MO, USA
3Dept. of Elctronic Eng., Daegu University, Gyeongsan, Korea
Abstract— This paper proposes a novel time-domain cell to store the XNOR output and do change the shared charge
computing-in-memory core that implements XNOR-and- between all the capacitors in a row to get the average voltage as
accumulate (XAC) of XNOR network in 8T SRAM cell. This new the accumulation output. It consumes much less energy than
technique uses an inverter-based ring oscillator to generate others because it does not waste DC current during the whole
periodic waves whose period represents the accumulation result of process. Moreover, Ref. [3] presented an 8T1C SRAM array
the input XNOR values. The circuit is built and simulated using using capacitive-coupling computing. The output is determined
PTM16_HP 16nm CMOS model with a 0.7V power supply. The by the ratio of the capacitors. Thus, it is less sensitive to
results show correct functionality, a large signal margin and 463 temperature and transistor process variation. However, those
TOPS/W efficiency. With further exploration, the time-domain
approaches have limitations such as poor signal margin,
computation could be a new candidate for in-memory computing
since it has its own superiorities in comparison to mixed-signal or
especially when large numbers of SRAM are integrated in the
digital methods. same row for higher efficiency. When many inputs are
connected, the voltage difference between adjacent output
Keywords— Artificial intelligence (AI), computing-in-memory values becomes small comparing to the offset voltage of ADC
(CIM), static random access memory (SRAM), ring oscillator. and thereby causes an error.
In this paper, a new type of implementation for XNOR
I. INTRODUCTION computing is proposed to address the signal margin problem in
With the rapid development of artificial intelligence (AI) and existing mixed-signal approaches. The inverter-based ring
Internet of Things (IoT), numerous challenges and constraints oscillator is used to perform XNOR and accumulation
have been imposed for existing computing architecture. In those operations, and the output is represented by time and converted
edge applications like smartphone or self-driving vehicle, to digital data through a time-digital converter (TDC).
instant inference or even on-time training on chip is often II. PROPOSED TIME-DOMAIN XNOR COMPUTING
considered as the goal for future AI. Therefore, high-speed and
low-power computing and data movement becomes a dare need The proposed time-domain structure is shown in Fig. 1. As
and can directly determine the performance of machine learning it shows, the SRAM is modified as an 8T cell so that it can
algorithms implemented in computing architectures. perform XNOR logic operation depending on the input and the
As for traditional von Neumann topology, data is weight stored in itself. The XNOR value is represented by Sn
continuously moved between memory and computing units in with logic “0” or “1”. Besides, there are totally (3+2n) inverters
series. When the large-scale algorithm is applied to the system, connected in series to form a ring oscillator (RO) in each SRAM
massive data transfer can be expected and results in high latency row and they are controlled by the Sn (“n” means the index of
and power consumption. In addition, this conundrum is often the SRAM cell in array and figures use “0” as an example).
referred to as the “memory-wall bottleneck” associated with the When S0 is logic high, it enables the two inverters associated
conventional computing architecture for machine learning with this memory cell. And when the logic is low, inverters are
applications. In order to catch up with the blossom in AI disabled to save power and an additional pass is created which
algorithms and more intricate neural networks, beyond-von shorts the inverters. Therefore, the ring oscillator has (3+2n_1 )
Neumann machine, such as the computing-in-memory (CIM) inverters in series which results in oscillating frequency and the
technology is treated as a convincing candidate to break through period are
the memory wall due to its nature of eliminating the distance and
barrier between memory and computing units. fRO = 1⁄(2τ(3+2n_1 )) (1)
In fact, plenty of efforts have been made in implementing
XNOR neural network in SRAM using mixed-signal CMOS TRO = 2τ(3+2n_1 ) (2)
circuits and reached promising results [1]. This kind of network
has 1-bit input and 1-bit weight to do XNOR logic and
accumulation operation. Ref. [2] presented a BNN where τ represents the time delay for a single inverter and n_1 is
implementation using 8T1C (8 transistors and 1 capacitor) the number of Sn that are logic high.
SRAM CIM cell. A small capacitor is attached to each SRAM
REFERENCES
[1] C. -J. Jhang, C. -X. Xue, J. -M. Hung, F. -C. Chang and M. -F. Chang,
Fig. 2. The circuit schematic of proposed 8T SRAM cell. "Challenges and Trends of SRAM-Based Computing-In-Memory for AI
Edge Devices," in IEEE Transactions on Circuits and Systems I: Regular
Papers, vol. 68, no. 5, pp. 1773-1786, May 2021.
As for the ring oscillator shown in Fig. 3, each SRAM cell
[2] H. Valavi, P. J. Ramadge, E. Nestler and N. Verma, "A Mixed-Signal
controls two inverters through five switches. The S0 pointed to Binarized Convolutional-Neural-Network Accelerator Integrating Dense
the switch in the figure means that when S0 is “1”, the switch is Weight Storage and Multiplication for Reduced Data Movement," 2018
on and vice versa. Therefore, when S0 is high, top four switches IEEE Symposium on VLSI Circuits, 2018, pp. 141-142.
are closed and inverters are connected to others in the RO. In [3] Z. Jiang, S. Yin, J. Seo and M. Seok, "C3SRAM: An In-Memory-
addition, when S0 is low, the circuit is equivalent to a short Computing SRAM Macro Based on Robust Capacitive Coupling
Computing Mechanism," in IEEE Journal of Solid-State Circuits, vol. 55,
circuit. no. 7, pp. 1888-1897, July 2020.