Constant-Time Synchronous Binary Counter With Minimal Clock Period
Constant-Time Synchronous Binary Counter With Minimal Clock Period
Abstract—A synchronous binary counter is one of the basic counter size is large and can be critical for some applications
components widely used in VLSI design, and it is required to be where the counting value needs to be stable.
fast and support a wide bit-width in many applications. However, To obtain a stable binary output, a synchronous binary
most of the previous counters are associated with a limited count-
ing rate due to large fan-outs and long carry chains, especially counter can be used. The simplest synchronous counter is
when the counter size is not small. This brief proposes a new the ripple carry counter in which the carry-out of an one-
fast structure for synchronous binary counting, which has a min- bit adder is connected to the carry-in of the succeeding stage.
imal counting period for practical counter sizes ranging from The chain of carry signals is called a ripple carry chain, as
8 to 128 bits. We first adopt an 1-bit Johnson counter to reduce the carry signal is continually rippled into the next stage. The
the overall hardware complexity, and then duplicate the 1-bit
Johnson counter to decrease the propagation delay caused by main limiting factor of the speed of a synchronous counter is
large fan-outs. Implementation results show that the proposed the long carry propagation caused by the carry chain. There
design can be realized with a small number of flip-flops, which have been many techniques developed for fast adders, which
is almost linear to the counter size, and it operates at a clock have also been applied to derive fast counters [1]. The rip-
frequency of 2GHz in a 65nm CMOS technology, being limited ple carry chain in the traditional binary counter was replaced
only by the counting rate of the least significant bit.
with a carry-lookahead circuit in order to achieve a significant
Index Terms—Backward carry propagation, binary counter, speedup [2]. In addition, a hierarchical Manchester carry chain
constant-time counter, prescaled counter. was used for carry propagation in [3], and a state-lookahead
topology was used in [4] to break the carry chain by adding
D F/Fs, avoiding the rippling. In [5], the carry chain was con-
I. I NTRODUCTION structed with employing a tree structure. However, regarding
COUNTER is one of the basic components actively a counter as a combination of an adder and a state register
A used in many applications such as measurement systems,
analog-to-digital converters, frequency dividers, phase-locked
is not effective in achieving a constant clock period, since
the lower bound of the adder delay is not constant. There
-loop frequency synthesizers, and so on. Due to recent have been other efforts to speed up the counter by improving
advances in the applications, it is commonly required to imple- the F/F. For example, high-speed synchronous counters were
ment a fast, wide counter supporting a constant counting rate developed by using the F/F based on the true single-phase
independent of the counter size. However, the counting rate clock [2], [6].
and the size conflict with each other, because the carry propa- If fast synchronous counting is only required instead of the
gation from a low-order bit to a high-order bit becomes longer binary sequence, a counter associated with a constant clock
as the counter size gets larger. Asynchronous counters, some- period can be achieved by employing a state generator. For
times called ripple counters, can be realized with a small instance, a pipelined carry propagation chain was presented
number of logic gates, but the accumulated delay caused by in [7], [8] by taking systolic structures, but it doubles the
the ripple propagation produces false outputs for a short period number of F/Fs required as well as the overall hardware com-
of time because the flip-flops (F/Fs) are connected to differ- plexity. Another approach to realize a state generator is to use
ent clock signals. The ripple effect becomes worse when the a linear-feedback shift register (LFSR) [3], [9], but it demands
large additional circuits to convert the state order to a binary
value and make the number of states a power of two.
Manuscript received December 17, 2020; accepted January 19, 2021. Date
of publication January 25, 2021; date of current version June 29, 2021. To accomplish both constant delay and binary sequence,
This work was supported in part by the National Research Foundation another carry propagation method called backward carry prop-
of Korea under Grant NRF-2017R1E1A1A01076992, and in part by the agation was presented in [10]. It exploits the characteristics
Ministry of Science and ICT (MSIT), South Korea, through the Information
Technology Research Center (ITRC) support program under Grant IITP- of a binary sequence that the more significant bits become
2020-0-01847 supervised by the Institute of Information & Communications high earlier than the less significant bits. This approach can
Technology Planning & Evaluation (IITP). This brief was recommended by be applied to achieve a constant-delay counter since the
Associate Editor B.-H. Gwee. (Corresponding author: In-Cheol Park.)
The authors are with the School of Electrical Engineering, Korea Advanced carry propagation is only determined by the least significant
Institute of Science and Technology, Daejeon 34141, South Korea (e-mail: bit (LSB). However, the LSB has to drive all F/Fs of the
[email protected]; [email protected]). counter, leading to a large fan-out problem. In other words, the
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TCSII.2021.3054014. number of input ports connected to the LSB exceeds the maxi-
Digital Object Identifier 10.1109/TCSII.2021.3054014 mum value that can be drived by the LSB. In addition, another
1549-7747
c 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 28,2022 at 06:17:25 UTC from IEEE Xplore. Restrictions apply.
2646 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 68, NO. 7, JULY 2021
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 28,2022 at 06:17:25 UTC from IEEE Xplore. Restrictions apply.
HYUN AND PARK: CONSTANT-TIME SYNCHRONOUS BINARY COUNTER WITH MINIMAL CLOCK PERIOD 2647
58-bit subblock. Increasing the counter size makes the fan-out PEN2 is 2n clock cycles, the carry propagation in C3 can be
problem more critical, and the propagation delay of the PEN stabilized before the next PEN2 arrives from C2 [11]. And
signal keeps growing and eventually exceeds the desired min- subcounter C2 is an (n − 1)-bit backward carry propagation
imal propagation delay. The theoretical delay is far different counter, and enabled by the 1-bit counter C1 . Observing that
from the real delay elongated by large fan-out nodes. the delay of the long carry chain is reduced to only one AND
In the proposed work, the fan-out delay and the hardware gate by employing the backward carry propagation, we can
complexity are reduced by exploiting the redundancy made guarantee that the carry propagation of C2 is shorter than the
by duplicating 1-bit Johnson counters and applying the back- period of PEN1 generated in C1 . As a result, we have only
ward carry propagation in generating the PEN signal, which 3 subcounters, and the partitioning process is not recursively
allow the proposed counter to achieve a high counting rate for applied to the subcounters unlike [11]. The carry propagation
practical counter sizes. delay of C2 is just the summed delay of an AND gate and
an XOR gate plus the loading delay of a D F/F. Note that the
III. P ROPOSED C OUNTER fan-out effect of the first bit is little enough to be negligible
because the size (n − 1) is quite small. If the minimum clock
The proposed N-bit counter is illustrated in Fig. 4. For
period is set by considering the setup time of a D F/F addi-
the sake of simplicity, let we assume that n = log2 N and
tionally, the carry propagation delay of C2 is always faster
m = (N − n)/L, where L is the maximum fan-out to be
than the period of PEN1 generated in C1 , which is 2 clock
determined by conducting simulations. An N-bit counter is
cycles. As a result, the clock period is indeed determined by
partitioned into three different subcounters in order to take
the least significant subcounter C1 .
advantage of prescaling, and m 1-bit Johnson counters are
employed to generate m PEN signals to be used for the last
subcounter. The Johnson counter is initialized to 0, and the B. Prescaled Enable Signal Generation
PEN signal is generated to enable the counting of the next sub-
In the prescaled counter, the PEN signal should be syn-
counter when the Johnson counter undergoes a state change
chronous with the clock and its delay caused by the fan-out
from 0 to 1.
be negligible. The typical method to generate the PEN is
to use a ring or twisted-tail counter [11]. The ring counter
A. Counter Block connects the output of the last F/F to the input of the first
A counter block plays the role of a sequence generator that one, making a circular structure. When the n-bit ring counter
counts from 00 · · · 000 to 11 · · · 111. A counter is in general reaches 2n−1 value, the PEN signal becomes 1. Similarly, the
composed of a register part that stores the present state and n-bit twisted-tail counter or the Johnson counter, in which the
a combinational incrementer that computes the next value. The inverted output of the last F/F is connected to the input of
counting rate is mainly limited by the computation time of the the first one, activates the PEN signal when the count value
incrementer. The delay of the incrementer can be mitigated becomes 2n−1 . They can operate at a high frequency, as there is
by prescaling. In the proposed counter architecture, an N-bit no combinational circuit between adjacent F/Fs, allowing the
counter is realized by partitioning it into three subcounters, PEN to be synchronous with the clock. However, the approach
C1 , C2 , and C3 , as shown in Fig. 4. Subcounter C1 is an 1-bit is not efficient, as it needs N F/Fs to traverse N states, increas-
counter that toggles between 0 and 1 every clock. Subcounter ing the hardware complexity. Moreover, the PEN signal needs
C2 is an (n − 1)-bit counter that works based on the backward to drive all the F/Fs in the next partition, leading to a high
carry propagation, and the last subcounter C3 is an (N − n)-bit fan-out and increasing the propagation delay and thus decreas-
conventional binary counter. ing the overall counting speed. For example, a 64-bit counter
The basic principle of the partitioned counter is to prescale can be made of an 1-bit subcounter C1 , a 5-bit subcounter
the high-order block by considering the low-order block. An C2 , a 58-bit subcounter C3 , a 2-bit ring counter generating
N-bit counter is divided into 3 subcounters such that the prop- PEN1 , and a 64-bit ring counter making PEN2 . The fan-out
agation delay of the (N-n)-bit synchronous ripple carry binary of PEN1 is small enough to be ignored. However, as PEN2
counter C3 , which consists of (N−n−1) AND gates, is smaller drives 58 enable ports in C3 , the delay caused by the high
than the period of PEN2 generated in C2 . As the period of fan-out should be considered in the design. As will be found
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 28,2022 at 06:17:25 UTC from IEEE Xplore. Restrictions apply.
2648 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 68, NO. 7, JULY 2021
Fig. 7. (a) Delay of prescaled enable signal 2. (b) Maximum counting rate.
Fig. 6. Timing diagram for prescaled enable signal generation using backward
carry propagation and 1-bit Johnson counter.
evaluating m = (N − n)/L, where L is the number of input
ports that can be driven by a F/F. For N ranging from 8 to
in the simulation and implementation results later, the propa- 128 bits, m is usually less than 8. As the redundant F/Fs are
gation delay of PEN2 induces the critical path, preventing the 8 at maximum, the additional complexity caused by the redun-
counter from having a minimal clock period. dancy is a small portion of the whole counter. The maximum
To deal with the fan-out issue, a 2n -bit ring counter is fan-out of a Johnson counter is set to 16 by conducting inten-
replaced with an 1-bit Johnson counter as illustrated in Fig. 5, sive simulations in order not to degrade the counting rate. All
where a 5-bit backward carry propagation counter and a PEN the Johnson counters are driven by the same signal and thus
generator are exemplified for N = 64, n = 6 and m = 4. The generate m identical PEN2 signals. Each PEN2 is distributed
1-bit Johnson counter changes its state when enabled after evenly to drive up to L F/Fs of the next subcounter C3 .
being initialized to 0 at the beginning. Our goal is to make
PEN2 have a pulse every 2n cycles, 64 cycles in this exam- IV. I MPLEMENTATION AND C OMPARISON
ple. For the purpose, the enable signal should be high at the This section compares the performance of the proposed
(2n − 2)th and (2n − 1)th cycles, the 62nd and 63rd cycles in counter to the conventional binary counter, the backward carry
the example, in order to make PEN2 being 1 at the (2n − 1)th propagation counter [10] and the prescaled counter employ-
cycle, or 63rd cycle in the example. Such a signal can be gen- ing ring counters [11]. The performance analysis has been
erated by exploiting the backward carry propagation method conducted to investigate the maximum clock frequency and
depicted in Fig. 6. The AND operation of Q[5], Q[4], Q[3] the hardware complexity for the counter sizes ranging from
and Q[2] can be realized by employing backward AND chains. 8 to 128 bits. The maximum clock frequency is determined
The Q[5]&Q[4]&Q[3]&Q[2] signal becomes high when Q[2] by considering the setup time of a D F/F additionally, while
undergoes a transition from low to high and lasts for four the propagation delay does not include it. The counters have
cycles. The late arriving signal Q[1] is connected to the last been implemented with a 65nm standard cell library, analyzed
AND gate to make the output of the AND chain high for two by Synopsys Design Compiler and simulated with the parasitic
cycles. The enable signal is equivalent to the result of &Q[5:1], resistors and capacitors extracted from the layout.
and the computation takes only one AND gate as &Q[5:2] is There are two critical paths in the prescaled counter. The
already computed in advance thanks to the backward carry first path is in the 1-bit counter block, and consists of an XOR
propagation. The enable signal is high at the 62nd and 63rd gate plus a D F/F, and the other one is related to the propaga-
cycles, and repeats periodically every 64 cycles. The PEN2 tion delay of PEN2 . The propagation delay of PEN2 is depicted
is inverted one clock cycle after the enable signal is asserted. in Fig. 7(a). The delay of [11] continues to increase linearly
Consequently, PEN2 becomes high once every 64 cycles. In according to the size of C3 , and exceeds the upper bound of
other words, PEN2 is equivalent to &Q[5:0] that is computed the delay allowed, which is the sum of an XOR delay and a F/F
with the least 6 bits of the counter. delay. However, the delay in the proposed counter is always
The 1-bit Johnson counter can be redundantly duplicated smaller than the upper boundary, as the fan-out is restricted
to cope with large fan-out nodes as depicted in Fig. 5. The to less than 16 thanks to the redundant PEN2 signals. In fact,
number of redundant Johnson counters, m, is determined by the real critical path is related to the propagation of PEN2
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 28,2022 at 06:17:25 UTC from IEEE Xplore. Restrictions apply.
HYUN AND PARK: CONSTANT-TIME SYNCHRONOUS BINARY COUNTER WITH MINIMAL CLOCK PERIOD 2649
V. C ONCLUSION
In this brief, we have proposed a new synchronous binary
counter architecture of which the delay is almost constant
for practical counter sizes. The proposed counter design has
used backward carry propagation and exploited redundant 1-bit
Johnson counters to reduce the number of flip-flops and the
unwanted propagation delay caused by large fan-out nodes.
The proposed counter can be realized with a small number of
flip-flops, which is a little higher than the counter size, and
can operate at 2GHz in a 65nm CMOS technology, which is
almost independent of the counter size.
ACKNOWLEDGMENT
Fig. 8. (a) Combinational equivalent gate counts and flip-flops. (b) Total The authors would like to thank IC Design Education
equivalent gate counts.
Center (IDEC), South Korea for supporting the EDA tool.
R EFERENCES
associated with a large fan-out [11]. In the proposed counter,
[1] M. R. Stan, A. F. Tenca, and M. D. Ercegovac, “Long and fast up/down
on the contrary, PEN2 is associated with an almost constant counters,” IEEE Trans. Comput., vol. 47, no. 7, pp. 722–735, Jul. 1998.
delay and not on a critical path, so the overall delay is only [2] J.-R. Yuan, “Efficient CMOS counter circuits,” Electron. Lett., vol. 24,
affected by the XOR gate in the rightmost 1-bit counter. no. 21, pp. 1311–1313, Oct. 1988, doi: 10.1049/el:19880891.
The maximum counting rates are compared in Fig. 7(b). The [3] A. Ajane, P. M. Furth, E. E. Johnson, and R. L. Subramanyam,
“Comparison of binary and LFSR counters and efficient LFSR
counter structures in [10] and [11] are mainly limited by the decoding algorithm,” in Proc. IEEE 54th Int. Midwest Symp.
large fan-out node. In [11], the maximum frequency of 2 GHz Circuits Syst. (MWSCAS), Seoul, South Korea, 2011, pp. 1–4,
can be achieved only for small counters ranging up to 15 bits. doi: 10.1109/MWSCAS.2011.6026392.
[4] S. Abdel-Hafeez, S. M. Harb, and W. R. Eisenstadt, “High
As the counter size increases more, however, the counting rate speed digital CMOS divide-by-N frequency divider,” in Proc. IEEE
becomes slower due to the large fan-out. Similarly, the maxi- Int. Symp. Circuits Syst., Seattle, WA, USA, 2008, pp. 592–595,
mum counting frequency of [10] decreases, as the fan-out of doi: 10.1109/ISCAS.2008.4541487.
[5] M. Kondo and T. Watnaba, “Synchronous counter,”
the LSB grows according to the counter size. The conventional U.S. Patent 5 526 393, Jun. 1996.
binary counter is slowest due to the long AND carry chain. [6] P. R. Thota and A. K. Mal, “A high speed counter for
The counting frequency of the proposed counter is almost con- analog-to-digital converters,” in Proc. Int. Conf. Microelectron.
Comput. Commun. (MicroCom), Durgapur, India, 2016, pp. 1–5,
stant, 2GHz, and almost independent of the counter size up to doi: 10.1109/MicroCom.2016.7522592.
128 bits. [7] D. R. Lutz and D. N. Jayasimha, “Programmable modulo-K counters,”
The total equivalent gate (EG) counts required in each IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 43, no. 11,
pp. 939–941, Nov. 1996, doi: 10.1109/81.542285.
counter is shown in Fig. 8, where an EG corresponds to [8] K. Z. Pekmestzi and N. Thanasouras, “Systolic frequency
a 2-input NAND gate, and an XOR gate and a D F/F are dividers/counters,” IEEE Trans. Circuits Syst. II, Analog Digit.
regarded as 2.25 EGs and 7.5 EGs by considering the standard Signal Process., vol. 41, no. 11, pp. 775–776, Nov. 1994,
doi: 10.1109/82.331551.
cell library used in the experiments. While the combinational [9] D. Morrison, D. Delic, M. R. Yuce, and J.-M. Redouté, “Multistage
gate counts in [10] increases exponentially to the counter size, linear feedback shift register counters with reduced decoding logic in
it increases linearly in the conventional counter, [11], and the 130-nm CMOS for large-scale array applications,” IEEE Trans. Very
Large Scale Integr. (VLSI) Syst., vol. 27, no. 1, pp. 103–115, Jan. 2019,
proposed counter. In [11], the total number of F/Fs is almost doi: 10.1109/TVLSI.2018.2872021.
two times the counter size, as the size of the ring counter used [10] P. Larsson and J. Yuan, “Novel carry propagation in high-speed syn-
to generate PEN2 is similar to the counter size. On the other chronous counters and dividers,” Electron. Lett., vol. 29, no. 16,
hand, the number of F/Fs in the proposed counter increases lin- pp. 1457–1458, Aug. 1993, doi: 10.1049/el:19930975.
[11] M. Ercegovac and T. Lang, “Binary counter with counting period of
early to the counter size like the conventional counter and [10]. one half adder independent of counter size,” IEEE Trans. Circuits Syst.,
More precisely, the slope is about 1, which means that the vol. 36, no. 6, pp. 924–926, Jun. 1989, doi: 10.1109/31.90421.
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on September 28,2022 at 06:17:25 UTC from IEEE Xplore. Restrictions apply.