A Triple Burst Error Correction Based On Region Selection Code
A Triple Burst Error Correction Based On Region Selection Code
8, AUGUST 2023
Abstract— The evolution of microelectronics boosts more scal- increase data reliability [5], [6]. In aggressive environments,
able and complex circuit designs, providing high processing speed such as space, a multiple cell upset (MCU) induced by the
and greater storage capacity. However, reliability issues have strike of high-energy cosmic particles becomes more likely to
grown significantly as electronic devices scale down, increasing
the fault rate, mainly in critical applications exposed to radiation. occur with CMOS technology reduction, especially in SRAM
Memories are sensitive to charged particles, which can corrupt memories. Typically, an MCU appears in memory as a burst
data due to the transient effects. Error correction codes (ECCs) error pattern, which means that an energized particle may
are highly applied to mitigate data failures, increasing memory affect a group of neighbor cells, provoking either continuous
reliability. The matrix region selection code (MRSC) is an ECC errors (adjacent errors) or spaced errors by n bits in a word
designed to correct a high rate of adjacent errors in memory but
less effectively for nonadjacent errors. However, MRSC has a (nonadjacent errors) [7], [8], [9]. Since SEC-DEDs do not
2-D structure that makes it challenging to implement in memory correct an MCU properly, more efficacious ECCs have become
where one address is accessed at a time. This article introduces a design trend.
the triple burst error correction based on region selection code Several ECC techniques focusing on mitigating errors due
(TBEC-RSC), an ECC that uses MRSC concepts, converting the to the occurrence of MCUs in memory have been proposed
MRSC format to a 1-D structure. TBEC-RSC was implemented
and evaluated in a 16-bit data version; however, the code is recently [10], [11], [12], [13], [14], [15], [16], [17], [18],
easily extensible to the higher base-2 data words (e.g., 64 bits). [19], [21], [22], [23], [24], [25], [26], [27], [28]. Some of
Experimental results showed that TBEC-RSC corrects 100% of them proposed exploring elaborated 2-D ECCs for increasing
triple burst errors and more than 40% of 8-bit burst errors. correction capacity [10], [11], [12], [13], [14], [15], [16],
Index Terms— Critical application, error correction code [17], [18], [19], [21], [22]; others explore the error correction
(ECC), multiple cell upset (MCU), reliable memory. efficacy targeting a specific memory hierarchy level, i.e., cache
[23], [24], [25], [26], [27], [28]. Notwithstanding the error
I. I NTRODUCTION correction improvements being a significant contribution of
these ECCs, the designers should also consider energy, area,
A MODERN and complex integrated circuit (IC) has high
integration levels and processing power, enclosing com-
plex modules implemented by numerous components. This
and delay aspects caused by the ECC usage, since complex
ECCs can compromise the performance of applications with
enhancement emerged driven by the shrinkage of the CMOS limited energy and time resources, such as space ones.
technology in the nanoscale era, which brings together the In this context, matrix region selection code (MRSC) [10] is
increase of single event effects (SEEs) induced by the impact an efficacious and low-cost ECC for MCU correction, which
of charged particles. Additionally, single cell upset (SCU) is a uses a 2-D parity encoding scheme. The MRSC decoding
type of SEE that causes memory bitflip [1], [2]. Modern ICs divides the data bits into regions and uses logic equations to
usually contain memory devices whose occurrence of an SCU specify the location of the errors, creating a region selection
can lead to a total operation failure; thus, dealing with SEE is algorithm (RSA). MRSC is a 2-bit error corrector that also
a vital concern for safety-critical applications [3], [4]. covers several adjacent error patterns with lower synthesis
Single error correction-double error detection (SEC-DED) costs when compared to other robust ECCs. The same authors
is a type of error correction code (ECC) widely applied to of MRSC propose eMRSC [11], an extended version of
MRSC that explores the RSA capacities to improve the ECC
Manuscript received 24 July 2022; revised 31 December 2022 and 21 March efficacy.
2023; accepted 27 April 2023. Date of publication 19 May 2023; date
of current version 26 July 2023. This work was supported in part by the This article proposes the triple burst error corrector based
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) and on region selection code (TBEC-RSC), a 2-D ECC scheme
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) mapped in a 1-D physical structure. TBEC-RSC employs the
under Finance Code 001. (Corresponding author: Felipe Silva.)
Felipe Silva, Alan Pinheiro, and Jarbas A. N. Silveira are with the main concepts presented in the MRSC algorithm, although
Department of Teleinformatics, Federal University of Ceará, Fortaleza, with some modifications for improving error correction effi-
Ceará 60455-970, Brazil (e-mail: [email protected]; [email protected]; cacy, enabling the correction of burst errors with high coverage
[email protected]).
César Marcon is with the Polytechnic School, Pontifical Catholic University and similar implementation costs compared with the original
of Rio Grande do Sul, Porto Alegre, Rio Grande do Sul 90619-900, Brazil MRSC. TBEC-RSC was implemented and evaluated in a
(e-mail: [email protected]). 16-bit data version; however, the code is easily extensible to
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TVLSI.2023.3273085. higher base-2 data words (e.g., 64 bits). Experimental results
Digital Object Identifier 10.1109/TVLSI.2023.3273085 showed that TBEC-RSC corrects 100% of triple burst errors
1063-8210 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on August 18,2023 at 08:40:28 UTC from IEEE Xplore. Restrictions apply.
SILVA et al.: TRIPLE BURST ERROR CORRECTION BASED ON REGION SELECTION CODE 1215
TABLE I
B URST E RROR PATTERNS AVAILABLE FOR l R ANGING F ROM 2 TO 8
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on August 18,2023 at 08:40:28 UTC from IEEE Xplore. Restrictions apply.
1216 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 31, NO. 8, AUGUST 2023
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on August 18,2023 at 08:40:28 UTC from IEEE Xplore. Restrictions apply.
SILVA et al.: TRIPLE BURST ERROR CORRECTION BASED ON REGION SELECTION CODE 1217
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on August 18,2023 at 08:40:28 UTC from IEEE Xplore. Restrictions apply.
1218 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 31, NO. 8, AUGUST 2023
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on August 18,2023 at 08:40:28 UTC from IEEE Xplore. Restrictions apply.
SILVA et al.: TRIPLE BURST ERROR CORRECTION BASED ON REGION SELECTION CODE 1219
TABLE III
PATTERN P OSSIBILITIES W ITH B URST L ENGTH VARYING
F ROM 1 TO 8 B ITS
Fig. 14. ECC correction efficacy according to the burst error length—T-
BEC-RSC scaling analysis.
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on August 18,2023 at 08:40:28 UTC from IEEE Xplore. Restrictions apply.
1220 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 31, NO. 8, AUGUST 2023
TABLE IV
MTTF R ESULTS FOR THE E VALUATED ECC S R EGARDING M = 4096,
8192, AND 16384 C ODEWORDS , AND λ = 10−5 ,
10−4 , 10−3 U PSETS /B IT /DAY
B. Tradeoff Evaluation
Silva et al. [10], [11], and Argyrides et al. [22] presented the
equations that represent a tradeoff between error coverage and
synthesis cost. Meanwhile, the authors in [16] also considered
the redundancy bits percentual increase as a part of the tradeoff
metric. We evaluated the codes applying both metrics, and the
results are presented next.
1) Error Correction Coverage Per Synthesis Cost: This
experiment considers the synthesis cost (SC) represented by
(18), which reflects the parameters computed in Section VII-A
SC = Area × Power × Delay. (18)
their tradeoff between coverage and synthesis cost: 1) CSC The encoder circuit is negligible compared to the decoder
based on the metric presented in [10], [11], and [22]; and circuit; therefore, we only consider the decoder SC. Equa-
2) CSCR based on the metric presented in [16]. tion (19) describes the error correction coverage per synthesis
cost (CSC), a metric that correlates SC with the error correc-
tion coverage (CC) - the results presented in Fig. 13
A. Synthesis Results
The 16-bit ECCs were designed in Verilog and synthesized CC
CSC = . (19)
to 28 and 65 nm CMOS technology by Cadence Genus SC
Synthesis Solution. Table V presents the synthesis results of Fig. 16 depicts the CSC results of each of the ECC for
the encoders and decoders of the six evaluated ECCs, enabling the burst error scenarios regarding a 65 nm CMOS technol-
us to compare area, power, and delay according to CMOS ogy. We normalized the results for better visualization and
technology variation. comprehension.
All analyzed ECCs have simple encoding processes that do TBEC-RSC reaches the highest efficacy results across all
not require expressive logic; thus, the encoder circuit is negli- the error patterns. Since all the codes, except MRSC and
gible compared to the decoder circuit. The decoding method- TBEC-RSC, require matchup tables to correct errors, reducing
ology applied in TBEC-RSC is based on the MRSC algorithm correction power reduces the CSC results since they were the
that requires simple logic equations to detect and correct codes with the lowest synthesis results. The compared codes
the errors. Consequently, TBEC-RSC presented equivalent have a limited scope of correction and use a syndrome look-up
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on August 18,2023 at 08:40:28 UTC from IEEE Xplore. Restrictions apply.
SILVA et al.: TRIPLE BURST ERROR CORRECTION BASED ON REGION SELECTION CODE 1221
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on August 18,2023 at 08:40:28 UTC from IEEE Xplore. Restrictions apply.
1222 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 31, NO. 8, AUGUST 2023
[8] B. Varghese, S. Sreelal, P. Vinod, and A. R. Krishnan, “Multiple bit error Felipe Silva received the master’s degree in telein-
correction for high data rate aerospace applications,” in Proc. IEEE Conf. formatics engineering from the Federal University of
Inf. Commun. Technol., Apr. 2013, pp. 1086–1090. Ceará (UFC), Fortaleza, Brazil, in 2018, where he
[9] C. Ogden and M. Mascagni, “The impact of soft error event topography is currently working toward the Ph.D. degree.
on the reliability of computer memories,” IEEE Trans. Rel., vol. 4, His research interests are in the fields of error cor-
pp. 966–979, Dec. 2017. rection codes for embedded systems, fault-tolerant
[10] F. Silva, W. Freitas, J. Silveira, O. Lima, F. Vargas, and C. Marcon, systems, and real-time systems.
“An efficient, low-cost ECC approach for critical-application memories,”
in Proc. 30th Symp. Integr. Circuits Syst. Design (SBCCI), Aug. 2017,
pp. 198–203.
[11] F. Silva, W. Freitas, J. Silveira, C. Marcon, and F. Vargas, “Extended
matrix region selection code: An ECC for adjacent multiple cell
upset in memory arrays,” Microelectron. Rel., vol. 106, Mar. 2020,
Art. no. 113582.
[12] J. Li, P. Reviriego, L. Xiao, Z. Liu, L. Li, and A. Ullah, “Low delay
single error correction and double adjacent error correction (SEC-
DAEC) codes,” Microelectron. Rel., vol. 97, pp. 31–37, Jun. 2019.
[13] S. Choi, H. K. Ahn, B. K. Song, J. P. Kim, S. H. Kang, and S. Jung,
“A decoder for short BCH codes with high decoding efficiency and low
power for emerging memories,” IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 27, no. 2, pp. 387–397, Feb. 2019.
[14] D. Freitas, D. Mota, C. Marcon, J. Siveira, and J. Mota, “LPC: An error Alan Pinheiro received the M.Sc. degree in tele-
correction code for mitigating faults in 3D memories,” IEEE Trans. informatics engineering from the Federal University
Comput., vol. 70, no. 11, pp. 2001–2013. Nov. 2021. of Ceará (UFC), Fortaleza, Brazil, in 2019, where
[15] F. Garcia-Herrero, A. Sánchez-Macián, and J. A. Maestro, “Low delay he is currently working toward the Ph.D. degree in
non-binary error correction codes based on orthogonal Latin squares,” teleinformatics engineering.
Integration, vol. 76, pp. 55–60, Jan. 2021. His research interests are on-chip communication
[16] J. Gracia-Morán, L. J. Saiz-Adalid, D. Gil-Tomás, and P. J. Gil-Vicente, architectures, fault tolerance, embedded systems,
“Improving error correction codes for multiple-cell upsets in space and real-time systems.
applications,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 26,
no. 10, pp. 2132–2142, Oct. 2018.
[17] J. Li, P. Reviriego, L. Xiao, C. Argyrides, and J. Li, “Extending 3-bit
burst error-correction codes with quadruple adjacent error correction,”
IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 26, no. 2,
pp. 221–229, Feb. 2018.
[18] J. Li, L. Xiao, J. Guo, and X. Cao, “Efficient implementations of multiple
bit burst error correction for memories,” in Proc. 14th IEEE Int. Conf.
Solid-State Integr. Circuit Technol. (ICSICT), Oct. 2018, pp. 1–3.
[19] J. Li, P. Reviriego, and L. Xiao, “Low delay 3-bit burst error correction
codes,” J. Electron. Test., vol. 35, no. 3, pp. 413–420, Jun. 2019.
[20] A. Das and N. A. Touba, “Online correction of hard errors and soft errors
via one-step decodable OLS codes for emerging last level caches,” in
Proc. IEEE Latin Amer. Test Symp. (LATS), Mar. 2019, pp. 1–6. Jarbas A. N. Silveira received the Ph.D. degree in
[21] A. Das and N. Touba, “A new class of single burst error correcting teleinformatics engineering from the Federal Univer-
codes with parallel decoding,” IEEE Trans. Comput., vol. 69, no. 2, sity of Ceará (UFC), Fortaleza, Brazil, in 2015.
pp. 253–260, Feb. 2020. He has been an Adjunct Professor with the Telein-
[22] C. Argyrides, H. R. Zarandi, and D. K. Pradhan, “Matrix codes: formatics Department, UFC, since 2009, where he is
Multiple bit upsets tolerant method for SRAM memories,” in Proc. 22nd with the Engineering Laboratory Computer Systems.
IEEE Int. Symp. Defect Fault-Tolerance VLSI Syst. (DFT), Sep. 2007, His research interests are embedded systems on
pp. 340–348. digital circuits, computer architecture, on-chip com-
[23] H. Farbeh, L. Delshadtehrani, H. Kim, and S. Kim, “ECC-United Cache: munication architectures, fault tolerance, and real-
Maximizing efficiency of error detection/correction codes in associative time systems.
cache memories,” IEEE Trans. Comput., vol. 70, no. 4, pp. 640–654,
Apr. 2021.
[24] D. Yoon and M. Erez, “Memory-Mapped ECC: Low-cost error protec-
tion for last-level caches,” ACM SIGARCH Comput. Archit. News, vol. 3,
pp. 116–127, Jun. 2009.
[25] S. Paul, F. Cai, X. Zhang, and S. Bhunia, “Reliability-driven ECC
allocation for multiple bit error resilience in processor cache,” IEEE
Trans. Comput., vol. 60, no. 1, pp. 20–34, Jan. 2011.
[26] J. Kim, N. Hardavellas, K. Mai, B. Falsafi, and J. Hoe, “Multi-bit
error tolerant caches using two-dimensional error coding,” in Proc. 40th
Annu. IEEE/ACM Int. Symp. Microarchitecture (MICRO), Dec. 2007,
pp. 197–209.
[27] E. Cheshmikhani, H. Farbeh, and H. Asadi, “ROBIN: Incremental César Marcon (Senior Member, IEEE) received
oblique interleaved ECC for reliability improvement in STT-MRAM the Ph.D. degree in computer science from the
caches,” in Proc. 24th Asia South Pacific Design Autom. Conf., Federal University of Rio Grande do Sul (UFRGS),
Jan. 2019, pp. 173–178. Porto Alegre, Brazil, in 2005.
[28] S. G. Ghaemi, I. Ahmadpour, M. Ardebili, and H. Farbeh, “SMARTag: He has been a Professor at the School of Com-
Error correction in cache tag array by exploiting address locality,” puter Science, Pontifical Catholic University of Rio
in Proc. Design, Autom. Test Eur. Conf. Exhib. (DATE), Mar. 2018, Grande do Sul (PUCRS), Porto Alegre, since 1995,
pp. 1658–1663. where he is an Advisor of Ph.D. graduate students at
[29] C. Argyrides, R. Chipana, F. Vargas, and D. K. Pradhan, “Reliability the Graduate Program in Computer Science. He has
analysis of H-Tree random access memories implemented with built in more than 150 papers published in prestigious jour-
current sensors and parity codes for multiple bit upset correction,” IEEE nals and conference proceedings.
Trans. Rel., vol. 60, no. 3, pp. 528–537, Sep. 2011. Dr. Marcon is a Brazilian Computer Society (SBC) Member.
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on August 18,2023 at 08:40:28 UTC from IEEE Xplore. Restrictions apply.