0% found this document useful (0 votes)
16 views5 pages

CRC Circuit Design For SRAM-Based FPGA Configuration Bit Correction

Uploaded by

sorrynsfw69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views5 pages

CRC Circuit Design For SRAM-Based FPGA Configuration Bit Correction

Uploaded by

sorrynsfw69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

CRC Circuit Design for SRAM-Based FPGA

Configuration Bit Correction


Wenlong Yang, Lingli WangˈXuegong Zhou*
Department of Microelectronics, Fudan University, Shanghai 201203, China
Email:zhouxg @fudan.edu.cn

Abstract three illustrates the proposed method in details. Finally


Field Programmable Gate Arrays (FPGAs) are used in experimental results and conclusions are discussed in
a variety of applications. However, the SRAM-based Section four and Section five.
FPGAs can be easily influenced by single event upset
(SEU) in the configuration memory [1]. SEU could lead 2. Existing Mitigation Technique
to a series of catastrophic consequence ranging from
unwanted functional, data loss, or failure of the whole 2.1 Zero-hardened SRAM cells
systems. So it’s very important to find a way to It was observed in [5] that 87% of the configuration
mitigating the SEU effect. bits are zeros across different designs. The high
In this paper, the existing techniques for SEU proportion of zero would be mainly due to unused large
mitigation are introduced, and their shortcomings are number of routing bits [6]. So, design SRAM cells which
pointed out. A new mitigation algorithm is proposed. have better SEU immunity while storing a zero can
Experimental results indicate that our algorithm is both decrease the soft error rate.
feasible and effective. There are six transistors in a standard SRAM. In paper
[2], a new SRAM model called ASRAM0 was proposed,
1. Introduction which the threshold voltage of two among six transistors
are raised. According to its test, the soft error rate can
Reconfigurable FPGAs promise performance, decrease by 41%.
flexibility, which helps in achieving a short design cycle Improving the SRAM structure can lead to better SEU
and no Non-Recurring Engineering (NRE) costs, as well
immunity, but this mitigation technique lack an error
as reduced time to market. So they are increasingly
wildly used in communications, automotive, medical and correction mechanism. If a permanent soft error occurrs,
other fields. there is nothing can be done about it.
As for SRAM-based FPGAs, the configuration
information stored in the SRAMs determines the routing
2.2 TMR
and functionality of the design. However, an inadvertent
change in the value of one of the SRAM cells can TMR means triple modular redundancy [7]. The basic
potentially modify the functionality of the design. A concept of TMR is that a single circuit may sensitive to
major reason for such inadvertent bit flips is due to SEU SEUs, but by implementing three copies of the same
caused by cosmic radiation or the radiation emanating circuit and performing a bit-wise “majority vote” on the
from materials [2]. output of the triplicate circuit, the SEU immunity can be
To make matters worse, as the semiconductor
improved [8].
technology scales down, high density and low voltage
Although effective, the TMR imposed more than
bring a great challenge. The charge stored at the sensitive
double area and power overhead. And it enforces a
nodes of the SRAM is reduced due to Qnode=Cnode u Vdd,
performance penalty. It may not be affordable to put
making SRAM more prone to flip when a particle strikes
redundancy in each and every module when power and
at the transistor [3]. area are important constraints [9].
An SEU with sufficient energy changes the logic state
Besides, as the Zero-hardened SRAM cells described
of the SRAM element, producing a soft error [4]. The
in section 2.1, the TMR approach doesn’t have an error
soft errors can be divided into two groups: transient and
correction mechanism either. If more than one module is
permanent. The transient error can be corrected by the
wrong, the TMR system will contiguously output the
next load of the latches. But for the permanent error, wrong value without aware of it. So, the TMR
unless some detection and correction technique is technology can only partially mitigate the impact of
applied, it can only be corrected by a re-download of the SEU.
bitstream [1].
The rest of the paper is organized as follows. Section
two describes the existing mitigation techniques. Section

978-1-4244-5798-4/10/$26.00 ©2010 IEEE

Authorized licensed use limited to: BMS College of Engineering. Downloaded on October 16,2024 at 08:54:46 UTC from IEEE Xplore. Restrictions apply.
will re-download the frame which contains errors with
2.3 Scrubbing the correct data in the temp RAM. The flow diagram of
the ECC system shows in Figure 2.
All the configuration bits are stored in the SRAM of
FPGAs. Using the Readback function of FPGA’s can
contiguously read out the configuration bits frame by
frame. By comparing them to a correct copy of bitstream
stored in a ROM, an upset can be detected, and then by
simply reloading the frame which contains the effected
bit, the error can be corrected. This is called
"scrubbing"[10].
The scrubbing technology needs the support of the
FPGA devices. It can not only detect the error caused by
SEU, but also correct it. Scrubbing can provide better
SEU immunity, but the main drawback of this approach
is that the entire bit file must be stored in a ROM.

3. Proposed approach

3.1 overview

In our research, ECC which refers to Error Check and


Correction based on CRC was introduced in FPGA
mitigation technique. The goal of our research is to
develop an embedded IP core, which can be implanted
into the FPGAs to detect and correct soft error
automatically. The structure of our system shows in
Figure 1.

Figure 2. The flow diagram of the ECC system

First of all, the ECC system readback one word


configuration bits from the FPGA, and the word’s
conresponding CRC constants from the ROM at the
same time. Combining these two, a wrong pattern can be
calculated. According to the wrong pattern, the system
can know whether the word is wrong and where the the
Figure 1. The structure of ECC system for FPGA error is. After correct the errror, the word will be stored
The main circuit is the IP core embedded in a FPGA. in a temprory RAM waiting for reconfiguration. Since
And to coordinate with its work, a ROM is needed. the reconfiguration operation is by frame, the word
Before the bit-stream for a specific design is downloaded without errors will be stored in the RAM, too. When the
to the FPGA, the corresponding CRC constants are whole frame is checked and errors have been detected,
generated and stored in the ROM. The readback the system will use the correct data stored in the
controller controls the FPGA’s readback function. Since temprory RAM to partially reconfigure the FPGA. Then
the data output by the FPGA is by byte, a buffer is the system will be ready to readback the next frame and
needed to combine 4 bytes into a word. The CRC do the same job until all frames are checked.
CHECK module is the main module to perform the CRC Our system is capable of dectecting soft error and
operation. And if errors detected, the reconfig controller have the ability to correct 1 bit error in every 11 bit

Authorized licensed use limited to: BMS College of Engineering. Downloaded on October 16,2024 at 08:54:46 UTC from IEEE Xplore. Restrictions apply.
bit-stream. Compared with the scrubbing technique, our transmitted to a PC through the RS-232 port on the board,
method don’t have to store the whole bit-stream, but so we can know the result. Besides, we use the LCD on
CRC values. The ROM area used can be reduced to the board to show the status.
37.5% . Assuming that there are 20 words in a frame, the 20
words configuration data and their CRC values are as
3.2 CRC Encoding and Decoding follows:

A word in FPGA congfiguration data means 32-bit Table 3. The right frame and its CRC value
binary data. To achive the goal of high correction rate Config bits CRC Config bits CRC
1 F4BB_7065 0x872 11 758F_E22B 0x9E9
and keep the simplicity of the circuit at the same time, 2 E220_F433 0x7D3 12 9052_F27B 0x9EB
we add a zero before MSB, and a word becames 33-bit 3 D9A3_04F8 0x 544 13 A415_6AC6 0x6DE
data. Then, we divide it into 3 pieces. Each 11-bit data 4 DE69_B577 0x 237 14 B757_CB1F 0x 535
5 E16D_7DCD 0x 30B 15 D05A_C374 0x 6B6
have 4 redundance bits, which are generated by using the 6 F8EF_CE18 0x 48A 16 E1DD_03B4 0x 56F
(15,11) Hamming CRC encode method. The generator 7 0A82_0071 0x 77B 17 F2DE_FBE7 0x 5C8
8 2305_48D6 0x 9C6 18 FC5F_ABFE 0x 7B5
polynomial used for encoding is x  x  1 . According
4
9 3E08_9929 0xDFD 19 E49C_C39F 0x F2B
to Hamming code coding theory, (15, 11) code can 10 568C_11B7 0x 077 20 0681_4034 0x 813
correct 1 bit error, and detect 2 bits error. And the We store the CRC values in the “ROM For CRC”, and
relationship between the error bit and the wrong pattern modify the correct frame data into table 4. In this
shows in Table 1. experiment, we control the error rate within the circuit’s
correction ability which is 1 bit error in 11 bits data. We
Table 1. The wrong pattern store the frame with errors in the “Data ROM”.
Wrong
The error bit
Wrong
The error bit
Table 4. The frame with errors
pattern[3:0] pattern[3:0] Config bits Config bits
1001 10th bit 0101 4th bit 1 74AB_7061 11 658B_E22F
1101 9th bit 1011 3th bit 2 F222_F431 12 8052_FA7B
1111 8th bit 1100 2th bit 3 D9A3_04F8 13 A515_7AC7
1110 7th bit 0110 1th bit 4 DE69_B577 14 9756_CB1F
0111 6th bit 0011 0th bit 5 E16D_7DCD 15 D05A_C376
1010 5th bit 0000 No error 6 F8EF_CE18 16 E0DD_03B4
7 0A82_0071 17 F2DE_FBE7
8 A307_48DE 18 FC5F_EBFE
Once the wrong pattern is calculated, the wrong bit 9 7E08_192B 19 E49C_C3BF
can be found and corrected by invert it. 10 548E_11B3 20 0681_4134

4. Experimental Results After running the ECC system, the LCD shows the
process is done and all errors are corrected, which shows
4.1 implementation and verification in Figure 3.

Readback and reconfiguration require the support of


the FPGA devices. So, in our experiment, only the error
check and correction algorithm has been verified. We
implement the ECC circuit on a Xilinx Spartan3-E
Starter Kit, which has a Xilinx XC3S500E FPGA on it.
Table 2 shows the synthesis report and the timing Figure 3. The status of the ECC system
analysis report from ISE9.2i.
Table 2. Synthesis and timing report And the data received by the PC is as Table 5. It is
Number of Number of the same as the original configuration bits!
Max Speed
Slices LUTs Table 5. The frame after ECC’s process
126 205 159.4MHz Config bits Config bits
1 F4BB_7065 11 758F_E22B
To simulate a frame of configuration bits readback 2 E220_F433 12 9052_F27B
from the FPGA, we use the ISE BlcokRAM generator to 3 D9A3_04F8 13 A415_6AC6
4 DE69_B577 14 B757_CB1F
generate a DATA ROM inside the XC3S500E, so the 5 E16D_7DCD 15 D05A_C374
ECC circuit can read from the ROM as it is reading back 6 F8EF_CE18 16 E1DD_03B4
from a FPGA. The ROM for the CRC value is also 7 0A82_0071 17 F2DE_FBE7
8 2305_48D6 18 FC5F_ABFE
generated and put in the XC3S500E. What we need to do 9 3E08_9929 19 E49C_C39F
is modifying the correct configuration bits in the ROM 10 568C_11B7 20 0681_4034
to see whether the ECC circuit can detect and correct
them. The data processed by the circuit will be However, if the error rate is beyond the system’s

Authorized licensed use limited to: BMS College of Engineering. Downloaded on October 16,2024 at 08:54:46 UTC from IEEE Xplore. Restrictions apply.
ability, for example, we modify the first word
F4BB_7065 to A5BC_7065, and download it to the A new method to mitigate SEU is propsed and
ROM. It’s obvious that there is more than 1 error in verified in our research. It can detect and correct soft
11-bit data. After running the ECC system, the system errors based on CRC technique. Experiments show that
detects the error but can’t correct it. It warns the user on for a FPGA device like Virtex XCV300, our approach
the screen as Figure 4. can detect the soft error within 0.6ms. It is fast enough to
detect and correct the soft errors caused by radiation.
And the catastrophe moment only occurs one time in
every Ͷ ൈ ͳͲͻ day. Compared with the scrubbing
technique, our method can save 62.5% ROM space. And
the ciruit consume only 2% area in XC3S500E. In
Figure 4. The status of the ECC system conclusion, the mitigation technique we propsed is a
feasible and efficient way.
4.2 Performance Analysis
Acknowledgment
the total time for a complete error detection and
correction process of all FPGA configuration can be This project is sponsored by Shanghai Pujiang
calculate by the equation as below. [11] Program and the National Nature Science Foundation of
China under Grant No.60676020.
Ttotal col x f x w x r x t (1)
col = number of coloumns per FPGA References
f = number of frames per coloumns [1] Kamanu E.; Reddy P.; Hsu K.; Lukowaik, M., “A
w = number of words per frames new architecture for single-event detection &
r = number of cycles for configuration readback, reconfiguration of SRAM-based FPGAs,” High
error detection and correction for a single word. Assurance Systems Engineering Symposium, 2007.
t = clock period. HASE '07. 10th IEEE.
As described in the section 4.1, the circuit can run [2] S.Srinivasan A.Gayasen, N.Vijaykrishnam,
159.4MHz. It means that the smallest clock period would M.Kandemir, Y.Xie and M.J.Irwin, “Improving
be 6.3ns. The circuit takes 3 clocks to corrrect error in a soft-error tolerance of FPGA configuration bits”,
word. So, r = 3. For a FPGA device like virtex XCV300, ICCAD,2004
col=48, f=32, w=21. Based on the eqution (1), the T total [3] Balkaran S. Gill, Ghris Papachristou, and Francis G.
for this kind of FPGA is 0.6ms. Wolff. “A New Asymmetric SRAM Cell to Reduce
Suppose the probability of occurrence of SEU Soft Errors and Leakage Power in FPGA”. Design
consistent with the Poisson distribution, which can be Automation&Test in Europe Conference& Exhibition,
represented by the following formula: 2007:1-6.
‡ െɉ ɉ Ɉ [4] G.AsadiandM.B.Tahoori,“Soft Error Rate Estimation
ሺሻ ൌ (2)
ɈǨ and Mitigation for SRAM-Based FPGAs,”Proc. Of
In [12] the data shows that at a worst circumstance in the 13 ACM Intl.Symp.on Field-Programmable Gate
space, the virtex series FPGA’s upset rate is 81.5 upsets Arrays (FPGA), Monterey, CA, Feb.2005.
per day, that’s ͷǤ͹ ൈ ͳͲെ͹ bit per 0.6ms. Since we put [5] Suresh Srinivasan, Aman Gayasen, N. Vijaykrishnan,
11 bits into a group and there are N=96768 groups in M. Kandemir, Y. Xie, and M. J. Irwin, “Improving
XCV300. The error probability ɉ in each group is Soft-error Tolerance of FPGA Configuration Bits”.
ͷǤͻ ൈ ͳͲെͳʹ . Our circuit can resist one error in each International Conference on Computer Aided Design
group, the probability of errors occurred in a single (ICCAD-2004), pp. 107-110, 2004.
group beyond the circuit’s tolerance per 0.6ms is [6] Balkaran S. Gill, Ghris Papachristou, and Francis G.
 ൌ ͳ െ ሺͲሻ െ ሺͳሻ ൌ ͳǤ͹Ͷ ൈ ͳͲെʹ͵ . Then we can get Wolff, “A New Asymmetric SRAM Cell to Reduce
the probability of errors overflow in a whole FPGA per Soft Errors and Leakage Power in FPGA”. Design
0.6ms is  ൌ ͳ െ ሺͳ െ ሻ ൌ ͳǤ͸ͺ ൈ ͳͲെͳͺ . It is a Automation & Test in Europe Conference &
small probability and the catastrophe moment only Exhibition, pp.1-6, Apri 2007.
occurs one time in every Ͷ ൈ ͳͲͻ day. [7] Xilinx Inc "Triple Module Redundancy Design
Based on the experiments above, our mitigation Techniques for Virtex FPGAs", November 2001.
technique can help the FPGAs to resist SEU caused by [8] Kyriakoulakos, K. ; Pnevmatikatos, D. ; “A novel
high radiation. SRAM-based FPGA architecture for efficient TMR
fault tolerance support”, Field Programmable Logic
5. Conclusion and Applications, 2009. FPL 2009. International

Authorized licensed use limited to: BMS College of Engineering. Downloaded on October 16,2024 at 08:54:46 UTC from IEEE Xplore. Restrictions apply.
Conference.
[9] G. Asadi, M.B. Tahoori, “Soft Error Mitigation for
SRAM-Based FPGAs,” 23th IEEE VLSI Test
Symposium, pages 207-212, May 2005.
[10] Xilinx Inc "Correcting Single-Event Upsets
Through Virtex Partial Configuration", June 2000.
[11] W. Huang, E. J. McCluskey, “A Memory
CoherenceTechnique for Online Transient Error
Recovery of FPGAConfigurations”, Proc. 9th ACM
Intl. Symposium on Field Programmable Gate Arrays,
2001, pp. 183-192.
[12] Fuller, E., et al, “Radiation Test Results of the
Virtex FPGA and ZBT SRAM for Space Based
Reconfigurable Computing,” MAPLD 1999.

Authorized licensed use limited to: BMS College of Engineering. Downloaded on October 16,2024 at 08:54:46 UTC from IEEE Xplore. Restrictions apply.

You might also like