0% found this document useful (0 votes)
47 views11 pages

Introd Uzi One

A soft error is an error in a signal or datum which is wrong, but is not assumed to Languages imply such a mistake or breakage. Highly reliable systems use error correction to correct soft errors on the fly. Soft errors can occur on transmission lines, in digital logic, analog circuits, magnetic storage.

Uploaded by

vitade1983
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views11 pages

Introd Uzi One

A soft error is an error in a signal or datum which is wrong, but is not assumed to Languages imply such a mistake or breakage. Highly reliable systems use error correction to correct soft errors on the fly. Soft errors can occur on transmission lines, in digital logic, analog circuits, magnetic storage.

Uploaded by

vitade1983
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Soft error - Wikipedia, the free encyclopedia

● Log in / create account

Read Edit View history Search

Article Discussion

Soft error
Interaction From Wikipedia, the free encyclopedia

Toolbox
In electronics and computing, a soft error is an error in a signal or datum which is wrong. Errors

Print/export may be caused by a defect, usually understood either to be a mistake in design or construction, or a

broken component. A soft error is also a signal or datum which is wrong, but is not assumed to
Languages
imply such a mistake or breakage. After observing a soft error, there is no implication that the

system is any less reliable than before.

If detected, a soft error may be corrected by rewriting correct data in place of erroneous data. Highly

reliable systems use error correction to correct soft errors on the fly. However, in many systems, it

may be impossible to determine the correct data, or even to discover that an error is present at all.

In addition, before the correction can occur, the system may have crashed, in which case the

recovery procedure must include a reboot.

Soft errors involve changes to data — the electrons in a storage circuit, for example — but not

changes to the physical circuit itself, the atoms. If the data is rewritten, the circuit will work perfectly

again.

Soft errors can occur on transmission lines, in digital logic, analog circuits, magnetic storage, and

elsewhere, but are most commonly known in semiconductor storage.

Soft errors should not be confused with software programming errors.

http://en.wikipedia.org/wiki/Soft_error (1 of 11)2/11/2011 5:03:48 PM


Soft error - Wikipedia, the free encyclopedia

Contents [hide]

● 1 Critical charge

● 2 Causes of soft errors

❍ 2.1 Alpha particles from package decay

❍ 2.2 Cosmic rays creating energetic neutrons and protons

❍ 2.3 Thermal neutrons

❍ 2.4 Other causes

● 3 Designing around soft errors

❍ 3.1 Soft error mitigation

❍ 3.2 Correcting soft errors

● 4 Soft errors in combinational logic

● 5 Soft error rate

● 6 See also

● 7 External links

● 8 References

Critical charge [edit]

Whether a circuit experiences a soft error depends on the energy of the incoming particle, the

geometry of the impact, the location of the strike, and the design of the logic circuit. Logic circuits

with higher capacitance and higher logic voltages are less likely to suffer an error. This combination

of capacitance and voltage is described by the critical charge parameter, Qcrit, the minimum

electron charge disturbance needed to change the logic level. A higher Qcrit means fewer soft

errors. Unfortunately, a higher Qcrit also means a slower logic gate and a higher power dissipation.

Reduction in chip feature size and supply voltage, desirable for many reasons, decreases Qcrit.

Thus, the importance of soft errors increases as chip technology advances.

In a logic circuit, Qcrit is defined as the minimum amount of induced charge required at a circuit

node to cause a voltage pulse to propagate from that node to the output and be of sufficient

duration and magnitude to be reliably latched. Since a logic circuit contains many nodes that may

be struck, and each node may be of unique capacitance and distance from output, Qcrit is typically

characterized on a per-node basis.

Causes of soft errors [edit]

http://en.wikipedia.org/wiki/Soft_error (2 of 11)2/11/2011 5:03:48 PM


Soft error - Wikipedia, the free encyclopedia

Alpha particles from package decay [edit]

Soft errors became widely known with the introduction of dynamic RAM in the 1970s. In these early

devices, chip packaging materials contained small amounts of radioactive contaminants. Very low

decay rates are needed to avoid excess soft errors, and chip companies have occasionally suffered

problems with contamination ever since. It is extremely hard to maintain the material purity needed.

Controlling alpha particle emission rates for critical packaging materials to less than a level of 0.001

counts per hour per cm2 (cph/cm2) is required for reliable performance of most circuits. For

comparison, the count rate of a typical shoe's sole is between 0.1 and 10 cph/cm2.

Package radioactive decay usually causes a soft error by alpha particle emission. The positively

charged alpha particle travels through the semiconductor and disturbs the distribution of electrons

there. If the disturbance is large enough, a digital signal can change from a 0 to a 1 or vice versa. In

combinational logic, this effect is transient, perhaps lasting a fraction of a nanosecond, and this has

led to the challenge of soft errors in combinational logic mostly going unnoticed. In sequential logic

such as latches and RAM, even this transient upset can become stored for an indefinite time, to be

read out later. Thus, designers are usually much more aware of the problem in storage circuits.

Cosmic rays creating energetic neutrons and protons [edit]

Once the electronics industry had determined how to control package contaminants, it became

clear that other causes were also at work. James F. Ziegler led a program of work at IBM which

culminated in the publication of a number of papers (Ziegler and Lanford, 1979) demonstrating that

cosmic rays also could cause soft errors. Indeed, in modern devices, cosmic rays may be the

predominant cause. Although the primary particle of the cosmic ray does not generally reach the

Earth's surface, it creates a shower of energetic secondary particles. At the Earth's surface

approximately 95% of the particles capable of causing soft errors are energetic neutrons with the

[1]
remainder composed of protons and pions (Ziegler, 1996). This flux of energetic neutrons is
typically referred to as "cosmic rays" in the soft error literature. Neutrons are uncharged and cannot

disturb a circuit on their own, but undergo neutron capture by the nucleus of an atom in a chip. This

process may result in the production of charged secondaries, such as alpha particles and oxygen

nuclei, which can then cause soft errors.

Cosmic ray flux depends on altitude. For the common reference location of 40.7N, 74W at 0 meters

(sea level in New York City, NY, USA) the flux is approximately 14 neutrons / cm2/hour. Burying a

http://en.wikipedia.org/wiki/Soft_error (3 of 11)2/11/2011 5:03:48 PM


Soft error - Wikipedia, the free encyclopedia

system in a cave reduces the rate of cosmic-ray induced soft errors to a negligible level. In the

lower levels of the atmosphere, the flux increases by a factor of about 2.2 for every 1000 m (1.3 for

every 1000 ft) increase in altitude above sea level. Computers operated on top of mountains

experience an order of magnitude higher rate of soft errors compared to sea level. The rate of

upsets in aircraft may be more than 300 times the sea level upset rate. This is in contrast to

package decay induced soft errors, which do not change with location. A model of the energetic

[2]
neutron flux is presented in (Gordon & Goldhagen, 2004). An online calculator for this model is
available at www.seutest.com.

The average rate of cosmic-ray soft errors is inversely proportional to sunspot activity. That is, the

average number of cosmic-ray soft errors decreases during the active portion of the sunspot cycle

and increases during the quiet portion. This counterintuitive result occurs for two reasons. The sun

does not generally produce cosmic ray particles with energy above 1 GeV that are capable of

penetrating to the Earth's upper atmosphere and creating particle showers, so the changes in the

solar flux do not directly influence the number of errors. Further, the increase in the solar flux during

an active sun period does have the effect of reshaping the Earth's magnetic field providing some

additional shielding against higher energy cosmic rays, resulting in a decrease in the number of

particles creating showers. The effect is fairly small in any case resulting in a +/- 7% modulation of

the energetic neutron flux in New York City. Other locations are similarly affected.

Energetic neutrons produced by cosmic rays may lose most of their kinetic energy and reach

thermal equilibrium with their surroundings as they are scattered by materials. The resulting

neutrons are simply referred to as thermal neutrons and have an average kinetic energy of about 25

millielectron-volts at 25°C. Thermal neutrons are also produced by environmental radiation sources

such as the decay of naturally occurring uranium or thorium. The thermal neutron flux from sources

other than cosmic-ray showers may still be noticeable in an underground location and an important

contributor to soft errors for some circuits.

Thermal neutrons [edit]

Neutrons that have lost kinetic energy until they are in thermal equilibrium with their surroundings

are an important cause of soft errors for some circuits. At low energies many neutron capture

reactions become much more probable and result in fission of certain materials creating charged

secondaries as fission byproducts. For some circuits the capture of a thermal neutron by the

nucleus of the B-10 isotope of boron is particularly important. This nuclear reaction is an efficient

http://en.wikipedia.org/wiki/Soft_error (4 of 11)2/11/2011 5:03:48 PM


Soft error - Wikipedia, the free encyclopedia

producer of an alpha particle, Li-7 nucleus and gamma ray. Either of the charged particles (alpha or

Li-7) may cause a soft error if produced in very close proximity, approximately 5 micrometers, to a

critical circuit node. The capture cross section for B-11 is 6 orders of magnitude smaller and does

[3]
not contribute to soft errors (Baumann et al., 1995)

Boron has been used in BPSG, the insulator in the interconnection layers of integrated circuits,

particularly in the lowest one. The inclusion of boron lowers the melt temperature of the glass

providing better reflow and planarization characteristics. In this application the glass is formulated

with a boron content of 4% to 5% by weight. Naturally occurring boron is 20% B-10 with the

remainder the B-11 isotope. Soft errors are caused by the high level of B-10 in this critical lower

layer of some older integrated circuit processes. Boron-11, used at low concentrations as a p-type

dopant, does not contribute to soft errors. Integrated circuit manufacturers eliminated borated

dielectrics by the 150 nm process node, largely due to this problem.

In critical designs, depleted boron—consisting almost entirely of Boron-11 is used, to avoid this

effect and therefore to reduce the soft error rate. Boron-11 is a by-product of the nuclear industry.

For applications in medical electronic devices this soft error mechanism may be extremely

important. Neutrons are produced during high energy cancer radiation therapy using photon beam

energies above 10 MV. These neutrons are moderated as they are scattered from the equipment

and walls in the treatment room resulting in a thermal neutron flux that is about 40x106 higher than

the normal environmental neutron flux. This high thermal neutron flux will generally result in a very

[4]
high rate of soft errors and consequent circuit upset (Wilkinson et al., 2005) , (Franco et al.,
[5]
2005) .

Other causes [edit]

Soft errors can also be caused by random noise or signal integrity problems, such as inductive or

capacitive crosstalk. However, in general, these sources represent a small contribution to the

overall soft error rate when compared to radiation effects.

Designing around soft errors [edit]

Soft error mitigation [edit]

A designer can attempt to minimize the rate of soft errors by judicious device design, choosing the

right semiconductor, package and substrate materials, and the right device geometry. Often,

http://en.wikipedia.org/wiki/Soft_error (5 of 11)2/11/2011 5:03:48 PM


Soft error - Wikipedia, the free encyclopedia

however, this is limited by the need to reduce device size and voltage, to increase operating speed

and to reduce power dissipation. The susceptibility of devices to upsets is described in the industry

using the JEDEC JESD-89 standard.

One technique that can be used to reduce the soft error rate in digital circuits is called radiation

hardening. This involves increasing the capacitance at selected circuit nodes in order to increase its

effective Qcrit value. This reduces the range of particle energies to which the logic value of the node

can be upset. Radiation hardening is often accomplished by increasing the size of transistors who

share a drain/source region at the node. Since the area and power overhead of radiation hardening

can be restrictive to design, the technique is often applied selectively to nodes which are predicted

to have the highest probability of resulting in soft errors if struck. Tools and models that can predict

which nodes are most vulnerable are the subject of past and current research in the area of soft

errors.

Correcting soft errors [edit]

Main article: ECC memory

Designers can choose to accept that soft errors will occur, and design systems with appropriate

error detection and correction to recover gracefully. Typically, a semiconductor memory design

might use forward error correction, incorporating redundant data into each word to create an error

correcting code. Alternatively, roll-back error correction can be used, detecting the soft error with an

error-detecting code such as parity, and rewriting correct data from another source. This technique

is often used for write-through cache memories.

Soft errors in logic circuits are sometimes detected and corrected using the techniques of fault

tolerant design. These often include the use of redundant circuitry or computation of data, and

typically come at the cost of circuit area, decreased performance, and/or higher power

consumption. The concept of triple modular redundancy (TMR) can be employed to ensure very

high soft-error reliability in logic circuits. In this technique, three identical copies of a circuit compute

on the same data in parallel and outputs are fed into majority voting logic, returning the value that

occurred in at least two of three cases. In this way, the failure of one circuit due to soft error is

discarded assuming the other two circuits operated correctly. In practice, however, few designers

can afford the greater than 200% circuit area and power overhead required, so it is usually only

selectively applied. Another common concept to correct soft errors in logic circuits is temporal (or

http://en.wikipedia.org/wiki/Soft_error (6 of 11)2/11/2011 5:03:48 PM


Soft error - Wikipedia, the free encyclopedia
time) redundancy, in which one circuit operates on the same data multiple times and compares

subsequent evaluations for consistency. This approach, however, often incurs performance

overhead, area overhead (if copies of latches are used to store data), and power overhead, though

is considerably more area-efficient than modular redundancy.

Traditionally, DRAM has had the most attention in the quest to reduce, or work-around soft errors,

due to the fact that DRAM has comprised the majority-share of susceptible device surface area in

desktop, and server computer systems (ref. the prevalence of ECC RAM in server computers). Hard

figures for DRAM susceptibility are hard to come by, and vary considerably across designs,

fabrication processes, and manufacturers. 1980s technology 256 kilobit DRAMS could have

clusters of five or six bits flip from a single alpha particle. Modern DRAMs have much smaller

feature sizes, so the deposition of a similar amount of charge could easily cause many more bits to

flip.

The design of error detection and correction circuits is helped by the fact that soft errors usually are

localised to a very small area of a chip. Usually, only one cell of a memory is affected, although high

energy events can cause a multi-cell upset. Conventional memory layout usually places one bit of

many different correction words adjacent on a chip. So, even a multi-cell upset leads to only a

number of separate single-bit upsets in multiple correction words, rather than a multi-bit upset in a

single correction word. So, an error correcting code needs only to cope with a single bit in error in

each correction word in order to cope with all likely soft errors. The term 'multi-cell' is used for

upsets affecting multiple cells of a memory, whatever correction words those cells happen to fall in.

'Multi-bit' is used when multiple bits in a single correction word are in error.

Soft errors in combinational logic [edit]

The three natural masking effects in combinational logic that determine whether a single event

upset (SEU) will propagate to become a soft error are electrical masking, logical masking, and

temporal (or timing-window) masking. An SEU is logically masked if its propagation is blocked from

reaching an output latch because off-path gate inputs prevent a logical transition of that gate's

output. An SEU is electrically masked if the signal is attenuated by the electrical properties of gates

on its propagation path such that the resulting pulse is of insufficient magnitude to be reliably

latched. An SEU is temporally masked if the erroneous pulse reaches an output latch, but it does

occur close enough to when the latch is actually triggered to hold.

If all three masking effects fail to occur, the propagated pulse becomes latched and the output of

http://en.wikipedia.org/wiki/Soft_error (7 of 11)2/11/2011 5:03:49 PM


Soft error - Wikipedia, the free encyclopedia

the logic circuit will be an erroneous value. In the context of circuit operation, this erroneous output

value may be considered a soft error event. However, from a microarchitectural-level standpoint,

the affected result may not change the output of the currently-executing program. For instance, the

erroneous data could be overwritten before use, masked in subsequent logic operations, or simply

never be used. If erroneous data does not affect the output of a program, it is considered to be an

example of microarchitectural masking.

Soft error rate [edit]

Soft error rate (SER) is the rate at which a device or system encounters or is predicted to encounter

soft errors. It is typically expressed as either number of failures-in-time (FIT), or mean time between

failures (MTBF). The unit adopted for quantifying failures in time is called FIT, equivalent to 1 error

per billion hours of device operation. MTBF is usually given in years of device operation. To put it in

perspective, 1 year MTBF is equal to approximately 114,077 FIT.

While many electronic systems have an MTBF that exceeds the expected lifetime of the circuit, the

SER may still be unacceptable to the manufacturer or customer. For instance, many failures per

million circuits due to soft errors can be expected in the field if the system does not have adequate

soft error protection. The failure of even a few products in the field, particularly if catastrophic, can

tarnish the reputation of the product and company that designed it. Also, in safety- or cost-critical

applications where the cost of system failure far outweighs the cost of the system itself, a 1%

chance of soft error failure per lifetime may be too high to be acceptable to the customer. Therefore,

it is advantageous to design for low SER when manufacturing a system in high-volume or requiring

extremely high reliability.

See also [edit]

● Single event upset

● Radiation hardening

External links [edit]

● Book on "Architecture Design for Soft Errors" by Shubu Mukherjee, published by Elsevier, Inc.

Book review by Max Baron of Microprocessor Report (May 27, 2008), “Dr. Shubu Mukherjee’s

book is a welcome surprise: books by architecture leaders in major companies are few and far

between. Written from the viewpoint of a working engineer, the book describes sources of soft

http://en.wikipedia.org/wiki/Soft_error (8 of 11)2/11/2011 5:03:49 PM


Soft error - Wikipedia, the free encyclopedia

errors and solutions involving device, logic, and architecture design to reduce the effects of soft

errors”

● Ionizing Radiation Effects in MOS Devices and Circuits by Tso Ping Ma and PAUL V.

Dressendorfer, The first comprehensive overview describing the effects of ionizing radiation on

MOS devices, as well as how to design, fabricate, and test integrated circuits intended for use

in a radiation environment.

● Radiation Effects And Soft Errors In Integrated Circuits And Electronic Devices by Dan

Fleetwood and Ron D Schrimpf, Vanderbilt University, Nashville, Tennessee, USA A collection

of the most important concepts in Radiation Effects by two pioneers in this field.

● Soft Errors in Electronic Memory - A White Paper - A good summary paper with many

references - Tezzaron Jan 2004. Concludes that 1000–5000 FIT per Mbit (0.2–1 error per day

per Gbyte) is a typical DRAM soft error rate.

● Benefits of Chipkill-Correct ECC for PC Server Main Memory - A 1997 discussion of SDRAM

reliability - some interesting information on "soft errors" from cosmic rays, especially with

respect to Error-correcting code schemes

● Soft errors' impact on system reliability - Ritesh Mastipuram and Edwin C Wee, Cypress

Semiconductor, 2004

● Scaling and Technology Issues for Soft Error Rates - A Johnston - 4th Annual Research

Conference on Reliability Stanford University, October 2000

● Evaluation of LSI Soft Errors Induced by Terrestrial Cosmic rays and Alpha Particles - H.

Kobayashi, K. Shiraishi, H. Tsuchiya, H. Usuki (all of Sony), and Y. Nagai, K. Takahisa (Osaka

University), 2001.

● SELSE Workshop Website - Website for the workshop on the System Effects of Logic Soft

Errors

● TRAD Tests & Radiations - A company dedicated to Single events and soft error Test, solutions

and products

● iRoC Technologies - A company dedicated to Soft Error solutions and products

● EADS Nucletudes - A company dedicated to hardening system in harsh elctromagnetic and

radiative environments

References [edit]

1. ^ J.F. Ziegler, Terrestrial cosmic rays, IBM Journal of Research and Development, Vol. 40, no. 1, pp.

http://en.wikipedia.org/wiki/Soft_error (9 of 11)2/11/2011 5:03:49 PM


Soft error - Wikipedia, the free encyclopedia

19-40, Jan 1996.

2. ^ Gordon, Goldhagen, "Measurement of the Flux and Energy Spectrum of Cosmic-Ray Induced

Neutrons on the Ground, IEEE Trans on Nuclear Science, vol. 51, no. 6, pp. 3427-34, Dec. 2004.

3. ^ R. Baumann, T. Hossain, S. Murata, H. Kitagawa, Boron compounds as a dominant source of alpha

particles in semiconductor devices, IRPS Proceedings, pp. 297-302, 1995.

4. ^ J. Wilkinson, C. Bounds, T. Brown, B.J. Gerbi, J. Peltier, Cancer radiotherapy equipment as a cause

of soft errors in electronic equipment, IEEE Trans Device and Materials Reliability, Vol. 5, No. 3, pp.

449-51, Apr. 2005

5. ^ Franco, L., Gómez, F., Iglesias, A., Pardo, J., Pazos, A., Pena, J., Zapata, M., SEUs on commercial

SRAM induced by low energy neutrons produced at a clinical linac facility, RADECS Proceedings,

Sept. 2005

● Ziegler, J. F. and W. A. Lanford, "Effect of Cosmic Rays on Computer Memories", Science,

206, 776 (1979).

● Mukherjee, S, "Architecture Design for Soft Errors," Elsevier, Inc., Feb. 2008.

● Mukherjee, S, "Computer Glitches from Soft Errors: A Problem with Multiple Solutions,"

Microprocessor Report, May 19, 2008.

Categories: Digital electronics | Computer memory

● This page was last modified on 10 February 2011 at 20:15.

● Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. See

Terms of Use for details.

Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.

● Contact us

● Privacy policy ●

● About Wikipedia

http://en.wikipedia.org/wiki/Soft_error (10 of 11)2/11/2011 5:03:49 PM


Soft error - Wikipedia, the free encyclopedia

● Disclaimers

http://en.wikipedia.org/wiki/Soft_error (11 of 11)2/11/2011 5:03:49 PM

You might also like