0% found this document useful (0 votes)
857 views72 pages

Reed Solomon

Reed-Solomon codes are widely used to identify and correct errors in transmission and storage systems. When RS codes are used for high reliable systems, the designer must take into account the design of the system.

Uploaded by

kalkam
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
857 views72 pages

Reed Solomon

Reed-Solomon codes are widely used to identify and correct errors in transmission and storage systems. When RS codes are used for high reliable systems, the designer must take into account the design of the system.

Uploaded by

kalkam
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 72

Contents

Abstract

Chapter 1
1.1 OVERVIEW
1.2 LITERATURE SURVEY
1.3 OBJECTIVE OF THESIS

Chapter 2: ERROR CONTROL CODING


2.1. INTRODUCTION
2.2 ERROR DETECTION AND ERROR CORRECTION
2.3 ERROR DETECTION AND CORRECTION CODES
2.3.1 TERMINOLOGIES AND DEFINITIONS USED IN EDAC CODES
2.3.2 CONCURRENT ERROR DETECTION SCHEMES
2.3.2.1 Parity Codes
2.3.2.2 Checksum Codes
2.3.2.3 m-out-of-n Codes
2.3.2.4 Berger Codes
2.3.3 Concurrent Error Correction Schemes
2.3.3.1 Bose – Chaudhuri – Hocquenqhem (BCH) Codes
2.3.3.2 Hamming Single Error – Correcting Codes
2.3.3.2 Burst Error Correcting Codes

Chapter 3: GALOIS FIELD ARITHMETIC


3.1 INTRODUCTION
3.2 DEFINITION OF GALOIS FIELD
3.3 PROPERTIES OF GALOIS FIELDS
3.4 CONSTRUCTION OF GALOIS FIELDS
3.5 GALOIS FIELD ARITHMETIC

Chapter 4:REED SOLOMON CODES


4.1 INTRODUCTION
4.2 RS CODES BACKGROUND
4.3 Characteristics of the RS Encoder

4.4 DESIGN AND IMPLEMENTATION


4.5 SELF-CHECKING RS ENCODER
4.6 CONCURRENT ERROR DETECTION SCHEME OF THE RS
DECODER

Chapter 5:Field Programmable Gate Array (FPGA)


5.1 History of FPGA
5.2 Basic concepts of FPGA
5.3 FPGA Advantage
5.4 Language Used in FPGA
5.5 Importance of HDLs

Chapter 6: INTRODUCTION TO VERILOG


6.1 Introduction

1
6.2 Behavioral level
6.3 Register-Transfer Level
6.4 Gate Level
6.5 History of Verilog
6.6 Features of Verilog HDL
6.7 Simulation
6.8 SYNTHESIS:

2
Concurrent Error Detection in Reed–Solomon
Encoders and Decoders

ABSTRACT

Reed–Solomon (RS) codes are widely used to identify and correct errors
in transmission and storage systems. When RS codes are used for high reliable
systems, the designer should also take into account the occurrence of faults in
the encoder and decoder subsystems. In this project, self-checking RS encoder
and decoder architectures are presented. The RS encoder architecture exploits
some properties of the arithmetic operations in .

These properties are related to the parity of the binary representation of


the elements of the Galois Field. In the RS decoder, the implicit redundancy of
the received codeword, under suitable assumptions explained in this paper,
allows implementing concurrent error detection schemes useful for a wide range
of different decoding algorithms with no intervention on the decoder architecture.
Moreover, performances in terms of area and delay overhead for the proposed
circuits are presented.

3
CHAPTER 1
INTRODUCTION
Digital communication system is used to transport an information bearing signal from the
source to a user destination via a communication channel. The information signal is
processed in a digital communication system to form discrete messages which makes the
information more reliable for transmission. Channel coding is an important signal
processing operation for the efficient transmission of digital information over the
channel. It was introduced by Claude E. Shannon in 1948 by using the channel capacity
as an important parameter for error – free transmission. In channel coding the number of
symbols in the source encoded message is increased in a controlled manner in order to
facilitate two basic objectives at the receiver: error detection and error correction. Error
detection and error correction to achieve good communication is also employed in
electronic devices. It is used to reduce the level of noise and interferences in electronic
medium. The amount of error detection and correction required and its effectiveness
depends on the signal to noise ratio (SNR).

1.1 OVERVIEW

Every information signal has to be processed in a digital communication system before it


is transmitted so that the user at the receiver end receives an error – free information. A
digital communication system has three basic signal processing operations: source
coding, channel coding and modulation. A digital communication system is shown in the
block diagram given below:

Figure-1.1: Block Diagram of a Digital Communication System

4
In source coding, the encoder maps the digital generated at the source output into another
signal in digital form. The objective is to eliminate or reduce redundancy so as to provide
an efficient representation of the source output. Since the source encoder mapping is one-
to – one, the source decoder on the other end simply performs the inverse mapping,
thereby delivers to the user a reproduction of the original digital source output. The
primary benefit thus gained from the application of source coding is a reduced bandwidth
requirement.

In channel coding, the objective for the encoder is to map the incoming digital signal into
a channel input and for the decoder is to map the channel output into an output signal in
such a way that the effect of channel noise is minimized. That is the combined role of the
channel encoder and decoder is to provide for a reliable communication over a noisy
channel. This provision is satisfied by introducing redundancy in a prescribed fashion in
the channel encoder and exploiting it in the decoder to construct the original encoder
input as accurately as possible. Thus in source coding, redundant bits are removed
whereas in channel coding, redundancy is introduced in a controlled manner.

Then modulation is performed for the efficient transmission of the signal over the
channel. Various digital modulation techniques could be applied for modulation such as
Amplitude Shift Keying (ASK), Frequency- Shift Keying (FSK) or Phase – Shift Keying
(PSK). The addition of redundancy in the coded messages implies the need for increased
transmission bandwidth. Moreover, the use of coding adds complexity to the system,
especially for the implementation of decoding operations in the receiver. Thus bandwidth
and system complexity
has to be considered in the design trade – offs in the use of error - control coding to
achieve acceptable error performance.

Different errors correcting codes can be used depending on the properties of the system
and the application in which the error correcting is to be introduced. Generally error –
correcting codes have been classified into block codes and convolutional codes. The
distinguishing feature for the classification is the presence or absence of memory in the
encoders for the two codes. To generate a block code, the incoming information stream is
divided into blocks and each block is processed individually by adding redundancy in
accordance with a prescribed algorithm. The decoder processes each block individually
and corrects errors by exploiting redundancy.
Many of the important block codes used for error – detection are cyclic codes. These are
also called cyclic redundancy check codes.

In a convolutional code, the encoding operation may be viewed as the discrete – time
convolution of the input sequence with the impulse response of the encoder. The duration
of the impulse response equals the memory of the encoder. Accordingly, the encoder for
a convolutional code operates on the incoming message sequence, using a “sliding
window”equal in duration to its own memory. Hence in a convolutional code, unlike a
block code where codewords are produced on a block- by – block basis, the channel
encoder accepts message bits as continuous sequence and thereby generates a continuous
sequence of encoded bits at a higher rate.

5
1.2 LITERATURE SURVEY

Channel coding is a widely used technique for the reliable transmission and reception of
data. Generally systematic linear cyclic codes are used for channel coding. In 1948,
Shannon introduced the linear block codes for complete correction of errors [1]. Cyclic
codes were first discussed in a series of technical notes and reports written between 1957
and 1959 by Prange [2], [3], [4]. This led directly to the work published in March and
September of 1960 by Bose and Ray-Chaudhuri the BCH codes. [5], [6], [7]. In 1959,
Irving Reed and Gus Solomon described a new class of error-correcting codes called
Reed-Solomon codes [8]. Originally Reed-Solomon codes were constructed and decoded
through the use of finite field arithmetic [9], [10] which used nonsingular Vandermonde
matrices [10]. In 1964 Singleton showed that this was the best possible error correction
capability for any code of the same length and dimension [11]. Codes that achieve this
"optimal" error correction capability are called Maximum Distance Separable (MDS).
Reed-Solomon codes are by far the dominant members, both in number and utility, of the
class of MDS codes. MDS codes have a number of interesting properties that lead to
many practical consequences.

The generator polynomial construction for Reed-Solomon codes is the approach most
commonly used today in the error control literature. This approach initially evolved
independently from Reed-Solomon codes as a means for describing cyclic codes.
Gorenstein and Zierler then generalized Bose and Ray-Chaudhuri's work to arbitrary
Galois fields of size p m , thus developing a new means for describing Reed and
Solomon's "polynomial codes" [12]. It was described that vector c is a code word in the
code defined by g(x) if and only if its corresponding code polynomial c(x) is a multiple of
g(x). So the information symbols could be easily mapped onto code words. All valid code
polynomials are multiples of the generator polynomial. It follows that any valid code
polynomial must have as roots the same 2t consecutive powers of α that form the roots of
g(x). This approach leads to a powerful and efficient set of decoding algorithms.

After the discovery of Reed-Solomon codes, a search began for an efficient decoding
algorithm. In 1960, Reed and Solomon proposed a decoding algorithm based on the
solution of sets of simultaneous equations [8].Though much more efficient than a look-up
table, Reed and Solomon's algorithm is still useful only for the smallest Reed-Solomon
codes. In 1960 Peterson provided the first explicit description of a decoding algorithm for
binary BCH codes [13], His "direct solution" algorithm is quite useful for correcting
small numbers of errors but becomes computationally intractable as the number of errors
increases. Peterson's algorithm was improved and extended to non - binary codes by
Gorenstein and Zierler (1961) [12], Chien (1964) [14], and Forney (1965) [15]. These
efforts were productive, but Reed-Solomon codes capable of correcting more than six or
seven errors still could not be used in an efficient manner.

In 1967, Berlekamp demonstrated his efficient decoding algorithm for both non - binary
BCH and Reed-Solomon codes [16], [17]. Berlekamp's algorithm allows for the efficient

6
decoding of dozens of errors at a time using very powerful Reed-Solomon codes. In 1968
Massey showed that the BCH decoding problem is equivalent to the problem of
synthesizing the shortest Linear Feedback Shift Register capable of generating a given
sequence [18]. Massey then demonstrated a fast shift register-based decoding algorithm
for BCH and Reed-Solomon codes that is equivalent to Berlekamp's algorithm. This shift
register-based approach is now referred to as the Berlekamp-Massey algorithm.

In 1975 Sugiyama, Kasahara, Hirasawa, and Namekawa showed that Euclid's algorithm
can also be used to efficiently decode BCH and Reed- Solomon codes [19]. Euclid's
algorithm is a means for finding the greatest common divisor of a pair of integers. It can
also be extended to more complex collections of objects, including certain sets of
polynomials with coefficients from finite fields.

As mentioned above, Reed – Solomon codes are based on the finite fields so they can be
extended or shortened. In this thesis Reed-Solomon codes used for decoding in the
compact discs are encoded and decoded. The generator polynomial approach has been
used for encoding and decoding of data.

1.3 OBJECTIVE OF THESIS

The objectives of the thesis are:


1. To analyze the important characteristics of various coding techniques that could
be used for error control in a communication system for reliable transmission of
digital information over the channel.
2. To study the Galois Field Arithmetic on which the most important and powerful
ideas of coding theory are based.
3. To study the Reed – Solomon codes and the various methods used for encoding
and decoding of the codes to achieve efficient detection and correction of the
errors.
4. Analysis of the simulation results of the Reed – Solomon encoder and decoder.

7
CHAPTER 2

ERROR CONTROL CODING


2.1 INTRODUCTION

The designer of an efficient digital communication system faces the task of providing a
system which is cost effective and gives the user a level of reliability. The information
transmitted through the channel to the receiver is prone to errors. These errors could be
controlled by using Error- Control Coding which provides a reliable transmission of data
through the channel. In this chapter, a few error control coding techniques are discussed
that rely on systematic addition of redundant symbols to the transmitted information.
Using these techniques, two basic objectives at the receiver are facilitated: Error
Detection and Error Correction.

2.2 ERROR DETECTION AND ERROR CORRECTION

When a message is transmitted or stored it is influenced by interference which can distort


the message. Radio transmission can be influenced by noise, multipath propagation or by
other transmitters. In different types of storage, apart from noise, there is also interference
which is due to damage or contaminant sin the storage medium. There are several ways
of reducing the interference. However, some interference is too expensive or impossible
to remove. One way of doing so is to design the messages in such ways that the receiver
can detect if an error has occurred or even possibly correct the error too. This can be
achieved by Error – Correcting Coding. In such coding the number of symbols in the
source encoded message is increased in a controlled manner, which means that
redundancy is introduced [20].

To make error correction possible the symbol errors must be detected. When an error has
been detected, the correction can be obtained in the following ways:
(1) Asking for a repeated transmission of the incorrect codeword (Automatic repeat
Request (ARQ)) from the receiver.
(2) Using the structure of the error correcting code to correct the error (Forward Error
Correction (FEC)).

It is easier to detect an error than it is to correct it. FEC therefore requires a higher
number of check bits and a higher transmission rate, given that a certain amount of
information has to be transmitted within a certain time and with a certain minimum error
probability. The reverse is also true; if the channel offers a certain possible transmission
rate, ARQ permits a higher information rate than FEC, especially if the channel has a low
error rate. FEC however has the advantage of not requiring a reply channel. The choice in
each particular case therefore depends on the properties of the system or on the

8
application in which the error – correcting is to be introduced. In many applications, such
as radio broadcasting or Compact Disc (CD), there is no reply channel. Another
advantage of the FEC is that the transmission is never completely blocked even if the
channel quality falls below such low levels that ARQ system would have completely
asked for retransmission. In a system using FEC, the receiver has no realtime contact
with the transmitter and can not verify if the data was received correctly. It must
make a decision about the received data and do whatever it can to either fix it or declare
an alarm.

There are two main methods to introduce Error- Correcting Coding. In one of them the
symbol stream is divided into block and coded. This consequently called Block Coding.
In the other one a convolution operation is applied to the symbol stream. This is called
Convolutional Coding.

Figure-2.1: Error Detection and Correction

FEC techniques repair the signal to enhance the quality and accuracy of the received
information, improving system performance. Various techniques used for FEC are
described in the following sections.

9
2.3 ERROR DETECTION AND CORRECTION CODES
The telecom industry has used FEC codes for more than 20 years to transmit digital data
through different transmission media. Claude Shannon first introduced techniques for
FEC in 1948 [1]. These error-correcting codes compensated for noise and other physical
elements to allow for full data recovery.

For an efficient digital communication system early detection of errors is crucial in


preserving the received data and preventing data corruption. This reliability issue can be
addressed making use of the Error Detection And Correction (EDAC) schemes for
concurrent error detection (CED).

The EDAC schemes employ an algorithm, which expresses the information message,
such that any of the introduced error can easily be detected and corrected (within certain
limitations), based on the redundancy introduced into it. Such a code is said to be e-error
detecting, if it can detect any error affecting at most e-bits e.g. parity code, two-rail code,
m-out-of-n, Berger ones etc.

Similarly it is called e-error correcting, if it can correct e-bit errors e.g. Hamming codes,
Single Error Correction Double Error Detection (SECDED) codes, Bose Choudhary –
Hocquenqhem (BCH) Codes, Residue codes, Reed Solomon codes etc. As mentioned
earlier both error detection and correction schemes require additional check bits for
achieving CED. An implementation of this CED scheme is shown in Figure

Figure-2.2: Concurrent Error Detection Implementation

Here the message generated by the source is passed to the encoder which adds redundant
check bits, and turns the message into a codeword. This encoded message is then sent
through the channel, where it may be subjected to noise and hence altered. When this
message arrives at the decoder of the receiver, it gets decoded to the most likely message.
If any error had occurred during its transmission, the error may either get detected and
necessary action taken (Error Detection scheme) or the error gets corrected and the
operations continue (Error Correction Scheme).
Detection of an error in an error detection scheme usually leads to the stalling of the
operation in progress and results in a possible retry of some or all of the past
10
computations. On the other hand the error correction scheme permits the process to
continue uninterrupted, but usually requires higher amount of redundancy than the error
detection. Each of these schemes has varied applications, depending on the reliability
requirements of the system.

2.3.1 TERMINOLOGIES AND DEFINITIONS USED IN EDAC


CODES

 An EDAC Scheme is said to be


1. Unidirectional: when all components affected by a multiple error change their
values in only one direction from, say, 0 to 1, or vice versa, but not both.
2. Asymmetric: if its detecting capabilities are restricted to a single error type (0 to
1 or 1 to 0). These codes are useful in applications which expect only single
type of error during their operation.
3. Linear: if the sum of two codewords (encoded data) is also a codeword. Sum
here means adding two binary block bit wise XOR.
4. Non-separable or Non-systematic: if the check-bits are embedded within the
codeword, and cannot be processed concurrently with the information, else is
referred to as separable or systematic codes.
5. Block codes: if the codewords can be considered as a collection of binary
blocks, all of the same length, say n. Such codes are characterized by the fact
that the encoder accepts k information symbols from the information source and
appends a set of r redundant symbols derived from the information symbols, in
accordance with the code algorithm.
6. Binary: if the elements (or symbols) can assume either one of two possible
states (0 and 1).
7. Cyclic: if it is a parity check code with the additional property that every cyclic
shift of the word is also a code word.
8. Forward Error Correcting: if enough extra parity bits are transmitted along
with the original data, so as to enable the receiver to correct a predetermined
maximum corrupted data, without any further retransmissions.
9. Backward Error Correcting: if the redundancy is enough only to detect errors
and retransmission is required.

 The Hamming Distance of two codewords x, y ∈ n F denoted by d(x, y), is the


minimum number of bits in which they differ .i.e., the number of 1s in a XOR b. A
distance d code can detect (d-1) bit errors and correct [(d-1)/2] bit errors.

 The Error Syndrome is a defined function which is used to exactly identify the error
location by addition of the bad parity bit locations. It is the vector sum of the received
parity digits and the parity check digits recomputed from the received information
digits. A Codeword is a block of n symbols that carries the k information symbols and
the r redundant symbols (n = k + r).

 For a (n, k) block code, the rate of the code defined as the ratio of number of
information bits to the length of code k/n.

11
2.3.2 CONCURRENT ERROR DETECTION SCHEMES

Schemes for Concurrent Error Detection (CED) find wide range of applications, since
only after the detection of error, can any preventive measure be initiated. The principle of
error detecting scheme is very simple, an encoded codeword needs to preserve some
characteristic of that particular scheme, and a violation is an indication of the occurrence
of an error. Some of the CED techniques are discussed below.

2.3.2.1 Parity Codes

These are the simplest form of error detecting codes, with a hamming distance of two
(d=2), and a single check bit (irrespective of the size of input data). They are of two basic
types: Odd and Even. For an even-parity code the check bit is defined so that the total
number of 1s in the code word is always even; for an odd code, this total is odd. So,
whenever a fault affects a single bit, the total count gets altered and hence the fault gets
easily detected. A major drawback of these codes is that their multiple fault detection
capabilities are very limited.

2.3.2.2 Checksum Codes

In these codes the summation of all the information bytes is appended to the information
as bbit checksum. Any error in the transmission will be indicated as a resulting error in
the checksum. This leads to detection of the error. When b=1, these codes are reduced to
parity check codes. The codes are systematic in nature and require simple hardware units.

2.3.2.3 m-out-of-n Codes

In this scheme the codeword is of a standard weight m and standard length n bits.
Whenever an error occurs during transmission, the weight of the code word changes and
the error gets detected. If the error is a 0 to 1 transition an increase in weight is detected,
similarly 1 to 0 leads to a reduction in weight of the code, leading to easy detection of
error. This scheme can be used for detection of unidirectional errors, which are the most
common form of error in digital systems.

2.3.2.4 Berger Codes

Berger codes are systematic unidirectional error detecting codes. They can be considered
as an extension of the parity codes. Parity codes have one check bit, which can be
considered as the number of information bits of value 1 considered in modulo 2. On the

12
other hand Berger codes have enough check bits to represent the count of the information
bits having value 0. The number of check bits (r) required for k-bit information is given
by
r = [log2 (k − 1)]

Of all the unidirectional error detecting codes that exist [21] suggests, m-out-of-n codes
to be the most optimal. These codes however, are not of much application because of its
nonseparable nature. Amongst the separable codes in use, the Berger codes have been
proven to be most optimal, requiring the smallest number of check bits [22].
The Berger Codes, however, are not optimal when only t unidirectional errors need to be
detected instead of all unidirectional errors. For this reason a number of different
modified Berger codes exist: Hao Dong introduced a code [23] that accepts slightly
reduced error detection capabilities, but does so using fewer check bits and smaller
checker sizes. In this code the number of check bits is independent of the number of
information bits. Bose and Lin [24] have introduced their own variation on Berger codes
and Bose [25] has further introduced a code that improves on the burst error detection
capabilities of his previous code, where erroneous bit are expected to appear in groups.

Blaum [26] further improves on Bose-Lin Code. Favalli proposes an approach where
code cost is reduced because of the graph theoretic optimization [27].

2.3.3 Concurrent Error Correction Schemes

Error-correcting codes (ECC) were first developed in the 1940s following a theorem of
Claude Shannon that showed that almost error-free communication could be obtained
over a noisy channel [1]. The quality of the recovered signal will however depend on the
error correcting capability of the codes.

Error correction coding requires lower rate codes than error detection, but is a basic
necessity in safety critical systems, where it is absolutely critical to get it right first time
itself. In these special circumstances, the additional bandwidth required for the redundant
check-bits is an acceptable price.

Over the years, the correcting capability of the error correction schemes have gradually
increased with constrained number of computation steps. Concurrently, the time and
hardware cost to perform a given number of computational steps have also greatly
decreased. These trends have led to greater application of these error-correcting
techniques.

One application of ECC is to correct or detect errors in communication over channels


where the errors appear in bursts, i.e. the errors tend to be grouped in such a way that
several neighboring symbols are incorrectly detected. Non – binary codes are used to
correct such errors. Since the error is always a number different from zero in the field, it
is always one in the binary codes. In a non – binary code the error can take many values
and the magnitude of the error has to be determined to correct the error. Some of the non
– binary codes are discussed in the following sections.

13
2.3.3.1 Bose – Chaudhuri – Hocquenqhem (BCH) Codes

BCH codes are the most important and powerful classes of linear block codes which are
cyclic codes with a wide variety of parameters. The most common BCH codes are
characterized as follows. Specifically, for any positive integer m (equal to or greater than
3) and t [less than (2m −1) / 2 ] there exists a binary BCH code with the following
parameters:
Block length: = 2m −1 n
Number of message bits k ≥ n − mt
Minimum distance 2 1 min d ≥ t +

Where m is the number of parity bits and t is number of errors that can be corrected.
Each BCH code is a t – error correcting code in that it can detect and correct up to t
random errors per codeword. The Hamming single – error correcting codes can be
described as BCH codes. The BCH codes offer flexibility in the choice of code
parameters, block length and code rate.

2.3.3.2 Hamming Single Error – Correcting Codes

Hamming codes can also be defined over the non – binary field. The parity check matrix
is designed by setting its columns as the vectors of GF(p) m whose first non – zero
element equals one. There are n = ( p −1) /( p −1) m such vectors and any pair of these is
linearly independent.

If p = 3 and r = 3, then a Hamming single error – correcting code is generated.


These codes represent a primitive and trivial error correcting scheme, which is key to the
understanding of more complex correcting schemes. Here the codewords have a
minimum Hamming distance 3 (i.e. d = 3), so that one error can be corrected, two errors
detected. For enabling error correction, other than the error detection, the location of the
error must also be identified. So for one-bit correction on an n-bit frame, if there is an
error in one out of the total n bit positions, it must be identified otherwise it must be
stated that there is no error. Once located, the correction is trivial: the bit is inverted.

Hamming codes have the advantage of requiring fewest possible check bits for their code
lengths, but suffer from the disadvantage that, whenever more than single error occurs, it
is wrongly interpreted as a single-error, because each non-zero syndrome is matched with
one of the single-error events. Thus it is inefficient in handling burst errors.

2.3.3.2 Burst Error Correcting Codes

The transmission channel could be memory less or it may be having some memory. If the
channel is memory less then the errors may be independent and identically distributed. .
Sometimes it is possible that the channel errors exhibit some kind of memory. The most
common example of this is burst errors. If a particular symbol is in error, then the
chances are good that its immediate neighbors are also wrong. Burst errors occur for
14
instance in mobile communications due to fading and in magnetic recording due to media
defects. Burst errors can be converted to independent errors by the use of an interleaver.

A burst error can also be viewed as another type of random error pattern and be handled
accordingly. But some schemes are particularly well suited to dealing with burst errors.
Cyclic codes represent one such class of codes. Most of the linear block codes are either
cyclic or are closely related to the cyclic codes. An advantage of cyclic codes over most
other codes is that they are easy to encode. Furthermore, cyclic codes posses a well
defined mathematical structure called the Galois Field, which has led to the development
of a very efficient decoding schemes for them.

Reed Solomon codes represent the most important sub-class of the cyclic codes [5], [14].

15
CHAPTER 3

GALOIS FIELD ARITHMETIC

3.1 INTRODUCTION
In chapter 2, various types of the error correcting codes were discussed. Burst errors are
efficiently corrected by using cyclic codes. The Galois field or the Finite Fields are
extensively used in the Error - Correcting Codes (ECC) using the Linear Block Codes. In
this chapter these finite fields are discussed thoroughly. The Galois Field is a finite set of
elements which has defined rules for arithmetic. These roots are not algebraically
different from those used in the arithmetic with ordinary numbers but the only difference
is that there is only a finite set of elements involved. They have been extensively used in
Digital Signal Processing (DSP), Pseudo- Random Number Generation, Encryption and
Decryption protocols in cryptography.

The design of efficient multiplier, inverter and exponentiation circuits for Galois Field
arithmetic is needed for these applications.

3.2 DEFINITION OF GALOIS FIELD

A Finite Field is a field with a finite field order (i.e., number of elements), also called a
Galois field. The order of a finite field is always a prime or a power of a prime . For each
prime power, there exists exactly one finite field GF(p m ).A Field is said to be infinite if
it consists of infinite number of elements, for e.g. Set of real numbers, complex numbers
etc. Finite field on the other hand consist of finite number of elements.

GF(p m ) is an extension field of the ground field GF(p), where m is a positive integer.
For p = 2, GF(2 m ) is an extension field of the ground field GF(2) of two elements (0,l).
GF(2 m ) is a vector space of dimension m over GF(2) and hence is represented using a
basis of m linearly independent vectors. The finite field (2 ) m GF contains (2m −1) non
zero elements. All finite fields contain a zero element and an element, called a generator
or primitive element α , such that every non-zero element in the field can be expressed as
a power of this element. The existence of this primitive element (of order 2 m -1) is
asserted by the fact that the nonzero elements of GF(2 m ) form a cyclic group.
Encoders and decoders for linear block codes over GF(2 m ), such as Reed-Solomon
codes, require arithmetic operations in GF(2 m ). In addition, decoders for some codes
over GF(2), such as BCH codes, require computations in extension fields GF(2 m ). In
GF(2 m ) addition and subtraction are simply bitwise exclusive-or. Multiplication can be
performed by several approaches, including bit serial, bit parallel (combinational), and
software. Division requires the reciprocal of the divisor, which can be computed in
hardware using several methods, including Euclid’s algorithm, lookup tables,
exponentiation, and subfield representations. With the exception of division,

16
combinational circuits for Galois field arithmetic are straightforward. Fortunately, most
decoding algorithms can be modified so that only a few divisions are needed, so fast
methods for division are not essential.

3.3 PROPERTIES OF GALOIS FIELDS

1. In GF(2 m ) fields, there is always a primitive element α , such that you can
express every element of GF(2 m ) except zero as a power of α [28]. You can
generate every field GF(2 m ) using a primitive polynomial over GF(2), and the
arithmetic performed in the GF(2 m ) field is modulo this primitive polynomial.
2. If α is a primitive element of GF(2 m ), its conjugate α 2m is also primitive
elements of GF(2 m ).
3. If α is an element of order n in GF(2 m ), all its conjugates have the same order n.
4. If α , an element in GF(2 m ), is a root of a polynomial f(x) over GF(2), then all
the distinct conjugates of α , also elements in GF(2 m ), are roots of f(x).
5. The 2 m- 1 nonzero elements of GF(2^m ) form all the roots of x m 2 - 1 - 1 = 0.
The elements of GF(2 m ) form all the roots of x 2^m - x = 0.

3.4 CONSTRUCTION OF GALOIS FIELDS

A Galois field GF (2 m ) with primitive element α is generally represented as (0, 1,


α ,α^2 ,………. α 2^k−2). The simplest example of a finite field is the binary field
consisting of the elements (0, 1). Traditionally referred to as GF(2) 2 , the operations in
this field are defined as integer addition and multiplication reduced modulo 2. Larger
fields can be created by extending GF(2) into vector space leading to finite fields of size
2 m . These are simple extensions of the base field GF(2) over m dimensions. The field
GF(2 m ) is thus defined as a field with 2 m elements each of which is a binary m-tuple.
Using this definition, m bits of binary data can be grouped and referred to it as an element
of GF(2 m ). This in turn allows applying the associated mathematical operations of the
field to encode and decode data [10].
Let the primitive polynomial be φ (x), of degree m over GF(2 m ). Now any th i element
of the field is given by

Hence all the elements of this field can be generated as powers ofα . This is the
polynomial representation of the field elements, and also assumes the leading coefficient
of φ (x) to be equal to 1. Figure 3.1 shows the Finite field generated by the primitive
polynomial 1 + α +α^2 +α^3 +α^4 +α^8 represented as GF( 2^8 ) or GF(256).

17
Figure 3.1: Representation of some elements in GF( 2^8)

Note that here, the primitive polynomial is of degree 4, and the numeric value of α is
considered purely arbitrary. Using the irreducibility property of the polynomial φ (x), this
can be proven that this construction indeed produces a field.
The primitive polynomial is used in a simple iterative algorithm to generate all the
elements of the field. Hence different polynomials will generate different fields. In the
work [29], the author claims that even though there are numerous choices for the
irreducible polynomials, the fields constructed are all isomorphic.

3.5 GALOIS FIELD ARITHMETIC

Galois Field Arithmetic (GFA) is very attractive in several ways. All the operations that
generally result in overflow or underflow in traditional mathematics gets mapped on to a
value inside the field because of modulo arithmetic followed in GFA, hence rounding
issues also get automatically eliminated.

GF generally facilitates the representation of all elements with a finite length binary
word. For e.g. in GF(2 m ), all the operands are of m-bits, where m is always a number
smaller than the conventional bus width of 32 or 64. This in turn introduces a huge
amount of parallelism into the GFA operations. Further, we assume ‘m’ to be a multiple
of 8, or a power of 2, because of its inherent convenience for sub-word parallelism [30].
To study the GFA an introduction to the mathematical concepts of the trace and dual
basis are necessary [31], [32].

18
Definition 1: The trace of an element β which belongs to GF(2 m ) is defined as

Definition 2: A basis { μj } in GF(2 m ) is a set of m linearly independent elements in


GF(2^ m ),
where 0 ≤ j ≤ m-1 .
Definition 3: Two bases { μj } and { λk } are the dual of one another if

CHAPTER 4

19
REED SOLOMON CODES

4.1 INTRODUCTION

High reliable data transmission and storage systems frequently use error
correction codes (ECC) to protect data. By adding a certain grade of redundancy
these codes are able to detect and correct errors in the coded information. In the
design of high reliable electronics systems both the Reed-Solomon (RS) encoder
and decoder should be self checking in order to avoid faults in these blocks
which compromise the reliability of the whole system. In fact, a fault in the
encoder can produce a no correct codeword, while a fault in the decoder can
give a wrong data word even if no errors occur in the codeword transmission.

Therefore, great attention must be paid to detect and recover faults in the
encoding and decoding circuitry. Nowadays, the most used error correcting
codes are the RS codes, based on the properties of the finite field arithmetic. In
particular, finite fields with 2m elements are suitable for digital implementations
due to the isomorphism between the addition, performed modulo 2, and the XOR
operation between the bits representing the elements of the field.

The use of the XOR operation in addition and multiplication allows to use
parity check-based strategies to check the presence of faults in the RS encoder,
while the implicit redundancy in the codeword is used either for correct erroneous
data and for detect faults inside the decoder block.

4.2 RS CODES BACKGROUND

In this section, a short background on RS codes is outlined. In, more


information about finite fields and RS codes are provided. The finite fields used in
digital implementations are in the form GF (2 m), where represents the number of

20
bits of a symbol to be coded. An element a(x) 2 GF (2 m) is a polynomial with

coefficients ; 1g and can be seen as a symbol of m bits .


The addition of two elements a(x) and b(x) 2 GF (2 m) is the sum modulo 2 of the
coefficients ai and bi, i.e., is the bitwise XOR of the two symbols a and b. The
multiplication of two elements a(x) and b(x) 2 GF(2 m) requires the multiplication of
the two polynomials followed by the reduction modulo i(x), where i(x) is an
irreducible polynomial of degree m. Multiplication can be implemented as an
AND-XOR network, as explained.

The RS (n; k) code is defined by representing the data word symbols as


elements of the field GF (2m) and the overall data word is treated as a polynomial
d(x) of degree k1 with coefficient in GF(2 m). The RS codeword is then generated
by using the generator polynomial g(x). All valid code words are exactly divisible
by g(x). The general form of g(x) is

where

and a primitive element of the field, i.e.,

The code words of a separable RS(n; k) code correspond to the polynomial c(x)
with degree n - 1 that can be generated by using the following formulas:

Where p(x) is a polynomial with degree less than n - k representing the parity
symbols. In practice, the encoder takes k data symbols and adds 2t parity
symbols obtaining a n symbol codeword. The 2t parity symbols allow the
correction of up to t symbols containing errors in a codeword. Defining the

21
Hamming distance of two polynomials a(x) and b(x) of degree n as the number of
coefficients of the same degree that are different, , and
the Hamming weight W(a(x)) as the number of non-zero coefficients of a(x), i.e.,

it is easy to prove that H(a(x); b(x)) = W(a(x) - b(x)).

In a RS (n; k) code the Hamming distance between two code words is n-


k. After the transmission of the coded data on a noisy channel the decoder
receives as input a polynomial c(x) = c(x) +e(x), where e(x) is the error
polynomial. The RS decoder identifies the position and magnitude of up to t
errors and it is able to correct them.

In other words the decoder is able to identify the e(x) polynomial if the
Hamming weight W (e(x)) is not greater than t. The decoding algorithm provides
as output the codeword that is the only codeword having an Hamming distance
not greater than t from the received polynomial c(x).

METHODOLOGY AND PREVIOUS WORK

In this section, the motivations of the design methodology used for the
proposed implementations are described starting from an overview of the
presented literature.

A radiation-tolerant RS encoder hardened against space radiation effects


through circuit and layout techniques is presented. Single and multiple parity bits
schemes are presented to check the correctness of addition and multiplication in
polynomial basis representation of finite fields.
The authors extend the techniques presented to detect faults occurring in
the RS encoder, achieving the self checking property for the RS encoder
implementation. Moreover, a method to obtain CED circuits for finite field
multipliers and inverters has been proposed. Since both the RS encoder and
decoder are based on GF(2m) addition, multiplication, and inversion, their self-

22
checking implementation can be obtained by using CED implementations of
these basic arithmetic operations.

Moreover, a self-checking algorithm for solving the key equation (that is a


part of the overall decoding algorithm) has been introduced. Exploiting the
algorithm proposed and substituting the elementary operations with the
corresponding CED implementation for the other parts of the decoding algorithm
a self-checking decoder can been implemented. This approach can be used for
the encoder, that use only addition and constant multiplication and is illustrated in
the following subsection, but it is unusable for the decoder as described later in
this paper and a specific technique will be explained in the successive section.

4.3 Characteristics of the RS Encoder

I. INTRODUCTION

Reed-Solomon error correcting codes (RS codes) are widely used in


communication systems and data storages to recover data from possible errors
that occur during transmission and from disc error respectively. One typical
application of the RS codes is the Forward Error Correction (FEC), shown in Fig.
1, in the optical network G.709, which has a fast transmission rate of 40 Gbps.

23
Before data transmission, the encoder attaches parity symbols to the data
using a predetermined algorithm before transmission. At the receiving side, the
decoder detects and corrects a limited predetermined number of errors occurred
during transmission. Transmitting the extra parity symbols requires extra
bandwidth compared to transmitting the pure data. However, transmitting
additional symbols introduced by FEC is better than retransmitting the whole
package when at least an error has been detected by the receiver. Many
implementation of the RS codec is targeted to ASIC design and only a few
papers discuss about synthesizing the RS codec toward reconfigurable devices.
Implementing Reed- Solomon codec on reconfigurable devices is attractive for
two main reasons. FPGAs provide flexibility where the algorithm parameters can
be altered to provide different error correction capabilities. They also provide a
rapid development cycle, resulting in a short time to market, which is a major
factor in industry.

The objective of this work is to implement a generic Reed- Solomon VHDL


code to measure the performance of the RS codec on Altera’s StratixII. The
performance of the implemented RS codec will be compared to the performance
of Altera’s RS codec. The performance metrics to be used are the area occupied
by the design and the speed at which the design can run. The Reed-Solomon
code to be implemented is RS (255,223). This project over the theory behind
Reed-Solomon code, the architecture of the implemented RS codec, the
preliminary results and future work extending the current research.

II. REED-SOLOMON THEORY

A Reed-Solomon code is a block code and can be specified as RS(n,k) as


shown in Fig. 2. The variable n is the size of the codeword with the unit of
symbols, k is the number of data symbols and 2t is the number of parity symbols
Each symbol contains s number of bits.

24
The relationship between the symbol size, s, and the size of the
codeword, n, is given by (1). This means that if there are s bits in one symbol,
there could exist 2s−1 distinct symbols in one codeword, excluding the one with
all zeros.

n = 2s − 1 (1)

The RS code allows correcting up to t number of symbol errors where t is


given by

A. Galois Field

The Reed-Solomon code is defined in the Galois field, which contains a


finite set of numbers where any arithmetic operations on elements of that set will
result in an element belonging to the same set. Every element, except zero, can
be expressed as a power of a primitive element of the field. The non-zero field
elements form a cyclic group defined based on a binary primitive polynomial. An
addition of two elements in the Galois field is simply the exclusive-OR (XOR)
operation. However, a multiplication in the Galois field is more complex than the
standard arithmetic. It is the multiplication modulo the primitive polynomial used
to define the Galois field. For example, a Galois field, GF(8), is constructed with
the primitive polynomial p(z) = z3+z+1 based on the primitive element _ = z

25
The GF(8) will be the basis for the Reed-Solomon code, RS(7,3). Because
each symbol has
log2 (8) = 3
Bits, the variables for RS code are one
n = 23 − 1 = 7, k = 3
(Arbitrarily chosen to balance between the number of information and
parity symbols in one codeword),
t = n−k 2 = 2.
Given a stream of data,

the stream of data can be represented as a polynomial of

B. Encoder

26
The transmitted codeword is systematically encoded and defined in (3) as
a function of the transmitted message m(x), the generator polynomial g(x) and
the number of parity symbols 2t.
c(x) = m(x)X2t + m(x) mod g(x) (3)

where g(x) is the generator polynomial of degree 2t and given by

The variable _ is a root of the binary primitive polynomial of degree t. In


OTN G.709, the binary primitive polynomial is defined as

x8 + x4 + x3 + x2 + 1.

C. Decoder
After going through a noisy transmission channel, the encoded data can
be represented as

r(x) = c(x) + e(x) (5)

Where e(x) represents the error polynomial with the same degree as c(x)
and r(x). Once the decoder evaluates e(x), the transmitted message, c(x), is then
recovered by adding the received message, r(x), to the error polynomial, e(x), as
shown in Equation 6.
C(x) = r(x) + e(x) = c(x) + e(x) + e(x) = c(x) (6)

Note that e(x) + e(x) = 0 because addition in Galois field is equivalent to an


exclusive-OR and e(x) XOR e(x) = 0.

27
Five functional blocks that form the decoder are:
1) Syndrome Calculator: In this block, errors are detected by calculating
the syndrome polynomial, S(x) as shown in (7). This is used by the Key Equation
functional block.

Where

When S(x) = 0, the received codeword is error free. Else, the Key Equation
Solver will use S(x) to generate the error locator polynomial, _(x), and the error
evaluator polynomial, (x).

2) Key Equation Solver: The Key Equation is an equation that describes the

relationship between the syndromes

Solving (14) gives error locator polynomial _(x) and the error evaluator
polynomial (x), which can be represented in the general form shown in (10) and
(11) respectively.

28
The error locator polynomial, (x), has a degree of e < t. The error evaluator
polynomial, (x) has degree at most e−1 to determine the magnitude of e errors.
There are different algorithms that have been used to solve the key equation and
two common ones are the Euclidean algorithm and the

Berlekamp-Massey (BM) algorithm as shown in Fig.3 and Fig.4,


respectively. A glance at the flowcharts, shown in Fig.3 and Fig.4, reveals that
the Euclidean algorithm has simpler structure than the BM algorithm does.
However, it needs a significant amount of logic elements to implement the
polynomial division function. On the other hand, the BM algorithm has a complex
structure, but uses fewer gates to be implemented. In this research project, the
BM algorithm is chosen to be implemented because of its low degree of
hardware utilization to solve the key equation.

3) Error Locator: The locations of the errors are determined based on the error
locator polynomial,^(x) (10). Each is plugged into

(10). If the location of an error is c, where c is derived


based on in the Galois Field. This process is known as the Chien search
algorithm.
4) Error Evaluator: The magnitude of the errors is determined based on the error

evaluator polynomial,

For every symbol in the codeword,

4.4 DESIGN AND IMPLEMENTATION

29
A. Encoder
The encoder is architected using the Linear Feedback Shift Register

Design. The coefficients , are derived Fig. 4. The Berlekamp-Massey


algorithm based on (12).
g(x) = x16 + 59x15 + 13x14 + 104x13 + 189x12 + 68x11 + 209x10 + 30x9 + 8x8
+ 163x7 + 65x6 + 41x5 + 229x4 + 98x3 + 50x2 + 36x + 59 (12)

Each message is accompanied by a pulse signal, which indicates the


beginning of a message. After 239 clock cycles, the encoder starts concatenating
the 16 calculated parities to the message to make a codeword of 255 symbols.

B. Decoder

The high level architecture of the decoding data path is shown in Fig. 6.
The decoder first calculates the syndrome of the receiving codeword to detect
any potential errors occurred during transmission. If the syndrome polynomial,
S(x), is not zero, the receiving codeword is therefore erroneous and will be
corrected if the number of erroneous symbols is less than eight.

30
1) Syndrome Calculator: The syndrome takes in codeword after codeword at a
rate of 1 symbol/clock cycle. The i start signal indicates the beginning of each
codeword. The syndrome architecture is shown in Fig. 7.

The coefficients are obtained by solving (13). After 255 clock cycles,
S(x) is ready to be processed by the Key Equation Solver.

31
2) Key Equation Solver: The Key Equation solver waits on the i start signal
before capturing the syndrome polynomial, S(x). The Berlekamp algorithm is
implemented using a state machine design as shown in Fig. 8.

The state machine is designed from the BM flowchart, Fig 4. The state
machine is initialized each time a syndrome, S(x) is ready to be processed by the
Key Equation Solver to generate ^(x). Once ^(x) is found, calculated using 14.

3) Chien Search Error Location: The chien’s architecture is shown in Fig. 9.


The chien’s and Forney’s algorithm for calculating the error locations and values
are described as follows:
For i = 1 to 255 then

End If
End For

32
The chien’s algorithm calculates the location of the erroneous symbols in
each codeword. Considering the above algorithm cc i and ri represent the i-th
symbol in the corrected codeword and the received i th polynomial, respectively.

is the derivative of and also defined as Therefore,

4) Forney’s method for error values:

Fig. 10 shows the architecture of the forney error evaluator. It implements


part of the algorithm described above. An”INVERSE ROM” is implemented as a
look-up table to store the inverse field of the Galois field elements since current
state of art’s reconfigurable devices has resources for look-up tables.

4.5 SELF-CHECKING RS ENCODER

The implementation of RS encoders are usually based on an LFSR, which


implements the polynomials division over the finite field. In Fig. 1, the
implementation of an RS encoder is shown. The additions and multiplications are
performed on GF (2m) and gi is the coefficients of the generator polynomial g(x).
The RS encoder architecture is composed by slice blocks containing a constant
multiplier, an adder, and a register (see shaded block in Fig. 1). The number of
slices to implement for an RS (n; k) code is n -k. The self-checking
33
implementation requires the insertion of some parity prediction blocks and a
parity checker. The correctness of each slice is checked by using the architecture
shown in Fig. 2.

34
The input and output signals to the slice are as follows.
• Ain is the registered output of the previous slice.
• Pin is the registered parity of the previous slice.
• Fin is the feed-back of the LFSR.
• PFin is the parity of the feed-back input.
• Aout is the result of the multiplication and addition operation.
• Pout is the predicted parity of the result.

The parity prediction block is implemented by using (5). It must be noticed


that some constrains in the implementation of the constant multiplier must be
added in order to avoid interference between different outputs when a fault
occurs. These interferences are due to the sharing of intermediate results
between different outputs and, therefore, can be avoided by using networks with
fan-out equal to one: considering the field-programmable gate array (FPGA)
implementation of constant multiplier, this constrain is not a serious drawback.

In fact, each output bit is computed by implementing a XOR network


requiring a very limited number of LUTs: for example, considering the field
GF(28) and an FPGA based on four-inputs LUTs, three LUT’s in the worst case
are required. Table I reports the overhead introduced for different constant gi
without resource sharing in the case of GF(28).
The predicted parity bit and the output of each slice are evaluated by the
parity checker block as shown in Fig. 3, and an error indicator informs if a
difference between the predicted parity bit and the parity
of the m slice outputs is detected.

35
The parity checker block checks if the parity of the inputs is even or odd.
The self checking implementation of the parity checker is realized with a two-rail
circuit. The two outputs are each equal to the parity of one of the two disjoint
subsets of the inputs, as proposed. The fault-free behavior of the checker, when
a correct set of inputs is provided (i.e., no faults occur in the slices) is the
following: the output codes 01 or 10 are generated for an odd parity checker or
the output codes 00 or 11 for an even parity checker. If the checker receive as
input an erroneous codeword (i.e., a fault occurs in a slice) the checker provides
the output codes 11 or 00 for an odd parity checker or the output codes 01 or 10
for an even parity checker.

Also, if a fault occurs in the checker the outputs provided are 11 or 00 for
an odd parity checker or the output codes 01 or 10 for an even parity checker.
This considerations guarantee the self-checking property of the checker. It can
be noticed that, due to the LFSR-based structure of the RS encoder, there are no
control state machines to be protected against faults.
Therefore, the use of the described self-checking arithmetic structures
allows checking the entire RS encoder. The evaluations in terms of area and
delay of this structure has been carried out by using a Xilinx Virtex II FPGA as
the target device and the design flow has been performed by using the Xilinx
ISE foundation framework.

36
Table I reports the area of each of the blocks described in this section.
The adder is implemented by using one LUT for each output, while the area of
the constant multipliers and of the parity prediction block depends by the
coefficients gi. In Table I, the row named “additional logic” represents the logic
added to the slice in order to predict the parity bit. The number of LUTs required
to implement the parity checker depends by the number of slices of the encoder,
i.e., the number n - k of check bits of the RS code.

In particular, implementing the parity checker as a network of XOR gates,


the number of LUTs is d((n-k)(m+ 1))=(3)e. Starting from the result shown in
Table I, the area overhead has been computed for the given case. The overhead
50% and it is independent from the number of check symbols (n - k). In fact, for
each check symbol (m = 8) the overhead for the single slice is about six LUTs,
plus the overhead due to the parity checker (three LUTs). The equation
describing the overhead is

The characterization of the critical path is different for each slice,


depending on the complexity of the constant multiplier gi. In the worst case, the
constant multiplier gi implemented by using an eight XOR network requires three
LUTs, therefore, in the worst case path five LUTs are crossed.

37
In order to compute the critical path for the overall self checking encoder
architecture, the following additional signal paths must be considered:
• Path crossing the parity prediction block that is comparable with the path of the
worst-case constant multiplier;
• Path crossing the parity checkers. This path depends by the number of bits
provided as input to the checker. In fact, the number of required LUTs is equal to
the number of levels of the four inputs XOR network, that is dlog4 (n -k) (m + 1)
e.
The number of levels of the two-rail parity checker increases very slowly
with the growth of the number of checks symbols, and therefore, do not represent
a problem for the maximum frequency of the self checking decoder.

4.6 CONCURRENT ERROR DETECTION SCHEME OF THE RS DECODER

In Fig. 4, the CED implementation of the RS decoder is shown. Its main


blocks are as follows.

 RS decoder, i.e., the block to be checked.


 An optional error polynomial recovery block (the shaded block shown in
Fig. 4). This block is needed if the RS decoder does not provide at the
output the error polynomial coefficients.
 Hamming weight counter, that checks the number of nonzero coefficients
of the error polynomial.
 Codeword checker, that checks if the output data of the RS decoder form
a correct codeword.

38
 Error detection block that take as inputs the output of the Hamming weight
counter and of the codeword checker and provides error detection signal if
a fault in the RS decoder has been detected.

The RS decoder can be considered as a black box performing an


algorithm for the error detection and correction of the input data (the coefficients
of the received data forming the polynomial c(x). The error polynomial recovery
block is composed by a shifter register of length L (the latency of the decoder)
and by a GF(2m) adder having as operands the coefficients of c(x) and c(x).

The Hamming weight counter is composed by the following:


 A comparator indicating (at each clock cycle) if the e(x) coefficients are
zero;
 A counter that takes into account the number of nonzero coefficients;
 A comparator between the counter output and t that is the maximum
allowed number of nonzero elements.

The codeword checker block checks if the reconstructed c(x) is a


codeword, i.e., if it is exactly divisible for the generator polynomial g(x). The
following two implementations of this block are proposed.

Implementation 1:

It is based on the computation of the remainder of the polynomial division


between c(x) and g(x). If all the coefficients of the remainder polynomial are zero
then the polynomial c(x) is a correct codeword. The remainder of the division by
g(x) is exactly the function of the systematic RS encoder.

Therefore, a systematic RS encoder with the same g(x) polynomial of the


decoder is used if c(x) is a codeword. Faults in the decoder can be detected
ignoring either g(x) and also ignoring how the operation in GF (2 m) is performed.

39
We only need to reuse the same RS encoder used to create the codeword for the
computation of the remainder c(x) obtained from the decoder. The drawback of
this implementation is the additional latency introduced by the RS encoder, which
is n-k clock cycles. This latency must be considered by the error detection block
that must wait n - k clock cycles to check the two properties defined in Section III.
The area occupation of the RS encoder is smaller than the area occupation of
the decoder; therefore, the overhead introduced by this block is about 15% of the
decoder area.

Implementation 2:

The codeword checker block is based on the so-called syndrome


calculation. This operation is the first to be performed in the decoder; therefore,
conceptually this approach implies a partial duplication of the RS decoder and
implies the knowledge of the used Galois field and the roots of g(x). The
syndrome calculations imply the evaluation of the received polynomial c(x) for the
values of x in the set A, with , i.e., A is the set of the roots of
g(x). The received polynomial c(x) is exactly divisible for g(x) if and only if it is
exactly divisible for all the monomials , where is a root of g(x). The
polynomial is divisible by is zero.

Therefore, the received polynomial is a codeword if and only if all the


computed syndromes are zero. The syndromes computation block is composed
by a GF (2m) constant multiplier, an adder and an m-bit register. The output of
this block is valid one clock cycle later than the computation of the last coefficient
of the polynomial. The area occupation of the syndrome calculation block is
equivalent to the encoder area occupation. In fact, in both cases we need n- k
blocks composed by an adder, a constant multiplier and an m-bit register.

The main difference between implementation 1 and 2 is the latency of the


codeword checker block. The error detection block takes as inputs the outputs of

40
the Hamming weight counter and the outputs of the codeword checker. Its
implementation depends from the chosen implementation of the codeword
checker. If we use implementation 1 the error detection block must delay the
output of the Hamming weight counter for n -k clock cycles and checks if all the
coefficients of the remainder polynomial are zero.

On the other hand, if we use the syndromes calculation block the inputs
are the computed syndromes and the error detection block checks if all the
received symbols are zero. The additional blocks used to detect faults inside the
decoder are susceptible to faults and, therefore, their implementation must
assure the self-checking property, in order to face the age old question of “who
checks the checker.” For the codeword checker and the error polinomiyal
generator blocks only register and GF (2 m) addition and constant multiplication
are used and, therefore, the same consideration of Section IV can be used to
obtain the self-checking property of these blocks. For the counters and the
comparator used in the Hamming weight counter and error detection blocks,
many efficient techniques can be found in literature.

41
CHAPTER 5
Field Programmable Gate Array (FPGA)

5.1 History of FPGA

The historical roots of FPGAs are in complex programmable logic devices


(CPLDs) of the early to mid 1980s. A Xilinx co-founder invented the field
programmable gate array in 1984. CPLDs and FPGAs include a relatively large
number of programmable logic elements. CPLD logic gate densities range from
the equivalent of several thousand to tens of thousands of logic gates, while
FPGAs typically range from tens of thousands to several million.

The primary differences between CPLDs and FPGAs are architectural. A


CPLD has a somewhat restrictive structure consisting of one or more
programmable sum-ofproducts logic arrays feeding a relatively small number of
clocked registers. The result
of this is less flexibility, with the advantage of more predictable timing delays and
a higher logic-to-interconnect ratio. The FPGA architectures, on the other hand,
are dominated by interconnect. This makes them far more flexible (in terms of the
range of designs that are practical for implementation within them) but also far
more complex to design for.

Another notable difference between CPLDs and FPGAs is the presence in


most FPGAs of higher-level embedded functions (such as adders and multipliers)
and embedded memories. Some FPGAs have the capability of partial re-
configuration that lets one portion of the device be re-programmed while other
portions continue running.

42
5.2 Basic concepts of FPGA

FPGA is device that contains a matrix of reconfigurable gate array logic


circuitry. The programmable logic components can be programmed to duplicate
the functionality of basic logic gates such as AND, OR, XOR and NOT or more
complete combinational functions such as decoders or simple mathematical
functions. Most FPGA includes memory elements in these programmable logic
components which consist of simple flip-flops or more complete blocks of
memory. When FPGA is configured, the internal circuitry is connected in a way
which creates a hardware implementation of the software application. Unlike
processors, FPGAs uses dedicated hardware for processing logic and doesn’t
require an operating system.

FPGAs consist of three components:


 array of programmable logic blocks
 with look-up tables (LUTs), registers, multiplexors
 programmable interconnect
 I/O blocks around the perimeter

Block Diagram of FPGA

43
The performance of an application is not affected when additional process
added to the FPGA since is parallel in nature and do not have to compete to use
the same resource. FPGA can enforce critical interlock logic and can be
designed to prevent I/O forcing by an operator. Unlike hardwired printed circuit
board (PCB) designs, which have fixed and limited hardware resources, FPGA
based system can literally rewire its internal circuitry to allow reconfiguration after
the control system is deployed to the field.

FPGA devices deliver more performance and reliability of dedicated


hardware circuitry. Thousands of discrete components can be replaced by using
a single FPGA which incorporate millions of logic gates in a single integrated
circuit. The internal resources of an FPGA chip consists of a matrix of
configurable blocks (CLB) connected to periphery of I/O blocks. Signal are routed
within FPGA matrix by programmable interconnect switched and wire routes.

5.3 FPGA Advantage

FPGA have many advantages. One of that many advantage of FPGA is


the only unified flow that allows you to design for any silicon, vendor as well as
language. The various silicones are PLD, Platform FPGA, structured ASIC, ASIC
Prototypes, ASICs and SOCs. There are also many vendors available such as
Altera, Xilinx, Actel, Atmel, ChipExpress, Lattice and etc. There are many
languages that can be used to program a FPGA such as VHDL, Verilog, System
Verilog, C, C++, PSL and SVA.

FPGA is good used for large designs and it is also reconfigurable. When
creating designs, we can use simple VHDL or verilog commands to design a
complex FPGA design. Moreover, by using FPGA, it is able to deliver the
technical edge such as optimizing FPGA time closure with Precision Synthesis,
advanced timing analysis and optimizing timing closure with I/O optimization and
PCB integration. FPGA is also able to optimize the design process by reducing
by half the design time with rapid development process.

44
5.4 Language Used in FPGA

For a long time, programming languages such as FORTRAN, Pascal, and


C were being used to describe computer programs that were sequential in
nature. Similarly, in the digital design field, designers felt the need for a standard
language to describe digital circuits. Thus, Hardware Description Languages
(HDLs) came into existence. HDLs allowed the designers to model the
concurrency of processes found in hardware elements. Hardware description
languages such as Verilog HDL and VHDL became popular. Verilog HDL
originated in 1983 at Gateway Design Automation. Later, VHDL was developed
under contract from DARPA. Both Verilog ® and VHDL simulators to simulate
large digital circuits quickly gained acceptance from designers.

Even though HDLs were popular for logic verification, designers had to
manually translate the HDL-based design into a schematic circuit with
interconnections between gates. The advent of logic synthesis in the late 1980s
changed the design methodology radically. Digital circuits could be described at
a register transfer level (RTL) by use of an HDL. Thus, the designer had to
specify how the data flows between registers and how the design processes the
data. The details of gates and their interconnections to implement the circuit were
automatically extracted by logic synthesis tools from the RTL description.

Thus, logic synthesis pushed the HDLs into the forefront of digital design.
Designers no longer had to manually place gates to build digital circuits. They
could describe complex circuits at an abstract level in terms of functionality and
data flow by designing those circuits in HDLs. Logic synthesis tools would
implement the specified functionality in terms of gates and gate interconnections.

HDLs also began to be used for system-level design. HDLs were used for
simulation of system boards, interconnect buses, FPGAs (Field Programmable
Gate Arrays), and PALs (Programmable Array Logic). A common approach is to

45
design each IC chip, using an HDL, and then verify system functionality via
simulation.

Today, Verilog HDL is an accepted IEEE standard. In 1995, the original


standard IEEE 1364-1995 was approved. IEEE 1364-2001 is the latest Verilog
HDL standard that made significant improvements to the original standard.

5.5 Importance of HDLs


HDLs have many advantages compared to traditional schematic-based design.

 Designs can be described at a very abstract level by use of HDLs.


Designers can write their RTL description without choosing a specific
fabrication technology. Logic synthesis tools can automatically convert the
design to any fabrication technology. If a new technology emerges,
designers do not need to redesign their circuit. They simply input the RTL
description to the logic synthesis tool and create a new gate-level netlist,
using the new fabrication technology. The logic synthesis tool will optimize
the circuit in area and timing for the new technology.
 By describing designs in HDLs, functional verification of the design can be
done early in the design cycle. Since designers work at the RTL level, they
can optimize and modify the RTL description until it meets the desired
functionality. Most design bugs are eliminated at this point. This cuts down
design cycle time significantly because the probability of hitting a
functional bug at a later time in the gate-level netlist or physical layout is
minimized

 Designing with HDLs is analogous to computer programming. A textual


description with comments is an easier way to develop and debug circuits.
This also provides a concise representation of the design, compared to
gate-level schematics.Gate-level schematics is almost incomprehensible
for very complex designs.

46
CHAPTER 6

INTRODUCTION TO VERILOG

6.1 Introduction
Verilog is a HARDWARE DESCRIPTION LANGUAGE (HDL). A hardware
description Language is a language used to describe a digital system, for
example, a microprocessor or a memory or a simple flip-flop. This just means
that, by using a HDL one can describe any hardware (digital) at any level. Verilog
is one of the HDL languages available in the industry for designing the Hardware.
Verilog allows us to design a Digital design at Behavior Level, Register Transfer
Level (RTL), Gate level and at switch level. Verilog allows hardware designers to
express their designs with behavioral constructs, deterring the details of
implementation to a later stage of design in the final design.

Verilog supports a design at many different levels of abstraction. Three of


them are very important:
 Behavioral level
 Register-Transfer Level
 Gate Level

6.2 Behavioral level

This level describes a system by concurrent algorithms (Behavioral). Each


algorithm itself is sequential, that means it consists of a set of instructions that
are executed one after the other. Functions, Tasks and Always blocks are the
main elements. There is no regard to the structural realization of the design.

6.3 Register-Transfer Level

47
Designs using the Register-Transfer Level specify the characteristics of a
circuit by operations and the transfer of data between the registers. An explicit
clock is used. RTL design contains exact timing possibility; operations are
scheduled to occur at certain times.
6.4 Gate Level

Within the logic level the characteristics of a system are described by


logical links and their timing properties. All signals are discrete signals. They can
only have definite logical values (`0', `1', `X', `Z`). The usable operations are
predefined logic primitives (AND, OR, NOT etc gates). Using gate level modeling
might not be a good idea for any level of logic design. Gate level code is
generated by tools like synthesis tools and this net list is used for gate level
simulation and for backend.

6.5 History of Verilog

Verilog was started initially as a proprietary hardware modeling language


by Gateway Design Automation Inc. around 1984. It is rumored that the original
language was designed by taking features from the most popular HDL language
of the time, called HiLo as well as from traditional computer language such as C.
At that time, Verilog was not standardized and the language modified itself in
almost all the revisions that came out within 1984 to 1990.

Verilog simulator was first used beginning in 1985 and was extended
substantially through 1987.The implementation was the Verilog simulator sold by
Gateway. The first major extension was Verilog-XL, which added a few features
and implemented the infamous "XL algorithm" which was a very efficient method
for doing gate-level simulation.

48
The time was late 1990. Cadence Design System, whose primary product
at that time included Thin film process simulator, decided to acquire Gateway
Automation System. Along with other Gateway product, Cadence now became
the owner of the Verilog language, and continued to market Verilog as both a
language and a simulator.
At the same time, Synopsys was marketing the topdown design
methodology, using Verilog. This was a powerful combination. In 1990, Cadence
recognized that if Verilog remained a closed language, the pressures of
standardization would eventually cause the industry to shift to VHDL.
Consequently, Cadence organized Open Verilog International (OVI), and in 1991
gave it the documentation for the Verilog Hardware Description Language.

This was the event which "opened" the language. OVI did a considerable
amount of work to improve the Language Reference Manual (LRM), clarifying
things and making the language specification as vendor-independent as
possible.In 1990. Soon it was realized, that if there were too many companies in
the market for Verilog, potentially everybody would like to do what Gateway did
so far - changing the language for their own benefit. This would defeat the main
purpose of releasing the language to public domain.

As a result in 1994, the IEEE 1364 working group was formed to turn the
OVI LRM into an IEEE standard. This effort was concluded with a successful
ballot in 1995, and Verilog became an IEEE standard in December, 1995. When
Cadence gave OVI the LRM, several companies began working on Verilog
simulators. In 1992, the first of these were announced, and by 1993 there were
several Verilog simulators available from companies other than Cadence. The
most successful of these was VCS, the Verilog Compiled Simulator, from
Chronologic Simulation. This was a true compiler as opposed to an interpreter,
which is what Verilog-XL was. As a result, compile time was substantial, but
simulation execution speed was much faster. In the meantime, the popularity of
Verilog and PLI was rising exponentially.

49
Verilog as a HDL found more admirers than well-formed and federally
funded VHDL. It was only a matter of time before people in OVI realized the need
of a more universally accepted standard. Accordingly, the board of directors of
OVI requested IEEE to form a working committee for establishing Verilog as an
IEEE standard.

The working committee 1364 was formed in mid 1993 and on October 14,
1993, it had its first meeting. The standard, which combined both the Verilog
language syntax and the PLI in a single volume, was passed in May 1995 and
now known as IEEE Std. 1364- 1995. After many years, new features have been
added to Verilog, and new version is called Verilog 2001. This version seems to
have fixed lot of problems that Verilog 1995 had. This version is called 1364-
2000. Only waiting now is that all the tool vendors implementing it.

6.6 Features of Verilog HDL

Verilog HDL has evolved as a standard hardware description language. Verilog


HDL offers many useful features

 Verilog HDL is a general-purpose hardware description language that is


easy to learn and easy to use. It is similar in syntax to the C programming
language. Designers with C programming experience will find it easy to
learn Verilog HDL.
 Verilog HDL allows different levels of abstraction to be mixed in the same
model. Thus, a designer can define a hardware model in terms of
switches, gates, RTL, or behavioral code. Also, a designer needs to learn
only one language for stimulus and hierarchical design.

 Most popular logic synthesis tools support Verilog HDL. This makes it the
language of choice for designers.

50
 All fabrication vendors provide Verilog HDL libraries for postlogic synthesis
simulation. Thus, designing a chip in Verilog HDL allows the widest choice
of vendors.

 The Programming Language Interface (PLI) is a powerful feature that


allows the user to write custom C code to interact with the internal data
structures of Verilog. Designers can customize a Verilog HDL simulator to
their needs with the PLI.

6.7 Simulation
Simulation is the process of verifying the functional characteristics of
models at any level of abstraction. We use simulators to simulate the the
Hardware models. To test if the RTL code meets the functional requirements of
the specification, see if all the RTL blocks are functionally correct. To achieve this
we need to write testbench, which generates clk, reset and required test vectors.

We use waveform output from the simulator to see if the DUT (Device
under Test) is functionally correct. Most of the simulators comes with waveform
viewer, as design becomes complex, we write self checking testbench, where
testbench applies the test vector, compares the output of DUT with expected
value.

There is another kind of simulation, called timing simulation, which is


done after synthesis or after P&R (Place and Route). Here we include the gate
delays and wire delays and see if DUT works at rated clock speed. This is also
called as SDF simulation or gate simulation.

6.8 SYNTHESIS:

Synthesis is process in which synthesis tool like design compiler or


Synplify takes the RTL in Verilog or VHDL, target technology, and constrains as
input and maps the RTL to target technology primitives.

51
Synthesis tool after mapping the RTL to gates, also do the minimal
amount of timing analysis to see if the mapped design meeting the timing
requirements.

SOFTWARE DETAILS:

SPARTAN-III FPGA FAMILY:

Architectural Description:

Spartan-III Array:

The Spartan-III user-programmable gate array is composed of five major


configurable elements:
 IOBs provide the interface between the package pins and the internal
logic.
 CLBs provide the functional elements for constructing most logic.
 Dedicated block RAM memories of 4096 bits each.
 Clock DLLs for clock-distribution delay compensation and clock domain
control.
 Versatile multi-level interconnects structure.

Values stored in static memory cells control all the configurable logic
elements and interconnect resources. These values load into the memory cells
on power-up, and can reload if necessary to change the function of the device.
Each of these elements will be discussed in detail.

Input/Output Block:

52
The Spartan-III IOB, as seen in Figure, features inputs and outputs that
support a wide variety of I/O signaling standards. These high-speed inputs and
outputs are capable of supporting various state of the art memory and bus
interfaces.

The three IOB registers function either as edge-triggered D-type flip-flops


or as level- sensitive latches. Each IOB has a clock signal (CLK) shared by the
three registers and independent Clock Enable (CE) signals for each register. In
addition to the CLK and CE control signals, the three registers share a Set/Reset
(SR). For each register, this signal can be independently configured as a
synchronous Set, a synchronous Reset, an asynchronous Preset, or an
asynchronous Clear.

Figure Spartan-III Input/Output Block (IOB)

Input Path:

A buffer in the Spartan-II IOB input path routes the input signal either
directly to internal logic or through an optional input flip-flop. An optional delay
53
element at the D-input of this flip-flop eliminates pad-to-pad hold time. The delay
is matched to the internal clock-distribution delay of the FPGA, and when used,
assures that the pad-to-pad hold time is zero.

Output Path:

The output path includes a 3-state output buffer that drives the output
signal onto the pad. The output signal can be routed to the buffer directly from
the internal logic or through an optional IOB output flip-flop.

The 3-state control of the output can also be routed directly from the
internal logic or through a flip-flip that provides synchronous enable and disable.
Each output driver can be individually programmed for a wide range of low-
voltage signaling standards. Each output buffer can source up to 24 mA and sink
up to 48 mA. Drive strength and slew rate controls minimize bus transients.

Storage Elements:

Storage elements in the Spartan-II slice can be configured either as


edge-triggered D-type flip-flops or as level-sensitive latches. The D inputs can be
driven either by function generators within the slice or directly from slice inputs,
bypassing the function generators.

Block RAM:

Spartan-III FPGAs incorporate several large block RAM memories. These


complement the distributed RAM. Block RAM memory blocks are organized in
columns. All Spartan-III devices contain two such columns, one along each
vertical edge. These columns extend the full height of the chip. Each memory

54
block is four CLBs high, and consequently, a Spartan-II device eight CLBs high
will contain two memory blocks per column, and a total of four blocks.
.

Design Implementation:

The place-and-route tools (PAR) automatically provide the


implementation flow described in this section. The practitioner takes the EDIF
netlist for the design and maps the logic into the architectural resources of the
FPGA (CLBs and IOBs, for example).

The placer then determines the best locations for these blocks based on
their interconnections and the desired performance. Finally, the router
interconnects the blocks. The PAR algorithms support fully automatic
implementation of most designs. For demanding applications, however, the user
can exercise various degrees of control over the process. User partitioning,
placement, and routing information are optionally specified during the design-
entry process.

The implementation of highly structured designs can benefit greatly from


basic floor planning. The implementation software incorporates Timing Wizard
timing-driven placement and routing. Designers specify timing requirements
along entire paths during design entry. The timing path analysis routines in PAR
then recognize these user-specified requirements and accommodate them.

Timing requirements are entered on a schematic in a form directly


relating to the system requirements, such as the targeted clock frequency, or the
maximum allowable delay between two registers. In this way, the overall
performance of the system along entire signal paths is automatically tailored to

55
user-generated specifications. Specific timing information for individual nets is
unnecessary.

Configuration:

Configuration is the process by which the bit stream of a design, as


generated by the Xilinx development software, is loaded into the internal
configuration memory of the FPGA.

Spartan-III devices support both serial configurations, using the


master/slave serial and JTAG modes, as well as byte-wide configuration
employing the Slave Parallel mode.

Modes

Spartan-III devices support the following four configuration modes:


• Slave Serial mode
• Master Serial mode
• Slave Parallel mode
• Boundary-scan mode

The Configuration mode pins (M2, M1, and M0) select among these
configuration modes with the option in each case of having the IOB pins either
pulled up or left floating prior to configuration.

Serial Modes:

56
There are two serial configuration modes: In Master Serial mode, the FPGA
controls the configuration process by driving CCLK as an output. In Slave Serial
mode, the FPGA Passively receives CCLK as an input from an external agent
(e.g., a microprocessor, CPLD, or second FPGA in master mode) that is
controlling the configuration process. In both modes, the FPGA is configured by
loading one bit per CCLK cycle. The MSB of each configuration data byte is
always written to the DIN pin first.

Slave Parallel Mode:

The Slave Parallel mode is the fastest configuration option. Byte-wide data is
written into the FPGA. A BUSY flag is provided for controlling the flow of data at
a clock frequency FCCNH above 50 MHz.

Slave Serial Mode:

In Slave Serial mode, the FPGAs CCLK pin is driven by an external source,
allowing FPGAs to be configured from other logic devices such as
microprocessors or in a Daisy-chain configuration.

Master Serial Mode:

In Master Serial mode, the CCLK output of the FPGA drives a Xilinx PROM
which feeds a serial stream of configuration data to the FPGA’s DIN input.

Operating Modes:

Block RAM memory supports two operating modes.


• Read Through
• Write Back

57
Figure: Configuration Flow Diagram

58
Read Through (One Clock Edge):

The read address is registered on the read port clock edge and data
appears on the output after the RAM access time. Some memories may place
the latch/register at the outputs, Depending on the desire to have a faster clock-
to-out versus setup time. This is generally considered to be an inferior solution
since it changes the read operation to an asynchronous function with the
possibility of missing an address/control line transition during the generation of
the read pulse clock.

Write Back (One Clock Edge):

The write address is registered on the write port clock edge and the data input is
written to the memory and mirrored on the write port input.

Features

 7.5 ns pin-to-pin logic delays on all pins


 fCNT to 125 MHz
 72 macrocells with 1,600 usable gates
 Up to 72 user I/O pins
 5 V in-system programmable (ISP)
 Endurance of 10,000 program/erase cycles
 Program/erase over full commercial voltage and temperature range
 Enhanced pin-locking architecture
 90 product terms drive any or all of 18 macrocells within Function Block
 Flexible 36V18 Function Block
 Global and product term clocks, output enables, set and reset signals
 Extensive IEEE Std 1149.1 boundary-scan (JTAG) support
 Programmable power reduction mode in each macrocell
 Slew rate control on individual outputs

59
 User programmable ground pin capability
 Extended pattern security features for design protection
 High-drive 24 mA outputs
 3.3 V or 5 V I/O capability
 Advanced CMOS 5V FastFLASH technology
 Supports parallel programming of more than one XC9500 concurrently
 Available in 44-pin PLCC, 84-pin PLCC, 100-pin PQFP and 100-pin TQFP
packages

Xilinx ISE

The Xilinx ISE tools allow you to use schematics, hardware description
languages (HDLs), and specially designed modules in a number of ways.
Schematics are drawn by using symbols for components and lines for
wires.Xilinx Tools is a suite of software tools used for the design of digital circuits
implemented using Xilinx Field Programmable Gate Array (FPGA) or Complex
Programmable Logic Device (CPLD).

The design procedure consists of (a) design entry, (b) synthesis and
implementation of the design, (c) functional simulation and (d) testing and
verification. Digital designs can be entered in various ways using the above CAD
tools: using a schematic entry tool, using a hardware description language (HDL)
– Verilog or VHDL or a combination of both. In this lab we will only use the
design flow that involves the use of Verilog HDL. The CAD tools enable you to
design combinational and sequential circuits starting with Verilog HDL design
specifications.

The steps of this design procedure are listed below:


1. Create Verilog design input file(s) using template driven editor.
2. Compile and implement the Verilog design file(s).

60
3. Create the test-vectors and simulate the design (functional simulation) without
using a PLD (FPGA or CPLD).
4. Assign input/output pins to implement the design on a target device.
5. Download bitstream to an FPGA or CPLD device.
6. Test design on FPGA/CPLD device
A Verilog input file in the Xilinx software environment consists of the following
segments:

• Header: module name, list of input and output ports.

• Declarations: input and output ports, registers and wires.

• Logic Descriptions: equations, state machines and logic functions.

• End: endmodule

ModelSim 6.2C

ModelSim is our UNIX, Linux, and Windows-based simulation and debug


environment, combining high performance with the most powerful and intuitive
GUI in the industry.

FEATURES:
 Unified Coverage Database (UCDB) which is a central point for managing,
merging, viewing, analyzing and reporting all coverage information.
 Source Annotation. The source window can be enabled to display the
values of objects during simulation or when reviewing simulation results
logged to WLF.
 Finite State Machine Coverage for both VHDL and Verilog is now
supported.
 Code Coverage results can now be reviewed post-simulation using the
graphical user environment.
 Simulation messages are now logged in the WLF file and new capabilities
for managing message viewing are provided in the message viewer.
 SystemC is now supported for x86 Linux 64-bit platforms.

61
 Transaction recording and viewing is supported for SystemC using the
SCV transaction recording facilities.
 The GUI debug and analysis environment continues to evolve to provide
greater user-customization and better performance.
 SystemVerilog for design support continues to expand with many new
constructs added for this release.
 Message logging and viewing. Simulation messages are now logged in
the WLF and new capabilities for managing message viewing are
provided. Messages are organized by their severity and type.

BENEFITS:
 The best mixed-language environment and performance in the industry.
 The intuitive GUI makes it easy to view and access the many powerful
capabilities of ModelSim. There is no learning curve as the debug
environment is common across all languages.
 All ModelSim products are 100% standards based. This means your
investment is protected, risk is lowered, reuse is enabled, and productivity
is enhanced.
 Award-winning technical support.

62
Figure 8.a Model Sim

High-Performance Simulation Environment:

ModelSim combines high performance and high capacity with the most
advanced code coverage and debugging capabilities in the industry. ModelSim
offers unmatched flexibility by supporting 32 and 64 bit UNIX and Linux and 32
bit Windows®-based platforms. Model Technology™ was the first to put the
award-winning single kernel simulator (SKS) technology in the hands of
engineers, enabling transparent mixing of VHDL, Verilog, and SystemC in one
design, using a common, intuitive graphical interface for development and debug
at any level, regardless of the language.

The combination of industry-leading performance and capacity with the


best integrated debug and analysis environment make ModelSim the simulator of
choice for both ASIC and FPGA design. The best standards and platform support
in the industry make it easy to adopt in the majority of process and tool flows.

63
Verilog for Design:

ModelSim fully supports the Verilog design constructs, providing new


capabilities that aid in modeling at higher levels of abstraction. Some of the most
significant design productivity features include:
 Interfaces
 Enumerated, structures, unions, and user-defined types
 Assignment and increment/decrement operators
 Enhanced procedural blocks
 Jump statements
 Dynamic arrays
 Associative arrays
 Default task and function arguments and named argument association
Packages and global declaration ModelSim’s native support of Verilog also
includes a fully integrated debug environment.

64
CONCLUSION

In this project self-checking architectures for an RS encoder and decoder


are described. The parity properties of the binary representation of the elements
of GF (2m) has been studied and a method for a self-checking implementation of
the arithmetic structures used in the RS encoder has been proposed. The
problems related to the presence of undetected faults in parity check-based
schemes have been faced by imposing some constrains in the logical net-list
implementation for the constant multiplier.

Evaluations of area and delay overhead for the self-checking RS encoder


have been provided. For the self-checking RS decoder two main properties of the
fault free decoder have been identified and used to detect faults inside the
decoder. The proposed method can be used for a wide range of algorithm
implementing the decoder function. Some concurrent error detection schemes
have been explained in the paper and some evaluations of area overhead have
been provided. Our method is no intrusive, i.e., the decoder architecture is not
modified. This fact enables the use of the reusability concept, for the design of
very complex digital systems.

65
Simulation results

The simulation output waveform of the top module of Reed


Solomon

66
The simulation output waveform of the top module of Reed
Solomon

67
Synthesis results

RTL schematic of top module of self checking reed Solomon

68
RTL schematic of sub module of self checking reed Solomon

69
RTL schematic of inner sub block module of self checking reed Solomon

70
REFERENCES

1. R. E. Blahut, Theory and Practice of Error Control Codes. Reading, MA:


Addison-Wesley Publishing Company, 1983.
2. A. R. Masoleh and M. A. Hasan, “Low complexity bit parallel architectures for
polynomial basis multiplication over GF(2m), computers,” IEEE Trans. Comput.,
vol. 53, no. 8, pp. 945–959, Aug. 2004.
3. J. Gambles, L. Miles, J. Has, W. Smith, and S. Whitaker, “An ultra-low power,
radiation-tolerant reed Solomon encoder for space applications,” in Proc. IEEE
Custom Integr. Circuits Conf., 2003, pp. 631–634.
4. A. R. Masoleh and M. A. Hasan, “Error Detection in Polynomial Basis
Multipliers over Binary Extension Fields,” in Lecture Notes in Computer Science.
New York: Springer-Verlag, 2003, vol. 2523, pp.515–528.
5. S. B. Sarmadi and M. A. Hasan, “Concurrent error detection of polynomial
basis multiplication over extension fields using a multiple-bit parity scheme,” in
Proc. IEEE Int. Symp. Defect Fault Tolerance VLSI Syst., 2005, pp. 102–110.
6. G. C. Cardarilli, S. Pontarelli, M. Re, and A. Salsano, “Design of a self
checking reed solomon encoder,” in Proc. 11th IEEE Int. On-Line Test.Symp.
(IOLTS’05), 2005, pp. 201–202.
7. G. C. Cardarilli, S. Pontarelli, M. Re, and A. Salsano, “A self checking Reed
Solomon encoder: Design and analysis,” in Proc. IEEE Int. Symp. Defect Fault
Tolerance VLSI Syst., 2005, pp. 111–119.

71
8. M. Gossel, S. Fenn, and D. Taylor, “On-line error detection for finite field
multipliers,” in Proc. IEEE Int. Symp. Defect Fault Tolerance VLSI Syst., 1997,
pp. 307–311.
9. Y.-C. Chuang and C.-W. Wu, “On-line error detection schemes for a systolic
finite-field inverter,” in Proc. 7th Asian Test Symp., 1998, pp. 301–305.
10. M. Boyarinov, “Self-checking algorithm of solving the key equation,” in Proc.
IEEE Int. Symp. Inf. Theory, 1998, p. 292.
11. C. Bolchini, F. Salice, and D. Sciuto, “A novel methodology for designing TSC
networks based on the parity bit code,” in Proc. Eur. Design Test Conf., 1997, pp.
440–444.
12. Altera Corp., San Jose, CA, “Altera Reed-Solomon compiler user guide
3.3.3,” 2006.
13. Xilinx, San Jose, CA, “Xilinx logicore Reed-Solomon decoder v5.1,” 2006.
14. D. Nikolos, “Design techniques for testable embedded error checkers,
computers,” Computer, vol. 23, no. 7, pp. 84–88, Jul. 1990.
15. P. K. Lala, Fault Tolerant and Fault Testable Hardware Design. Englewood
Cliffs, NJ: Prentice-Hall, 1985.

72

You might also like