Unit5 3
Unit5 3
5.1
5.2
5.3
5.4
5.5 QUANTIZATION :
It refers to the process of approximating the continuous set of values in the image data
with a finite (preferably small) set of values. The input to a quantizer is the original data, and the
output is always one among a finite number of levels. The quantizer is a function whose set of
output values are discrete, and usually finite. Obviously, this is a process of approximation, and a
good quantizer is one which represents the original signal with minimum loss or distortion. The
difference between the actual analog value and quantized digital value due is called quantization
error. This error is due either to rounding or truncation.
The set of possible input values may be infinitely large, and may possibly be continuous
and therefore uncountable (such as the set of all real numbers, or all real numbers within some
limited range). The set of possible output values may be finite or countably infinite. The input
and output sets involved in quantization can be defined in a rather general way. For example,
vector quantization is the application of quantization to multi-dimensional (vector-valued) input
data
Outside the realm of signal processing, this category may simply be called rounding or scalar
quantization. An ADC can be modeled as two processes: sampling and quantization. Sampling
converts a voltage signal (function of time) into a discrete-time signal (sequence of real
numbers). Quantization replaces each real number with an approximation from a finite set of
discrete values (levels), which is necessary for storage and processing by numerical methods.
Most commonly, these discrete values are represented as fixed-point words (either proportional
to the waveform values or companded) or floating-point words. Common word-lengths are 8-bit
(256 levels), 16-bit (65,536 levels), 32-bit (4.3 billion levels), and so on, though any number of
quantization levels is possible (not just powers of two). Quantizing a sequence of numbers
produces a sequence of quantization errors which is sometimes modeled as an additive random
signal called quantization noise because of its stochastic behavior. The more levels a quantizer
uses, the lower is its quantization noise power.
In general, both ADC processes lose some information. So discrete-valued signals are
only an approximation of the continuous-valued discrete-time signal, which is itself only an
approximation of the original continuous-valued continuous-time signal. But both types of
approximation errors can, in theory, be made arbitrarily small by good design.
5.7 FIXED POINT AND FLOATING POINT NUMBER REPRESENTATIONS:
Quantization refers to the process of approximating the continuous set of values in the
image data with a finite (preferably small) set of values. The input to a quantizer is the original
data, and the output is always one among a finite number of levels. The quantizer is a function
whose set of output values are discrete, and usually finite. Obviously, this is a process of
approximation, and a good quantizer is one which represents the original signal with minimum
loss or distortion.
Finite register lengths and A/D converters cause errors in:- (i) Input quantisation. (ii)
Coefficient (or multiplier) quantisation (iii) Products of multiplication truncated or rounded due
to machine length
Truncation: simply chop off the remaining digits; also called rounding to zero.
Round to nearest: round to the nearest value, with ties broken in one of two
ways. The result may round up or round down.
A round-off error, also called rounding error, is the difference between the calculated
approximation of a number and its exact mathematical value. Numerical analysis
specifically tries to estimate this error when using approximation equations and/or
algorithms, especially when using finite digits to represent real numbers (which in theory
have infinite digits). This is a form of quantization error.
Rounding example:
As an example, rounding a real number to the nearest integer value forms a very basic type of
quantizer – a uniform one. A typical (mid-tread) uniform quantizer with a quantization step size
equal to some value can be expressed as
where the notation or depict the floor function. For simple rounding to the nearest
integer, the step size is equal to 1. With or with equal to any other integer value, this
quantizer has real-valued inputs and integer-valued outputs, although this property is not a
necessity – a quantizer may also have an integer input domain and may also have non-integer
output values. The essential property of a quantizer is that it has a countable set of possible
output values that has fewer members than the set of possible input values. The members of the
set of output values may have integer, rational, or real values (or even other possible values as
well, in general – such as vector values or complex numbers).
When the quantization step size is small (relative to the variation in the signal being measured),
it is relatively simple to show[3][4][5][6][7][8] that the mean squared error produced by such a
rounding operation will be approximately . Mean squared error is also called the
quantization noise power. Adding one bit to the quantizer halves the value of Δ, which reduces
the noise power by the factor ¼. In terms of decibels, the noise power change is
Because the set of possible output values of a quantizer is countable, any quantizer can be
decomposed into two distinct stages, which can be referred to as the classification stage (or
forward quantization stage) and the reconstruction stage (or inverse quantization stage), where
the classification stage maps the input value to an integer quantization index and the
reconstruction stage maps the index to the reconstruction value that is the output
approximation of the input value. For the example uniform quantizer described above, the
forward quantization stage can be expressed as
This decomposition is useful for the design and analysis of quantization behavior, and it
illustrates how the quantized data can be communicated over a communication channel – a
source encoder can perform the forward quantization stage and send the index information
through a communication channel (possibly applying entropy coding techniques to the
quantization indices), and a decoder can perform the reconstruction stage to produce the output
approximation of the original input data. In more elaborate quantization designs, both the
forward and inverse quantization stages may be substantially more complex. In general, the
forward quantization stage may use any function that maps the input data to the integer space of
the quantization index data, and the inverse quantization stage can conceptually (or literally) be a
table look-up operation to map each quantization index to a corresponding reconstruction value.
This two-stage decomposition applies equally well to vector as well as scalar quantizers.
5.11 TRUNCATION:
In mathematics, truncation is the term for limiting the number of digits right of the
decimal point, by discarding the least significant ones.
For example, consider the real numbers
5.6341432543653654
32.438191288
-6.3444444444444
To truncate these numbers to 4 decimal digits, we only consider the 4 digits to the right
of the decimal point.
The result would be
5.6341
32.4381
-6.3444
Note that in some cases, truncating would yield the same result as rounding, but
truncation does not round up or round down the digits; it merely cuts off at the specified digit.
The truncation error can be twice the maximum error in rounding.
.
5.12 LIMIT CYCLE OSCILLATIONS:
For an IIR filter implemented with infinite precision arithmetic the output should approach
zero in the steady state if the input is zero and it should approach a constant value if the input is a
constant. However , with an implementation using a finite length register an output can occur
even with zero input. The output may be a fixed value or it may oscillate between finite positive
and negative values. This effect is referred to as (zero input) limit cycle oscillation.
A limit cycle, sometimes referred to as a multiplier round off limit cycle, is a low-level
oscillations that can exist in an otherwise stable filter as a result of the nonlinearity associated
with rounding (or truncating) internal filter calculations . Limit cycles require recursion to exist
and do not occur in non-recursive FIR filters. There are at least three ways of dealing with limit
cycles when fixed-point arithmetic is used. One is to determine a bound on the maximum limit
cycle amplitude, expressed as an integral number of quantization steps. It is then possible to
choose a word length that makes the limit cycle amplitude acceptably low. Alternately, limit
cycles can be prevented by randomly rounding calculations up or down. However, this approach
is complicated to implement. The third approach is to properly choose the filter realization
structure and then quantize the filter calculations using magnitude approach. This approach has
the disadvantage of producing more round off noise than truncation or rounding.
The addition of two fixed point arithmetic numbers cause overflow when the sum
exceeds the word ze available to store the sum. This overflow caused by adder make the filter
output to oscillate between maximum amplitude limits. Such limit cycles have been referred to
as overflow oscillations. The limit cycles occur as a result of quantization effect in
multiplication. The amplitudes of the output during a limit cycle are confined to a range of
values called the dead band of the filter.
The truncation of Fourier series is known to introduce the unwanted ripples in the
frequency response characteristics H(w) due to non uniform convergence of Fourier series at a
discontinuity .These ripples or oscillatory behaviour near the band edge of the filter is known
as “Gibb’s phenomenon or Gibb’s oscillation “.
1. The discontinuity between pass band and stop band in the frequency response is avoided by
introducing the transition between the pass band and stop band.
2. Another technique used for the reduction of Gibb’s phenomenon is by using window
function that contains a taper which decays towards zero gradually instead abruptly.
5.15 SCALING:
Saturation arithmetic eliminates limit cycle due to overflow, but it causes undesirable
signal distortion due to the non-linearity of the clipper. In order to limit the amount of non-linear
distortion, it is important to scale the input signal and the unit sample response between the input
and any internal summing node in the system such that overflows becomes a rare event.
Similarly, the dynamic range of a signal can be defined as its maximum decibel level
minus its average ``noise level'' in dB. For digital signals, the limiting noise is
ideally quantization noise.
Quantization noise is generally modeled as a uniform random variable between plus and
minus half the least significant bit (since rounding to the nearest representable sample value is
normally used). If denotes the quantization interval, then the maximum quantization-error
magnitude is , and its variance (``noise power'') is (see §G.3 for a derivation of
this value).
The number system (see Appendix G and number of bits chosen to represent signal
samples determines their available dynamic range. Signal processing operations such as digital
filtering may use the same number system as the input signal, or they may use extra bits in the
computations, yielding an increased ``internal dynamic range''.
Since the threshold of hearing is near 0 dB SPL, and since the ``threshold of pain'' is
often defined as 120 dB SPL, we may say that the dynamic range of human hearing is
approximately 120 dB.
The dynamic range of magnetic tape is approximately 55 dB. To increase the dynamic
range available for analog recording on magnetic tape, companding is often used. ``Dolby A''
adds approximately 10 dB to the dynamic range that will fit on magnetic tape (by compressing
the signal dynamic range by 10 dB), while DBX adds 30 dB (at the cost of more ``transient
distortion''). In general, any dynamic range can be mapped to any other dynamic range, subject
only to noise limitations.